Kim Nguyễn [Sun, 21 Oct 2012 07:43:52 +0000 (09:43 +0200)]
Split the Options module in two to remove a circular dependency in
the pretty-printing module:
Options depends on Logger to have the list of all logs levels
Logger depends on Options to check whether we are in verbose mode
or not.
Split:
Config options holds the references storing the options
Options only handles the parsing of the command line
Kim Nguyễn [Sat, 20 Oct 2012 10:27:49 +0000 (12:27 +0200)]
Clean-up temporary files after testing.
Kim Nguyễn [Fri, 19 Oct 2012 19:33:45 +0000 (21:33 +0200)]
Make the non-regression testing more robust by feeding the output
of the reference XML engine into SXSI to produce the output of
'/child::*'. This allows to serialize again the XML document in
the exact same way as SXSI, allowing to use a simple diff to check
for differences in the output.
Kim Nguyễn [Fri, 19 Oct 2012 19:13:06 +0000 (21:13 +0200)]
Add option -nw control the wrapping of results in an <xml_result/> node.
Kim Nguyễn [Fri, 19 Oct 2012 18:20:52 +0000 (20:20 +0200)]
Completely silences the output unless -v is given.
Kim Nguyễn [Fri, 19 Oct 2012 13:07:45 +0000 (15:07 +0200)]
Fix bug in the handling of element-subtree lazy result sets.
Kim Nguyễn [Fri, 19 Oct 2012 13:06:54 +0000 (15:06 +0200)]
Also serialize results in counting mode (prints the count in the output file)
Kim Nguyễn [Fri, 19 Oct 2012 13:03:45 +0000 (15:03 +0200)]
Add test docs.
Kim Nguyễn [Fri, 19 Oct 2012 12:46:52 +0000 (14:46 +0200)]
Add non-regression testing infrastructures and first tests.
Kim Nguyễn [Fri, 19 Oct 2012 12:46:03 +0000 (14:46 +0200)]
Add explicitely index files in the main .gitignore
remove directory local gitignore file.
Kim Nguyễn [Fri, 19 Oct 2012 12:44:55 +0000 (14:44 +0200)]
Remove the tests directory from the .gitignore file.
Kim Nguyễn [Fri, 19 Oct 2012 12:44:27 +0000 (14:44 +0200)]
Fix the xpath script to enclose results in an <xml_result></xml_result> element
Kim Nguyễn [Fri, 19 Oct 2012 12:42:46 +0000 (14:42 +0200)]
Rename directory non_regression_tests to comparison_tests
Kim Nguyễn [Wed, 17 Oct 2012 17:05:14 +0000 (19:05 +0200)]
Follow the changes in XMLTree API: xml_tree::subtree_elements() does
not require an extra array of attributes anymore.
Kim Nguyễn [Wed, 17 Oct 2012 17:00:00 +0000 (19:00 +0200)]
Silence a format-related warning (%i -> %lu).
Kim Nguyễn [Wed, 17 Oct 2012 16:57:45 +0000 (18:57 +0200)]
Fix the path to the #included bp.h (bp/bp.h -> libbp/bp.h)
Kim Nguyễn [Sat, 13 Oct 2012 12:56:22 +0000 (14:56 +0200)]
Change configure to reflect name change in the bp library
(bp become libbp).
Kim Nguyễn [Sat, 13 Oct 2012 12:54:02 +0000 (14:54 +0200)]
Add runtime assertion to check that values passed to Obj_val()
are indeed custom blocks.
Kim Nguyễn [Fri, 12 Oct 2012 20:46:38 +0000 (22:46 +0200)]
Fix a nasty bug where the wrong pointer was passed to the C side.
Kim Nguyễn [Fri, 12 Oct 2012 18:55:24 +0000 (20:55 +0200)]
BUG closed: [
fc7c30b] Wrong display of empty attributes.
Kim Nguyễn [Fri, 12 Oct 2012 14:37:51 +0000 (16:37 +0200)]
BUG added: [
fc7c30b28705] Wrong display of empty attributes.
Kim Nguyễn [Fri, 12 Oct 2012 14:10:57 +0000 (16:10 +0200)]
BUG closed: [
59c2eb8] Wrong result for jumping from root
Kim Nguyễn [Fri, 12 Oct 2012 14:08:04 +0000 (16:08 +0200)]
Merge branch 'master' of ssh://git.nguyen.vg/home/kim/repository/SXSI/xpathcomp
Kim Nguyễn [Fri, 12 Oct 2012 14:07:51 +0000 (16:07 +0200)]
Ignore everything under test.
Kim Nguyễn [Fri, 12 Oct 2012 14:04:45 +0000 (16:04 +0200)]
Add a test program for experimenting with lexicographic indices.
Kim Nguyễn [Tue, 24 Jul 2012 15:33:03 +0000 (17:33 +0200)]
Remove all traces of Tom's Grammar.
Kim Nguyễn [Tue, 24 Jul 2012 15:11:01 +0000 (17:11 +0200)]
Change incorrect english in error message.
Kim Nguyễn [Wed, 30 May 2012 12:56:26 +0000 (14:56 +0200)]
BUG added: [
59c2eb816cf4] Wrong result for jumping from root
Kim Nguyễn [Wed, 30 May 2012 12:17:47 +0000 (14:17 +0200)]
Add tests for wordbased index.
Kim Nguyễn [Tue, 29 May 2012 05:59:34 +0000 (07:59 +0200)]
Fix bug in commandline parsing.
Kim Nguyễn [Tue, 29 May 2012 05:58:54 +0000 (07:58 +0200)]
Rework test scripts
Kim Nguyễn [Tue, 29 May 2012 05:53:07 +0000 (07:53 +0200)]
Fix typo in debugging message.
Kim Nguyễn [Tue, 29 May 2012 05:50:26 +0000 (07:50 +0200)]
Add command line option to disable caching and jumping
Kim Nguyễn [Tue, 29 May 2012 05:49:14 +0000 (07:49 +0200)]
Bump file format magic number, invalidate all previously generated indexes.
Kim Nguyễn [Fri, 4 May 2012 14:00:11 +0000 (16:00 +0200)]
Add -doc-stats options to print document statistics.
Kim Nguyễn [Fri, 4 May 2012 12:57:57 +0000 (14:57 +0200)]
Add support for multiline XPath queries.
Kim Nguyễn [Wed, 2 May 2012 12:41:18 +0000 (14:41 +0200)]
BUG added: [
2c899aa21af4] OCamlbuild dependencies
Kim Nguyễn [Wed, 2 May 2012 12:34:45 +0000 (14:34 +0200)]
Fixes on queries and test scripts to handle old versions of SXSI.
Kim Nguyễn [Wed, 2 May 2012 12:34:18 +0000 (14:34 +0200)]
Various fixes for bottom-up run.
Kim Nguyễn [Wed, 2 May 2012 12:32:34 +0000 (14:32 +0200)]
Various improvements:
- Store the set of attributes in a TLIST instead of a Ptset.Int.t
- Add a C trim function for string (to remove whitespaces left and right)
- Rewrite the text_query wrappers.
Kim Nguyễn [Wed, 2 May 2012 12:31:57 +0000 (14:31 +0200)]
Add more logging statements.
Kim Nguyễn [Wed, 2 May 2012 12:31:00 +0000 (14:31 +0200)]
Add 'bottom-up' logging level.
Kim Nguyễn [Wed, 2 May 2012 12:26:42 +0000 (14:26 +0200)]
Add a command line option to disable the indexing of ignorable whitespaces.
Kim Nguyễn [Wed, 2 May 2012 12:24:36 +0000 (14:24 +0200)]
Don't flush the XML printing buffer if nothing was printed.
Kim Nguyễn [Wed, 2 May 2012 12:11:50 +0000 (14:11 +0200)]
Add word-based index auxiliary index files to .gitignore.
Kim Nguyễn [Tue, 24 Apr 2012 14:24:39 +0000 (16:24 +0200)]
Revert "Call directly the low-level subtree_elements function instead of"
This reverts commit
d6c57f01eabebe2b11e1c701835562c2efc2fd92.
The tentative fix for performance regression is buggy and make things slower.
Kim Nguyễn [Tue, 24 Apr 2012 14:24:19 +0000 (16:24 +0200)]
BUG reopened: [
a3fdf1a] Performance regression
Kim Nguyễn [Tue, 24 Apr 2012 13:52:29 +0000 (15:52 +0200)]
BUG closed: [
a3fdf1a] Performance regression
Kim Nguyễn [Tue, 24 Apr 2012 13:50:38 +0000 (15:50 +0200)]
Call directly the low-level subtree_elements function instead of
re-implementing it in ocaml.
Fixes bug:
a3fdf1a5 Performance regression
Kim Nguyễn [Tue, 24 Apr 2012 07:53:05 +0000 (09:53 +0200)]
BUG added: [
2a89b3b569f4] duplicate interface definitions
Kim Nguyễn [Tue, 24 Apr 2012 07:47:38 +0000 (09:47 +0200)]
BUG added: [
a3fdf1a569f4] Performance regression
Kim Nguyễn [Tue, 24 Apr 2012 07:44:31 +0000 (09:44 +0200)]
Initialized bug tracker
Kim Nguyễn [Fri, 20 Apr 2012 14:54:22 +0000 (16:54 +0200)]
Add hooks to re-initialize hconsed modules.
Kim Nguyễn [Fri, 20 Apr 2012 14:49:47 +0000 (16:49 +0200)]
Replace \n by @\n in log message.
Remove the tracing code around exec (ocaml generate less efficient code
even for noops).
Kim Nguyễn [Fri, 20 Apr 2012 14:05:44 +0000 (16:05 +0200)]
Minor code factoring.
Call Ata.init() before a global top-down run to clear global cache.
Kim Nguyễn [Fri, 20 Apr 2012 14:01:53 +0000 (16:01 +0200)]
Favor {first,next}_element calls instead of select_{descendant,following_below},
when the set of target tags is large (esp. it contains everything but attributes).
Kim Nguyễn [Fri, 20 Apr 2012 13:41:17 +0000 (15:41 +0200)]
Make constant construtors of L2JIT.opcode CACHE and RETURN be take a
dummy unit argument. This improves the code generated for the pattern
matching in l2jit_dispatch (in runtime.ml).
Replaces inline macros LOOP and LOOP_TAG with function calls.
Kim Nguyễn [Fri, 20 Apr 2012 13:34:41 +0000 (15:34 +0200)]
Use better defaults for top-down cache size.
Kim Nguyễn [Fri, 20 Apr 2012 13:31:44 +0000 (15:31 +0200)]
Add -r <n> option to repeat the query execution n times.
Kim Nguyễn [Fri, 20 Apr 2012 13:28:52 +0000 (15:28 +0200)]
Change the read_procmem function to return the stack size of the process
not the heap size.
Kim Nguyễn [Fri, 20 Apr 2012 13:27:31 +0000 (15:27 +0200)]
Add performance regression test script.
Kim Nguyễn [Wed, 18 Apr 2012 13:10:27 +0000 (15:10 +0200)]
Change the inlining
Remove dependency on pkg-config
Kim Nguyễn [Wed, 18 Apr 2012 11:48:10 +0000 (13:48 +0200)]
Add utility header file.
Kim Nguyễn [Wed, 18 Apr 2012 11:47:13 +0000 (13:47 +0200)]
Change from unordered_set<tag> to int array in low-level select_* functions.
Kim Nguyễn [Wed, 18 Apr 2012 11:45:20 +0000 (13:45 +0200)]
Change the ifndef guard from FOO_H_ to FOO_HPP_ to keep it consistent with
the filename.
Kim Nguyễn [Wed, 18 Apr 2012 11:43:32 +0000 (13:43 +0200)]
Misc. rewrites:
- cosmetic changes tab -> whitespaces
- more logging
Kim Nguyễn [Thu, 12 Apr 2012 16:08:18 +0000 (18:08 +0200)]
Encapsulate serialization results around with <xml_result>...</xml_result>
Kim Nguyễn [Thu, 12 Apr 2012 14:32:33 +0000 (16:32 +0200)]
More debugging:
remove progress printing during parsing
add debuging trace in resJIT (show which node is added to the result set)
Kim Nguyễn [Fri, 6 Apr 2012 12:04:14 +0000 (14:04 +0200)]
Finish adapting to new libxml-tree API
- Code is much cleaner
- Speed is mostly the same, often faster but two offenders:
Q9 : 155 -> 190ms
Q28: 2s -> 3.5 s
Need to be investigated.
Kim Nguyễn [Wed, 4 Apr 2012 17:07:23 +0000 (19:07 +0200)]
Big refactoring of libxml-tree, part (1) (everything compiles)
Kim Nguyễn [Mon, 2 Apr 2012 13:09:27 +0000 (15:09 +0200)]
Optimize the bottom-up run using a Camlp4 macro instead of an
(un-inlined) recursive call.
Kim Nguyễn [Mon, 2 Apr 2012 13:09:11 +0000 (15:09 +0200)]
Silence compiler warning about unused variables.
Kim Nguyễn [Mon, 2 Apr 2012 13:05:27 +0000 (15:05 +0200)]
Uses the Logger.print function instead of Printf.eprintf
Kim Nguyễn [Mon, 2 Apr 2012 13:02:41 +0000 (15:02 +0200)]
Add text() and node() tokens in the lexer to allow node test and
text node test in XPath expression.
Kim Nguyễn [Mon, 2 Apr 2012 13:00:24 +0000 (15:00 +0200)]
Fix bug where the Lvl2 Cache got corrupted upon resizing.
Kim Nguyễn [Mon, 2 Apr 2012 12:39:58 +0000 (14:39 +0200)]
Remove unused memory profiling code.
Kim Nguyễn [Mon, 2 Apr 2012 12:37:21 +0000 (14:37 +0200)]
Add efficient compare_int in INCLUDED .ml files.
Kim Nguyễn [Tue, 20 Mar 2012 20:17:18 +0000 (21:17 +0100)]
Change the logging infrastructure:
- rely on the Format module to correctly indent log messages
- re-use the Pretty module has much as possible to print
sequences and arrays.
- add version of print_list and print_array that take a printer
as optional argument to print the separator (rather than a string).
Kim Nguyễn [Tue, 20 Mar 2012 20:16:15 +0000 (21:16 +0100)]
Rename flags for build script from -foo to -enable-foo
Kim Nguyễn [Mon, 19 Mar 2012 17:41:12 +0000 (18:41 +0100)]
Add the auction.dtd document type definition to the repository.
Kim Nguyễn [Mon, 19 Mar 2012 17:40:25 +0000 (18:40 +0100)]
Rename 'Tracer' module to 'Logger'.
Kim Nguyễn [Mon, 19 Mar 2012 14:24:15 +0000 (15:24 +0100)]
Revert "Remove the need for a NOP operation in automata bytecode."
This reverts commit
a6c781462ddca7c25fe95789c81c2265f153203c.
The automata is unsound and returns boggus results for child moves
without the nop case.
Kim Nguyễn [Mon, 19 Mar 2012 14:19:28 +0000 (15:19 +0100)]
Temporary commit
Kim Nguyễn [Thu, 15 Mar 2012 15:06:43 +0000 (16:06 +0100)]
Remove -flto from the linking phase.
Kim Nguyễn [Wed, 14 Mar 2012 23:20:49 +0000 (00:20 +0100)]
Ensure that the c++ flags defined in myocamlbuild_config.ml.in are passed
to ocamlopt for linking.
Kim Nguyễn [Wed, 14 Mar 2012 18:07:26 +0000 (19:07 +0100)]
Change inlining parameter to 1000 again.
Kim Nguyễn [Wed, 14 Mar 2012 18:07:42 +0000 (19:07 +0100)]
Finaly clean up formula representation.
Kim Nguyễn [Wed, 14 Mar 2012 14:33:42 +0000 (15:33 +0100)]
Add query
Kim Nguyễn [Wed, 14 Mar 2012 14:33:15 +0000 (15:33 +0100)]
Add performance tests
Kim Nguyễn [Wed, 14 Mar 2012 14:32:49 +0000 (15:32 +0100)]
Fix test scripts.
Kim Nguyễn [Wed, 14 Mar 2012 13:02:41 +0000 (14:02 +0100)]
Ignore .g files (grammar).
Kim Nguyễn [Wed, 14 Mar 2012 13:01:49 +0000 (14:01 +0100)]
Make the tree/text interface more flexible to various text index.
Kim Nguyễn [Wed, 14 Mar 2012 13:00:44 +0000 (14:00 +0100)]
Small refactoring:
- split subtree/subtree_tag marking into separate function
- general clean-up
- comment out some grammar related stuff
Kim Nguyễn [Wed, 14 Mar 2012 13:00:02 +0000 (14:00 +0100)]
Remove the need for a NOP operation in automata bytecode.
Kim Nguyễn [Wed, 14 Mar 2012 12:52:16 +0000 (13:52 +0100)]
Clean-up Hcons module:
- remove dead code
- avoid one allocation.
Kim Nguyễn [Wed, 14 Mar 2012 12:47:31 +0000 (13:47 +0100)]
Add iteri function to traverse cache data structures.
Kim Nguyễn [Wed, 14 Mar 2012 12:46:30 +0000 (13:46 +0100)]
Sort and remove duplicate from text query results (needed for the
word-based text index).
Kim Nguyễn [Wed, 14 Mar 2012 12:45:17 +0000 (13:45 +0100)]
Add a C implementation of leading_bit and clz to optimize Patricia tree
library.
Kim Nguyễn [Wed, 14 Mar 2012 12:39:42 +0000 (13:39 +0100)]
Update compilation flag:
- change the value of inlining for OCaml code
- pass -O3 as a flag to the C/C++ compiler
Kim Nguyễn [Thu, 1 Mar 2012 13:31:13 +0000 (14:31 +0100)]
Add text-attribute tags to the star tagset.