X-Git-Url: http://git.nguyen.vg/gitweb/?a=blobdiff_plain;f=HACKING;h=e88ebe9e21cdf0b40debc0b210265374a8afaaed;hb=0158dd0bd369b50f3e784b3f4b6122ea4cbde822;hp=b8fe7c6406d0e020fcfececf55989ccd3d3a5094;hpb=7a84508ebd7b534215a768c771171e8f062e0d0b;p=SXSI%2Fxpathcomp.git diff --git a/HACKING b/HACKING index b8fe7c6..e88ebe9 100644 --- a/HACKING +++ b/HACKING @@ -1,58 +1,56 @@ -Compilation instruction: +Dependencies +------------ -compile the TextCollection : - -cd XMLTree/TextCollection -make - -compile libcds : - -cd XMLTree/libcds -make - -compile XMLTree : -cd XMLTree -make - -compile xpathcomp/ -make - - -Building xpathcomp requires: - -libxml2.6++-dev +libxml2.6++-dev ocaml-findlib ocaml-ulex ocaml-nox camlp4 -and their dependencies +and their dependencies as well as ocamlbuild which is part of the standard ocaml distribution. +The "src" directory should contain a copy of the XMLTree source dir (this can be a symbolic to +the XMLTree directory). +Buid instructions +----------------- -you can compile with make DEBUG=true to get more statistics the overall programm is slower -but you get precise timing for each individual function calls to the XMLTree interface. +Build the XMLTree library: +cd src/XMLTree; make clean all; cd - -Usage : -./main 'file.xml' 'query' [output] +Build the xpathcomp program: +./configure +./build -file.xml is the input file. There are some in the tests subdirectory. At the moment you can mainly test -with base.xml and tiny.xml. Any file bigger than this will take too much time to load. +See ./build --help to customize the build process (build with debugging/profiling code, build ocaml +bytecode instead of native binary etc...). +This will produce a main.native executable (a symlink to the real binary which lies in the _build/src +directory). -output is optional. If specified, it is the name of a file to which the result of the query is serialized. -If output is not given, the the result of the query is kept as a set of identifier and no access to -the string collection is made. +Usage +----- +./main.native [options] 'input_file' 'query' [output_file] +input_file can be either an xml file (and thus the name must have a .xml extension) or an indexed file +(with a .srx extension). Due to a bug in the parser, the query can only use the explicit syntax, that +is: -There are a few flags: +/descendant::a/child::b[ child::a and descendant::c or not(contains( ., "str")) ]/descendant::d --f sample factor [default=64] --i index empty texts [default=false] --d Disable text collection[default=false] --help Display this list of options ---help Display this list of options +and no // or a/b/c. Text predicates must be of the form function( ., "string") where function can +be: contains, equals, starts-with, ends-with. +Available options are: -for instance: -./main -f 29 -d tests/tiny.xml '//para' + -c counting only (don't materialize the result set) + -f sample factor [default=64] + -i index empty texts [default=false] + -d disable text collection[default=false] + -s save the intermediate representation into file.srx + -b real bottom up run + -nj disable jumping + -index-type {default|swcsa|rlcsa} choose text index type + -v verbose mode + -help Display this list of options + --help Display this list of options \ No newline at end of file