From f9ae94d85120129130e9e9ddf9c832c5e92c3cdd Mon Sep 17 00:00:00 2001 From: kim Date: Fri, 9 Sep 2011 16:01:49 +0000 Subject: [PATCH] Update HACKING file with new build instructions git-svn-id: svn+ssh://idea.nguyen.vg/svn/sxsi/trunk/xpathcomp@1114 3cdefd35-fc62-479d-8e8d-bae585ffb9ca --- HACKING | 72 ++++++++++++++++++++++++++------------------------------- 1 file changed, 33 insertions(+), 39 deletions(-) diff --git a/HACKING b/HACKING index bb1dae8..e88ebe9 100644 --- a/HACKING +++ b/HACKING @@ -1,62 +1,56 @@ -Compilation instruction: +Dependencies +------------ -compile the TextCollection : - -cd XMLTree/TextCollection -make - -compile libcds : - -cd XMLTree/libcds -make - -compile XMLTree : -cd XMLTree -make - -compile xpathcomp/ -make - - -Building xpathcomp requires: - -libxml2.6++-dev +libxml2.6++-dev ocaml-findlib ocaml-ulex ocaml-nox camlp4 -and their dependencies - +and their dependencies as well as ocamlbuild which is part of the standard ocaml distribution. +The "src" directory should contain a copy of the XMLTree source dir (this can be a symbolic to +the XMLTree directory). -you can compile with make DEBUG=true to get more statistics the overall programm is slower -but you get precise timing for each individual function calls to the XMLTree interface. +Buid instructions +----------------- +Build the XMLTree library: -Usage : -./main 'file.xml' 'query' [output] +cd src/XMLTree; make clean all; cd - -file.xml is the input file. There are some in the tests subdirectory. At the moment you can mainly test -with base.xml and tiny.xml. Any file bigger than this will take too much time to load. -If the file ends with the .srx extensions then it is considered to be a serialized output saved to disk -with the -s flag (see bellow). +Build the xpathcomp program: +./configure +./build +See ./build --help to customize the build process (build with debugging/profiling code, build ocaml +bytecode instead of native binary etc...). -output is optional. If specified, it is the name of a file to which the result of the query is serialized. -If output is not given, the the result of the query is kept as a set of identifier and no access to -the string collection is made. +This will produce a main.native executable (a symlink to the real binary which lies in the _build/src +directory). +Usage +----- +./main.native [options] 'input_file' 'query' [output_file] +input_file can be either an xml file (and thus the name must have a .xml extension) or an indexed file +(with a .srx extension). Due to a bug in the parser, the query can only use the explicit syntax, that +is: +/descendant::a/child::b[ child::a and descendant::c or not(contains( ., "str")) ]/descendant::d -There are a few flags: +and no // or a/b/c. Text predicates must be of the form function( ., "string") where function can +be: contains, equals, starts-with, ends-with. -./main 'query' [output] +Available options are: -c counting only (don't materialize the result set) - -max-tc set maximum count for which the TextCollection is used -f sample factor [default=64] + -i index empty texts [default=false] + -d disable text collection[default=false] -s save the intermediate representation into file.srx -b real bottom up run + -nj disable jumping + -index-type {default|swcsa|rlcsa} choose text index type + -v verbose mode -help Display this list of options - --help Display this list of options + --help Display this list of options \ No newline at end of file -- 2.17.1