An index tool to index content between <Text> and </Text>, <Headline> and </Headline> in the file. And removing any punctuation and excess markup tags. Use <DOCNO> as pagination.
gatoy / index-tool Goto Github PK
View Code? Open in Web Editor NEWA java project to implement inverted index and query.