Code Monkey home page Code Monkey logo

esalib's People

Contributors

ticcky avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

esalib's Issues

Build indexes of a subset of Wikipedia pages

Hi,

I am interested in some specific sub-domain rather than the whole knowledge base. I've downloaded the pages I needed, is there a way to build the index only on these pages?

Thanks!

Error while following tutorial

Syntax error caused in / memesa_builder / memndx / prepare.py line 98:

if len(vector != 808:

I presume this should be:

if len(vector) != 804:

tutorial steps for importing wikipedia dump

Hi, I'm trying to follow the tutorial steps for updating the imported wikipedia snapshot and related data structures on a Linux workstation with Mysql but I'm having some troubles. My understanding is that PrepareWikiDb should be called several times during the importing with different argument (1=>drop_indexes, 2=>clear, 3=>rebuild_indexes, 4=>page_concepts...) in order to update the SQL db. Is it correct? Could you state the correct order? Thank you!

As usual when I'm trying to import Wiki dumps I end up into problems like:

ERROR 1366 (HY000) at line 32: Incorrect string value: '\xD0' for column 'll_title' at row 2577

Even if I set the CHARACTER SET to COLLATE utf8_general_ci. Am I the only person to get this error? I turned to an importer that can ignore the very few errors of this kind and continue importing the rest of the dataset.

5.6.21 MySQL Community Server (GPL)
Ubuntu 14.04.1 LTS
java version "1.7.0_72" 64bit

Initial Problem

Dear Ticcky:
I try to follow you step on my mac pro:
== So how to get ESA running in 2 minutes for English? ==
0. # git co https://github.com/ticcky/esalib.git
# cd esalib

  1. Create a symbolic link to the sample database
    # ln -s example/esa_en.db esa_db.db
  2. Get relatedness estimate of two texts:
    # ./run_analyzer "computer" "apple"

Finally get the following problem log:
Cannot execute command: SELECT * FROM en_ndx
java.sql.SQLException: [SQLITE_ERROR] SQL error or missing database (no such table: en_ndx)
Exception in thread "main" java.lang.NullPointerException
at clldsystem.esa.ESAAnalyzer.initDB(ESAAnalyzer.java:109)
at clldsystem.esa.ESAAnalyzer.(ESAAnalyzer.java:155)
at clldsystem.esa.ESAAnalyzer.main(ESAAnalyzer.java:522)

Are there any other initial work should be done ? Before using esalib.
Thank you for your kindly help
:)

aide esalib

comment je peux ajouter à ESAlib une base de donnée française ?
merci d'avance

[Fatal Error]

I came up with error.plz help me. I tried the 2 min option.

[Fatal Error] config.xml:1:1: Content is not allowed in prolog.
org.apache.commons.configuration.ConfigurationException: Unable to load the configuration
at org.apache.commons.configuration.XMLConfiguration.load(XMLConfiguration.java:863)
at org.apache.commons.configuration.XMLConfiguration.load(XMLConfiguration.java:821)
at org.apache.commons.configuration.XMLConfiguration$XMLFileConfigurationDelegate.load(XMLConfiguration.java:1453)
at org.apache.commons.configuration.AbstractFileConfiguration.load(AbstractFileConfiguration.java:287)
at org.apache.commons.configuration.AbstractFileConfiguration.load(AbstractFileConfiguration.java:219)
at org.apache.commons.configuration.AbstractFileConfiguration.load(AbstractFileConfiguration.java:197)
at org.apache.commons.configuration.AbstractHierarchicalFileConfiguration.load(AbstractHierarchicalFileConfiguration.java:164)
at common.config.AppConfig.init(AppConfig.java:72)
at common.config.AppConfig.(AppConfig.java:61)
at common.config.AppConfig.(AppConfig.java:25)
at clldsystem.esa.ESAAnalyzer.main(ESAAnalyzer.java:518)
Caused by: org.xml.sax.SAXParseException; systemId: file:/home/milina/Desktop/esa/ticcky-esalib-b228628/config/config.xml; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:253)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:288)
at org.apache.commons.configuration.XMLConfiguration.load(XMLConfiguration.java:855)
... 10 more
Exception in thread "main" java.lang.NullPointerException
at common.db.DBConfig.setConnectionFromDrupalUrl(DBConfig.java:25)
at common.db.DBConfig.(DBConfig.java:21)
at common.db.DB.(DB.java:55)
at clldsystem.esa.ESAAnalyzer.main(ESAAnalyzer.java:522)

Not enough space for wikixray.py

when running scripts/wikixray.py pages-articles.xml ,I got the following error
Traceback (most recent call last):
File "G:/esalib/scripts/wikixray.py", line 509, in
filter_handler.parse(sys.stdin)
File "J:\python64\lib\xml\sax\saxutils.py", line 283, in parse
self._parent.parse(source)
File "J:\python64\lib\xml\sax\expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "J:\python64\lib\xml\sax\xmlreader.py", line 121, in parse
buffer = file.read(self._bufsize)
IOError: [Errno 12] Not enough space

it means need more memory space?

Thanks

can't run the example "computer" "apple" using the given esa_en.db

Hi,
I have just cloned the repo and did exactly what's required to run the example but unfortunately I am stuck with this error

Cannot execute command: SELECT * FROM en_ndx
java.sql.SQLException: [SQLITE_ERROR] SQL error or missing database (no such table: en_ndx)
Exception in thread "main" java.lang.NullPointerException
at clldsystem.esa.ESAAnalyzer.initDB(ESAAnalyzer.java:109)
at clldsystem.esa.ESAAnalyzer.(ESAAnalyzer.java:155)
at clldsystem.esa.ESAAnalyzer.main(ESAAnalyzer.java:522)

could you please tell me what could be wrong here? I already browsed the .db file and the tables do exist, really don't know what's going on!

my os is ubuntu

Thanks in advance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.