Code Monkey home page Code Monkey logo

Comments (6)

hitvoice avatar hitvoice commented on July 20, 2024

run python prepro.py before running python train.py. Please read the instructions carefully.

from drqa.

 avatar commented on July 20, 2024

I ran python prepro.py already and it took over 10 hours to complete but I didn't see the file. If it comes after running the script, I'll try again and see if that helps.

from drqa.

hitvoice avatar hitvoice commented on July 20, 2024

What's your spacy version? if you follow all instructions and install spacy==1.9 the prepro script should only take 1~2 minutes.

from drqa.

tim5go avatar tim5go commented on July 20, 2024

@hitvoice
It seems like the annotate function from SpaCy will eat up lots of memory (>10GB)
during the data preparation process. If the machine doesn't have enough RAM, the process will turn out to be extremely slow. I guess it is what @YuanTW was experiencing.

Not sure if it is a memory leakage problem for SpaCy.

from drqa.

hitvoice avatar hitvoice commented on July 20, 2024

@tim5go I didn't pay attention to the memory usage before. I'll check that.

from drqa.

hitvoice avatar hitvoice commented on July 20, 2024

@tim5go I ran the script again and indeed, the memory keeps increasing to about 10G. I didn't figure out the exact reason and I'll add a warning to ReadMe. Thanks for reporting this!

from drqa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.