Code Monkey home page Code Monkey logo

Comments (11)

Danial-Alh avatar Danial-Alh commented on June 7, 2024

Could you please give more information about the dataset you are using and put some sample of it here? Is it publicly available?
I duplicated my dataset to have 1.5 million samples, each of them having 10 words on average and didn't have problem.

from fast-bleu.

StevenTang1998 avatar StevenTang1998 commented on June 7, 2024

For example, reference_corpus = [[str(i) + str(j) for i in range(100)] for j in range(70000)].

from fast-bleu.

StevenTang1998 avatar StevenTang1998 commented on June 7, 2024

I guess maybe it is caused by vocabulary size.
If using reference_corpus = [['a' for i in range(100)] for j in range(70000)], it works.

from fast-bleu.

Danial-Alh avatar Danial-Alh commented on June 7, 2024

Yes your guess is close.
I tested your example without issue but the big vocabulary caused large amount of memory consumption.

Does your memory have enough capacity?

from fast-bleu.

StevenTang1998 avatar StevenTang1998 commented on June 7, 2024

Yes, I am running this code on a server with 256GB memory. And the above code occupied about 3GB memory when failing. I guess maybe this issue due to the limit memory of local variables in functions.

from fast-bleu.

Danial-Alh avatar Danial-Alh commented on June 7, 2024

Excuse me, waht is your OS , python and fast_bleu version?

As I see in you stack trace, there is something about threading.
Is the code you are running exactly like this?

from fast_bleu import BLEU

bleu = BLEU(reference_corpus, weights)

If your python code is in mult-ithreaded mode, could you please test it in standard and single-thread mode?

from fast-bleu.

StevenTang1998 avatar StevenTang1998 commented on June 7, 2024

I have tested in three severs, and met the same error.
OS: Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-193-generic x86_64), Ubuntu 16.04.5 LTS (GNU/Linux 4.15.0-120-generic x86_64), Ubuntu 18.04.2 LTS (GNU/Linux 4.15.0-121-generic x86_64)
Memory: 256GB, 128GB, 64GB
Python: 3.7.4, 3.7.4, 3.7.7
fast_bleu: 0.0.86, 0.0.86, 0.0.86

The entire code is shown as follows, without using multi-threading:

from fast_bleu import BLEU

reference_corpus = [[str(i) + str(j) for i in range(100)] for j in range(70000)]
weight = {'4': (.25, .25, .25, .25)}
BLEU(reference_corpus, weight)

from fast-bleu.

Danial-Alh avatar Danial-Alh commented on June 7, 2024

Thank you for the extra information.
Could you please install the 0.0.87 version and test it?

from fast-bleu.

StevenTang1998 avatar StevenTang1998 commented on June 7, 2024

Thanks for your help. Now it can run successfully on my server with GCC version 7.5.0.
However, when I installed it on the server with GCC version 5.4.0, it failed to install. Could you lower the barrier to version for more widely usage.

from fast-bleu.

Danial-Alh avatar Danial-Alh commented on June 7, 2024

I created a new branch gcc-downgrade and lowered compilation flag to c++11.
I tested it and it seems there is no issue.

Could you please clone the repo and and test whether the gcc-downgrade branch works well or not?
Please uninstall the currently installed version and install the new version with python setup.py install command .

from fast-bleu.

StevenTang1998 avatar StevenTang1998 commented on June 7, 2024

I can run it successfully with GCC version 5.4.0. Thank you very much for your help.

Best wishes!

from fast-bleu.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.