Code Monkey home page Code Monkey logo

xiamx / fasttext Goto Github PK

View Code? Open in Web Editor NEW

This project forked from facebookresearch/fasttext

111.0 111.0 25.0 4.3 MB

Windows Build of fastText, library for text representation and classification.

Home Page: http://cs.mcgill.ca/~mxia3/FastText-for-Windows/

License: Other

Makefile 0.09% Shell 2.00% Python 4.35% C++ 7.05% Perl 0.12% CMake 0.13% JavaScript 10.13% HTML 73.93% CSS 2.18%
fasttext nlp text-classification win64 windows word-embedding word-embeddings

fasttext's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fasttext's Issues

optimization of hyperparameters does not work

Optimization of hyperparameters doesn't work:
Unknown argument: -autotune-validation (еtc)

this bug was in the original fastext, it has already been fixed.
could you rebuild code?

the same data, the same code , different thread number, different result, why ?

the same data, the same code , different thread number, different result, why ?

fastText.exe supervised -input train.dat -output model -epoch 25 -loss softmax -ws 5 -thread 10
fastText.exe test model.bin test.dat
P@1 0.874
R@1 0.874

fastText.exe supervised -input train.dat -output model -epoch 25 -loss softmax -ws 5 -thread 20
fastText.exe test model.bin test.dat
P@1 0.939
R@1 0.939

the line count of text file is different between prediction result count

My text file exist 1537584 lines, but after the prediction process, there are only 1537490 results.
I've already check that my file didn't include any empty line.

After I published this issues to facebookresearch/fasttext, they said it might be a old bug.
(facebookresearch#344)

Do you have latest build at windows binary file? It will really help me !! Because my environment is windows server 2016, and your binary file is my only available source to run it. I suffered several issue at R/Python fasttext wrapper, only your one can be executed successfully.

Creates empty model file

Using the following training and test data:
`

1 is positive, 0 is negative

f = open('train.txt', 'w')
f.write('__label__1 i love you\n')
f.write('__label__1 he loves me\n')
f.write('__label__1 she likes baseball\n')
f.write('__label__0 i hate you\n')
f.write('__label__0 sorry for that\n')
f.write('__label__0 this is awful')
f.close()

f = open('test.txt', 'w')
f.write('sorry hate you')
f.close()
`

Running fasttext supervised -input train.txt -output model -dim 2 yields a model.bin file with 0 bytes

Unable to use pretrained model

Facebook published some pretrained vectors that can be used. I have tried running it on this windows version but the command takes a very long time, it's been at least 15 minutes, will it even work?

I also noticed that the size of the binaries produced is 0.16 times of the self-built binaries. How is that so?

fasttext issue

After installing those files provided in the link
https://github.com/xiamx/fastText/releases
Please explain what to do next?
I extracted those files but not able to open.. Can anyone tell me the procedure what i need to do after extracting those files
How will i get the fasttext?
Please reply to my as soon as poss

Error Assertion failed on using predict or predict-prob

Hi, I am using your fasttext bin version. Everything works fine, esp. applying nn on a model I produced by myself. Trying to use predict or predict-prob on this or any other models produces the following error code:

Assertion failed: A.m_ == m_, file src\vector.cc, line 93

What's wrong here?

FastText for Information Extraction

I'm trying to train FastText for performing Information Extraction (which is considered as a text classification problem) on a big corpus where the positive examples (speakers) are not organized one per line, like the paragrapgh below.
Can FastText perform the classification based on this kind of input?

"The Student Alumni Relations Council (SARC) invites you to a Career Talk featuring __label__speaker Andrew Gault HS'80 HNZ'94 and three university speakers, __label__speaker Mary Francis McLaughlin (volunteer and community service opportunities), __label__speaker Jessie Ramey (research opportunities) and __label__speaker Judi Mancuso (part-time work-study summer and internship opportunities) on Tuesday January 31 at 7:00 P M in the Carnegie Conference Room 1st Floor Warner Hall."

FastText returns no results after testing

In order to explore FastText, I prepared a train.txt containing 90 examples tagged with a single label, and a test.txt containing 4 examples. The training is performed successfully, but i got no results after testing the model.
Is it due to the number of examples in the files train and test?

capture

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.