kimiamania / mitlm Goto Github PK
View Code? Open in Web Editor NEWLicense: BSD 3-Clause "New" or "Revised" License
License: BSD 3-Clause "New" or "Revised" License
What steps will reproduce the problem?
1. Download the latest version 0.4.1
2. Make -j
3.
What is the expected output? What do you see instead?
Compilation errors related to the 'fortran_wrapper'.
What version of the product are you using? On what operating system?
Version 0.4.1 on Mac OSX 10.9.2
Please provide any additional information below.
The errors:
src/optimize/fortran_wrapper.c:38:6: error: function cannot return function
type 'void (int *, int *, double *, double *, double *, int *, double *,
double *, double *, double *, double *, int *, char *, int *, char *, int *, int *, double *)'
void setulb_f77(int *n, int *m, double *x, double *l, double *u, int *nbd,
^
src/optimize/fortran_wrapper.c:36:29: note: expanded from macro 'setulb_f77'
#define setulb_f77 F77_FUNC (setulb, SETULB)
^
src/optimize/fortran_wrapper.c:38:6: error: a parameter list without types is
only allowed in a function definition
src/optimize/fortran_wrapper.c:36:30: note: expanded from macro 'setulb_f77'
#define setulb_f77 F77_FUNC (setulb, SETULB)
^
src/optimize/fortran_wrapper.c:46:6: error: function cannot return function
type 'void (int *, int *, double *, double *, double *, int *, double *, int
*, double *, double *, double *, int *)'
void lbfgs_f77(int *n, int *m, double *x, double *f, double *g,
^
src/optimize/fortran_wrapper.c:44:28: note: expanded from macro 'lbfgs_f77'
#define lbfgs_f77 F77_FUNC (lbfgs, LBFGS)
^
src/optimize/fortran_wrapper.c:46:6: error: a parameter list without types is
only allowed in a function definition
src/optimize/fortran_wrapper.c:44:29: note: expanded from macro 'lbfgs_f77'
#define lbfgs_f77 F77_FUNC (lbfgs, LBFGS)
^
src/optimize/fortran_wrapper.c:56:2: error: use of undeclared identifier
'setulb'
setulb_f77(n, m, x, l, u, nbd, f, g, factr, pgtol, wa,
^
src/optimize/fortran_wrapper.c:36:30: note: expanded from macro 'setulb_f77'
#define setulb_f77 F77_FUNC (setulb, SETULB)
^
src/optimize/fortran_wrapper.c:64:2: error: use of undeclared identifier 'lbfgs'
lbfgs_f77(n, m, x, f, g, diagco, diag, iprint, eps,
^
src/optimize/fortran_wrapper.c:44:29: note: expanded from macro 'lbfgs_f77'
#define lbfgs_f77 F77_FUNC (lbfgs, LBFGS)
Original issue reported on code.google.com by [email protected]
on 10 Apr 2014 at 5:23
What steps will reproduce the problem?
1. configure
2. make
3.
What is the expected output? What do you see instead?
Expected Output: compilation with no errors.
Instead: I see "fatal error: 'tr1/unordered_map' file not found
What version of the product are you using? On what operating system?
Latest version (0.4.1), on Mac OSX 10.9.2
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 10 Apr 2014 at 4:33
What steps will reproduce the problem?
1. estimate-ngram -h
What is the expected output? What do you see instead?
expected: help message
actual: estimate-ngram: error while loading shared libraries: libmitlm.so.0:
cannot open shared object file: No such file or directory
What version of the product are you using? On what operating system?
using version 48 on ubuntu 11.10
Please provide any additional information below.
After searching I found that libmitlm.so.0 is located at
/usr/local/lib/libmitlm.so.0
Original issue reported on code.google.com by [email protected]
on 4 Jun 2012 at 4:59
What steps will reproduce the problem?
1. checkout revision 48
2. copy Makefile.example to Makefile
3. compile with "make -j"
What is the expected output? What do you see instead?
I expected a successful compilation. Instead I get a failed compilation with
non-zero return code
What version of the product are you using? On what operating system?
I'm using a clean checkout of revision 48 of mitlm. I'm compiling it on Mac OS
X Lion (10.7).
Please provide any additional information below.
I manage to solve the problem by performing the following steps:
1. Install Fortran with Homebrew: "brew install gfortran"
2. Add this setting to your Makefile after the FFLAGS line: "FC = gfortran"
3. Change the LDFLAGS line to "LDFLAGS = -L. -lgfortran -lmitlm"
4. Create a symlink to your libgfortran library: "ln -s
/usr/local/Cellar/gfortran/4.2.4-5666.3/lib/gcc/i686-apple-darwin11/4.2.1/x86_64
/libgfortran.a"
Once I did this, I was able to compile mitlm with no errors.
I haven't yet used the binaries for anything, but I noticed that running the
estimate-ngram command without any input produces a segmentation fault:
$ ./interpolate-ngram
Interpolating component LMs...
Tying parameters across n-gram order...
Segmentation fault: 11
Since I've never used it before, I'm not sure if this just me.
Original issue reported on code.google.com by [email protected]
on 25 Aug 2011 at 2:37
My Input:
interpolate-ngram -lm "test_model.lm, lm_giga_5k_nvp_3gram.arpa" -wl
combined_lang_model.lm -verbose
test_model.lm is one I created. It interpolates fine by itself.
lm_giga_5k_nvp_3gram.arpa does not work even if you interpolate it by itself.
What version of the product are you using? On what operating system?
version 48
Please provide any additional information below.
interpolate-ngram: src/NgramModel.cpp:329: void
NgramModel::LoadLM(std::vector<DenseVector<double>,
std::allocator<DenseVector<double> > >&, std::vector<DenseVector<double>,
std::allocator<DenseVector<double> > >&, ZFile&): Assertion `p >=
&line[lineLen]' failed.
Aborted
Original issue reported on code.google.com by [email protected]
on 14 Jul 2011 at 4:21
What steps will reproduce the problem?
1. Try to load arpa lm using evaluate-lm
I have tried with ngrams estimated using both mitlm as well as other tool sets.
What is the expected output? What do you see instead?
0.000 Loading LM exp/arpa/imd/imd.p7E-8.arpa...
terminate called after throwing an instance of 'std::invalid_argument'
what(): Unexpected file format.
What version of the product are you using? On what operating system?
Used both latest SVN trunk and 0.4.1 on Ubuntu Linux
Please provide any additional information below.
The problem seems to be on line 293 of NgramModel.cpp
"if (sscanf(line, "\\%u-ngrams:", &i) != 1 || i != o) {"
On line 382 we see the problem, when writing out the n-gram:
"fprintf(lmFile, "\n\\%lu-grams:\n", (unsigned long)o);"
Thus the fix is fairly straight forward, just change line 293 to:
"if (sscanf(line, "\\%u-grams:", &i) != 1 || i != o) {"
Original issue reported on code.google.com by [email protected]
on 18 May 2014 at 9:52
Dear all,
I've tried to install mitlm.0.4 to test the Phonetisaurus, but it couldn't make
it because there were some problems with the fortran codes and makefile.
(missing some library).
I've spent so long time for this, but still impossible to find the errors.
Please help me to get out from this problem....!
What version of the product are you using? On what operating system?
- Cygwin with gcc 4.3.4 (20090804)
- Window 7
- mitln.0.4
For additional information, please refer to the attached file (for the error
message).
.................................
gfortran -g -fPIC -fmessage-length=0 -O3 -DNDEBUG -funroll-loops -c -o
src/optimize/lbfgsb.o src/optimize/lbfgsb.f
f951: warning: -fPIC ignored for target (all code is position independent)
gfortran -g -fPIC -fmessage-length=0 -O3 -DNDEBUG -funroll-loops -c -o
src/optimize/lbfgs.o src/optimize/lbfgs.f
f951: warning: -fPIC ignored for target (all code is position independent)
ar rcs libmitlm.a src/util/RefCounter.o src/util/Logger.o
src/util/CommandOptions.o src/Vocab.o src/NgramVector.o src/NgramModel.o
src/NgramLM.o src/InterpolatedNgramLM.o src/Smoothing.o
src/MaxLikelihoodSmoothing.o src/KneserNeySmoothing.o src/PerplexityOptimizer.o
src/WordErrorRateOptimizer.o src/Lattice.o src/optimize/lbfgsb.o
src/optimize/lbfgs.o
g++ -g -Wall -fPIC -fmessage-length=0 -Isrc -O3 -DNDEBUG -funroll-loops -c -o
src/estimate-ngram.o src/estimate-ngram.cpp
src/estimate-ngram.cpp:1: warning: -fPIC ignored for target (all code is
position independent)
In file included from
/usr/lib/gcc/i686-pc-cygwin/4.3.4/include/c++/ext/hash_map:64,
from src/util/CommandOptions.h:37,
from src/estimate-ngram.cpp:36:
/usr/lib/gcc/i686-pc-cygwin/4.3.4/include/c++/backward/backward_warning.h:33:2:
warning: #warning This file includes at least one deprecated or antiquated
header which may be removed without further notice at a future date. Please use
a non-deprecated interface with equivalent functionality instead. For a listing
of replacement headers and interfaces, consult the file backward_warning.h. To
disable this warning use -Wno-deprecated.
g++ src/estimate-ngram.o -o estimate-ngram -L. -lgfortran -lmitlm -O3
-funroll-loops
./libmitlm.a(lbfgs.o): In function `mcsrch':
/home/seng/Phonetisaurus/mitlm.0.4/src/optimize/lbfgs.f:670: undefined
reference to `__gfortran_st_write'
/home/seng/Phonetisaurus/mitlm.0.4/src/optimize/lbfgs.f:670: undefined
reference to `__gfortran_st_write_done'
./libmitlm.a(lbfgs.o): In function `lb1':
/home/seng/Phonetisaurus/mitlm.0.4/src/optimize/lbfgs.f:469: undefined
reference to `__gfortran_st_write'
/home/seng/Phonetisaurus/mitlm.0.4/src/optimize/lbfgs.f:469: undefined
reference to `__gfortran_transfer_integer'
/home/seng/Phonetisaurus/mitlm.0.4/src/optimize/lbfgs.f:469: undefined
reference to `__gfortran_transfer_integer'
/home/seng/Phonetisaurus/mitlm.0.4/src/optimize/lbfgs.f:469: undefined
reference to `__gfortran_transfer_real'
/home/seng/Phonetisaurus/mitlm.0.4/src/optimize/lbfgs.f:469: undefined
reference to `__gfortran_transfer_real'
/home/seng/Phonetisaurus/mitlm.0.4/src/optimize/lbfgs.f:469: undefined
reference to `__gfortran_transfer_real'
/home/seng/Phonetisaurus/mitlm.0.4/src/optimize/lbfgs.f:469: undefined
reference to `__gfortran_st_write_done'
./libmitlm.a(lbfgs.o):/home/seng/Phonetisaurus/mitlm.0.4/src/optimize/lbfgs.f:48
0: more undefined references to `__gfortran_transfer_real' follow
./libmitlm.a(lbfgs.o): In function `lb1':
/home/seng/Phonetisaurus/mitlm.0.4/src/optimize/lbfgs.f:480: undefined
reference to `__gfortran_st_write_done'
./libmitlm.a(lbfgs.o): In function `lbfgs':
/home/seng/Phonetisaurus/mitlm.0.4/src/optimize/lbfgs.f:246: undefined
reference to `__gfortran_st_write'
/home/seng/Phonetisaurus/mitlm.0.4/src/optimize/lbfgs.f:246: undefined
reference to `__gfortran_st_write_done'
.................
./libmitlm.a(lbfgsb.o): In function `setulb':
/home/seng/Phonetisaurus/mitlm.0.4/src/optimize/lbfgsb.f:196: undefined
reference to `__gfortran_compare_string'
collect2: ld returned 1 exit status
make: *** [estimate-ngram] Error 1
Thank you in advance for your help !
Seng,
Original issue reported on code.google.com by [email protected]
on 15 Feb 2013 at 12:25
Attachments:
1. I decompressed the mitlm-0.4.1.tar.gz file on OSX Yosemite
From Terminal:
2. ./compile.
3. make -j
Error:
Input is: estimate-ngram -text mysent.txt -write-lm mysent.lm
But then I get: estimate-ngram: command not found
Original issue reported on code.google.com by [email protected]
on 19 Jun 2015 at 9:31
I have two 4-gram models, 191567768 and 38095008 bytes in MITLM binary format.
When I use interpolate-ngram to LI them without any perplexity
optimization, it works fine. However, when I add --optimize-perplexity
option, I get segmentation fault.
This is what happens:
$ ~/lbin/mitlm-svn/interpolate-ngram -l build/lm/tmp/model1.mitlm
model2.mitlm -o 4 --optimize-perplexity dev.txt -write-lm out.arpa.gz
Loading component LM model1.mitlm...
Loading component LM model2.mitlm...
Interpolating component LMs...
Interpolation Method = LI
Loading development set dev.txt...
Optimizing 1 parameters...
Segmentation fault (core dumped)
Backtrace from gdb:
(gdb) bt
#0 0x0000000000430c9c in InterpolatedNgramLM::_EstimateProbsMasked
(this=0x7fffa952a110, params=@0x7fffa952a210, pMask=0x5b4280) at
src/InterpolatedNgramLM.cpp:342
#1 0x00000000004316dd in InterpolatedNgramLM::Estimate
(this=0x7fffa952a110, params=@0x7fffa952a3a0, pMask=0x5b4280) at
src/InterpolatedNgramLM.cpp:214
#2 0x0000000000441f7a in PerplexityOptimizer::ComputeEntropy
(this=0x7fffa952a250, params=@0x7fffa952a3a0) at src/PerplexityOptimizer.cpp:61
#3 0x0000000000443381 in
PerplexityOptimizer::ComputeEntropyFunc::operator() (this=0x7fffa9529f90,
params=@0x7fffa952a3a0) at src/PerplexityOptimizer.h:64
#4 0x0000000000445076 in
MinimizeLBFGSB<PerplexityOptimizer::ComputeEntropyFunc>
(func=@0x7fffa9529f90, x=@0x7fffa952a3a0, numIter=@0x7fffa9529f8c,
step=1e-08, factr=10000000,
pgtol=1.0000000000000001e-05, maxIter=15000) at src/optimize/LBFGSB.h:79
#5 0x0000000000442643 in PerplexityOptimizer::Optimize
(this=0x7fffa952a250, params=@0x7fffa952a3a0, technique=LBFGSBOptimization)
at src/PerplexityOptimizer.cpp:122
#6 0x000000000046db3e in main (argc=10, argv=0x7fffa952aab8) at
src/interpolate-ngram.cpp:277
A similar thing happens with -i CM, but it crasher earlier:
$ ~/lbin/mitlm-svn/interpolate-ngram -l build/lm/tmp/model1.mitlm
model2.mitlm -o 4 --optimize-perplexity dev.txt -write-lm out.arpa.gz -i CM
Interpolating component LMs...
Interpolation Method = CM
Loading counts for model1.mitlm from log:model1.counts...
Loading counts for model2.mitlm from log:model2.counts...
Loading development set dev.txt...
Segmentation fault (core dumped)
(gdb) bt
#0 0x0000000000429924 in Copy<unsigned char const*, unsigned char*>
(input=0xa0 <Address 0xa0 out of bounds>, begin=0x2aaad1028010 "",
end=0x2aaad15da9f0 "")
at src/util/FastIO.h:56
#1 0x000000000042db71 in DenseVector<unsigned char>::operator=
(this=0x5b4810, v=@0x5b3ea0) at src/vector/DenseVector.tcc:146
#2 0x0000000000431974 in InterpolatedNgramLM::GetMask
(this=0x7fff0e36af40, probMaskVectors=@0x7fff0e36ad30,
bowMaskVectors=@0x7fff0e36ad10) at src/InterpolatedNgramLM.cpp:153
#3 0x0000000000442c6e in PerplexityOptimizer::LoadCorpus
(this=0x7fff0e36b080, corpusFile=@0x7fff0e36b2f0) at
src/PerplexityOptimizer.cpp:55
#4 0x000000000046db01 in main (argc=12, argv=0x7fff0e36b8e8) at
src/interpolate-ngram.cpp:274
BTW, the same thing works with small toy models.
I'm using MTLM from SVN, Linux, amd64.
Original issue reported on code.google.com by [email protected]
on 10 Dec 2008 at 1:41
What steps will reproduce the problem?
Create a large counts file in which there an ngram (e.g. "foo bar baz") whose
suffix ngram ("bar baz") doesn't exist earlier in the file.
Run `estimate-ngram -wl lm.arpa -counts counts` on it.
Note this doesn't always happen consistently for me with smaller count files,
but seems to replicate fairly consistently with larger (or at least
middle-sized) files.
What is the expected output? What do you see instead?
I'd ideally expect it allow a language model to be built in this case, even if
it means removing/skipping over the ngram in question, or making some
assumption about the count for the missing suffix (e.g. same as the
higher-order ngram).
I realise that these missing suffixes won't occur if I use MITLM itself to
compute the counts from a corpus, however if dealing with large amounts of
count-based source data from some other tools/sources, it's possible for these
kinds of constraints to be violated accidentally due to data corruption or bugs
beyond your control, and so it would be convenient if MITLM could cope
gracefully with these cases.
Alternatively if this is a WONTFIX then it would be good to at least document
what the constraint is on acceptable input for counts files, and give a more
friendly error message if the constraint is violated, so people know how to fix
up their input files in order to get MITLM to work.
Currently what you see is:
estimate-ngram: src/NgramModel.cpp:811: void
mitlm::NgramModel::_ComputeBackoffs(): Assertion `allTrue(backoffs !=
NgramVector::Invalid)' failed.
Aborted (core dumped)
What version of the product are you using? On what operating system?
Built from latest github master, Ubuntu 14.04.1
Cheers!
Original issue reported on code.google.com by [email protected]
on 11 Feb 2015 at 12:08
The crash only happens if the ngram order is higher than 1, and only if the #
occurs at the start of a token.
I'm guessing this is because it interprets a # at the beginning of a line in a
text counts file as a comment and skips it, meaning a unigram beginning with a
# is missing from the term dictionary when it's encountered in a later bigram.
What steps will reproduce the problem?
$ estimate-ngram -wc counts -text <(echo 'a #hashtag')
0.001 Loading corpus /dev/fd/63...
0.002 Smoothing[1] = ModKN
0.002 Smoothing[2] = ModKN
0.002 Smoothing[3] = ModKN
0.002 Set smoothing algorithms...
0.002 Saving counts to counts...
$ cat counts
<s> 1
a 1
#hashtag 1
<s> a 1
a #hashtag 1
#hashtag </s> 1
<s> a #hashtag 1
a #hashtag </s> 1
$ estimate-ngram -counts counts -wl lm.arpa
0.001 Loading counts counts...
estimate-ngram: src/NgramModel.cpp:800: void
mitlm::NgramModel::_ComputeBackoffs(): Assertion `allTrue(backoffs !=
NgramVector::Invalid)' failed.
Aborted (core dumped)
What version of the product are you using? On what operating system?
Built from latest master on github. Ubuntu 14.04.1
Original issue reported on code.google.com by [email protected]
on 10 Feb 2015 at 6:39
Hi,
I am trying to build and interpolate very small language models (most higher
order n-grams are unique). I am not able to interpolate ARPAs, because it
always throw the following error:
$ estimate-ngram -order 3 -text 1.txt -s ML -wl 1.arpa
...
$ estimate-ngram -order 3 -text 2.txt -s ML -wl 2.arpa
...
$ interpolate-ngram -order 3 -lm "1.arpa, 2.arpa" -wl 3.arpa
...
interpolate-ngram: src/InterpolatedNgramLM.cpp:327: void
mitlm::InterpolatedNgramLM::_EstimateBows(): Assertion `!anyTrue(isnan(bows))'
failed.
Aborted (core dumped)
PS 1. The same when doing open vocab (-unk)
PS 2. My minimalistic arpa do have most of BOW set to -99
PS 3. There are quite a lot of n-grams with -log(p) == 0.00000 in arpa
PS 4. I found out that "</s>" 1-gram _DOES_NOT_ have BOW in arpa
PS 5. I am using mitlm-0.4.1
Any ideas?
Original issue reported on code.google.com by [email protected]
on 22 Jul 2015 at 8:03
I have a background unigram model (bg.arpa), some additional training data
(train.txt) and some dev text (dev.txt). I want to create an interpolated
unigram that optimizes the perplexity of dev.txt. I also need open
vocabulary LM (-unk).
I execute:
$ interpolate-ngram -l bg.arpa -t train.txt -op dev.txt -o 1 -wf
"entropy:train.txt" -unk 1 -v etc/vocab
I get:
...
Loading component LM bg.arpa...
-unk with -lm is not implemented yet.
-- RefCounter----------
map[0x2aaaab5a5010] = 0
map[0x2aaaab0dc010] = 1
map[0x5a9c60] = 0
map[0x5a94f0] = 0
map[0x2aaaab1dd010] = 1
map[0x2aaaab018010] = 1
map[0x5a9f00] = 0
map[0x5a9cf0] = 0
-----------------------
Without -unk it seems to work fine.
OK, I understand it's not implemented, but maybe it's just a simple fix...
Otherwise, I think I know a workaround. Thanks.
Original issue reported on code.google.com by [email protected]
on 26 Feb 2009 at 4:42
What steps will reproduce the problem?
1. Download latest 0.4.1 tarsal
2. Experience failure to build
Solution:
Add #include <string> to FastIO.h, after #include <cstring>
Recipe to fix:
Use Xcode toolchain, not homebrew GCC
Download gfortran 4.9 from https://gcc.gnu.org/wiki/GFortranBinaries#MacOS
Check out latest source svn checkout http://mitlm.googlecode.com/svn/trunk/
mitlm-read-only
Add #include <string> to FastIO.h
autogen.sh
make
make install
Original issue reported on code.google.com by [email protected]
on 24 Nov 2014 at 1:13
Hello,
i tried to create an 4-gram language model with the help of your estimate-ngram
tool which led to the following debug output:
0.000 Loading vocab wlist...
0.170 Loading corpus corpus.txt...
estimate-ngram: src/vector/DenseVector.tcc:406: void
DenseVector<T>::_allocate() [with T = int]: Assertion `_data' failed.
I used the command:
estimate-ngram -order 4 -v wlist -unk -t corpus.txt -wl arpa
When i try to create a trigram model from the same corpus, the tool runs the
task like a charm.
Original issue reported on code.google.com by [email protected]
on 18 Jun 2010 at 8:47
What steps will reproduce the problem?
1. Doenload mitlm.0.4. and extract it
2. goto the mitlm.0.4 dir
3. run make -j
What is the expected output? What do you see instead?
Succesful compilation of the tool
What version of the product are you using? On what operating system?
mitlm0.4
Os - fedora12
Please provide any additional information below.
error
src/vector/VectorOps.h:168: error: expected primary-expression before ‘>’
token
src/vector/VectorOps.h:168: error: no matching function for call to ‘min()’
In file included from src/NgramModel.h:42,
from src/NgramModel.cpp:43:
src/Vocab.h: In member function ‘VocabIndex Vocab::Find(const char*) const’:
src/Vocab.h:78: error: ‘strlen’ was not declared in this scope
src/Vocab.h: In member function ‘VocabIndex Vocab::Add(const char*)’:
src/Vocab.h:80: error: ‘strlen’ was not declared in this scope
src/NgramModel.cpp: In member function ‘void
NgramModel::LoadCorpus(std::vector<DenseVector<int>,
std::allocator<DenseVector<int> > >&, ZFile&, bool)’:
src/NgramModel.cpp:93: error: ‘strncmp’ was not declared in this scope
src/NgramModel.cpp:93: error: ‘strcmp’ was not declared in this scope
src/NgramModel.cpp: In member function ‘void
NgramModel::LoadLM(std::vector<DenseVector<double>,
std::allocator<DenseVector<double> > >&, std::vector<DenseVector<double>,
std::allocator<DenseVector<double> > >&, ZFile&)’:
src/NgramModel.cpp:264: error: ‘strcmp’ was not declared in this scope
src/NgramModel.cpp:297: error: ‘strlen’ was not declared in this scope
src/NgramModel.cpp:323: error: ‘strcmp’ was not declared in this scope
src/NgramModel.cpp:344: error: ‘strcmp’ was not declared in this scope
src/NgramModel.cpp: In member function ‘void
NgramModel::LoadEvalCorpus(std::vector<DenseVector<int>,
std::allocator<DenseVector<int> > >&, std::vector<DenseVector<int>,
std::allocator<DenseVector<int> > >&, BitVector&, ZFile&, size_t&, size_t&)
const’:
src/NgramModel.cpp:478: error: ‘strncmp’ was not declared in this scope
src/NgramModel.cpp:478: error: ‘strcmp’ was not declared in this scope
src/NgramModel.cpp: In member function ‘void
NgramModel::LoadComputedFeatures(std::vector<DenseVector<double>,
std::allocator<DenseVector<double> > >&, const char*, size_t) const’:
src/NgramModel.cpp:586: error: ‘strcmp’ was not declared in this scope
src/NgramModel.cpp:611: error: ‘strcmp’ was not declared in this scope
src/NgramModel.cpp: In member function ‘void
NgramModel::_LoadFrequency(std::vector<DenseVector<double>,
std::allocator<DenseVector<double> > >&, ZFile&, size_t) const’:
src/NgramModel.cpp:858: error: ‘strcmp’ was not declared in this scope
src/NgramModel.cpp:870: error: ‘strncmp’ was not declared in this scope
src/NgramModel.cpp: In member function ‘void
NgramModel::_LoadEntropy(std::vector<DenseVector<double>,
std::allocator<DenseVector<double> > >&, ZFile&, size_t) const’:
src/NgramModel.cpp:937: error: ‘strcmp’ was not declared in this scope
src/NgramModel.cpp:951: error: ‘strncmp’ was not declared in this scope
make: *** [src/NgramModel.o] Error 1
In file included from
/usr/lib/gcc/i686-redhat-linux/4.4.2/../../../../include/c++/4.4.2/ext/hash_map:
59,
from src/util/RefCounter.h:38,
from src/util/SharedPtr.h:38,
from src/NgramLM.h:39,
from src/KneserNeySmoothing.cpp:37:
/usr/lib/gcc/i686-redhat-linux/4.4.2/../../../../include/c++/4.4.2/backward/back
ward_warning.h:28:2:
warning: #warning This file includes at least one deprecated or antiquated
header which may be removed without further notice at a future date. Please
use a non-deprecated interface with equivalent functionality instead. For a
listing of replacement headers and interfaces, consult the file
backward_warning.h. To disable this warning use -Wno-deprecated.
In file included from src/vector/DenseVector.tcc:40,
from src/vector/DenseVector.h:144,
from src/Types.h:40,
from src/NgramLM.h:41,
from src/KneserNeySmoothing.cpp:38:
src/util/FastIO.h: In function ‘bool getline(FILE*, char*, size_t)’:
src/util/FastIO.h:111: error: ‘strlen’ was not declared in this scope
src/util/FastIO.h: In function ‘bool getline(FILE*, char*, size_t,
size_t*)’:
src/util/FastIO.h:123: error: ‘strlen’ was not declared in this scope
src/util/FastIO.h: In function ‘void WriteHeader(FILE*, const char*)’:
src/util/FastIO.h:184: error: ‘strlen’ was not declared in this scope
src/util/FastIO.h: In function ‘void VerifyHeader(FILE*, const char*)’:
src/util/FastIO.h:237: error: ‘strlen’ was not declared in this scope
src/util/FastIO.h:239: error: ‘strncmp’ was not declared in this scope
In file included from src/Types.h:42,
from src/NgramLM.h:41,
from src/KneserNeySmoothing.cpp:38:
src/vector/VectorOps.h: In function ‘typename V::ElementType min(const
Vector<I>&)’:
src/vector/VectorOps.h:159: error: ‘numeric_limits’ is not a member of
‘std’
src/vector/VectorOps.h:159: error: expected primary-expression before ‘>’
token
src/vector/VectorOps.h:159: error: ‘::max’ has not been declared
src/vector/VectorOps.h: In function ‘typename V::ElementType max(const
Vector<I>&)’:
src/vector/VectorOps.h:168: error: ‘numeric_limits’ is not a member of
‘std’
src/vector/VectorOps.h:168: error: expected primary-expression before ‘>’
token
src/vector/VectorOps.h:168: error: no matching function for call to ‘min()’
In file included from src/Vocab.h:41,
from src/NgramLM.h:42,
from src/KneserNeySmoothing.cpp:38:
src/util/ZFile.h: In member function ‘bool ZFile::endsWith(const char*,
const char*)’:
src/util/ZFile.h:51: error: ‘strlen’ was not declared in this scope
src/util/ZFile.h:54: error: ‘strncmp’ was not declared in this scope
In file included from src/NgramLM.h:42,
from src/KneserNeySmoothing.cpp:38:
src/Vocab.h: In member function ‘VocabIndex Vocab::Find(const char*) const’:
src/Vocab.h:78: error: ‘strlen’ was not declared in this scope
src/Vocab.h: In member function ‘VocabIndex Vocab::Add(const char*)’:
src/Vocab.h:80: error: ‘strlen’ was not declared in this scope
In file included from src/vector/DenseVector.h:144,
from src/Types.h:40,
from src/NgramLM.h:41,
from src/KneserNeySmoothing.cpp:38:
src/vector/DenseVector.tcc: In member function ‘void DenseVector<T>::set(T)
[with T = double]’:
src/KneserNeySmoothing.cpp:163: instantiated from here
src/vector/DenseVector.tcc:366: error: ‘memset’ was not declared in this
scope
make: *** [src/KneserNeySmoothing.o] Error 1
In file included from
/usr/lib/gcc/i686-redhat-linux/4.4.2/../../../../include/c++/4.4.2/ext/hash_map:
59,
from src/util/RefCounter.h:38,
from src/vector/DenseVector.tcc:37,
from src/vector/DenseVector.h:143,
from src/Types.h:39,
from src/Lattice.h:41,
from src/Lattice.cpp:41:
/usr/lib/gcc/i686-redhat-linux/4.4.2/../../../../include/c++/4.4.2/backward/back
ward_warning.h:28:2:
warning: #warning This file includes at least one deprecated or antiquated
header which may be removed without further notice at a future date. Please
use a non-deprecated interface with equivalent functionality instead. For a
listing of replacement headers and interfaces, consult the file
backward_warning.h. To disable this warning use -Wno-deprecated.
In file included from src/Lattice.cpp:41:
src/util/FastIO.h: In function ‘bool getline(FILE*, char*, size_t)’:
src/util/FastIO.h:111: error: ‘strlen’ was not declared in this scope
src/util/FastIO.h: In function ‘bool getline(FILE*, char*, size_t,
size_t*)’:
src/util/FastIO.h:123: error: ‘strlen’ was not declared in this scope
src/util/FastIO.h: In function ‘void WriteHeader(FILE*, const char*)’:
src/util/FastIO.h:184: error: ‘strlen’ was not declared in this scope
src/util/FastIO.h: In function ‘void VerifyHeader(FILE*, const char*)’:
src/util/FastIO.h:237: error: ‘strlen’ was not declared in this scope
src/util/FastIO.h:239: error: ‘strncmp’ was not declared in this scope
In file included from src/Lattice.h:41,
from src/Lattice.cpp:42:
src/util/ZFile.h: In member function ‘bool ZFile::endsWith(const char*,
const char*)’:
src/util/ZFile.h:51: error: ‘strlen’ was not declared in this scope
src/util/ZFile.h:54: error: ‘strncmp’ was not declared in this scope
In file included from src/NgramLM.h:42,
from src/Lattice.h:43,
from src/Lattice.cpp:42:
src/Vocab.h: In member function ‘VocabIndex Vocab::Find(const char*) const’:
src/Vocab.h:78: error: ‘strlen’ was not declared in this scope
src/Vocab.h: In member function ‘VocabIndex Vocab::Add(const char*)’:
src/Vocab.h:80: error: ‘strlen’ was not declared in this scope
src/Lattice.cpp: In member function ‘void Lattice::LoadLattice(ZFile&)’:
src/Lattice.cpp:109: error: ‘strcmp’ was not declared in this scope
src/Lattice.cpp:112: error: ‘strcmp’ was not declared in this scope
make: *** [src/Lattice.o] Error 1
Original issue reported on code.google.com by [email protected]
on 17 Jan 2010 at 12:26
Hi Paul,
I've experienced some problems with the linear interpolation process and I am
not sure how to solve the problem. When I try to interpolate a 3-component LM I
get the warning:
--------
THE SEARCH DIRECTION IS NOT A DESCENT DIRECTION
IFLAG= -1
LINE SEARCH FAILED. SEE DOCUMENTATION OF ROUTINE MCSRCH
ERROR RETURN OF LINE SEARCH: INFO= 0
POSSIBLE CAUSES: FUNCTION OR GRADIENT ARE INCORRECT
OR INCORRECT TOLERANCES
--------
What I did was creating three single language models using
estimate-ngram -order 3 -v wlist -unk true -t train1.txt -opt-perp opt1.txt -wl
arpa_a.gz
estimate-ngram -order 3 -v wlist -unk true -t train2.txt -opt-perp opt2.txt -wl
arpa_b.gz
estimate-ngram -order 3 -v wlist -unk true -t train3.txt -opt-perp opt3.txt -wl
arpa_c.gz
then unpacked the components
gzip -d -f arpa_a.gz
gzip -d -f arpa_b.gz
gzip -d -f arpa_c.gz
then interpolated using
interpolate-ngram -l "arpa_a,arpa_b,arpa_c" -opt-perp int-opt.txt -wl arpa_full
The process actually yields in a language model I can use afterwards but I am
not sure what the warning/error is about and what I have to do to fix this.
My system is a Linux 2.6.31.12-0.2-desktop x86_64 with 8 GB ram and a quad-core
AMD 2360SE
Thanks in advance!
Original issue reported on code.google.com by [email protected]
on 25 Oct 2010 at 12:29
What steps will reproduce the problem?
1. Run estimate-ngram on an input text file with lines longer than 4096
characters, where the 4096th character is in the middle of a word.
2. Check the LM file for a partial words created by splitting the word above.
3.
What is the expected output? What do you see instead?
In a very long line containing e.g. the word "defect', where "c" is the 4096th
character, the non-words "def" and "ect" appear in the LM.
What version of the product are you using? On what operating system? 0.4.1 on
Ubuntu 12.04
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 18 Jul 2014 at 7:28
Hi, I would like to train a LM using mitlm, is it possible to use UTF-8 encoded
data? I'm also interested whether it is possible to invoke case-insensitive
handling of data? Thanks for answers. Jan
Original issue reported on code.google.com by [email protected]
on 28 Sep 2010 at 9:56
Hi I'm trying to interpolate two fairly straightforward 3gram lms with the
interpolate-ngram tool.
The command I'm running is,
-------------------
$ interpolate-ngram -o 3 -l lm1.arpa,lm2.arpa -wl lm1lm2.arpa
Loading component LM lm1.arpa...
Loading component LM lm2.arpa...
Segmentation fault
-------------------
The first lm was created with the estimate-ngram tool from a fairly small
training text (apprx 70mb),
$ estimate-ngram -t lm1.txt -wl lm1.arpa -o 3
The second lm is the gigaword 64k NVP 3gram model from Keith Vertanen's
open source LM page,
http://www.keithv.com/software/giga/
My guess is that there is something about the KV model that
interpolate-ngram doesn't like, but it isn't terribly clear what that might be.
Also, neither of the vocabularies is a subset of the other (although I
don't know whether or not that is relevant).
Original issue reported on code.google.com by [email protected]
on 28 Feb 2010 at 1:47
When I use interpolate-ngram to interpolate two models by CM or GLI with
perplexity optimization, I get following faults:
1st:
interpolate-ngram -lm "model1.lm, model2.lm" -smoothing ModKN -interpolation CM
-opt-perp dev-set.txt -write-lm CM-model.lm
Loading component LM model1.lm...
Loading component LM model2.lm...
Interpolating component LMs...
Tying parameters across n-gram order...
Interpolation Method = CM
Loading feature for model1.lm from log:sumhist:model1.effcounts...
terminate called after throwing an instance of 'std::runtime_error'
what(): Cannot open file
Aborted
2nd:
interpolate-ngram -lm "model1.lm, model2.lm" -smoothing ModKN -interpolation
GLI -opt-perp dev-set.txt -write-lm GLI-model.lm
Loading component LM model1.lm...
Loading component LM model2.lm...
Interpolating component LMs...
Tying parameters across n-gram order...
Interpolation Method = GLI
Segmentation fault
I'm using MITLM v0.4 from SVN under Linux, Intel i7.
Jan
Original issue reported on code.google.com by [email protected]
on 8 Sep 2010 at 7:11
I'm not sure if this is a feature or a bug, but when using estimate-ngram
with -v option, the words specified in the vocabulary but that are not seen
in the training data do not appear in the resulting LM. It would be nice if
there was a way to apply some discounting to also estimate the unigram
probabilities of unseen words (i.e. like SRILM's ngram-count does).
I'm using MITLM from SVN.
Original issue reported on code.google.com by [email protected]
on 12 Dec 2008 at 3:22
MITLM tools cannot load gzipped ARPA LM files, even those produced by
estimate-ngram or interpolate-ngram.
This is what happens:
$ ~/lbin/mitlm-svn/evaluate-ngram --read-lm tmp.arpa.gz
--evaluate-perplexity dev.txt
Loading LM tmp.arpa.gz...
terminate called after throwing an instance of 'std::invalid_argument'
what(): Unexpected file format.
Backtrace from gdb:
(gdb) bt
#0 0x00000035c102ee25 in raise () from /lib64/libc.so.6
#1 0x00000035c1030770 in abort () from /lib64/libc.so.6
#2 0x00000035c27c0f74 in __gnu_cxx::__verbose_terminate_handler () from
/usr/lib64/libstdc++.so.6
#3 0x00000035c27bf0b6 in std::set_unexpected () from /usr/lib64/libstdc++.so.6
#4 0x00000035c27bf0e3 in std::terminate () from /usr/lib64/libstdc++.so.6
#5 0x00000035c27bf1ca in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6 0x00000000004181ac in NgramModel::LoadLM (this=0x5adff0,
probVectors=@0x7fff7925fc28, bowVectors=@0x7fff7925fc40,
lmFile=@0x7fff7925fe50)
at src/NgramModel.cpp:289
#7 0x0000000000426d1a in ArpaNgramLM::LoadLM (this=0x7fff7925fc10,
lmFile=@0x7fff7925fe50) at src/NgramLM.cpp:141
#8 0x000000000046c38f in main (argc=5, argv=0x7fff79260118) at
src/evaluate-ngram.cpp:150
I'm using MITLM from SVN, Linux, amd64.
I attached the tmp.arpa.z file (produced with estimate-ngram)
Loading uncompressed ARPA files works fine.
Original issue reported on code.google.com by [email protected]
on 9 Dec 2008 at 1:30
Attachments:
I'm trying to follow the tutorial at
https://code.google.com/p/mitlm/wiki/Tutorial, and want to make sure I'm doing
it right. Where can I find the sample files like lectures.txt?
Original issue reported on code.google.com by [email protected]
on 14 Apr 2015 at 10:59
error after make command
i want to develop language model but following error occurs
checke attached make file
mitlm using latest version from svn repository
ubuntu 12.04 32bit system
libtool: link: /usr/bin/nm -B src/util/.libs/CommandOptions.o
src/util/.libs/RefCounter.o src/util/.libs/Logger.o src/.libs/NgramLM.o
src/.libs/Vocab.o src/.libs/PerplexityOptimizer.o src/.libs/Lattice.o
src/.libs/Smoothing.o src/.libs/NgramModel.o src/.libs/NgramVector.o
src/.libs/MaxLikelihoodSmoothing.o src/.libs/KneserNeySmoothing.o
src/.libs/InterpolatedNgramLM.o src/optimize/.libs/lbfgs.o
src/optimize/.libs/lbfgsb.o src/optimize/.libs/fortran_wrapper.o
src/.libs/WordErrorRateOptimizer.o | sed -n -e 's/^.*[
]\([ABCDGIRSTW][ABCDGIRSTW]*\)[ ][ ]*\([_A-Za-z][_A-Za-z0-9]*\)$/\1 \2 \2/p'
| sed '/ __gnu_lto/d' | /bin/sed 's/.* //' | sort | uniq > .libs/libmitlm.exp
libtool: link: /bin/grep -E -e "mitlm" ".libs/libmitlm.exp" >
".libs/libmitlm.expT"
libtool: link: mv -f ".libs/libmitlm.expT" ".libs/libmitlm.exp"
libtool: link: g++ -fPIC -DPIC -shared -nostdlib
/usr/lib/gcc/i686-linux-gnu/4.6/../../../i386-linux-gnu/crti.o
/usr/lib/gcc/i686-linux-gnu/4.6/crtbeginS.o src/util/.libs/CommandOptions.o
src/util/.libs/RefCounter.o src/util/.libs/Logger.o src/.libs/NgramLM.o
src/.libs/Vocab.o src/.libs/PerplexityOptimizer.o src/.libs/Lattice.o
src/.libs/Smoothing.o src/.libs/NgramModel.o src/.libs/NgramVector.o
src/.libs/MaxLikelihoodSmoothing.o src/.libs/KneserNeySmoothing.o
src/.libs/InterpolatedNgramLM.o src/optimize/.libs/lbfgs.o
src/optimize/.libs/lbfgsb.o src/optimize/.libs/fortran_wrapper.o
src/.libs/WordErrorRateOptimizer.o -Wl,-rpath -Wl,/home/java/test/mitlm/.libs
-L. -lgfortran /home/java/test/mitlm/.libs/libmitlm.so
-L/usr/lib/gcc/i686-linux-gnu/4.6
-L/usr/lib/gcc/i686-linux-gnu/4.6/../../../i386-linux-gnu
-L/usr/lib/gcc/i686-linux-gnu/4.6/../../../../lib -L/lib/i386-linux-gnu
-L/lib/../lib -L/usr/lib/i386-linux-gnu -L/usr/lib/../lib
-L/usr/lib/gcc/i686-linux-gnu/4.6/../../.. -lstdc++ -lm -lc -lgcc_s
/usr/lib/gcc/i686-linux-gnu/4.6/crtendS.o
/usr/lib/gcc/i686-linux-gnu/4.6/../../../i386-linux-gnu/crtn.o -Wl,-soname
-Wl,libmitlm.so.0 -Wl,-retain-symbols-file -Wl,.libs/libmitlm.exp -o
.libs/libmitlm.so.0.0.0
g++: error: /home/java/test/mitlm/.libs/libmitlm.so: No such file or directory
make[1]: *** [libmitlm.la] Error 1
make[1]: Leaving directory `/home/java/test/mitlm'
make: *** [all-recursive] Error 1
Original issue reported on code.google.com by [email protected]
on 21 Feb 2015 at 7:18
Attachments:
When I run smth like:
interpolate-ngram -l lm1.mitlm lm2.mitlm --write-lm tmp3.arpa.gz
--optimize-perplexity dev.txt
Loading component LM lm1.mitll...
Loading component LM lm2.mitlm...
Interpolating component LMs...
Interpolation Method = LI
Loading development set dev.txt...
Segmentation fault (core dumped)
gdb shows:
(gdb) bt
#0 0x0000000000447d9f in PerplexityOptimizer::LoadCorpus
(this=0x7fffd28165d0, corpusFile=Variable "corpusFile" is not available.
) at src/util/FastIO.h:54
#1 0x0000000000479ee5 in main (argc=8, argv=0x7fffd2816de8) at
src/interpolate-ngram.cpp:270
I'm using mitlm from SVN under Linux, amd64.
Original issue reported on code.google.com by [email protected]
on 3 Dec 2008 at 2:20
Just in case anyone will come across the same problem:
The linker was complaining about -lg2c and an undefined reference to
ApplySort(...). Here are few steps leading to successful compilation:
1. install gfortran:
sudo apt-get install gfortran
2. in the Makefile of mitlm, it is needed to set:
LDFLAGS = -L. -lgfortran -lmitlm
FC = gfortran
...the LDFLAGS line is already there, but it is needed to replace -lg2c
with -lgfortran
3. open the file NGramModel.cpp in a text editor and comment out the
definition of ApplySort template method. Then copy it's code and paste it
into NGramModel.h right after the declaration of ApplySort. The code of the
template method simply has to be in the header file.
4. rm src/NgramLM.o
5. make
Thank you Bo-June, for the toolkit.
Best regards,
Michal
Original issue reported on code.google.com by [email protected]
on 14 Jan 2010 at 11:29
Whn running:
estimate-ngram -read-text tmp.txt --write-binary-lm tmp.mitml
Loading corpus tmp.txt...
Smoothing[1] = ModKN
Smoothing[2] = ModKN
Smoothing[3] = ModKN
Set smoothing algorithms...
Segmentation fault (core dumped)
When I add -w tmp.wfeatures, where tmp.wfeatures is empty file, it works OK.
I'm using mitlm from SVN under Linux, amd64.
Original issue reported on code.google.com by [email protected]
on 3 Dec 2008 at 1:53
I am getting an error very similar to the -i CM case in Issue 4, with
failure the Copy function (though I get it while copying the lm during
smoothing). This happens during the 6th pass through, and only with -o6
(not -o5) and only on a 300+meg text, so it seems like it might be a memory
issue of some sort.
Starting program: /data/homes/benp/lmlowdata/mitlm.0.4~/estimate-ngram -t
spec.txt -wc spec.counts -o 6
Loading corpus all-Podiatry-all-all.txt...
Smoothing[1] = ModKN
Smoothing[2] = ModKN
Smoothing[3] = ModKN
Smoothing[4] = ModKN
Smoothing[5] = ModKN
Smoothing[6] = ModKN
Set smoothing algorithms...
Program received signal SIGSEGV, Segmentation fault.
KneserNeySmoothing::Initialize (this=0x86d7fe8, pLM=0xbfffb9c0, order=6) at
src/util/FastIO.h:56
56 *begin = *input;
(gdb) bt
#0 KneserNeySmoothing::Initialize (this=0x86d7fe8, pLM=0xbfffb9c0,
order=6) at src/util/FastIO.h:56
#1 0x08074743 in NgramLM::SetSmoothingAlgs (this=0xbfffb9c0,
smoothings=@0xbfffb71c) at src/NgramLM.cpp:287
#2 0x0807747b in NgramLM::Initialize (this=0xbfffb9c0, vocab=0x0,
useUnknown=false, text=0xbfffd160 "all-Podiatry-all-all.txt", counts=0x0,
smoothingDesc=0x80c8986 "ModKN", featureDesc=0x0) at src/NgramLM.cpp:225
#3 0x0804daa9 in main (argc=Cannot access memory at address 0x0
) at src/estimate-ngram.cpp:120
Original issue reported on code.google.com by [email protected]
on 3 Mar 2009 at 11:03
What steps will reproduce the problem?
1. Check out trunk (r48)
2. ./autogen.sh
3. make
What is the expected output? What do you see instead?
Build fails with:
make[1]: *** No rule to make target `lbfgs.lo', needed by `libmitlm.la'. Stop.
make[1]: Leaving directory `/home/stephen/mitlm2'
make: *** [all-recursive] Error 1
This can be fixed by editing the Makefile produced by autogen.sh:
--- Makefile.orig 2011-03-08 14:19:53.000000000 +0200
+++ Makefile 2011-03-08 14:20:26.000000000 +0200
@@ -75,7 +75,7 @@
src/Vocab.lo src/PerplexityOptimizer.lo src/Lattice.lo \
src/Smoothing.lo src/NgramModel.lo src/NgramVector.lo \
src/MaxLikelihoodSmoothing.lo src/KneserNeySmoothing.lo \
- src/InterpolatedNgramLM.lo lbfgs.lo lbfgsb.lo \
+ src/InterpolatedNgramLM.lo src/optimize/lbfgs.lo src/optimize/lbfgsb.lo \
src/WordErrorRateOptimizer.lo
libmitlm_la_OBJECTS = $(am_libmitlm_la_OBJECTS)
binPROGRAMS_INSTALL = $(INSTALL_PROGRAM)
but I don't know what the underlying cause is.
What version of the product are you using? On what operating system?
trunk (r48)
Linux srvslngrd003.uct.ac.za 2.6.18-194.3.1.el5 #1 SMP Fri May 7 01:43:09 EDT
2010 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Scientific Linux SL release 5.4 (Boron)
Please provide any additional information below.
autoconf (GNU Autoconf) 2.59
Original issue reported on code.google.com by [email protected]
on 8 Mar 2011 at 12:27
Attachments:
What steps will reproduce the problem?
1.
2.
3.
What is the expected output? What do you see instead?
What version of the product are you using? On what operating system?
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 23 Dec 2014 at 12:16
It seems that the make is trying to use a libtool option that does not exist.
What can I do to fix it?
What steps will reproduce the problem?
1. type: ./autogen.sh --prefix=$(pwd)/usr
2. type: make
What is the expected output? What do you see instead?
This should successfully make the program.
What version of the product are you using? On what operating system?
Revision 48
Please provide any additional information below.
Here is the exact bug in the make process:
/bin/bash ./libtool --tag=F77 --mode=compile -c -o src/optimize/lbfgs.lo
src/optimize/lbfgs.f
libtool: compile: unrecognized option `-c'
libtool: compile: Try `libtool --help' for more information.
make[1]: *** [src/optimize/lbfgs.lo] Error 1
make[1]: Leaving directory `/home/myname/workspace/sphinx/mitlm/mitlm-read-only'
make: *** [all-recursive] Error 1
Original issue reported on code.google.com by [email protected]
on 13 Jul 2011 at 8:44
What steps will reproduce the problem?
1.
$ make
……
libtool: compile: g++ "-DPACKAGE_NAME=\"MIT Language Modeling Toolkit\""
-DPACKAGE_TARNAME=\"mitlm\" -DPACKAGE_VERSION=\"0.4.1\" "-DPACKAGE_STRING=\"MIT
Language Modeling Toolkit 0.4.1\""
-DPACKAGE_BUGREPORT=\"[email protected]\" -DPACKAGE_URL=\"\"
-DPACKAGE=\"mitlm\" -DVERSION=\"0.4.1\" "-DF77_FUNC(name,NAME)=name ## _"
"-DF77_FUNC_(name,NAME)=name ## _" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
-DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1
-DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1
-DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_STRING_H=1 -DHAVE_MATH_H=1
-DSTDC_HEADERS=1 -DHAVE_STDLIB_H=1 -DHAVE_MALLOC=1 -DHAVE_STDLIB_H=1
-DHAVE_REALLOC=1 -I. -I./src -g -O2 -MT src/NgramModel.lo -MD -MP -MF
src/.deps/NgramModel.Tpo -c src/NgramModel.cpp -fno-common -DPIC -o
src/.libs/NgramModel.o
src/NgramModel.cpp: In member function 'void
mitlm::NgramModel::LoadLM(std::vector<mitlm::DenseVector<double>,
std::allocator<mitlm::DenseVector<double> > >&,
std::vector<mitlm::DenseVector<double>,
std::allocator<mitlm::DenseVector<double> > >&, mitlm::ZFile&)':
src/NgramModel.cpp:325: error: call of overloaded 'pow(int, int)' is ambiguous
/usr/include/math.h:436: note: candidates are: double pow(double, double)
/usr/include/c++/4.2.1/cmath:357: note: float std::pow(float,
float)
/usr/include/c++/4.2.1/cmath:361: note: long double
std::pow(long double, long double)
/usr/include/c++/4.2.1/cmath:365: note: double std::pow(double,
int)
/usr/include/c++/4.2.1/cmath:369: note: float std::pow(float,
int)
/usr/include/c++/4.2.1/cmath:373: note: long double
std::pow(long double, int)
make[1]: *** [src/NgramModel.lo] Error 1
make: *** [all-recursive] Error 1
$
I did in this way.
325c325
< assert(prob <= std::pow(10, -99));
---
> assert(prob <= std::pow(10.0, -99));
Original issue reported on code.google.com by mamadontgodaddycomehome
on 21 Nov 2013 at 10:52
What steps will reproduce the problem?
1../autogen.sh
2.make
Error messages:
src/NgramModel.cpp:325: error: call of overloaded 'pow(int, int)' is ambiguous
/usr/include/bits/mathcalls.h:154: note: candidates are: double pow(double,
double)
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/cmath:345:
note: float std::pow(float, float)
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/cmath:349:
note: long double std::pow(long double, long double)
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/cmath:353:
note: double std::pow(double, int)
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/cmath:357:
note: float std::pow(float, int)
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/cmath:361:
note: long double std::pow(long double, int)
make[1]: *** [src/NgramModel.lo] Error 1
make[1]: Leaving directory `/home/wyz/lm/mitlm-0.4.1'
make: *** [all-recursive] Error 1
Solution
Changed src/NgramModel.cpp:325 from
assert(prob <= std::pow(10, -99));
to
assert(prob <= std::pow(10.0, -99.0));
Again, rum `make'
Error message
make[1]: *** No rule to make target `lbfgs.lo', needed by `libmitlm.la'. Stop.
make[1]: Leaving directory `/home/wyz/lm/mitlm-0.4.1'
make: *** [all-recursive] Error 1
Solution,
according to http://code.google.com/p/mitlm/issues/detail?id=26
Change Makefile and Makefile.in
Error message
./.libs/libmitlm.so: undefined reference to `__cxa_get_exception_ptr'
collect2: ld returned 1 exit status
make[1]: *** [evaluate-ngram] Error 1
make[1]: Leaving directory `/home/wyz/lm/mitlm-0.4.1'
make: *** [all-recursive] Error 1
Then I have no idea about how to solve this problem.
What version of the product are you using? On what operating system?
mitlm-0.4.1.tar.gz
Description: Red Hat Enterprise Linux Server release 5.8 (Tikanga)
autoconf (GNU Autoconf) 2.69 (the original version is 2.59)
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 16 Jun 2014 at 4:34
What steps will reproduce the problem?
Run estimate-lm -v vocab.file -t training.txt -wl lm.arpa. Where vocab.file
contains a subset of the vocab in the LM training text.
What is the expected output? What do you see instead?
Program output:
0.000 Loading vocab vocab...
0.010 Loading corpus training.txt...
Segmentation fault
What version of the product are you using? On what operating system?
Latest SVN, ubuntu 10.04 64bit
Original issue reported on code.google.com by [email protected]
on 14 Jul 2011 at 4:34
What steps will reproduce the problem?
The project summary page at http://code.google.com/p/mitlm/ mentions a
dependency on the Boost C++ libraries, but mitlm no longer seems to require
these.
What is the expected output? What do you see instead?
Reference to Boost should be removed from project documentation.
What version of the product are you using? On what operating system?
r48 (trunk)
Original issue reported on code.google.com by [email protected]
on 8 Mar 2011 at 12:14
Dear Sir,
I want to make some request.
Can you modify mitlm to take unicode files as input?
Original issue reported on code.google.com by [email protected]
on 22 Jan 2010 at 6:07
on the tutorial wiki page (http://code.google.com/p/mitlm/wiki/Tutorial) it
is written that a language model with <unk> symbol for out-of-vocabulary
words can be estimated with this command:
estimate-ngram -v CS.vocab -unk -t Lectures.txt -wl Lectures.CS.unk.lm
but that does not work. You have to add T,t,1,TRUE or true after -unk:
estimate-ngram -v CS.vocab -unk true -t Lectures.txt -wl Lectures.CS.unk.lm
Original issue reported on code.google.com by [email protected]
on 24 Feb 2010 at 3:30
Upon running interpolate-ngram twice, I received the same error twice after
the 46th iteration. When I did this with 1 less component lm, it went
through 188 iterations without error.
interpolate-ngram -i GLI -op dev.txt -wl GLI.lm -if
"log:sumhist:%s.effcounts" -o 6 -l "1.lm, 2.lm, 3.lm, 4.lm, 5.lm"
Optimizing 9 parameters...
IFLAG= -1
LINE SEARCH FAILED. SEE DOCUMENTATION OF ROUTINE MCSRCH
ERROR RETURN OF LINE SEARCH: INFO= 3
POSSIBLE CAUSES: FUNCTION OR GRADIENT ARE INCORRECT
OR INCORRECT TOLERANCES
Iterations = 46
Elapsed Time = 516.460000
Perplexity = 28.741343
Original issue reported on code.google.com by [email protected]
on 27 Feb 2009 at 10:56
When creating unigram LMs and using word features and trying to optimize on
a dev set, I get segmentation fault:
$ estimate-ngram -v etc/vocab -unk 1 -o 1 -t train.txt -wl tmp.arpa.gz -wf
entropy:%s.txt -op tmp.txt
Replace unknown words with <unk>...
Loading vocab etc/vocab...
Loading corpus train.txt...
Loading weight features entropy:train.txt...
Smoothing[1] = ModKN
Set smoothing algorithms...
Loading development set tmp.txt...
Segmentation fault
The same line works fine with -o 2.
I'm using MITLM from SVN.
Original issue reported on code.google.com by [email protected]
on 26 Feb 2009 at 2:13
What steps will reproduce the problem?
1. compile gcc 4.x linux platform
2. run with gzip files on linux
What is the expected output? What do you see instead?
ZFile._file is null because popen() fails
What version of the product are you using? On what operating system?
0.4 running on ubuntu 8.04
Please provide any additional information below.
the fix is to change mode "rb" and "wb" to "r" and "w". popen()
implementations do not always
support "rb" or "wb" modes (see:
http://opengroup.org/onlinepubs/007908775/xsh/popen.html)
Original issue reported on code.google.com by [email protected]
on 13 May 2009 at 12:11
I have several arpa-formatted language models, and I'd like to mix these LMs
with a list of numerical weights (proportion) using interpolate-ngram, e.g.,
[0.3, 0.4, 0.1, 0.2]. I have been looking in the tutorial but could not find an
option for this. Is that implemented yet? Is there any alternative way to do
this (mix LMs with numerical weights)?
What version of the product are you using? On what operating system?
0.4.1, Linux 3.13.0-39-generic
Thanks!
Original issue reported on code.google.com by [email protected]
on 19 May 2015 at 11:18
What steps will reproduce the problem?
1. Compile trunk (r48)
2. Download an arpa LM e.g. the LM in
http://www.keithv.com/software/giga/lm_giga_64k_nvp_3gram.zip
3. Run evaluate-ngram which loads the model, e.g.
evaluate-ngram -lm lm_giga_64k_nvp_3gram.arpa -eval-perp data.txt
What is the expected output? What do you see instead?
0.001 Loading LM lm_giga_64k_nvp_3gram.arpa...
Assertion failed: (p >= &line[lineLen]), function LoadLM, file
src/NgramModel.cpp, line 329.
Abort trap
What version of the product are you using? On what operating system?
r48, MacOSX
Darwin stephen-marquards-macbook-pro.local 10.6.0 Darwin Kernel Version 10.6.0:
Wed Nov 10 18:13:17 PST 2010; root:xnu-1504.9.26~3/RELEASE_I386 i386
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 9 Mar 2011 at 1:01
When I run the following command I get many "Feature skipped" error
messages. I previously populated the effcounts files with estimate-ngram -t
-wec for each model, and they look fine. What do these error messages mean?
I also tried this with CM, and got many of the Feature skipped messages,
and a "feature read from..." for each model.
interpolate-ngram -lm "model1.lm, model2.lm, model3.lm" -interpolation GLI
-op dev.txt -wl GLI.lm -if
"log:sumhist:%s.effcounts,pow2:log1p:sumhist:%s.effcounts"
Thanks
Original issue reported on code.google.com by [email protected]
on 26 Feb 2009 at 11:27
1)
set option = "-i LI -opt-alg LBFGSB "
interpolate-ngram "$trainingdata,$adaptdata" $option -wl $trigram -op $devset
-eval-perp $testset
2)
set option = "-i CM -opt-alg LBFGSB "
interpolate-ngram -c "$count21,$count22" $option
-if "log:sumhist:$effcount21;log:sumhist:$effcount22" -wl $trigram -op $devset
-eval-perp $testset
Both of the above methods create many nan backoffs in the output LM.
However, their perplexities seems OK.
If the -op $devset is not used, the nan is not created. But the perplexities of
"CM" and "GLI" are over double of the "LI"
What version of the product are you using? On what operating system?
mit0.4, in CenOS 4.7
Original issue reported on code.google.com by [email protected]
on 10 Nov 2010 at 8:15
After updating from SVN, both LI and CM interpolation seem to be broken: in
the interpolated LM, there are many "nans" and most back-off weights are zero.
Sample interpolated LM:
\data\
ngram 1=199992
ngram 2=865062
ngram 3=2657490
ngram 4=4259246
\1-grams:
-1.564484 </s>
-99 <s> nan
[...]
-5.898262 Abadan
-6.074985 Abadi
-6.242569 Abadia 0.000000
-6.105848 Abadie
-6.242569 Abadou 0.000000
[...]
\2-grams:
nan </s> -t-il -0.019559
-2.477506 </s> <UNK> -0.299104
nan </s> A -0.020696
nan </s> A.
nan </s> A.B.
nan </s> A.K.
nan </s> ABM
nan </s> ACF
nan </s> AFP -0.045175
The source LMs (estimated with estimate-ngram) seem to be OK.
Original issue reported on code.google.com by [email protected]
on 16 Dec 2008 at 10:24
What steps will reproduce the problem?
1. Create an lm with evaluate-ngram and eval-perp param
2. Use estimate-gram with eval-perp on the same LM
3. Perplexity results differ
What is the expected output? What do you see instead?
evaluate-ngram -lm rlst8-similar.lm -eval-perp "$TRANSCRIPT_CONT,
$TRANSCRIPT_SENT"
0.001 Loading LM rlst8-similar.lm...
7.262 Perplexity Evaluations:
7.262 Loading eval set
/data/src/sphinx/experiments/transcripts/rlst-transcript.corpus...
7.318 /data/src/sphinx/experiments/transcripts/rlst-transcript.corpus 385.071
7.322 Loading eval set
/data/src/sphinx/experiments/transcripts/rlst-transcript.sentences...
7.376 /data/src/sphinx/experiments/transcripts/rlst-transcript.sentences 312.22
4
$ estimate-ngram -unk 1 -vocab $VOCAB_AUGMENTED -text $SENTENCE_CORPUS -wl
$LM_SIMILAR -eval-perp "$TRANSCRIPT_CONT, $TRANSCRIPT_SENT"
0.001 Replace unknown words with <unk>...
0.001 Loading vocab rlst8-merged-vocab.txt...
0.013 Loading corpus sentences.similar.corpus...
10.127 Smoothing[1] = ModKN
10.127 Smoothing[2] = ModKN
10.127 Smoothing[3] = ModKN
10.127 Set smoothing algorithms...
10.243 Estimating full n-gram model...
10.459 Saving LM to rlst8-similar.lm...
14.192 Perplexity Evaluations:
14.192 Loading eval set
/data/src/sphinx/experiments/transcripts/rlst-transcript.corpus...
14.351 /data/src/sphinx/experiments/transcripts/rlst-transcript.corpus 377.913
14.359 Loading eval set
/data/src/sphinx/experiments/transcripts/rlst-transcript.sentences...
14.516 /data/src/sphinx/experiments/transcripts/rlst-transcript.sentences 307.0
90
I would expect the two sets of perplexity results to be the same.
The difference appears to arise from use of the "-unk" parameter. Without these
(i.e. LM excludes <unk>), the perplexity results from estimate-ngram and
evaluate-ngram are the same.
What version of the product are you using? On what operating system?
r48
MacOS X 10.6.1
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 4 Jun 2011 at 7:42
Hi, am attempting to compile and after install of gcc-gfortran version
4.4.1, I get error when compiling lbfgsb.f:
make: f77: Command not found
Is there some way to tell Linux or GCC that the Fortran compiler is not f77?
Thanks
Hal
Original issue reported on code.google.com by [email protected]
on 5 Nov 2009 at 12:56
Hi, I have problem with smoothing large files of 3-grams counts.
I use :estimate-ngram -order 3 -counts allgrams -smoothing FixModKN -wl
allgrams.FixModKN.lm command and i get this error:
Saving LM to train.corpus.lm...
estimate-ngram: src/NgramModel.cpp:422: void NgramModel::SaveLM(const
std::vector<DenseVector<double>, std::allocator<DenseVector<double> > >&, const
std::vector<DenseVector<double>, std::allocator<DenseVector<double> > >&,
ZFile&) const: Assertion `(size_t)(ptr - lineBuffer.data()) <
lineBuffer.size()' failed.
Before I tried on 2-grams with 4,7GB files and it works fine. 3-grams file is
20GB big.
My operating system is GNU/Linux x86_64 with 96GB RAM
Original issue reported on code.google.com by [email protected]
on 15 Nov 2012 at 4:48
Hi,
I've created configuration files for autotools and Debian packages. Are you interested?
1) Here are quick instructions to patch the revision 41 of the code (you have
to download the attached patch file). Please note that I needed to apply a few
fixes to the source in order to be able to compile mitlm on Debian squeeze:
svn checkout -r 41 http://mitlm.googlecode.com/svn/trunk/ mitlm-read-only
cd mitlm-read-only
patch -p0 < ../autotools_debian_compilation.diff
chmod 755 autogen.sh debian/rules
touch NEWS AUTHORS ChangeLog debian/info
svn add configure.ac autogen.sh Makefile.am NEWS AUTHORS ChangeLog debian
svn remove Makefile
svn move LICENSE COPYING
2) Once you have installed autoconf, automake and libtool you can use this
command line to generate configure script and running it
./autogen.sh --prefix=$(pwd)/usr
3) After this you can run "make dist-gzip" to create mitlm-0.4.tar.gz archive
for distribution (users of this package can simply run ./configure && make &&
make install), you can also run "dpkg-buildpackage -rfakeroot" to create Debian
packages (it will create 4 packages, one for the binaries and three for the
library) for your architecture.
4) To change the package version, just edit the configure.ac file, change it
and re-run step 2 and 3.
Bests.
Original issue reported on code.google.com by [email protected]
on 16 Nov 2010 at 9:11
Attachments:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.