kwonoj / cld3-asm Goto Github PK
View Code? Open in Web Editor NEWWebAssembly based Javascript bindings for google Compact Language Detector v3
License: MIT License
WebAssembly based Javascript bindings for google Compact Language Detector v3
License: MIT License
Same as kwonoj/hunspell-asm#65 .
ENVIRONMENT
(https://kripken.github.io/emscripten-site/docs/api_reference/module.html?highlight=environment#overriding-execution-environment) to NODE
for applicable case (for binaryEndpoint case too) #14export LanguageIdentifier
Same refactor as kwonoj/hunspell-asm#67 is required.
I'm trying to run this module as cloudflare worker. However I get "Error: environment detection error". I think this is because cloudflare workers don't implement "importScripts"
I'm quite new to workers and webassembly so not sure if this just is configuration problem in my end. However what i read from webpack produced code there was a check: "Module.ENVIRONMENT has been deprecated. To force the environment, use the ENVIRONMENT compile-time option (for example, -s ENVIRONMENT=web or -s ENVIRONMENT=node)" And also found commit related to environment detection
I created example repo for reproducing the issue.
Just run
https://github.com/kwonoj/cld3-asm/blob/master/src/languageCode.ts#L5
stack": "TypeError: Cannot read property 'UNKNOWN' of undefined\n
need to export as enum
instead of const enum
maybe?
8.0.2
to 8.0.3
.This version is covered by your current version range and after updating it in your project the build failed.
lint-staged is a devDependency of this project. It might not break your production code or affect downstream projects, but probably breaks your build or test tools, which may prevent deploying or publishing.
The new version differs by 1 commits.
225a904
fix: Allow to use lint-staged on CI (#523)
See the full diff
There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.
Your Greenkeeper Bot 🌴
1.14.3
to 1.15.0
.This version is covered by your current version range and after updating it in your project the build failed.
prettier is a devDependency of this project. It might not break your production code or affect downstream projects, but probably breaks your build or test tools, which may prevent deploying or publishing.
There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.
Your Greenkeeper Bot 🌴
I run across this issue:
Error in getLanguage: TypeError: runtimeModule is not a function
at cld3-asm.js?v=13ef29a7:1958:23
at Module.loadModule (cld3-asm.js?v=13ef29a7:2138:10)
I've been using it in a React app with Vite.
I've been trying to figure out what exactly it was, but ultimately found out that you have to specify the path in which the module gets loaded.
import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
export default defineConfig({
plugins: [react()],
resolve: {
alias: {
'cld3-asm': 'cld3-asm/dist/cjs/index.js'
}
}
})
Tried your lib and "findLanguage" seems to work fine but combining languages and then using "findMostFrequentLanguages" seems to find only one language in couple cases.
const test = require("tape");
const { loadModule } = require("cld3-asm");
test("findMostFrequentLanguages", async t => {
t.plan(7);
const cldFactory = await loadModule();
const identifier = cldFactory.create(0, 100);
const textEN = "This piece of text is in English.";
const textBG = "Този текст е на Български.";
const textFI = "Tämä teksti on suomea.";
const textSV = "Den här texten är på Svenska.";
const testEN = identifier.findLanguage(textEN);
t.equal(testEN.language, "en"); // ok
const testBG = identifier.findLanguage(textBG);
t.equal(testBG.language, "bg"); // ok
const testFI = identifier.findLanguage(textFI);
t.equal(testFI.language, "fi"); // ok
const testSV = identifier.findLanguage(textSV);
t.equal(testSV.language, "sv"); // ok
const testEN_BG = identifier.findMostFrequentLanguages(
`${textEN} ${textBG}`,
3
);
t.deepEqual(testEN_BG.map(lang => lang.language), ["bg", "en"]); // ok
const testEN_FI = identifier.findMostFrequentLanguages(
`${textEN} ${textFI}`,
3
);
t.deepEqual(testEN_FI.map(lang => lang.language), ["fi", "en"]); // not ok, just ["fi"]
const testEN_SV = identifier.findMostFrequentLanguages(
`${textEN} ${textSV}`,
3
);
t.deepEqual(testEN_SV.map(lang => lang.language), ["sv", "en"]); // not ok, just ["sv"]
});
is there any option to remove detectable language?
simply i tested 'test' but it gives like below.
{language: "de", probability: 0.6367550492286682, is_reliable: false, proportion: 1}
if i narrow detectable language, it might give better result.
2.0.5
to 2.0.7
.This version is covered by your current version range and after updating it in your project the build failed.
conventional-changelog-cli is a devDependency of this project. It might not break your production code or affect downstream projects, but probably breaks your build or test tools, which may prevent deploying or publishing.
There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.
Your Greenkeeper Bot 🌴
10.11.0
to 10.11.1
.This version is covered by your current version range and after updating it in your project the build failed.
@types/node is a devDependency of this project. It might not break your production code or affect downstream projects, but probably breaks your build or test tools, which may prevent deploying or publishing.
There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.
Your Greenkeeper Bot 🌴
Check when try to fetch in electron renderer / or either fall back asmjs loading failure - too verbose footprints.
Do not use binaryendpoint but allow override locatefile as needed for wasm binary woule be more bundler config agnostic (i.e file loader hash)
it doesn't include correct dist in pkg.
2 breaking changes in planning:
Both'll allow simplifying interfaces in general, as well as reducing module size.
This is blocked by upstream emscripten releases new version contains SINGLE_FILE option.
Hi
I was randomly testing some text messages trying to detect the language using your assembly and got these results:
Like super duper sketchy
Language is: da
Probability: 0.9992335438728333
living in music loving art
Language is: no
Probability: 0.996842622756958
AMERICAN DIABETES ASSOCIATION ALERT DAY
Language is: hu
Probability: 0.26049503684043884
great late brunch in Lox five ways
Language is: fy
Probability: 0.878024160861969
Actually all of these are detected by cld3 as English and I don't why why it reported incorrect results
Hi,
First of all great thanks for this amazing repos.
When using cld3-asm
in a node service, on each unhandled rejection, my service will shut down without send any beforeExit
or exit
or SIGINT
or SIGTERM
event so I can handle those cases properly.
mkdir cld3-asm-issue
cd cld3-asm-issue
nvm use 12
npm init --y
npm i express cld3-asm
touch index.js
// index.js
const cld3 = require('cld3-asm');
const app = require('express')();
process.on(
'SIGTERM',
() => process.stdout.write('SIGTERM\n')
);
process.on(
'beforeExit',
() => process.stdout.write('beforeExit\n')
);
process.on(
'exit',
(status) => process.stdout.write(`exit: ${status}\n`)
);
cld3.loadModule()
.then(() => {
app.listen(
5432,
() => console.log('listening...')
);
})
.then(() => {
setTimeout(() => Promise.reject(), 2000);
})
node index.js
listening...
/cld3-asm-issue/node_modules/cld3-asm/dist/cjs/lib/node/cld3.js:8
var Module=typeof Module!=="undefined"?Module:{};var moduleOverrides={};var key;for(key in Module){if(Module.hasOwnProperty(key)){moduleOverrides[key]=Module[key]}}Module["arguments"]=[];Module["thisProgram"]="./this.program";Module["quit"]=function(status,toThrow){throw toThrow};Module["preRun"]=[];Module["postRun"]=[];var ENVIRONMENT_IS_WEB=false;var ENVIRONMENT_IS_WORKER=false;var ENVIRONMENT_IS_NODE=true;if(Module["ENVIRONMENT"]){throw new Error("Module.ENVIRONMENT has been deprecated. To force the environment, use the ENVIRONMENT compile-time option (for example, -s ENVIRONMENT=web or -s ENVIRONMENT=node)")}var scriptDirectory="";function locateFile(path){if(Module["locateFile"]){return Module["locateFile"](path,scriptDirectory)}else{return scriptDirectory+path}}if(ENVIRONMENT_IS_NODE){if(!(typeof process==="object"&&typeof require==="function"))throw new Error("not compiled for this environment (did you b
abort() at Error
at jsStackTrace (/cld3-asm-issue/node_modules/cld3-asm/dist/cjs/lib/node/cld3.js:8:11112)
at stackTrace (/cld3-asm-issue/node_modules/cld3-asm/dist/cjs/lib/node/cld3.js:8:11283)
at process.abort (/cld3-asm-issue/node_modules/cld3-asm/dist/cjs/lib/node/cld3.js:8:1058443)
at process.emit (events.js:215:7)
at processPromiseRejections (internal/process/promises.js:201:33)
at processTicksAndRejections (internal/process/task_queues.js:94:32)
And my service died without telling anyone.
Is there a way to change this behavior? Maybe I missed some config. Is it mandatory for you to re-throw the error you catch via the unhandledRejection
event?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.