ivanakcheurov / ntextcat Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
so i used your package here and i copy-pasted your code and added the xml and the txt and lm files from those folders into my app
and its appears to recognize hebrew as danish for some reason
you can try this to check it yourself:
var languages = identifier.Identify("קדימה");
and when i checked xml hebrew is apparent there
I have used documents of varying sizes in Turkish but it always gets detected as Swedish or Norwegian. Is there a known issue with detecting Turkish? Thanks.
The online demo webpage needs to be updated.
Requirements:
Hi, thanx for the library.
Hi Ivan,
Are you accepting any PR's or plan to do a .net core/standard version?
Would you consider allowing other contributors on this repo?
Thanks and well cone on your port, it is a great accomplishment.
Could you give a small example of using your library?
win 7x64
vs - 2017
Installed "ntextcat" through "nuget"
I need to determine the language of the text that is entered in "textBox2.Text".
Result output in "textBox1.Text"
It is supposed to enter the text: European languages, languages with hieroglyphs (Chinese, Japanese) and others
Found sample code.
I get a string error
var identifier = factory.Load("NTextCat 0.2.1.1\\LanguageModels\\Core14.profile.xml");
cod
using NTextCat;
namespace rsh
{
public partial class Form2 : Form
{
public Form2()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
var factory = new RankedLanguageIdentifierFactory();
var identifier = factory.Load("NTextCat 0.2.1.1\\LanguageModels\\Core14.profile.xml");
var languages = identifier.Identify(textBox2.Text);
var mostCertainLanguage = languages.FirstOrDefault();
textBox1.Text = mostCertainLanguage.Item1.Iso639_3;
}
}
}
How to solve the problem?
Just wondering whether the identifierfactory is a thread-safe object.
Thanks for sharing this!
I am using it to triple check subtitles for my Plex processing, and 'the voice' which had a Closed Captions stream, all in upper case, detected as Norwegian instead of english?
using var model = httpClient.GetStreamAsync("https://raw.githubusercontent.com/ivanakcheurov/ntextcat/master/src/LanguageModels/Core14.profile.xml").Result;
RankedLanguageIdentifierFactory identifierFactory = new();
var identifier = identifierFactory.Load(model);
Иван добрый день,
какая логика определения языка, если в тексте встречаются фразы на 2х или 3х языках сразу?
Спасибо за вашу работу!
There are many ideas and issues on
https://archive.codeplex.com/?p=ntextcat
Do you still work on this topics?
is the RankedLanguageIdentifierFactory and RankedLanguageIdentifier thread safe?
Thanks for making your library available. Could you state the license/terms of use?
Looking at your library, it has no dependencies, and so should be very simple (less than a couple of hours work) to convert to .Net Standard.
Do you have any plans to do this?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.