Comments (2)
Hi Vito,
thanks for your interest in JedAI!
The error you get is caused by an incompatibility between JedAI-gui and JedAI-core. The former is practically deprecated. It's better to use JedAI's web app through Docker. See Table 4 here for instructions: https://helios2.mi.parisdescartes.fr/~themisp/publications/is21-jedaiRepro.pdf .
Still, our custom implementation of the CSV reader gets confused by the separator, when it's not a comma. The best approach is to format your dataset like the Leipzig benchmarks (https://dbs.uni-leipzig.de/research/projects/object_matching/benchmark_datasets_for_entity_resolution), using comma as a separator and quotes for the values. Below you can find screenshots showing that JedAI can successfully read the csv files of the Abt-Buy dataset.
Btw, some of the Leipzig datasets involve more than than one duplicate per entity, even though they are Clean-Clean ER datasets. This is not supported by JedAI, which automatically removes equivalence clusters with more than two entities and prints the relevant messages in the command line.
from jedaitoolkit.
Thanks! My tests succedeed with JedAI's web app through Docker.
from jedaitoolkit.
Related Issues (20)
- ArrayIndexOutOfBoundsException when blocking with schema clusters HOT 1
- data pairs shown as false negatives and as true positives HOT 1
- SiGMa Similarity
- Could not read successfully the input file! HOT 1
- CSV Headers with upper case doesn't works for PPJoin HOT 1
- Documentation or examples for the open source library HOT 3
- GtCSVReader problems with jgrapht ConnectivityInspector HOT 2
- PPJoin throw ArrayIndexOutOfBound if candidateSize > requireOverlaps.length HOT 1
- Cannot read ground truth HOT 1
- Dirty datasets in CSV format HOT 3
- Change comparison counts type to int HOT 2
- DBPedia link broken HOT 1
- JedAI for Data matching HOT 1
- Make block building, block processing, entity clustering classes serializable and add setters for configurable fields HOT 2
- Question about Data HOT 2
- Unable to Read csv or json files HOT 1
- Apply JedAI blocking programmatically - missing documentation HOT 2
- Dependency org.apache.httpcomponents:httpclient-cache, leading to CVE problem HOT 1
- Converting the DBPedia dataset into non-Java format HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jedaitoolkit.