Comments (2)
Hi,
sorry I just saw now the issue .... yes this is also the problem we run into when writing hdtCat. As far as I remember: We are using mapped files and this is a feature that is not done by the JVM but by the operating system. For some reasons in Windows this is not working (or at least it was not at that time). I cannot find anymore the related ticked on the JDK issue tracker, so I do not know if it was resolved. But when I saw it it was open already since many years .... Maybe there is an easy bug fix but I do not know one.
Salut Dennis
I think it is something related to this https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6359560
from hdt-java.
After fixing HDTcat to work under Windows, I was able to evaluate the initial idea (question 1 in my initial post) - to a limited extent. Result: It might be the thing to do in situations where the available memory just isn't enough to generate a HDT file in one go. However, if it is possible to generate the whole HDT file in memory, it is much faster than the map-reduce approach (map to HDT chunks, reduce chunks using HDTcat). The difference in execution time is probably due to the many more filesystem accesses required in the map-reduce approach as each triple is processed multiple times. If the system could be changed to use in-memory data structures instead of memory mapped files for smaller amounts of data (i.e., data that fit in memory), this result might change as well.
from hdt-java.
Related Issues (20)
- Problem in hdtCat when compressing on small machine or many literals HOT 1
- Filter doesn't appear to be working with Jena interface HOT 4
- OPS indexing HOT 3
- Dependency org.apache.commons:commons-compress, leading to CVE problem HOT 1
- Increment HDT and index version to 3.0.0 HOT 6
- Release version 2.2 HOT 2
- Setup Github Action for Maven release HOT 18
- IllegalFormatException or IllegalArgumentException while reading RDF with B-Nodes in two-pass mode
- the ByteStringUtil.longestCommonPrefix(...) method isn't working between non ascii String and internal CharSequence
- Support query of multiple HDT files from CLI HOT 3
- Can't use Big version of the sequence HOT 1
- Fuseki integration seems broken since 3.0.0 HOT 4
- Byte strings aren't able to compare UTF32 strings
- Unsafe memory access HOT 2
- System.out.println() output HOT 5
- Question about Bitmap Triples Iterator ZFOQ implementation HOT 5
- Required array length 2147483639 + 11 is too large HOT 4
- Filtering issue HOT 3
- Problem loading wikidata: java.lang.OutOfMemoryError: Requested array size exceeds VM limit HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hdt-java.