parchain's People
Forkers
joancfparchain's Issues
How Dimensions affect memory and speed usage?
Hi Shangdi,
I have try it with the dendrogram option,
I could run one of the sample datasets (10D_UCI1_19K) and it works
then I tried to use it with some embeddings (768 Dimensions) I had to add the code for these dimensions
I have a dataset of 2,5M samples. (it produces an out-of-memory error) but with 1M dataset it uses 89Gb of memory (some swap) but it fails. I dies silently
this is the output:
+++++++++++++++
eps = 9.999999999999999451532715e-21
use matrix = 0
use matrix range = 0
naive_thresh = 5
cache_size/2 = 32
starting dendro
average Linkage of 1000000, dim 768 points
distAverage5 point array, norm = 1, rebuild tree
The command I run is this one
CILK_NWORKERS=12 numactl -i all ./parchain/linkage/framework/linkage -method avg -r 1 -d $DIMENSIONS -dendro ./outputs/${METHOD}/dendrogaram.txt ./datasets/$FILE > outputs/${METHOD}/test.txt
I have 16 cores, and 64G of memory
Now I have it running with only 2 workers: It has been running for 102 hrs and seems to be able to go a bit forward, but has not ended yet!!!
+++++++++++++++
eps = 9.999999999999999451532715e-21
use matrix = 0
use matrix range = 0
naive_thresh = 5
cache_size/2 = 32
starting dendro
average Linkage of 1000000, dim 768 points
distAverage5 point array, norm = 1, rebuild tree 2
========= CHAIN TREE =========
num workers 2
hash table size 64
::initialize: 22.557
So , I would as: Is there a way to compute the memory needed depending on?
- number of processors
- dimensionality
- dataset size
- method to compute distance
Generate denogram given a file containing N d-dimensional vectors
Hi thanks for the great work!
I am wondering if there are docs about how to use the tool to actually do HAC with a file of vectors, instead of benchmarking.
What are the expected input data format and what would be the output?
Thanks again!
Install cilk
Hi,
Your package looks great, but I was not able to install/run it.
Do you have more information on how to install cilk:
Which version do you use (is the intel version? which one?) or is it open cilk?
Maybe, there is a docker with c++ and cilk already installed. Do you know if this option exists?
I found this one
https://github.com/jonniesweb/docker-cilkplus/blob/master/Dockerfile
but seems that the gcc is 5.4, maybe too old ;-(
Thanks!!!
Joan
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.