Comments (7)
I'll attempt to take this on! Btw, there is a Python squash_edges
function in test_topology.py
over here.
from tskit.
Hmm, yes, you're right @gtsambos, the function is assuming that there's only one parent in the list of edges. It's easy enough to generalise to handle more than one though. Here's what I'd suggest:
-
Generalise the
squash_edges
function to take lists that have more than one distinct parent value. This would require (a) addingparent
as the first key in the comparison function (there should be lots of other comparison functions around to use as an example); (b) changing the for loop logic to deal with changing from one parent to another; (c) writing a couple of (very) simple test cases to cover this functionality (manually creating a few unsorted small lists of edges and testing the output is squashed appropriately). -
Add a a new method
tsk_edge_table_squash
that mallocs a list of edges, copies its contents into this list, callssquash_edges
, callstsk_edge_table_clear()
and then runstsk_edge_table_add_row
for each of the new squashed edges. -
Set up the plumbing between
tsk_edge_table_squash
andEdgeTable_squash
in_tskitmodule.c
.
Does this seem like a good approach @gtsambos? Sorry about the misinformation on what the current function does.
from tskit.
Cool - the python version will be handy to write a few quick tests to compare with.
from tskit.
Hmmm, I'm wondering whether the existing squash_edges
function in C was made for some slightly different purpose to what we need it for here (link to code is here)?
- I think the first argument should be of type
tsk_edge_table_t *
? I've assumed this is a typo and fixed it - Assuming that
edges
has typeedge_table_t *
, then all the things likeedges[k].parent
should actually beedges->parent[k]
(I think?) - This assertion
assert(edges[k - 1].parent == edges[k].parent)
suggests that the input into this function isn't a normal edge table but some truncated one, as most edge tables would have more than one distinct parent in them. The comparison functioncmp_edge_cl
that you defined here also looks like it's been designed for this purpose.
Btw I know you're very busy @jeromekelleher! Just flagging these to indicate that this task won't just be setting up plumbing, so might take a bit longer than we first thought
tsk_squash_edges(tsk_edge_t *edges, size_t num_edges, size_t *num_output_edges)
{
int ret = 0;
size_t j, k, l;
tsk_edge_t e;
qsort(edges, num_edges, sizeof(tsk_edge_t), cmp_edge_cl);
j = 0;
l = 0;
for (k = 1; k < num_edges; k++) {
assert(edges[k - 1].parent == edges[k].parent);
if (edges[k - 1].right != edges[k].left || edges[j].child != edges[k].child) {
e = edges[j];
e.right = edges[k - 1].right;
edges[l] = e;
j = k;
l++;
}
}
e = edges[j];
e.right = edges[k - 1].right;
edges[l] = e;
*num_output_edges = l + 1;
return ret;
}
from tskit.
Hi @jeromekelleher, let's continue the conversation over here.
from tskit.
Presumably this can now be closed?
from tskit.
Yes, closed in #285.
from tskit.
Related Issues (20)
- Use reusable workflows across repos? HOT 2
- Error pip install tskit on Windows HOT 2
- Add node_is_sample array HOT 2
- TreeSequence.f2 is not symmetric with multiallelic sites HOT 3
- Bug assertion in `ts.allele_frequency_spectrum` HOT 8
- Folded AFS between branch and site modes differs by factor of two HOT 7
- edges lost after merging two trees HOT 20
- Codecov upload issues HOT 1
- keep_intervals() giving _tskit.LibraryError: Can't squash, flush, simplify or link ancestors... HOT 7
- Update GitHub upload/download artefacts
- Add XTable.drop_metadata HOT 1
- trees.c Compilation Error HOT 2
- Split large numbers in html/cli print out.
- Post release for 0.5.7 HOT 1
- Fixup tests for lshmm 0.0.6 HOT 7
- Fix tests for numpy 2.0 HOT 1
- Drop "benchmark" CI job?
- Pass a numpy array of booleans to python simplify? HOT 7
- Cannot pickle '_tskit.Tree' object HOT 3
- AFS folding (polarised=False) HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tskit.