spiralgenetics / biograph Goto Github PK

The BioGraph genomics analysis platform

License: BSD 2-Clause "Simplified" License

Starlark 2.92% C++ 56.40% C 9.27% Shell 1.07% Python 15.45% CSS 0.43% JavaScript 3.70% Dockerfile 0.01% Makefile 0.01% Perl 0.05% Jupyter Notebook 10.71%

biograph's People

Contributors

Stargazers

Watchers

Forkers

surabhim mhoak dna0ff scchess hackerfriendly wangdepin iamh2o qpc-github

biograph's Issues

frozen study indication

vdb study list and vdb study show should indicate if a study is frozen

Download link for biograph model is broken

Hi,

Could you provide a download link for the biograph model (https://archive.spiralgenetics.com/files/models/biograph_model-7.1.0.ml) the current one is broken? I am happy to upload this file on a reliable storage platform.

I could not locate this file in the docker container.

Thanks,
Yann

Remove pyvcf dependency

PyVCF can only be installed with use_2to3, which is no longer used in any setuptools past 58. Therefore, need to remove PyVCF dependency.

PyVCF is used:

./python/biograph/tools/coverage.py
- Just uses it for header stuff
./python/functest/biograph_test.py
- Single line just to count vcf entries
./python/functest/GenomeGraphTests.py
- Again, just a counter

There are a few places in ./python/biograph/internal/ where PyVCF is also used. But before refactoring that code, we should check if we even need to support those tools at all.

import_formats_test fails sporadically

I can't reproduce this error outside of github-ci, but it fails fairly regularly when testing with or without -copt.

https://github.com/spiralgenetics/biograph/runs/3624148617?check_suite_focus=true

Running out of space is a non-fatal error

Trying to export a study without a properly set TMPDIR gave error:


Traceback (most recent call last):File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrapself.run()
File "/home/ubuntu/pyenv/lib/python3.6/site-packages/biograph/vdb/study_cmd.py", line 97, in runfile=fOSError: [Errno 28] No space left on deviceExporting VCF

So it failed, and then kept going, only exporting chromosome 1 and part of chromosome 10

boto3 reports credentials file too often

Every time a new connection is made to athena or s3, boto reports:

Found credentials in shared credentials file: ~/.aws/credentials

This is quite annoying when running many child processes in parallel.

Implement a logging filter, eg. https://stackoverflow.com/questions/879732/logging-with-filters