pinellolab / haystack_bio Goto Github PK
View Code? Open in Web Editor NEWHaystack: Epigenetic Variability and Transcription Factor Motifs Analysis Pipeline
License: Other
Haystack: Epigenetic Variability and Transcription Factor Motifs Analysis Pipeline
License: Other
How does haystack_bio handle replicates? For example, if each celltype in the example had 3 replicates therefore 3 input BAMs can we still run this?
I'm trying to get haystack up and running but get an out of range error generating the bigwig files. This is the output after running haystack_run_test
.
I get a similar error when running on real data.
Thanks,
Shannan
sambamba 0.7.0
by Artem Tarasov and Pjotr Prins (C) 2012-2019
LDC 1.13.0 / DMD v2.083.1 / LLVM7.0.0 / bootstrap LDC - the LLVM D compiler (0.17.6)
sambamba-view: not enough data in stream
Traceback (most recent call last):
File "/Users/shosui/miniconda2/bin/haystack_hotspots", line 10, in <module>
sys.exit(main())
File "/Users/shosui/miniconda2/lib/python2.7/site-packages/haystack/find_hotspots.py", line 1002, in main
args.read_ext)
File "/Users/shosui/miniconda2/lib/python2.7/site-packages/haystack/find_hotspots.py", line 460, in to_normalized_extended_reads_tracks
scaling_factor = get_scaling_factor(bam_filename)
File "/Users/shosui/miniconda2/lib/python2.7/site-packages/haystack/find_hotspots.py", line 63, in get_scaling_factor
scaling_factor = (1.0 / float(stdout.strip())) * 1000000
ValueError: could not convert string to float:
Traceback (most recent call last):
File "/Users/shosui/miniconda2/bin/haystack_pipeline", line 10, in <module>
sys.exit(main())
File "/Users/shosui/miniconda2/lib/python2.7/site-packages/haystack/run_pipeline.py", line 258, in main
'Background_for_%s*.bed' % sample_name))[0]
IndexError: list index out of range
INFO @ Mon, 23 Sep 2019 13:33:47:
Test completed successfully
How many cell types are required for haystack_bio to provide accurate and reliable results? For example, I have ATAC-seq and RNA-seq data on three cell types? Would this be sufficient for running haystack_bio?
Hi,
I have a miniconda3 env with python 2.7 and pandas 0.21 as required.
Still, when i run haystack_hotspots I get the following error during the Determine High Plastic Regions (HPR) step:
Traceback (most recent call last):
File "/home/davidebrex/miniconda3/envs/py27/bin/haystack_hotspots", line 10, in <module>
sys.exit(main())
File "/home/davidebrex/miniconda3/envs/py27/lib/python2.7/site-packages/haystack/find_hotspots.py", line 1043, in main
output_directory)
File "/home/davidebrex/miniconda3/envs/py27/lib/python2.7/site-packages/haystack/find_hotspots.py", line 665, in find_hpr_coordinates
result_type ='broadcast')
File "/home/davidebrex/miniconda3/envs/py27/lib/python2.7/site-packages/pandas/core/frame.py", line 4854, in apply
ignore_failures=ignore_failures)
File "/home/davidebrex/miniconda3/envs/py27/lib/python2.7/site-packages/pandas/core/frame.py", line 4950, in _apply_standard
results[i] = func(v)
File "/home/davidebrex/miniconda3/envs/py27/lib/python2.7/site-packages/pandas/core/frame.py", line 4831, in f
return func(x, *args, **kwds)
TypeError: ("zscore() got an unexpected keyword argument 'result_type'", u'occurred at index 65286')
Any idea about what might cause this?
Thank you!
Davide
Hi, I am trying to install haystack on our server and I am running into an error when running the tests:
The tests complete successfully but at the end I get this:
INFO @ Wed, 07 Apr 2021 16:10:54:
Analyzing MA0724.1 from:/home/user/haystack_test_output/HAYSTACK_PIPELINE_RESULTS/HAYSTACK_MOTIFS/HAYSTACK_MOTIFS_on_K562/genes_lists/MA0724.1_motif_region_in_target.tss.bed
/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/haystack/generate_tf_activity_plane.py:189:FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
mapped_genes = map(str.upper, list(pd.read_table(motif_gene_filename,keep_default_na=False,na_values='null').dropna()['Symbol'].values.astype(str)))
Traceback (most recent call last):
File "/home/users/.conda/envs/hotspots/bin/haystack_tf_activity_plane", line 10, in <module>
sys.exit(main())
File "/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/haystack/generate_tf_activity_plane.py", line 193, in main
ds_values = zscore_series(gene_ranking.ix[mapped_genes, :].mean())
File "/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/pandas/core/indexing.py", line 120, in __getitem__
return self._getitem_tuple(key)
File "/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/pandas/core/indexing.py", line 888, in _getitem_tuple
retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
File "/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/pandas/core/indexing.py", line 1088, in _getitem_axis
return self._getitem_iterable(key, axis=axis)
File "/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/pandas/core/indexing.py", line 1205, in _getitem_iterable
raise_missing=False)
File "/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/pandas/core/indexing.py", line 1161, in _get_listlike_indexer
raise_missing=raise_missing)
File "/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/pandas/core/indexing.py", line 1252, in _validate_read_indexer
raise KeyError("{} not in index".format(not_found))
KeyError: "['BAGE5', 'GRIK1-AS2'] not in index"
INFO @ Wed, 07 Apr 2021 16:10:54:
Test completed successfully
Should I be worried?
Do we need all the files in haystack_data/extra ?
The three files below contain functions used by all the modules. Can they be combined under a single file? Are all the functions needed by the modules?
PEP8: Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.
When I use haystack_download_genome command to download the reference genome, it displayed the fowlling error:
Traceback (most recent call last):
File "/picb/epigenome/usr/zhangge/Softwares/haystack_bio_master/haystack/download_genome.py", line 24, in
main()
File "/picb/epigenome/usr/zhangge/Softwares/haystack_bio_master/haystack/download_genome.py", line 21, in main
initialize_genome(args.name)
File "/picb/epigenome/usr/zhangge/Softwares/haystack_bio_master/haystack/haystack_common.py", line 213, in initialize_genome
from bioutilities import Genome_2bit
File "/picb/epigenome/usr/zhangge/Softwares/haystack_bio_master/haystack/bioutilities.py", line 20, in
from bx.intervals.intersection import Intersecter, Interval
File "/home/zhangge/miniconda3/envs/python27/lib/python2.7/site-packages/bx/intervals/init.py", line 7, in
from bx.intervals.intersection import * # noqa: F40
ImportError: /home/zhangge/miniconda3/envs/python27/lib/python2.7/site-packages/bx/intervals/intersection.so: undefined symbol: PyUnicodeUCS2_FromStringAndSize
How can I solve this? Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.