pinellolab / haystack_bio Goto Github PK

View Code? Open in Web Editor NEW

43.0 4.0 10.0 1.21 GB

Haystack: Epigenetic Variability and Transcription Factor Motifs Analysis Pipeline

License: Other

Shell 0.02% Python 0.73% CSS 0.01% HTML 98.78% Jupyter Notebook 0.46% R 0.01% Dockerfile 0.01%

epigenomics gene regulation tfs transcription-factor-binding rna-seq chip-seq fimo motifs epigenetics

haystack_bio's People

Contributors

Stargazers

Watchers

Forkers

hanhatquan rfarouni gcyuan anran1214 ianmellis tools-jusue404 am-official gvaihir genomicsnx 00mjk

haystack_bio's Issues

replicates with haystack_bio

How does haystack_bio handle replicates? For example, if each celltype in the example had 3 replicates therefore 3 input BAMs can we still run this?

Error generating normalized bigwig files

I'm trying to get haystack up and running but get an out of range error generating the bigwig files. This is the output after running haystack_run_test.

I get a similar error when running on real data.

Thanks,
Shannan

sambamba 0.7.0
by Artem Tarasov and Pjotr Prins (C) 2012-2019
LDC 1.13.0 / DMD v2.083.1 / LLVM7.0.0 / bootstrap LDC - the LLVM D compiler (0.17.6)

sambamba-view: not enough data in stream
Traceback (most recent call last):
File "/Users/shosui/miniconda2/bin/haystack_hotspots", line 10, in <module>
sys.exit(main())
File "/Users/shosui/miniconda2/lib/python2.7/site-packages/haystack/find_hotspots.py", line 1002, in main
args.read_ext)
File "/Users/shosui/miniconda2/lib/python2.7/site-packages/haystack/find_hotspots.py", line 460, in to_normalized_extended_reads_tracks
scaling_factor = get_scaling_factor(bam_filename)
File "/Users/shosui/miniconda2/lib/python2.7/site-packages/haystack/find_hotspots.py", line 63, in get_scaling_factor
scaling_factor = (1.0 / float(stdout.strip())) * 1000000
ValueError: could not convert string to float: 
Traceback (most recent call last):
File "/Users/shosui/miniconda2/bin/haystack_pipeline", line 10, in <module>
sys.exit(main())
File "/Users/shosui/miniconda2/lib/python2.7/site-packages/haystack/run_pipeline.py", line 258, in main
'Background_for_%s*.bed' % sample_name))[0]
IndexError: list index out of range
INFO  @ Mon, 23 Sep 2019 13:33:47:
 Test completed successfully

How many cell types are required for haystack_bio to provide accurate and reliable results? For example, I have ATAC-seq and RNA-seq data on three cell types? Would this be sufficient for running haystack_bio?

TypeError: ("zscore() got an unexpected keyword argument 'result_type'", u'occurred at index 65286')

Hi,
I have a miniconda3 env with python 2.7 and pandas 0.21 as required.
Still, when i run haystack_hotspots I get the following error during the Determine High Plastic Regions (HPR) step:

Traceback (most recent call last):
  File "/home/davidebrex/miniconda3/envs/py27/bin/haystack_hotspots", line 10, in <module>
    sys.exit(main())
  File "/home/davidebrex/miniconda3/envs/py27/lib/python2.7/site-packages/haystack/find_hotspots.py", line 1043, in main
    output_directory)
  File "/home/davidebrex/miniconda3/envs/py27/lib/python2.7/site-packages/haystack/find_hotspots.py", line 665, in find_hpr_coordinates
    result_type ='broadcast')
  File "/home/davidebrex/miniconda3/envs/py27/lib/python2.7/site-packages/pandas/core/frame.py", line 4854, in apply
    ignore_failures=ignore_failures)
  File "/home/davidebrex/miniconda3/envs/py27/lib/python2.7/site-packages/pandas/core/frame.py", line 4950, in _apply_standard
    results[i] = func(v)
  File "/home/davidebrex/miniconda3/envs/py27/lib/python2.7/site-packages/pandas/core/frame.py", line 4831, in f
    return func(x, *args, **kwds)
TypeError: ("zscore() got an unexpected keyword argument 'result_type'", u'occurred at index 65286')

Any idea about what might cause this?
Thank you!

Davide

KeyError while running tests

Hi, I am trying to install haystack on our server and I am running into an error when running the tests:
The tests complete successfully but at the end I get this:

INFO  @ Wed, 07 Apr 2021 16:10:54:
	 Analyzing MA0724.1 from:/home/user/haystack_test_output/HAYSTACK_PIPELINE_RESULTS/HAYSTACK_MOTIFS/HAYSTACK_MOTIFS_on_K562/genes_lists/MA0724.1_motif_region_in_target.tss.bed
/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/haystack/generate_tf_activity_plane.py:189:FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
  mapped_genes = map(str.upper, list(pd.read_table(motif_gene_filename,keep_default_na=False,na_values='null').dropna()['Symbol'].values.astype(str)))
Traceback (most recent call last):
  File "/home/users/.conda/envs/hotspots/bin/haystack_tf_activity_plane", line 10, in <module>
    sys.exit(main())
  File "/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/haystack/generate_tf_activity_plane.py", line 193, in main
    ds_values = zscore_series(gene_ranking.ix[mapped_genes, :].mean())
  File "/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/pandas/core/indexing.py", line 120, in __getitem__
    return self._getitem_tuple(key)
  File "/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/pandas/core/indexing.py", line 888, in _getitem_tuple
    retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
  File "/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/pandas/core/indexing.py", line 1088, in _getitem_axis
    return self._getitem_iterable(key, axis=axis)
  File "/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/pandas/core/indexing.py", line 1205, in _getitem_iterable
    raise_missing=False)
  File "/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/pandas/core/indexing.py", line 1161, in _get_listlike_indexer
    raise_missing=raise_missing)
  File "/home/user/.conda/envs/hotspots/lib/python2.7/site-packages/pandas/core/indexing.py", line 1252, in _validate_read_indexer
    raise KeyError("{} not in index".format(not_found))
KeyError: "['BAGE5', 'GRIK1-AS2'] not in index"
INFO  @ Wed, 07 Apr 2021 16:10:54:
	 Test completed successfully

Should I be worried?

code organization

These files in root are not needed

haystack_download_genome.py
haystack_hotspots.py
haystack_motifs.py
haystack_pipeline.py

Do we need all the files in haystack_data/extra ?
The three files below contain functions used by all the modules. Can they be combined under a single file? Are all the functions needed by the modules?

bioutilities.py
external.py
haystack_common.py

The module names are not all-lowercase. haystack in the beginning is not descriptive of what the module does.

haystack_download_genome_CORE.py
haystack_hotspots_CORE.py
haystack_motifs_CORE.py
haystack_pipeline_CORE.py
haystack_tf_activity_plane_CORE.py

PEP8: Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.

haystack_download_genome error

When I use haystack_download_genome command to download the reference genome, it displayed the fowlling error:
Traceback (most recent call last):
File "/picb/epigenome/usr/zhangge/Softwares/haystack_bio_master/haystack/download_genome.py", line 24, in
main()
File "/picb/epigenome/usr/zhangge/Softwares/haystack_bio_master/haystack/download_genome.py", line 21, in main
initialize_genome(args.name)
File "/picb/epigenome/usr/zhangge/Softwares/haystack_bio_master/haystack/haystack_common.py", line 213, in initialize_genome
from bioutilities import Genome_2bit
File "/picb/epigenome/usr/zhangge/Softwares/haystack_bio_master/haystack/bioutilities.py", line 20, in
from bx.intervals.intersection import Intersecter, Interval
File "/home/zhangge/miniconda3/envs/python27/lib/python2.7/site-packages/bx/intervals/init.py", line 7, in
from bx.intervals.intersection import * # noqa: F40
ImportError: /home/zhangge/miniconda3/envs/python27/lib/python2.7/site-packages/bx/intervals/intersection.so: undefined symbol: PyUnicodeUCS2_FromStringAndSize

How can I solve this? Thanks!

pinellolab / haystack_bio Goto Github PK

haystack_bio's People

Contributors

Stargazers

Watchers

Forkers

haystack_bio's Issues

replicates with haystack_bio

Error generating normalized bigwig files

Number of cell types required

TypeError: ("zscore() got an unexpected keyword argument 'result_type'", u'occurred at index 65286')

KeyError while running tests

code organization

haystack_download_genome error

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent