amcpherson / remixt Goto Github PK
View Code? Open in Web Editor NEWClone-specific genomic structure estimation in cancer
License: MIT License
Clone-specific genomic structure estimation in cancer
License: MIT License
I am going to analyze BAM mapped using the hg38 reference. But how could I "create_ref_data" using hg38?
Thank you.
Hi,
The following command
remixt create_ref_data $ref_data_dir
works and ends properly. However, the subsequent command
remixt mappability_bwa $ref_data_dir
fails because bwa
does not find the index of the reference genome.
If the default behavior is for generating a mappability file based on bwa
alignments, would be possible to add the proper bwa
indexing as an additional step in remixt create_ref_data
?
Hi,
With the most updated version of conda and bioconda, ReMixT pipeline fails when running samtools
with the following error:
samtools: error while loading shared libraries: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory
This is due to the recent transition to OpenSSL 1.1.1 in bioconda. To solve this issue, samtools
must be updated at least to the following version:
conda install samtools=1.9=*_11
Would be possible to add this update version of samtools to the conda
distribution of remixt
?
Hi @amcpherson ,
When trying the the example code for ReMixT, we encountered this following error:
2018-05-17 12:29:25,210 - pypeliner.scheduler - INFO - job /remixt_seqdata_workflow/create_segments executing 2018-05-17 12:29:25,219 - pypeliner.scheduler - INFO - job /remixt_seqdata_workflow/create_segments -> remixt.analysis.segment.create_segments('/gpfs/commons/groups/imielinski_lab/git/mskilab/flows/modules/remixt/testing/raw_data/segments.tsv.tmp', {'ensembl_assemblies': ['chromosome.15'], 'chromosomes': ['15']}, '/gpfs/commons/groups/imielinski_lab/git/mskilab\ /flows/modules/remixt/testing/ref_data', breakpoint_filename='/gpfs/commons/groups/imielinski_lab/git/mskilab/flows/modules/remixt/testing/HCC1395_breakpoints.tsv') 2018-05-17 12:29:36,620 - pypeliner.scheduler - ERROR - job /remixt_seqdata_workflow/create_segments failed to complete --- stdout --- --- stderr --- Traceback (most recent call last): File "/gpfs/commons/home/mimielinski/software/anaconda2/lib/python2.7/site-packages/pypeliner/jobs.py", line 286, in __call__ self.ret_value = self.func(*self.callset.args, **self.callset.kwargs) File "/gpfs/commons/home/mimielinski/software/anaconda2/lib/python2.7/site-packages/remixt/analysis/segment.py", line 71, in create_segments changepoints.sort(['chromosome', 'position'], inplace=True) File "/gpfs/commons/home/mimielinski/software/anaconda2/lib/python2.7/site-packages/pandas/core/generic.py", line 3614, in __getattr__ return object.__getattribute__(self, name) AttributeError: 'DataFrame' object has no attribute 'sort'
I suspect this is related to pandas version, we are using pandas-0.22.0.
Please let me know how we can resolve this. Thanks!
Hello,
I met one problem when I ran remix, the command I executed is that
"
remixt run ref_data_dir result_remix breakpoit/bp.hg19.txt --normal_sample_id oec --normal_bam_file bam/OEC130618.rlg.bam --tumour_sample_ids lm130227 --tumour_bam_files bam/LM130227.rlg.bam --results_files lm130227.hd --tmpdir temp
",
and the error is that:
"
/home/yuzh/miniconda2/lib/python2.7/site-packages/statsmodels/compat/pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future version. Please use the pandas.tseries module instead.
from pandas.core import datetools
Traceback (most recent call last):
File "/home/yuzh/miniconda2/bin/remixt", line 11, in
load_entry_point('remixt==0.5.5', 'console_scripts', 'remixt')()
File "/home/yuzh/miniconda2/lib/python2.7/site-packages/remixt/ui/main.py", line 24, in main
func(**args)
File "/home/yuzh/miniconda2/lib/python2.7/site-packages/remixt/ui/run.py", line 32, in run
pyp = pypeliner.app.Pypeline([remixt], pypeliner_config)
File "/home/yuzh/miniconda2/lib/python2.7/site-packages/pypeliner/app.py", line 195, in init
config_filename=self.config['submit_config'])
File "/home/yuzh/miniconda2/lib/python2.7/site-packages/pypeliner/execqueue/factory.py", line 6, in create
raise Exception('No submit queue specified')
Exception: No submit queue specified
“
By the way, the reference genome I used is hg19 instead of GRCh37, I was wondering whether it will have an effect on the result.
I have no idea about how to solve this problem. I will appreciate it very much if you could help me.
Thank you!
Dear authors,
I would like to test this very interesting method! I have a collection of BAM files corresponding to multiple samples from the same patient and a corresponding matched-normal sample. Unfortunately, these BAM files have the chromosomes specified with the chr
notation (e.g. chr1, chr2, chr3, ..., chr22) which seem different from the assumptions in the default values of the config file.
As such, I would like to know whether there is a simple and easy way to run Remixt without changing the BAM files, is it sufficient to provide the names chromosomes: ['chr1, ...']
in the config file? Is there anything else that needs to be changed? For example the name of the corresponding 1000G files, where the files are specified as _chr{chromosome}
and should probably become _{chromosome}
?
Thank you
I have installed remixt and started downloading the reference data, but that process failed with message
--2023-10-11 06:02:33-- http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000G_2504_high_coverage/working/20220422_3202_phased_SNV_INDEL_SV/1kGP_high_coverage_Illumina.chrX.filtered.SNV_INDEL_SV_phased_panel.vcf.gz
Resolving ftp.1000genomes.ebi.ac.uk (ftp.1000genomes.ebi.ac.uk)... 193.62.193.167
Connecting to ftp.1000genomes.ebi.ac.uk (ftp.1000genomes.ebi.ac.uk)|193.62.193.167|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2023-10-11 06:05:37 ERROR 404: Not Found.
Using a web browser, I visited site http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000G_2504_high_coverage/working/20220422_3202_phased_SNV_INDEL_SV and saw that file
1kGP_high_coverage_Illumina.chrX.filtered.SNV_INDEL_SV_phased_panel.vcf.gz
is not there; instead, I see a similarly-named file
1kGP_high_coverage_Illumina.chrX.filtered.SNV_INDEL_SV_phased_panel.v2.vcf.gz
How should I address this situation?
Can I restart the download process from the point where it halted without having to download the files that I have already?
Regards,
Eric Sisson
Hi,
When running on a server machine without an active X server, the pipeline of ReMixT
fails as it is unable to save some .pdf
files.
Would be possible to provide an optional argument to switch matplotlib
and similar libraries to use Agg
instead?
Temporary solution is to manually set the corresponding environmental variable.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.