Comments (3)
Hi @ashenflower ,
Maybe you are looking for --target-avg-size
and --antitarget-avg-size
parameters of batch
subcommand ? (see cnvkit.py batch --help
)
Hope this helps !
Have a nice day,
Felix.
from cnvkit.
Hello @tetedange13 , thank you very much for your reply! That's what I was looking for (I had seen that parameter before but I had misunderstood its usage).
But another question raised in my mind after setting the --target-avg-size
equal to 1000 and running the batch
pipeline. If I am correct, the .cnr file in output should report the log2 ratio for each bin, and I wanted to apply the CN calling directly on this file. However, it seems that not all the bins are listed in the file (for e.g. for some chromosomes bins are not starting from 0). Is that correct? Why is that so?
Also, in my genomic.target.bed file, several sequences are missing, even if they are listed between the 'Accessible regions' in the log file, and I don't undersand why.
Thank you so much for your help!
from cnvkit.
This is the log file I get by applying CNVkit on an old reference:
WGS protocol: recommend '--annotate' option (e.g. refFlat.txt) to help locate genes in output files.
NC_000001.10: Scanning for accessible regions
Accessible region NC_000001.10:10000-177417 (size 167417)
Accessible region NC_000001.10:227417-267719 (size 40302)
Accessible region NC_000001.10:317719-471368 (size 153649)
Accessible region NC_000001.10:521368-2634220 (size 2112852)
Accessible region NC_000001.10:2684220-3845268 (size 1161048)
Accessible region NC_000001.10:3995268-13052998 (size 9057730)
Accessible region NC_000001.10:13102998-13219912 (size 116914)
Accessible region NC_000001.10:13319912-13557162 (size 237250)
Accessible region NC_000001.10:13607162-17125658 (size 3518496)
Accessible region NC_000001.10:17175658-29878082 (size 12702424)
Accessible region NC_000001.10:30028082-103863906 (size 73835824)
Accessible region NC_000001.10:103913906-120697156 (size 16783250)
Accessible region NC_000001.10:120747156-120936695 (size 189539)
Accessible region NC_000001.10:121086695-121485434 (size 398739)
Accessible region NC_000001.10:142535434-142731022 (size 195588)
Accessible region NC_000001.10:142781022-142967761 (size 186739)
Accessible region NC_000001.10:143117761-143292816 (size 175055)
Accessible region NC_000001.10:143342816-143544525 (size 201709)
Accessible region NC_000001.10:143644525-143771002 (size 126477)
Accessible region NC_000001.10:143871002-144095783 (size 224781)
Accessible region NC_000001.10:144145783-144224481 (size 78698)
Accessible region NC_000001.10:144274481-144401744 (size 127263)
Accessible region NC_000001.10:144451744-144622413 (size 170669)
Accessible region NC_000001.10:144672413-144710724 (size 38311)
Accessible region NC_000001.10:144810724-145833118 (size 1022394)
Accessible region NC_000001.10:145883118-146164650 (size 281532)
Accessible region NC_000001.10:146214650-146253299 (size 38649)
Accessible region NC_000001.10:146303299-148026038 (size 1722739)
Accessible region NC_000001.10:148176038-148361358 (size 185320)
Accessible region NC_000001.10:148511358-148684147 (size 172789)
Accessible region NC_000001.10:148734147-148954460 (size 220313)
Accessible region NC_000001.10:149004460-149459645 (size 455185)
Accessible region NC_000001.10:149509645-205922707 (size 56413062)
Accessible region NC_000001.10:206072707-206332221 (size 259514)
Accessible region NC_000001.10:206482221-223747846 (size 17265625)
Accessible region NC_000001.10:223797846-235192211 (size 11394365)
Accessible region NC_000001.10:235242211-248908210 (size 13665999)
Accessible region NC_000001.10:249058210-249240621 (size 182411)
NT_113878.1: Scanning for accessible regions
Accessible region NT_113878.1:0-106433 (size 106433)
NT_167207.1: Scanning for accessible regions
Accessible region NT_167207.1:0-547496 (size 547496)
NC_000002.11: Scanning for accessible regions
Accessible region NC_000002.11:10000-3529312 (size 3519312)
Accessible region NC_000002.11:3579312-5018788 (size 1439476)
Accessible region NC_000002.11:5118788-16279724 (size 11160936)
Accessible region NC_000002.11:16329724-21153113 (size 4823389)
Accessible region NC_000002.11:21178113-31705550 (size 10527437)
Accessible region NC_000002.11:31705551-31725939 (size 20388)
Accessible region NC_000002.11:31726790-31816827 (size 90037)
Accessible region NC_000002.11:31816828-31816854 (size 26)
Accessible region NC_000002.11:31816855-31816858 (size 3)
Accessible region NC_000002.11:31816859-33092197 (size 1275338)
Accessible region NC_000002.11:33093197-33141692 (size 48495)
Accessible region NC_000002.11:33142692-87668206 (size 54525514)
Accessible region NC_000002.11:87718206-89630436 (size 1912230)
Accessible region NC_000002.11:89830436-90321525 (size 491089)
Accessible region NC_000002.11:90371525-90545103 (size 173578)
Accessible region NC_000002.11:91595103-92326171 (size 731068)
Accessible region NC_000002.11:95326171-110109337 (size 14783166)
Accessible region NC_000002.11:110251337-149690582 (size 39439245)
Accessible region NC_000002.11:149790582-234003741 (size 84213159)
Accessible region NC_000002.11:234053741-239801978 (size 5748237)
Accessible region NC_000002.11:239831978-240784132 (size 952154)
Accessible region NC_000002.11:240809132-243102476 (size 2293344)
Accessible region NC_000002.11:243152476-243189373 (size 36897)
NC_000003.11: Scanning for accessible regions
Accessible region NC_000003.11:60000-17137943 (size 17077943)
NT_113878.1: Joining over small gaps
NT_167207.1: Joining over small gaps
Wrote GCF_000001405.25_GRCh37.p13_genomic.bed with 2 regions
So it seems to me that all sequences NC_000001.10, NT_113878.1, NT_167207.1, NC_000002.11 and NC_000003.11 have large accessible regions, but in the end only NT_113878.1 and NT_167207.1 got in my target.bed file.
from cnvkit.
Related Issues (20)
- Gender Inference in Reference Step
- Update included references for hg38 HOT 1
- What are the specific steps to run for cnvkit.py batch -m wgs?
- How to normalise different sequencing coverage samples in CNVKit?
- For paired-WGS, is it necessary step of markduplicates by PICARD? HOT 1
- `import-rna` not compatible with pandas 2 HOT 1
- Can the results of CNVKIT be applied to clinical clinical?
- UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte` HOT 1
- negative number in cn column of *call.cns
- Installation from VCS via pip fails during pomegranate installation
- the false negative of segment HOT 2
- Question: Clarification on building a reference HOT 3
- "Genemetrics" combination algorithm seems to include an additional antitarget bin
- Calculating non-integer copy number variations
- A question about purity assumption in the Docs
- Docker image of cnvkit does not contain additional scripts HOT 3
- We could not set the gender of the sample correctly in reference.py HOT 2
- Why adding a pseudocount of flat reference to the aggregated reference?
- sex inference and X chromosome copy number
- CNVkit for RNA-seq: log2 in *.cnr file is all zero
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cnvkit.