Code Monkey home page Code Monkey logo

Comments (3)

tetedange13 avatar tetedange13 commented on August 13, 2024

Hi @ashenflower ,

Maybe you are looking for --target-avg-size and --antitarget-avg-size parameters of batch subcommand ? (see cnvkit.py batch --help)

Hope this helps !
Have a nice day,
Felix.

from cnvkit.

ashenflower avatar ashenflower commented on August 13, 2024

Hello @tetedange13 , thank you very much for your reply! That's what I was looking for (I had seen that parameter before but I had misunderstood its usage).

But another question raised in my mind after setting the --target-avg-size equal to 1000 and running the batch pipeline. If I am correct, the .cnr file in output should report the log2 ratio for each bin, and I wanted to apply the CN calling directly on this file. However, it seems that not all the bins are listed in the file (for e.g. for some chromosomes bins are not starting from 0). Is that correct? Why is that so?

Also, in my genomic.target.bed file, several sequences are missing, even if they are listed between the 'Accessible regions' in the log file, and I don't undersand why.

Thank you so much for your help!

from cnvkit.

ashenflower avatar ashenflower commented on August 13, 2024

This is the log file I get by applying CNVkit on an old reference:

WGS protocol: recommend '--annotate' option (e.g. refFlat.txt) to help locate genes in output files.
NC_000001.10: Scanning for accessible regions
        Accessible region NC_000001.10:10000-177417 (size 167417)
        Accessible region NC_000001.10:227417-267719 (size 40302)
        Accessible region NC_000001.10:317719-471368 (size 153649)
        Accessible region NC_000001.10:521368-2634220 (size 2112852)
        Accessible region NC_000001.10:2684220-3845268 (size 1161048)
        Accessible region NC_000001.10:3995268-13052998 (size 9057730)
        Accessible region NC_000001.10:13102998-13219912 (size 116914)
        Accessible region NC_000001.10:13319912-13557162 (size 237250)
        Accessible region NC_000001.10:13607162-17125658 (size 3518496)
        Accessible region NC_000001.10:17175658-29878082 (size 12702424)
        Accessible region NC_000001.10:30028082-103863906 (size 73835824)
        Accessible region NC_000001.10:103913906-120697156 (size 16783250)
        Accessible region NC_000001.10:120747156-120936695 (size 189539)
        Accessible region NC_000001.10:121086695-121485434 (size 398739)
        Accessible region NC_000001.10:142535434-142731022 (size 195588)
        Accessible region NC_000001.10:142781022-142967761 (size 186739)
        Accessible region NC_000001.10:143117761-143292816 (size 175055)
        Accessible region NC_000001.10:143342816-143544525 (size 201709)
        Accessible region NC_000001.10:143644525-143771002 (size 126477)
        Accessible region NC_000001.10:143871002-144095783 (size 224781)
        Accessible region NC_000001.10:144145783-144224481 (size 78698)
        Accessible region NC_000001.10:144274481-144401744 (size 127263)
        Accessible region NC_000001.10:144451744-144622413 (size 170669)
        Accessible region NC_000001.10:144672413-144710724 (size 38311)
        Accessible region NC_000001.10:144810724-145833118 (size 1022394)
        Accessible region NC_000001.10:145883118-146164650 (size 281532)
        Accessible region NC_000001.10:146214650-146253299 (size 38649)
        Accessible region NC_000001.10:146303299-148026038 (size 1722739)
        Accessible region NC_000001.10:148176038-148361358 (size 185320)
        Accessible region NC_000001.10:148511358-148684147 (size 172789)
        Accessible region NC_000001.10:148734147-148954460 (size 220313)
        Accessible region NC_000001.10:149004460-149459645 (size 455185)
        Accessible region NC_000001.10:149509645-205922707 (size 56413062)
        Accessible region NC_000001.10:206072707-206332221 (size 259514)
        Accessible region NC_000001.10:206482221-223747846 (size 17265625)
        Accessible region NC_000001.10:223797846-235192211 (size 11394365)
        Accessible region NC_000001.10:235242211-248908210 (size 13665999)
        Accessible region NC_000001.10:249058210-249240621 (size 182411)
NT_113878.1: Scanning for accessible regions
        Accessible region NT_113878.1:0-106433 (size 106433)
NT_167207.1: Scanning for accessible regions
        Accessible region NT_167207.1:0-547496 (size 547496)
NC_000002.11: Scanning for accessible regions
        Accessible region NC_000002.11:10000-3529312 (size 3519312)
        Accessible region NC_000002.11:3579312-5018788 (size 1439476)
        Accessible region NC_000002.11:5118788-16279724 (size 11160936)
        Accessible region NC_000002.11:16329724-21153113 (size 4823389)
        Accessible region NC_000002.11:21178113-31705550 (size 10527437)
        Accessible region NC_000002.11:31705551-31725939 (size 20388)
        Accessible region NC_000002.11:31726790-31816827 (size 90037)
        Accessible region NC_000002.11:31816828-31816854 (size 26)
        Accessible region NC_000002.11:31816855-31816858 (size 3)
        Accessible region NC_000002.11:31816859-33092197 (size 1275338)
        Accessible region NC_000002.11:33093197-33141692 (size 48495)
        Accessible region NC_000002.11:33142692-87668206 (size 54525514)
        Accessible region NC_000002.11:87718206-89630436 (size 1912230)
        Accessible region NC_000002.11:89830436-90321525 (size 491089)
        Accessible region NC_000002.11:90371525-90545103 (size 173578)
        Accessible region NC_000002.11:91595103-92326171 (size 731068)
        Accessible region NC_000002.11:95326171-110109337 (size 14783166)
        Accessible region NC_000002.11:110251337-149690582 (size 39439245)
        Accessible region NC_000002.11:149790582-234003741 (size 84213159)
        Accessible region NC_000002.11:234053741-239801978 (size 5748237)
        Accessible region NC_000002.11:239831978-240784132 (size 952154)
        Accessible region NC_000002.11:240809132-243102476 (size 2293344)
        Accessible region NC_000002.11:243152476-243189373 (size 36897)
NC_000003.11: Scanning for accessible regions
        Accessible region NC_000003.11:60000-17137943 (size 17077943)
NT_113878.1: Joining over small gaps
NT_167207.1: Joining over small gaps
Wrote GCF_000001405.25_GRCh37.p13_genomic.bed with 2 regions

So it seems to me that all sequences NC_000001.10, NT_113878.1, NT_167207.1, NC_000002.11 and NC_000003.11 have large accessible regions, but in the end only NT_113878.1 and NT_167207.1 got in my target.bed file.

from cnvkit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.