Code Monkey home page Code Monkey logo

tgg's People

Contributors

yaozhou89 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

tgg's Issues

SV identified with 706 accessions

In your method, Genotypes of Structural variants (SVs) in 706 accessions with following code:

multigrmpy.py -i /public10/home/sci0011/projects/tomato2/08_paragraph/02_vcf/SV.paragraph.vcf -m /public10/home/sci0011/projects/tomato2/08_paragraph/01_bam/samples_${sample}.txt -r ~/data/ref/SL5.0/SL5.0_chr_number.fa -o . --threads 64

but how to gain the file "SV.paragraph.vcf" ?

Missing code/scripts for the `Genome_assembly` part

Hi, the 1.Genome_asembly/1.contig part reports that

Flye, Hicanu and Hifiasm are used to assemble primary genome. GALA is used to fillter the potential miss-assembly regions. WGS is manual software intergrated in pipeline.

but I am not able to find all the code/scripts about it. For example, I see that a script called draft_comp.sh is missing, I don't see where flye, canu, gala are called, etc...

I state that I am a newbie with snakemake, and I apologize if this is the problem.

Question about dedup merged SV

Hello, Mr. Zhou!
I had a question when I read your code about dedup merged SV. Is this step necessary? Does it mean that if I do long-reads sequencing (HiFi or ONT) on some individuals, I also need to do short-reads sequencing (>20x) on them? This ensures that I can use their short reads for redup after merging SVs?
df4858d3a10bcb5700910cf4583db37

关于使用vcf构图的问题

您好,我在您公布的TGG1.1 vcf数据集中看到所有的变异位点都被merge成一个样本名(图一),但是我看到有些构建vg图是包含所有的样本信息在vcf文件里面的,我想咨询下您,这两种方法有什么区别呢?为什么把多个样本的变异全部集中在一个样本集合内呢?
图1:TGG1.1 vcf
image
图2:human vg vcf文件
image

一个关于Benchmark study of variant calling的问题

Yaozhou89,您好,在《Graph pangenome captures missing heritability and empowers tomato breeding》这篇文章的reference的5.1中提到了您使用simug进行simulated genome以及使用 art_illumina 进行 simulated short reads 的相关工作,但是我在 Benchmark study of graph simulation 部分并没有找到更加详细的介绍,simug 部分的shell也是空的,我想知道您是否能够更新补充这一部分内容。

此外,在 http://solomics.agis.org.cn/tomato/ftp 数据库中也没有找到simulation相关的short reds数据,不知道您是否有将相关数据传入其他数据库,或者您是否能够提供相关数据。
Screenshot 2024-01-20 at 16 32 17

感谢您的宝贵时间与帮助。

GCTA grm

Hi,Zhou!
I read your article on the tomato pan-genome, fantastic job! But I get confused when I read your method section about genome-wide association study, you said that "After pruning using PLINK (v.2.0) with the parameter ‘-indep-pairwise’ set to ‘50 5 0.2’, the pruned SNPs were used for the kinship matrix (genetic relationship matrix; GRM). For SNPs and indels, the pruned dataset (-indep-pairwise 100, 1, 0.98) was used". Does this sentence mean that the second parameter is used when the sum of snp and indel is in the vcf file? I see that your article also includes other types, sv and indel separately, or SNP+indel+SV, does this need to filter the LD, what parameters do you use?
Sorry to bother u. Thanks a million!

a sincere request !!!

Dear Teacher Zhou:
Recently, I selected the SL5.0 tomato genome in Graph pangenome captures missing heritability and empowers tomato breeding as a reference genome. I would like to know the centromere position of each chromosome of SL5.0 in Figure 1a. Thank you!
Very much looking forward to your reply !
image

Missing workflow images

Hi, there are multiple missing images in the markdown files. The links are broken as they point to a local folder. Here are a few of them:

in https://github.com/YaoZhou89/TGG/blob/main/1.Genome_assembly/1.contig/Readme.md
![image-20220103182607789](/Users/zhiyangzhang/Library/Application Support/typora-user-images/image-20220103182607789.png)

in https://github.com/YaoZhou89/TGG/blob/main/1.Genome_assembly/2.scaffold/Readme.md
![image-20220103183236565](/Users/zhiyangzhang/Library/Application Support/typora-user-images/image-20220103183236565.png)

in https://github.com/YaoZhou89/TGG/blob/main/2.Genome_annotation/Readme.md
![image-20220101234140544](/Users/zhiyangzhang/Library/Application Support/typora-user-images/image-20220101234140544.png)

in https://github.com/YaoZhou89/TGG/blob/main/4.Graph_pangenome/1.construction_graph_genome/Readme.md
![image-20220102154738977](/Users/zhiyangzhang/Library/Application Support/typora-user-images/image-20220102154738977.png)

31个样本SV如何合并到一个vcf文件

尊敬的周老师@YaoZhou89
请问在如下步骤中,多个个体以什么方式合并到一个vcf中,多个样本合并还是融合到一个样本?
image
这里的cleanSV行使了什么功能?

另外,在如下步骤中,如果我的流程中不涉及使用多个SV calling软件,只用了一个软件,我可以不使用这个步骤来对SV去重吗?
image

敬上
强森

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.