Code Monkey home page Code Monkey logo

Comments (9)

xunchen85 avatar xunchen85 commented on June 25, 2024

Thanks for running the VIcaller_v1.1.

  1. Yes, that path is the one i use on our server, i have revised it correspondingly.

  2. I have delete the Extract_read_information.pl script which is only used when i debug the tool. And the two changes you made are correct.

  3. There is a file is not created, I have revised the VIcaller.pl script. I also corrected the script of "Extract_fuq_split.pl" which have a bug about the "\n".

With those changes, you should be able to run the validation function now.

Please let me know if you found any other bugs. Thanks.

from vicaller.

byeonggill avatar byeonggill commented on June 25, 2024

I appreciate your help.

I download revised scripts and running validate function and calculate function.
but, I got the same error.

My data is targeted sequencing data.
and I attach output file.
fcd_Genareal_1.xlsx

file lists:
image

validate cmd: nohup perl /data/program/VIcaller_v1.1/VIcaller.pl validate -i 11FCD -c /data/program/VIcaller_v1.1/VIcaller.config -t 4 -S 11FCD_14_75126278_75126928_abelson_murine_leukemia_virus_61487 -G 61487 -V murine_leukemia_virus > validate.out
validate.out:
image

calculate cmd: nohup perl /data/program/VIcaller_v1.1/VIcaller.pl calculate -i 11FCD -c /data/program/VIcaller_v1.1/VIcaller.config -t 4 -F .fastq.gz -C 14 -P 75126936 -B 2 -N 147 > cal.out
cal.out:
image

from vicaller.

xunchen85 avatar xunchen85 commented on June 25, 2024

A small correction was made for the VIcaller.pl script, can you try the latest one.

May I ask if you enriched specific viruses for target sequencing? What is the read length?

from vicaller.

byeonggill avatar byeonggill commented on June 25, 2024

I'm sorry that the answer was delayed.
The validate function seems to work normally.
but, calculate function have error.
In my samtools version,
system ("${samtools_d}samtools view ${input_sampleID}s_h.bam chr${Chr}:${position1}-${position2} >${input_sampleID}${Chr}_${Position}_h.sort.sam")
chr1 not 1

Anyway, why does it change to 1 even if I put another number in the c parameter?

cmd: nohup perl /data/program/VIcaller_v1.1/VIcaller.pl calculate -i PYH024 -t 4 -F _s_h.bam -C 4 -I -P 112387642 -B 2 -N 96 > cal.out

this is my revision.
image

this is results.
image

from vicaller.

xunchen85 avatar xunchen85 commented on June 25, 2024

If the input file is FASTQ format, you can use "" for -F parameter. Or you can change to "-F _h.bam" if you have the PYH024_h.bam file indexed.

For the chromosome ID issue, because the "chr" for the chromosome IDs are deleted in the output file in the current VIcaller version, thus you can add the "chr" in the script to extract corresponding reads from the BAM file for now.

I will update the VIcaller.pl in the next version to support any chromosome ID format soon.

from vicaller.

byeonggill avatar byeonggill commented on June 25, 2024

My previous question is even if i take parameter -C 4, VIcaller receive chr1.

Sorry but i have another question.
I want to modify the virus library.
I want viral genome library composed only hepatitis. so, I modified viral.fa, viral.taxonomy and viral.virus_list files.
but, virus names in output files were recorded of unknown.
virus GI seem to normally.

Please let me know how to modify viral genome library in detail.

from vicaller.

xunchen85 avatar xunchen85 commented on June 25, 2024

Sorry for my late reply.

Really appreciate for your feedback, i have correct the bug in the VIcaller main script.
Specifically, you can either downloaded the latest VIcaller script, or directly revised the line 8 defining the chromosome ID as follow:
" "C|Chr=s" => $Chr,"

For your second question:
Can you show me an example of your modified three files? If you are able to detect the virus GI in the FASTA file, but cannot find the corresponding virus name, it means something wrong with your modified virus_list, or viral.taxonomy file; Or it may means that you did not specify the correct paths for those two files in the config file.

virus_list file: only have two column, 1) GI #, and 2) virus_name without space;
viral.taxonomy: 11 columns separated by "$": column 1: GI; column 2: GB, column 3: length; column 6: original virus name; column 9: original virus name; column 10: modified_virus_name (no space)

As an example of HBV, you can modify the corresponding columns, and you should be able to find all the info from the NCBI NT database using the GI:

110225259$AB246317.1$3185$Hepatitis B virus; Viruses; Retro-transcribing viruses; Hepadnaviridae; Orthohepadnavirus.$Hepatitis B virus DNA, complete genome, isolate: TA112.$Hepatitis B virus$TA112$0$Hepatitis B virus$hepatitis_b_virus$0

from vicaller.

byeonggill avatar byeonggill commented on June 25, 2024

thank you,
I'm running it just using non-modified virome reference.

but, I have another question,
I have some problem when using parameter -r in detect function such as Repeatmasker is so confused and i can't download Rebase database and TRF output position is strange.

so, I did not use parameter -r. but, I'm worried that this might be big difference.
Can you give me advice on removing repeat region?

from vicaller.

xunchen85 avatar xunchen85 commented on June 25, 2024

For the repeatmasker, you need to apply for Rebase database and then use it with repeatmasker follow their online manuscript.

You can show some example of TRF output, I can check it out for you.

I would suggest you to disable Repeatmasker function in the script, and then only use TRF and DUST to remove repeat regions. I would be happy to help you with it if you want.

from vicaller.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.