Dear Professor, I am attempting to use the fdog-Assembly tool to fin

Issue with fdog-Assembly: ModuleNotFoundError for 'fdog.fDOGassembly' about fdog HOT 8 CLOSED

Somnous1998 commented on July 17, 2024

Issue with fdog-Assembly: ModuleNotFoundError for 'fdog.fDOGassembly'

from fdog.

Comments (8)

ebersber commented on July 17, 2024

Dear Zechen Tang,
Thank you very much for pointing this out. Could you please specify from which branch you have downloaded fDOG? Please use the branch fdog_goes_assembly (https://github.com/BIONF/fDOG/tree/fdog_goes_assembly). If you did use this branch, please indicate so.
Kind regards,
Ingo Ebersberger

from fdog.

Somnous1998 commented on July 17, 2024

Dear Ingo Ebersberger,

Thank you very much for your response. I downloaded and configured fDOG from the branch you specified (fdog_goes_assembly: https://github.com/BIONF/fDOG/tree/fdog_goes_assembly). However, when I attempted to use it, I encountered an error indicating that the module does not exist. Further investigation suggests that it may have been removed.

Could you please provide further guidance or confirm if the module has indeed been removed?

Thank you for your assistance.

Kind regards,
Zechen Tang

from fdog.

mueli94 commented on July 17, 2024

Dear Zechen Tang,

thank you for pointing this out.
The fDOG-Assembly module was removed from the fDOG master branch in an earlier version and exists only in its own branch (fdog_goes_assembly). Please double-check if you are using the correct branch and the latest version. You can follow the following instructions to install fDOG-assembly.

# clone repo
git clone https://github.com/BIONF/fDOG.git

#navigate in fDOG folder
cd fDOG

#checkout branch
git checkout --track origin/fdog_goes_assembly

#install fDOG
python setup.py develop

#setup fDOG
fdog.setup -d <directory_for_fDOG_data>

#setup FAS
fas.setup -t </annotation_tools>

Please, let me know if you still encounter the same error.

Kind regards,

Hannah

from fdog.

Somnous1998 commented on July 17, 2024

Dear Hannah,

Thank you for your help. I have now managed to include the module by obtaining the correct version from GitHub. However, I have some further questions I hope you can assist with:

I noticed that the --augustusRefSpec AUGUSTUSREFSPEC parameter is required. Does this mean that I need to pre-train Augustus in my environment, or do I need to provide specific commands to train Augustus during the setup? I did not find any specific instructions regarding configuring Augustus.

The --gene option requires alignment and HMM files. When running fdog.run on a protein dataset, the three specified folders (--searchpath /path/to/your/searchTaxa_dir, --corepath /path/to/your/coreTaxa_dir, and --annopath /path/to/your/annotation_dir) do not include HMM and alignment files. However, according to your 2009 publication, these files are necessary. I am a bit confused if I missed a step where these core gene HMM and alignment files should have been generated during fdog.run. Is the process as follows: aligning sequences of orthologous genes, building, training, and calibrating pHMM using HMMER, and then combining all generated files into one file to be used as an input for fdog.run?

Currently, I am running fDOG version 0.1.32 with fdog.assembly version 0.0.1.

Thank you very much for your patience and assistance, and I apologize for any inconvenience.

Kind regards,
Zechen Tang

from fdog.

mueli94 commented on July 17, 2024

Dear Zechen Tang,

the current version of fDOG-Assembly is 0.1.5.1. Please be sure to use the latest one that includes many improvements and new features. Make another git pull today because I made some updates recently. You can check the version with

fdog.assembly --version.

Augustus already offers many pre-trained models you can use. Use the following augustus command to get all pre-trained models augustus --species=help. Use the identifier of your choice as --augustusRefSpec parameter.

fDOG (fdog.run) produces, among others, a folder called core_orthologs as output. In this folder, you can find the alignments and HMM files you need for fDOG-Assembly in the correct format. You can pass the path to the folder called core_orthologs to --coregroupPath. Afterwards, specify the gene name you want to run fDOG-Assembly with. Be sure that the gene name and the subfolder in which its data is located in the folder core_orthologs are the same. The reference species you select with --refSpec has to be included in the core_ortholog group and the name must be identical. The data of this reference species must be contained in the fDOG folders searchTaxa_dir, coreTaxa_dir, annotation_dir (have a look at the github wiki for more information about the fDOG data structure).

The last thing you need is a folder containing all the assemblies you want to search in. The assemblies should have the same naming scheme as described in the fDOG wiki. fDOG-Assembly requires a subfolder for each species containing the assembly fasta file. For example, the data structure for the species Drosophila melanogaster would look like the following:

Name: DROME@7227@v1
Folder structure:

assembly_dir
 |-DROME@7227@v1
 |  |-DROME@[email protected]
 |  |-blast_dir
 |  |  |-DROME@[email protected]
 |  |  |-DROME@[email protected]
 |  |  |-DROME@[email protected]
 |  |  |-DROME@[email protected]
 |  |  |-DROME@[email protected]
 |  |  |-DROME@[email protected]
 |  |  |-DROME@[email protected]
 |  |  |-DROME@[email protected]
 |  |  |-DROME@[email protected]
 |  |  |-DROME@[email protected]

The blast_dir will be automatically computed by fDOG-Assembly if it does not exist. You can use the script fdog.addAssembly which generates the required file structure automatically.

I hope things are clearer now. Please don't hesitate to ask if something is unclear or if you encounter any problems.

Kind regards,

Hannah

from fdog.

Somnous1998 commented on July 17, 2024

Dear Hannah,

Thank you very much for your assistance. Upon careful review, I realized that I had previously not selected the correct branch in the git repository. I have now corrected this and am using the following command:

fdog.assembly --gene NAEY1_g34.t1 --refSpec REFSPE ANTNE@642069@240617 --augustusRefSpec Anthocoris_zoui --metaeukDb /datapool/home/yangzc/soft/pacbio/fdog/Pfam/Pfam-hmms/targetDB --coregroupPath /datapool/home/yangzc/soft/OTHER/orth-datasets/other_group/OrthoFinder/Results_Jun04/ortho_braker/NAEY1/core_orthologs --dataPath /datapool/home/yangzc/soft/pacbio/gapfiller-main/Anthocoris/MEGAHIT_MJGY2/ --strict --force

Anthocoris_zoui is an Augustus training set generated previously using transcriptome data via the Braker3 pathway, and NAEY1 is the seed species used for fdog.run.

However, I encountered the following error during execution.

Additionally, I would like to ask if there is currently a way to perform batch gene searches. Specifically, if we use a verified protein dataset to conduct a reverse BLAST using Diamond on an assembled genome, and then extract transcripts using TransDecoder followed by running fdog.run, can we achieve the same goal of batch orthologous gene searches? Based on my understanding, it seems that fdog.assembly does not yet support batch processing of genes.

Thank you very much for your patience and support, and I apologize for any inconvenience.

Kind regards,
Zechen Tang

from fdog.

mueli94 commented on July 17, 2024

Dear Zechen Tang,
in fDOG-Assembly we have implemented two different gene prediction methods, namely Augustus and MetaEuk. Currently, MetaEuk is the default gene prediction method. If you want to use augustus, you have to use --augustus as a parameter, and additionally, the --augustusRefSpec parameter. Please remove the --metaEukDB parameter and replace it with --augustus if you want to use Augustus.

There was indeed a bug using --strict, thank you. It is now fixed, so please update fDOG-Assembly. Nevertheless, I want to mention that using --strict can decrease the number of reported orthologs by fDOG-Assembly or lead to no ortholog reported at all. I recommend using fDOG-Assembly without the parameter --strict and by using the species as reference (--refSpec) that is related closest to your species under investigation.

Currently, a batch gene search is not implemented in fDOG-Assembly. Maybe we can deliver that in a future update.

Kind regards,

Hannah

from fdog.

Somnous1998 commented on July 17, 2024

Dear Hannah,

Thank you very much for your detailed response and the provided information. Based on your guidance, I will make the following adjustments to my command:

Remove the --metaEukDB parameter and replace it with --augustus.
Add the --augustus parameter alongside --augustusRefSpec.

I have also noted your recommendation about not using the --strict parameter, as it can decrease the number of reported orthologs or lead to no orthologs being reported at all. I will update fDOG-Assembly to the latest version to ensure the bug fix is applied.

Regarding batch gene search, thank you for clarifying that it is not currently implemented in fDOG-Assembly. I look forward to any future updates that might include this feature.

Thank you again for your patience and support.

Kind regards,
Zechen Tang

from fdog.

Issue with fdog-Assembly: ModuleNotFoundError for 'fdog.fDOGassembly' about fdog HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent