Code Monkey home page Code Monkey logo

Comments (17)

Midnighter avatar Midnighter commented on August 11, 2024 1

Thanks for the report @dmalzl, will add a fix immediately. I will just remove the flag since most of the time nobody will follow the log output anyway.

from fetchngs.

dmalzl avatar dmalzl commented on August 11, 2024

On the same issue, there seems to be a problem with configuration of the sra-toolkit as I get this message after resolving the command line parameter issue

This sra toolkit installation has not been configured.
Before continuing, please run: vdb-config --interactive
For more information, see https://www.ncbi.nlm.nih.gov/sra/docs/sra-cloud/

after which it retries and fails

from fetchngs.

Midnighter avatar Midnighter commented on August 11, 2024

I had a look and I don't see the same thing when running prefetch from the container. What version of sra-tools are you running?

Output that I see:

prefetch --help

  -p|--progress                    Show progress

"prefetch" version 2.11.0

I'm happy to remove the hardcoded option, though, since it doesn't really serve a purpose.

On the configuration: There is code in the module that generates a configuration for you. Why is that not working for you? Please open a separate issue on that or join us on slack to discuss.

from fetchngs.

dmalzl avatar dmalzl commented on August 11, 2024

Ah sorry this is my fault. I had the configuration problem before and therefore used the preinstalled module I have on our cluster which is v2.9.6.1. Didn't anticipate the version difference. So I think this was then just a problem on my side. However, the problem with the configuration is unchanged. It now runs with my version but I am really curious why configuration is not working for me. Would you suggest opening another issue here or directly go to slack? Which one would be better for discussion?

from fetchngs.

Midnighter avatar Midnighter commented on August 11, 2024

I think Slack will be easier if you don't mind there's a fetchngs channel.

from fetchngs.

dmalzl avatar dmalzl commented on August 11, 2024

Sorry for being annoying but I joined and don't find the fetchngs channel. Just have pipelines is this the one you were referring to?

from fetchngs.

dmalzl avatar dmalzl commented on August 11, 2024

No worries found it

from fetchngs.

dmalzl avatar dmalzl commented on August 11, 2024

Last issue connected to this I promise because I fixed it this way. After fixing my ncbi config with the two missing lines

/LIBS/GUID = "bd94a196-a984-494f-909e-14e1ffb250e4"
/libs/cloud/report_instance_identity = "true"

I retried and it worked. To reset the pipeline to the state it was in I checked out all my changes and tried again just to be sure the issue doesn't seem to be fixed just because I worked my voodoo on the code. Unfortunately, this was the case with the reason for this being that the stupid prefetch is not just writing the file in the cwd but rather in some other directory specified in the ncbi_config. Simply adding -o ./$id to the command fixed this. So the complete change to the module should be

    retry_with_backoff.sh prefetch \\
        $args \\
        $id \\
        -o ./$id

This will also prevent others to run into the same issues as I experienced today

from fetchngs.

dmalzl avatar dmalzl commented on August 11, 2024

Although I don't know if prepending $id with ./ is necessary since this is something I had to do with v2.9.6.1 but v2.11 does not complain about it

from fetchngs.

Midnighter avatar Midnighter commented on August 11, 2024

Since this seems specific to your case, i.e., the content of your NCBI configuration, I suggest you make use of the args with a local configuration.

process {
        withName: SRATOOLS_PREFETCH {
            ext.args = { "-o ./$id" }
        }
}

from fetchngs.

dmalzl avatar dmalzl commented on August 11, 2024

Ah that's clever. Thanks for the suggestion

from fetchngs.

dmalzl avatar dmalzl commented on August 11, 2024

I have identified yet another possible problem. Using the -o ./$id forces the pipeline, at least in my case, to save the data into the cwd of the current process. However, the file is then named $id which is subsequently passed on to fasterq-dump. But passing just a plain SRA id to fasterq-dump will result in fasterq-dump downloading the SRA file again instead of just processing the already existing one. I tried with suffixing the additional arg wit.sra but then the pipeline tells me it could not find the expected output since the output it expects from prefetch is a file named just with an SRA id.
I fixed this by changing the output to path("${id}.sra") which seems to work fine until the process completes where it starts to throw an error but i didn't have the time to look into this yet. In any way I think this should be changed to incorporate the -o argument to force the output to have the .sra suffix in order to avoid futile downloads and data accumulation

from fetchngs.

Midnighter avatar Midnighter commented on August 11, 2024

The default behavior is that prefetch creates a directory which contains the SRA so $id/$id.sra if you will. This directory is then supplied to fasterq-dump which will look for such a directory. I hadn't used the -o flag before and didn't look at it specifically.

I suggest you use either -O . / --output-directory . to force output in the current directory or -o $id/$id.sra / --output-file $id/$id.sra. Then everything should work as expected.

from fetchngs.

dmalzl avatar dmalzl commented on August 11, 2024

Okay. Again something I didn't know. I now just deleted my ncbi config file to let the pipeline take care of it. Hope that settles it now. Thanks

from fetchngs.

Midnighter avatar Midnighter commented on August 11, 2024

I'm working on some improvements for the pipeline based on your troubles but it will still take a while until they are released.

from fetchngs.

dmalzl avatar dmalzl commented on August 11, 2024

No worries. Since I was not really dependent on the configurations in my file simply deleting it seems to have done the trick for now. But I guess at least some future user will profit from my experiences. Thanks for taking care

from fetchngs.

drpatelh avatar drpatelh commented on August 11, 2024

Looks like this has been resolved. Will close for now but feel free to re-open if the problem persists.

from fetchngs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.