Dear All, I am trying to run gcloud alpha genomics but have recurren

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi Cory (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard

Docker run failed: command failed: /tmp/ggp-494856422: line 16: type: gsutil: not found\ndebconf about deepvariant HOT 12 CLOSED

google commented on May 17, 2024

Docker run failed: command failed: /tmp/ggp-494856422: line 16: type: gsutil: not found\ndebconf

from deepvariant.

Comments (12)

depristo commented on May 17, 2024

@arostamianfar This seems like an issue for you.

from deepvariant.

arostamianfar commented on May 17, 2024

The actual error is "The TF examples in /mnt/data/input/gs/wgs-test-shan/test_samples/UDN689484temp/examples/examples_output.tfrecord-00000-of-00064.gz has image/format 'None' (expected 'raw') which means you might need to rerun make_examples to genenerate the examples again."

@pichuan @depristo this is odd since the pipeline ran as a single workflow. The model and docker binary paths also seem correct. One issue I can think of is most of the shards being empty (the output has 64 shards, but it's only 1.3KB in total). Do you know if empty shards could cause such an error?

P.S. the 'gsutil not found' error is actually harmless. I think we should provide a 'parser' for these errors based on the logs that provides a meaningful error message.

from deepvariant.

arostamianfar commented on May 17, 2024

yeap, it's caused by empty shards. I was able to reproduce this by using 64 shards with the quickstart test data. @depristo should I file a separate issue for this as it's not really a docker issue?

@chenshan03: thanks for the report. As a workaround until this bug is fixed, you may reduce the number of shards to avoid having empty ones.

from deepvariant.

depristo commented on May 17, 2024

@pichuan @scott7z I believe the empty shards bug has been fixed, is that correct?

from deepvariant.

pichuan commented on May 17, 2024

Hi Mark and Asha,
here's what I believe the current status is:
(1) If there is just an empty shard (a shard file that exist, but just contains 0 record) out of many, what happens is the code will move on to the next shard to attempt to read image/format. -- this is what Mark meant by the previously fixed empty shards bug.
(2) However, if all the shard files exist but all of them contains 0 records, the current code can fail with that error message above.

In this case, if the actual error message observed is:
The TF examples in /mnt/data/input/gs/wgs-test-shan/test_samples/UDN689484temp/examples/examples_output.tfrecord-00000-of-00064.gz has image/format 'None' (expected 'raw')

It seems like this call_variant run is specifically being done on on that one file. And if that file has 0 record, unfortunately it will currently fail with that error. :-(

So, I think this is a real bug that we should fix. Because we do expect the use case where users run 64 separate call_variants, and some of them might have complete empty single input file. Is that correct?

from deepvariant.

arostamianfar commented on May 17, 2024

yes, I think this is a real bug that still exists.
Due to the distributed nature of the cloud process, some machines may get shards that are all empty. Also, we actually only supply one of the shards to each process, so (1) doesn't really apply (there is no 'next shard').
You can reproduce this by adding "--shards 64" to the quickstart test data configuration in https://cloud.google.com/genomics/deepvariant.

from deepvariant.

depristo commented on May 17, 2024

My view is that if all shards are empty we should just write an empty CVO file. If that's not what happens right now, let's add a bug to buganizer and fix it.

from deepvariant.

pichuan commented on May 17, 2024

I filed a bug in buganizer.

from deepvariant.

cmclean commented on May 17, 2024

This has been fixed by the DeepVariant 0.5.1 release that just came out a few minutes ago. Thank you for raising attention to this issue.

from deepvariant.

pgrosu commented on May 17, 2024

Hi Cory (@cmclean),

Thank you for the new release, but if we look at the new timings with the 0.5.1 release, they seem to have gotten longer than with the previous version:

Commit v0.5.1

Timings: Whole Genome Case Study - [`0.5 (pink) vs. 0.5.1 (green)`]

Timings: Exome Case Study - [`0.5 (pink) vs. 0.5.1 (green)`]

What is the cause of the additional delay in version 0.5.1 as compared to the previous one?

Thanks,
Paul

from deepvariant.

depristo commented on May 17, 2024

Hi Paul,

Two quick suggestions. First, I'd recommend posting this question in a separate issue, to keep the discussion clean since this is a very interesting and general observation.

Second, it's unclear to us if this is normal variation in cloud timing [not all machines you create are identical. For example, the case study command:

gcloud beta compute instances create "${USER}-deepvariant-casestudy"  --scopes "compute-rw,storage-full,cloud-platform" --image-family "ubuntu-1604-lts" --image-project "ubuntu-os-cloud" --machine-type "custom-64-131072" --boot-disk-size "300" --boot-disk-type "pd-ssd" --zone "us-west1-b"

Doesn't specify the exact machine type, so we're likely getting skylake processors sometimes and broadwell processors other times. That alone could account for the variation in timing we are seeing here.

from deepvariant.

pichuan commented on May 17, 2024

Hi all,
it has recently be reported again that the crashing issue on empty shard for call_variants wasn't fully resolved last time. I just released v0.6.1 that should really resolve this issue now:
https://github.com/google/deepvariant/releases/tag/v0.6.1

The issue was that I didn't properly return in the if branch where an empty shard was detected:
12f9e67
(And the unit test I had for it was flawed. We'll fix the unit test in a later release.)

This time I've tested it manually on an empty shard, and confirmed that call_variants works when there is zero record.

Please feel free to report if you see any issues again. Thank you!

from deepvariant.

Docker run failed: command failed: /tmp/ggp-494856422: line 16: type: gsutil: not found\ndebconf about deepvariant HOT 12 CLOSED

Comments (12)

Timings: Whole Genome Case Study - [`0.5 (pink) vs. 0.5.1 (green)`]

Timings: Exome Case Study - [`0.5 (pink) vs. 0.5.1 (green)`]

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Comments (12)

Timings: Whole Genome Case Study - [0.5 (pink) vs. 0.5.1 (green)]

Timings: Exome Case Study - [0.5 (pink) vs. 0.5.1 (green)]

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org

Timings: Whole Genome Case Study - [`0.5 (pink) vs. 0.5.1 (green)`]

Timings: Exome Case Study - [`0.5 (pink) vs. 0.5.1 (green)`]