I've downloaded files from Google bucket storage and noticed that some of the files throw an error when extracting with tar. For example, tar -xvf dataset_unaligned/0075.tar
outputs following error:
...
0075/0000075_0044740_0000005_0007124.txt
0075/0000075_0100856_0000007_0018797.jpg
0075/0000075_0040870_0000002_0006601.txt
0075/0000075_0023229_0000008_0003366.jpg
gzip: stdin: invalid compressed data--crc error
0075/0000075_0038536_0000003_0006132.txt
tar: Child returned status 1
tar: Error is not recoverable: exiting now
$ cat dataset_unaligned/0075/0000075_0038536_0000003_0006132.txt
d 75 6132 38536 3 3.611326736278057e+01 -1.151703076311115e+02 6.481528000000000e+02 -2.939905000000000e-02 -9.995678000000000e-01 0.000000000000000e+00 3.611320610005831e+01 -1.151701627748344e+02 6.428045040000000e+02 1.468759559608981e+01 -6.236738248856333e+01 2.000843336507016e+01 0.000000000000000e+00