Code Monkey home page Code Monkey logo

Comments (7)

mnshgl0110 avatar mnshgl0110 commented on August 13, 2024

Hi. This issue is probably happening because of the BAM file format description which limits the chromosome size.
I will add a new reader for SAM files in some days, which I think should resolve this issue. Till then, could you please try using mummer3? It is slower than mummer4, but is more memory efficient and would bypass this issue.

from syri.

mnshgl0110 avatar mnshgl0110 commented on August 13, 2024

Hi. SyRI now reads SAM files directly and should work with large genomes.

from syri.

lvbakker avatar lvbakker commented on August 13, 2024

Hi,
Thanks for looking into this.
Unfortunately, I tried to resubmit with the sam file and now i return to this issue:

2020-08-20 11:04:53,669 - Running SyRI - WARNING - :135 - Using pandas version: 0.18.0. Expected vesion: 0.23.4. This might result in unexpected errors.
2020-08-20 11:04:53,682 - Running SyRI - DEBUG - :136 - Python and Pandas version are ok
2020-08-20 11:04:55,347 - Reading Coords - DEBUG - :163 - S
2020-08-20 11:04:55,348 - Reading Coords - INFO - :163 - Reading input from SAM file
2020-08-20 11:05:18,105 - Reading BAM/SAM file - ERROR - :163 - Error in reading BAM/SAM file. truncated file

I successfully converted to bam format so it is not the case that this is truncated.
Can you reopen my original issue with submitting a BAM as input.
Thank you.

from syri.

mnshgl0110 avatar mnshgl0110 commented on August 13, 2024

Hi. The error message is unexpected for SAM files. Is this happening with all SAM files or just one specific SAM file? Would it be possible to share a sample SAM file for me to test?

from syri.

Jquimcrz avatar Jquimcrz commented on August 13, 2024

Hi! I am actually having the same problem as mentioned here. How was this issue finally resolved? Thanks!

from syri.

mnshgl0110 avatar mnshgl0110 commented on August 13, 2024

Hi. I assume that you are talking about the issue with truncated file..
I did not receive any message, so I thought that this issue is fixed. Anyway, I think this has something to do with your specific SAM files as it is not happening to other people.

Have you tried the example analysis did it work? Also, how did you create the input file? Is it SAM or BAM? Is the issue happening with only one specific file or always?

If possible, could you test if you can read the SAM/BAM file with pysam?

from syri.

Jquimcrz avatar Jquimcrz commented on August 13, 2024

Hi, sorry because my first message was very vague. Let me explain it a little more in detail. Firstly, I had no problem in reproducing the example analysis. It just worked as expected. Then, I tried to run the protocol on my own genomes. I have different genome assemblies that I want to compare to the current reference, but none of these is at the chromosome level and have varying numbers of scaffolds. I aligned each of my queries independently to the reference using minimap2, and then tried to call the SR using SyRI, with the following parameters:

./syri/bin/syri -c path/to/sam -r path/to/reference -q path/to/query -F S -k -f --no-chrmatch

SyRI started running, it reported that the reference and the query have different number of scaffolds and some were not aligned, and continued to run until it crashed. The error message as follows:

SAM reader - WARNING - A1_scaffold0151 do not align with any reference sequence and cannot be analysed. Remove all unplaced scaffolds and contigs from the assemblies.
Reading Coords - WARNING - Chromosomes IDs do not match.
Reading Coords - WARNING - --no-chrmatch is set. Not matching chromosomes automatically.
Reading Coords - WARNING - BDIQ01000126.1, BDIQ01000179.1, BDIQ01000057.1, BDIQ01000041.1, BDIQ01000183.1, BDIQ01000086.1, BDIQ01000074.1, BDIQ01000107.1, BDIQ01000051.1, BDIQ01000167.1, BDIQ01000205.1, BDIQ01000152.1, BDIQ01000040.1, BDIQ01000030.1, BDIQ01000163.1, BDIQ01000135.1, BDIQ01000080.1, BDIQ01000117.1, BDIQ01000116.1, BDIQ01000088.1, BDIQ01000136.1, BDIQ01000169.1, BDIQ01000038.1, BDIQ01000055.1, BDIQ01000171.1, BDIQ01000160.1, BDIQ01000028.1, BDIQ01000193.1, BDIQ01000067.1, BDIQ01000191.1, BDIQ01000132.1, BDIQ01000014.1, BDIQ01000130.1, BDIQ01000121.1, BDIQ01000144.1, BDIQ01000024.1, BDIQ01000134.1, BDIQ01000139.1, BDIQ01000185.1, BDIQ01000058.1, BDIQ01000129.1, BDIQ01000032.1, BDIQ01000123.1, BDIQ01000063.1, BDIQ01000166.1, BDIQ01000035.1, BDIQ01000190.1, BDIQ01000102.1, BDIQ01000095.1, BDIQ01000076.1, BDIQ01000100.1, BDIQ01000125.1, BDIQ01000112.1, BDIQ01000006.1, BDIQ01000174.1, BDIQ01000178.1, BDIQ01000085.1, BDIQ01000011.1, BDIQ01000090.1, BDIQ01000151.1, BDIQ01000066.1, BDIQ01000108.1, BDIQ01000013.1, BDIQ01000017.1, BDIQ01000061.1, BDIQ01000098.1, BDIQ01000031.1, BDIQ01000198.1, BDIQ01000075.1, BDIQ01000137.1, BDIQ01000048.1, BDIQ01000156.1, BDIQ01000147.1, BDIQ01000127.1, BDIQ01000054.1, BDIQ01000164.1, BDIQ01000060.1, BDIQ01000081.1, BDIQ01000170.1, BDIQ01000012.1, BDIQ01000010.1, BDIQ01000068.1, BDIQ01000165.1, BDIQ01000184.1, BDIQ01000016.1, BDIQ01000050.1, BDIQ01000131.1, BDIQ01000077.1, BDIQ01000020.1, BDIQ01000042.1, BDIQ01000140.1, BDIQ01000158.1, BDIQ01000097.1, BDIQ01000168.1, BDIQ01000180.1, BDIQ01000105.1, BDIQ01000138.1, BDIQ01000022.1, BDIQ01000089.1, BDIQ01000122.1, BDIQ01000197.1, BDIQ01000161.1, BDIQ01000096.1, BDIQ01000047.1, BDIQ01000146.1, BDIQ01000128.1, BDIQ01000201.1, BDIQ01000001.1, BDIQ01000007.1, BDIQ01000194.1, BDIQ01000154.1, BDIQ01000091.1, BDIQ01000188.1, BDIQ01000120.1, BDIQ01000149.1, BDIQ01000083.1, BDIQ01000109.1, BDIQ01000079.1, BDIQ01000043.1, BDIQ01000143.1, BDIQ01000059.1, BDIQ01000114.1, BDIQ01000070.1, BDIQ01000118.1, BDIQ01000056.1, BDIQ01000033.1, BDIQ01000115.1, BDIQ01000025.1, BDIQ01000052.1, BDIQ01000093.1, BDIQ01000141.1, BDIQ01000199.1, BDIQ01000133.1, BDIQ01000065.1, BDIQ01000071.1, BDIQ01000195.1, BDIQ01000053.1, BDIQ01000039.1, BDIQ01000177.1, BDIQ01000073.1, BDIQ01000162.1, BDIQ01000192.1, BDIQ01000082.1, BDIQ01000034.1, BDIQ01000159.1, BDIQ01000101.1, BDIQ01000106.1, BDIQ01000157.1, BDIQ01000046.1, BDIQ01000145.1, BDIQ01000124.1, BDIQ01000111.1, BDIQ01000148.1, BDIQ01000104.1, BDIQ01000153.1, BDIQ01000150.1, BDIQ01000155.1, BDIQ01000002.1, BDIQ01000187.1, BDIQ01000196.1, BDIQ01000062.1, BDIQ01000078.1, BDIQ01000173.1, BDIQ01000186.1, BDIQ01000110.1, BDIQ01000027.1, BDIQ01000182.1, BDIQ01000021.1, BDIQ01000044.1, BDIQ01000172.1, BDIQ01000023.1, BDIQ01000064.1, BDIQ01000092.1, A1_scaffold0145, A1_scaffold0028, A1_scaffold0004, A1_scaffold0065, A1_scaffold0097, A1_scaffold0096, A1_scaffold0012, A1_scaffold0025, A1_scaffold0136, A1_scaffold0040, A1_scaffold0102, A1_scaffold0044, A1_scaffold0049, A1_scaffold0130, A1_scaffold0123, A1_scaffold0019, A1_scaffold0138, A1_scaffold0017, A1_scaffold0135, A1_scaffold0053, A1_scaffold0127, A1_scaffold0092, A1_scaffold0034, A1_scaffold0041, A1_scaffold0036, A1_scaffold0003, A1_scaffold0133, A1_scaffold0117, A1_scaffold0108, A1_scaffold0061, A1_scaffold0057, A1_scaffold0113, A1_scaffold0119, A1_scaffold0099, A1_scaffold0089, A1_scaffold0093, A1_scaffold0079, A1_scaffold0144, A1_scaffold0018, A1_scaffold0005, A1_scaffold0002, A1_scaffold0048, A1_scaffold0128, A1_scaffold0142, A1_scaffold0084, A1_scaffold0087, A1_scaffold0116, A1_scaffold0141, A1_scaffold0082, A1_scaffold0022, A1_scaffold0043, A1_scaffold0148, A1_scaffold0132, A1_scaffold0143, A1_scaffold0029, A1_scaffold0023, A1_scaffold0088, A1_scaffold0106, A1_scaffold0075, A1_scaffold0045, A1_scaffold0147, A1_scaffold0068, A1_scaffold0067, A1_scaffold0105, A1_scaffold0059, A1_scaffold0125, A1_scaffold0046, A1_scaffold0121, A1_scaffold0140, A1_scaffold0115, A1_scaffold0101, A1_scaffold0078, A1_scaffold0033, A1_scaffold0024, A1_scaffold0085, A1_scaffold0052, A1_scaffold0009, A1_scaffold0026, A1_scaffold0006, A1_scaffold0081, A1_scaffold0071, A1_scaffold0076, A1_scaffold0015, A1_scaffold0124, A1_scaffold0030, A1_scaffold0062, A1_scaffold0011, A1_scaffold0060, A1_scaffold0031, A1_scaffold0055, A1_scaffold0047, A1_scaffold0080, A1_scaffold0131, A1_scaffold0070, A1_scaffold0035, A1_scaffold0074, A1_scaffold0064, A1_scaffold0146, A1_scaffold0014, A1_scaffold0066, A1_scaffold0137, A1_scaffold0069, A1_scaffold0016, A1_scaffold0104, A1_scaffold0122, A1_scaffold0110, A1_scaffold0008, A1_scaffold0126, A1_scaffold0086, A1_scaffold0032, A1_scaffold0027, A1_scaffold0129, A1_scaffold0073, A1_scaffold0109, A1_scaffold0098, A1_scaffold0063, A1_scaffold0090, A1_scaffold0149, A1_scaffold0037, A1_scaffold0042, A1_scaffold0058, A1_scaffold0077, A1_scaffold0020, A1_scaffold0118, A1_scaffold0054, A1_scaffold0111, A1_scaffold0095, A1_scaffold0094, A1_scaffold0001, A1_scaffold0100, A1_scaffold0091, A1_scaffold0007, A1_scaffold0072, A1_scaffold0021, A1_scaffold0050, A1_scaffold0120, A1_scaffold0039, A1_scaffold0083, A1_scaffold0038, A1_scaffold0150, A1_scaffold0010, A1_scaffold0051, A1_scaffold0107, A1_scaffold0134, A1_scaffold0013, A1_scaffold0103, A1_scaffold0114, A1_scaffold0056, A1_scaffold0112, A1_scaffold0139 present in only one genome. Removing corresponding alignments
Traceback (most recent call last):
File "/scratch/jcruzcor/04_SyRI_analysis/syri/syri/bin/syri", line 250, in
startSyri(args, coords[["aStart", "aEnd", "bStart", "bEnd", "aLen", "bLen", "iden", "aDir", "bDir", "aChr", "bChr"]])
File "syri/pyxFiles/synsearchFunctions.pyx", line 467, in syri.pyxFiles.synsearchFunctions.startSyri
File "syri/pyxFiles/synsearchFunctions.pyx", line 860, in syri.pyxFiles.synsearchFunctions.outSyn
File "/scratch/jcruzcor/trial_minimap/SYRI/lib/python3.5/site-packages/pandas/core/generic.py", line 4389, in setattr
return object.setattr(self, name, value)
File "pandas/_libs/properties.pyx", line 69, in pandas._libs.properties.AxisProperty.set
File "/scratch/jcruzcor/trial_minimap/SYRI/lib/python3.5/site-packages/pandas/core/generic.py", line 646, in _set_axis
self._data.set_axis(axis, labels)
File "/scratch/jcruzcor/trial_minimap/SYRI/lib/python3.5/site-packages/pandas/core/internals.py", line 3323, in set_axis
'values have {new} elements'.format(old=old_len, new=new_len))
ValueError: Length mismatch: Expected axis has 0 elements, new values have 7 elements

from syri.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.