Code Monkey home page Code Monkey logo

Comments (8)

quentinsf avatar quentinsf commented on August 22, 2024

Hi João,

It looks as if you may have hit some limit on your server, or maybe it's timing out. I'd need to look more carefully at this and I'm afraid I'm not likely to manage that in the near future.

If you need a temporary fix, can I suggest splitting your inbox into folders, e.g. by year, running the program against each folder, and then (if you really want an inbox that large!) recombining them again?

Best,
Quentin

from imapdedup.

vortek avatar vortek commented on August 22, 2024

Hello Quentin,
How do you suggest that I split the inbox?
Thanks!

from imapdedup.

quentinsf avatar quentinsf commented on August 22, 2024

Well, there are ways you could script it, but I would just use an email program to create a new folder, select all the messages in one year, and move them over. Then do the next year...

Depending on your email client, you may be able to do something clever with smart mailboxes to make the selection process easier...

from imapdedup.

vortek avatar vortek commented on August 22, 2024

Thanks for the tips!

from imapdedup.

Bill48105 avatar Bill48105 commented on August 22, 2024

I ran into this as well doing an inbox with 300K+ messages. (Don't ask..) First run was great it deleted 100K dupes & I was excited but there were still dupes showing up in roundcube so I figured I'd run it again but I'd get that EOF error on the same fetch headers line. I changed (RFC822.HEADER) to (BODY.PEEK[HEADER]) and it worked again for 1 run. Then the dreaded EOF error every run after. So I edited (BODY.PEEK[HEADER]) back to (RFC822.HEADER) and it worked.. For one run.. Until I let it sit awhile & it worked again.. For 1 run then EOF. By that time it was clear something funky was up so I decided to dig deeper to try & narrow it down. While I did many things including adjust MAXLINE and wrap the IMAP commands in try/except hoping it'd continue (it doesn't) it wasn't until I enabled debugging with imaplib.Debug & m.debug = True I finally got a big clue as to what was going on:

35:55.56 BYE response: Server shutting down.

So yeah umm seems the remote server is shutting down mid session? That'd explain why it works after editing (time passed allowing the server to be online again) And note it happened on folder with only 39 messages.. I had changed to another folder with fewer messages to try & narrow down the issue. I thought it was a fluke but was able to reproduce this shutting down bit multiple times.

Many guesses as to what is up from corrupt messages on server to overloading server to bug in python imaplib to who knows but clearly there's an issue, just can't say it's in IMAPdedup (in fact it's not in that I get similar issues with other programs/scripts) beyond maybe it'd be helpful if it better handled & recovered.

Btw not sure about OP but in my case this is all on InMotion shared business hosting which is Dovecot:

  • OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE NAMESPACE STARTTLS AUTH=PLAIN AUTH=LOGIN] Dovecot ready.

EDIT: Ok seems maybe that's syslog rate limiting in that post so maybe unrelated & weird coincidence..
Little searching & maybe it's rate limiting:
"server dovecot: imap([email protected]): Server shutting down. in=7140 out=70598"
https://www.howtoforge.com/community/threads/server-dovecot-imap-account-tld-com-server-shutting-down-in-7140-out-70598.74887/

If that's the case maybe need option to limit max # of messages it does at a time and/or add sleeps in the loop to help?

from imapdedup.

shubhammatta avatar shubhammatta commented on August 22, 2024

1150749 others in INBOX
30:37.28 > LCJD5 FETCH 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100 (RFC822.HEADER)
30:38.69 last 0 IMAP4 interactions:
30:38.69 > LCJD6 LOGOUT
30:38.69 last 0 IMAP4 interactions:
Traceback (most recent call last):
File "imapdedup.py", line 324, in
main(sys.argv[1:])
File "imapdedup.py", line 321, in main
process(options, mboxes)
File "imapdedup.py", line 248, in process
ms = check_response(server.fetch(message_ids, '(RFC822.HEADER)'))
File "/root/daily_build/64_23/4.3.4/SysUtil/Python-2.7.5-cross/install_path_full/lib/python2.7/imaplib.py", line 443, in fetch
File "/root/daily_build/64_23/4.3.4/SysUtil/Python-2.7.5-cross/install_path_full/lib/python2.7/imaplib.py", line 1070, in _simple_command
File "/root/daily_build/64_23/4.3.4/SysUtil/Python-2.7.5-cross/install_path_full/lib/python2.7/imaplib.py", line 899, in _command_complete
imaplib.abort: command: FETCH => socket error: EOF

I turned the imaplib debug on.
I get that INBOX has huge amount of mails but fetching result in socket error EOF. Anyone has any insights?

from imapdedup.

quentinsf avatar quentinsf commented on August 22, 2024

Mmm. Do you have access to the server logs?

The imaplib source says that '"abort" exceptions imply the connection should be reset, and
the command re-tried.'

So perhaps that's what we should do (if anyone who can test this would like to submit a pull request!)

I guess your mail server may be very heavily loaded and timing out trying to do this even for 100 messages. However, you may be asking for problems with any IMAP server if you keep more than a million messages in a single mailbox! Not to mention using a lot of RAM on your local machine if you do manage to download even their headers...

from imapdedup.

shubhammatta avatar shubhammatta commented on August 22, 2024

Thanks for the info. I reduced the chunksize to 1 and script ran. although if it again aborts, I will try to add the re connect part in the script. Will comment if that works.
Although I wish it does not abort . Have been at it for quite some time now.

from imapdedup.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.