Comments (11)
@carragom So, BinaryField is definitely not the answer because we can't filter on it... I finally added a new setting to configure the encoding to use (default value is now LATIN1) and I removed some useless convert_from calls.
from modoboa-amavis.
Hi,
as for the Quarantine.mail_text field, we could try to use a BinaryField and to remove all calls to convert_from.
Do you think you could try it ?
from modoboa-amavis.
@carragom ping
from modoboa-amavis.
From @carragom on December 18, 2015 19:39
Hi @tonioo sorry for the absence. At least in version 1.2.2
Quarantine.mail_text
already is a BinaryField
as shown here or am I looking at the wrong place ?. Maybe I did not understand what you meant ?
from modoboa-amavis.
Hi @carragom, your problem seems to be related to the email field:
https://github.com/modoboa/modoboa-amavis/blob/master/modoboa_amavis/models.py#L20
from modoboa-amavis.
From @carragom on December 18, 2015 22:25
@tonioo Yes that's the field causing the problems. Like I said, I see two options to fix this:
1- Keep the conversion in the database as it's now, but ask the database to convert to LATIN1
instead of UTF8
in all occurrences of the convert_from
function here. I did this and it's working for me. I'm not sure if this is the right encoding but for sure it's better than using UTF8
that we already know it breaks.
2- Switch to using BinaryField
for the Maddr.email
and handle the conversion in python. Just keep in mind that the number of rows in the Maddr
table grows fast, when I reported this a month ago the table was around 7K rows, right now it's sitting at 12K rows. So doing this conversion out of the database could be a performance issue.
I have been looking around here to see if there is any indication on what encoding it's actually used without luck. But it does mention that the option $sql_allow_8bit_address
needs to be set to use this field as bytea
. So LATIN1
sounds like a safe encoding to use.
If you ask me I would just go option number 1 which is simple to implement and have no performance issues.
I can provide a quick PR for option 1 if you decide to go with it.
Let me know.
from modoboa-amavis.
@carragom Using LATIN1 as encoding won't cover all cases. I guess we will encounter the same issues with another encoding soon. I still think a BinaryField is the right answer and I do hope Django uses the appropriate field when it generates queries. The manual conversion you see in the current code would also disappear.
from modoboa-amavis.
@carragom BTW, the right place for this issue is into the https://github.com/modoboa/modoboa-amavis repository.
from modoboa-amavis.
From @carragom on January 27, 2016 23:40
@tonioo I agree that LATIN1 does not cover all cases and it's far from ideal. The one thing for sure is that this problem renders Modoboa unusable, all of it, not just the amavis module. So this should be fixed in any way necessary.
It might be possible that amavis does not intent for these fields to be used as text, from the amavis README.sql-pg.txt:
Upgrade note: field quarantine.mail_text should be of data type 'bytea'
and not 'text' as suggested in earlier documentation; this is to prevent
it from being unjustifiably associated with a character set, and to be
able to store any byte value; to convert existing field from type 'text'
to type 'bytea' the following clause may be used:
ALTER TABLE quarantine ALTER mail_text TYPE bytea
USING decode(replace(mail_text,'','\'),'escape');
Thanks a lot for your time.
from modoboa-amavis.
Please look at this thread (the end of the page is interesting):
https://code.djangoproject.com/ticket/2417
And this commit (django source code):
django/django@8ee1edd
And tell me what do you think :)
from modoboa-amavis.
From @carragom on January 29, 2016 19:2
Yes using a BinaryField
is definitively an option see here. But switching to BinaryField
alone is not enough. Every custom query using convert_from
needs to be replaced with something that fetches the data from the table and filter's it on the python side. This means probably rewriting this entire class.
In any case, it does not matter what type of field is used or where the conversion happens (db or app) at some point those bytes on the database will have to be converted to text in order to be useful and the conversion will require a character set. UTF8
is not the right charset for that data and currently breaks the entire application. The main objective here is to find a way where the application does not break even if the conversion fails.
Again I see two options:
1- Find a way to handle the conversion gracefully at the database level (maybe a stored procedure would help here or just use LATIN1
as charset which is working for me and seems to be what amavis is using).
2- Use a BinaryField
and move the entire logic of converting/filtering the data to the web app which is inefficient and a lot of work and will still break if we keep trying to use UTF8
as charset.
Again thanks for your time, I hope I was a bit more clear this time.
Cheers.
from modoboa-amavis.
Related Issues (20)
- AttributeError at /quarantine/process/ - quarantined mails not being released HOT 6
- CSRF error: Cannot release or delete quarantine HOT 5
- Deleting catch all alias deletes the domain from amavis.users table
- Cannot Release Emails HOT 21
- Manual learning is not working according to bayes db HOT 3
- Only Super Admin can mark quarantined mails as spam or ham HOT 5
- Unable to release objects from quarantine:TypeError at /quarantine/process/ can't pickle memoryview objects HOT 2
- openenig quarantine page results in internal error - KeyError '?' HOT 2
- Non-latin characters FUBAR HOT 2
- Can't unblock messages in Qurantine HOT 2
- Quarantine deletion not working with MariaDb 5.5/10.0 and modoboa 1.15.0 HOT 5
- qcleanup doesn't delete messages (PostgreSQL) HOT 1
- Quarantine List Empty
- Embed Webmailer into another website HOT 1
- Cannnot release mails form quarantine as super admin HOT 11
- Migration Fails HOT 1
- How to train Spamassassin HOT 1
- Outgoing mail marked as 'BANNED' HOT 1
- Perfomance issue with manual learning
- 1.5.1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from modoboa-amavis.