intevation / intelmq-mailgen Goto Github PK
View Code? Open in Web Editor NEWIntelMQ command line tool to process events and send out email notifications.
Home Page: http://intevation.github.io/intelmq-mailgen/
License: Other
IntelMQ command line tool to process events and send out email notifications.
Home Page: http://intevation.github.io/intelmq-mailgen/
License: Other
Outgoing emails should be signed.
Ideally the OpenPGP/MIME format should be used (no encoding problems, easier automatic parsing)
as some old implementations probably can only do no-mime signature, we starting supporting this.
There is certtools/intelmq#534 for the general OpenPGP support.
In order to facilitate tests, we should have a database sql extract (part of a dump) that we can insert
and run mailgen on.
Good would be to have a few types of all patterns represented in the database table.
Should be easy to produce from a test run with intelMQ inserting the events.
For email receivers that want to get emails with attachments (e.g. x-arf or csv attachments) they want to have an OpenPGP/MIME signature to be able to verify the sender in a standard compliant way.
A module or code can be useful to implement parts of certtools/intelmq#534 .
RFCs 2015 and 3156 define a MIME compatible solution for OpenPGP signed emails. The advantage is that encodings and mime-types will be handled nicely and even in the case that the mail user agent does not know about crypto.
The documentation needs some cleanup and rework, two especially important points:
Right now, only the logging_level for intelmqmail can be set via cb.main()
and the configuration file. This is close to how intelmq does it right now.
If logging is used in production with more requirements, it may make sense to
The debian packages should give the possibility for a non-interactive setting
of the intelmq apache password. This is a precondition for automatic tests.
Solution idea:
Just generate a password and write it into the system wide configuration file.
So admins can look it up.
The intelmq-mailgen file contains about 20 functions right now.
In order to be able to write unittests for it, we probably should turn it into a module.
Within the tests directory I would want to import the functions and write tests for it.
e.g. into directory "mailgen" with init.py
and turn intelmq-mailgen into file that just imports that module and runs it.
Mailgen should by default prevent formulars in the events data
to be used as injection vector, as described here:
The current implementation uses https://docs.python.org/3/library/csv.html
so a fix could be an extension to this upstream library.
Before issue #19 dpkg-buildpackage
would run (a part of) mailgen's
test suite via its Makefile.
After the restructuring and switch to pybuild, this is no longer the
case:
I: pybuild base:170: cd /2auto/intelmq-mailgen/.pybuild/pythonX.Y_3.4/build; python3.4 -m unittest discover -v
----------------------------------------------------------------------
Ran 0 tests in 0.000s
OK
I: pybuild base:170: cd /2auto/intelmq-mailgen/.pybuild/pythonX.Y_3.5/build; python3.5 -m unittest discover -v
----------------------------------------------------------------------
Ran 0 tests in 0.000s
OK
Tests can still be run manually with make
but we should probably
fix this regression.
The CSV-Output of mailgen has no header.
A header has to exist. In output
If the SMTP-Server cannot be reached a Traceback is raised atm. Ther should be a more simple error message.
currently there is a difference between csv and CSV.
This should be fixed
We need to decide on how to handle notifications for which mailgen cannot determine the format to use for the message. E.g. if a notification specifies a format which mailgen simply does not know about, what should mailgen do?
The example config still uses "event-processor/" as a part of its paths. This is outdated. The path should be replaced with "intelmq-mailgen"
Split out from #28
Make the ticket number unguessable.
Implementation idea:
Add a table that keeps the used ticket ids,
draw new ones randomly until you find out that has not been in use.
Within our current design size:
100,000,000 possibilities for numbers per day
and aiming for sending out 1,000,000 mail per days,
this process will need a redraw in a max of 1% of cases.
So we are okay.
Less attractive implementation idea:
Using a festel chipher like suggested on the Postgresql wiki.
It is less clear to prove that the chipher will create no collisions
and what would be needed (in terms of the used "round" function or "key")
to make it unguessible, if the source is know.
The debugging smptd of python3 only dumps the email to std.
For analysis, manual and automatic testing, it makes sense to save the emails
somewhere. Maybe on disc in maildir format so that emails can be inspected by email
clients or other python scripts.
I'll give it a shot.
Next steps:
Right now only one mailgen variant can run at a time
and mailgen only uses one (python thread).
If email sending speed becomes an issue, it may be possible to enable
more email creation processes to work in parallel.
Possibilities:
Both ideas will allow a machine with several cores to utilise them better.
If implemented, the sql interactions must be checked for race conditions.
Technically the use of SQL selections should right now prevent
an active second mailgen script to run.
@bernhard-herzog
https://github.com/Intevation/intelmq-mailgen/blob/master/intelmq-mailgen#L715
has FOR UPDATE NOWAIT
does this prevent mailmen from running twice like you've said?
# apt install intelmq-mailgen
Reading package lists... Done
Building dependency tree
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
intelmq-mailgen : Depends: python3 (>= 3.6) but 3.5.3-1 is to be installed
Depends: gnupg (>= 2.2) but 2.1.18-8~deb9u4 is to be installed
Recommends: python3-pyxarf (>= 0.0.5) but it is not installable
E: Unable to correct problems, you have held broken packages.
the pyxarf package is not available in any ubuntu repository
It is available for xenial here: http://apt.intevation.de/dists/xenial/intelmq-testing/binary-amd64/
but that is pretty useless nowadays
pyxarf is highly optional, so it should not be recommended either
Also see certtools/intelmq#910
The current implementation of mailgen can only create two variations
of mails:
Looking at the code line 599 ff. this comes to no surprise:
if notification["format"] == "feed_specific":
formatter = known_formatters['feed_specific'].get(('csv',
notification["feed_name"]))
if formatter is not None:
formatter = known_formatters['feed_specific'].get(('csv',
'DEFAULT'))
else:
formatter = known_formatters['generic'].get((notification["format"],
"GENERIC"))
besides being hilarious the "if formatter is not None:" makes no sense
at all and should be deleted. The whole
mail_format_feed_specific_as_csv should be deleted, too, as there is
already a generic fallback...
A number of feeds come as one block, e.g. once a day. (This means that will have the same time.observation value in intelmq.)
Each recipient wants to get one aggregrated email with all notification for this block as fast and complete as possible.
@bernhard-herzog has implemented a way to notice that when a directive was inserted the last time for a specific set of aggregation values, so when used to aggregate on time.observation and max(d.inserted_at) is 2 hours ago, we trigger sending the email.
This methods has the drawback that if the first event and the last event of one load (or batch) is for this set of aggregation values, it will take a long time before another directive is entering the database, so the time intervall has to be quite long to have a good detection that processing has been through.
This issue is about using a better detection mechanism, that can detect the completion of processing faster, thus sending emails faster on the average.
Using an extra table that for each feed.name
and time.observation
keeps that last inserted directive time. This way the email aggregation script can use a simple additional query to see with a higher reliability that the batch has been processed fully.
Necessary implementation steps (roughly):
In the rare case that very long lines are in there, quoted-printable is necessary.
For the other cases for a text/plain body it is not strictly necessary,
but we have some requests to do quoted-printable there anyway.
(Probably because of the compatibility with some email receivers.)
== Technical Analysis:
The version of python3 on Debian Jessie does not do quoted-printable on very long lines,
so enforcing could make the solution more robust.
To enable testing as a regular user, reading the system configuration file should be optional
and if there is a complete system configuration file there is no need to enforce an additional local user
one.
We have seen example emails where there is a comment in the
OpenPGP no-mime signature, e.g.
"""
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Dear Sir or Madam,
.....
-----BEGIN PGP SIGNATURE-----
Comment: Key verification https://example.org/hints-about-verification
....
-----END PGP SIGNATURE-----
Would be nice to document how to add a comment like this.
(Of course this could be placed in the main text template as an alternative.)
@gsiv does debian/rules
'''sh
sed 's@/usr/local/lib/intelmq@/usr/lib/intelmq@'
debian/intelmq-mailgen/usr/share/doc/intelmq-mailgen/examples/intelmq-mailgen.conf.example
> debian/intelmq-mailgen/etc/intelmq/intelmq-mailgen.conf
'''
leave the system configuration line okay?
(The replacement also seems unnecessary rightn ow.)
In order to prepare intelmq-mailgen for more output formats,
e.g. for more schemas from xarf and other types as malware for csv,
we should put each format into a file (or module) of its own.
Should be able to send out x-arf emails.
Specification available from http://www.x-arf.org,
the question is: which version v0.2 or v0.3 draft.
TODO: List x-arf sender and receivers. Look for example emails.
cb
calls logging.basicConfig on import, it's probably better to call this in cb.main() or else where.
If cb is imported by different component it should not mess with the root logger.
( hint by @bernhard-herzog )
The last procedure searches for JSON object "source_directives", but none of the CERT-bund Contact Database and CERT-bund Contact Rules seem to add this information to column "extra" in table "events". This is how my "extra" column looks formatted:
{
"features":"cmd,stat_v2,shell_v2",
"certbund":{
"source_contacts":{
"organisations":[
{
"import_source":"",
"name":"test1",
"id":0,
"managed":"manual",
"sector":null,
"contacts":[
{
"email":"[email protected]",
"managed":"manual",
"email_status":"enabled",
"annotations":[
]
}
],
"annotations":[
]
}
],
"matches":[
{
"organisations":[
0
],
"managed":"manual",
"field":"asn",
"annotations":[
]
}
]
}
},
"model":"SM-G960F",
"name":"starltexx",
"tag":"adb",
"device":"starlte"
}
I only have source_contacts. I added the contact "[email protected]" from fody application, but i didnt insert all info, only ASN and mail. Could this be the problem, which i doubt or if you can help me with finding where CERT-bund expert should add "source_directives" info into column "extra".
Thanks in advance.
we are missing template test data as well.
Running intelmqcbmail in dry-run mode, no mails are sent and directives in the database left untouched. But the ticket number counter is still increased.
The main file in my view needs a header file stating the authors, copyright and license.
And: BTW: shouldn't be #!/usr/bin/env python3 to be sure?
If mailgen is deployed on a different machine then intelmq, Termstyle is missing.
add2e8e emitts csv-inline no-mime mails as multipart/mixed
should be without mime.
The "Date:" (aka. orig-date field) header is not only extremely useful, but also required according to RFC 5322 section 3.6. "Field Definitions".
But currently this header is missing from mails generated by mailgen. Mailgen should add this header, with the current time/date from the moment the mail was generated. For further semantics see the RfC.
There should be a way to prevent creation of notification mails for certain recipients.
A simple design would be to send no notifications if the notification-inerval
for the contact is set to -1
.
Each email should have a unique ticket number.
It should be usable for help desks, this means:
Idea is: use a prefix for the cert, like cert-example "CE"
then an iso like date and a unique random number,
formated to be readable over the phone.
Example
CE-20160818-1234-5678
Variants:
Implementation ideas:
Using postgresql seems to be preferable to not introduce more dependencies.
If the roundtrip to the db or to the service becomes a problem, someone could draw a couple of ids and cache them for usage.
If the group from get_pending_notifications()
https://github.com/Intevation/intelmq-mailgen/blob/master/intelmq-mailgen#L700
leads to several emails, mark_as_send must be called for each email to
make sure that each email gets its own intelmq_ticket number (for each of its noficiations)
right now it is not
https://github.com/Intevation/intelmq-mailgen/blob/master/intelmq-mailgen#L691
GnuPG has published official Python bindings.
The currently used GnuPG python bindings (pygpgme) are being phased out from GNU/Linux distributions and are not very actively maintained.
We shall support the official bindings (and test with Ubuntu 20.04 LTS)
additionally. The official bindings are in package python3-gpg
(see https://packages.ubuntu.com/focal/python3-gpg).
Support can be dropped for pygpgme, once Ubuntu 16.04 LTS stops getting maintenance updates (April 2021) https://ubuntu.com/about/release-cycle https://wiki.ubuntu.com/Releases
Is https://launchpad.net/pygpgme (which would be the package python3-gpgme
for Ubuntu 16.04 LTS (https://packages.ubuntu.com/search?suite=default§ion=all&arch=any&keywords=python3-gpgme&searchon=names)
As per cutomer request:
CSV output should:
I'll implement the change.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.