Code Monkey home page Code Monkey logo

pytroll-collectors's Introduction

This is the sandbox area for the pytroll project, an international cooperation 
on a future distributed real-time processing system for Meteorological Satellite
Data.



------------------------------------
December 2010

Lars Ørum Rasmussen, Esben Sigård Nielsen, Kristian Rune Larsen, 
Martin Raspaud, Anna Geidne, Adam Dybbroe.

Danish Meteorological Institute (DMI)
Swedish Meteorological and Hydrological Institute (SMHI)

pytroll-collectors's People

Contributors

adybbroe avatar dependabot[bot] avatar djhoese avatar gerritholl avatar hundahl avatar lahtinep avatar mraspaud avatar paulovcmedeiros avatar pnuu avatar stickler-ci avatar talonglong avatar tecnavia-dev avatar vgiuffrida avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytroll-collectors's Issues

Add S3 configuration to documentation

The configuration of S3 connection should be added to documentation. This requires #110, #114 and #117 to be merged first.

For segment gatherer and checking which segments exist after startup and first message, this might be adaptable:

    All the connection configurations and such are done using the `fsspec` configuration system:

    https://filesystem-spec.readthedocs.io/en/latest/features.html#configuration

    An example configuration could be for example placed in `~/.config/fsspec/s3.json`::

        {
            "s3": {
                "client_kwargs": {"endpoint_url": "https://s3.server.foo.com"},
                "secret": "VERYBIGSECRET",
                "key": "ACCESSKEY"
            }
        }

Depending on how #114 is finalized, this might need some adjustment.

Originally posted by @pnuu in #117 (comment)

segment_gatherer fails with yaml ParserError when passed ini-config file from examples directory

The examples/ directory contains some configuration files in .ini and some in .yaml. I used the trollstalker_config.ini_template to configure a trollstalker.ini, and similarly for trollstalker_logging.ini. This is handled well by trollstalker -c trollstalker.ini. I then used segment_gatherer.ini_template as a template to configure my segment_gatherer.ini, but when I then ran segment_gatherer -c segment_gatherer.ini it failed with an exception — apparently segment_gatherer wants yaml files now.

To reproduce

  • Create a segment_gatherer.ini file
  • Run segment_gatherer -c segment_gatherer.ini

Expected result

Since I based this off an example in current git master, I expected it to work with current git master.

Actual result

Traceback (most recent call last):
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/bin/segment_gatherer.py", line 95, in <module>
    main()
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/bin/segment_gatherer.py", line 57, in main
    config = read_yaml(args.config)
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/helper_functions.py", line 209, in read_yaml
    data = yaml.load(fid, Loader=yaml.SafeLoader)
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/yaml/__init__.py", line 114, in load
    return loader.get_single_data()
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/yaml/constructor.py", line 49, in get_single_data
    node = self.get_single_node()
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/yaml/composer.py", line 39, in get_single_node
    if not self.check_event(StreamEndEvent):
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/yaml/parser.py", line 98, in check_event
    self.current_event = self.state()
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/yaml/parser.py", line 171, in parse_document_start
    raise ParserError(None, None,
yaml.parser.ParserError: expected '<document start>', but found '<scalar>'
  in "segment_gatherer.ini", line 2, column 1

Desired solution

The examples/ directory should only contain configuration file formats that are current. Any older formats should be converted to the equivalent in the current formats and then the old files should be deleted.

Further context

It appears that zipcollector_config also contains both ini and yaml examples.

Add "multi-scene" collecting and publishing

For creation of multi-temporal datasets data need to be collected and published for multiple time slots.

As an example, pytroll/satpy#2488 needs three distinct datasets:

  • files for a dataset at T-2
  • files for a dataset at T-1
  • files for the latest available dataset

The time-shift between the datasets can be anything, for example 15/30/60 minutes. It can even be irregular if used for polar satellite data or emphasis is needed on one direction or the other.

There are other envisioned needs for this kind of collection/publishing, so the feature needs to be kept as flexible as possible.

Messages

Currently we have the following message types for publishing data:

  • file: plain json without nested lists nor dictionaries, everything at the "top level" of the message
    • used for individual files
  • dataset: combined metadata (start/end times, platform, and such) at the top level, and a list named dataset of dictionaries having URI and UID of individual files
    • used for geostationary segments
  • collection: same as above, but there is a list named collection with dictionaries of individual start/end times and datasets
    • used for multi-segment multi-time data, such as granulated VIIRS SDR swaths

The collection message type could be used for the collection of multi-temporal data that described here, but how to distinguish from the existing usage? Should there be new message type like library (file -> dataset -> collection -> library 😜) or something that has a list named library with collections with datasets inside?

Configuration

This is the first crude idea of how to configure which data are published together. The publishing would be triggered after each data collection has terminated.

published_slots:
  - {min_age: 0, max_age: 0}
  - {min_age: 60, max_age: 65}
  - {min_age: 120, max_age: 125}

The min/max ages are relative to the start time of the currently completed collection. Just having the 0/0 combination would equal the current behaviour of publishing the latest completed set. If all the criteria are not met (just after restart, for example, we might not have the earlier slots collected).

Internals

Currently the completed Slots are deleted. We need to add a new check that looks at the published_slots config (and timeliness?) to determine which slots are not needed anymore. As the keys in the self.slots dictionary are the nominal or start time (possibly rounded, depending on config) of the slot as a string, comparison is quite easy.

Add option to adjust start and/or end time after pass calculation is made in gatherer

I'm thinking of how I can add an option adjust the start and or end time of the gatherer.

When a file is received the possible files are predicted by the gatherer for the area given to the gatherer.

For EARS data this is mainly ok, but for MERSI2 data the coverage is (for the moment) bad for the EARS area due to low prioritation and overlapp with Suomi NPP passes.

The ears mersi prediction from Eumetsat can be seen( or downloaded) here:
https://uns.eumetsat.int/ see EARS pass prediction and scroll down to the files with mersi in,

or eg.:

https://uns.eumetsat.int/downloads/ears/ears_mersi_pass_prediction_19-11-29.txt

Example of Passes for can be seen here

I think something similar is done with AQUA TERRA for antenna scheduling where the DB is scheduled to be off.

So my suggestion is to let the gatherer to the pass calculation as done today. Then download the EUM pass file ( somehow configured in the config file), and then adjust the start and/or end time of the calculated pass to avoid the gatherer waiting for segments that never will arrive.

segment-gatherer does not respect critical_segments in config

As far as I can see segment gatherer does not respect critical_segments in config using ini config

[metopc-ears-avhrr-pps-hrw]
pattern = S_NWC_{segment}_metopc_{orbit_number:05d}_{start_time:%Y%m%dT%H%M%S}{start_frac:1s}Z_{end_time_}Z.nc
critical_files = :CT,:CMA,:CTTH,:avhrr
wanted_files = :CT,:CMA,:CTTH,:avhrr
all_files = :CT,:CMA,:CTTH,:avhrr
topics = /tpcs
publish_topic = /PT
timeliness = 1200
time_name = start_time

Example from log file:

[WARNING: 2021-12-10 12:42:18 : segment_gatherer] Missing files: S_NWC_avhrr_metopc_00000_20211210T1133000Z_20211210T1133413Z.nc
[INFO: 2021-12-10 12:42:18 : segment_gatherer] Sending: pytroll://PPSV2021/EARS/GATHERED/AVHRR/PPS/HRW dataset ubuntu@pps-v2021-a 2021-12-10T12:42:18.808006 v1.01 application/json {"orbit_number": 0, "start_time"
: "2021-12-10T11:33:00", "start_frac": "0", "end_time_": "20211210T1133413", "platform_name_number": 3, "process_level": "l1b", "antenna": "ears", "platform_name": "Metop-C", "end_time": "2021-12-10T11:34:00", "d
isposition_mode": "O", "process_time": "2021-12-10T11:34:31", "origin": "157.249.97.83:9152", "format": "CF", "type": "netCDF4", "data_processing_level": "2", "station": "oslo", "env": "ears", "orig_orbit_number"
: 16046, "dataset": [{"uri": "/data/pytroll/nwcsaf/export/S_NWC_CMA_metopc_00000_20211210T1133000Z_20211210T1133413Z.nc", "uid": "S_NWC_CMA_metopc_00000_20211210T1133000Z_20211210T1133413Z.nc"}, {"uri": "/data/py
troll/nwcsaf/export/S_NWC_CT_metopc_00000_20211210T1133000Z_20211210T1133413Z.nc", "uid": "S_NWC_CT_metopc_00000_20211210T1133000Z_20211210T1133413Z.nc"}, {"uri": "/data/pytroll/nwcsaf/export/S_NWC_CTTH_metopc_00
000_20211210T1133000Z_20211210T1133413Z.nc", "uid": "S_NWC_CTTH_metopc_00000_20211210T1133000Z_20211210T1133413Z.nc"}], "sensor": ["avhrr/3"]}

So I get this message when the timeout is reached. But I'm sure sure what the actual meaning if critical_segments is. Should this never be sendt?

python                    3.9.7           hb7a2778_3_cpython    conda-forge
pytroll-collectors        0.11.1                   pypi_0    pypi

Cleanup unused versioning configurations

There seems to be configuration files for three different versioning systems now within the repository. There should be only one that is used.

And the CHANGELOG.rst should be renamed to something like CHANGELOG_until_0.8.4.rst.

gatherer fails with KeyError: 'format' when fallback to default format and no format in metadata

When for some reason in gatherer.py there is a call to terminator() with publish_topic=None, and the metadata (containing a message sent via posttroll?) does not contain the format key, gatherer.py raises a KeyError: 'format'.

MCVE

The true minimum would be a Python script calling terminator() with certain arguments, but I think that's not the most useful way to reproduce it, because the bug may be upstream (why does terminator() get called this way). My setup:

  • nameserver is running
  • trollstalker is running with trollstalker -c trollstalker.ini -C avhrrl0 -n localhost — contents of trollstalker.ini:
[avhrrl0]
topic=/file/poes/avhrr
directory=/data/pytroll/IN/HRPT/
posttroll_port=0
publish_port=
event_names=IN_CLOSE_WRITE,IN_MOVED_TO
loglevel=DEBUG
stalker_log_config=/opt/pytroll/pytroll_inst/config/trollstalker_logging.ini
filepattern={path}AVHR_HRP_00_{platform_name}_{start_time:%Y%m%d%H%M%S}Z_{end_time:%Y%m%d%H%M%S}Z_N_O_{processing_time:%Y%m%d%H%M%S}Z
instruments=avhrr/3
alias_platform_name=M01:Metop-B|M02:Metop-A|M03:Metop-C|noaa18:NOAA-18|noaa19:NOAA-19
history=0
nameservers=localhost
  • contents of gatherer.ini:
[default]
regions = eurol
timeliness = 180
service =
topics = /file/poes/avhrr

[metopa-avhrr-hrpt]
regions = eurol
service =
pattern = /data/pytroll/IN/HRPT/AVHR_HRP_00_M02_{start_time:%Y%m%d%H%M%S}Z_{end_time:%Y%m%d%H%M%S}Z_N_O_{processing_time:%Y%m%d%H%M%S}Z
sensor = avhrr
topics = /file/poes/avhrr
publish_topic = /collection/metopa/avhrr
timeliness = 180
duration = 300
platform_name = Metop-A
format = HRPT
type = binary
variant = DR
level = 0
orbit_type = polar
nameserver = localhost

Running gatherer.py -v gatherer.ini. In another terminal, move some matching files to the directory watched by trollstalker (mv AVHR_HRP_00_M02_20210114080* /tmp/foo/ && mv /tmp/foo/AVHR_HRP_00* .).

Expected output

I don't understand gatherer.py well enough yet to know what to expect, but I suspect it should publish a message and continue running.

Actual output

Output from gatherer.py (I added a print(metadata) in the terminator function for debugging purposes):

Output from `gatherer.py -v gatherer.ini`
[DEBUG: 2021-01-14 16:37:41 : gatherer] Using posttroll for default
[DEBUG: 2021-01-14 16:37:41 : pytroll_collectors.trigger] Nameserver: localhost
[DEBUG: 2021-01-14 16:37:41 : gatherer] Using posttroll for metopa-avhrr-hrpt
[DEBUG: 2021-01-14 16:37:41 : pytroll_collectors.trigger] Nameserver: localhost
[INFO: 2021-01-14 16:37:41 : posttroll.publisher] publisher started on port 33687
[INFO: 2021-01-14 16:37:41 : posttroll.subscriber] Subscriber adding SUB hook tcp://localhost:16543 for topics ['pytroll://address']
[INFO: 2021-01-14 16:37:41 : posttroll.subscriber] Subscriber adding address tcp://141.38.37.143:16543 with topics ['pytroll://file/poes/avhrr']
[INFO: 2021-01-14 16:37:41 : posttroll.subscriber] Subscriber adding address tcp://141.38.37.143:32883 with topics ['pytroll://file/poes/avhrr']
[INFO: 2021-01-14 16:37:41 : posttroll.subscriber] Subscriber adding SUB hook tcp://localhost:16543 for topics ['pytroll://address']
[INFO: 2021-01-14 16:37:41 : posttroll.subscriber] Subscriber adding address tcp://141.38.37.143:16543 with topics ['pytroll://file/poes/avhrr']
[INFO: 2021-01-14 16:37:41 : posttroll.subscriber] Subscriber adding address tcp://141.38.37.143:32883 with topics ['pytroll://file/poes/avhrr']
[DEBUG: 2021-01-14 16:37:47 : pytroll_collectors.trigger] mda: {'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'processing_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'uid': 'AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02'}
[DEBUG: 2021-01-14 16:37:47 : pytroll_collectors.region_collector] Adding area ID to metadata: eurol
[INFO: 2021-01-14 16:37:47 : pytroll_collectors.region_collector] Platform name Metop-A and sensor ['avhrr/3']: Start and end times = 20210114 08:00:12 20210114 08:03:12
[DEBUG: 2021-01-14 16:37:47 : pytroll_collectors.trigger] mda: {'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'processing_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'uid': 'AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02'}
[DEBUG: 2021-01-14 16:37:47 : pytroll_collectors.region_collector] Adding area ID to metadata: eurol
[DEBUG: 2021-01-14 16:37:47 : pytroll_collectors.region_collector] Estimated granule duration to 0:03:00
[INFO: 2021-01-14 16:37:47 : pytroll_collectors.region_collector] Platform name Metop-A and sensor ['avhrr/3']: Start and end times = 20210114 08:00:12 20210114 08:03:12
[DEBUG: 2021-01-14 16:37:47 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:47 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:47 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:37:47 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:47 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:47 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:37:48 : trollsched.boundary] Instrument = avhrr/3
[INFO: 2021-01-14 16:37:49 : pytroll_collectors.region_collector] Added Metop-A (2021-01-14 08:00:12) granule to area eurol
[DEBUG: 2021-01-14 16:37:49 : pytroll_collectors.region_collector] Predicting granules covering eurol
[DEBUG: 2021-01-14 16:37:49 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:49 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:49 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:37:50 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:37:51 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:51 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:51 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:37:52 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:37:53 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:53 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:53 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:37:53 : trollsched.boundary] Instrument = avhrr/3
[INFO: 2021-01-14 16:37:53 : pytroll_collectors.region_collector] Added Metop-A (2021-01-14 08:00:12) granule to area eurol
[DEBUG: 2021-01-14 16:37:53 : pytroll_collectors.region_collector] Predicting granules covering eurol
[DEBUG: 2021-01-14 16:37:53 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:53 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:53 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:37:54 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:37:54 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:54 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:54 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:37:59 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:37:59 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:59 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:59 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:01 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:01 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:01 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:01 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:02 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:02 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:02 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:02 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:06 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:07 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:07 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:07 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:07 : trollsched.boundary] Instrument = avhrr/3
[INFO: 2021-01-14 16:38:08 : pytroll_collectors.region_collector] Planned granules for Euro 3.0km area - Europe: [datetime.datetime(2021, 1, 14, 7, 50, 12), datetime.datetime(2021, 1, 14, 7, 55, 12), datetime.datetime(2021, 1, 14, 8, 0, 12), datetime.datetime(2021, 1, 14, 8, 5, 12)]
[INFO: 2021-01-14 16:38:08 : pytroll_collectors.region_collector] Planned timeout for Euro 3.0km area - Europe: 2021-01-14T11:10:12
[WARNING: 2021-01-14 16:38:08 : pytroll_collectors.trigger] Timeout detected, terminating collector
[DEBUG: 2021-01-14 16:38:08 : pytroll_collectors.trigger] mda: {'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 6, 12), 'processing_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z', 'uid': 'AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02'}
/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pyproj/crs/crs.py:543: UserWarning: You will likely lose important projection information when converting to a PROJ string from another format. See: https://proj.org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems
  proj_string = self.to_proj4()
[DEBUG: 2021-01-14 16:38:08 : pytroll_collectors.trigger] Area: Area ID: eurol
Description: Euro 3.0km area - Europe
Projection: {'ellps': 'WGS84', 'lat_0': '90', 'lat_ts': '60', 'lon_0': '0', 'no_defs': 'None', 'proj': 'stere', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 2560
Number of rows: 2048
Area extent: (-3780000.0, -7644000.0, 3900000.0, -1500000.0), timeout: 2021-01-14 11:10:12
[DEBUG: 2021-01-14 16:38:08 : pytroll_collectors.region_collector] Adding area ID to metadata: eurol
[INFO: 2021-01-14 16:38:08 : gatherer] Composing topic.
[INFO: 2021-01-14 16:38:08 : pytroll_collectors.region_collector] Platform name Metop-A and sensor ['avhrr/3']: Start and end times = 20210114 08:03:12 20210114 08:06:12
[DEBUG: 2021-01-14 16:38:08 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:08 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:08 : pyorbital.tlefile] Fetch TLE from the internet.
[INFO: 2021-01-14 16:38:08 : gatherer] sending pytroll://collection/metopa/avhrr collection [email protected] 2021-01-14T16:38:08.450784 v1.01 application/json {"path": "", "platform_name": "Metop-A", "start_time": "2021-01-14T08:00:12", "end_time": "2021-01-14T08:03:12", "processing_time": "2021-01-14T08:00:12", "sensor": ["avhrr/3"], "orig_platform_name": "M02", "collection_area_id": "eurol", "collection": [{"start_time": "2021-01-14T08:00:12", "end_time": "2021-01-14T08:03:12", "uri": "/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z", "uid": "AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z"}]}
[DEBUG: 2021-01-14 16:38:08 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:09 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:09 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:09 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:12 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:13 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:13 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:13 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:13 : trollsched.boundary] Instrument = avhrr/3
[INFO: 2021-01-14 16:38:14 : pytroll_collectors.region_collector] Added Metop-A (2021-01-14 08:03:12) granule to area eurol
[DEBUG: 2021-01-14 16:38:14 : pytroll_collectors.region_collector] Predicting granules covering eurol
[DEBUG: 2021-01-14 16:38:14 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:14 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:14 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:15 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:15 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:15 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:15 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:18 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:18 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:18 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:18 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:20 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:21 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:21 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:21 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:21 : trollsched.boundary] Instrument = avhrr/3
[INFO: 2021-01-14 16:38:22 : pytroll_collectors.region_collector] Planned granules for Euro 3.0km area - Europe: [datetime.datetime(2021, 1, 14, 7, 51, 12), datetime.datetime(2021, 1, 14, 7, 54, 12), datetime.datetime(2021, 1, 14, 7, 57, 12), datetime.datetime(2021, 1, 14, 8, 0, 12), datetime.datetime(2021, 1, 14, 8, 3, 12), datetime.datetime(2021, 1, 14, 8, 6, 12)]
[INFO: 2021-01-14 16:38:22 : pytroll_collectors.region_collector] Planned timeout for Euro 3.0km area - Europe: 2021-01-14T11:09:12
[DEBUG: 2021-01-14 16:38:22 : pytroll_collectors.trigger] mda: {'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 6, 12), 'processing_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z', 'uid': 'AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02'}
[DEBUG: 2021-01-14 16:38:22 : pytroll_collectors.region_collector] Adding area ID to metadata: eurol
[INFO: 2021-01-14 16:38:22 : pytroll_collectors.region_collector] Added Metop-A (2021-01-14 08:03:12) granule to area eurol
[DEBUG: 2021-01-14 16:38:22 : pytroll_collectors.trigger] mda: {'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 6, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 6, 41), 'processing_time': datetime.datetime(2021, 1, 14, 8, 6, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z', 'uid': 'AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02'}
[DEBUG: 2021-01-14 16:38:22 : pytroll_collectors.region_collector] Adding area ID to metadata: eurol
[INFO: 2021-01-14 16:38:22 : pytroll_collectors.region_collector] Added Metop-A (2021-01-14 08:06:12) granule to area eurol
[INFO: 2021-01-14 16:38:22 : pytroll_collectors.region_collector] Adjusted timeout: 2021-01-14T11:00:12
[WARNING: 2021-01-14 16:38:22 : pytroll_collectors.trigger] Timeout detected, terminating collector
/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pyproj/crs/crs.py:543: UserWarning: You will likely lose important projection information when converting to a PROJ string from another format. See: https://proj.org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems
  proj_string = self.to_proj4()
[DEBUG: 2021-01-14 16:38:22 : pytroll_collectors.trigger] Area: Area ID: eurol
Description: Euro 3.0km area - Europe
Projection: {'ellps': 'WGS84', 'lat_0': '90', 'lat_ts': '60', 'lon_0': '0', 'no_defs': 'None', 'proj': 'stere', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 2560
Number of rows: 2048
Area extent: (-3780000.0, -7644000.0, 3900000.0, -1500000.0), timeout: 2021-01-14 11:00:12
[INFO: 2021-01-14 16:38:22 : gatherer] Using default topic.
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/trigger.py", line 141, in run
    self.terminator(next_timeout[0].finish(),
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/bin/gatherer.py", line 90, in terminator
    subject = "/".join(("", mda["format"], mda["data_processing_level"],
KeyError: 'format'
[CRITICAL: 2021-01-14 16:38:22 : gatherer] Something went wrong!
[WARNING: 2021-01-14 16:38:22 : gatherer] Ending publication the gathering of granules...
Setting timezone to UTC
[{'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'processing_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'uid': 'AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02', 'collection_area_id': 'eurol'}] /collection/metopa/avhrr
[{'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'processing_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'uid': 'AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02', 'collection_area_id': 'eurol'}, {'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 6, 12), 'processing_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z', 'uid': 'AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02', 'collection_area_id': 'eurol'}, {'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 6, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 6, 41), 'processing_time': datetime.datetime(2021, 1, 14, 8, 6, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z', 'uid': 'AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02', 'collection_area_id': 'eurol'}] None
[DEBUG: 2021-01-14 16:38:25 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:26 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:26 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:26 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:28 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:29 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:29 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:29 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:34 : trollsched.boundary] Instrument = avhrr/3
[INFO: 2021-01-14 16:38:35 : pytroll_collectors.region_collector] Planned granules for Euro 3.0km area - Europe: [datetime.datetime(2021, 1, 14, 7, 48, 12), datetime.datetime(2021, 1, 14, 7, 53, 12), datetime.datetime(2021, 1, 14, 7, 58, 12), datetime.datetime(2021, 1, 14, 8, 3, 12)]
[INFO: 2021-01-14 16:38:35 : pytroll_collectors.region_collector] Planned timeout for Euro 3.0km area - Europe: 2021-01-14T11:08:12
[INFO: 2021-01-14 16:38:35 : pytroll_collectors.region_collector] Adjusted timeout: 2021-01-14T11:03:12

At this point, gatherer.py quits. I'm not sure if it quits normally; it quits with exit code 0. The logging output from trollstalker in the meantime:

Logging output from trollstalker
[DEBUG: 2021-01-14 16:37:47,858: trollstalker] trigger: IN_CLOSE_WRITE
[DEBUG: 2021-01-14 16:37:47,858: trollstalker] processing /data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z
[DEBUG: 2021-01-14 16:37:47,858: trollstalker] filter: {path}AVHR_HRP_00_{platform_name}_{start_time:%Y%m%d%H%M%S}Z_{end_time:%Y%m%d%H%M%S}Z_N_O_{processing_time:%Y%m%d%H%M%S}Z         event: /data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z
[DEBUG: 2021-01-14 16:37:47,858: trollstalker] No origin_inotify_base_dir_skip_levels in self.custom_vars
[DEBUG: 2021-01-14 16:37:47,859: trollstalker] Extracted: OrderedDict([('path', ''), ('platform_name', 'M02'), ('start_time', datetime.datetime(2021, 1, 14, 8, 0, 12)), ('end_time', datetime.datetime(2021, 1, 14, 8, 3, 12)), ('processing_time', datetime.datetime(2021, 1, 14, 8, 0, 12))])
[DEBUG: 2021-01-14 16:37:47,859: trollstalker] self.info['sensor']: ['avhrr/3']
[INFO: 2021-01-14 16:37:47,860: trollstalker] Publishing message pytroll://file/poes/avhrr file [email protected] 2021-01-14T16:37:47.859793 v1.01 application/json {"path": "", "platform_name": "Metop-A", "start_time": "2021-01-14T08:00:12", "end_t
ime": "2021-01-14T08:03:12", "processing_time": "2021-01-14T08:00:12", "uri": "/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z", "uid": "AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z", "sensor": ["avhrr/3"], "orig_platform_name": "M02"}
[DEBUG: 2021-01-14 16:37:47,943: trollstalker] trigger: IN_CLOSE_WRITE
[DEBUG: 2021-01-14 16:37:47,943: trollstalker] processing /data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z
[DEBUG: 2021-01-14 16:37:47,943: trollstalker] filter: {path}AVHR_HRP_00_{platform_name}_{start_time:%Y%m%d%H%M%S}Z_{end_time:%Y%m%d%H%M%S}Z_N_O_{processing_time:%Y%m%d%H%M%S}Z         event: /data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z
[DEBUG: 2021-01-14 16:37:47,944: trollstalker] No origin_inotify_base_dir_skip_levels in self.custom_vars
[DEBUG: 2021-01-14 16:37:47,944: trollstalker] Extracted: OrderedDict([('path', ''), ('platform_name', 'M02'), ('start_time', datetime.datetime(2021, 1, 14, 8, 3, 12)), ('end_time', datetime.datetime(2021, 1, 14, 8, 6, 12)), ('processing_time', datetime.datetime(2021, 1, 14, 8, 3, 12))])
[DEBUG: 2021-01-14 16:37:47,944: trollstalker] self.info['sensor']: ['avhrr/3']
[INFO: 2021-01-14 16:37:47,945: trollstalker] Publishing message pytroll://file/poes/avhrr file [email protected] 2021-01-14T16:37:47.945067 v1.01 application/json {"path": "", "platform_name": "Metop-A", "start_time": "2021-01-14T08:03:12", "end_time": "2021-01-14T08:06:12", "processing_time": "2021-01-14T08:03:12", "uri": "/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z", "uid": "AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z", "sensor": ["avhrr/3"], "orig_platform_name": "M02"}
[DEBUG: 2021-01-14 16:37:47,965: trollstalker] trigger: IN_CLOSE_WRITE
[DEBUG: 2021-01-14 16:37:47,965: trollstalker] processing /data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z
[DEBUG: 2021-01-14 16:37:47,965: trollstalker] filter: {path}AVHR_HRP_00_{platform_name}_{start_time:%Y%m%d%H%M%S}Z_{end_time:%Y%m%d%H%M%S}Z_N_O_{processing_time:%Y%m%d%H%M%S}Z         event: /data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z
[DEBUG: 2021-01-14 16:37:47,965: trollstalker] No origin_inotify_base_dir_skip_levels in self.custom_vars
[DEBUG: 2021-01-14 16:37:47,966: trollstalker] Extracted: OrderedDict([('path', ''), ('platform_name', 'M02'), ('start_time', datetime.datetime(2021, 1, 14, 8, 6, 12)), ('end_time', datetime.datetime(2021, 1, 14, 8, 6, 41)), ('processing_time', datetime.datetime(2021, 1, 14, 8, 6, 12))])
[DEBUG: 2021-01-14 16:37:47,966: trollstalker] self.info['sensor']: ['avhrr/3']
[INFO: 2021-01-14 16:37:47,966: trollstalker] Publishing message pytroll://file/poes/avhrr file [email protected] 2021-01-14T16:37:47.966510 v1.01 application/json {"path": "", "platform_name": "Metop-A", "start_time": "2021-01-14T08:06:12", "end_time": "2021-01-14T08:06:41", "processing_time": "2021-01-14T08:06:12", "uri": "/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z", "uid": "AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z", "sensor": ["avhrr/3"], "orig_platform_name": "M02"}

Additional diagnosis

When I add to gatherer.ini a line to the [default] section publish_topic = /collection/metopa/avhrr (duplicating the line under [metopa-avhrr-hrpt], I don't get this exception, and it does not appear to quit at all.

Go through Gatherer and Segment Gatherer log messages

I'll be needing better defined log messages at proper levels when starting to feed the logs to Graylog. At least the messages by gatherer.py and segment_gatherer.py need to be checked and adjusted as necessary.

gatherer does not complain when configuration file does not exist

Summary

When the configuration file passed to gatherer does not exist (or otherwise cannot be read), there is neither a warning nor an error from gatherer.py. From the console output (or logfile output) there is no direct indication that anything is wrong — only the indirect evidence from the observation that it doesn't seem to be doing much.

MCVE

$ gatherer.py -v /does/not/exist

Expected output

I think this should fail immediately, as I don't see any way in which this error could be recoverable.

Actual output

$ gatherer.py -v /does/not/exist
Setting timezone to UTC
[INFO: 2021-01-14 15:02:24 : posttroll.publisher] publisher started on port 38155

Gather a xor b

We produce a world composite image by collecting messages from six trollflow2 instances, corresponding to production of imagery from six geostationary satellite sensors: MTG-FCI, MSG-SEVIRI (IODC), GK2A-AMI, HIMAWARI-AHI, GOES-West ABI, and GOES-East ABI. At present, we ignore GK2A-AMI and produce a world composite image when the other five constituent images have been produced. This is suboptimal, because if HIMAWARI-AHI is missing or delayed, GK2A-AMI should be able to jump in. It would be nice to instruct the segment gatherer to collect either HIMAWARI-AHI or GK2A-AMI, whichever comes first, and then drop the other one.

gatherer raises KeyError in terminator when default config section lacks publish_topic and metadata lacks format

When for some reason in gatherer.py there is a call to terminator() with publish_topic=None, and the metadata (containing a message sent via posttroll?) does not contain the format key, gatherer.py raises a KeyError: 'format'.

MCVE

The true minimum would be a Python script calling terminator() with certain arguments, but I think that's not the most useful way to reproduce it, because the bug may be upstream (why does terminator() get called this way). My setup:

  • nameserver is running
  • trollstalker is running with trollstalker -c trollstalker.ini -C avhrrl0 -n localhost — contents of trollstalker.ini:
[avhrrl0]
topic=/file/poes/avhrr
directory=/data/pytroll/IN/HRPT/
posttroll_port=0
publish_port=
event_names=IN_CLOSE_WRITE,IN_MOVED_TO
loglevel=DEBUG
stalker_log_config=/opt/pytroll/pytroll_inst/config/trollstalker_logging.ini
filepattern={path}AVHR_HRP_00_{platform_name}_{start_time:%Y%m%d%H%M%S}Z_{end_time:%Y%m%d%H%M%S}Z_N_O_{processing_time:%Y%m%d%H%M%S}Z
instruments=avhrr/3
alias_platform_name=M01:Metop-B|M02:Metop-A|M03:Metop-C|noaa18:NOAA-18|noaa19:NOAA-19
history=0
nameservers=localhost
  • contents of gatherer.ini:
[default]
regions = eurol
timeliness = 180
service =
topics = /file/poes/avhrr

[metopa-avhrr-hrpt]
regions = eurol
service =
pattern = /data/pytroll/IN/HRPT/AVHR_HRP_00_M02_{start_time:%Y%m%d%H%M%S}Z_{end_time:%Y%m%d%H%M%S}Z_N_O_{processing_time:%Y%m%d%H%M%S}Z
sensor = avhrr
topics = /file/poes/avhrr
publish_topic = /collection/metopa/avhrr
timeliness = 180
duration = 300
platform_name = Metop-A
format = HRPT
type = binary
variant = DR
level = 0
orbit_type = polar
nameserver = localhost

Running gatherer.py -v gatherer.ini. In another terminal, move some matching files to the directory watched by trollstalker (mv AVHR_HRP_00_M02_20210114080* /tmp/foo/ && mv /tmp/foo/AVHR_HRP_00* .).

Expected output

I don't understand gatherer.py well enough yet to know what to expect, but I suspect it should publish a message and continue running.

Actual output

Output from gatherer.py (I added a print(metadata) in the terminator function for debugging purposes):

Output from `gatherer.py -v gatherer.ini`
[DEBUG: 2021-01-14 16:37:41 : gatherer] Using posttroll for default
[DEBUG: 2021-01-14 16:37:41 : pytroll_collectors.trigger] Nameserver: localhost
[DEBUG: 2021-01-14 16:37:41 : gatherer] Using posttroll for metopa-avhrr-hrpt
[DEBUG: 2021-01-14 16:37:41 : pytroll_collectors.trigger] Nameserver: localhost
[INFO: 2021-01-14 16:37:41 : posttroll.publisher] publisher started on port 33687
[INFO: 2021-01-14 16:37:41 : posttroll.subscriber] Subscriber adding SUB hook tcp://localhost:16543 for topics ['pytroll://address']
[INFO: 2021-01-14 16:37:41 : posttroll.subscriber] Subscriber adding address tcp://141.38.37.143:16543 with topics ['pytroll://file/poes/avhrr']
[INFO: 2021-01-14 16:37:41 : posttroll.subscriber] Subscriber adding address tcp://141.38.37.143:32883 with topics ['pytroll://file/poes/avhrr']
[INFO: 2021-01-14 16:37:41 : posttroll.subscriber] Subscriber adding SUB hook tcp://localhost:16543 for topics ['pytroll://address']
[INFO: 2021-01-14 16:37:41 : posttroll.subscriber] Subscriber adding address tcp://141.38.37.143:16543 with topics ['pytroll://file/poes/avhrr']
[INFO: 2021-01-14 16:37:41 : posttroll.subscriber] Subscriber adding address tcp://141.38.37.143:32883 with topics ['pytroll://file/poes/avhrr']
[DEBUG: 2021-01-14 16:37:47 : pytroll_collectors.trigger] mda: {'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'processing_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'uid': 'AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02'}
[DEBUG: 2021-01-14 16:37:47 : pytroll_collectors.region_collector] Adding area ID to metadata: eurol
[INFO: 2021-01-14 16:37:47 : pytroll_collectors.region_collector] Platform name Metop-A and sensor ['avhrr/3']: Start and end times = 20210114 08:00:12 20210114 08:03:12
[DEBUG: 2021-01-14 16:37:47 : pytroll_collectors.trigger] mda: {'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'processing_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'uid': 'AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02'}
[DEBUG: 2021-01-14 16:37:47 : pytroll_collectors.region_collector] Adding area ID to metadata: eurol
[DEBUG: 2021-01-14 16:37:47 : pytroll_collectors.region_collector] Estimated granule duration to 0:03:00
[INFO: 2021-01-14 16:37:47 : pytroll_collectors.region_collector] Platform name Metop-A and sensor ['avhrr/3']: Start and end times = 20210114 08:00:12 20210114 08:03:12
[DEBUG: 2021-01-14 16:37:47 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:47 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:47 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:37:47 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:47 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:47 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:37:48 : trollsched.boundary] Instrument = avhrr/3
[INFO: 2021-01-14 16:37:49 : pytroll_collectors.region_collector] Added Metop-A (2021-01-14 08:00:12) granule to area eurol
[DEBUG: 2021-01-14 16:37:49 : pytroll_collectors.region_collector] Predicting granules covering eurol
[DEBUG: 2021-01-14 16:37:49 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:49 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:49 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:37:50 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:37:51 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:51 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:51 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:37:52 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:37:53 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:53 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:53 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:37:53 : trollsched.boundary] Instrument = avhrr/3
[INFO: 2021-01-14 16:37:53 : pytroll_collectors.region_collector] Added Metop-A (2021-01-14 08:00:12) granule to area eurol
[DEBUG: 2021-01-14 16:37:53 : pytroll_collectors.region_collector] Predicting granules covering eurol
[DEBUG: 2021-01-14 16:37:53 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:53 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:53 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:37:54 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:37:54 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:54 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:54 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:37:59 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:37:59 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:37:59 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:37:59 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:01 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:01 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:01 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:01 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:02 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:02 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:02 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:02 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:06 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:07 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:07 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:07 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:07 : trollsched.boundary] Instrument = avhrr/3
[INFO: 2021-01-14 16:38:08 : pytroll_collectors.region_collector] Planned granules for Euro 3.0km area - Europe: [datetime.datetime(2021, 1, 14, 7, 50, 12), datetime.datetime(2021, 1, 14, 7, 55, 12), datetime.datetime(2021, 1, 14, 8, 0, 12), datetime.datetime(2021, 1, 14, 8, 5, 12)]
[INFO: 2021-01-14 16:38:08 : pytroll_collectors.region_collector] Planned timeout for Euro 3.0km area - Europe: 2021-01-14T11:10:12
[WARNING: 2021-01-14 16:38:08 : pytroll_collectors.trigger] Timeout detected, terminating collector
[DEBUG: 2021-01-14 16:38:08 : pytroll_collectors.trigger] mda: {'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 6, 12), 'processing_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z', 'uid': 'AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02'}
/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pyproj/crs/crs.py:543: UserWarning: You will likely lose important projection information when converting to a PROJ string from another format. See: https://proj.org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems
  proj_string = self.to_proj4()
[DEBUG: 2021-01-14 16:38:08 : pytroll_collectors.trigger] Area: Area ID: eurol
Description: Euro 3.0km area - Europe
Projection: {'ellps': 'WGS84', 'lat_0': '90', 'lat_ts': '60', 'lon_0': '0', 'no_defs': 'None', 'proj': 'stere', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 2560
Number of rows: 2048
Area extent: (-3780000.0, -7644000.0, 3900000.0, -1500000.0), timeout: 2021-01-14 11:10:12
[DEBUG: 2021-01-14 16:38:08 : pytroll_collectors.region_collector] Adding area ID to metadata: eurol
[INFO: 2021-01-14 16:38:08 : gatherer] Composing topic.
[INFO: 2021-01-14 16:38:08 : pytroll_collectors.region_collector] Platform name Metop-A and sensor ['avhrr/3']: Start and end times = 20210114 08:03:12 20210114 08:06:12
[DEBUG: 2021-01-14 16:38:08 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:08 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:08 : pyorbital.tlefile] Fetch TLE from the internet.
[INFO: 2021-01-14 16:38:08 : gatherer] sending pytroll://collection/metopa/avhrr collection [email protected] 2021-01-14T16:38:08.450784 v1.01 application/json {"path": "", "platform_name": "Metop-A", "start_time": "2021-01-14T08:00:12", "end_time": "2021-01-14T08:03:12", "processing_time": "2021-01-14T08:00:12", "sensor": ["avhrr/3"], "orig_platform_name": "M02", "collection_area_id": "eurol", "collection": [{"start_time": "2021-01-14T08:00:12", "end_time": "2021-01-14T08:03:12", "uri": "/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z", "uid": "AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z"}]}
[DEBUG: 2021-01-14 16:38:08 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:09 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:09 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:09 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:12 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:13 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:13 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:13 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:13 : trollsched.boundary] Instrument = avhrr/3
[INFO: 2021-01-14 16:38:14 : pytroll_collectors.region_collector] Added Metop-A (2021-01-14 08:03:12) granule to area eurol
[DEBUG: 2021-01-14 16:38:14 : pytroll_collectors.region_collector] Predicting granules covering eurol
[DEBUG: 2021-01-14 16:38:14 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:14 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:14 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:15 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:15 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:15 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:15 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:18 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:18 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:18 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:18 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:20 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:21 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:21 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:21 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:21 : trollsched.boundary] Instrument = avhrr/3
[INFO: 2021-01-14 16:38:22 : pytroll_collectors.region_collector] Planned granules for Euro 3.0km area - Europe: [datetime.datetime(2021, 1, 14, 7, 51, 12), datetime.datetime(2021, 1, 14, 7, 54, 12), datetime.datetime(2021, 1, 14, 7, 57, 12), datetime.datetime(2021, 1, 14, 8, 0, 12), datetime.datetime(2021, 1, 14, 8, 3, 12), datetime.datetime(2021, 1, 14, 8, 6, 12)]
[INFO: 2021-01-14 16:38:22 : pytroll_collectors.region_collector] Planned timeout for Euro 3.0km area - Europe: 2021-01-14T11:09:12
[DEBUG: 2021-01-14 16:38:22 : pytroll_collectors.trigger] mda: {'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 6, 12), 'processing_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z', 'uid': 'AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02'}
[DEBUG: 2021-01-14 16:38:22 : pytroll_collectors.region_collector] Adding area ID to metadata: eurol
[INFO: 2021-01-14 16:38:22 : pytroll_collectors.region_collector] Added Metop-A (2021-01-14 08:03:12) granule to area eurol
[DEBUG: 2021-01-14 16:38:22 : pytroll_collectors.trigger] mda: {'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 6, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 6, 41), 'processing_time': datetime.datetime(2021, 1, 14, 8, 6, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z', 'uid': 'AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02'}
[DEBUG: 2021-01-14 16:38:22 : pytroll_collectors.region_collector] Adding area ID to metadata: eurol
[INFO: 2021-01-14 16:38:22 : pytroll_collectors.region_collector] Added Metop-A (2021-01-14 08:06:12) granule to area eurol
[INFO: 2021-01-14 16:38:22 : pytroll_collectors.region_collector] Adjusted timeout: 2021-01-14T11:00:12
[WARNING: 2021-01-14 16:38:22 : pytroll_collectors.trigger] Timeout detected, terminating collector
/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pyproj/crs/crs.py:543: UserWarning: You will likely lose important projection information when converting to a PROJ string from another format. See: https://proj.org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems
  proj_string = self.to_proj4()
[DEBUG: 2021-01-14 16:38:22 : pytroll_collectors.trigger] Area: Area ID: eurol
Description: Euro 3.0km area - Europe
Projection: {'ellps': 'WGS84', 'lat_0': '90', 'lat_ts': '60', 'lon_0': '0', 'no_defs': 'None', 'proj': 'stere', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 2560
Number of rows: 2048
Area extent: (-3780000.0, -7644000.0, 3900000.0, -1500000.0), timeout: 2021-01-14 11:00:12
[INFO: 2021-01-14 16:38:22 : gatherer] Using default topic.
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/trigger.py", line 141, in run
    self.terminator(next_timeout[0].finish(),
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/bin/gatherer.py", line 90, in terminator
    subject = "/".join(("", mda["format"], mda["data_processing_level"],
KeyError: 'format'
[CRITICAL: 2021-01-14 16:38:22 : gatherer] Something went wrong!
[WARNING: 2021-01-14 16:38:22 : gatherer] Ending publication the gathering of granules...
Setting timezone to UTC
[{'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'processing_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'uid': 'AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02', 'collection_area_id': 'eurol'}] /collection/metopa/avhrr
[{'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'processing_time': datetime.datetime(2021, 1, 14, 8, 0, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'uid': 'AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02', 'collection_area_id': 'eurol'}, {'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 6, 12), 'processing_time': datetime.datetime(2021, 1, 14, 8, 3, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z', 'uid': 'AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02', 'collection_area_id': 'eurol'}, {'path': '', 'platform_name': 'Metop-A', 'start_time': datetime.datetime(2021, 1, 14, 8, 6, 12), 'end_time': datetime.datetime(2021, 1, 14, 8, 6, 41), 'processing_time': datetime.datetime(2021, 1, 14, 8, 6, 12), 'uri': '/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z', 'uid': 'AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z', 'sensor': ['avhrr/3'], 'orig_platform_name': 'M02', 'collection_area_id': 'eurol'}] None
[DEBUG: 2021-01-14 16:38:25 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:26 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:26 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:26 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:28 : trollsched.boundary] Instrument = avhrr/3
[DEBUG: 2021-01-14 16:38:29 : trollsched.satpass] kwargs: {'instrument': 'avhrr/3'}
[DEBUG: 2021-01-14 16:38:29 : trollsched.satpass] instrument: avhrr/3
[DEBUG: 2021-01-14 16:38:29 : pyorbital.tlefile] Fetch TLE from the internet.
[DEBUG: 2021-01-14 16:38:34 : trollsched.boundary] Instrument = avhrr/3
[INFO: 2021-01-14 16:38:35 : pytroll_collectors.region_collector] Planned granules for Euro 3.0km area - Europe: [datetime.datetime(2021, 1, 14, 7, 48, 12), datetime.datetime(2021, 1, 14, 7, 53, 12), datetime.datetime(2021, 1, 14, 7, 58, 12), datetime.datetime(2021, 1, 14, 8, 3, 12)]
[INFO: 2021-01-14 16:38:35 : pytroll_collectors.region_collector] Planned timeout for Euro 3.0km area - Europe: 2021-01-14T11:08:12
[INFO: 2021-01-14 16:38:35 : pytroll_collectors.region_collector] Adjusted timeout: 2021-01-14T11:03:12

At this point, gatherer.py quits. I'm not sure if it quits normally; it quits with exit code 0. The logging output from trollstalker in the meantime:

Logging output from trollstalker
[DEBUG: 2021-01-14 16:37:47,858: trollstalker] trigger: IN_CLOSE_WRITE
[DEBUG: 2021-01-14 16:37:47,858: trollstalker] processing /data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z
[DEBUG: 2021-01-14 16:37:47,858: trollstalker] filter: {path}AVHR_HRP_00_{platform_name}_{start_time:%Y%m%d%H%M%S}Z_{end_time:%Y%m%d%H%M%S}Z_N_O_{processing_time:%Y%m%d%H%M%S}Z         event: /data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z
[DEBUG: 2021-01-14 16:37:47,858: trollstalker] No origin_inotify_base_dir_skip_levels in self.custom_vars
[DEBUG: 2021-01-14 16:37:47,859: trollstalker] Extracted: OrderedDict([('path', ''), ('platform_name', 'M02'), ('start_time', datetime.datetime(2021, 1, 14, 8, 0, 12)), ('end_time', datetime.datetime(2021, 1, 14, 8, 3, 12)), ('processing_time', datetime.datetime(2021, 1, 14, 8, 0, 12))])
[DEBUG: 2021-01-14 16:37:47,859: trollstalker] self.info['sensor']: ['avhrr/3']
[INFO: 2021-01-14 16:37:47,860: trollstalker] Publishing message pytroll://file/poes/avhrr file [email protected] 2021-01-14T16:37:47.859793 v1.01 application/json {"path": "", "platform_name": "Metop-A", "start_time": "2021-01-14T08:00:12", "end_t
ime": "2021-01-14T08:03:12", "processing_time": "2021-01-14T08:00:12", "uri": "/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z", "uid": "AVHR_HRP_00_M02_20210114080012Z_20210114080312Z_N_O_20210114080012Z", "sensor": ["avhrr/3"], "orig_platform_name": "M02"}
[DEBUG: 2021-01-14 16:37:47,943: trollstalker] trigger: IN_CLOSE_WRITE
[DEBUG: 2021-01-14 16:37:47,943: trollstalker] processing /data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z
[DEBUG: 2021-01-14 16:37:47,943: trollstalker] filter: {path}AVHR_HRP_00_{platform_name}_{start_time:%Y%m%d%H%M%S}Z_{end_time:%Y%m%d%H%M%S}Z_N_O_{processing_time:%Y%m%d%H%M%S}Z         event: /data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z
[DEBUG: 2021-01-14 16:37:47,944: trollstalker] No origin_inotify_base_dir_skip_levels in self.custom_vars
[DEBUG: 2021-01-14 16:37:47,944: trollstalker] Extracted: OrderedDict([('path', ''), ('platform_name', 'M02'), ('start_time', datetime.datetime(2021, 1, 14, 8, 3, 12)), ('end_time', datetime.datetime(2021, 1, 14, 8, 6, 12)), ('processing_time', datetime.datetime(2021, 1, 14, 8, 3, 12))])
[DEBUG: 2021-01-14 16:37:47,944: trollstalker] self.info['sensor']: ['avhrr/3']
[INFO: 2021-01-14 16:37:47,945: trollstalker] Publishing message pytroll://file/poes/avhrr file [email protected] 2021-01-14T16:37:47.945067 v1.01 application/json {"path": "", "platform_name": "Metop-A", "start_time": "2021-01-14T08:03:12", "end_time": "2021-01-14T08:06:12", "processing_time": "2021-01-14T08:03:12", "uri": "/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z", "uid": "AVHR_HRP_00_M02_20210114080312Z_20210114080612Z_N_O_20210114080312Z", "sensor": ["avhrr/3"], "orig_platform_name": "M02"}
[DEBUG: 2021-01-14 16:37:47,965: trollstalker] trigger: IN_CLOSE_WRITE
[DEBUG: 2021-01-14 16:37:47,965: trollstalker] processing /data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z
[DEBUG: 2021-01-14 16:37:47,965: trollstalker] filter: {path}AVHR_HRP_00_{platform_name}_{start_time:%Y%m%d%H%M%S}Z_{end_time:%Y%m%d%H%M%S}Z_N_O_{processing_time:%Y%m%d%H%M%S}Z         event: /data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z
[DEBUG: 2021-01-14 16:37:47,965: trollstalker] No origin_inotify_base_dir_skip_levels in self.custom_vars
[DEBUG: 2021-01-14 16:37:47,966: trollstalker] Extracted: OrderedDict([('path', ''), ('platform_name', 'M02'), ('start_time', datetime.datetime(2021, 1, 14, 8, 6, 12)), ('end_time', datetime.datetime(2021, 1, 14, 8, 6, 41)), ('processing_time', datetime.datetime(2021, 1, 14, 8, 6, 12))])
[DEBUG: 2021-01-14 16:37:47,966: trollstalker] self.info['sensor']: ['avhrr/3']
[INFO: 2021-01-14 16:37:47,966: trollstalker] Publishing message pytroll://file/poes/avhrr file [email protected] 2021-01-14T16:37:47.966510 v1.01 application/json {"path": "", "platform_name": "Metop-A", "start_time": "2021-01-14T08:06:12", "end_time": "2021-01-14T08:06:41", "processing_time": "2021-01-14T08:06:12", "uri": "/data/pytroll/IN/HRPT/AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z", "uid": "AVHR_HRP_00_M02_20210114080612Z_20210114080641Z_N_O_20210114080612Z", "sensor": ["avhrr/3"], "orig_platform_name": "M02"}

Additional diagnosis

When I add to gatherer.ini a line to the [default] section publish_topic = /collection/metopa/avhrr (duplicating the line under [metopa-avhrr-hrpt], I don't get this exception, and it does not appear to quit at all.

Scene not marked as complete when used with files with time tolerance

The problem with using time as key and finding files within a time tolerance was partly solved in
#124.

The files are now found but the scene is not marked as ready.
Add line assert slot.get_status() == Status.SLOT_READY to the test_mismatching_files_within_tolerance_generate_one_slot. To see where code fails.

align_time does not round sub-second time

The function pytroll_collectors.helper_functions.align_time is supposed to round times. However, if there is a subsecond timestamp present, this one gets kept and the time is not rounded.

This script:

from datetime import datetime, timedelta
from pytroll_collectors.helper_functions import align_time

t = datetime(2015, 10, 21, 22, 29, 0, 12345)
t2 = align_time(t, timedelta(minutes=5))
print(t2)
assert t2 == datetime(2015, 10, 21, 22, 25, 0, 0)

results in an AssertionError:

2015-10-21 22:25:00.012345
Traceback (most recent call last):
  File "/home/gholl/checkouts/protocode/mwe/bad-align.py", line 7, in <module>
    assert t2 == datetime(2015, 10, 21, 22, 25, 0, 0)
AssertionError

because the result of align_time is datetime(2015, 10, 21, 22, 25, 0, 12345) rather than datetime(2015, 10, 21, 22, 25, 0, 0).

gatherer stops gathering if uncaught exception raised in pyorbital

When an uncaught exception is raised in pyorbital, for example, due to pytroll/pyorbital#74, gatherer stops gathering. The last sign of life in my gatherer logfile is:

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/trigger.py", line 397, in run                                                                                                                  
    self.process(msg)
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/trigger.py", line 111, in add_file                                                                                                             
    self._do(pathname)
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/trigger.py", line 107, in _do                                                                                                                  
    Trigger._do(self, mda)
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/trigger.py", line 86, in _do                                                                                                                   
    res = collector(metadata.copy())
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/region_collector.py", line 65, in __call__                                                                                                     
    return self.collect(granule_metadata)
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/region_collector.py", line 147, in collect                                                                                                     
    granule_pass = Pass(platform, start_time, end_time,
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/trollsched/satpass.py", line 176, in __init__                                                                                                                     
    self.orb = orbital.Orbital(satellite, line1=tle1, line2=tle2)
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pyorbital/orbital.py", line 164, in __init__                                                                                                                      
    self.tle = tlefile.read(satellite, tle_file=tle_file,
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pyorbital/tlefile.py", line 106, in read                                                                                                                          
    return Tle(platform, tle_file=tle_file, line1=line1, line2=line2)
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pyorbital/tlefile.py", line 154, in __init__                                                                                                                      
    self._read_tle()
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pyorbital/tlefile.py", line 200, in _read_tle                                                                                                                     
    urls = (max(glob.glob(os.environ["TLES"]),
ValueError: max() arg is an empty sequence

Despite the exception, the daemon is still running, as shown by supervisorctl:

$ supervisorctl -c supervisord.conf status
pytroll-polar:pytroll-aapp-runner                     RUNNING   pid 19658, uptime 11 days, 7:59:32
pytroll-polar:pytroll-cat                             RUNNING   pid 14107, uptime 28 days, 1:45:12
pytroll-polar:pytroll-gatherer-metopa                 RUNNING   pid 14080, uptime 24 days, 6:16:04
pytroll-polar:pytroll-gatherer-metopb                 RUNNING   pid 14229, uptime 24 days, 6:16:00
pytroll-polar:pytroll-gatherer-metopc                 RUNNING   pid 14353, uptime 24 days, 6:15:57
pytroll-polar:pytroll-nameserver                      RUNNING   pid 14096, uptime 28 days, 1:45:12
pytroll-polar:pytroll-trollflow2                      RUNNING   pid 398, uptime 15 days, 0:35:45
pytroll-polar:pytroll-trollstalker-metopa-direkt      RUNNING   pid 26102, uptime 28 days, 0:58:37
pytroll-polar:pytroll-trollstalker-metopa-eumetcast   RUNNING   pid 26265, uptime 28 days, 0:58:28
pytroll-polar:pytroll-trollstalker-metopb-direkt      RUNNING   pid 26111, uptime 28 days, 0:58:34
pytroll-polar:pytroll-trollstalker-metopb-eumetcast   RUNNING   pid 26451, uptime 28 days, 0:58:25
pytroll-polar:pytroll-trollstalker-metopc-direkt      RUNNING   pid 26223, uptime 28 days, 0:58:31
pytroll-polar:pytroll-trollstalker-metopc-eumetcast   RUNNING   pid 26479, uptime 28 days, 0:58:22
pytroll-polar:pytroll-trollstalker-noaa               RUNNING   pid 14103, uptime 28 days, 1:45:12

This is the worst of both worlds: It continues running, so it's not restarted by supervisorctl, but it's not doing anything anymore, so production has stopped.

The gatherer should catch and log exceptions raised downstream (such as by pyorbital), then try to resume gathering if at all possible.

trollstalker seems to do nothing in current main

Starting with 2b0ec74, which was merged with #141, trollstalker appears to do nothing at all.

With my trollstalker configuration, starting trollstalker with trollstalker.py -c /tmp/trollstalker.ini -C msg-seviri-iodc -n localhost with commits older than 2b0ec74, I get

Setting timezone to UTC
[DEBUG: 2023-12-05 13:58:52,784: trollstalker] Logger started

followed by messages related to files that are arriving. With newer versions, I simply get

Setting timezone to UTC

and then nothing.

The trollstalker configuration is:

[msg-seviri-iodc]
topic = /file/msg/seviri/iodc
directory = /data/pytroll/IN/HRIT/
instruments = seviri
alias_platform_name = MSG1:Meteosat-8|MSG2:Meteosat-9
filepattern = {path}H-000-{orig_platform_name:4s}__-{platform_name:4s}_IODC___-{channel_name:_<9s}-{segment:_<9s}-{start_time:%Y%m%d%H%M}-{compression:1s}_
var_gatherer_time = {start_time:%Y%m%d%H%M%S|align(15,0,0)}
stalker_log_config = /tmp/trollstalker_logging.ini
event_names = IN_MOVED_TO
# Port to send the posttroll messages to, optional so use "0" to take a random free port:
posttroll_port = 0
nameservers = localhost

with the logging configuration

[loggers]
keys=root,trollstalker

[handlers]
keys=consoleHandler,fileHandler

[formatters]
keys=simpleFormatter

[logger_root]
level=DEBUG
handlers=consoleHandler,fileHandler

[logger_trollstalker]
level=DEBUG
handlers=consoleHandler,fileHandler
qualname=trollstalker
propagate=0

[handler_consoleHandler]
class=StreamHandler
level=DEBUG
formatter=simpleFormatter
args=(sys.stdout,)

[handler_fileHandler]
class=handlers.TimedRotatingFileHandler
level=DEBUG
formatter=simpleFormatter
args=("/opt/pytroll/pytroll_inst/log/trollstalker.log", 'midnight', 1, 7, None, True, True)

[formatter_simpleFormatter]
format=[%(levelname)s: %(asctime)s: %(name)s] %(message)s
datefmt=

Add a "monitor for silence"-style timeout to (geographic) gatherer

The (geographic) gatherer currently has a timeout option that is measured relative to the latest expected granule. This is useful, but not for all situations. Consider:

Current situation

Consider a granule duration of 180 seconds, a timeliness of 5 minutes, for a certain area:

  • [10:03] Granule covering 10:00 - 10:03 arrives. Gatherer expects 10:00-10:03, 10:03-10:06, 10:06-10:09, 10:09-10:12, 10:12-10:15, 10:15-10:18.
  • [10:06] Granule covering 10:03 - 10:06 arrives.
  • Nothing happens.
  • [10:23] Timeout detected, gatherer sends message covering 10:00-10:06, with data that are 17-23 minutes old.

Desirable situation

With a „monitor for silence“ set to 7 minutes:

  • [10:03] Granule covering 10:00 - 10:03 arrives. Gatherer expects 10:00-10:03, 10:03-10:06, 10:06-10:09, 10:09-10:12, 10:12-10:15, 10:15-10:18.
  • [10:06] Granule covering 10:03 - 10:06 arrives.
  • Nothing happens.
  • [10:13] Silence detected, gatherer sends message covering 10:00-10:06, with data that are 7-13 minutes old.

The monitor for silence would bring the data to the users 10 minutes faster.

There is a tradeoff between fast and comprehensive, but that is up to the user to decide.

Remove usage of six

As we don't support Python 2.7 anymore, the usage of the six compatibility library can be removed.

Starting the geographic gatherer fails with ModuleNotFoundError

Starting the geographic gatherer fails with a ModuleNotFoundError related to the pytroll_collectors.triggers module:

$ pip install ~gholl/checkouts/pytroll-collectors/
(...)
Successfully installed pytroll-collectors-0.10.0+118.g71d66a1
$ geographic_gatherer.py -v -n localhost gatherer-n19-ears.ini
Traceback (most recent call last):
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py39/bin/geographic_gatherer.py", line 34, in <module>
    from pytroll_collectors.geographic_gatherer import GeographicGatherer
  File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py39/lib/python3.9/site-packages/pytroll_collectors/geographic_gatherer.py", line 41, in <module>
    from pytroll_collectors.triggers import PostTrollTrigger, WatchDogTrigger
ModuleNotFoundError: No module named 'pytroll_collectors.triggers'

This regression was introduced in #85.

geographic gatherer stopped after IndexError exception

geographic gatherer stopped working, but did not exit with a value supervisor recognized to trigger a restart. This cause supervisord not to restart the gatherer. Supervisor reported:

2022-03-21 21:38:51,889 INFO exited: gatherer-winds (exit status 0; expected)

The log from geographic gatherer

Exception in thread Thread-7:
Traceback (most recent call last):
  File "/software/miniconda/envs/pytroll/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/software/miniconda/envs/pytroll/lib/python3.9/site-packages/pytroll_collectors/triggers/_base.py", line 227, in run
    self.publish_collection(next_timeout[0].finish_without_reset())
  File "/software/miniconda/envs/pytroll/lib/python3.9/site-packages/pytroll_collectors/triggers/_base.py", line 91, in publish_collection
    subject = self._get_topic(metadata[0])
IndexError: list index out of range
[ERROR: 2022-03-21 21:38:40 : pytroll_collectors.geographic_gatherer] Something went wrong
Traceback (most recent call last):
  File "/software/miniconda/envs/pytroll/lib/python3.9/site-packages/pytroll_collectors/geographic_gatherer.py", line 201, in run
    raise RuntimeError
RuntimeError
[INFO: 2022-03-21 21:38:40 : pytroll_collectors.geographic_gatherer] Ending publication the gathering of granules...
[INFO: 2022-03-21 21:38:40 : gatherer] GeographicGatherer has stopped.
Setting timezone to UTC

I guess the metadata does not have an element, but I don't understand why

def publish_collection(self, metadata):
"""Terminate the gathering."""
subject = self._get_topic(metadata[0])
mda = _merge_metadata(metadata)

Python 3 support

Some of the modules/scripts have Python 2 specific syntax, imports and/or dependencies. These should be converted so that everything works also with Python 3.

Installing fails on Python 3.12; versioneer update needed?

Trying to install pytroll-collectors fails on Python 3.12. The failure apparently happens due to an outdated version of versioneer, which refers to SafeConfigParser, a class that has been removed.

In a clean repo, pip install pytroll-collectors fails with:

Collecting pytroll-collectors
  Using cached pytroll_collectors-0.15.1.tar.gz (112 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'error'
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [18 lines of output]
      /tmp/pip-install-a_grrpv_/pytroll-collectors_63499942fb5345279a073fa4e81584bc/versioneer.py:421: SyntaxWarning: invalid escape sequence '\s'
        LONG_VERSION_PY['git'] = '''
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-a_grrpv_/pytroll-collectors_63499942fb5345279a073fa4e81584bc/setup.py", line 63, in <module>
          version=versioneer.get_version(),
                  ^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-install-a_grrpv_/pytroll-collectors_63499942fb5345279a073fa4e81584bc/versioneer.py", line 1480, in get_version
          return get_versions()["version"]
                 ^^^^^^^^^^^^^^
        File "/tmp/pip-install-a_grrpv_/pytroll-collectors_63499942fb5345279a073fa4e81584bc/versioneer.py", line 1412, in get_versions
          cfg = get_config_from_root(root)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-install-a_grrpv_/pytroll-collectors_63499942fb5345279a073fa4e81584bc/versioneer.py", line 342, in get_config_from_root
          parser = configparser.SafeConfigParser()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      AttributeError: module 'configparser' has no attribute 'SafeConfigParser'. Did you mean: 'RawConfigParser'?
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

If tle is missing the actual gatherer thread is crashing

If tle for some reason is missing the actual gatherer thread is crashing like this:

[DEBUG: 2019-02-13 04:12:02 : trollsched.satpass] Failed in PyOrbital: u"Found no TLE entry for 'NOAA 19'"
[DEBUG: 2019-02-13 04:12:02 : pyorbital.tlefile] Reading TLE from /data/pytroll/tle-in/tle_db/tle-latest-pytroll.txt
Exception in thread Thread-11:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/software/pytroll/lib/python2.7/site-packages/pytroll_collectors-0.7.0-py2.7.egg/pytroll_collectors/trigger.py", line 381, in run
    self.process(msg)
  File "/software/pytroll/lib/python2.7/site-packages/pytroll_collectors-0.7.0-py2.7.egg/pytroll_collectors/trigger.py", line 94, in add_file
    self._do(pathname)
  File "/software/pytroll/lib/python2.7/site-packages/pytroll_collectors-0.7.0-py2.7.egg/pytroll_collectors/trigger.py", line 89, in _do
    Trigger._do(self, mda)
  File "/software/pytroll/lib/python2.7/site-packages/pytroll_collectors-0.7.0-py2.7.egg/pytroll_collectors/trigger.py", line 64, in _do
    res = collector(metadata.copy())
  File "/software/pytroll/lib/python2.7/site-packages/pytroll_collectors-0.7.0-py2.7.egg/pytroll_collectors/region_collector.py", line 61, in __call__
    return self.collect(granule_metadata)
  File "/software/pytroll/lib/python2.7/site-packages/pytroll_collectors-0.7.0-py2.7.egg/pytroll_collectors/region_collector.py", line 169, in collect
    instrument=granule_metadata["sensor"])
  File "/software/pytroll/lib/python2.7/site-packages/pytroll_schedule-0.3.3-py2.7.egg/trollsched/satpass.py", line 184, in __init__
    self.orb = orbital.Orbital(NOAA20_NAME.get(satellite, satellite), line1=tle1, line2=tle2)
  File "/software/pytroll/lib/python2.7/site-packages/pyorbital-1.3.1-py2.7.egg/pyorbital/orbital.py", line 142, in __init__
    line1=line1, line2=line2)
  File "/software/pytroll/lib/python2.7/site-packages/pyorbital-1.3.1-py2.7.egg/pyorbital/tlefile.py", line 146, in read
    return Tle(platform, tle_file=tle_file, line1=line1, line2=line2)
  File "/software/pytroll/lib/python2.7/site-packages/pyorbital-1.3.1-py2.7.egg/pyorbital/tlefile.py", line 197, in __init__
    self._read_tle()
  File "/software/pytroll/lib/python2.7/site-packages/pyorbital-1.3.1-py2.7.egg/pyorbital/tlefile.py", line 280, in _read_tle
    raise KeyError("Found no TLE entry for '%s'" % self._platform)
KeyError: u"Found no TLE entry for 'NOAA 19'"

A possible try/excet with restart might avoid this. Or at least restart the tread.

Add option for non-greedy matching

Feature request

The file delivery system in use at the German Weatherservice (DWD) (the Automatic File Distributor, see also English language link) delivers files to a system by first creating a temporary file starting with a ., then renaming that file when the transfer is complete. For example, it will first create .AVHR_HRP_00_M01_20210121082440Z_20210121082740Z_N_O_20210121082441Z, then moves that file to AVHR_HRP_00_M01_20210121082440Z_20210121082740Z_N_O_20210121082441Z when copying is complete. Due to the non-greedy matching implemented in Trollsift, any pattern that matches the final, intended file will also match the temporary, unintended file. For example, using filepattern={path}AVHR_HRP_00_{platform_name}_{start_time:%Y%m%d%H%M%S}Z_{end_time:%Y%m%d%H%M%S}Z_N_O_{processing_time:%Y%m%d%H%M%S}Z, we get:

[DEBUG: 2021-01-17 22:25:47,850: trollstalker] trigger: IN_CLOSE_WRITE
[DEBUG: 2021-01-17 22:25:47,851: trollstalker] processing /data/pytroll/IN/HRPT/.AVHR_HRP_00_M03_20210117221434Z_20210117221733Z_N_O_20210117221434Z
[DEBUG: 2021-01-17 22:25:47,851: trollstalker] filter: {path}AVHR_HRP_00_{platform_name}_{start_time:%Y%m%d%H%M%S}Z_{end_time:%Y%m%d%H%M%S}Z_N_O_{processing_time:%Y%m%d%H%M%S}Z  event: /data/pytroll/IN/HRPT/.AVHR_HRP_00_M03_20210117221434Z_20210117221733Z_N_O_20210117221434Z
[DEBUG: 2021-01-17 22:25:47,851: trollstalker] No origin_inotify_base_dir_skip_levels in self.custom_vars
[DEBUG: 2021-01-17 22:25:47,852: trollstalker] Extracted: OrderedDict([('path', '.'), ('platform_name', 'M03'), ('start_time', datetime.datetime(2021, 1, 17, 22, 14, 34)), ('end_time', datetime.datetime(2021, 1, 17, 22, 17, 33)), ('processing_time', datetime.datetime(2021, 1, 17, 22, 14, 34))])
[DEBUG: 2021-01-17 22:25:47,852: trollstalker] self.info['sensor']: ['avhrr/3']
[INFO: 2021-01-17 22:25:47,852: trollstalker] Publishing message pytroll://file/poes/avhrr file [email protected] 2021-01-17T22:25:47.852507 v1.01 application/json {"path": ".", "platform_name": "Metop-C", "start_time": "2021-01-17T22:14:34", "end_time": "2021-01-17T22:17:33", "processing_time": "2021-01-17T22:14:34", "uri": "/data/pytroll/IN/HRPT/.AVHR_HRP_00_M03_20210117221434Z_20210117221733Z_N_O_20210117221434Z", "uid": ".AVHR_HRP_00_M03_20210117221434Z_20210117221733Z_N_O_20210117221434Z", "s
ensor": ["avhrr/3"], "orig_platform_name": "M03"}
[DEBUG: 2021-01-17 22:25:47,865: trollstalker] trigger: IN_MOVED_TO
[DEBUG: 2021-01-17 22:25:47,865: trollstalker] processing /data/pytroll/IN/HRPT/AVHR_HRP_00_M03_20210117221434Z_20210117221733Z_N_O_20210117221434Z
[DEBUG: 2021-01-17 22:25:47,866: trollstalker] filter: {path}AVHR_HRP_00_{platform_name}_{start_time:%Y%m%d%H%M%S}Z_{end_time:%Y%m%d%H%M%S}Z_N_O_{processing_time:%Y%m%d%H%M%S}Z  event: /data/pytroll/IN/HRPT/AVHR_HRP_00_M03_20210117221434Z_20210117221733Z_N_O_20210117221434Z
[DEBUG: 2021-01-17 22:25:47,866: trollstalker] No origin_inotify_base_dir_skip_levels in self.custom_vars
[DEBUG: 2021-01-17 22:25:47,866: trollstalker] Extracted: OrderedDict([('path', ''), ('platform_name', 'M03'), ('start_time', datetime.datetime(2021, 1, 17, 22, 14, 34)), ('end_time', datetime.datetime(2021, 1, 17, 22, 17, 33)), ('processing_time', datetime.datetime(2021, 1, 17, 22, 14, 34))])
[DEBUG: 2021-01-17 22:25:47,866: trollstalker] self.info['sensor']: ['avhrr/3']
[INFO: 2021-01-17 22:25:47,867: trollstalker] Publishing message pytroll://file/poes/avhrr file [email protected] 2021-01-17T22:25:47.867246 v1.01 application/json {"path": "", "platform_name": "Metop-C", "start_time": "2021-01-17T22:14:34", "end_time": "2021-01-17T22:17:33", "processing_time": "2021-01-17T22:14:34", "uri": "/data/pytroll/IN/HRPT/AVHR_HRP_00_M03_20210117221434Z_20210117221733Z_N_O_20210117221434Z", "uid": "AVHR_HRP_00_M03_20210117221434Z_20210117221733Z_N_O_20210117221434Z", "sensor": ["avhrr/3"], "orig_platform_name": "M03"}

This is undesirable because Trollstalker sends messages about these files. Somewhere down the chain this is going to cause error messages when the temporary files are absent. Although those error messages do not prevent successful processing of the final input files, they do clutter the logs and may make more anomalous error messages harder to spot.

Describe the solution you'd like

I would like a new flag for whole-name matching only. This would likely require a change in both Trollstalker and Trollsift.

Describe any changes to existing user workflow

The new flag would be optional, and the default behaviour would correspond to the status quo. Therefore, this improvement should have no impact on backward compatibility.

Additional context

I tried to change the filepattern, but if I add a / between {path} and {AVHR...}, then neither the temporary nor the final file are matched; this is because Trollstalker matches against the filename, not against the full path.

My present workaround is to monitor only for the event {{IN_MOVED_TO}} and not for {{IN_CLOSE_WRITE}}. This workaround is problematic because it relies on an implementation detail of the file monitoring software. This detail may change without warning (from the perspective of us users), which could therefore suddenly break operational file processing. Therefore, a more sustainable solution would be desirable.

segment_gatherer publishes incorrect end_time when using group_by_minutes

When using group_by_minutes, the segment gathere publishes the incorrect end time. It publishes the end_time for the first segment in the slot, rather than the end_time for the segment in the slot latest in time (max of all end times).

In the following MCVE, the segment gatherer is configured with group_by_minutes: 10 and receives three segments. The first segment covers 13:00:00-13:01:00, the second 13:01:00-13:02:00, and the third 13:02:00-13:03:00. They all fit in the slot starting at 13:00:00. The correct start and end time for the slot would be 13:00:00-13:03:00, but the segment gatherer publishes a message with start and end times 13:00:00-13:01:00 instead.

MCVE:

import datetime
from pytroll_collectors.segments import SegmentGatherer
from posttroll.message import Message

sg = SegmentGatherer({
    "patterns": {
        "oak": {
            "pattern": "oak-s{start_time:%Y%m%d%H%M%S}-e{end_time:%Y%m%d%H%M%S}-s{segment}.tree",
            "critical_files": None,
            "wanted_files": ":001-003",
            "all_files": ":001-003",
            "is_critical_set": False,
            "variable_tags": ["start_time", "end_time"]}},
    "timeliness": 10,
    "group_by_minutes": 10,
    "time_name": "start_time"})

messages = [Message(
	rawstr=f"pytroll://tree/oak file pytroll@forest 1980-01-01T13:0{i:d}:00.000000 v1.01 application/json "
		   f'{{"platform_name": "forest", "start_time": "1980-01-01T13:0{i:d}:00", "end_time": '
		   f'"1980-01-01T13:0{i+1:d}:00", "uri": "/data/oak-s19800101130{i:d}00-e19800101130{i+1:d}00-'
		   f's00{i:d}.tree", "uid": "oak-s19800101130{i:d}00-e19800101130{i+1:d}00-s00{i:d}.tree", '
		   '"sensor": "Thaumetopoea processionea"}')
		for i in range(3)]
for msg in messages:
	sg.process(msg)
print(sg.slots["1980-01-01 13:00:00"].output_metadata["start_time"])
print(sg.slots["1980-01-01 13:00:00"].output_metadata["end_time"])

Expected output:

1980-01-01 13:00:00
1980-01-01 13:03:00

Actual output:

1980-01-01 13:00:00
1980-01-01 13:01:00

using latest pytroll-collectors main.

add option to gatherer to measure timeout since first arriving granule not clock time

For our EUMETCAST GDS processing, granules may arrive very late, sometimes more than 90 minutes after the initial measurement. Currently, gatherer measures the timeliness only by comparing processing time with measurement time, so if timeout is set to 90 minutes and a message arrives about a newly arrived granule that measured 92 minutes ago, gatherer will conclude a timeout immediately. Currently, this means that either gatherer ships out too many short granules (defeating the point of gatherer) or the user needs to set the timeout very long, which leads to other problems. Currently I have the timeout set to two hours, which sometimes leads to high-latitude granules being gathered between subsequent orbits.

I would like to see an option in gatherer in which the timeliness is measured not by "current time - measurement time" but by "current time - time of first granule". As an added bonus, this will also make it somewhat easier to test a setup by dropping in old files.

region collector never finishes if start_time after end_time (probably stuck in infinite loop)

When the end time for a granule is before the start time, the region collector is apparently supposed to do something clever, judging from this code block:

if start_time > end_time:
old_end_time = end_time
end_date = start_time.date()
if end_time.time() < start_time.time():
end_date += timedelta(days=1)
end_time = datetime.combine(end_date, end_time.time())
LOG.debug('Adjusted end time from %s to %s.',
old_end_time, end_time)

However, I've tried to write a unit test for this, see #77 or

https://github.com/gerritholl/pytroll-collectors/blob/1ad53748b5b2c267f6513a13a1cb5d6b23cfc50c/pytroll_collectors/tests/test_region_collector.py

where this unit test never finishes, apparently stuck in an infinite loop:

@unittest.mock.patch("pyorbital.tlefile.urlopen", new=_fakeopen)
def test_faulty_end_time(europe_collector, caplog):
    """Test adapting if end_time before start_time."""
    granule_metadata = {
        "platform_name": "Metop-C",
        "sensor": "avhrr",
        "start_time": datetime.datetime(2021, 4, 11, 0, 0),
        "end_time": datetime.datetime(2021, 4, 10, 23, 58)}
    with caplog.at_level(logging.DEBUG):
        europe_collector(granule_metadata)
    assert "Adjusted end time" in caplog.text

which I suspect to be either one of those two:

while True:
gr_time += self.granule_duration
gr_pass = Pass(platform, gr_time,
gr_time + self.granule_duration,
instrument=self.sensor)
if not gr_pass.area_coverage(self.region) > 0:
break
self.planned_granule_times.add(gr_time)
gr_time = start_time
while True:
gr_time -= self.granule_duration
gr_pass = Pass(platform, gr_time,
gr_time + self.granule_duration,
instrument=self.sensor)
if not gr_pass.area_coverage(self.region) > 0:
break
self.planned_granule_times.add(gr_time)

cat.py fails to check if command ran successfully

Describe the bug

When cat.py is used to concatenate EPS files, for example using kai, it fails to test if kai ran successfully. When configuring cat.py with:

[kai_cat]
topic=/collection/metop/avhrr
command=kai -i {input_files} -o {output_file}
output_file_pattern=/data/pytroll/TMP/avhrr/avhrr_{platform_name}_{start_time:%Y%m%d%H%M%S}_{end_time:%Y%m%d%H%M%S}
publish_topic=/cat/metop/avhrr
nameservers=localhost
subscriber_nameserver = localhost

then the command run with kai may be, for example:

kai -i /data/pytroll/IN/HRPT/AVHR_HRP_00_M03_20210119081730Z_20210119082030Z_N_O_20210119081730Z /data/pytroll/IN/HRPT/AVHR_HRP_00_M03_20210119082030Z_20210119082330Z_N_O_20210119082030Z /data/pytroll/IN/HRPT/AVHR_HRP_00_M03_20210119082330Z_20210119082630Z_N_O_20210119082331Z /data/pytroll/IN/HRPT/AVHR_HRP_00_M03_20210119082630Z_20210119082930Z_N_O_20210119082630Z /data/pytroll/IN/HRPT/AVHR_HRP_00_M03_20210119082930Z_20210119083123Z_N_O_20210119082930Z -o /data/pytroll/TMP/avhrr/avhrr_Metop-C_20210119081730_20210119083123

If the output directory /data/pytroll/TMP/avhrr does not exist, the actual kai command will finish with a message to stderr:

Cannot open file '/data/pytroll/TMP/avhrr/avhrr_Metop-C_20210119081730_20210119083123` mode w+

and exit with exit code 1. The cat log however only monitors stdout, happily noting:

[INFO: 2021-01-19 08:31:27 : cat] b'Writing 61469828 bytes (4784 records) to /data/pytroll/TMP/avhrr/avhrr_Metop-C_20210119081730_20210119083123'
[INFO: 2021-01-19 08:31:27 : cat] Sending pytroll://file/cat/avhrr file [email protected] 2021-01-19T08:31:27.797434 v1.01 application/json {"path": "", "platform_name": "Metop-C", "start_time": "2021-01-19T08:17:30", "end_time": "2021-01-19T08:31:23", "processing_time": "2021-01-19T08:17:30", "sensor": ["avhrr/3"], "orig_platform_name": "M03", "collection_area_id": "eurol", "filename": "avhrr_Metop-C_20210119081730_20210119083123", "uri": "/data/pytroll/TMP/avhrr/avhrr_Metop-C_20210119081730_20210119083123"}

Expected behaviour

I expect that cat.py verifies that kai or whatever command was used exits successfully (exit code 0). If it doesn't, it should copy the stderr output for the process to the logfile and it should not send a posttroll message.

Actual behaviour

In reality cat.py monitors only stdout, appears to assume everything is fine, and sends a posttroll message in any case. It's up to the next program in the chain to realise the expected input file is absent.

ValueError: can't have unbuffered text I/O with python-dæmon 2.2.4

Starting the nameserver in dæmon mode fails with Python 3.8.6, pytroll-collectors-0.10.0, and python-dæmon 2.2.4:

$ nameserver -d start -l /tmp/test.log
Traceback (most recent call last):
  File "/data/gholl/miniconda3/envs/py38b/bin/nameserver", line 129, in <module>
    angel = daemon.runner.DaemonRunner(APP)
  File "/data/gholl/miniconda3/envs/py38b/lib/python3.8/site-packages/daemon/runner.py", line 114, in __init__
    self._open_streams_from_app_stream_paths(app)
  File "/data/gholl/miniconda3/envs/py38b/lib/python3.8/site-packages/daemon/runner.py", line 134, in _open_streams_from_app_stream_paths
    self.daemon_context.stderr = open(
ValueError: can't have unbuffered text I/O

This is true while logging to a file or not.

Could this be a Python 2 / Python 3 issue?

SegmentGatherer: configuration of time_name via ini not possible

The configuration of time_name via ini configuration file is not possible any longer.
time_name = xxx_time

Doing the same in an yaml file works.

It seems that parameters from ini files are copied in method ini_to_dict(). Unfortunately, time_name is missing there.

Use asyncio for gatherers

The current implementation of the geographic and segment gatherers make use of threading to adress concurrency needs of the algorithms. While this works fine, I believe that using asyncio with a single event loop and coroutines would allow the code to be made much simpler and more maintainable and testable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.