Code Monkey home page Code Monkey logo

datastore's People

Contributors

153957 avatar davidfokkema avatar dependabot[bot] avatar kaspervd avatar paraseus avatar tomkooij avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

tomkooij 153957

datastore's Issues

Move PySPARC data with bad timestamps to test stations

Stations 102 and 202 were using PySPARC for a period in which it generated bad timestamps.
This is bad data because we can never recover the correct timestamp for most of the data (wrong units for Quantization Error).

102: 1 sep 2014 - 21 sep 2015
202: 19 dec 2014 - 21 sep 2015

Move GPS test data from stations 501 and 502 to a test stations

For a GPS offset test station 501 and 502 were triggered simultaneously using a pulsegenerator. The data is poluting the real cosmic data. The data is easily identified by the trace signals (no pulses, external trigger) and the interval between events (250 ms). Moreover, we performed the tests so we know the dates: From 2011/10/21 upto and including 2011/10/31.
We should move this data away from these stations and store under test stations. Stations 94 and 95 are obvious candidates, since those are used for similar tests and started data taking after the 501-502 tests. We should ensure that 94/95 also recieve configs from 501/502 to get the right gps coordinates.

Properly migrate HiSPARC data from /databases/kascade to the datastore

In /databases/kascade there is still about 2 years worth of event data from the station at KASCADE. This should be migrated to the datastore, which already contains some data from the KASCADE station. Note that the current cluster name (karlsruhe) and station number (70001) are different from how they are stored in unmigrated kascade files (kascade, 601).

The directory also contains the KASCADE experiment data (HiSparc-new.dat.gz), I think that at least a HDF5 version should also be stored for easier access (using sapphire.kascade.StoreKascadeData).

Restarting httpd (change config) sometimes leaves old threads running

I added two new stations in the Publicdb; 521 and 522.
When I converted 501 to 521 it sometimes got a 206 return code, which means unknown station number. This error was also logged on frome, in a thread with a process ID different from the other threads running at the time, meaning it was not properly killed when hisparc-datastore tried to reload httpd.

Update repo on frome

I am going to update the datastore on frome to the latest master.
And after that merge the lightning branch and update again.

  • Check if current code on frome is equal to master.
    Only one modification, which is also in master 6b0bdcf
  • Check location and values of config/ini/scripts. On frome the config.ini and writer_app.py are in the top level of the datastore repo. The application.wsgi is renamed to datastore.wsgi and is one level higher (outside repo). The files in top level are now ignored by git 51dc9e6
  • Find purpose of unmaintained_datastore dir. This dir contained some differences with the datastore dir, first it was still a bzr repo, the only other difference (diff -r unmaintained_datastore datastore) of note was the eventwarehouse migration log mig_ew.log, which I moved to /var/log/hisparc/
  • Backup current live repo. cp -rp datastore datastore_backup
  • Switch remote to HTTPS from SSH. git remote set-url origin https://github.com/HiSPARC/datastore.git
  • Fetch latest master. git fetch
  • Stop writer and httpd. Stop httpd service httpd stop and stop the writer by attaching the corresponding screen and stopping it (control + C)
  • Checkout latest version. git reset --hard origin/master
  • Restart httpd and writer. service httpd start and start the writer again in the screen using sudo -u www python /var/www/wsgi-bin/datastore/writer_app.py
  • Does everything still work? (check logs and raw data files)
  • Stop httpd/writer
  • Checkout lightning branch. git reset --hard origin/lightning
  • Start httpd/writer
  • Does everything still work? (check logs and raw data files, and new data types; singles/satellites)
  • Merge lightning branch into master
  • Checkout master and restart httpd/writer

Bad trigger config for 502 from 2012-6-8 to 2012-10-29

On 2012-5-16 station 502 transitioned to HiSPARC III electronics.
The baseline was automatically calibrated to 30 ADC by the DAQ.
It was soon found that the thresholds used in the DAQ were incorrect;
The threshold conversion from mV to ADC used a baseline of 200 ADC.
On 2012-6-8 the thresholds were changed to be close to the normal trigger levels.

The thresholds are not properly accounted for in the ESD.
In the period mentioned in the title about 70% of the trigger time reconstructions failed.

I'm uncertain if this data was reprocessed after updating SAPPHiRE to properly take different trigger settings into account.

Remove excessive number of configs from station 4 and 91

$ h5ls /databases/frome/2017/4/2017_4_15.h5/hisparc/cluster_amsterdam/station_4
blobs                    Dataset {799871/Inf}
config                   Dataset {248013/Inf}
…
$ h5ls /databases/frome/2017/4/2017_4_12.h5/hisparc/cluster_amsterdam/station_91
blobs                    Dataset {399080/Inf}
config                   Dataset {121744/Inf}
…

The public database seems to crash while processing these configs (hisparc-update), or while trying to render config data (uwsgi). Those configs should be removed (keeping maybe one, first and/or last?).

The blobs could also be updated, but they do not seem to take a lot of extra space. Removing the config blob data entries would mean that other blob indexes need to be updated as well, so it is easier to simply leave that data.

Improved data integrity checking

Given the occurances of bad data: HiSPARC/publicdb#63
Of which I suspect may be due to data corruption, other may just be due to migration issues.
We could activate the fletcher32 option for PyTables, this means that checksums will be made for all data, which will ensure data integrity.

Additionally we could enable blosc data compression to possibly save some space and speed up data read out (making IO less of a bottleneck).

GPS WNRO: Apr 7, 2019

De dag die je wist dat zou komen...

Apr 7, 2019 is the second GPS WNRO (week number rollover). On this day the 10-bit GPS week number overflowed from 1023 back to 0. GPS week 0 started on Jan 6, 1980. This was the second rollover. The GPS weeknumber of Apr 7, 2019 is 2048.

Our Trimble Resolution T GPS clocks use the start of weeknumber 2048 (Apr 7, 2019 0:00) as the default time after a GPS cold start (no signal acquired). The DAQ will hapilly send events to the datastore evenwhen no GPS signal has been acquired. Events from Apr 7, 2019 are thus very suspicious. Now that Apr 7, 2019 has passed we have verified that the GPS still use Apr 7, 2019 as default time after cold start.

Apr 7, 2019 used to be in the future... Not anymore.

Soon, we must mark events from Apr 7, 2019 as suspicious and not import them into the raw datastore, because there were most probably caused by missing/bad GPS signal.

(Last week I already moved the "old" Apr, 2019 suspicious data in the raw datastore)

Remove bad GPS config for 3201

On 28 April 2011 station 3201 started using HiSPARC electronics previously used by 3202.
The wrong GPS position was submitted by 3201 (which actually belonged to 3202).
Later the correct position was sent in a config, but the wrong ones was not removed from the datastore.

writer: Events get discarded on raw datastore IO errors

Today we discovered the writer had been erroneously running as root lately, thus creating raw datastore hdf5 files chown root.root. A few days ago frome was physically moved to a new location and the server was restarted. The writer was restarted as user www (as specified in the docs).

The writer running as user www could not write to the raw data store. All data was dropped:

/var/log/hisparc/hisparc-log.writer
2018-09-24 00:00:05,473 writer.store_events[4758].store_event_list.ERROR: Cannot process event, discarding event (station: 8006)
2018-09-24 00:00:05,473 writer.store_events[4758].store_event_list.ERROR: Cannot process event, discarding event (station: 8006)

The code that generates this error:
https://github.com/HiSPARC/datastore/blob/master/writer/store_events.py#L127L148

When store_events.store_event_list is unsuccesful, we still remove the incoming pickled data from the partial folder!

Solution: Only remove the pickle if process_data is succesful: https://github.com/HiSPARC/datastore/blob/master/writer/writer_app.py#L73

Bad data for stations on 2012-11-05

Data can not be processed for the ESD.
HDF5 errors occur when reading its data from the raw datastore.
Perhaps all data for that station needs to be removed..

Errors for stations 10, 303, 504.

Create an acceptance test for a datastore (VM)

(This is a proposal, not an issue)

Currently there is no real way to test the datastore (for example when commissioning a new frome) before it is put "live".

Risk:

  • Once we "accept" data from a station, it is deleted on the station side. If we mess up, we lose data.

We need a semi-automated test, that we can run through a VM, but we cannot spend days and days creating it.

Proposal for a test:

What do I want to test:

  • Will we recieve and store all data?
  • Is the data stored correctly?
    (Some uncaught exception in the WSGI app may cause data to be dropped, the writer may mangle data)

How to create testdata

  • provision a datastore VM.
  • Add this test datastore to a real station, adding it as a local datastore.
  • Let it run for a few days, compare raw data to real datastore. (Comparing an entire day should be straightforward, code already in sapphire.tests).
  • Let it run again: capture recieved data (1 hour? 1 day?) and create testdata (if real data, rebrand as data from test station), so we can test the "real" new frome, before it is replaced.

If would be best to use a test station for all of this, but I we lack that (and keeping in mind that we cannot afford to spent much time on this) we can use one or more SPA stations.

singles counts for missing detectors are 0 instead of -1 or -999

This might not be a datastore issue, but perhaps a station software issue, but since it affects raw data, I'm reporting it here.

singles counts for 2 detector stations (no slave) are apparently reported as zero (0) instead of -1 or -999 for the missing detectors.
IMHO, this is wrong.

Sample TSV downloaded from my publicdb test VM:

# Event Summary Data - Singles
#
# Station: (203) College Hageveld
#
(...)
#
2017-01-02	00:00:00	1483315200	162	90	122	67	0	0	0	0
2017-01-02	00:00:01	1483315201	139	80	151	98	0	0	0	0
2017-01-02	00:00:02	1483315202	145	86	164	104	0	0	0	0

I can fix this at the import into the ESD by simply setting the missing channels to all '-999'. But that still leaves the raw data to be ambiguous.

Then again, for singles data we might be safe in assuming 0 means "not connected" or "missing detector".

Possible solutions:

  • Fix in station software. Change existing raw data.
  • Fix in datastore. Change existing raw data.
  • Leave raw data as is. Change at import into ESD.
  • Leave raw data as is. Import into ESD unchanged. Leave slave channels out of histograms for missing slave.

Any thoughts?

Reduce logging or number of stored logs

frome now has log level set to DEBUG for the writer and uwsgi processes. With the new PySPARC stations that upload every event individually this causes a lot of extra log messages. We should set the log level to INFO. Or, alternatively, instead of reducing the log level, reduce the number of logs kept by the TimedRotatingFileHandler..

Recently frome was having issues accepting new events due to a full partition, caused by the large logs..

Vagrant for setting up the datastore server

It would be very useful to be able to setup the datastore server with vagrant.
That can help in case of a server crash or upgrade.
This would allow us to easily test improvements to how it runs;
such as adding supervisor to keep certain processes running.

frome is not a very complex server, so it should be relatively (compared to publicdb) easy setup.
Some notes on how it was originally setup can be found here: http://docs.hisparc.nl/servers/frome.html

Remove test data from station 1102

Station 1102 was tested at Nikhef before deployment.

The test data was unfortunately uploaded to the actual station.
Most importantly the GPS location of the Nikhef is now attached to the station.
This needs to be removed/migrated to a test station:
Example test data: http://data.hisparc.nl/show/stations/1102/2016/5/23/

Once the station is properly installed the date for real/good data will be established and the test data can easily be removed.

'Todo' items

Are these issues still relevant?

  • Undo support (transaction-like) for handling exceptions and recovering
    from them. Might remove the need for 'partial'?
  • Check for signed / unsigned troubles client-side. -999 etc.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.