Code Monkey home page Code Monkey logo

mseed3-utils's Introduction

miniSEED3-Utils

This repository contains command line utilities to verify miniSEED 3 files

  • miniSEED 3 Validator
    • Validates a single or collection of miniSEED 3 files
  • mseed3-text
    • Prints the contents of a selected miniSEED 3 file in text format to the terminal
  • mseed3-json
    • Prints the contents of a selected miniSEED 3 file in JSON format to the terminal

Dependencies

  1. cmake >= 2.8.0
  2. libmseed >= 3.0
  3. WJElement > 1.3

NOTE: libmseed and WJElement are automatically installed locally via make

Supported Platforms

  • Linux
  • MacOS
  • Windows

Clone - Configure - Build - Install

Clone & Configure Project

  • git clone https://github.com/earthscope/mseed3-utils.git
  • cd mseed3-utils
  • mkdir build/
  • cd build/
  • Run cmake ..
    • To specify install prefix use: cmake -DCMAKE_INSTALL_PREFIX:PATH={user specified path} ..

Build/Install

  • Linux/MacOS

    • Run make --NOTE: Internet connection required to pull and build supporting libraries (see Dependencies)
    • (optional) Run make install to install in system or location specified by -DCMAKE_INSTALL_PREFIX:PATH
  • Windows

    • Run MSBuild mseed3-utils.sln

miniSEED 3 Validator

Checks miniSEED 3 file for:

  1. Valid fixed header
  2. Valid payload
  3. Valid extra header via user provided JSON schema (optional)

All information on the miniSEED file is printed to the terminal

Usage:

Usage: ./mseed3-validator [options] infile(s)

         ## Options ##
	 -h help    Display usage information
	 -j json    Json schema
	 -v verbose Verbosity level
	 -d data    Print data payload
	 -W         Option flag  *e.g* -W error,skip-payload
         -V version Print program version

mseed3-text

Prints the contents of a selected miniSEED file in text format to the terminal

Usage:

Program to print an miniSEED file in human readable format:

Usage: ./mseed3-text [options] infile(s)

     ## Options ##
     -h help    Display usage information
     -v verbose Verbosity level
     -d data    Print data payload
     -V version Print program version

mseed3-json

Prints the contents of a selected miniSEED file in JSON format to the terminal

Usage:

Program to print an miniSEED file in JSON format:

Usage: ./mseed3-json [options] infile(s)

     ## Options ##
     -h help    Display usage information
     -v verbose Verbosity level
     -d data    Print data payload
     -V version Print program version

mseed3-utils's People

Contributors

jeffreyleifer avatar chad-earthscope avatar

Stargazers

Zade Viggers avatar Brad Avenson avatar steam(VR)punk waiting for wankunder(wear) to dry. avatar Cuda Chen avatar Mohammadreza Abdollahzadeh avatar

Watchers

James Cloos avatar  avatar Sid Hellman avatar Nick Falco avatar  avatar

Forkers

crotwell

mseed3-utils's Issues

cmake build system miss-detection of libmseed version

$ cmake .
Configuring xseed-utils version: 1.0.0
-- Found MSEED: /usr/local/lib/libmseed.dylib (found suitable version "3.0.0", minimum required is "3.0") 
-- 
...

but:

$ ls -l /usr/local/lib/libmseed.dylib
lrwxr-xr-x  1 chad  staff  21 Mar 24 19:26 /usr/local/lib/libmseed.dylib -> libmseed.2.19.5.dylib

But, when a build is actually triggered that installed library is not used so the build actually works. Something is wrong with that detection mechanism and it's unclear that if the right version where present on the system that it would be used.

Need documentation of the xseed-validator -W option flags

The -W command line option for xseed-validator is documented in the usage message as:

	 -W         Option flag  *e.g* -W error,skip-payload

It remains undocumented what other "flags" are supported.

The source file src/xseed-validator/parse_warn_options.c, which appears to deal with these flags, has more options. Based on the amount of commented-out and incomplete code elsewhere, there is little confidence that these flags are honored and review is needed in addition to documentation.

flags flipped order

mseed2text flips the order of the bits in flags I think.

For example using reference data file reference-sinusoid-steim2.xseed mseed2json gives:

{
    "SID": "XFDSN:XX_TEST__L_H_Z",
    "RecordLength": 956,
    "FormatVersion": 3,
    "Flags": {
        "RawUInt8": 4,
        "ClockLocked": true
    },
    "StartTime": "2012-01-01T00:00:00.000000000Z",
    "EncodingFormat": 11,
    "SampleRate": 1,
    "SampleCount": 500,
    "CRC": "0x5CFF0548",
    "PublicationVersion": 1,
    "ExtraLength": 0,
    "DataLength": 896
}%     

But mseed2text gives:

XFDSN:XX_TEST__L_H_Z, version 1, 956 bytes (format: 3)
             start time: 2012,001,00:00:00.000000
      number of samples: 500
       sample rate (Hz): 1
                  flags: [00100000] 8 bits
                         [Bit 2] Clock locked
                    CRC: 0x5CFF0548
    extra header length: 0 bytes
    data payload length: 896 bytes
       payload encoding: STEIM-2 integer compression (val: 11)

but 4 in binary should be 00000100 not 00100000. IE bit 2 is 3rd from right, not 3rd from left.

seg fault on mseed3-validator

 git clone [email protected]:EarthScope/mseed3-utils.git
cd mseed3-utils
mkdir build
cd build
cmake ..
make

copy reference-sinusoid-int32.mseed3 from miniseed repo

bin/mseed3-validator reference-sinusoid-int32.mseed3
zsh: segmentation fault  bin/mseed3-validator reference-sinusoid-int32.mseed3

This is on a M3 Max macbook, so new chip may be part of the problem.

Oddly, mseed3-text works just fine:

bin/mseed3-text reference-sinusoid-int32.mseed3
FDSN:XX_TEST__V_H_Z, version 1, 2059 bytes (format: 3)
             start time: 2022-06-05T20:32:38.123456789Z (156)
      number of samples: 500
       sample rate (Hz): 0.1
                  flags: [00000100] 8 bits
                         [Bit 2] Clock locked
                    CRC: 0x37223EA2
    extra header length: 0 bytes
    data payload length: 2000 bytes
       payload encoding: 32-bit integer (val: 3)

mseed3-json invalid on multiple records

A mseed3 file with multiple records should still generate valid json.

Example using reference-data:

cat reference-sinusoid-int32.xseed reference-sinusoid-int16.xseed > combine.xseed
mseed2json combine.xseed                                      
{
    "SID": "XFDSN:XX_TEST__L_H_Z",
    "RecordLength": 2060,
    "FormatVersion": 3,
    "Flags": {
        "RawUInt8": 4,
        "ClockLocked": true
    },
    "StartTime": "2012-01-01T00:00:00.000000000Z",
    "EncodingFormat": 3,
    "SampleRate": 1,
    "SampleCount": 500,
    "CRC": "0x727BF4BD",
    "PublicationVersion": 1,
    "ExtraLength": 0,
    "DataLength": 2000
},{
    "SID": "XFDSN:XX_TEST__L_H_Z",
    "RecordLength": 860,
    "FormatVersion": 3,
    "Flags": {
        "RawUInt8": 4,
        "ClockLocked": true
    },
    "StartTime": "2012-01-01T00:00:00.000000000Z",
    "EncodingFormat": 1,
    "SampleRate": 1,
    "SampleCount": 400,
    "CRC": "0x106EAFA5",
    "PublicationVersion": 1,
    "ExtraLength": 0,
    "DataLength": 800
}%            

It looks like this was meant to be a json array at the top level, but the open and close brackets [ ] are not there.

Rename program source files appropriately instead of main.c

The main() for each xseed-validator, xseed2json, and xseed2text are contained in files named main.c. Unless there is a really good reason for this super generic and same name, these should be renamed to the actual name of the program.

Heaven help the developer that has more than one of the main.c files open in an editor and confuses them. Nope, hasn't happened to me, not ever, nope, not five minutes ago either.

mseed exec names

Just a thought, but since this is dealing with mseed3, it is a little confusing to have some commands named like mseed2json and some like mseed3-validator. Perhaps mseed3-json and mseed3-text. Otherwise they look like they might deal with miniseed version 2.

xseed2json incorrectly maps record details to JSON values

Somewhere in xseed2json/main.c:

    ierr = json_object_set_number (jsonObj, "reclen", msr->reclen);

    if (ierr == JSONFailure)
    {
      printf ("Something went wrong parsing to JSON : reclen");
      return EXIT_FAILURE;
    }

    ierr = json_object_set_number (jsonObj, "reclen", msr->formatversion);

    if (ierr == JSONFailure)
    {
      printf ("Something went wrong parsing to JSON : formatversion");
      return EXIT_FAILURE;
    }

    ierr = json_object_set_number (jsonObj, "reclen", msr->formatversion);

    if (ierr == JSONFailure)
    {
      printf ("Something went wrong parsing to JSON : formatversion");
      return EXIT_FAILURE;
    }

time formatting

Would be useful if time output in mseed2text printed nanoseconds instead of microseconds.

Also, perhaps the output time should be more ISO friendly, like YYYY-DDD instead of YYYY,DDD. See "Ordinal Dates" in ISO8601.

Maybe add Z to be clear date is UTC.

Change:

             start time: 2012,001,00:00:00.000000

to

             start time: 2012-001T00:00:00.000000000Z

Note mseed2json already does some of this:

    "StartTime": "2012-01-01T00:00:00.000000000Z",

but might be good if it used day of year instead of month day since that is what is stored in the header?

Check payload from read data, instead of re-reading

Currently the payload checks are performed by re-reading and re-parsing the entire file being checked (in check_file.c), i.e. all files are read twice.

This should be improved by performing the payload check for each record while it is already read into memory, in the range noted by:

//TODO check payload via buffer, for now payloads are checked via libmseed after all headers are checked
...
//TODO validate payload using the buffer contains

xseed-validator does not work with combined JSON Schema via $ref usage

The JSON Schema language allows $ref values to be "includes" of other schema components, either in the same schema structure or separate files. This is used internally in the ExtraHeaders-FDSN.schema.json schema file, so it works within a file. The intended method of combining FDSN and non-FDSN schemas is to create a simple schema document that combines all schemas using the $ref mechanism. xseed-validator does not work in this case.

The reference schema area contains and FDSN schema and some example schemas for Manufacturer123 and OperatorXYZ and a all-schemas.json file that combines all schemas. Using this file as the schema for validation should work for validating all the extra headers against all schemas.

The all-schemas.json file simply contains:

{
  "allOf": [
    {
      "$ref": "ExtraHeaders-FDSN.schema.json"
    },
    {
      "$ref": "ExtraHeaders-Manufacturer123.schema.json"
    },
    {
      "$ref": "ExtraHeaders-OperatorXYZ.schema.json"
    }
  ]
}

Failure to report Extra Header validation failure as a validation failure

Validating a single record that includes an un-allowed extra header:

$ xseed-validator -j ../ExtraHeaders/ExtraHeaders-FDSN.schema.json reference-detectiononly.xseed.invalidEH

reports:


**********xseed-validator STARTING validation**********
---------------------------------------------------------

Reading file reference-detectiononly.xseed.invalidEH
Error in Schema Validation- Detection[0]: extra property 'DetectionWave' found.
*** Completed processing 1 records ***
xseed-validator RESULT - file reference-detectiononly.xseed.invalidEH is VALID xSEED

----------------------------------------------------------
*xseed-validator COMPLETE - 1 record(s) processed in 1 file(s)*
****xseed-validator SUCCESSFULLY validated 1 file(s)****

The summary messages quite (EMPHATICALLY) indicates that validation was successful even though it reported a validation error for the file.

When the validator is provided with a non-xSEED file (to generate a validation error), so the validator knows the difference:

*xseed-validator COMPLETE - 0 record(s) processed in 1 file(s)*
*xseed-validator FAILED to validate 1 file(s) out of the 1 file(s) processed*

Default verbosity is too verbose, needs to be a tight summary

The default (minimum) verbosity is too verbose, containing at least three lines for each file. This validator should be able to run on thousands of files and millions of records and this level of output quickly becomes overwhelming. The validator also produces some header and footer bracketing:

**********xseed-validator STARTING validation**********
---------------------------------------------------------

...

----------------------------------------------------------
*xseed-validator COMPLETE - 11 record(s) processed in 11 file(s)*
****xseed-validator SUCCESSFULLY validated 11 file(s)****

All but the last two lines are useless and the copious asterisks are visual noise. Perhaps the intention was to create parsable output, which could be useful, but this could be achieved with much less noise.

The current verbosity levels should all be moved up one, e.g. default becomes -v, and -v becomes -vv, etc.

The default minimum verbosity should only include the filename and validation errors when they occur, i.e. no output for files that are valid, and the final status of all files with a file count.

Fail hard on file not found

bin/mseed3-validator -v file_that_does_not_exist.ms3 ~/Downloads/seismograms/CO.BIRD.00.HHZ_2021-11-1_2.ms3 ~/Downloads/seismograms/CO.BIRD.00.HHZ_2021-11-1_3.ms3
Error! Cannot read file: file_that_does_not_exist.ms3, File Not Found! 
Reading file /Users/crotwell/Downloads/seismograms/CO.BIRD.00.HHZ_2021-11-1_2.ms3
mseed3-validator RESULT - file /Users/crotwell/Downloads/seismograms/CO.BIRD.00.HHZ_2021-11-1_2.ms3 is VALID miniSEED 3
Reading file /Users/crotwell/Downloads/seismograms/CO.BIRD.00.HHZ_2021-11-1_3.ms3
mseed3-validator RESULT - file /Users/crotwell/Downloads/seismograms/CO.BIRD.00.HHZ_2021-11-1_3.ms3 is VALID miniSEED 3

----------------------------------------------------------
mseed3-validator COMPLETE - 6 record(s) processed in 2 file(s)

Might be better to fail hard on a file not exists instead of continuing to process. Otherwise the error can be easy to miss in all the other text.

Alternatively, maybe change the COMPLETE message to FAILED. Basically the last output line probably should give an indication if there was a failure of any type.

I see the -W error arg, but even that doesn't halt on file not found.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.