Code Monkey home page Code Monkey logo

nappy's Introduction

nappy

NASA Ames Package in Python.

Description

A NASA Ames I/O package - A python input/output package for NASA Ames file formats.

Version History

This repository was previously hosted on CEDA's Subversion repository. The first tagged release here is:

Reference doc

Software written with reference to:

'Format Specification for Data Exchange' paper by Gaines and Hipkind (1998). makeheader.f fortran application to write NASA Ames headers, Anne de Rudder (2000). Ames python library developed by Bryan Lawrence (2003).

Conventions:

The basic NASAAmes class holds a dictionary called naVars which holds all the variables described in the Gaines and Hipkind document and these are all named using CAPITAL LETTERS for compliance/reference with that document.

For example the number of independent variables is held in the instance variable:

self["NIV"]

Return values are being calculated for many functions/methods are often prefixed with 'rt' symbolising 'return'.

Usage documentation for nappy

Nappy provides the following functionality:

  1. A set of I/O routines for most NASA Ames File Format Indices (FFIs).
  2. An implicit checking facility for NASA Ames compliance - i.e. if the file is formatted incorrectly then a python error will be raised. This checking facility will eventually be made explicit to report NASA Ames specific errors.
  3. Methods to interrogate the contents the contents of NASA Ames files (such as: naFile.getVariable(), naFile.getIndependentVariables(), naFile.getMissingValue() etc.).
  4. A set of to allow conversion to and from NetCDF (for the most common FFIs) using the Xarray library. Note that any Xarray-compatible format can potentially be converted to NASA Ames via these libraries. In order to use this feature your software should have nappy[netcdf_conversion] in its requirements.
  5. Some command line utilities for the format conversions in (4).

PYTHONPATH and import issues

The most common stumbling block for python users is getting to grips with PYTHONPATH (or sys.path), an environment variable used to tell python where it should look for modules and packages.

In order for your python scripts, modules and interactive sessions to find the nappy package you must make the directory visible by pointing to it in one of the following ways.

If the nappy directory has been installed at /my/nappy/location/nappy then the directory you need to tell python about is /my/nappy/location.

Option 1. Append your nappy path to the PYTHONPATH environment variable:

export PYTHONPATH=$PYTHONPATH:/my/nappy/location

Option 2: Append your nappy path once within python:

>>> import sys   # Imports the sys module
>>> sys.path.append("/my/nappy/location")   # Adds the directory to others
                                            # used when searching for a module.

You should then be able to import nappy with:

>>> import nappy

Option 3: Installing to a virtualenv

Download the requirements.txt file from the nappy github page into your working directory. Then run the following commands in the terminal.

On Unix or MacOS run:

python -m venv .nappy-env
source .nappy-env/bin/activate
pip install -r requirements.txt
pip install git+https://github.com/cedadev/nappy.git

On Windows PowerShell run:

python -m venv .nappy-env
& .nappy-env/Scripts/activate.ps1
pip install -r requirements.txt
pip install git+https://github.com/cedadev/nappy.git

Usage Examples

The following examples demonstrate and overview of nappy usage:

Example 1: Opening and interrogating a NASA Ames file

Open the python interactive prompt:

python

Import the nappy package:

>>> import nappy

Open a NASA Ames file (reading the header only):

>>> myfile = nappy.openNAFile('some_nasa_ames_file.na')

Query the methods on the 'myfile' objects:

>>> dir(myfile)

['A', 'AMISS', 'ANAME', 'ASCAL', 'DATE', 'DX', 'FFI', 'IVOL', 
'LENA', 'LENX', 'MNAME', 'NAUXC', 'NAUXV', 'NCOM', 'NIV', 
'NLHEAD', 'NNCOML', 'NSCOML', 'NV', 'NVOL', 'NVPM', 'NX', 
'NXDEF', 'ONAME', 'ORG', 'RDATE', 'SCOM', 'SNAME', 'V', 
'VMISS', 'VNAME', 'VSCAL', 'X', 'XNAME', '__doc__', 
'__getitem__', '__init__', '__module__', '_checkForBlankLines', 
'_normalizeIndVars', '_normalizedX', '_open', '_parseDictionary', 
'_readAuxVariablesHeaderSection', '_readCharAuxVariablesHeaderSection',
'_readComments', '_readCommonHeader', '_readData1', '_readData2', 
'_readLines', '_readNormalComments', '_readSpecialComments', 
'_readTopLine', '_readVariablesHeaderSection', '_setupArrays', 
'_writeAuxVariablesHeaderSection', '_writeComments', 
'_writeCommonHeader', '_writeVariablesHeaderSection', 
'auxToXarrayVariable', 'close', 'createXarrayAuxVariables', 
'createXarrayAxes', 'createXarrayVariables', 'file', 'filename', 
'floatFormat', 'getAuxMissingValue', 'getAuxScaleFactor', 
'getAuxVariable', 'getAuxVariables', 'getFFI', 'getFileDates', 
'getIndependentVariable', 'getIndependentVariables', 
'getMissingValue', 'getMission', 'getNADict', 'getNormalComments', 
'getNumHeaderLines', 'getOrg', 'getOrganisation', 'getOriginator', 
'getScaleFactor', 'getSource', 'getSpecialComments', 'getVariable', 
'getVariables', 'getVolumes', 'naDict', 'pattnBrackets', 'readData', 
'readHeader', 'delimiter', 'toXarrayAxis', 'toXarrayFile', 'toXarrayVariable', 
'writeData', 'writeHeader']

List the variables:

>>> myfile.getVariables()
[('Mean zonal wind', 'm/s', 200.0, 1.0)]

List the independent variables (or dimension axes):

>>> myfile.getIndependentVariables()
[('Altitude', 'km'), ('Latitude', 'degrees North')]

Get a dictionary of the file contents in the form of NASA Ames documentation:

>>> myfile.getNADict()
{'ASCAL': [1.0], 'NLHEAD': 43, 'NNCOML': 11, 'NCOM': 
['The files included in this data set illustrate each of the 9 NASA Ames file', 
'format indices (FFI). A detailed description of the NASA Ames format can be', 
'found on the Web site of the British Atmospheric Data Centre (BADC) at', 
'http://www.badc.rl.ac.uk/help/formats/NASA-Ames/', 
'E-mail contact: [email protected]', 
'Reference: S. E. Gaines and R. S. Hipskind, Format Specification for Data', 
'Exchange, Version 1.3, 1998. This work can be found at', 
'http://cloud1.arc.nasa.gov/solve/archiv/archive.tutorial.html', 
'and a copy of it at', 
'http://www.badc.rl.ac.uk/help/formats/NASA-Ames/G-and-H-June-1998.html', ''], 
'DX': [20.0, 10.0], 'DATE': [1969, 1, 1], 'NXDEF': [1], 
'ONAME': 'De Rudder, Anne', 'SNAME': 'Anemometer measurements averaged over longitude', 
'MNAME': 'NERC Data Grid (NDG) project', 'NX': [9], 'NSCOML': 9, 
'RDATE': [2002, 10, 31], 'AMISS': [2000.0], 'VSCAL': [1.0], 'NV': 1, 
'NVOL': 13, 'X': [[], [0.0]], 'XNAME': ['Altitude (km)', 'Latitude (degrees North)'], 
'VNAME': ['Mean zonal wind (m/s)'], 'SCOM': ['Example of FFI 2010 (b).', 
'This example illustrating NASA Ames file format index 2010 is based on results', 
'from Murgatroyd (1969) as displayed in Brasseur and Solomon, Aeronomy of the', 
'Middle Atmosphere, Reidel, 1984 (p.36). It is representative of the mean zonal', 
'wind distribution in the winter hemisphere as a function of latitude and height.', 
'The first date on line 7 (1st of January 1969) is fictitious.', 
'From line 10 (NXDEF = 1) we know that the latitude points are defined by', 
'X(i) = X(1) + (i-1)DX1 for i = 1, ..., NX', 
'with X(1) = 0 deg (line 11), DX1 = 10 deg (line 8) and NX = 9 (line 9).'], 
'VMISS': [200.0], 'IVOL': 7, 'FFI': 2010, 
'ORG': 'Rutherford Appleton Laboratory, Chilton OX11 0QX, UK - Tel.: +44 (0) 1235 445837', 'NIV': 2, 
'ANAME': ['Pressure (hPa)'], 'NAUXV': 1}

Grab the normal comments:

>>> comm=myfile.na_dict["NCOM"]
>>> print(comm)
['The files included in this data set illustrate each of the 9 NASA Ames file', 
'format indices (FFI). A detailed description of the NASA Ames format can be', 
'found on the Web site of the British Atmospheric Data Centre (BADC) at', 
'http://www.badc.rl.ac.uk/help/formats/NASA-Ames/', 'E-mail contact: [email protected]', 
'Reference: S. E. Gaines and R. S. Hipskind, Format Specification for Data', 
'Exchange, Version 1.3, 1998. This work can be found at', 
'http://cloud1.arc.nasa.gov/solve/archiv/archive.tutorial.html', 
'and a copy of it at', 
'http://www.badc.rl.ac.uk/help/formats/NASA-Ames/G-and-H-June-1998.html', '']

Use the file method to get the normal comments:

>>> myfile.getNormalComments()
['The files included in this data set illustrate each of the 9 NASA Ames file', 
'format indices (FFI). A detailed description of the NASA Ames format can be', 
'found on the Web site of the British Atmospheric Data Centre (BADC) at', 
'http://www.badc.rl.ac.uk/help/formats/NASA-Ames/', 'E-mail contact: [email protected]',
'Reference: S. E. Gaines and R. S. Hipskind, Format Specification for Data', 
'Exchange, Version 1.3, 1998. This work can be found at', 
'http://cloud1.arc.nasa.gov/solve/archiv/archive.tutorial.html', 
'and a copy of it at', 
'http://www.badc.rl.ac.uk/help/formats/NASA-Ames/G-and-H-June-1998.html', '']

Read the actual data:

>>> myfile.readData()

Inspect the data array ("V") in the NASA Ames dictionary:

>>> print(myfile.na_dict["V"])
[[[-3.0, -2.6000000000000001, -2.2999999999999998, 2.0, 4.7999999999999998, 
4.5999999999999996, 4.5, 3.0, -0.90000000000000002], [-15.1, -4.2000000000000002, 
6.9000000000000004, 12.800000000000001, 14.699999999999999, 20.0, 21.5, 18.0, 
8.1999999999999993], [-29.0, -15.199999999999999, 3.3999999999999999, 
28.199999999999999, 41.0, 39.100000000000001, 17.899999999999999, 8.0, 
0.10000000000000001], [-10.0, 8.4000000000000004, 31.199999999999999, 
59.899999999999999, 78.5, 77.700000000000003, 47.0, 17.600000000000001, 
16.0], [200.0, 200.0, 200.0, 200.0, 200.0, 200.0, 200.0, 200.0, 200.0]]]

Example 2: Writing a NASA Ames file

Start the python interactive prompt:

python

Import the nappy package:

>>> import nappy

Pretend you have created a complete NASA Ames file contents in a dictionary called na_contents.

Write the data to a NASA Ames file:

>>> nappy.openNAFile('my_file_to_write.na', 'w', na_contents)

Example 3: Converting a NASA Ames file to a NetCDF file

[Note: this utility is only available on Unix/linux platforms]

Run the command-line utility na2nc:

na2nc -t "seconds since 1999-01-01 00:00:00" -i my_nasa_ames_file.na -o my_netcdf_file.nc

Note that the -t argument allows you to pass a NetCDF-style data/time units description into your NetCDF that will allow software packages to identify the time axis correctly. This is required when the time unit string in your NASA Ames file is non-standard.

For help on the command-line utility type:

na2nc -h

na2nc.py
========

Converts a NASA Ames file to a NetCDF file.

Usage
=====

   na2nc.py [-m <mode>] [-g <global_atts_list>]
            [-r <rename_vars_list>] [-t <time_units>] [-n]
            -i <na_file> [-o <nc_file>]

Where
-----

    <mode>                      is the file mode, either "w" for write or "a" for append
    <global_atts_list>          is a comma-separated list of global attributes to add
    <rename_vars_list>          is a comma-separated list of <old_name>,<new_name> pairs to rename variables
    <time_units>                is a valid time units string such as "hours since 2003-04-30 10:00:00"
    -n                          suppresses the time units warning if invalid
    <na_file>                   is the input NASA Ames file path
    <nc_file>                   is the output NetCDF file path (default is to replace ".na" from NASA Ames
                                 file with ".nc").

Example 4: Converting a NetCDF file to a NASA Ames file

[Note: this utility is only available on Unix/linux platforms]

Run the command-line utility nc2na:

nc2na -i my_netcdf_file.nc -o my_nasa_ames_file.na

For help on the command-line utility type:

nc2na -h

nc2na.py
========

Converts a NetCDF file into one or more NASA Ames file.

Usage
=====

    nc2na.py [-v <var_list>] [--ffi=<ffi>] [-f <float_format>]
             [-d <delimiter>] [-l <limit_ffi_1001_rows>]
             [-e <exclude_vars>] [--overwrite-metadata=<key1>,<value1>[,<key2>,<value2>[...]]]
             [--names-only] [--no-header] [--annotated]
             -i <nc_file> [-o <na_file>]
Where
-----

    <nc_file>                   - name of input file (NetCDF).
    <na_file>                   - name of output file (NASA Ames or CSV) - will be used as base name if multiple files.
    <var_list>                  - a comma-separated list of variables (i.e. var ids) to include in the output file(s).
    <ffi>                       - NASA Ames File Format Index (FFI) to write to (normally automatic).
    <float_format>              - a python formatting string such as %s, %g or %5.2f
    <delimiter>                 - the delimiter you wish to use between data items in the output file such as "   " or "    "
    <limit_ffi_1001_rows>       - if format FFI is 1001 then chop files up into <limitFFI1001Rows> rows of data.
    <exclude_vars>              - a comma-separated list of variables (i.e. var ids) to exclude in the output file(s).
    <key1>,<value1>[,<key2>,<value2>[...]] - list of comma-separated key,value pairs to overwrite in output files:
                                                                * Typically the keys are in:
                                   * "DATE", "RDATE", "ANAME", "MNAME","ONAME", "ORG", "SNAME", "VNAME".
    --names-only                - only display a list of file names that would be written (i.e. don't convert actual files).
    --no-header                 - Do not write NASA Ames header
    --annotated                 - add annotation column in first column

nappy's People

Contributors

agstephens avatar bastianzim avatar fobersteiner avatar lukejones123 avatar nisc586 avatar poelsner avatar reimarbauer avatar tomkralidis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nappy's Issues

Python 3 support ?

Hi Ag,

Do you intend to produce a version compatible with Python 3 in the future ?
I am actually using Nappy in EGADS with Python 2.7, only for NASA Ames support, and I would like to update EGADS code to python3 if I have time before the end of EUFAR contract.

Kind regards,

Olivier

LICENSE file missing in pypi package

Hi

please add the LICENSE file to the pypi package.

I need this for a conda-forge build. Currently I have to add it manually to the recipe.

regards
Reimar

use filelike obj and/or filename

I started to become a great fan of pyfilesystem2

A minimum to get nappy used by this is having a file like object for accessing data and not only
path + filename. I have no idea if there is a limitation by the dependent cdat-lite.

Do you think it is possible to refactor for an additional filelike obj? Is this planned already?

fix test dependencies

Running python setup.py test yields 6 errors all of type:

Exception: Could not import third-party software. Nappy requires the CDMS and Numeric packages to be installed to convert to CDMS and NetCDF.

Scanning the code it seems the following packages are required:

  • numpy
  • cdms2 (or cdms)
  • Numeric

I don't see cdms2, cdms or Numeric on PyPI, are they elsewhere?
Perhaps we can update the tests so that the base tests that are run are those that do not require NetCDF support (numpy/cdms2 or cdms/Numeric).

Fix time bounds values so they are correct in HadUK-Grid test

$ cat output_01_5.csv

34,2010
Met Office
Data held at British Atmospheric Data Centre (BADC), Rutherford Appleton Laboratory, UK.
HadUK-Grid_v1.0.3.0
Gridded surface climate observations data for the UK
1,1
1971 1 16,2021 9 13
1,0
2
2
0,1
bnds
time
1
1
-1e+20
time_bnds
0
0
14
==== Normal Comments follow ====
creation_date:   2021-07-12T19:39:41
frequency:   mon
references:   doi: 10.1002/joc.1161
short_name:   monthly_snowlying
version:   v20210712
Conventions:   CF-1.7
NCO:   netCDF Operators version 4.9.7 (Homepage = http://nco.sf.net, Code = http://github.com/nco/nco)
=== Additional Global Attributes defined in the source file ===
Monthly resolution gridded climate observations
History:  2021-09-13 09:21:23 - Converted to NASA Ames format using nappy-0.3.0.
    Tue Aug 10 16:50:24 2021: ncks -d projection_x_coordinate,,,10 -d projection_y_coordinate,,,10 -d time,,,12 --variable snowLying /badc/ukmo-hadobs/data/insitu/MOHC/HadOBS/HadUK-Grid/v1.0.3.0/1km/snowLying/mon/v20210712/snowLying_hadukgrid_uk_1km_mon_197101-197112.nc ./archive/badc/ukmo-hadobs/data/insitu/MOHC/HadOBS/HadUK-Grid/v1.0.3.0/1km/snowLying/mon/v20210712/snowLying_hadukgrid_uk_1km_mon_197101-197112.nc
==== Normal Comments end ====
=== Data Section begins on the next line ===
1.4993e+06
1.49895e+06,1.49967e+06

Are they right? Need to check.

Performance: readData utterly slow for files with many lines of data

Description

Loading data from small files completes in a decent amount of time. With many lines of data (10k+), the process becomes a "bottleneck".

What I Did

read 4.3k lines of data, ffi1001:

%timeit myfile.readData()
67.9 ms ± 7.36 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

read 86.6k lines of data, ffi1001:

%timeit myfile.readData()
51.5 s ± 2.54 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

That's nearly a minute per file! If I'd want to load many such files, I'd have to go have a lot of coffee in the meantime ☕👾


tracing the execution of the call to readData, I find

Refactor location of scripts

  • Move the command-line scripts are located in a cli directory, from script.
  • Move nappy/utils/compare_na.py to cli/ - but remember that compare is imported in tests and other code, so the underlying functions need to remain in a library rather than a script.
  • Maybe compare_na.py and compare.py are the same!?

different nappy version in nappy.ini vs. nappy.__version__

just noted that the version in the nappy ini file is still 0.3.0, while the _init_.py specifies __version__ = "2.0.1".

Those should be consistent I think for reproducibility, if NA files are converted for example to nc files.

I noted that some internal functions for example use getVersion(), which obtains the package version from the .ini, while for example the setup.py file just uses __version__. Since I just started to work my way through all of the source code, I'm not sure what is best to make this consistent...

Remove CI support for Python 3.6

Hi @FObersteiner and @LukeJones123, you have both been active in helping improve Nappy recently. When reviewing the first Pull Request I see that it fails with Python 3.6 in the GitHub Actions CI. I don't think we need to support Python3.6 anymore. Are you happy to remove support for Python3.6?

Bug prevents access to auxilliary variables

  • roocs-utils version:
  • Python version:
  • Operating System:

Description

I find that I can't access the auxilliary variables from certain Ames files from the NDACC database - the function na_file method getAuxVariables() returned an empty list even when auxilliary variables are present in the file.

What I Did

An example file header is ...

Woods P.            FTIR        ABERDEEN    TOTALCOL    23-JAN-1994 12:00:0007-MAY-1994 12:00:000001
30  1010
Woods, Peter; Bell, William; Paton-Walsh, Clare; Gardiner, Tom; Gould, Adrian; Donohoue, Liam
National Physical Laboratory (NPL)
Daily mean vertical column amounts from FTIR  ground-based experiment, 1993/1994 SESAME I;ground-based.
NDSC
1  1
1994 01 23   1994 05 07 
0
Julian Day of Observation 
5
1.0E+16 1.0E+14 1.0E+13 1.0E+13 1.0E+13
999 999 999 999 999 
N2O vertical column amount; molecules/cm**2
HNO3 vertical column amount; molecules/cm**2
ClONO2 vertical column amount; molecules/cm**2
HCl vertical column amount; molecules/cm**2
HF vertical column amount; molecules/cm**2
5
1.0 1.0 1.0 0.1 0.1 
9999 9999 9999 9999 9999
Year of Observation
Month number of Observation
Day of month
Latitude of observer; degrees N
Longtitude of observer; degrees E
0
3
Estimated overall uncertainties are: N2O 6.4%; HNO3 14.3%; ClONO2 24.4%; HCl 9.7%; HF 7.7%.
Estimated daily precision (neglecting line parameter uncertainties) are: N2O 4.9%; HNO3 6.8%; ClONO2 20.4%; HCl 7.7%; HF 5.7%.
     N2O  HNO3 ClONO2  HCl   HF
 13 1994  1 13  572  -21
     999   999  257    999   999

Capture VNAMEs when converting NA to NC

@FObersteiner said:

One thing that came up for me was that it might be useful (for me at least ^^) to dump VNAMEs when converting na to nc (i.e. to a global attribute maybe). The VNAMEs I have to deal with are not exactly standard and I want to provide sufficient info on those variables in the netCDF file as well (so the user doesn't have to open the na file additionally).

@agstephens said:

Do you want to propose a global attribute such as vnames_from_source_data ?

@FObersteiner said:

yes, that sounds reasonable.
Another option might be to set the original VNAME from the na file as an attribute of the xr.DataArray (make a "vname_from_source_data" tag).

There is an xarray issue with the "wet" variable in CRUTS

Here is an issue we need to solve:

# ncdump -h ~/.mini-ceda-archive/main/archive/badc/cru/data/cru_ts/cru_ts_4.04/data/wet/cru_ts4.04.1901.2019.wet.dat.nc | grep wet
netcdf cru_ts4.04.1901.2019.wet.dat {
        float wet(time, lat, lon) ;
                wet:long_name = "wet day frequency" ;
                wet:units = "days" ;
                wet:correlation_decay_distance = 450.f ;
                wet:_FillValue = 9.96921e+36f ;
                wet:missing_value = 9.96921e+36f ;

Xarray tries to be clever:

ds = xr.open_dataset(fpath)
ds.wet --> has been converted to a datetime64 type
- because units are "days".

We don't want that so:

ds = xr.open_dataset(fpath, decode_times=False)
ds.wet --> has been decoded to a float64 type
- BUT: the time dimension is also not decoded

Not sure how to fix this!!!

Error when processing name of variable in HadUK-Grid data

Inputs:

/mini-ceda-archive/....../archive/badc/ukmo-hadobs/data/insitu/MOHC/HadOBS/HadUK-Grid/v1.0.3.0/1km/snowLying/mon/v20210712/snowLying_hadukgrid_uk_1km_mon_197101-197112.nc

 File "/usr/local/anaconda/envs/flamingo/lib/python3.7/site-packages/pywps/app/Process.py", line 248, in _run_process
    self.handler(wps_request, wps_response)  # the user must update the wps_response.
  File "/usr/local/anaconda/envs/flamingo/lib/python3.7/site-packages/flamingo/processes/_wps_subset_base.py", line 210, in _handler
    output_uris = write_to_csvs(results, self.workdir)
  File "/usr/local/anaconda/envs/flamingo/lib/python3.7/site-packages/flamingo/utils/output_utils.py", line 37, in write_to_csvs
    xr_to_na.convert()
  File "/usr/local/anaconda/envs/flamingo/lib/python3.7/site-packages/nappy/nc_interface/xarray_to_na.py", line 89, in convert
    variables = self._convertSingletonVars(self.xr_variables)
  File "/usr/local/anaconda/envs/flamingo/lib/python3.7/site-packages/nappy/nc_interface/xarray_to_na.py", line 147, in _convertSingletonVars
    id = xarray_utils.getBestName(var_metadata).replace(" ", "_"),
  File "/usr/local/anaconda/envs/flamingo/lib/python3.7/site-packages/nappy/nc_interface/xarray_utils.py", line 49, in getBestName
    raise Exception(f"Cannot find a valid name for variable: {var}.")
Exception: Cannot find a valid name for variable: {'grid_mapping_name': 'transverse_mercator', 'longitude_of_prime_meridian': 0.0, 'semi_major_axis': 6377563.396, 'semi_minor_axis': 6356256.909, 'longitude_of_central_meridian': -2.0, 'latitude_of_projection_origin': 49.0, 'false_easting': 400000.0, 'false_northing': -100000.0, 'scale_factor_at_central_meridian': 0.9996012717}.

support NDACC based 2160 files

The Network for the Detection of Atmospheric Composition Change (NDACC) project has a 2160 variation which adds a line at the beginning of the file.

Snippet from ftp://ftp.cpc.ncep.noaa.gov/ndacc/station/alert/ames/o3sonde/al000106.b23:

FAST H.             O3SONDE     ALERT NWT   OZONE       06-Jan-2000 23:00:0007-Jan-2000 02:00:000001
54 2160 
Jonathan Davies
AES
Ozonesonde
ARQX OZONE PROGRAM
1 1 
2000  1  6  2000  1  7 
0 
40 
Pressure at observation (hPa) 
Station name

Running against master results in the following traceback

>>> from nappy import openNAFile
WARNING:nappy.nappy_api:You cannot use nappy NetCDF conversion tools as your system does not have CDMS installed, or it is not in your sys.path.
>>> n = openNAFile('al000106.b23')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "nappy/nappy_api.py", line 163, in openNAFile
    ffi = readFFI(filename)
  File "nappy/utils/common_utils.py", line 43, in readFFI
    ffi = text_parser.readItemsFromLine(topline, 2, int)[-1]
  File "nappy/utils/text_parser.py", line 48, in readItemsFromLine
    raise Exception("Incorrect number of items (%s) found in line: \n'%s'" % (nitems, line))
Exception: Incorrect number of items (2) found in line:
'FAST H.             O3SONDE     ALERT NWT   OZONE       06-Jan-2000 23:00:0007-Jan-2000 02:00:000001
'
>>>

Managing version updates with poetry

Consider a new approach for version updates and tagging (using CI):

As an illustration, in a poetry-managed package, I would set the git tag first,

git tag x.y.z
git push --tags

then put that version in the pyproject.toml file like

poetry version $(git describe --tags --abbrev=0)

and finally

poetry build

to build the package. Put the poetry part in a github action and you have an automated workflow.

To correctly show the version number within the package, you put

from importlib import metadata
__version__ = metadata.version("package-name")

in the main __init__.py file.

Some data providers encode var names and units in a format that Nappy doesn't parse

  • roocs-utils version:
  • Python version:
  • Operating System:

Description

Some of the files in the NDACC database do not have variable name+units strings that conform to the expectations of Nappy. For example, some providers separate the two with a semicolon, e.g.

N2O vertical column amount; molecules/cm**2
HNO3 vertical column amount; molecules/cm**2
ClONO2 vertical column amount; molecules/cm**2
HCl vertical column amount; molecules/cm**2
HF vertical column amount; molecules/cm**2

This issue could be addressed by a more complicate regular expression, but others are more convoluted and regular expressions aren't enough. Thus would be good if a custom user callback function could be allowed to do the parsing.

What I Did

Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.

read error with 2110 ngv file

Hi Ag,
Just installed nappy with my conda py2.7.14 as I would like to read some 2110 FFI files. Nappy imports fine, however, I receive the following error when trying to open a 2110 FFI file. Was curious if there is a workaround. I apologize if the answer is posted elsewhere, I did search around without success. I get a similar problem when running na2nc to convert to netcdf. I posted a snippet of the header and each error below. Would be great to get your feedback as you have time.

thanks,
Matt

----MP20140613.NGV
64 2110
Julie Haggerty ([email protected])
NCAR, Boulder, CO
NGV Microwave Temperature Profiler (MTP/NGV)
DEEPWAVE Production Data
1 1
2014 06 13 2015 04 24 20140008 {FLT DATE, REDUCTION DATE & FLIGHT NUMBER}
0.0 0.0

---error with na2nc

na2nc -t "seconds since 1999-01-01 00:00:00" -i /scratch/fearon/deepwave/temperature_profiler/MP20140613.NGV -o test.nc
Traceback (most recent call last):
File "/users/fearon/.conda/envs/pyn_env/bin/na2nc", line 11, in
load_entry_point('nappy==1.1.4', 'console_scripts', 'na2nc')()
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/script/na2nc.py", line 120, in na2nc
nc_file = apply(nappy.convertNAToNC, [na_file], arg_dict)
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/nappy_api.py", line 206, in convertNAToNC
convertor = apply(nappy.nc_interface.na_to_nc.NAToNC, [], arg_dict)
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/nc_interface/na_to_nc.py", line 54, in init
na_file_obj = nappy.openNAFile(na_file_obj)
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/nappy_api.py", line 164, in openNAFile
return apply(getNAFileClass(ffi), (filename, mode))
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/na_file/na_file.py", line 70, in init
self.readHeader()
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/na_file/na_file_2110.py", line 32, in readHeader
self._readCommonHeader()
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/na_file/na_file.py", line 183, in _readCommonHeader
dates = nappy.utils.text_parser.readItemsFromLine(self.file.readline(), 6, int)
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/utils/text_parser.py", line 44, in readItemsFromLine
raise "Incorrect number of items (%s) found in line: \n'%s'" % (nitems, line)
TypeError: exceptions must be old-style classes or derived from BaseException, not str

---error with nappy within python

python 2.7.14 |Anaconda, Inc.| (default, Dec 7 2017, 17:05:42)
[GCC 7.2.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.

import nappy
myfile = nappy.openNAFile('/scratch/fearon/deepwave/temperature_profiler/MP20140624.NGV')
Traceback (most recent call last):
File "", line 1, in
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/nappy_api.py", line 164, in openNAFile
return apply(getNAFileClass(ffi), (filename, mode))
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/na_file/na_file.py", line 70, in init
self.readHeader()
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/na_file/na_file_2110.py", line 32, in readHeader
self._readCommonHeader()
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/na_file/na_file.py", line 183, in _readCommonHeader
dates = nappy.utils.text_parser.readItemsFromLine(self.file.readline(), 6, int)
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/utils/text_parser.py", line 44, in readItemsFromLine
raise "Incorrect number of items (%s) found in line: \n'%s'" % (nitems, line)
TypeError: exceptions must be old-style classes or derived from BaseException, not str
myfile = nappy.openNAFile('/scratch/fearon/deepwave/temperature_profiler/MP20140624.NGV')
Traceback (most recent call last):
File "", line 1, in
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/nappy_api.py", line 164, in openNAFile
return apply(getNAFileClass(ffi), (filename, mode))
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/na_file/na_file.py", line 70, in init
self.readHeader()
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/na_file/na_file_2110.py", line 32, in readHeader
self._readCommonHeader()
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/na_file/na_file.py", line 183, in _readCommonHeader
dates = nappy.utils.text_parser.readItemsFromLine(self.file.readline(), 6, int)
File "/users/fearon/.conda/envs/pyn_env/lib/python2.7/site-packages/nappy-1.1.4-py2.7.egg/nappy/utils/text_parser.py", line 44, in readItemsFromLine
raise "Incorrect number of items (%s) found in line: \n'%s'" % (nitems, line)
TypeError: exceptions must be old-style classes or derived from BaseException, not str

conversion scripts in Python 3: missing apply()

  • Python version: 3.9.x
  • Operating System: Ubuntu 20.04, Windows 10 (tested on both)

Description

Scripts in the /scripts directory still seem to have some Python 2. That causes for example

[...]/nappy/nappy/script$ python na2nc.py -i '[...]/nappy/tests/testdata/1001.na' -o '[...]/test_out.nc'
Traceback (most recent call last):
  File "[...]/nappy/nappy/script/na2nc.py", line 126, in <module>
    na2nc(args)
  File "[...]/nappy/nappy/script/na2nc.py", line 120, in na2nc
    nc_file = apply(nappy.convertNAToNC, [na_file], arg_dict)
NameError: name 'apply' is not defined

I think this can be fixed pretty easily without breaking anything. For example in na2nc.py, line 118 to 120 would be replaced with

nc_file = nappy.convertNAToNC(**arg_dict)

to run in Python 3. I can make a PR if you want.

API extension: na 2 xarray.Dataarray or xarray.Dataset ?

This is more of a suggestion for a new feature than an "issue".

Background: we still have a lot of data in NASA Ames format. Currently, there's an initiative at our institute to develop a collection of tools that are basically method extensions for xarray.Dataarray and xarray.Dataset. github: imktk. So I was looking for convenient ways to load the na data to xarray. And since I noted that nappy uses xarray internally for the conversion to netCDF, I thought that could be a possibility.

A way to do this with the existing version of nappy could be e.g.

from pathlib import Path
import xarray as xr

import nappy
import nappy.nc_interface.na_to_xarray as na2xr

f = Path('./nappy/example_files/1001a.na') # from the samples collection
xr_converter_class = na2xr.NADictToXarrayObjects(nappy.openNAFile(f))

xr_tuple = xr_converter_class.convert()
arrays = xr_tuple[0] # list of data arrays

new_attrs = {} # we need to combine attributes manually
for a in arrays:
    for k, v in a.attrs.items():
        new_attrs[a.name + '_' + k] = v # not guaranteed to work with ANY input!

xrds = xr.merge(arrays, combine_attrs="drop")
xrds.attrs = new_attrs

print(xrds)

<xarray.Dataset>
Dimensions:              (pressure: 28)
Coordinates:
  * pressure             (pressure) float64 1.013e+03 540.5 ... 4e-05 2.5e-05
Data variables:
    total_concentration  (pressure) float64 2.55e+19 1.53e+19 ... 5.03e+11
    temperature          (pressure) float64 288.0 256.0 223.0 ... 300.0 360.0
Attributes:
    total_concentration_units:                 cm-3
    total_concentration_long_name:             total_concentration
    total_concentration_title:                 total_concentration
    total_concentration_nasa_ames_var_number:  0
    temperature_units:                         degrees K
    temperature_long_name:                     temperature
    temperature_title:                         temperature
    temperature_nasa_ames_var_number:          1

While that works for me, it's not explicitly part of the nappy API - would it be a useful extension?

var_and_units_pattern (re.Pattern) as a modifiable attribute

this is a follow-up to #45

I'd like to implement var_and_units_pattern as an attribute with a @setter so I can modify it after creating the na file object with openNAFile.

This involves changes to na_core.py - I'd like to accompany the modification with some minor refactoring.

Get nappy working with Xarray

This is quite a big task, because there is a lot of code that translates NASA Ames concepts/objects to/from NetCDF-style concepts. However, it is more complex than that because we previously used cdms as the library to convert to NetCDF. Now, we are moving to xarray.

NOTE: Initially, we only need the Xarray-to-NASAAmes|CSV functionality to work!

Here is an overview of the order of tasks:

tests/tests-na2nc.py.txt
tests/tests-nc2csv.py.txt
tests/tests-nc2na.py.txt
cdms_utils.var_utils.getBestName(var)
cdms_utils.var_utils.getMissingValue(var)

cdms_utils.axis_utils.isUniformlySpaced(axis) 
cdms_utils.axis_utils.areAxesIdentical(ax1, ax2) 
cdms_utils.axis_utils.isAxisRegularlySpacedSubsetOf(ax1, ax2)
  • Then it's a brute force (let's get it working approach!)

Improvements to the documentation for installing to a virtualenv

  • roocs-utils version: ?
  • Python version: 3.10
  • Operating System: Windows

Description

Hi,
The way to install nappy to a virtualenv as it is described in the ReadMe does not work on Windows.
It also looks a bit outdated, so I would like to change it to a version that uses pip.

I also noticed, that the setup fails if the pypandoc package is installed. Apparently pypandocs convert-function has been renamed to convert_file.

What I Did

I prepared a pull request where I updated the ReadMe. I also renamed the convert function in setup.py.

cannot install package without already having xarray installed

the setup.py imports the version from the top level init.py but this also imports * from nappy_api.
This means that you cannot install unless you already have xarray. Can the version be baked into the setup.py script or the API import moved elsewhere?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.