Code Monkey home page Code Monkey logo

act's People

Contributors

adamtheisen avatar ajsockol avatar cgodine avatar dennyh-ssec avatar dependabot[bot] avatar jhemedin avatar jkyrouac avatar jrobrien91 avatar kenkehoe avatar maxwelllevin avatar mgrover1 avatar rcjackson avatar zssherman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

act's Issues

Check arm standards not really needed anymore

Since Ken now has uploaded a separate routine in the code that checks for ARM standards in the code, we really don't need to do a hard check for the datastream when reading netcdf and csv files. I suggest removing the warning we get.

Also, add **kwargs to csvfiles, pd.read_csv function.

Discuss guidelines for making ACT release versions

@AdamTheisen and @kenkehoe were discussing how the DQ Office installs ACT on ADC production system. We currently release off the GitHub repo with a simple RPM build. This does not track with the official ACT release version number. We will be changing to use PIP soon which should resovle this issue.

I would like to understand our guidelines on how often we create a new ACT release version so I can understand how often we need to create new RPMs and how we can get bug fixes implemented quickly.

Using act object attributes to store important information

I've tried to use the ACT plotting library to plot data that I read with my xarray wrapper routine. To my surprise the plotting routine is expecting some hidden object attributes set during the ACT reading process. Namely the act.file_dates. I don't think we should expect the user to use the ACT reading routine, so we should not have any necessary hidden information.

My current issue is with the xarray object needing arm_ds.act.file_dates which the default or my xarray reader does not set.

Add functions for QC data

The DQ Office has a few functions to work with bit packed QC. I will look into implementing these into qc_utils. I think we will need to have some discussions on the metadata format standard once in the xarray since ARM, CF and others have different standards. I'm leaning towards the current updated CF method. Do we have any examples of other program's QC after reading (NEON, NWC, ...)?

overplot times of an event at top/bottom of time series plot

I think we need a way to overplot a status onto a time series plot. For example when a sensor is tripped (e.g. MWR rain flag), when rain is detected by another instrument (i.e. MET TBRG) or when some state is happening with the data (i.e. a fan is running indicated by a variable being over a threshold).

My plan is to add a bar or something like that to an existing plot. But instead of adding this feature to the existing plot definition, I'm thinking of a new method to add to an existing plot. This may be better for cases when we want to add information from a different instrument (object) without adding to the primary object used for plotting.

You can assign this issue to me.

Standard in docs and bug in Sphinx

So I was converting over Py-ARTs docs and ran into something from ACT early on. I remember we had the docs set where parameters had a space on both side of the colon before the parameter type. This is inline with doc styles for major packages. When it stopped working the spaces were removed. It turned out this was an update sphinx did that messed with numpy docs:
readthedocs/sphinx_rtd_theme#766 (comment)

The fixed mentioned, however, will fix rtd theme but then centers all tables for parameters which looks ugly, in my opinion.

A way around all these and looks cleaner both standard wise and in the documentation itself is to remove numpy doc and have the sphinx napolean extension and revert to having the space before and after the colon in the docs to keep with standards. I can do a pull request soon if you all wish.

Need to find way to unit test discovery module safely

Currently, the discovery module takes in a user specified username and token as a string, but if we want to do unit testing on this module having these hard coded in the unit test is troublesome. Having a more secure way to log into the ARM Archive for testing would enable us to do this unit test safely.

In addition, due to the size of 2D datasets such as celiometer data, I would like to add a unit test for time-range plots but I need a way to download the celiometer data either from the ARM archive or, for now, downloaded from a temporary location to unit test 2D plotting.

Plotting all NaN data causes error with y limits

File "/home/kehoe/dev/dq/lib/python/ACT/act/plotting/plot.py", line 365, in set_yrng
self.axes[subplot_index].set_ylim(yrng)
File "/apps/base/python3.6/lib/python3.6/site-packages/matplotlib/axes/_base.py", line 3616, in set_ylim
bottom = self._validate_converted_limits(bottom, self.convert_yunits)
File "/apps/base/python3.6/lib/python3.6/site-packages/matplotlib/axes/_base.py", line 3139, in _validate_converted_limits
raise ValueError("Axis limits cannot be NaN or Inf")
ValueError: Axis limits cannot be NaN or Inf

We need to catch when all the data are set to NaN and set the y limits to something like [0, 1].

I can fix this if needed, let me know.

Dependencies need to be agreed upon

Currently, ACT has the following dependencies:

  • Numpy
  • Xarray
  • Matplotlib
  • Astral (for shading in day vs night in timeseries plots)

as well as the packages these depend on such as scipy and pandas. We should agree about what dependencies ACT should have.

read_netcdf function causes segfault when attempting to read in a day of radar data

It seems that the read_netcdf function in act.io.armfiles causes a segfault when reading in radar data. Perhaps because of the large file sizes? the specific case that causes the segfault is reading in the following list of files:

['/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.000013.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.010019.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.020010.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.030009.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.040011.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.050013.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.060019.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.070007.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.080020.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.090004.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.100010.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.110016.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.120011.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.130013.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.140004.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.150010.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.160019.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.170014.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.180013.nc', '/data/datastream/sgp/sgpkazrcfrgeC1.a1/sgpkazrcfrgeC1.a1.20190225.190008.nc']

The function call looks like:
read_netcdf(files, return_None=True, mask_and_scale=False, data_vars='minimal')
where data_vars is simple the list of variables

Speed up import time to load ACT

I think we need to pay more attention to the way we load ACT modules. Currently it takes about 8 seconds to load ACT. Also it would be better to not load modules that are not used. For example the cartopy module is loaded for any plot created even if it is not used. I think we should put that module import under the GeographicPlotDisplay class so it is only imported when that class is loaded.

Not sure how this is typically done so if someone has suggestions I'm happy to implement.

Xarray not able to read old MMCR data

Some older data might not be compatible with xarray. sgpmmcrmomC1.b1.20041107 data yielded this error:

xarray.core.variable.MissingDimensionsError: 'heights' has more than 1-dimension and the same name as one of its dimensions ('mode', 'heights'). xarray disallows such variables because they conflict with the coordinates used to label dimensions.

Is there a way around this?

geodisplay.py projection not working with any other options but the default

While using the GeographicPlotDisplay function when trying to use other cartopy projections the map extent changes or no data appears.

The default is PlateCarree and works fine but when trying to use a Mercator projection the data shows up but the map extent is changed.
image

When using a LambertConformal projection the extent appears to be off since no features appear that were suppose to be plotted.
image

Combine clean.py into armfiles.py

The clean.py script is very ARM-centric and so i'm wondering if it would make sense to include in the armfiles.py file instead of a stand-alone one. Or, as another option, since it is very qc centric, that it be moved into the qc directory and titled appropriately if it is mainly for ARM QC. @kenkehoe thoughts?

Roadmap Discussion

It seems like a lot of the initial framework is developed. It seems like a good time to discuss what else we envision adding to this repo in the near and far term.

Plot embedded QC

We need to create a plot similar to DQ Inspector of embedded QC lining up with data. We can assign this to me.
sgpmetE13 b1 atmos_pressure_qc_tests 20190801

Auto Day/Night Background with 1D time Series

Right now, the day/night background is automatically displayed if the variable is a 1d time series and there's no "ydata". Should we remove this and leave up to users to display background?

if ydata is None:
self.day_night_background(subplot_index)
ax.plot(xdata, data, '.', color=line_color)
...

Easy Overlay of QC in TimeSeries Plot

Need an easy way to flag data based on QC variables in the TimeSeries plotting to start. Something that we could pass a list of test numbers or a list of quality level (bad, suspect) and produce something like the attached plot.

Screen Shot 2019-07-10 at 11 44 22 AM

QC Barh Plot for 2D data

When a user tries to plot 2D-QC data, it currently runs into an error. It would be good to figure out plots similar to this.

multi-1d_vars

New QC Tests to Add

This is to track what QC tests need to be added. I added all the numpy masked array options and two more, persistence and single variable comparison.

Clean way to store multiple objects in display object

Another issue that I need to get working on is to have a cleaner way to plot data from multiple objects at a time in one display. Right now you have to merge objects, but it would be nicer to have the display object natively support the display of data from more than one object at a time so that the user does not have to make a new object and hog up memory and resources.

Figsize Not Working Correctly

While we have figsize in an example, I don't think it is actually being used in the plotting script. Changing figsize in the example below does not actually change the plot size

display2 = act.plotting.TimeSeriesDisplay(new,figsize=(8,5))
display2.plot('backscatter')

License and Author Files Incorrect

It looks like the AUTHOR and LICENSE files from my initial build using the scientific cookie cutter were pulled over. We should update these accordingly.

Secondary Y axis

@rcjackson @kenkehoe I was looking at adding in a feature for working with a secondary y axis on the timeseries plots, but it is a little more in depth than I thought, mainly due to the set_yrng function. We rely on the subplot indices to set a lot of things and that currently does not include anything about a secondary y-axis. I could put in a work around for now that I think will work, but it's not ideal long-term. Could we adjust the subplot indices at all or are there any other options you could think of?

Day/night background issues with polar night

The day/night plotting background has issues with astral and calculating times for sun rise/set. This is throwing an error when there is no sun during the day. We can fix that with a try/except.

Other issue is that if a user catches the exception the background for polar night comes out as all day. I think this is an issue with how things are plotted.

Screen Shot 2020-01-31 at 12 41 59

Adding internal global attributes to Xarrray Dataset to replace ACT object attributes

Initially ACT would add some attributes to the Xarray Dataset that are used by the plotting functions. Some of them include file_dates, file_times, and datastream. We changed to adding them as Dataset global attributes so others who don't use our reader can fix the missing information when needed.

But the part I'm not a fan of is adding these without an indication they are part of ACT not part of the data file. I suggest prepending "__" to any internal information added to the Dataset so we know and can easily remove them if needed. The current variables I see file_dates, file_times, arm_standards_flag and datastream should be changed to __fiile_dates, __file_times, __arm_standards_flag and __datastream.

This is also causing issues with the variables that do exist in the data file like "datastream". We have some code that fixes issues when processing older ARM data that are missing global attributes, particularly datastream. Since ACT is adding that global attribute our code thinks things are OK and we have other issues.

Here is an example of a Dataset global attriubutes:

    ...
    serial_number:              P2710385
    qc_standards_version:       1.0
    qc_method:                  Standard Mentor QC
    qc_comment:                 The QC field values are a bit packed represen...
    zeb_platform:               enasondewnpnC1.b1
    history:                    created by user dsmgr on machine garnet at 20...
    file_dates:                 ['20191206']
    file_times:                 ['113100']
    datastream:                 act_datastream
    arm_standards_flag:         ARMStandardsFlag.OK
    __source_files:             ['enasondewnpnC1.b1.20191206.113100.cdf']

Everything after history: is added by ACT or DQ Office code. We add __source_files.

Stability Indices for BBSS data

It may be useful to have a way to calculate stability indices for the BBSS data. Perhaps a way to output a table with the plot or a separate file.

Docs has wrong location to install from

In the docs, it says to clone from the AdamTheisen repository, but it's actually better to clone from the ANL-DIGR repository. In the end, we'll fix this whole issue with pip and conda packaging, but this is a reminder for me to change that part of the docs tomorrow.

Way to set absolute limits for time series plot

I am putting in a feature request to have a way to set an absolute limit when making a time series plot so a single very large or small value will not make the plot unusable. I have some ideas on how to implement.

  • Thinking not used as default
  • can add to min or max independently
  • can add absolute values or percentage of mean, mode or median
  • Will indicate on plot that the range displayed does not show all values

I'll try to get to this soon.

Documentation issues

It looks like the way that we are currently doing the documentation is not translating well into the sphinx generated docs.

I.e.

obj : Xarray Dataset Object

creates this:

objXarray Dataset Object

There is no spacing between the variable name and the type. Are we doing this the correct way @rcjackson @zssherman ?

Wind rose and sounding plots.

In addition to a TimeSeriesDisplay, we need to have a WindRoseDisplay and a SkewTDisplay object. This needs to be done by the ARM-ASR PI meeting as a demonstration that shows that Paytsar is able to use it for her data.

For Skew-T plots, I am thinking of introducing metpy as a dependency since it does these really well.

reading csv into xarray

Adam mentioned the need to allow data to be in a more general format for this repo to be broader to the open source community. A quick search for how to read CSV data into xarray didn't show any results, but it does look like pandas can read CSV. Can we create a module or example using pandas to show how the CSV data will be read in with pandas and then converted to xarray with DataFrame.to_xarray()?

ACT Title Missing 'data'

The title for the ACT repo is missing 'data' and should read Atmospheric data Community Toolkit

u/v wind barb adding all colorbars to last subplot

When plotting multiple subplots using act.plotting.TimeSeriesDisplay.plot_barbs_from_u_v and adding a colormap to the subplots, all of the colorbars end up being plotted on the last subplot rather than their respective subplots:

I didn't notice anything in the code that I thought would be causing this. Any ideas?

Documentation typo

Contributing_โ€”_Atmospheric_data_Community_Toolkit_0_1_3_documentation

In the documentation under Setting up an Environment/Install there is a typo for using ACT and its dependencies. It has is dependencies.

Speed up Plotting of 2D datasets

We need to find a way to plot out high-res 2D datasets faster. The MPL data takes a ridiculous amount of time to plot up 3 plots. Maybe we need to play around with image plotting of the data as well instead of pcolormesh and see how that works for cases like this.

Missing dependency : appdirs

Hi, i just installed act from conda-forge and this error came up on import:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
~/.conda/envs/cmac_env/lib/python3.6/site-packages/metpy/_version.py in get_version()
     13     try:
---> 14         from setuptools_scm import get_version
     15         return get_version(root='..', relative_to=__file__,

ModuleNotFoundError: No module named 'setuptools_scm'

During handling of the above exception, another exception occurred:

DistributionNotFound                      Traceback (most recent call last)
<ipython-input-11-d20e7d291629> in <module>
     31 import matplotlib.ticker as mt
     32 import matplotlib.font_manager as fm
---> 33 import act
     34 get_ipython().run_line_magic('matplotlib', 'inline')

~/.conda/envs/cmac_env/lib/python3.6/site-packages/act/__init__.py in <module>
      1 from . import io
----> 2 from . import plotting
      3 from . import corrections
      4 from . import utils
      5 from . import tests

~/.conda/envs/cmac_env/lib/python3.6/site-packages/act/plotting/__init__.py in <module>
     19 from .ContourDisplay import ContourDisplay
     20 from .WindRoseDisplay import WindRoseDisplay
---> 21 from .SkewTDisplay import SkewTDisplay
     22 from .XSectionDisplay import XSectionDisplay
     23 from .GeoDisplay import GeographicPlotDisplay

~/.conda/envs/cmac_env/lib/python3.6/site-packages/act/plotting/SkewTDisplay.py in <module>
     12 
     13 try:
---> 14     import metpy.calc as mpcalc
     15     METPY_AVAILABLE = True
     16 except ImportError:

~/.conda/envs/cmac_env/lib/python3.6/site-packages/metpy/__init__.py in <module>
     34 from ._version import get_version  # noqa: E402
     35 from .xarray import *  # noqa: F401, F403
---> 36 __version__ = get_version()
     37 del get_version

~/.conda/envs/cmac_env/lib/python3.6/site-packages/metpy/_version.py in get_version()
     17     except (ImportError, LookupError):
     18         from pkg_resources import get_distribution
---> 19         return get_distribution(__package__).version

~/.conda/envs/cmac_env/lib/python3.6/site-packages/pkg_resources/__init__.py in get_distribution(dist)
    480         dist = Requirement.parse(dist)
    481     if isinstance(dist, Requirement):
--> 482         dist = get_provider(dist)
    483     if not isinstance(dist, Distribution):
    484         raise TypeError("Expected string, Requirement, or Distribution", dist)

~/.conda/envs/cmac_env/lib/python3.6/site-packages/pkg_resources/__init__.py in get_provider(moduleOrReq)
    356     """Return an IResourceProvider for the named module or requirement"""
    357     if isinstance(moduleOrReq, Requirement):
--> 358         return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
    359     try:
    360         module = sys.modules[moduleOrReq]

~/.conda/envs/cmac_env/lib/python3.6/site-packages/pkg_resources/__init__.py in require(self, *requirements)
    899         included, even if they were already activated in this working set.
    900         """
--> 901         needed = self.resolve(parse_requirements(requirements))
    902 
    903         for dist in needed:

~/.conda/envs/cmac_env/lib/python3.6/site-packages/pkg_resources/__init__.py in resolve(self, requirements, env, installer, replace_conflicting, extras)
    785                     if dist is None:
    786                         requirers = required_by.get(req, None)
--> 787                         raise DistributionNotFound(req, requirers)
    788                 to_activate.append(dist)
    789             if dist not in req:

DistributionNotFound: The 'appdirs' distribution was not found and is required by pooch

What to do when file not found

I would like to change the base behavior when a file is not found with io.read_netcdf() to catch the FileNotFound error and instead just return a value of None. This makes more logical sense to me to use the reading function to go check if the file exists. We do a lot of stuff depending on availability of files and having to wrap everything in a try seems excessive. I would suggest adding a verbose option to make a print statement optional if the file is not found, but not make that the default option. If this is OK I'll make the update.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.