Code Monkey home page Code Monkey logo

pycontrails's People

Contributors

dependabot[bot] avatar marcstettler avatar mlshapiro avatar nickmasson avatar roger-teoh avatar thabbott avatar trdean1 avatar tsankar avatar zebengberg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pycontrails's Issues

Remove `FlightPhase` dataclass

Description

In 0.34, we moved the FlightPhase data class from the BADA files into the core.flight module.

This class doesn't necessarily make sense and I think should be wrapped into the Flight class as a property that can be overridden like altitude or level.

I would suggest

  • The single key "phase" with a value scheme of {-1: descending, 0: cruise, 1: ascending, np.nan: unknown}
  • A Flight.phase property that returns this key
  • Any other necessary properties or methods that support the calculation of the phase

pin version of Black/Ruff

Description

Black and Ruff are used as part of code development control. They are not part of the critical path for the pycontrails application code.

It is a common occurrence that updated Black releases will cause issues (either causing previously passing code to fail, and/or, creating inconsistency between a local development environment and CICD environment.

As such, it is sensible to pin the version of these two packages to fixes versions, resulting in deterministic behavior across all environments.

Task(s)
Update all areas where Black/Ruff versions are declared to use a fixed/pinned version. This includes, but is not limited to, directives that set up CICD environment and local environment. If feasible, refactor to include a single reference for black/ruff version across the entire project (i.e. avoid declarations in multiple places).

Acceptance Criteria
Pycontrails in a production/installed environment, development environment and CICD environment all use the same version of black/ruff.

Review test coverage

Description

We calculate test coverage. We should review and choose to share as a tag on repository (or not).

Additional context

I personally think test coverage isn't a very helpful stat, but its pretty common.

Perhaps we can improve the codebase slightly by motivating an expansion of coverage.

GFS + PCR differs widely from ERA5 + PCR

Description

I'm sure I missed something, and I don't yet understand humidity scaling but something looks off using ISSR and PCR with GFS:

  • Running SAC on both GFS and ERA5 metdataset -> essentially identical results.
  • Running ISSR on both sets with specifying humidity scaling -> essentially identical results
  • Running ISSR on both sets without specifying humidity scaling -> widely different results
  • Running PCR (which doesn't allow for specifying humidity scaling afaik) -> widely different results

What am I missing?

Details

  • Version: [ 0.45]
  • Module: [issr.py, pcr.py]

Steps to Reproduce

from pycontrails.datalib.gfs import GFSForecast
from pycontrails.datalib.ecmwf import ERA5
from pycontrails.models.sac import SAC
from pycontrails.models.issr import ISSR
from pycontrails.models.pcr import PCR
from pycontrails.models.humidity_scaling import ConstantHumidityScaling
import matplotlib.pyplot as plt

# ignore pycontrails warning about ECMWF humidity scaling
import warnings
warnings.filterwarnings(message=".*humidity scaling.*", action="ignore")

# Store data files to local disk (default behavior)
times = ("2023-07-20 00:00:00")
pressure_levels = [300, 250]
variables = ["t", "q"]
gfs = GFSForecast(time=times, variables=variables, pressure_levels=pressure_levels, show_progress=True)
era5 = ERA5(time=times, variables=variables, pressure_levels=pressure_levels)

met_gfs = gfs.open_metdataset()
met_era5 = era5.open_metdataset()

# Instantiate and run model
scaling = ConstantHumidityScaling(rhi_adj=0.98)

issr1 = ISSR(met_gfs, humidity_scaling=scaling).eval()
issr2 = ISSR(met_era5, humidity_scaling=scaling).eval()
issr3 = ISSR(met_gfs).eval()
issr4 = ISSR(met_era5).eval()

sac1 = SAC(met=met_gfs).eval()
sac2 = SAC(met=met_era5).eval()

pcr1 = PCR(met=met_gfs).eval()
pcr2 = PCR(met=met_era5).eval()

fig, axes = plt.subplots(nrows=4, ncols=2, figsize=(12, 18))
level = 0

sac1.data.isel(time=0, level=level).plot(x="longitude", y="latitude", ax=axes[0,0], cmap="Reds", vmin=0, vmax=1, add_colorbar=False)
axes[0,0].set_title('SAC+GFS')

sac2.data.isel(time=0, level=level).plot(x="longitude", y="latitude", ax=axes[0,1], cmap="Blues", vmin=0, vmax=1, add_colorbar=False)
axes[0,1].set_title('SAC+ERA5')

issr1.data.isel(time=0, level=level).plot(x="longitude", y="latitude", ax=axes[1,0], cmap="Reds", vmin=0, vmax=1, add_colorbar=False)
axes[1,0].set_title('ISSR+GFS (With Humidity Scaling)')

issr2.data.isel(time=0, level=level).plot(x="longitude", y="latitude", ax=axes[1,1], cmap="Blues", vmin=0, vmax=1, add_colorbar=False)
axes[1,1].set_title('ISSR+ERA5 (With Humidity Scaling)')

issr3.data.isel(time=0, level=level).plot(x="longitude", y="latitude", ax=axes[2,0], cmap="Reds", vmin=0, vmax=1, add_colorbar=False)
axes[2,0].set_title('ISSR+GFS (No Humidity Scaling)')

issr4.data.isel(time=0, level=level).plot(x="longitude", y="latitude", ax=axes[2,1], cmap="Blues", vmin=0, vmax=1, add_colorbar=False)
axes[2,1].set_title('ISSR+ERA5 (No Humidity Scaling)')

pcr1.data.isel(time=0, level=level).plot(x="longitude", y="latitude", ax=axes[3,0], cmap="Reds", vmin=0, vmax=1, add_colorbar=False)
axes[3,0].set_title('PCR+GFS')

pcr2.data.isel(time=0, level=level).plot(x="longitude", y="latitude", ax=axes[3,1], cmap="Blues", vmin=0, vmax=1, add_colorbar=False)
axes[3,1].set_title('PCR+ERA5')

plt.tight_layout()
plt.show()

image

Implement ATC parser

Description

We want to be able to parse ATC flight plans and turn into a Flight class.

For now, this can be a function in the flight module or a new atc module (or equiv). This can be a classmethod on Flight like in pandas with the prefrix Flight.from_atc_plan -> Flight.

  • Parse ATC plan into a dict and set to attrs. You can set the route string to a route key in the attr dict.

Error when creating `Fleet` from sequence with empty `Flight`s

Opening an issue to note an error that I found difficult to debug. I'll assign myself and work on an improvement.

The error appears when creating a Fleet from a sequence of Flights, some of which are missing data:

import pandas as pd
import numpy as np
from pycontrails import Flight, Fleet

df = pd.DataFrame()
df["longitude"] = np.linspace(0, 50, 100)
df["latitude"] = np.linspace(0, 10, 100)
df["altitude"] = 11000
df["time"] = pd.date_range("2022-03-01 00:00:00", "2022-03-01 02:00:00", periods=100)

flights = [Flight(df, flight_id=1), Flight(flight_id=2)]
Fleet.from_seq(flights)

results in ValueError: Unexpected flight_id(s) {2} in fl_attrs. This simple example is not too hard to debug, but I ran into this error when resample_and_fill dropped a small fraction of a large number of flights to 0 length, and it took me a while to figure out what the problem was. I think we probably want to catch this case earlier on and provide a more informative error.

Add `Flight.to_contrails_api()` method

Description

It would be nice if we could have a the Flight class directly output a json object in a form submittable to the Contrails API. e.g. in the example:

flight = {
    "longitude": np.linspace(-10, -60, n_waypoints).tolist(),
    "latitude": np.linspace(45, 40, n_waypoints).tolist(),
    "altitude": altitude.tolist(),
    "time": time.astype(int).tolist(),
    "aircraft_type": "A320",
}

Maybe we call this .to_contrails_api()?

New ACCF update allows for preselected forecast_step, leading to error in eval runs

Description

Currently trying to carry out some ACCF test runs, however due to a recent update to ClimaCCF, forecast_step can now be predefined. This needs to be updated in the accf.py script in Pycontrails.

Details

  • Version: [e.g. 0.25.1]
  • Module: accf
  • OS: Ubuntu
  • Reporter: Kieran Tait

Steps to Reproduce

  1. Run accf.ipynb with climaccf updated.

Additional Notes

(Additional code snippets, screenshots, potential solutions, etc)

_> _**KeyError Traceback (most recent call last)

Cell In[7], line 2
1 # Eval accfs over this flight
----> 2 fl = ac.eval(fl)

File ~/miniconda3/envs/env_climaccf/lib/python3.9/site-packages/pycontrails/models/accf.py:238, in ACCF.eval(self, source, **params)
236 self.set_source_met()
237 self._update_accf_config()
--> 238 self._generate_weather_store()
240 # check aircraft type and set in config if needed
241 if self.params["nox_ei"] != "TTV":

File ~/miniconda3/envs/env_climaccf/lib/python3.9/site-packages/pycontrails/models/accf.py:310, in ACCF._generate_weather_store(self)
307 else:
308 self.ds_sur = None
--> 310 ws = WeatherStore(
311 self.ds_met, self.ds_sur, ll_resolution=self.p_settings["horizontal_resolution"]
312 )
313 if self.p_settings["lat_bound"] and self.p_settings["lon_bound"]:
314 ws.reduce_domain(
315 {
316 "latitude": self.p_settings["lat_bound"],
317 "longitude": self.p_settings["lon_bound"],
318 }
319 )

File ~/miniconda3/envs/env_climaccf/lib/python3.9/site-packages/climaccf-1.0-py3.9.egg/climaccf/weather_store.py:156, in WeatherStore.init(self, weather_data, weather_data_sur, flipud, **weather_config)
154 self.pre_coordinate_names = inf_coordinates['pre_coor_name']
155 self.aCCF_bool = logic_cal_accfs(inf_variables['logic_variable'])
--> 156 self.forecast_step = self.cfg['forecast_step']
158 if self.cfg['save_as_xr']:
159 self.var_xr = {}

KeyError: 'forecast_step'**_
_

Variation in EF sum - Passing flight object as a list

Discussed in #77

Originally posted by nataj27 August 14, 2023
I've been experimenting with the pycontrails using some sample waypoints.
I observed that the sum of EF changes depending on the order in which flights are stored in the flight object list.

For Example: If the flight object is in the following order -->Flight_object_list = [FL1_obj, FL2_obj, FL3_obj, FL4_obj] the sum of EF for each of the flights are different compared to when the Flight_object_list = [FL3_obj, FL4_obj, FL1_obj, FL2_obj].

Comments in energy_forcing (and subsequent)

Description

In the function mean_energy_flux_per_m it states:

- Instead of taking an average of the previous and following segments, energy flux is only calculated for the following segment.

I think this refers to to the multiplication with the segment length in energy_forcing, therefore I suggest moving the comment to this function. The same is also true for the other statement

- Discontinuity is no longer set to 0 (this occurs directly in model :mod:pycontrails.models.cocip.Cocip)

which also makes more sense in the docstring of energy_forcing.

Also, it might be worth stating in the docstring that the term "energy forcing" seems not to be consistently defined (e.g., in Schumann et al., "Atmospheric Physics: Background - Methods - Trends", 2012), energy forcing is defined as energy input per distance. Teoh et al. ("Targeted Use...", 2022) give different energy forcings (integrated, energy per flight distance, energy per contrail length).

Finally, why is only the segment length at time 2 used for the calculation (return energy_flux_per_m * seg_length_t2), instead of a mean segment length?

Exponent of effective_tau_cirrus

Description

Hello pycontrail team,

Looking at the 'effective_tau_cirrus' function in radiative_forcing.py, it seems that in the code the product (tau_cirrus_eff * delta_sc_aps) is outside the exponent, while the source paper places it inside the exponent. Which one is correct?

Best regards,
Wessel

Details

  • Version: 0.47.2
  • Module: radiative_forcing.py
  • OS: windows

Additional Notes

def effective_tau_cirrus(
tau_cirrus: npt.NDArray[np.float_],
mue: npt.NDArray[np.float_],
delta_sc: npt.NDArray[np.float_],
delta_sc_aps: npt.NDArray[np.float_],
) -> npt.NDArray[np.float_]:
r"""
Calculate the effective optical depth of natural cirrus above the contrail, e_sw.

Refer to Eq. (11) of Schumann et al. (2012).

Parameters
----------
tau_cirrus : npt.NDArray[np.float_]
    Optical depth of numerical weather prediction (NWP) cirrus above the
    contrail for each waypoint.
mue : npt.NDArray[np.float_]
    Cosine of the solar zenith angle (theta), mue = cos(theta) = sdr/sd0
delta_sc : npt.NDArray[np.float_]
    Habit-specific parameter to account for the optical depth of natural
    cirrus above the contrail
delta_sc_aps : npt.NDArray[np.float_]
    Habit-specific parameter to account for the optical depth of natural
    cirrus above the contrail

Returns
-------
npt.NDArray[np.float_]
    Effective optical depth of natural cirrus above the contrail,
    ``n_waypoints x 8`` (habit) columns.
"""
tau_cirrus_eff = tau_cirrus / (mue + 1e-6)
return np.exp(tau_cirrus * delta_sc) - (tau_cirrus_eff * delta_sc_aps)

Add DOI to repository

Description

  • Add Zenodo DOI to repository so its citable.
  • Decide if we want to add a separate Zenodo data repository with test datasets

Support loading GFS files from local file paths

We now support loading HRES forecast data from local files. Its unfortunately not as straightforward to load local GFS files using similar patterns.

GFS currently works well when downloading files and caching.

This is low priority, and we may agree to close this as a wontfix.

Review Sphinx API docstrings

Description

When running the documentation build, there are many warnings and errors about the docstrings.

Docstrings have also never been fully reviewed and cleaned up.

This task should include

  • Addressing or ignoring all the docstring errors in the make docs-build process
  • Reviewing the built API docs and correcting any formatting errors

Wrong use of constants.epsilon in sac.slope_mixing_line

Description

The sac.slope_mixing_line function is making use of parameter constants.epsilon, but epsilon is placed at an incorrect location in the equation. How constants.epsilon is currently defined (epsilon: float = R_d / R_v), it should be placed above the division bar, not below.

Either epsilon should be redefined as:
epsilon: float = R_v/ R_d

or

slope_mixing_line should be redefined as:
slope_mixing_line = (ei_h2o * c_pm * air_pressure * constants.epsilon) / (q_fuel * (1.0 - engine_efficiency))

Details

  • Version: [0.45.0]
  • Module: [sac.slope_mixing_line, constants.epsilon]
  • OS: [windows]

Additional Notes

Current code:

#: Ratio of gas constant for dry air / gas constant for water vapor
epsilon: float = R_d / R_v

def slope_mixing_line(
specific_humidity: ArrayLike,
air_pressure: ArrayLike,
engine_efficiency: float | ArrayLike,
ei_h2o: float,
q_fuel: float,
) -> ArrayLike:
r"""Calculate the slope of the mixing line in a temperature-humidity diagram.

This quantity is often notated with ``G`` in the literature.

Parameters
----------
specific_humidity : ArrayLike
    A sequence or array of specific humidity values, [:math:`kg_{H_{2}O} \ kg_{air}`]
air_pressure : ArrayLike
    A sequence or array of atmospheric pressure values, [:math:`Pa`].
engine_efficiency: float | ArrayLike
    Engine efficiency, [:math:`0 - 1`]
ei_h2o : float
    Emission index of water vapor, [:math:`kg \ kg^{-1}`]
q_fuel : float
    Specific combustion heat of fuel combustion, [:math:`J \ kg^{-1} \ K^{-1}`]

Returns
-------
ArrayLike
    Slope of the mixing line in a temperature-humidity diagram, [:math:`Pa \ K^{-1}`]
"""
c_pm = thermo.c_pm(specific_humidity)  # Often taken as 1004 (= constants.c_pd)
return (ei_h2o * c_pm * air_pressure) / (constants.epsilon * q_fuel * (1.0 - engine_efficiency))

Implement spire flight data processing (Teoh 2023) as datalib

  • Module in datalib called spire. Feel free to make this functional, or a class Spire
  • Input is a Pandas Dataframe from load_parquet or load_csv on raw spire ADS-B data
  • Necessary utilities are:
    • Smooth and filter flight waypoints (this may already exist in our flight utilities)
    • Altitude interpolation
    • Separation of individual flights

Minor pain points in develop instructions

I ran into a few minor pain points while following instructions for creating a pycontrails development environment (https://py.contrails.org/develop.html). I'm documenting them here in case the record is useful. Addressing them might somewhat improve the new-developer experience, but probably shouldn't be a priority.

Fix ArrayLike TypeVar

Description

The ArrayLike TypeVar is not set up correctly and may be hiding underlying type inconsistencies. Our goal here was to create a single type that could handle xr.DataArray and np.ndarray consistently.

It currently works decently well, but we have to use inline ignores in certain circumstances (see # type: ignore[assignment]).

I think the preferred solution is to use np.ndarray everywhere specifying the types we expect in the array.

Details

  • Version: 0.35.0
  • Module: utils.types
  • Reporter: @mlshapiro

Additional Notes

This may require more handling in the basic grid models, like issr, sac, pcr.

Changing fuel on fleet

Discussed in #84

Originally posted by GunnarQuante August 25, 2023
For some weeks I am trying to change the fuel used on a fleet/list of flights from JetA to a SAFBlend. I have a fleet of e.g., 100 flights and would like to run Cocip on them once with JetA and once with a 100% SAFBlend. When I loop through each flight in the fleet, change the fuel and run Cocip on each flight individually, the overall contrail EF changes. This is not the case if I change the fuel for the entire fleet. In this case, the overal contrail EF of the JetA run to the SAFBlend run is identical. To reduce computational time, I am looking for a way to run Cocip on the entire fleet:

def CoCiP_by_fuels(met, rad, flights_in, fuel):

# Use BADA3 for performance modelling
bada_model = BADAFlight(bada_priority=3)

params = {
    "process_emissions": True,
    "verbose_outputs": False,
    "humidity_scaling": ConstantHumidityScaling(rhi_adj=0.98),
}
cocip = Cocip(met=met, rad=rad, params=params, aircraft_performance=bada_model)

    Fleet = pycontrails.Fleet.from_seq(flights_in)
    Fleet.fuel = SAFBlend(100)
    print(Fleet.fuel.fuel_name)
    Fleet_out=cocip.eval(Fleet)
    print(Fleet_out.fuel.fuel_name)

return Fleet_out

Note that also the fuel_name changes from Fleet to Fleet_out. I suppose I am missing something here, but after quite some reading I still can't find a fix.

Create dry-advection model for flight trajectories

Description

Based on flight trajectory, create a model of where a contrail "might be" based on advection.

We'd have to make assumptions about spreading and sedimentation, but these could be model parameters.

Alternatives

Additional context

  • I think the team at Google may have already implemented a model like this as their "baseline"
  • This could be baseline for a slight more nuanced model that takes into account SAC and ISSR

Consistency in `Cocip` outputs

Presently, the Cocip model attaches different variables to the source and contrail attributes depending on if flight waypoints generate persistent contrails. These differences should be documented or be made consistent.

  • Avoid unnecessary operations in the default evaluation mode
  • Attach any extraneous variables by setting params["verbose_outputs"]
  • Consider running Cocip in a "low memory" mode in which the contrail_list attribute isn't used (instead, only the latest_contrail is kept during model evaluation). This would very helpful for better memory management in the contrails API.

Remove Aircraft class and document special Flight.attr keys

Description

I struggle to understand the difference between engine_uid and engine_name. According to the documentation, the Aircraft class contains the attribute engine_name. The Flight class contains nothing related to the engine itself, only an unspecified attrs attribute.

https://py.contrails.earth/api/pycontrails.Aircraft.html
https://py.contrails.earth/api/pycontrails.Flight.html

Internally, I find that engine_uid is used for BADA4 support.

https://github.com/contrailcirrus/pycontrails-bada/blob/32ce4a5544929fcf78f490bc353ae74c965929d9/pycontrails_bada/bada_interface.py#L124

In BADAFlight.eval the docstring says

If the following variables are not provided in the flight attribute, the default aircraft-engine assignment and
aircraft mass properties from BADA will be assumed:
- engine_uid

https://github.com/contrailcirrus/pycontrails-bada/blob/4968c976cd2b35efc94b134f49d7c201c6f60df2/pycontrails_bada/bada_model.py#L396

So, engine_uid should be given in the Flight.attrs dictionary? What is Aircraft.engine_name used for? Maybe some documentation is lacking on the usage and the differences?

Automatic Flight resampling in Cocip

Implicitly, CoCiP assumes flight waypoints occur with sufficiently fine time frequency to ensure weather conditions along the associated segment are close to constant. In practice, waypoints with at least 1 minute time frequency are preferred when running Cocip. If the source parameter has coarse resolution in Cocip.eval, it could be automatically resampled to a higher frequency, then the per-waypoints results could be aggregated back to the original source.

This could be governed by a flight_resample_freq parameter, which is None by default.

Unit inconsistency in turbulent_kinetic_energy_dissipation_rate function

Description

My understanding is that the turbulent kinetic energy dissipation rate (or epsilon) should be in the unit of [m^2 / s^3]. However, whitin the calculation of epsilon, the square of w'n [m/s] is multiplied with the square of (enhanced) shear [s^-1], resulting in [m^2/ s^4]. This seems inconsistent.

I see this both in code and in "A contrail cirrus prediction model" Schumann 2012.

Am I missing something or should this be corrected?

Details

  • Version: [bv0.45.0]
  • Module: [wake_vortex.py]
  • OS: [windows]

Additional Notes

See snippet below:

def turbulent_kinetic_energy_dissipation_rate(
ds_dz: npt.NDArray[np.float_],
shear_enhancement_factor: npt.NDArray[np.float_] | float = 1.0,
) -> npt.NDArray[np.float_]:
"""
Calculate the turbulent kinetic energy dissipation rate (epsilon).

The shear enhancement factor is used to account for any sub-grid scale turbulence.

Parameters
----------
ds_dz : npt.NDArray[np.float_]
    Difference in wind speed over dz in the atmosphere, [:math:`m s^{-1} / m`]
shear_enhancement_factor : npt.NDArray[np.float_] | float
    Multiplication factor to enhance the wind shear

Returns
-------
npt.NDArray[np.float_]
    turbulent kinetic energy dissipation rate

Notes
-----
See eq. (37) in :cite:`schumannContrailCirrusPrediction2012`.

References
----------
- :cite:`schumannContrailCirrusPrediction2012`
"""
return 0.5 * 0.1**2 * (ds_dz * shear_enhancement_factor) ** 2

Run `nbval` in stricter mode

  • Run notebook tests as part of the CI. This could happen when a pull request is opened.
  • Right now, nbval-lax is run. We may want to run the stricter nbval check (no lax).
  • Consider using a tool like nb-clean to sanitize notebooks. This can be used with pre-commit.

Support model levels

Description

General

  • Add new metview dependency for [ecmwf] installation
  • Add a new input model_levels to ERA5 constructor. Type should be a list of model levels (like pressure_levels).
  • pressure_level input is required when defining model_levels - we will interpolate the model levels to these pressure levels.

CDS integration

When no cached data and no paths are supplied, try to download data from CDS:

  • Update request to cdsapi to reflect ml when model_levels is defined. Continue to default to grid coordinates (rather than native spherical harmonic coefficients) so regridding is done by CDS and returned data can come in netcdf format.
  • Use metview to automatically interpolate CDS returned data for all variables / times to pressure levels.
    • This should probably happen before caching so that we don't need to do re-interpolate every time? This may be less robust.

Local Files

  • If local file paths are found, assume that it is a NetCDF file with the same format as the output from CDS. Load the local files and then handle the same was as the CDS integration.

Alternatives

  • Calculate average pressure levels for each model level (valid for high altitudes)

Additional context

Reference

@depion @milenkorz

Add required met and radiation variables to Cocip documentation

Description

We have the properties Cocip.met_variables and Cocip.rad_variables, but these don't render nicely in Sphinx.

We should add a succinct description of the required ECMWF met and rad variables to the Cocip docstring and Cocip example notebook (under download meteorology).

Alternatives

  • Try to render the met_variables and rad_variables properties more clearly. Link from the docstrring

Additional context

While walking through a new user, I was looking for the required met variables and couldn't find them outside of the source code.

Inconsistent hole detection

Description

Polygon generation with different thresholds can produce inconsistent sets of holes.

If a hole is present with a low threshold, it should also be present at a higher threshold provided the enclosing polygon remains intact. In some cases, however, holes appear only at intermediate thresholds.

In the example below, holes should be present in the large region of high EF at all thresholds, but appear only with thresholds near 50,000,000 J/m.

image

image

image

Details

  • Version: v0.50.0
  • Module: core.met
  • OS: macOS

Steps to Reproduce

Download https://storage.googleapis.com/contrails-301217-public-data/sample.nc and run

import xarray as xr
import matplotlib.colors as colors
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import geojson
import numpy as np

from pycontrails.core import MetDataArray

def build_polygons(field, threshold):
   params = dict(
       fill_value=0.0,
       iso_value=threshold,
       min_area=0.3,
       epsilon=0.05,
       precision=2,
       interiors=True,
       convex_hull=False,
       include_altitude=True,
   )
   poly = field.to_polygon_feature(**params)
   return geojson.FeatureCollection(poly)

ds = xr.open_dataset("sample.nc")

plt.figure(figsize=(10, 3), dpi=300)
ax = plt.subplot(111, projection=ccrs.PlateCarree())
ax.coastlines(lw=0.1, color="gray")
im = ax.pcolormesh(ds["longitude"], ds["latitude"], ds["ef_per_m"].squeeze().T,
    shading="nearest", cmap="RdBu_r", norm=colors.SymLogNorm(linthresh=1e7))
plt.colorbar(im, label="EF (J m$^{-1}$)")

threshold = 3e7
for poly in build_polygons(
   MetDataArray(ds["ef_per_m"]),
   threshold
)["features"]["geometry"]["coordinates"]:
   for region in poly:
      v = np.array(region)
      ax.plot(v[:,0], v[:,1], transform=ccrs.Geodetic(), lw=0.5, color="black")

ax.set_title(f"{threshold} J/m")

plt.figure(figsize=(10, 3), dpi=300)
ax = plt.subplot(111, projection=ccrs.PlateCarree())
ax.coastlines(lw=0.1, color="gray")
im = ax.pcolormesh(ds["longitude"], ds["latitude"], ds["ef_per_m"].squeeze().T,
    shading="nearest", cmap="RdBu_r", norm=colors.SymLogNorm(linthresh=1e7))
plt.colorbar(im, label="EF (J m$^{-1}$)")

threshold = 5e7
for poly in build_polygons(
   MetDataArray(ds["ef_per_m"]),
   threshold
)["features"]["geometry"]["coordinates"]:
   for region in poly:
      v = np.array(region)
      ax.plot(v[:,0], v[:,1], transform=ccrs.Geodetic(), lw=0.5, color="black")

ax.set_title(f"{threshold} J/m")

plt.figure(figsize=(10, 3), dpi=300)
ax = plt.subplot(111, projection=ccrs.PlateCarree())
ax.coastlines(lw=0.1, color="gray")
im = ax.pcolormesh(ds["longitude"], ds["latitude"], ds["ef_per_m"].squeeze().T,
    shading="nearest", cmap="RdBu_r", norm=colors.SymLogNorm(linthresh=1e7))
plt.colorbar(im, label="EF (J m$^{-1}$)")

threshold = 7e7
for poly in build_polygons(
   MetDataArray(ds["ef_per_m"]),
   threshold
)["features"]["geometry"]["coordinates"]:
   for region in poly:
      v = np.array(region)
      ax.plot(v[:,0], v[:,1], transform=ccrs.Geodetic(), lw=0.5, color="black")

ax.set_title(f"{threshold} J/m")

Add ECMWF Open Data via API?

Description
ECMWF-IFS low resolution data are available on the opendata server without the need of registration with the open analogue of ecmwf-api-client, https://github.com/ecmwf/ecmwf-opendata.
As both packages use a really similar architecture maybe it wouldn't be hard to implement this in the package?

UserWarning: Met data 'rad' does not overlap the grid domain along the time axis. Lead to missing values in first hour of eval.

Description

Loading met and rad data from cachestore with same start and end times leads to UserWarning only for rad parameters. Consequently, the first hour of eval does not contain any values. I think this is not the intended behavior, since ::func:_check_met_rad_times (lines 141-153 in cocip_time_handling) checks rad and met with same tmin and tmax values.

Details

  • Version: v0.49.3 (but also v0.49.2/1)
  • Module: models.cocipgrid.cocip_time_handling
  • OS: linux cluster
  • Reporter: kirscsim

Steps to Reproduce

Minimal example:


from pycontrails import DiskCacheStore
from pycontrails.datalib.ecmwf import ERA5, HRES
from pycontrails.models.cocip import Cocip
from pycontrails.models.cocipgrid import CocipGrid
from pycontrails.models.ps_model.ps_grid import PSGrid
from pycontrails.core.met import MetDataset
from pycontrails.models.humidity_scaling import ConstantHumidityScaling
from datetime import datetime, timedelta

Data loading from cachestore and preprocessing:

t0 = datetime(2024,2,5,0)
t1 = datetime(2024,2,5,13)

pressure_levels = [
    900, 850, 800, 700, 600, 500, 400, 300, 250, 200, 150, 100, 70, 50, 30
]

forecast_time = datetime(2024,2,5,0)

cachestore = DiskCacheStore(
        cache_dir='path/to/cachestore', 
        allow_clear=False,
    )

met = HRES(
        time=(t0, t1),
        variables=Cocip.met_variables + Cocip.optional_met_variables, 
        pressure_levels=pressure_levels, 
        grid=.25,
        cachestore=cachestore,
        url='https://api.ecmwf.int/v1',
        key=***,
        email=***,
        forecast_time=forecast_time,
)
    
rad = HRES(
    time=(t0, t1),
    variables=Cocip.rad_variables,
    grid=.25,
    cachestore=cachestore, 
    url='https://api.ecmwf.int/v1',
    key=***,
    email=***,
    forecast_time=forecast_time,
)

met = met.open_metdataset(xr_kwargs=dict(parallel=False))
rad = rad.open_metdataset(xr_kwargs=dict(parallel=False))

met = met.wrap_longitude()
rad = rad.wrap_longitude()

Met times:
met['time'].values.min(), met['time'].values.max()
output:

(numpy.datetime64('2024-02-05T00:00:00.000000000'),
 numpy.datetime64('2024-02-05T13:00:00.000000000'))

Rad times:
rad['time'].values.min(), rad['time'].values.max()
output:

(numpy.datetime64('2024-02-05T00:00:00.000000000'),
 numpy.datetime64('2024-02-05T13:00:00.000000000'))

Cocipgrid calculation:

cocipgrid = CocipGrid(
    met=met,
    rad=rad,
    process_emissions=True,
    downselect_met=False,
    humidity_scaling=ConstantHumidityScaling(rhi_adj=0.99),
    aircraft_performance=PSGrid(),
)

# Define the 4D grid box that should be evaluated
delta = 0.25
longitude = np.arange(-20, 20, delta)
latitude = np.arange(50, 70, delta)
level = [
    376.0, 359.0, 344.0, 329.0, 314.0, 300.0, 287.0, 274.0, 262.0, 250.0, 
    238.0, 227.0, 216.0, 206.0, 196.0, 187.0,
]
time = [t0 + timedelta(hours=i) for i in range(2)]
print(time)
grid = MetDataset.from_coords(longitude, latitude, level, time)

# Apply the gridded CoCiPcd
result = cocipgrid.eval(
    source=grid,
    max_age=np.timedelta64(12, 'h'),
)

Output:

[datetime.datetime(2024, 2, 5, 0, 0), datetime.datetime(2024, 2, 5, 1, 0)]

***/micromamba/envs/pycontrails_complete/lib/python3.11/site-packages/pycontrails/models/cocipgrid/cocip_grid.py:2193: UserWarning: Met data 'rad' does not overlap the grid domain along the time axis. This causes interpolated values to be nan, leading to meaningless results.
  warnings.warn(
***/micromamba/envs/pycontrails_complete/lib/python3.11/site-packages/pycontrails/models/cocipgrid/cocip_time_handling.py:149: UserWarning: Parameter 'rad' is too short in the time dimension. Include additional time in 'rad' or reduce 'max_age' parameter.Model start time: 2024-02-05 00:00:00 Model end time: 2024-02-05 13:00:00
  warnings.warn(

CocipGrid eval: 77%
43/56 [01:21<00:25, 1.95s/it]

Additional Notes

I expect a warning from ::func:_check_met_rad_times from both met and rad or None. I also tried to increase time by adding time at the end, which resulted in the same behavior. Adding time before the first hour solved this issue and is my current workaround.

Here visualization:
Issue_fig0
Issue_fig1

Wrong OLR weather value in ACCF contrail

Description

The contrail accf by day is not using the right weather variable for outgoing longwave radiation. Climaccf is using "top net thermal radiation (ttr)" in J m^-2 from ERA5 single level dataset. It is then divided in their implementation by the time interval (3600s in this case) to convert it to the mean flux on this time interval (W m^-2). Climaccf should therefore be given the value in J m^-2, and not the one in W m^-2.

This is the expected result with Climaccf :
climaccf_contrailACCF

This is the result with Pycontrail accf:
pycontrail_contrailACCF

This is the result with climaccf with the wrong OLR values :
climaccf_contrailACCF_wrong_OLR

We think this is the main difference between the two results.

Details

  • Version: 0.42.0
  • Module: models.accf
  • OS: windows

Steps to Reproduce

  1. Run the accfs with pycontrail
  2. Compare with Climaccf with both the weather variable in J/m-² and in W/m-²

Run CI tests on windows

To better support windows usage, we should run the tests workflow with windows-latest in the testing matrix.

Most of the steps in the test.yaml file assume a unix operating system. These should be generalized for a windows runner.

Determine appropriate interpolation methodology for pressure level and specific humidity

Description

In discussions with collaborators, we've received the suggestion to interpolate pressure levels on a log scale to better approximate geometric altitude.

We've also heard the recommendation to interpolate specific humidity as the log(specific humidity) to account for the wide range of value.

Alternatives

  • Linear interpolation (current method)

Additional context

Per discussion

@zebengberg @mccloskey

Create install, FAQ, and Performance documentation

Description

We should add a few more supporting documentation pages:

  • install (more detailed README)
  • FAQ - for questions we answer, we can write up the answers
  • Performance - for questions about performance optimizations

Better Cartopy support

Description

Cartopy is a pain to get work correctly with the other pycontrails dependencies. Provide an optional install target that installs a working version of cartopy, e.g. pip install pycontrails[cartopy]

Resample, fill, and filter flight trajectories

Description

We want to provide an opinionated, general set of methods for preparing flight trajectories for analysis.

Generally this includes:

  • Resample trajectories to have a waypoint every ~ 10s - 1 min
  • Interpolate spatial (lat, lon) gaps when upsampling. Interpolate small gaps linearly, interpolate longer gaps with geodesics.
  • Filter the altitude signal to have an reasonable flight profile with distinct flight phases

Currently we have implemented Flight.resample_and_fill which works pretty well. This function contains some logic to validate an vertical profile, but the logic may be out of date.

  • If possible, make resample_and_fill a module function (flight.resample_and_fill) and call the module function from the class.

We also have a function flight.filter_altitude that should remove high frequency noise (usually ADS-B or FDR) and snap cruise altitudes to 1000s of feet. This should be a cascade of filters with a (1) median filter (2) low pass filter (3) cruise altitude correction. Feel free to design a better methodology that applies to all the possible segment durations we may encounter.

  • Take segment_duration as an input to filter_altitude. In the class method version, we can calculate segment duration automatically
  • Create a class method Flight.filter_altitude that calls the module function
  • Design a set of filtering strategies that work with the set of altitude/segment duration inputs we may receive. Ideally we'll get data with a time interval of < 10s minutes. We may decide we can't filter higher interval data, so we can throw an error if the segment duration > 30 min or whatever starts to really fail.
  • Implement filtering strategy
  • Implement tests for (1) flight with very high frequency altitude data (segment_duration ~ 1s) (2) nominal frequency (segment_duration ~ 1min) (3) lower frequency (segment_duration ~ 10 min) (3) low frequency (like an ATC route string)

As a final step, confirm that the implementations of resample_and_fill and filter_altitude play well together. In the Flight class methods, we can make sure this is handled well. In the module functions, we can reference the input sources (e.g. flight.segment_duration) and suggest the best way to assemble cleaned trajectories.

  • Update the Flight.ipynb with an example of cleaning a trajectory

Alternatives

  • Use the traffic.core.flight.Flight.filter. This relies on the client side implementing the cascading filter. I'd prefer to implement an opinionated set of fliters that work best, and then raise warning/errors when we don't get good results

Additional context

Deprecate or remove GeoVectorDataset.attrs["crs"]

Generally, pycontrails assumes all geospatial data is provided in WGS84. In the GeoVectorDataset docstring, we write:

Expect latitude-longitude CRS in WGS 84.

The GeoVectorDataset interface gives an option to set a custom CRS through its attrs. This attribute is not consistently used in geospatial calculations. As a concrete example, Flight.length and Flight.segment_length both check the crs attribute, whereas Flight.segment_angle and Flight.segment_groundspeed do not. In the cases where the CRS is checked, an error is raised when the data is not WGS84 (the default).

The GeoVectorDataset.attrs["crs"] can likely be removed without impacting the end user. It would result in a slightly cleaner code base, slight performance improvements, and simplify some patterns (for example, Fleet constructions).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.