Code Monkey home page Code Monkey logo

piaminterfaces's Introduction

Project specific interfaces to REMIND / MAgPIE

R package piamInterfaces, version 0.18.7

CRAN status R build status codecov r-universe

Purpose and Functionality

Project specific interfaces to REMIND / MAgPIE.

Tutorial

  • To understand how to submit to the IIASA database, read this REMIND tutorial.
  • In the following, we differentiate templates (list of variables and corresponding units used in a project) and mappings (specifying which PIAM variable will be mapped to a project variable).

Mappings

Mappings found in the inst/mappings folder serve to map variables from the PIAM framework to variables needed for the submission to databases. The mappings are ;-separated files, using # as comment character, with the following mandatory columns:

  • variable: name of the variable in the project template
  • unit: unit corresponding to variable
  • piam_variable: name of the variable in REMIND / MAgPIE / EDGE-T etc. reporting
  • piam_unit: unit corresponding to piam_variable
  • piam_factor: factor with which the piam_variable has to be multiplied for units to match

Recommended column:

  • description: description text defining the variable. Never use " and ; in the text.
  • source: abbreviation of the PIAM part where the piam_variable comes from. Use B = Brick, C = MAGICC, M = MAgPIE, R = REMIND, S = SDP postprocessing, T = EDGE-Transport. This column is used to select the variables passed to remind2 and coupling tests. If the variable is not normally reported, add a small x after the model abbreviation for it to be skipped.

Additionally, some mappings use those columns:

  • idx: serial number of variable
  • Tier: importance of variable. 1 means most important
  • Comment: place for comments

To edit a mapping in R, use:

mappingdata <- getMapping("AR6")
...
write.csv2(mappingdata, "test.csv", na = "", row.names = FALSE, quote = FALSE)

Opening the csv files in Excel can be problematic, as it sometimes changes values and quotation marks. You can edit the files in LibreOffice Calc using these settings in the Text Import dialog:

  • Text Import with:
    • Character set: Unicode (UTF-8)
    • Separated by: Semicolon.
  • Save with:
    • Character set: Unicode (UTF-8)
    • Field Delimiter: ;
    • String Delimiter: (none)

The github diff on a large semicolon-separated file is often unreadable. For a human-readable output, save the old version of the mapping and run:

remind2::compareScenConf(fileList = c("oldfile.csv", "mappingfile.csv"), row.names = NULL)

Model intercomparison

  • To compare the results of different models, pass as modeldata a quitte object or a csv/xlsx file. You get a PDF document for each scenario and each model with area plots for all the summation groups in AR6 (or NAVIGATE) summation files plus line plots for each variable in the lineplotVariables vector you supplied. It takes some time, better use a slurm job for:

    plotIntercomparison(modeldata, summationsFile = "AR6", lineplotVariables = c("Temperature|Global Mean", "Population"))
    
  • If your modeldata is not well filtered such that for example model regions are not too different, you can use interactive = TRUE which allows to select models, regions, scenarios and variables that you like in your PDF. As lineplotVariables, you can also specify mapping names.

    plotIntercomparison(modeldata, summationsFile = "AR6", lineplotVariables = c("AR6", "AR6_NGFS"), interactive = TRUE)
    

Installation

For installation of the most recent package version an additional repository has to be added in R:

options(repos = c(CRAN = "@CRAN@", pik = "https://rse.pik-potsdam.de/r/packages"))

The additional repository can be made available permanently by adding the line above to a file called .Rprofile stored in the home folder of your system (Sys.glob("~") in R returns the home directory).

After that the most recent version of the package can be installed using install.packages:

install.packages("piamInterfaces")

Package updates can be installed using update.packages (make sure that the additional repository has been added before running that command):

update.packages()

Questions / Problems

In case of questions / problems please contact Falk Benke [email protected].

Citation

To cite package piamInterfaces in publications use:

Benke F, Richters O (2024). piamInterfaces: Project specific interfaces to REMIND / MAgPIE. R package version 0.18.7, <URL: https://github.com/pik-piam/piamInterfaces>.

A BibTeX entry for LaTeX users is

@Manual{,
 title = {piamInterfaces: Project specific interfaces to REMIND / MAgPIE},
 author = {Falk Benke and Oliver Richters},
 year = {2024},
 note = {R package version 0.18.7},
 url = {https://github.com/pik-piam/piamInterfaces},
}

piaminterfaces's People

Contributors

0umfhxcvx5j7joaohfss5mncnistjj6q avatar bs538 avatar fbenke-pik avatar flohump avatar fschreyer avatar jmuessel avatar johannah-pik avatar orichters avatar pre-commit-ci[bot] avatar pweigmann avatar renato-rodrigues avatar robinhasse avatar strefler avatar tonnrueter avatar weindl avatar

Watchers

 avatar  avatar

piaminterfaces's Issues

Initial Setup of piam_interfaces

  • migrate all relevant functionality and templates from project_interfaces
  • decide: unify template structure?
    • Rename the r30m44* variables to remmag or so?
    • streamline AR6 template and delete all old mappings?
  • refactor function generateIIASASubmission
  • refactor function generateMappingFile
  • write check functions to be called in remind2 to assure that variable renamings don't cause problems
  • write tests in remind2 to assure that variable renamings don't cause problems
    • SHAPE
    • NGFS
    • AR6
    • NAVIGATE
  • README and documentation
  • tutorial in remindmodel
  • set up interface with xlsx_IIASA.R scripts in remindmodel such that export documents can be generated using output.R --> remindmodel/remind#1016
  • migrate ARIADNE template to this library
  • migrate ECEMF template to this library

Automate download of community template

checkFixUnits should tell which variables are affected and also write that to the logfile

# 1 unit mismatches between template and reporting.
If they are identical apart from spelling, add them to vector 'identicalUnits' in piamInterfaces::checkFixUnits() as:
                      "million t DM/yr" = "kcal/cap/day",

Error in checkFixUnits(mifdata, template, logFile) : Unit mismatches!

This is not helpful if it does not tell you which variable is affected. And the info could also be entered into the logfile.

https://github.com/pik-piam/piamInterfaces/blob/master/R/checkFixUnits.R#L78

Referencing non-existing REMIND variables; breaks remind2 tests

851;2;energy (final);Final Energy|Transportation|LDV|Electricity;EJ/yr;FE|Transport|LDV|Electricity;EJ/yr;;;;final energy consumption of electricity (including on-site solar PV), excluding transmission/distribution losses in the transportation sector by LDV;
852;2;energy (final);Final Energy|Transportation|LDV|Gases;EJ/yr;FE|Transport|LDV|Gases;EJ/yr;;;;final energy consumption of gases (natural gas, biogas, coal-gas), excluding transmission/distribution losses in the transportation sector by LDV;
853;2;energy (final);Final Energy|Transportation|LDV|Hydrogen;EJ/yr;FE|Transport|LDV|Hydrogen;EJ/yr;;;;;
854;2;energy (final);Final Energy|Transportation|LDV|Liquids;EJ/yr;FE|Transport|LDV|Liquids;EJ/yr;;;;;

FE|Transport|LDV|Liquids, FE|Transport|LDV|Gases, FE|Transport|LDV|Electricity, and FE|Transport|LDV|Hydrogen don't exist, but FE|Transport|LDV|+|Liquids, FE|Transport|LDV|+|Gases, FE|Transport|LDV|+|Electricity, and FE|Transport|LDV|+|Hydrogen do
https://github.com/pik-piam/remind2/blob/fcd6fcf6f63086355d39c806d96862a726f4e8f7/R/reportFE.R#L955-L958

Topics for documentation

Useful code:

To edit a template file in R:

templatedata <- getTemplate("AR6")
...
write.csv2(templatedata, "test.csv", na = "", row.names = FALSE, quote = FALSE)

The github diff on large semicolon-separated csv files are often completely unreadable.
To get a nice output, save the old version of the mapping and the use remind2::compareScenConf:

remind2::compareScenConf(fileList = c("oldfile.csv", "mappingfile.csv"), row.names = NULL)

Remove hard-coded World region from summation checks plots creation

Issue:

Using the checkSummations function with models that do not include a World region throws an error if generatePlots is set to true because the code requires the World region as the mainReg parameter of the showAreaAndBarPlots function call.

mip::showAreaAndBarPlots(plotdata, intersect(childs, unique(plotdata$variable)), tot = p,
mainReg = "World", yearsBarPlot = c(2030, 2050), scales = "fixed")

Request:

Remove the hard codded value under the mainReg parameter by adding and additional parameter to the checkSummations function call to allow using the function with results that do not include the World region.
E.g.

checkSummations <- function(mifFile, ... , plotMainReg = "World"){
...
mip::showAreaAndBarPlots(plotdata, intersect(childs, unique(plotdata$variable)), tot = p,
                                   mainReg = plotMainReg, yearsBarPlot = c(2030, 2050), scales = "fixed")
...
}

Application:

This change allow the checkSummations function to work with results and models that do not include the World region.

Strange summation results for "extractVariables"

Somehow, checkSummations seems to be confused if you have more than one + summation. See:

gdx <- "/p/tmp/oliverr/debugging/fulldata_checkSummations.gdx"
output <- remind2::reportSE(gdx)

sumChecks <- piamInterfaces::checkSummations(
    mifFile = output, outputDirectory = NULL,
    summationsFile = "extractVariableGroups",
    absDiff = 1.5e-8, relDiff = 1e-8, roundDiff = FALSE
)

This complains:

SE|Electricity|+|Coal <
   + SE|Electricity|Coal|+|w/ CC
   + SE|Electricity|Coal|+|w/o CC
Relative difference between -321.82833288036% and -6.0824905776075e-06%, absolute difference up to 3.74993647369593e-05 EJ/yr.

But I checked that carefully, and they actually do sum up correctly...

Let us remove the SE|Electricity|Coal|++| variables:

outputNoPlusPlus <- output[,, grep("SE|Electricity|Coal|++|", getNames(output), fixed = TRUE, value = TRUE, invert = TRUE)]

sumChecks <- piamInterfaces::checkSummations(
    mifFile = outputNoPlusPlus, outputDirectory = NULL,
    summationsFile = "extractVariableGroups",
    absDiff = 1.5e-8, relDiff = 1e-8, roundDiff = FALSE
)

results in "All summation checks were fine". So obviously this summation is not the problem. But:

outputOnlyPlusPlus <- output[,, c("SE|Electricity|+|Coal (EJ/yr)", grep("SE|Electricity|Coal|++|", getNames(output), fixed = TRUE, value = TRUE))]

sumChecks <- piamInterfaces::checkSummations(
    mifFile = outputOnlyPlusPlus , outputDirectory = NULL,
    summationsFile = "extractVariableGroups",
    absDiff = 1.5e-8, relDiff = 1e-8, roundDiff = FALSE
)

Raises an error again, but a different one:

SE|Electricity|+|Coal <
   + SE|Electricity|Coal|++|Gasification Combined Cycle w/o CC
   + SE|Electricity|Coal|++|Gasification Combined Cycle w/ CC
   + SE|Electricity|Coal|++|Pulverised Coal w/o CC
   + SE|Electricity|Coal|++|Pulverised Coal w/ CC
   + SE|Electricity|Coal|++|Combined Heat and Power w/o CC
   + SE|Electricity|Coal|++|Other
Relative difference between -321.82833288036% and -6.0824905776075e-06%, absolute difference up to 3.74993647369593e-05 EJ/yr.

So somehow the matching of the printing to the actual problem is not working as of now.

Cleanup N2O and CH4 emission reporting

Emissions|N2O

Variables in remind2:

  • Emi|N2O, with subgroups Agriculture, Energy Supply, Industry, Land-Use Change, Transport, Waste

N2O in NAVIGATE template:

Emissions|N2O                      <- Emi|N2O
Emissions|N2O|AFOLU                <- Emi|N2O|+|Land-Use Change + Emi|N2O|+|Agriculture
Emissions|N2O|Energy               <- Emi|N2O|Energy Supply and Demand    (does not exist in remind2!)
Emissions|N2O|Waste                <- Emi|N2O|+|Waste
Emissions|N2O|Other                <- Emi|N2O|Other                       (does not exist in remind2!)

Non-mapped variables from remind2: Emi|N2O|+|Energy Supply, Emi|N2O|+|Industry, Emi|N2O|+|Transport

Suggestions:

Emissions|N2O|Energy               <- Emi|N2O|+|Energy Supply + Industry + Transport
Emissions|N2O|Other                <- NA

N2O in AR6 template:

Emissions|N2O                      <- Emi|N2O
Emissions|N2O|AFOLU                <- Emissions|N2O|Land
Emissions|N2O|Energy               <- Emi|N2O|Energy Supply and Demand    (does not exist in remind2!)
Emissions|N2O|Industrial Processes <- Emi|N2O|+|Industry
Emissions|N2O|Other                <- Emi|N2O|Other                       (does not exist in remind2!)
Emissions|N2O|Waste                <- Emi|N2O|+|Waste

Non-mapped variables: Emi|N2O|+|Transport, Emi|N2O|+|Land-Use Change, Emi|N2O|+|Agriculture

Emissions|N2O|Land is not available for REMIND standalone.

Suggestions:

Emissions|N2O|AFOLU                <- Emi|N2O|+|Land-Use Change + Emi|N2O|+|Agriculture
Emissions|N2O|Energy               <- Emi|N2O|+|Energy Supply + Transport
Emissions|N2O|Other                <- NA

Unclear to me whether Industry should be mapped to Industrial Processes or Energy

Emissions|CH4

Variables in remind2:

  • Emi|CH4 with subgroups Agriculture, Energy Supply, Extraction, Land-Use Change, Waste

CH4 in NAVIGATE template:

Emissions|CH4                      <- Emi|CH4
Emissions|CH4|AFOLU                <- Emi|CH4|+|Agriculture + Emi|CH4|+|Land-Use Change
Emissions|CH4|Energy               <- Emi|CH4|+|Energy Supply
Emissions|CH4|Industrial Processes <- NA
Emissions|CH4|Other                <- NA
Emissions|CH4|Waste                <- Emi|CH4|+|Waste

Non-mapped variables from remind2: Emi|CH4|+|Extraction

Suggestion:

Emissions|CH4|Energy               <- Emi|CH4|+|Energy Supply + Extraction

CH4 in AR6 template:

Emissions|CH4                      <- Emi|CH4
Emissions|CH4|AFOLU                <- Emissions|CH4|Land
Emissions|CH4|Energy               <- Emi|CH4|Energy Supply and Demand     (does not exist in remind2!)
Emissions|CH4|Industrial Processes <- NA
Emissions|CH4|Other                <- Emi|CH4|Other                        (does not exist in remind2!)
Emissions|CH4|Waste                <- Emi|CH4|+|Waste

Non-mapped variables: Emi|CH4|+|Agriculture, Emi|CH4|+|Energy Supply, Emi|CH4|+|Extraction, Emi|CH4|+|Land-Use Change

Emissions|CH4|Land is not available for REMIND standalone.

Suggestion: Use settings from NAVIGATE.

generateIIASASubmission silently drops identical scenarios

library(piamInterfaces)
d1 <- d2 <- quitte::quitte_example_data
d12 <- rbind(d1, d2)
r1  <- generateIIASASubmission(d1 , mapping = "AR6", outputFilename = NULL, logFile = NULL)
r12 <- generateIIASASubmission(d12, mapping = "AR6", outputFilename = NULL, logFile = NULL)

message("d1 : ", nrow(d1 ), " -> ", nrow(r1 ))
message("d12: ", nrow(d12), " -> ", nrow(r12))
d1 : 19152 -> 19200
d12: 38304 -> 19200

Possible solutions:

  1. take the mif object at the end, remove $value column and check for any(duplicated(data))
  2. while reading mif files, check whether the scenario already exists and if yes, fail. Disadvantage: This doesn't catch the situation above, and it might lead to problems if you read say REMIND and MAgPIE data from different files (so no variable collisions) and you actually want to join them

I would probably opt for the first.

Add new energy waste emissions to mapping templates to ensure summation consistency in projects

New energy emissions variables for waste were added to REMIND with the feedstocks implementation (reporting added in this PR) such that energy emissions now sum as

Emi|CO2|+|Energy = Emi|CO2|Energy|+|Supply + Emi|CO2|Energy|+|Demand + Emi|CO2|Energy|+|Waste,
Emi|GHG|Energy = ... etc.

The project mapping files still need to be adapted to this change to ensure summation consistency of IIASA submissions.

The question where to put this to ensure consistency in projects, which do not have this variable structure. Looking, for instance, at the NAVIGATE template (which I understand to be the most up-to-date IAMC template), I think we need to decide whether we map the waste emissions either to energy supply or demand as that's the distinction they have. Assuming that most of the waste goes to power or heating plants, one approach would be to map it to Emissions|CO2|Energy|Supply|Other. This has the following definition in NAVIGATE:

CO2 emissions from fuel combustion in other energy supply sectors (please provide a definition of other sources in this category in the 'comments' tab)

So, it seems to be a rather flexible category. Just a suggestion. Main thing is that it gets accounted somewhere.

Tagging @mellamoSimon, @strefler. Also @orichters as someone who seems experienced with piamInterfaces.

Emissions|CO2 from transport don't sum correctly

There is some inconsistency in the summation of Emissions|CO2|Energy|Demand|Transportation. Maybe further variables should be added to the summation group? Maybe, Johanna, you want to have a look?

The left column is the equation summation_group_AR6.csv says should be satisfied in terms of AR6 variables. The right column is which values from REMIND correspond to each term according to mapping_template_AR6.csv. This information was provided by checkSummations().

summation_groups_AR6.csv
Emissions|CO2|Energy|Demand|Transportation <>                            Emi|CO2|Energy|Demand|+|Transport <>
   + Emissions|CO2|Energy|Demand|Transportation|Aviation|Passenger          + Emi|CO2|Transport|Pass|Aviation|Domestic|Demand
   + Emissions|CO2|Energy|Demand|Transportation|Maritime|Freight            + Emi|CO2|Transport|Freight|Navigation|Demand
   + Emissions|CO2|Energy|Demand|Transportation|Rail                        + Emi|CO2|Transport|Rail|Demand
   + Emissions|CO2|Energy|Demand|Transportation|Road                        + Emi|CO2|Transport|Road|Demand

A check with /p/tmp/oliverr/debugging/REMIND_generic_SSP2EU-AMT-Base.mif (from the automated model tests) reveals that there is a relative difference of the summation between -37.6% and 17.5%, absolute difference up to 1552.6 Mt CO2/yr. So somehow the equation is not satisfied.

The unusual thing is that the sum is sometimes higher, sometimes lower than Emissions|CO2|Energy|Demand|Transportation.

Here are further variables that start with Emissions|CO2|Energy|Demand|Transportation| that were found in the AR6 template that are not part of the summation group, received from Rscript -e "piamInterfaces::variableInfo('Emi|CO2|Energy|Demand|+|Transport', template='AR6')"

# Child variables not in summation group
  . Emissions|CO2|Energy|Demand|Transportation|Aviation               . NA
  . Emissions|CO2|Energy|Demand|Transportation|Freight                . Emi|CO2|Transport|Freight|Short-Medium Distance|Demand
  . Emissions|CO2|Energy|Demand|Transportation|Maritime               . NA
  . Emissions|CO2|Energy|Demand|Transportation|Passenger              . Emi|CO2|Transport|Pass|Short-Medium Distance|Demand

And these additional variables were found in the mif file that are not used in the template.

# Additional variables found in mif file
- Emi|CO2|Energy|Demand|Transport|+|Gases
- Emi|CO2|Energy|Demand|Transport|+|Liquids
- Emi|CO2|Energy|Demand|Transport|International Bunkers

So option to fix this include: 1. adapting the mapping, 2. changing the summation group, 3. finding an error in the reporting elsewhere.

Find solution for dropping aggregated regions in generateIIASASubmission

Using REMIND EU21, EUR and NEU have to be dropped from the data to avoid that IIASA database does double-counting. For ECEMF, Renato also drops World before submission.

Oliver is reluctant to adding a general rule to generateIIASASubmission because it is used by multiple models and projects, but we should at least warn the user about such things.

The problem is that in a mif file, you cannot identify anymore which the aggregate regions are (apart from World, maybe).

As a current workaround, such a function can be used:

mifs %>%
  generateIIASASubmission(mapping = "ECEMF", model = "REMIND 3.2", outputDirectory = NULL) %>%
  dplyr::filter(! .data$region %in% c("World","EUR","NEU")) %>%
  quitte::write.IAMCxlsx("output/submission.xlsx")

checkSummation plots do not include totals for multiple summations cases of the same variable

Reproduction:

Consider two group summations like for example:

  • group 1:
Emissions|Kyoto Gases;Emissions|Kyoto Gases|Industrial Processes;1
Emissions|Kyoto Gases;Emissions|Kyoto Gases|Energy;1
Emissions|Kyoto Gases;Emissions|Kyoto Gases|Waste;1
Emissions|Kyoto Gases;Emissions|Kyoto Gases|Other;1
Emissions|Kyoto Gases;Emissions|Kyoto Gases|AFOLU;1
Emissions|Kyoto Gases|AFOLU;Emissions|Kyoto Gases|AFOLU|Agriculture;1
Emissions|Kyoto Gases|AFOLU;Emissions|Kyoto Gases|AFOLU|Land;1
Emissions|Kyoto Gases|Energy and Industrial Processes;Emissions|Kyoto Gases|Industrial Processes;1
Emissions|Kyoto Gases|Energy and Industrial Processes;Emissions|Kyoto Gases|Energy;1
  • group 2:
Emissions|Kyoto Gases 2;Emissions|F-Gases;1
Emissions|Kyoto Gases 2;Emissions|N2O;0.265
Emissions|Kyoto Gases 2;Emissions|CH4;28
Emissions|Kyoto Gases 2;Emissions|CO2;1

Summation deviations charts will be created perfectly in the pdf for the first group.
However, the chart for the second group will have the total black line missing, most probably because the total variable passed to the showAreaAndBarPlots function in the following line is Emissions|Kyoto Gases 2 instead of Emissions|Kyoto Gases.

mip::showAreaAndBarPlots(plotdata, intersect(childs, unique(plotdata$variable)), tot = p,

checkSummations does not show piam_factor in human-readable summary

Carbon Sequestration|CCS >                                   Carbon Management|Storage >
   + Carbon Sequestration|CCS|Biomass                          + Carbon Management|Storage|+|Biomass|Pe2Se + Carbon Management|Storage|Industry Energy|+|Biomass
   + Carbon Sequestration|CCS|Fossil                           + Carbon Management|Storage|+|Fossil|Pe2Se + Carbon Management|Storage|Industry Energy|+|Fossil
   + Carbon Sequestration|Direct Air Capture                   + Emi|CO2|CDR|DACCS
   + Carbon Sequestration|CCS|Industrial Processes             + Carbon Management|Storage|+|Industry Process
Relative difference between 0.0121% and 28.8%, absolute difference up to 58.7 Mt CO2/yr.

should be

Carbon Sequestration|CCS >                                   Carbon Management|Storage >
   + Carbon Sequestration|CCS|Biomass                          + Carbon Management|Storage|+|Biomass|Pe2Se + Carbon Management|Storage|Industry Energy|+|Biomass
   + Carbon Sequestration|CCS|Fossil                           + Carbon Management|Storage|+|Fossil|Pe2Se + Carbon Management|Storage|Industry Energy|+|Fossil
   + Carbon Sequestration|Direct Air Capture                   - 1 Emi|CO2|CDR|DACCS
   + Carbon Sequestration|CCS|Industrial Processes             + Carbon Management|Storage|+|Industry Process
Relative difference between 0.0121% and 28.8%, absolute difference up to 58.7 Mt CO2/yr.

Variables with wrong piam_unit are just silently dropped instead of an error/warning raised

qe <- quitte::quitte_example_data %>%
  filter(.data$variable %in% c("Consumption", "Population")) %>%
  droplevels()

AR6 <- qe %>%
  mutate(unit = factor(ifelse(.data$variable == "Consumption", "quadrillion", as.character(.data$unit)))) %>%
  generateIIASASubmission(mapping = "AR6", outputFilename = NULL) %>%
  as_tibble()

message("qe: ", paste(levels(qe$variable), collapse = ", "),
        ". AR6: ", paste(levels(AR6$variable), collapse = ", "))

had to correct it here when I noticed some variables were missing:
e93480a

allow to edit mapping templates with R

The way to edit the templates as suggested in the tutorial does not work:

  • for NAVIGATE, it drops all the " in the variable definitions (such as "power-to-gas")
  • if you add quote = "" to the read.csv2 statement in getTemplate, it works better, but not for the ARIADNE template, where some descriptions are in quotation marks because the description itself contains a ;.

What could be done:

  • Replace ; by . or , in the ARIADNE descriptions
  • somehow improve the functions, no idea how.

Recursive support for summation checks

Assume a summationsFile containing these summation checks:

parent child
Primary Energy Primary Energy|non-Fossil
Primary Energy Primary Energy|Fossil
   
Primary Energy|Fossil Primary Energy|Coal
Primary Energy|Fossil Primary Energy|Gas
Primary Energy|Fossil Primary Energy|Oil

If the mif file provided to piamInterfaces::checkSummations does not contain the variable Primary Energy|Fossil, the summation for Primary Energy would not be tested, even if it could be obtained by the summation of Primary Energy|Coal, Primary Energy|Gas and Primary Energy|Oil.

Feature request:

  • Whenever a variable is missing in a summation check, check if the variable can be obtained from another summation defined in the summationsFile, and use its value if it contains some information.
  • This could be added as an extra option to the function, possible enabled by default: e.g. recursive = true.

Alternative potential implementation:

  • Create a function in piamInterfaces to fill missing variables based on the summationsFile groups (e.g. fillMissing function).
  • Run this function inside checkSummations before executing the summation checks.
  • Pros: the fillMissing function can be useful for other use cases also.
  • Cons: It can increase dimensionality of summation checks, and/or be incompatible with summation groups that include redundant (duplicated) information in different variables of the same summation group. The latter can be avoided by adding additional alternative summation groups as in #141

feature requests model comparison

  • set factor = 1 if column not given provided in getSummations: #58
  • allow for multiple summations in one variable (such as Final Energy|Industry) -> Use Final Energy|Industry 2
  • allow checkSummations to be called without a template: #47
  • in checkSummations, loop over models have them differentiated: #47
  • make as.quitte/read.quitte support xlsx files from the IIASA explorer: pik-piam/quitte#38
  • produce plot with all failing summation checks: #47

Industry|Heat variables

Dear industry colleagues, the AR6 template expects the following variables to exist in remind2 reporting, which doesn't seem to be the case:

FE|Industry|Heat|Cement
FE|Industry|Heat|Chemicals
FE|Industry|Heat|Steel

I found FE|Industry|Chemicals|Heat (different order), but nothing for Cement and Steel.

Should I replace what exists, and remove the other?

Inconsistency in emission variables

I received feedback that we are inconsistent in our NGFS runs wrt. the variable Emissions|CO2|Energy and Industrial Processes. It does not form the sum of ...Energy and ...Industrial Processes as one would expect. I could track this down to the mapping template, where Emissions|CO2|Energy also includes the carbon management variables for carbon capture and carbon storage, see here or in the picture below.

@strefler do you know whether this is correct? And if yes - it seems we are not doing it in the same way on the REMIND reporting side, where only used carbon is subtracted (if I understand correctly what is happening here). The way I see it, Emissions|CO2|Energy means something different than our REMIND Emi|CO2|Energy but within Emissions|CO2|Energy and Industrial Processes we treat them as if they were the same, resulting in the inconsistency. Do you have an idea what to do?

image

Check SDP post-processing variables

When checking the templates, we noticed that many variables do not seem to come from REMIND or MAgPIE or EDGE-T or MAGICC, but rather from some SDP-related post-processing. So we allocated source = S to them in line with the tutorial.

library(piamInterfaces)
sort(unique(unlist(lapply(c("AR6", "SHAPE", "NAVIGATE"), getTemplateVariables, "S"))))

It would be nice if you, @bs538, whenever you have nothing to do at some point, could have a short look whether that list is accurate.

wrong calculation for checkSummation

Hey Falk,

any idea how your most recent changes could have caused this?

# A tibble: 32 × 5
   scenario           variable                         unit  period      value
   <fct>              <fct>                            <fct>  <int>      <dbl>
 1 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2005 510.
 2 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2010 409.
 3 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2015 319.
 4 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2020 230.
 5 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2025 143.
 6 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2030  50.4
 7 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2035   0.0012
 8 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2040   0.0012
 9 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2045   0.0012
10 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2050   0.0012
11 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2055   0.0012
12 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2060   0.0012
13 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2070   0.0012
14 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2080   0.0012
15 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2090   0.0012
16 C_o_lowdem_d95high Capacity|Electricity|Oil         GW      2100   0.000660
17 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2005 510.
18 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2010 409.
19 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2015 319.
20 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2020 230.
21 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2025 143.
22 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2030  50.4
23 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2035   0.0012
24 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2040   0.0012
25 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2045   0.0012
26 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2050   0.0012
27 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2055   0.0012
28 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2060   0.0012
29 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2070   0.0012
30 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2080   0.0012
31 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2090   0.0012
32 C_o_lowdem_d95high Capacity|Electricity|Oil|w/o CCS GW      2100   0.000660

Looks good to satisfy this summation check: Capacity|Electricity|Oil = Capacity|Electricity|Oil|w/ CCS + Capacity|Electricity|Oil|w/o CCS.

But:

d4 <- readRDS("/p/tmp/oliverr/debugging/oilerror.rds")
checkSummations(mifFile = d4, template = "AR6", generatePlots = FALSE, summationsFile = "AR6", dataDumpFile = "oilerror.csv", outputDirectory="/p/tmp/oliverr/debugging")

yields:

Capacity|Electricity|Oil <                                                    Cap|Electricity|Oil|w/o CC <
   + Capacity|Electricity|Oil|w/ CCS                                             + NA
   + Capacity|Electricity|Oil|w/o CCS                                            + Cap|Electricity|Oil|w/o CC
Relative difference between -100% and -100%, absolute difference up to 510.2 GW.

image

So somehow the script now thinks that the sum is twice the actual value, maybe because one of the variables is missing?

I think that wasn't the case before. Can you have a look, please?

fix and factor out generateIIASASubmission() functionality to be used as general REMIND output filter for submissions

As generateIIASASubmission() is now the designated way of filtering REMIND
data for publication submissions, there are a couple of issues that need
addressing.

  • generateIIASASubmission() documentation is missing.
    The description only repeats the title, there is no explanation what steps
    are taken between .mif files and submission file, the interaction of
    different parameters are not clarified.
  • no documentation on how different mapping parameters interact (set
    union, set difference, …?)
  • the possibility to pass mapping file paths via the mapping parameter is
    not documented
  • logFile parameter has unstated dependency on outputDirectory
    Setting outputDirectory to something other then output, but not adjusting
    logFile, leads to
    Error in file(file, ifelse(append, "a", "w")) : 
      cannot open the connection
    In addition: Warning message:
    In file(file, ifelse(append, "a", "w")) :
      cannot open file 'output/missing.log': No such file or directory
    
    Either make sure the directory exists, or better, always write the log to the
    stated output directory, which should be what most users expect.
  • logFile should default to something based on outputFilename
    Like sub('\\.[^\\.]+$', '_missing.log', outputFilename), in case users want
    to write several submission files but do not want to adjust the name of the
    log file everytime.
  • option to deactivate the log file is missing
    Users might not be interested in the file (when converting lots of mifs and
    being aware of the state of the reporting). They should be able to set
    logFile to NULL to deactivate it.
    • added in #237. set logFile to FALSE please.
  • add a helper function for getting a list of available mappings
    Currently users have to fake-run generateIIASASubmission() without mapping
    parameters to get a list of the available mappings.
  • model parameter should default to NULL
    If users want to change the parameter from what is in the mifs, they should
    do so explicitly. Preset defaults are easily overlooked.
  • no-ops should not issue any messages
    • # Adapt scenario names: '' will be prepended, '' will be removed.
    • # iiasatemplate does not exist, returning full list of variables.
    • fixed in #237
  • cannot combine mapping and mappingFile parameters
    Since the documentation is lacking, I was under the impression that the
    mapping and mappingFile parameters work basically the same, but that the
    latter explicitly expects a path to a file, while the former accepts both
    mapping names and paths.
    But while it is possible to pass multiple mappings to mapping
    generateIIASASubmission(
        mapping = c('AR6', 'AR6_NGFS'),
        …)
    
    passing multiple mappings as mappingFile
    generateIIASASubmission(
        mappingFile = system.file('templates', 
                                  paste0('mapping_template_',
                                         c('AR6', 'AR6_NGFS'),
                                         '.csv'), 
                                  package = 'piamInterfaces'),
        …)
    
    leads to an uncaught error
    Error in length(mapping) > 0 || is.null(mappingFile) || !file.exists(mappingFile) :                                                      
      'length = 2' in coercion to 'logical(1)'
    
    Using both a mapping and a mappingFile ignores the mapping in
    mappingFile, while reporting using it
    generateIIASASubmission(
        mapping = 'AR6',
        mappingFile = system.file('templates',
                                  'mapping_template_AR6_NGFS.csv', 
                                  package = 'piamInterfaces'),
        …)
    
    says
    ### Generating mapping  based on templates AR6
    
    # Read AR6
    
    ### Generating submission file using mapping AR6, /home/pehl/R/x86_64-pc-linux-gnu-library/4.3/piamInterfaces/templates/mapping_template_AR6_NGFS.csv.
    # Adapt scenario names: '' will be prepended, '' will be removed.
    # Apply mapping /home/pehl/R/x86_64-pc-linux-gnu-library/4.3/piamInterfaces/templates/mapping_template_AR6_NGFS.csv
    
    but the resulting file is identical to the one produced by
    generateIIASASubmission(mapping = 'AR6', …).
    • corrected in #237. If you specify a mapping, you cannot also use a mappingFile at the moment. The script will now warn you.

improve output of checkDataLength

This is very hard to read:

### Check whether all scenarios have same number of variables
- For Emissions|CO2|AFOLU, data points per scenario differ: d_delfrag: 0. d_strain: 660. h_cpol: 0. h_ndc: 0. o_1p5c: 0. o_2c: 0.
- For Energy Service|Transportation|Aviation, data points per scenario differ: d_delfrag: 0. d_strain: 0. h_cpol: 660. h_ndc: 660. o_1p5c: 660. o_2c: 660.
- For Energy Service|Transportation|Freight, data points per scenario differ: d_delfrag: 0. d_strain: 0. h_cpol: 660. h_ndc: 660. o_1p5c: 660. o_2c: 660.
- For Energy Service|Transportation|Freight|International Shipping, data points per scenario differ: d_delfrag: 0. d_strain: 0. h_cpol: 660. h_ndc: 660. o_1p5c: 660. o_2c: 660.
- For Energy Service|Transportation|Freight|Road, data points per scenario differ: d_delfrag: 0. d_strain: 0. h_cpol: 660. h_ndc: 660. o_1p5c: 660. o_2c: 660.
- For Energy Service|Transportation|Passenger, data points per scenario differ: d_delfrag: 0. d_strain: 0. h_cpol: 660. h_ndc: 660. o_1p5c: 660. o_2c: 660.
- For Energy Service|Transportation|Passenger|Aviation, data points per scenario differ: d_delfrag: 0. d_strain: 0. h_cpol: 660. h_ndc: 660. o_1p5c: 660. o_2c: 660.
- For Energy Service|Transportation|Passenger|Bicycling and Walking, data points per scenario differ: d_delfrag: 0. d_strain: 0. h_cpol: 660. h_ndc: 660. o_1p5c: 660. o_2c: 660.
- For Energy Service|Transportation|Rail, data points per scenario differ: d_delfrag: 0. d_strain: 0. h_cpol: 660. h_ndc: 660. o_1p5c: 660. o_2c: 660.
- For Energy Service|Transportation|Road, data points per scenario differ: d_delfrag: 0. d_strain: 0. h_cpol: 660. h_ndc: 660. o_1p5c: 660. o_2c: 660.
- For Investment|Energy Supply|Electricity|Coal|w/o CCS, data points per scenario differ: d_delfrag: 600. d_strain: 600. h_cpol: 620. h_ndc: 620. o_1p5c: 600. o_2c: 600.

Better would be:

variable  d_delfrag  d_strain
Emi|CO2   500        700

Fix last 14 unclear variable mappings

   variable                                                              unit               piam_variable                             piam_unit            piam_factor template
 1 Capacity Additions|Electricity|Storage Capacity                       GWh/yr             New Cap|Electricity|Storage|Battery       GW/yr                NA          AR6
 2 Capacity Additions|Electricity|Storage Capacity                       GWh/yr             New Cap|Electricity|Storage|Battery       GW/yr                NA          NAVIGATE
 3 Capacity Additions|Electricity|Storage Reservoir|Stationary Batteries GWh/yr             New Cap|Electricity|Storage|Battery       GW/yr                6           ARIADNE
 4 Capacity|Electricity|Storage Reservoir|Stationary Batteries           GWh                Cap|Electricity|Storage|Battery           GW                   6           ARIADNE

@robertpietzcker explained that the factor 6 is a rough conversion from GW to GWh. We will use that in AR6 and NAVIGATE also.

 5 Capital Stock                                                         billion EUR2020/yr Capital Stock|Non-ESM                     billion US$2005      1.174       ARIADNE

Error in ARIADNE variable list, Felix will ask the person responsible for it whether it should be changed.

 6 Expenditure Share|Food                                                %                  SDG|SDG02|Food expenditure share          income               100         AR6
 7 Expenditure Share|Food                                                %                  SDG|SDG02|Food expenditure share          income               100         AR6_MAgPIE
 8 Expenditure Share|Food                                                %                  SDG|SDG02|Food expenditure share          income               100         SHAPE

That seems ok, I added an accepted conversion from income to % with a piam_factor of 100.

 9 Expenditure|household|Food                                            billion US$2010/yr Household Expenditure|Food|Expenditure    USD/capita           NA          AR6
10 Expenditure|household|Food                                            billion US$2010/yr Household Expenditure|Food|Expenditure    USD/capita           NA          AR6_MAgPIE
11 Expenditure|household|Food                                            billion US$2010/yr Household Expenditure|Food|Expenditure    USD/capita           NA          NAVIGATE

complete nonsense, removed.

12 Freshwater|Environmental Flow Violations                              km3/yr             Water|Environmental flow violation volume km3                  NA          SHAPE

@FelicitasBeier will adapt the MAgPIE unit

13 GDP|PPP                                                               billion US$2010/yr Total income                              million US$05 PPP/yr 0.00000111  AR6_MAgPIE

An error, should be three zeros less, as confirmed by @mleimbach and @flohump who introduced it.

14 Intensity|Final Energy                                                EJ/billion US$2010 Intensity|GDP|Final Energy                MJ/US$2005           0.00090274  AR6

Slightly adjusted to new conversion value, but overall ok.

Code to generate that:

library(tidyverse); options(width = 180)
devtools::load_all()
all <- NULL
for (name in names(templateNames())) {
  t <- as_tibble(getTemplate(name)) %>%
    select(c("variable", "unit", "piam_variable", "piam_unit", "piam_factor")) %>%
    mutate(template = name)
  all <- rbind(all, t %>% filter(! checkUnitFactor(t)))
}
arrange(all, .data$variable) %>% print(n = 200)

Fix ggplot2 3.5.0 warnings

With ggplot2 3.5.0 (as opposed to 3.4.0), plotIntercomparison raises six warnings (see checks in #243). May need to be solved in mip, skipping that now, one might want to remove that from .buildlibrary later.

Warning (test-plotIntercomparison.R:7:3): plotComparison works
No shared levels found between `names(values)` of the manual scale and the data's fill values.
Backtrace:
     ▆
  1. ├─utils::capture.output(...) at test-plotIntercomparison.R:7:2
  2. │ └─base::withVisible(...elt(i))
  3. ├─testthat::capture_messages(...)
  4. │ └─base::withCallingHandlers(...)
  5. └─piamInterfaces::plotIntercomparison(...)
  6.   └─piamInterfaces:::makepdf(...) at piamInterfaces/R/plotIntercomparison.R:128:6
  7.     └─mip::showLinePlots(plotdata, p, mainReg = mainReg, color.dim.name = legendTitle) at piamInterfaces/R/plotIntercomparison.R:169:4
  8.       └─mip:::getLegend(p1)
  9.         ├─ggplot2::ggplot_gtable(ggplot_build(plt))
 10.         │ └─ggplot2:::attach_plot_env(data$plot$plot_env)
 11.         │   └─base::options(ggplot2_plot_env = env)
 12.         ├─ggplot2::ggplot_build(plt)
 13.         └─ggplot2:::ggplot_build.ggplot(plt)
 14.           └─plot$guides$build(npscales, plot$layers, plot$labels, data)
 15.             └─ggplot2 (local) build(..., self = self)
 16.               └─guides$train(scales, labels)
 17.                 └─ggplot2 (local) train(..., self = self)
 18.                   └─base::Map(...)
 19.                     └─base::mapply(FUN = f, ..., SIMPLIFY = FALSE)
 20.                       └─ggplot2 (local) `<fn>`(...)
 21.                         └─guide$train(param, scale, aes, title = labels[[aes]])
 22.                           └─ggplot2 (local) train(..., self = self)
 23.                             ├─rlang::inject(self$extract_key(scale, !!!params))
 24.                             └─self$extract_key(...)
 25.                               └─ggplot2 (local) extract_key(...)
 26.                                 └─scale$get_breaks()
 27.                                   └─ggplot2 (local) get_breaks(..., self = self)
 28.                                     ├─base::intersect(breaks, limits)
 29.                                     │ └─base::as.vector(y)
 30.                                     └─self$get_limits()
 31.                                       └─ggplot2 (local) get_limits(..., self = self)
 32.                                         └─self$limits(self$range$range)
 33.                                           └─ggplot2:::limits(...)
 34.                                             └─cli::cli_warn(...)

generate plots charts in checkSummations do not consider the factor values defined at the summation_groups_*.csv file

Issue:

  • The pdf report charts created using checkSummations and generatePlots = TRUE do not consider the factors defined in the summation check tables (summation_groups_*.csv).

Reproduction:

  • The ECEMF summation check for total GHG is defined as Emissions|Kyoto Gases 2 = 28 * Emissions|CH4 + Emissions|CO2 + Emissions|F-Gases + 0.265 * Emissions|N2O

Emissions|Kyoto Gases 2;Emissions|F-Gases;1
Emissions|Kyoto Gases 2;Emissions|N2O;0.265
Emissions|Kyoto Gases 2;Emissions|CH4;28
Emissions|Kyoto Gases 2;Emissions|CO2;1

  • The values at the summation csv are presented correctly:
model scenario region period variable unit value checkSum diff reldiff details
IMAGE WP1 NetZero EU27 & UK (*) 2005 Emissions|Kyoto Gases 2 Mt CO2e/yr 5892.23413 5917.05777 24.8236421 0.4 28 * Emissions|CH4 (23.5059933662415) + Emissions|CO2 (4765.56206810088) + Emissions|F-Gases (126.401477813721) + 0.265 * Emissions|N2O (1384.62797290641)
  • However the chart use the values before the conversion factors are applied, as it can be seen below in the much smaller CH4 values and the much higher N2O values:

image

inconsistency in Investment|Energy Supply|Liquids

Dear colleagues,
a user noticed an inconsistency in NGFS results:
image

The corresponding mapping chain is (NAVIGATE / AR6 / remind2):

remindmodel    -> remind2 reporting                         -> NAVIGATE / AR6
peoil          -> Energy Investments|Liquids|Oil Ref        -> Investment|Energy Supply|Liquids|Oil
petyf          -> Energy Investments|Liquids|Fossil         -> Investment|Energy Supply|Liquids|Coal and Gas
pecoal + pegas -> Energy Investments|Liquids|Fossil|w/o oil -> not mapped

Maybe better to map w/o oil to Coal and Gas?

question about Final Energy|Transportation|Liquids variables

In this PR, I took out the following summation group:

Final Energy|Transportation|Liquids = 
  Final Energy|Transportation|Liquids|Bioenergy
+ Final Energy|Transportation|Liquids|Coal
+ Final Energy|Transportation|Liquids|Fossil synfuel
+ Final Energy|Transportation|Liquids|Natural Gas
+ Final Energy|Transportation|Liquids|Oil

But maybe this was an error and it should rather be fixed on the reporting side, not by deleting it here? Maybe @johannah-pik, at some point in the future, you can have a look? (Sorry for putting work on you, again)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.