Comments (11)
I have no strong feelings about this not being the default enabled option, although I guess it would make more sense to be that way. Asking for other modelling groups to add missing variables although is neither an option or just takes too long to realize in our case.
About the summation question, I would say it would depend of the summation groups you define.
If you define the summations:
PE|Fossil = PE|Fossil|w/ CCS + PE|Fossil|w/o CCS
PE|Fossil 2 = PE|Fossil|Coal + PE|Fossil|Gas + PE|Fossil|Oil
PE = PE|Fossil + PE|Renewable
I would assume that the summation check would need to:
(1) for any missing variable, replace it with a non-zero value if the variable is a result of another summation and issue a warning in the log file for that. You can simply add to PE|Fossil
the value of the first non-zero summation in this case.
(2) check for all above summations after missing variables are filled and issue an error if the summations doe not check;
(3) check if alternative summations for the same variable provide the same value (e.g. PE|Fossil
and PE|Fossil 2
above) and issue an error if both are non-zero and have different values.
from piaminterfaces.
Thanks for your input, Renato. Definitely can look into this, but need some additional input on how urgent / frequent this use case is so I can prioritize the task.
Any thoughts @Renato-Rodrigues @orichters @LaviniaBaumstark ?
from piaminterfaces.
I can describe my use case to inform your decision:
@robertpietzcker is currently working on a model comparison for ECEMF, which includes bottom-up and IAM models that are not able to report the same set of variables.
We need to make sure that results from these models are consistent and this feature would allow using piamInterfaces to do summation checks for those other models.
from piaminterfaces.
I don't need this feature for our NGFS model comparison and rather prefer to have summation checks fail if some variables are missing, such that I can tell the other modeling team to add it. I would also prefer that recursive = FALSE
remains the default that we use for checking our REMIND submissions. This is of course no opposition to implementing such a feature.
One question: If for example PE|Fossil
is PE|Fossil|w/ CCS
+ PE|Fossil|w/o CCS
but also PE|Fossil|Coal
+ PE|Fossil|Gas
. Now, if PE
= PE|Fossil
+ PE|Renewable
but PE|Fossil
is not reported, should the recursive mode check both of them?
from piaminterfaces.
The discussion already shows how complex implementing such a feature is due to all the border cases.
.. is currently working on a model comparison for ECEMF, which includes bottom-up and IAM models that are not able to report the same set of variables.
We need to make sure that results from these models are consistent and this feature would allow using piamInterfaces to do summation checks for those other models.
For this use case, I would
- Check the submissions per model for completeness of the variables (sth similar as in this example, section "Validate data").
- Then calculate the missing variables per model (an example from Ariadne)
- Then run checkSummations as is.
Step 1 and 2 might not be feasible if we are talking about large gaps with many missing variables across all the models, but it should be fine for smaller gaps.
from piaminterfaces.
That being said, if you want to keep going in this direction, I can set up a new function that fills gaps based on the summtion checks.
But I would keep it outside of the checkSummations function (you run this new function before to fill the gaps, then pass the result to checkSummations).
As this can become really complex, I would for now only cover edge cases you have in your data, not all potential edge cases and extend it, if a real need arises.
What do you think, Renato?
from piaminterfaces.
My suggestion on the first post was to exactly create the function you are mentioning: see fillMissing
mention above.
Supporting this in the checkSummations code is necessary for the case of contradictory alternative summation groups, if this is not handled already in the checkSummations code.
The implementation shouldn't be more complex than:
- add a parameter to the function
checkSummations
calledrecursive
. - if
recursive = TRUE
2.1 call thefillMissing
function to fill the gaps on data
2.2 add a warning to the log file if two group summations for the same variable does not sum up to the same value: e.g.PE|Fossil
<>PE|Fossil 2
(if this is not in place already as one of the checkSummation tests).
2.3 if two summations have different values (e.g.PE|Fossil 2
>PE|Fossil
<> 0), or if only an alternative summation has value (e.g.PE|Fossil 2
<> 0 andPE|Fossil 2
= 0), choose the one with value (or in the first case maybe the one with the biggest value) and rename it to have the clear variable name: e.g. renamePE|Fossil 2
toPE|Fossil
in this case. Add a warning to the log file thatPE|Fossil 2
values are being used forPE|Fossil
in summation checks. - run the summations check code as it is now.
PS: It is not a big difference to add 2.2. and 2.3 as part of the fillMissing
function code, allowing us to simply call the functions sequentially as you mentioned above, or to handle these duplication and missing info decision inside of the summation check code. However, this step is necessary.
from piaminterfaces.
Defining by hand a project specific validation code for summation checks like it was done in ARIADNE is not a good alternative in my opinion, because we could use the group summations file information to do exactly the same in an automated way.
It just seams too much double work to me.
from piaminterfaces.
Renato, what I haven't understood yet is: In the end, do you also want the data with gaps filled saved or returned or do you just want the summation check results?
If I understand it correctly, the question is whether you call
checkSummations(data, summationsFile = "AR6", fillMissing = TRUE, …)
which calls fillMissing()
itself or
checkSummations(fillMissing(data, summationsFile = "AR6"), summationsFile = "AR6", …)
To avoid redundancy, I would opt for the first.
2.2 is not yet implemented, but I also don't think this is an important feature for the non-recursive ones, because if PE
= PE|Fossil
+ PE|Renewable
and PE 2
= PE|w/ CCS
+ PE|w/o CCS
have different values, as you have a unique PE
value in the data, at least one of them will fail. So I would probably follow the suggestion to put 2.2 and 2.3 into the fillMissing function.
from piaminterfaces.
It does not matter for my current use case to have access or not for the "filled" data.
However, having the fillMissing
function as an independent function is both a more elegant solution in my opinion and could also be proven useful in the future.
Anyway, I have no strong opinion on integrating this directly into the checkSummations function or calling sequentially the fillMissing
and only then checkSummations
functions, as long as 2.2 and 2.3 are considered for any of the solutions.
from piaminterfaces.
@Renato-Rodrigues Can you provide
- some runs from ECEMF project with missing variables which i can use for testing
- a summations file (only, if it is not identical to the ECEMF summations file you checked in)
from piaminterfaces.
Related Issues (20)
- generateIIASASubmission() fails ungracefully when no .mif files are found
- fix and factor out generateIIASASubmission() functionality to be used as general REMIND output filter for submissions HOT 1
- allow to edit mapping templates with R
- generateIIASASubmission silently drops identical scenarios HOT 1
- Add new energy waste emissions to mapping templates to ensure summation consistency in projects HOT 9
- Fix ggplot2 3.5.0 warnings
- Cleanup N2O and CH4 emission reporting HOT 3
- Variables with wrong piam_unit are just silently dropped instead of an error/warning raised HOT 4
- Find solution for dropping aggregated regions in generateIIASASubmission
- Industry|Heat variables HOT 2
- Check SDP post-processing variables HOT 2
- Fix last 14 unclear variable mappings
- improve output of checkDataLength
- Automate download of community template HOT 1
- checkSummations does not show piam_factor in human-readable summary
- Adjust piam_unit after SSP update
- In checkSummationRegional, check whether intensive variables lie between extremes
- piam_weight column mix of NA and "NULL" HOT 2
- Hydrogen Energy Investment inconsistency AR6 vs. NAVIGATE HOT 1
- variableInfo does not realize if variable is contained in multiple summation groups HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from piaminterfaces.