Comments (4)
David,
This will be an excellent and very useful addition to the CF Conventions! I have not yet wrapped my head around the technical details. There is one thing I do not quite understand, first you write:
The aggregation variables contain no data but instead record instructions on both how to find the data in their original files, and how to combine the data into an aggregated data array
And below the Figure 1 you write:
Note that this proposal does not cover how to decide whether or not the data arrays of two existing variables could or should be aggregated into a single larger array.
Probably I am missing something here, but to me this seems contradictory? Anyway, that is a detail, and I think the more important questions are the one you raise in the Technical Proposal Summary:
... incorporate CFA into CF ... that this is a good idea, ...
To me this is no doubt a good idea, which already has a strong community backing.
... how the new content should be structured (e.g. a new section, a new appendix, both, or something else).
Perhaps an outline somewhere in the main text: end of Chapter 2 regarding aggregation files and their relation to the fragment files, somewhere in Chapter 3 regarding aggregation variables? And then an exhaustive description in an Appendix?
This, brings me a more general thought that I have been thinking about for some time:
I think that the CF Conventions document is getting increasingly long and complex/difficult to get an overview of. The Table of Content takes 8 full screens (5 pdf pages), then 5 screens of Tables of tables/figures/examples (3 pdf pages). I have no idea how to improve upon this, but it becomes more and more of a concern as we add new features to the Conventions. However, this is not something to discuss and solve here in this enhancement proposal, but I wanted to bring it up here anywaay.
from cf-conventions.
Thank you for you comments, Lars, and sorry that it has taken me some time to respond.
Even though you are the only person to have commented here (and in support), this proposal has been scrutinised carefully at two CF workshops, with a group decision being reached in 2023 to work towards incorporating CFA into CF. I'm therefore minded to move to writing the PR, now that Lars has made a good suggestion of how and where the content could go into the existing CF conventions. This shouldn't take too long, because it will largely be a "cut and paste" job from the existing CFA description, which was deliberately written in a CF-ish style in anticipation of this :).
The aggregation variables contain no data but instead record instructions on both how to find the data in their original files, and how to combine the data into an aggregated data array
...
Note that this proposal does not cover how to decide whether or not the data arrays of two existing variables could or should be aggregated into a single larger array.
Good point. The first statement applies to the reading of the data, and the second to the writing of the data. The CFA conventions do not give any guidance on the decision of how fragment files can be combined prior to creating an aggregation variable, rather once you have an aggregation in mind, they provide a framework in which you can encode it in such a way that other people can decode it.
If I give you two datasets (A and B) then the CFA conventions won't give you any help in working out if A and B can be sensibly combined into a single larger dataset (C). There are various ways in which you could work this out yourself - you could inspect the metadata and apply an aggregation algorithm (e.g. this one, or by visual inspection), or base it on files names (e.g. I know that model outputs from March.nc
and April.nc
are safe to combine into a 2-month dataset), etc.
Perhaps an outline somewhere in the main text: end of Chapter 2 regarding aggregation files and their relation to the fragment files, somewhere in Chapter 3 regarding aggregation variables? And then an exhaustive description in an Appendix?
I like the idea of a Chapter 2 outline. I might suggest content from Introduction, Terminology, Aggregation variables, and Aggregation instructions (without its subsections) for Chapter 2, and everything else - which is most of the existing CFA document - (Standardized aggregation instructions, Non-standardized terms, Fragment Storage and examples) for the appendix.
The Table of Content takes 8 full screens (5 pdf pages), then 5 screens of Tables of tables/figures/examples (3 pdf pages).
Just a thought - the TOC currently shows all subnsections - maybe it could be restricted to just one level of subsection, so for instance Chapter 7 would go from
[7. Data Representative of Cells](https://cfconventions.org/cf-conventions/cf-conventions.html#_data_representative_of_cells)
[7.1. Cell Boundaries](https://cfconventions.org/cf-conventions/cf-conventions.html#cell-boundaries)
[7.2. Cell Measures](https://cfconventions.org/cf-conventions/cf-conventions.html#cell-measures)
[7.3. Cell Methods](https://cfconventions.org/cf-conventions/cf-conventions.html#cell-methods)
[7.3.1. Statistics for more than one axis](https://cfconventions.org/cf-conventions/cf-conventions.html#statistics-more-than-one-axis)
[7.3.2. Recording the spacing of the original data and other information](https://cfconventions.org/cf-conventions/cf-conventions.html#recording-spacing-original-data)
[7.3.3. Statistics applying to portions of cells](https://cfconventions.org/cf-conventions/cf-conventions.html#statistics-applying-portions)
[7.3.4. Cell methods when there are no coordinates](https://cfconventions.org/cf-conventions/cf-conventions.html#cell-methods-no-coordinates)
[7.4. Climatological Statistics](https://cfconventions.org/cf-conventions/cf-conventions.html#climatological-statistics)
[7.5. Geometries](https://cfconventions.org/cf-conventions/cf-conventions.html#geometries)
to
[7. Data Representative of Cells](https://cfconventions.org/cf-conventions/cf-conventions.html#_data_representative_of_cells)
[7.1. Cell Boundaries](https://cfconventions.org/cf-conventions/cf-conventions.html#cell-boundaries)
[7.2. Cell Measures](https://cfconventions.org/cf-conventions/cf-conventions.html#cell-measures)
[7.3. Cell Methods](https://cfconventions.org/cf-conventions/cf-conventions.html#cell-methods)
[7.4. Climatological Statistics](https://cfconventions.org/cf-conventions/cf-conventions.html#climatological-statistics)
[7.5. Geometries](https://cfconventions.org/cf-conventions/cf-conventions.html#geometries)
That alone would remove 71 lines from the TOC! But as you say, any more on that should be discussed elsewhere, which I would welcome.
from cf-conventions.
I think this is generally a good idea and have been meaning to go over the details.
A quick thought about the table of contents: Would it be easy in the web view to collapse the subsection hierarchy to 1 or 2 levels, then click on an upper level to display its subsections? That might give a newbie a more accessible overview. On the other hand, I usually just execute "find" for some key word I know is relevant to what I want to look up, and if that word becomes hidden (in a hidden low level subsection), then I may have a harder time navigating quickly to the relevant section. So I can see arguments for the current expanded table of contents.
from cf-conventions.
Related Issues (20)
- Bulk change "http://..." to "https://..." HOT 15
- Small update to text in section 2.3 regarding character sets HOT 4
- Incorrect formating for some `<=` in Appendix D HOT 2
- Allow period and hyphen in attribute names HOT 21
- Appendix F: 14 `geotiff.maptools.org` domain links redirecting HOT 4
- Introduce `units_metadata` attribute to clarify the meaning of quantities involving temperature HOT 5
- Add a missing author to the list HOT 1
- Fix affiliation for Dave Allured HOT 2
- Problems in the github document build process HOT 7
- Simple correction to Example 6.1.2 HOT 5
- corrections to `units_metadata` text HOT 2
- Formatting of local links in text; Lists of Figures, Tables and Examples HOT 1
- Clarification of the use of `long_name`, `standard_name`, `cf_role` and non-standard attributes HOT 4
- Persistent removal of trailing whitespace for clean `diff`s HOT 13
- In exceptional cases allow a standard name to be aliased into two alternatives HOT 6
- Appendix B: New element in XML file header to record the "first published date" HOT 5
- Include DOI and License information in the conventions document HOT 20
- recommendation of `standard_name` or `long_name` HOT 4
- Update the XML format specification in Appendix B to provide a robust link to the XML schema file HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cf-conventions.