Comments (7)
Original comment by Kristian Rönn (Bitbucket: zero0nee, GitHub: zero0nee).
@{557058:3bfb0d84-eba1-44d8-b3a8-65379547e130} Any updates on this issue? I assume this is still not implemented. It would make migration between OpenLCA and brightway2 massively easier. It would also make it a lot easier to use data from https://nexus.openlca.org/. I’m more than happy to help with the coding if you could give me some pointers on where to start.
from brightway2-io.
One of the first steps here is to go from the extracted dictionary with keys:
['processes', 'dq_systems', 'locations', 'actors', 'flow_properties', 'sources', 'unit_groups', 'categories', 'flows']
To a list of activities that fits the BW standard IO format.
dq_systems
seems completely irrelevant.- We could take the
code
fromlocations
. The default is e.g. 'Northern America, the code is
RNA`. The tradeoff here is we get compatibility with (dumb) old systems, by using their (dumb) old abbreviations. After typing this, I would vote to use the human-readable names. - I propose to delete
actors
andsources
- the part of the information we would want here is already included in the process. - Here is a
flow_property
. What would we want to include? This is more philosophy than engineering:
{'@context': 'http://greendelta.github.io/olca-schema/context.jsonld',
'@type': 'FlowProperty',
'@id': 'f6811440-ee37-11de-8a39-0800200c9a66',
'name': 'Energy',
'version': '00.00.000',
'category': {'@type': 'Category',
'@id': '87cfd36b-db77-3a88-8c0e-7102ce682690',
'name': 'Technical flow properties',
'categoryType': 'FlowProperty'},
'flowPropertyType': 'PHYSICAL_QUANTITY',
'unitGroup': {'@type': 'UnitGroup',
'@id': '93a60a57-a3c8-11da-a746-0800200c9a66',
'name': 'Units of energy',
'categoryPath': ['Technical unit groups']}}
unit_groups
shows how to convert units. Not needed for a first working implementation.- Here is an entry in
categories
. What I would consider the same category is distinguished between process and flow:
'0927a450-e2b2-3455-983b-42ff1ebacc3c': {'@context': 'http://greendelta.github.io/olca-schema/context.jsonld',
'@type': 'Category',
'@id': '0927a450-e2b2-3455-983b-42ff1ebacc3c',
'name': '3251: Basic Chemical Manufacturing',
'version': '00.00.000',
'category': {'@type': 'Category',
'@id': '00388557-8210-3c1a-b8d8-08fe34203c28',
'name': '31-33: Manufacturing',
'categoryType': 'Process'},
'modelType': 'PROCESS'},
'2665bab2-c905-3680-8ad7-16954f1ea251': {'@context': 'http://greendelta.github.io/olca-schema/context.jsonld',
'@type': 'Category',
'@id': '2665bab2-c905-3680-8ad7-16954f1ea251',
'name': '3251: Basic Chemical Manufacturing',
'version': '00.00.000',
'category': {'@type': 'Category',
'@id': '196f44d8-40ec-37ed-8e73-a0c0825569eb',
'name': '31-33: Manufacturing',
'categoryPath': ['Technosphere Flows'],
'categoryType': 'Flow'},
'modelType': 'FLOW'},
And here is what we extract for the category of an activity:
'category': {'@type': 'Category',
'@id': '0927a450-e2b2-3455-983b-42ff1ebacc3c',
'name': '3251: Basic Chemical Manufacturing',
'categoryPath': ['31-33: Manufacturing'],
'categoryType': 'Process'}
I don't see anything in the metadata which we care about which isn't in the data.
flows
. Here is an exchange, which has a flow:
{'@type': 'Exchange',
'avoidedProduct': False,
'input': False,
'amount': 11.2,
'internalId': 3,
'flow': {'@type': 'Flow',
'@id': '8afb5cc3-26fc-416c-991e-afe07bb06bd4',
'name': 'Tar; from thermochemical conversion; at plant',
'categoryPath': ['Technosphere Flows',
'31-33: Manufacturing',
'3251: Basic Chemical Manufacturing'],
'flowType': 'PRODUCT_FLOW',
'location': 'US',
'refUnit': 'kg'},
'unit': {'@type': 'Unit',
'@id': '20aadc24-a391-41cf-b340-3e4529f44bde',
'name': 'kg'},
'flowProperty': {'@type': 'FlowProperty',
'@id': '93a60a56-a3c8-11da-a746-0800200b9a66',
'name': 'Mass',
'categoryPath': ['Technical flow properties']}},
And here is that flow from flows
:
{'@context': 'http://greendelta.github.io/olca-schema/context.jsonld',
'@type': 'Flow',
'@id': '8afb5cc3-26fc-416c-991e-afe07bb06bd4',
'name': 'Tar; from thermochemical conversion; at plant',
'description': '',
'version': '00.00.002',
'lastChange': '2019-11-25T14:50:26.620-05:00',
'category': {'@type': 'Category',
'@id': '2665bab2-c905-3680-8ad7-16954f1ea251',
'name': '3251: Basic Chemical Manufacturing',
'categoryPath': ['Technosphere Flows', '31-33: Manufacturing'],
'categoryType': 'Flow'},
'flowType': 'PRODUCT_FLOW',
'infrastructureFlow': False,
'location': {'@type': 'Location',
'@id': '0b3b97fa-6688-3c56-88ee-4ae80ec0c3c2',
'name': 'United States'},
'flowProperties': [{'@type': 'FlowPropertyFactor',
'referenceFlowProperty': True,
'flowProperty': {'@type': 'FlowProperty',
'@id': '93a60a56-a3c8-11da-a746-0800200b9a66',
'name': 'Mass',
'categoryPath': ['Technical flow properties']},
'conversionFactor': 1.0}]}
The only thing I see here is that the location is different (!?). 'United States' versus 'US'. So, as we are using the full names, we need to normalize these names in the exchanges.
from brightway2-io.
Current status:
- Creates a new, separate biosphere database based on flows available. This could be improved by importing the US EPA biosphere flow list.
- Internal linking working for most technsophere and all biosphere flows. The one wrinkle is an unlinked waste flow,
Water; from thermochemical conversion; at plant
, in the US FPL test fixture. - Need integration tests against a toy database with all possible JSON LD wrinkles.
from brightway2-io.
Great progress. I know the format is complicated and it has grown over the years...
Regarding the different location names: the structure of a data set is in general like this:
{
"@type": "SomeThing",
"@id": "...",
"referenceToAnotherThing": {
"@type": "AnotherThing",
"@id": "...",
"some": "meta-data to understand what this reference is"
}
}
There are data sets that contain references to other data sets. A reference to a data set is just a small descriptor. The only required thing there is the @id
(everything else could be missing). In the example above, the exchange references a flow, the flow references a location. We put some meta-data into these references so that it is possible to understand what it means but it is better to not use this in the import.
The import could first index the locations by their IDs by going through the location data sets in the locations
folder (e.g. by mapping the names or codes in these data sets to the Brightway locations: olca-location-id -> bw-location
). Then, when reading the flows, it could take the correct location for the respective ID.
Same for units, flow types, category paths, etc. these things may not be present in the references and it would be better to index these things bottom-up: units -> flow properties -> locations -> flows
from brightway2-io.
Thanks @msrocka, and good timing, I was just about to email you :)
We are actually doing as you suggest for locations and unit conversions, but should apply this strategy more broadly.
In any format, it is a challenge to get all the assumptions of the designers into the spec. This is normal, and the existing documentation is a good help in understanding things. Of course, there are still some places we don't perfectly understand.
A few more questions on the format:
- Exchange
internalId
values are intended to identify references to exchanges when multiple exchanges have the same flow. However, at least in the test data we downloaded from the LCA Commons (e.g. US FPL), these values are not unique. Do you rely on using theseinternalId
values? - Are the allocation algorithms described anywhere? I guess I could guess, but this is not as nice as a reference.
- How do you do allocation for avoided products (if at all)? I guess these are also just allocated as in other inputs...
- What to do if there is causal allocation, but not specific allocation factors for each exchange? Or does causal allocation require specific factors for every exchange?
- Does OpenLCA use the same flow UUIDs as https://github.com/USEPA/Federal-LCA-Commons-Elementary-Flow-List?
And a general question - are there other programs or reference implementations of OLCA-schema IO aside from OpenLCA?
from brightway2-io.
Exchange internalId values are intended to identify references to exchanges when multiple exchanges have the same flow. However, at least in the test data we downloaded from the LCA Commons (e.g. US FPL), these values are not unique. Do you rely on using these internalId values?
This is an error. These IDs need to be unique within a process. We use this to identify the exchanges for causal allocation factors and for process links in product systems (when there are two inputs of the same product (or outputs of the same waste flow), these can be linked to different providers and in order to know which exchange is linked to what, we need this id then)
Are the allocation algorithms described anywhere? I guess I could guess, but this is not as nice as a reference.
Unfortunately not. Basically it is like this:
- physical and economic allocation: you define a factor for each product output and waste input and these are applied on the amounts of all elementary flows, product inputs, and waste outputs of the process
- causal allocation: for each output product or waste input you define a factor for each other exchange
In openLCA, the allocation method can be then selected in the calculation and the respective factors are applied then (so there should be factors supplied for each allocation method). There is also the option to select the default allocation method for a process and use these default settings in the calculation.
How do you do allocation for avoided products (if at all)? I guess these are also just allocated as in other inputs...
Yes, the allocation factors should be also applied on the avoided products then (although I have never seen or even tested anything like this yet; I need to verify this).
What to do if there is causal allocation, but not specific allocation factors for each exchange? Or does causal allocation require specific factors for every exchange?
Yes, it requires the specific factors for every exchange. In openLCA, when a factor is missing it defaults to 1
.
Does OpenLCA use the same flow UUIDs as https://github.com/USEPA/Federal-LCA-Commons-Elementary-Flow-List?
Not yet. But they already defined a mapping with which the openLCA flows can be replaced with the FEFL flows:
https://github.com/USEPA/Federal-LCA-Commons-Elementary-Flow-List/blob/master/fedelemflowlist/flowmapping/openLCA.csv
I think we should move to this list at some point. But we have quite some databases that would require an udate then...
And a general question - are there other programs or reference implementations of OLCA-schema IO aside from OpenLCA?
I am not aware of such tools. The things I know are always linked to openLCA: packaging data for openLCA or exchanging data with openLCA via some interface. Improving the format or specifying a subset of the format so that it is more useful for other tools would be great though.
from brightway2-io.
First of all, thanks for the work on both the Olca schema and this converter. My name is Selim Youssry, I am the CTO of Sapiologie, Michael and I have been briefly in touch last year.
We actively use the Olca schema at Sapiologie, which answers your question from Sept 24th, 2021 @cmutel.
And a general question - are there other programs or reference implementations of OLCA-schema IO aside from OpenLCA?
We really like the format and more generally don't want to introduce yet another LCA format :). Being JSON based, well documented, and compatible with all Nexus databases are huge gains.
As we are looking very closely at Brightway for the computation parts, I'd like to offer our help to close the last 5 tasks on this ticket. Short term, I'm happy to take over and submit a PR, if that is okay with you both.
If that is of interest to you, @cmutel would you have some time to bring me/us up to speed on some questions we have? We
have examined the Brightway codebase, its documentation and tutorials ; but a call will help speed us up on some remaining questions we have pending. Either way, I'd love to connect.
Feel free to answer directly here or email me at selim [ at ] sapiologie [ dot ] com.
Thank you and hopefully talk to you soon
from brightway2-io.
Related Issues (20)
- How to export database with location information of technosphere HOT 1
- [BW2 Legacy] Using older versions of Biosphere HOT 2
- database writing failed due to unlinked exchanges in ecoinvent 3.10 HOT 4
- Some extractor methods do not have the `use_mp` flag implemented, causing them to fail. HOT 1
- [BW2 legacy] Error with easy ecoinvent biosphere flows/LCI/LCIA import of bw2io version 0.8.11 HOT 3
- SimaPro CSV importer doesn't fix broken uncertainty values HOT 8
- US EEIO import is broken
- [Discussion/Feature request] Adding `database` field to Linking Iterables by field on import HOT 2
- Excel file import - AttributeError: 'int' object has no attribute 'lower' HOT 1
- Few type hints not compatible with python 3.9 & python 3.8 HOT 1
- Encoding erorr when running `import_ecoinvent_release` on `v3.10` HOT 3
- Backport fix for "chemical formulas" from 0.9.DEV7 to 0.8.X ? HOT 1
- AttributeError in add_missing_cfs(): 'ExcelLCIAImporter' object has no attribute 'biosphere_name' HOT 1
- Importing in BW processes written in SP HOT 2
- errors importing exiobase in monetary units HOT 1
- KeyError in IOTable backend HOT 3
- update README to use cookiecutterlib format
- Configurable multiprocessing for large data import (Ecoinvent ecospold2) HOT 2
- `ExcelImporter` should convert integer codes to strings HOT 5
- minor compatibility issue HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from brightway2-io.