Comments (8)
It's a broad topic, here's some broad thoughts.
The original intent of extraction was always to perform serialisation only, and not involve itself with location or interaction with databases. In this case, extracting into a temporary directory and passing this directory on to integration is exactly aligned with this.
Integration then is the complete opposite. It doesn't do any generation of data on it's own, but merely "mediates" the data, and aligns it with the overall pipeline.
Where Collection represents the "input" of a processing graph, Integration then is the "output". Inbetween, data may "fan out", become divided into smaller tasks, but in the end, it must all pass through integration, i.e. "fan in", if the content is to ever see the light of day.
We also want to implement versioning. So that could be additional required data.
Canonically, no process should ever know about existing assets or the state of existing assets until it comes to integration. In the case of versioning, which requires knowledge about which is the currently highest version in order to increment it, this would have to happen solely during integration.
This means that an integrator is free to not only produce final outputs, but also communicate and gather information (unrelated to validation and extraction) in order to make it's final decision. An integrator is always assumed to be right, so no validation is ever required here, nor serialisation. Which in most cases should converge into plain file-copying and persistence of data within each Instance and/or Context.
Though this will always override it for any instance that has been Collected, which might be more annoying than what we gain from removing this duplicity in code.
Not sure how you mean here, but if you mean that the first instance will create a temporary directory, whereas subsequent instances would be written to an already existing temporary directory, than that's perfectly fine and intended.
The temporary directory is much like Git's "staging area" in that it holds an arbitrary amount of information, but does so temporarily until it all is converged, or integrated, with the rest of the data.
from pyblish-magenta.
Canonically, no process should ever know about existing assets or the state of existing assets until it comes to integration. In the case of versioning, which requires knowledge about which is the currently highest version in order to increment it, this would have to happen solely during integration.
This means that an integrator is free to not only produce final outputs, but also communicate and gather information (unrelated to validation and extraction) in order to make it's final decision. An integrator is always assumed to be right, so no validation is ever required here, nor serialisation. Which in most cases should converge into plain file-copying and persistence of data within each Instance and/or Context.
Why would it be up to the Integrator
to acquire the data (eg. about the current highest version) as opposed to the Collector
?
This would also limit Validations (eg. for versioning) like this: https://github.com/mkolar/pyblish-kredenc/blob/master/plugins/common/validate_version_number.py
I feel it might be nice to have the Selector
provide data about the current highest published version of the asset. I was thinking about having an Integrator
ordered -0.1
that is toggled off by default for Increment Version. Only if this is toggled on will it Incrementally Publish
. It's up to the artist to ensure the changes he made won't break anything. What do you think?
Either way. I would love to see a simple pseudocode example on what the Collector does, what the Extractor does and what the Integrator does.
from pyblish-magenta.
Why would it be up to the Integrator to acquire the data (eg. about the current highest version) as opposed to the Collector?
Because it isn't related to the quality of what you are outputting. If a version on disk is faulty, then that is a fault carried over from a previous publish.
Either way. I would love to see a simple pseudocode example on what the Collector does, what the Extractor does and what the Integrator does.
Sure, I'll have a look at this.
from pyblish-magenta.
Either way. I would love to see a simple pseudocode example on what the Collector does, what the Extractor does and what the Integrator does.
I've mocked up an example for you here.
from pyblish-magenta.
I feel it might be nice to have the Selector provide data about the current highest published version of the asset. I was thinking about having an Integrator ordered -0.1 that is toggled off by default for Increment Version. Only if this is toggled on will it Incrementally Publish. It's up to the artist to ensure the changes he made won't break anything. What do you think?
It would be nice and convenient, but also break encapsulation. Think about it. That data doesn't need validation, it has already been saved to disk. The damage is already done.
Furthermore, that data isn't part of what an artist has produced, it's part of what previous Integrators have produced. If anyone should be warned about an invalid version or bad naming convention on already written files, it should be the developer who produced the integrator.
from pyblish-magenta.
It would be nice and convenient, but also break encapsulation. Think about it. That data doesn't need validation, it has already been saved to disk. The damage is already done.
This isn't correct. The damage wouldn't have been done if the Validator catches it before Extraction. Plus it won't even be in the 'damaging' position if it would have Validated after Extraction. It would only be stored in the temporary location.
I think it's not that we're validating whether previous extractions went alright, but whether the version we are integrating now is up to par with our requirements.
Though as you state it's definitely not up to the artist to provide where it would go towards, unless there's user-defined data that influences "as what type of data it gets extracted". A good example could be publishing shader variations (which we do a lot in our pipeline). For example we build a red, blue and yellow bottle of wine. Each individual variation (for a single asset) could be Validated whether it's named correctly or already existing, etc. The point being that when a user can interact with data which influences Integration we want it to get validated because it's prone to human error.
But I think it's good to see where the ship leads us if we keep it purely implemented in Integration.
I've mocked up an example for you here.
https://gist.github.com/mottosso/863e97d6f9d08a0d9eee
Some questions that come to mind:
- How do we let the Extractors extract to the correct temporary location without having to redesign Extractors per pipeline? Should we add a
Selector
that sets upextractDir
data? (Or anExtractor
that is ordered -0.1, whatever makes more sense). Do we let multiple Extractors extract to the same directory? If so, what do we do on naming conflicts? Or how do we ensure there are no naming conflicts? - What data do we provide so that integrator knows how to rename a file in the end. This is partially dependent on the structure for how we we want files to be integrated. Do we smash it all into a single published folder for an asset?
from pyblish-magenta.
This isn't correct. The damage wouldn't have been done if the Validator catches it before Extraction.
Are we talking about looking at existing files on disk, and validating whether those files are valid, during the publish of a new file?
Here's what I'm hearing.
MyAsset
├── publish
│ ├── myasset_v001.ma
│ ├── myasset_v002.ma
│ └── myasset_v003.ma
└── dev
When we're about to publish MyAsset once more, it would then create myasset_v004.ma
.
You would like to (1) include myasset_v001-3.ma
during collection of MyAsset
, and (2) validate these versions? I'm sure this isn't what you mean.
A good example could be publishing shader variations (which we do a lot in our pipeline). For example we build a red, blue and yellow bottle of wine.
I guarantee you that there is a better way to solve this exact thing which doesn't involve integration to be validated.
I invite you to produce this asset in the \Pyblish\_sandbox\magenta
directory and I'll gladly walk you through how this can happen without complicating integration.
How do we let the Extractors extract to the correct temporary location without having to redesign Extractors per pipeline? Do we let multiple Extractors extract to the same directory? If so, what do we do on naming conflicts? Or how do we ensure there are no naming conflicts?
Yes, that's right, multiple extractors write to the same directory. That's what this is doing. The directory is a generic staging area, each extractor could create it's own little subdirectory if needed, but in general, the data each extractor produces should be unique enough to not need to do that.
The way I handled this in Napoleon was to create one subdirectory per family, and typically only extracted a single family via single extractor.
What data do we provide so that integrator knows how to rename a file in the end.
It depends on what file we're talking about.
Let's take the model from ben
in The Deal as an example. Ben is extracted as e.g. ben.mb
, his parent (temporary) directory is stored in his instance as e.g. commitDir
.
/tmp
└── ben.mb
In this case, an integrator with support for model
families would come to expect models to be stored in this manner, a name and suffix and could simply move this exact file into the appropriate directory and give it an appropriate name.
In case a playblast and gif is also present..
/tmp
├── ben.mov
├── ben.gif
└── ben.mb
The integrator will now need to support gifs and playblasts to properly manage their final locations, and when it does will know what to do with files in whichever format they are expected to reside in, for example, it could make the distinction based on their suffix.
So you see there needs to be an interplay between extractors and integrators. There needs to be an "API" or "contract" which they have both agreed to. Any extractor going rouge to produce things an integrator isn't expecting, will simply not get integrated. No harm done.
from pyblish-magenta.
So much has changed since this discussion and I'm not even sure how to "relate" this to the current state of Magenta. If this is relevant I think it would be great to see it outlined briefly what exactly we need to fix or add, otherwise close the discussion.
from pyblish-magenta.
Related Issues (17)
- Additive and subtractive extraction HOT 12
- Validate bind-pose HOT 6
- Validate rig performance HOT 8
- Validate locked normals HOT 4
- Encapsulating Collection HOT 12
- Public facing API HOT 21
- Integrator file naming HOT 5
- Setting up "Getting Started" documentation HOT 1
- Alembic Wrapper extractor doesn't include motion HOT 4
- Validate non-zero vertices HOT 3
- Scene saved validator HOT 5
- Concatenate pointcache hierarchies HOT 4
- Shaders included in model export HOT 3
- Maya Asset Browser HOT 2
- Maya Instance Browser HOT 5
- Alembic Extractor specifying attributes in job string twice with list or tuple values HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyblish-magenta.