Comments (8)
As to redudancy, I'm perfectly fine with things as they are. "Redundant" code (if that's what it is, which I am not convinced) is not by itself a bad. It's not worth it to find some ultra-Julian way to compress the indexing functionality just because the number of lines is large. VarInfo works really well, and that comes at the cost of complexity. I think we should just focus on functionality and revisit "redundancy" if it ever actually starts to become a non-cosmetic issue.
As to the API, I do not find that I run into cases where various API calls are repeated, or that I am unclear as to which API function to use, so I think the API is more or less fine. The docstrings are actually quite comprehensive.
As to new API, the only thing I'd like to see done by smarter people than me is:
- Setting values with
NamedTuples
in a better way than my hacky version for the Turing/AdvancedMH interconnect. This might already exist but I don't know where it is. - Getting a vector of support for each variable -- may help for box minimization, and potentially other sampler types.
- A super quick convenience function: getting a vector of variable names with indexing, i.e.
names(vi)
would return something like["a[1,1]", "a[1,2]", "sigma"]
. Also probably something that maybe already exists.
On testing: The Turing test issue is hard, because it's not obvious to me that there's a super easy solution. Perhaps one way is just to add a script that pulls in the Turing master into the test folder and runs the build from there, though this may have some unintended consequences and build issues.
from dynamicppl.jl.
First of all, I don't think we should strive for a small number of locs just for the sake of it. IMO the ultimate goal is a reasonable balance of readability/clarity, simplicity, composability, functionality, and performance (maybe I forgot to list something here, but at least I think all these aspects are important). Moreover, the current implementation seems to work reasonably well, so every change should be evaluated carefully and I think there's no need to rewrite everything completely. Maybe it would be even better to add alternative approaches and ideas as separate AbstractVarInfo
implementations instead of replacing VarInfo
.
That being said, probably you know that I like to throw around ideas that might not be completely useful or reasonable in the end, so here they are
IMO a major problem is that there is no clearly defined API right now (as discussed previously), which makes it difficult to actually come up with alternative AbstractVarInfo
implementations (and, e.g., it's unclear if ThreadedVarInfo
actually implements all required methods or maybe even too many). I think probably we should try to address this issue before adding any alternative implementations or rewrites of VarInfo
.
To me it seems with the current version of VarInfo
in mind a AbstractVarInfo
object should at least support:
getindex(varinfo, variables...)
andsetindex!(varinfo, value, variables...)
for getting and setting values of variablescopyto!(varinfo, vector)
,copyto!(vector, varinfo)
, and probablycopyto!(varinfo, variables, vector)
andcopyto!(vector, varinfo, variables)
for copying to/from vectors of variablesgetlogp
,setlogp!
,resetlogp!
, andacclogp!
for handling log probabilities
Additionally, as @cpfiffer mentioned above, the API of DynamicPPL should also provide some convenience functions for, e.g., retrieving the values (in the original shape!) as NamedTuple or retrieving the name of the variables. Also ideally DynamicPPL should provide the functionality of Turing._params_to_array
and its components in an efficient implementation.
As mentioned above, currently it is quite confusing to see what the difference actually is between getval
and getindex
, and setval!
and setindex!
. I think some of the confusion arises from the fact that currently some methods expect vectors whereas others don't, and some expect transformed data whereas others don't. I think both aspects should be separated. A method that updates values of variables from vectors should just do that (IMO copyto!
would be consistent with, e.g., updating parameters (of whatever shape) in Flux or Zygote from vectors), and not impose additional assumptions about transformations. If multiple versions (for transformed and untransformed data) are needed, one might want to add a positional or keyword argument that specifies in which space the data is. That leads me to the following point.
@devmotion has brought up more than once his preference to design the VarInfo data structure around unvectorized values. I understand the appeal of that but I don't think it will necessarily simplify the code by a lot. We still need to keep track of distributions, sampler gids and varnames. We still need to get a vectorized form and set a vectorized form of the values for HMC samplers. We still need to cater for variables disappearing and popping at any time. We still need specialize the type of VarInfo to cater for mixed variable types and automatic differentiation in a type stable way. All of this would still need to be done. Will it be simpler? Maybe, but I doubt it will be much simpler.
The motivation for not storing samples in a linearized form is not simplicity. Maybe the code will become more complicated, maybe it will be less complicated, I don't know. The main motivation is that linearization destroys information (such as shape) that is present in the samples and has to be reconstructed carefully and with a lot of heuristics currently. Moreover, it requires promotion of different types which is, e.g., weird in the case of both discrete and continuous variables. Samplers such as HMC that operate on vectors can use a method such as copyto!
to update samples in whatever shape, so IMO there is no need to organize data in vectorized form. In fact, for other samplers such as MH or ESS the current linearization is very inconvenient.
That leads me to another point.
For example, we need to be able to update the gids of a variable to assign a sampler to it. This is because new variables can pop up in dynamic models and they need to be assigned to HMC for example in the next HMC call.
To me it seems some of the design of AbstractVarInfo originates from having specific samplers in mind. I think ideally AbstractVarInfo should be sampler agnostic. So quite drastically, if gids are needed by the sampler in the next step, it should be part of the state of the sampler (and by that I mean it should be part of the transition - as discussed in AbstractMCMC, I don't think samplers should have a mutable state since that leads to abstract fields and weird dummy initializations e.g. in the HMC sampler). So I think ideally we should not actually handle gids.
Along the same lines, I'm not sure if the concept of transformed and untransformed samples should be handled on the level of AbstractVarInfo. A sampler operates in transformed or untransformed space, and hence it should know best how the values should be handled. At the beginning of sampling (and in every Gibbs step), it would just make sure that the variables are in the correct space, and at the end of sampling (and the end of every Gibbs step) it would make sure that they are in the original space. logpdf
evaluation and rand
calls could be sampler-dependent to know if we operate in transformed or untransformed space, and hence either use an appropriate bijector or not.
Regarding the test structure: If the interface should be tested with specific samplers, we could have a simple internal implementation of these samplers, similar to https://github.com/SciML/DiffEqBase.jl/blob/master/src/internal_euler.jl, in addition to downstream tests. Downstream tests could be handled similarly to https://github.com/SciML/DiffEqBase.jl/blob/139e5ce266fa41ffd81508ba1c632e88b0118e58/test/runtests.jl#L32-L36 by including them in a separate environment to which one could temporarily add a Manifest.toml file for tests against a specific branch or version of, e.g., Turing.
from dynamicppl.jl.
There are a lot of good points here. I'll add some of my thoughts too.
I think the lack of a clear set of compulsory APIs (as mentioned above) for AbstractVarInfo
is becoming a major issue for extensibility and readability. The lack of clear APIs is an important source of perceived complexity. I think a lot of VarInfo
's complexity can be alleviated by introducing a clearer type hierarchy, for example (these types are just off my mind without optimisation):
AbstractVarInfo ... UntypedVarInfo ... ThreadedVarInfo ... AbstractTypedVarInfo ...... UnonstrainedTypedVarInfo ...... ConstrainedTypedVarInfo ...CompositionalVarInfo
We can have a separate file abstract_varinfo.jl
for defining AbstractVarInfo
and common APIs. (I'm supporting adding a new file!)
Some existing complexity around constrained-space parameters and unconstrained-space parameters can be distinguished by introducing additional types UnonstrainedTypedVarInfo
, ConstrainedTypedVarInfo
. As @devmotion mentioned, the sampler should be responsible for handling conversion between UnonstrainedTypedVarInfo
and ConstrainedTypedVarInfo
at the beginning and end of the sampling process.
Another helpful feature that we currently lack is some granularity control for storing variables into VarInfo
s. For example, I might want to flatten a set of variables for certain convenience goals, but still want to keep other variables in the format of the original shape (maybe at some loss of performance). Another example is that I might want to store some variables in the constrained-space, while others in the unconstrained-space. In short, we need a mechanism for combining various VarInfo
s into a CompositionalVarInfo
object. Such additional VarInfo
compositionality, together with a set of conversion methods between primitive VarInfo
are hugely helpful for improving extensibility, modularity and ultimately clarity I think.
As for some of the internal functions, like getval
/setval!
, getindex
and setindex
, I think once we have a distinction between various VarInfo
types, we can consider unifying these functions into getindex
and setindex!
. For example, calling getindex(:: UnconstrainedTypedVarInfo)
should always parameters in the unconstrained-space. Similarly for setindex!
. Furthermore, calling getindex(::AbstractNonlinearVarInfo)
should return variables in their original shape.
It is possible that we can leverage parametric types to reduce the total number of AbstractVarInfo
subtypes, but the general principle remains the same I think. Also, I think we can archive this refactoring gradually, with each PR contains a small well-defined set of changes. A good starting point might be abstract_varinfo.jl
I think.
from dynamicppl.jl.
So the different VarInfo types idea is similar to the mode
field I introduced in #115. In that PR, I defined LinkMode
and StandardMode
structs which change the behavior of the VarInfo
. Fully separating the VarInfo
type into 2 types, unconstrained and constrained, sounds reasonable too. I can experiment with that after NeurIPS.
To accommodate the unvectorized representation, we may use a dictionary in the UnvectorizedVarInfo
type or a NamedTuple
of dictionaries. Essentially, the current VarInfo
data structure is mostly just a multi-value dictionary that tries to keep the values contiguous in memory. Perhaps, we can have a package for that data structure that makes it easy to construct various VarInfo
types around that MultiValueDict
type. We can have ContiguousMultiDict
, CompositionalMultiDict
that's a combination of AbstractMultiDict
s, etc. The different VarInfo
s can then be thin wrappers around MultiDict
.
We can also define a specialize_types
function that specializes the types of the keys and values, while creating a fallback generic dictionary for taking in previously unseen types. I think there is a lot of potential there in terms of separation of concerns. This data structure can then become usable by other PPLs if they want by just customizing it for their own key types (e.g. using Gen address instead of Turing VarName) or different value types.
For now, I will try to fix dynamic model support and stochastic control flow in branches. Then I will come back to refactoring VarInfo after NeurIPS.
from dynamicppl.jl.
@devmotion getting rid of gids from VarInfo sounds interesting but this needs more thinking. So currently we know which variable belongs to which sampler using the gids, but we also know it from the space of the Sampler. So this might be an actual redundancy. I assume this redundancy was introduced to enable 1 variable symbol to be sampled by 2 different samplers in a Gibbs context. But currently, I don't think we do this. Even if that's needed, it might make more sense to store which VarName
s each sampler samples in the Sampler
struct instead of the VarInfo
struct.
from dynamicppl.jl.
I tried to summarize the relevant ideas and arguments from here in the AbstractPPL.jl discussion, but I dare not close this issue yet, as the information content is pretty dense. Would everyone involved please have another look that nothing important is lost in transfer over there?
from dynamicppl.jl.
It's probably ok to keep this open and revisit after we finish the refactoring of VarInfo
APIs.
from dynamicppl.jl.
Closed in favour of AbstractPPL.jl discussion, #68 and #309
from dynamicppl.jl.
Related Issues (20)
- Unable to sample when conditioning on an array with `|` HOT 6
- Require branches to be up to date before merging HOT 5
- Remove `NamedDist` in favour of `VarName` interpolation HOT 1
- Simplify `assume`/`observe` design HOT 1
- Error with `.~` and `rand` HOT 2
- Supporting mutating ADs in models that fill arrays of parameters HOT 7
- Roadmap for depreciating `VarInfo` in favour of `SimpleVarInfo` HOT 5
- Type instability: assigning slices in assumptions HOT 2
- "[DynamicPPL] attempt to link a linked vi" warning when aborting sampling and returning minus infinity HOT 6
- Tests fail on Julia 1.8 HOT 2
- Name clash caused by submodels is hard to debug HOT 3
- `TypedVarInfo` failing for certain models over empty vectors HOT 1
- Remove use of `threadid` HOT 9
- Models with dynamic dimensionality
- Possibly confusing `.~` meaning HOT 6
- WARNING: Method definition subsumes [...] overwritten HOT 4
- Support for linking distributions with embedded support HOT 10
- Use merge queue instead of bors? HOT 1
- InferenceObjects integration HOT 12
- Adding StatsBase.predict to the API HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dynamicppl.jl.