alexhernandezgarcia / gflownet Goto Github PK

View Code? Open in Web Editor NEW

130.0 6.0 8.0 27.79 MB

Generative Flow Networks - GFlowNet

Home Page: https://gflownet.readthedocs.io/en/latest/

License: Apache License 2.0

Python 99.59% Shell 0.41%

aiforsocialgood generativemodeling gflownet opensource python pytorch aiforscience

gflownet's People

Contributors

Stargazers

Watchers

Forkers

influencefunctional hyeok9855 kiristern izumitkh bluelancer akkete simontheoret amelie-iska

gflownet's Issues

Adjust mask of ctorus

          I get that this mask is in a way fake, but for consistency wouldn't it make more sense to set the first entry of the mask (corresponding to the generic action) to True, and the second (corresponding to EOS) to False if the number of actions is equal to the trajectory length, and the other way around if any continuous action is valid (except EOS) is valid?

Originally posted by @michalkoziarski in #193 (comment)

FL and DB loss have issues with storing checkpoints

Need a way to restrict space groups in spacegroup.py

Need to pass a list of space group labels or numbers to spacegroup.py and have it restrict generation to only these groups.

I don't want to break it / make a mess. Assuming there's an elegant way to do it in maybe in get_mask_invalid_actions_forward(), but I haven't been able to run it an figure out the logic yet.

Example Scripts for "Towards equilibrium molecular conformation generation with GFlowNets"?

Thanks for the lovely work! Are there any example scripts for generating conformational ensembles for proteins similar to what is done in the paper Towards equilibrium molecular conformation generation with GFlowNets, which uses this repo. This would be very helpful to have. Perhaps there is another GitHub repo that does this?

Improve efficiency of searching in dicts, lists and tuples

https://www.geeksforgeeks.org/python-memory-consumption-dictionary-vs-list-of-tuples/

Write unit tests for GFlowNetAgent methods

Better handling of exceptions in the type of of training buffer

#155 (comment)

Do something to detect when the ids of the envs to sample are not unique.

min and norm properties of proxies can lead to silent errors

So if the caller of that property (accidently) does anything inplace on the returned tensor, that change will affect everyone else calling min in the future?
If so, it feels like it might lead to bugs that would be really difficult to track.
Same comment applies to the other proxies that implement the same strategy : torus (for min and norm) and uniform (for min).

Originally posted by @carriepl in #204 (comment)

Add test `Batch._compute_rewards_source`

          Since this is a new method in the Batch class, I think it would be a good idea to add the corresponding tests in the batch tests: https://github.com/alexhernandezgarcia/gflownet/blob/new_fl_loss/tests/gflownet/utils/test_batch.py

Originally posted by @alexhernandezgarcia in #253 (comment)

Remove `step_backwards` from tree env

See ece416a#commitcomment-125988565

The proxy should not be copied into each environment instance

Currently, the proxy is set as an attribute of the environments and the base environment implements the methods proxy2reward() and reward2proxy() that determine the conversion between proxy outputs and reward. The environment also implements the methods reward() and reward_batch(), which call the proxy and the conversion methods. This is probably not ideal for various reasons.

I do not see any longer a good reason to keep the proxy and these methods within the environment. It seems possible and a good idea to completely detach the environment and the proxy. Some proxies need information from the environment, which is currently set via the call to Env.setup_proxy(), which calls the proxy's setup() method. But this could just be done elsewhere.

Now, in terms of alternatives, I am not completely settled on what the best option would be. In particular, where should the methods that convert between proxy and reward go?

In the (base) proxy?
In the GFlowNet agent?

Batch of actions doesn't need to be converted to tensors

Currently, the actions are converted into tensors in the Batch. It seems that it is not necessary and it leads to inefficiency, for example in the computation of log probabilities.

About `GFlowNetEnv.top_k_metrics_and_plots()`

This may go rather in the logger class.

Flexible Policy Definition

Policies were originally MLPs.
Now we need to be able to use arbitrary function approximators.
This will be a library-level change that will affect all projects.

`gflownet/gflownet.py` - should not store `self.env` as well as `self.env_maker`

currently, self.env is stored in the GFlowNet class, because other classes expect an env instance (to access various methods / attributes).

Since the gflownet now stores a class factory rather than a class instance, we should figure out another way to communicate with these other classes (instead of storing an env instance.

Add test `Batch.get_rewards_source`

          Since this is a new method in the Batch class, I think it would be a good idea to add the corresponding tests in the batch tests: https://github.com/alexhernandezgarcia/gflownet/blob/new_fl_loss/tests/gflownet/utils/test_batch.py

Originally posted by @alexhernandezgarcia in #253 (comment)

Integrate Crystal and CCrystal

Both environments seems to share quite a bit of functionality, it would be good to refactor it so it's not copy-pasted between them.

test issue

task one
task two

Branch Cleanup

We should clean out all the old / dead branches.

There are like a billion.

SVP :)

Decide on a format for conditional modelling.

Unless I'm missing it I don't see anywhere a method for conditional generation. For my purposes it would be

load conditions in batches from a dataloader
assign each env to a condition
encode the condition (via a graph model - the condition in this case is a molecule and the conditions encoding is a vector)
concatenate conditions encoding to GFN policy input
train as normal

If possible it would also be ideal if the conditioning model could be updated during training along with the policy model, though I could probably find a way to pretrain one which is at least 'ok' if necessary. I haven't read deep enough into the conditional gflownet work to know what's optimal here. In my case, the distribution of high-scoring samples is both very sharp and extremely sensitive to the conditions.

In evaluation mode, for speed, we could call the conditioning model once and use the same encoding at all generation steps. For training, particularly if the conditioning model is being updated, probably fine to call it with the policy at every action.

I have started playing with this locally but don't want to conflict with any planned format.

Make test state count N configurable

any time in a test we use N, this should be a test-specific key in a dict from the super class.

Compute and log variances of the log probs

One more thing: I'd add computing and tracking two variances of the log probs:

variance over samples of logprobs_estimates (to understand better the behaviour of the correlation coefficient over the training)
median over samples of the variances of the logprobs_estimates over trajectories for each sample (to get a sense of how noisy the estimation is). The math is a bit tricky here as we use log mean as an estimation, not just the mean. But there're some work around: https://stats.stackexchange.com/questions/418313/variance-of-x-and-variance-of-logx-how-to-relate-them
But in any case, we will need to compute empirical var(P_F(tau) / P_B (tau)) / n_traj for each sample and then play around a bit with it to get variance for the log mean estimation.

Originally posted by @AlexandraVolokhova in #167 (comment)

About `GFlowNetEnv.compute_train_energy_proxy_and_rewards()`

Rethink whether this is the best way of computing train energies and rewards. This may not be the best location.

Also rethink Proxy.infer_on_train_set()

Be more Pythonic when checking boolean is True and is False

Refactor common environment tests

The way the common tests are implemented has several issues. Importantly, the repetition decorators are ignored.

See this comment: https://github.com/alexhernandezgarcia/gflownet/pull/204/files#r1339115115

Check uses of copy()

Might be a good idea to rename the copy() method in the common utils as copy_state() and move it to the base environment.

replay.pkl

Small detail in gflownet.py: enumerate is a generator so tqdm(enumerate(...)) does not print a full-width progress bar as tqdm is not aware of the length of enumerate(...). Simple fix: enumerate(tqdm(...)) 😄

Issue

Not sure you want to fix this in this PR but testing a gflownet creates files (like replay.pkl) in the current working directory and pollutes it, particularly dangerous when it's tracked by git.

Create training README

Batch size:

forward: number of forward trajectories to include in the training batch. These are on-policy trajectories possibly with random actions (if random_action_prob > 0) or with a tempered policy if temperature < 1.0
train: number of backward trajectories to include in the training batch, sampled (backwards) from data points in a "training set"
replay: number of backward trajectories to include in the training batch, sampled (backwards) from data points in the replay buffer.

The total number of trajectories in the training batch is the sum of the above.

state2proxy

state2proxy can be either state2oracle or state2obs. Input arg of state2oracle is a list of states, and that for state2obs is a single state. To use both of them interchangeably as state2proxy, the input form should be the same.
Need to add support for transformer-friendly data transformation. This would be as simple as changing the data type of state to int orNone transformation. (In the latter case, transformation to int can be done within the forward call of the transformer itself.) state2oracle cannot be used as the transformer-friendly transformation because:
a. for e.g., for the grid if we use state2proxy = state2oracle, input states to the transformer would be [-1, -1] (oracle-friendly) instead of [0, 0]) and the embedding for negative indices is not defined,
b. it is not necessary that the oracle takes indices as input (necessary for the embedding layer)

What happens with "stuck trajectories" in `env.trajectory_random()`?

Delete unused branches

We should clear out the outstanding PRs / old branches to make it easier for us to organize the work.

Crystal-GFN Fix lattice parameters issue

Talk with Pierre Luc

Constraints:

α + β + γ >= 360
α + β - γ <= 0
α + γ - β <= 0
β + γ - α <= 0

get_rewards vs. get_terminating_rewards in Batch

It may be possible to get rid of get_rewards() altogether since it is probably used by the flow matching loss only and it does not need to get the 0s of the non-terminating states.

Make separate base envs for discrete and continuous environments

On expression of variance and mean of Beta distribution being only valid under certain conditions.

          So I didn't get into too much detail on why this is the case, but the wikipedia page on beta distribution states that this is the case if `variance < mean * (1 - mean)` - assuming that's true, perhaps add some assert to validate this here?

Originally posted by @michalkoziarski in #214 (comment)

sample_batch args in `test_top_k` in gflownet.py

I just cloned a fresh copy of the repo in WSL and tried a run with all default configs, and it looks like the sample_batch on line 906 in gflownet.py is getting the wrong arguments (at least, it crashes every time). Looks like the parent function test_top_k was added August 1.

call:

for b in batch_with_rest(0, self.logger.test.n_top_k, self.batch_size.forward):
    gfn_states += self.sample_batch(
        self.env, len(b), train=False, progress=progress
    )[0]

function definition:

def sample_batch(
    self,
    n_forward: int = 0,
    n_train: int = 0,
    n_replay: int = 0,
    train=True,
    progress=False,
):