lightly-ai / lightly Goto Github PK
View Code? Open in Web Editor NEWA python library for self-supervised learning on images.
Home Page: https://docs.lightly.ai/self-supervised-learning/
License: MIT License
A python library for self-supervised learning on images.
Home Page: https://docs.lightly.ai/self-supervised-learning/
License: MIT License
This is required for #103.
We want to add a tutorial which highlights how a user can use custom augmentations to do self-supervised learning with lightly.
After sampling, we get a tag back. Translate the tag into a list of filenames.
Got the following error while uploading embedding
lightly-upload token='' dataset_id='' embeddings='/home/usr/embeddings.csv'
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/effd/lib/python3.7/site-packages/lightly/cli/upload_cli.py", line 107, in upload_cli
_upload_cli(cfg)
File "/home/ubuntu/anaconda3/envs/effd/lib/python3.7/site-packages/lightly/cli/upload_cli.py", line 52, in _upload_cli
embedding_name=cfg['embedding_name']
File "/home/ubuntu/anaconda3/envs/effd/lib/python3.7/site-packages/lightly/api/upload.py", line 115, in upload_embeddings_from_csv
raise RuntimeError(msg)
RuntimeError: Forbidden upload to dataset with no existing tags.
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Currently, the BaseCollateFunction concatenates the input images as follows:
# three concatenations
b0 = torch.cat([transform(img) for img in imgs], 0)
b1 = torch.cat([transform(img) for img in imgs], 0)
return torch.cat([b0, b1], 0)
This is inefficient because new memory has to be allocated for every concatenation operation (of which there are three).
An alternative with a single concatenation would be:
# single concatenation
b = [transform(imgs[i % bsz]) for i in range(2*bsz)]
return torch.cat(b, 0)
A short experiment for bsz=512
and image_height=128
shows the potential speed-up:
Required time (so far): 0.1442425012588501s
Required time (new): 0.09266984462738038s
When trying to use the lightly-magic CLI command without fine-tuning a model the CLI fails because it tries to load a checkpoint during the embedding phase which doesn't exist.
Command to reproduce: lightly-magic input_dir=raw trainer.max_epochs=0
Error message:
Training: 0it [00:00, ?it/s]Saving latest checkpoint...
[2020-11-26 15:47:06,327][lightning][INFO] - Saving latest checkpoint...
Training: 0it [00:00, ?it/s]
Best model is stored at: /datasets/videos/lightly_outputs/2020-11-26/15-46-37/
Traceback (most recent call last):
File "/opt/conda/envs/lightly/lib/python3.7/site-packages/lightly/cli/lightly_cli.py", line 72, in lightly_cli
return _lightly_cli(cfg)
File "/opt/conda/envs/lightly/lib/python3.7/site-packages/lightly/cli/lightly_cli.py", line 28, in _lightly_cli
embeddings = _embed_cli(cfg, is_cli_call)
File "/opt/conda/envs/lightly/lib/python3.7/site-packages/lightly/cli/embed_cli.py", line 85, in _embed_cli
checkpoint, map_location=device
File "/opt/conda/envs/lightly/lib/python3.7/site-packages/torch/serialization.py", line 581, in load
with _open_file_like(f, 'rb') as opened_file:
File "/opt/conda/envs/lightly/lib/python3.7/site-packages/torch/serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/opt/conda/envs/lightly/lib/python3.7/site-packages/torch/serialization.py", line 211, in __init__
super(_open_file, self).__init__(open(name, mode))
IsADirectoryError: [Errno 21] Is a directory: '/datasets/videos/lightly_outputs/2020-11-26/15-46-37/'
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Add hooks for validation at the end of each epoch (only informative if there are labels). Let's make sure it can be switched off and that the user can pass a validation set of his choice.
Following larger function are currently untested:
General thoughts:
The config.yaml has not the latest collate parameters. E.g. vf_prob
, hf_prob
is missing.
Additionally, there is no information about using half-precision. This should be added.
When uploading images my PC went sleeping for about a minute. After waking up, the upload stopped with following error:
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /boris-platform-dev/google-oauth2%7C108619227381715356556/5ff6ff536580b3000accacaf/training/training/n7/n7039.jpg?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=boris-platform-dev%40boris-250909.iam.gserviceaccount.com%2F20210107%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20210107T125248Z&X-Goog-Expires=3601&X-Goog-SignedHeaders=host&X-Goog-Signature=228f3730fa7f8e31d8540e0a0261d61b8232220a416e663f7f3bb556f253b19f258edeb32a00e482b78a901e4f2f97d5448ba643e00daffdec08f7fb815c93eb1335d556b7074d13974a21ba221224f656db3bed34506607f3e0b69b5e8c21ef71cefe79510d30ab5be186f9d98bd7fb9ffaec7bcce5c1014b8d6aaf096671cc196c35ccc0e2dbc7f34554a01fc778166f958a3552f52162c122f532b4e2857a6bf63f63ad8dc5c58acab304465fa86631ee3395dbcba17df617d99d21f183b950bc6f77741510d0437f9b31c188132c52fc72b8f782ed31f956ddd73ebeaf41d64f55aace58952e75d0067d83ebd5d76e84a1445db846f874ff512b7b970193 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x15780a400>: Failed to establish a new connection: [Errno 51] Network is unreachable'))
Let's add some high-level information about the structure of the package. And also explain how the different modules are connected. Maybe even make an illustration?
Currently, the ResNet implementation always adds a classification head but for almost all of the self supervised approaches, you only care about the actual backbone (and so you strip away this linear layer).
I'd propose to make the num_classes
argument optional, and if None
, omit the final linear layer so you only get the actual backbone. I can open a pull request for this if you're open to this.
Also, It looks like the library's implementation is not totally consistent with the default PyTorch ones (some missing max pools, different kernel sizes). I think a lot of people are likely to use the default PyTorch models for a lot of things, which would make the ResNets trained using Lightly incompatible. Would there be an interest in switching to the default PyTorch implementations?
Using files on disk to store images is often inefficient due to the large number of files.
Providing a way to use a memory abstraction instead would be really helpful.
It could be https://www.tensorflow.org/datasets or https://pytorch.org/docs/stable/data.html
What do you think? Is this already possible but simply not documented?
I found the one by https://github.com/swagger-api/swagger-codegen.git to work best, as the models it creates have exactly and only the parameters from the spec as argument.
Add an implementation of the CoMatch framework.
With the current setup the embedding module SelfSupervisedEmbedding
automatically creates checkpoints when trained. This also causes problems if checkpoints already exist, basically making it hard to use on its own. I would propose to remove the checkpoint_callback code
lightly/lightly/embedding/_base.py
Line 46 in d1b4711
embedding
module.
I think that should be a rather small change but it could help with usability.
The SimCLR and MoCo architectures are currently implemented as standalone architectures.
Before adding new architectures like BYOL or SimSiam we should probably make use of the existing code, e.g.
Currently if one has created a Lightly Dataset and wants to apply transforms at a later stage, we have to do it as follows:
dataset = data.LightlyDataset(root="./", name='CIFAR10', download=True)
...
use dataset to train SimCLR model for example
...
test_transforms = torchvision.transforms.ToTensor()
dataset.dataset.transform = test_transforms
We should extend the Dataset wrapper to directly support transforms.
This will save the user a line of code when trying to reproduce results and can make our example code leaner.
I suggest something like the following lines in lightly.data.collate.py
class SimCLRCollateFunction(ImageCollateFunction):
"""Description...
"""
def __init__(self, input_size=32):
super(SimCLRCollateFunction, self).__init__(
input_size=input_size,
# put all the SimCLR settings here
)
The PIP package should provide a function to get the current tag from the API.
https://github.com/lightly-ai/lightly-core/issues/86
Let's add the loss from SwAV to the package such that it can be used with and without a memory bank.
We'll put it in a separate file at lightly/loss/swav.py
.
Following closely our implementation of the NTXentropy loss, the backbone should look like this:
class SwAVLoss(MemoryBankModule):
def __init__(self,
# whatever arguments we need
memory_bank_size: int = 3000):
super(SwAVLoss, self).__init__(size=memory_bank_size)
...
def forward(self, output: torch.Tensor, labels: torch.Tensor = None):
output, negatives = super(SwAVLoss, self).forward(output)
if negatives is None:
# calculate loss from batch only
...
else:
# calculate loss from batch and negatives
...
return loss
Currently, the collate function converts each image into a pair of transformed images before concatenating them to a batch. This has led to a lot of confusion. Ideally, the transform could be passed to the LightlyDataset
constructor.
This would also have an impact on #68. Furthermore, a LightlyDataset
should have a train
and inference
mode to switch between augmentations for contrastive learning and for infering image representations.
Pre-trained models from the model zoo can only be used from the CLI. It would be better to move them to the models
module and add the option to load a pre-trained model.
Since there is no evidence pretrained models can hurt performance I would use them by default.
Then we could create them using:
E.g.
model = lightly.models.ResNetMoCo(num_ftrs=128, pretrained=True)
to improve the speed of the upload process, dont open the images prior to uploading since we are not using the extracted metadata anywhere
As proposed by users u/lfotofilter and u/AiDreamer on Reddit.
Add an implementation of the SimSiam Representation Learning framework.
As described in https://github.com/lightly-ai/lightly-core/issues/123, the pip package needs a mapping from the following four representations of the samples belonging to a tag:
The mapping can be done using the following two lists downloaded from the API:
Following mapping functions are needed:
from 1. to 2.: implemented in the bitmask class
from 2. to 3. by simple indexing, using A)
from 2. to 4. by simple indexing, using B)
from 4. to 2. by reverse indexing using A)
from 4. to 3. by reverse indexing using B)
from 2. to 1.: implemented in the bitmask class
Furthermore, the following workflows have to be updated to use these mappings:
To make it easier for other contributors to work on lightly it would be good to outline a bit the structure of the PIP package and the reasoning behind it. There were many internal discussions on how to derive a scalable architecture.
It should be possible to request a sampling from our API using the package directly (no need to go through the web-app).
Currently, all CLI configs are stored in a single file config.yaml
. In the future, the file structure should look as follows:
config/
|-- config.yaml
|-- model/
|-- resnet-18.yaml
|-- resnet-34.yaml
|-- ...
|-- data/
|-- data.yaml
|-- ...
This way, a user can still overwrite the default arguments like so:
lightly-train input_dir=my-dir model.num_ftrs=32
However, one could easily switch between different default settings and even write custom config files
# this will use the default settings from resnet-34.yaml
lightly-train input_dir=my-dir model=resnet-34
# this will use the settings in the custom config file
lightly-train input_dir=my-dir model=my-model
As implemented in PIL.ImageOps. May be used to replace ImageNet normalization when working on X-ray images. This normalization increases contrast in the image.
Example:
The new Lightly platform will use a new tag representation. The transmitted data will be encoded as (16bit) hex strings.
The goal is to provide a simple helper class to work with this new format and switch between hex representation and binary representation.
Some functionalities which need to be implemented:
'ab3f' --> '1010101100111111'
When I use the CLI lightly it always creates a new output folder with a timestamp. This is good to make experiments reproducible. However, for some CLI commands such as lightly-upload
or lightly-download
creating folders seems not adding any value.
Furthermore, I would rename the folder to something more descriptive than just outputs
.
lightly-upload
and lightly-download
not create new output folders by default.outputs
to lightly_outputs
or similar to make it clear which folders have been created by the lightly CLI.support for data augmentation pipelines, optimally data augmentation pipelines that support higher-level optimization like policy search or HPO
data augmentation would have its own abstraction but be bound to a pipeline abstraction and then pipelines would be passed to collate for dual-augmentation of batches specific to many self-supervision tasks, collation could be associated more with descriptive methods for how much divergence is introduced (via randomness) for each of the two "paths".
This would allow researchers to use strategies such as curriculum/active learning with augmentation diverging as the consistency loss reduced (for example)
To reduce constructor clutter and to allow for easy logging / experiment design, I believe that we should allow common hyper-parameters to be bundled and to make assertions about these parameters in one place.
There are two options: one a top-level params API that bundles one object to be passed to the data-loader and the model. The other, two objects- data params and model params.
I am biased to the former specifically for the framework-wide approach of self-supervision, there is a deep interconnection between data, data transformations, and models - perhaps a bit of which is present with all of deep learning- but I believe this paradigm in particular forces thinking of all of this as very interconnected.
This also forces the good practice of researchers thinking up-front about their entire experimental design, a kind of literate experimental design. With rich logging and enforcing invariants and adding defaults it could also be very beginner friendly.
Requirements
Currently, the primary ResNet backbone uses a custom implementation that would not be directly compatible with the default PyTorch implementations (in addition to having a slightly different layer configuration). I think it would be advantageous to move to using the standard models offered in torchvision as most people likely default to those.
Depends on #111
The goal is to make sure the pip package can communicate using the new tag format.
As proposed by user u/extracoffeeplease on Reddit.
Add an implementation of the BYOL framework.
At the moment, our CLI only supports two ways of downloading datasets:
However, it would be nice to download the full dataset (original images) if they are available from the platform to a destination folder.
The CLI documentation mentions that we have parameters, but there is no list of them. It would be great to provide a lit, so users know what kind of parameters they can set.
Currently, in lightly.cli.embed_cli
, the following lines make it impossible to embed eval/test data from CLI:
dataset = LightlyDataset(root, name=data, train=True, download=download,
from_folder=input_dir, transform=transform)
For SimCLR and MoCo there are two versions. Our implementations are more close to v2 of both from what I've seen. But I'm not sure we use all the changes from the papers. I think it makes sense to clarify which model lightly is using and if there is a difference to the paper we should mention this. Maybe also mention how one could use SimCLRv1 or SimCLRv2 by changing some of the parameters?
In their paper, the authors of MoCo mention that they shuffle their batches in order to prevent a flow of information between the key encoder and the query encoder (if the positive pairs are normed with the same statistics, the model can cheat). As a solution, they shuffle the batches and split them into smaller sub-batches on which the batch norm is then calculated. We can and should implement a similar strategy.
At the moment, tox installs the dependencies in the local environment which can overwrite the current installation of required packages.
Use CI to track the test coverage
Tasks:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.