vicco-group / thingsvision Goto Github PK
View Code? Open in Web Editor NEWPython package for extracting representations from state-of-the-art computer vision models
Home Page: https://vicco-group.github.io/thingsvision/
License: MIT License
Python package for extracting representations from state-of-the-art computer vision models
Home Page: https://vicco-group.github.io/thingsvision/
License: MIT License
rename the vgg ecoset model to VGG16bn_ecoset
.show_model()
currently displays all modules of the model, but also prompts the user to input a module name, which it then checks for validity and returns. It should either be renamed to something more descriptive or actually only show the modules, as the name suggests.
We might also consider moving module name checking into .extract_features(...)
, as the user is not required to call .show_model()
before extracting.
When I try to import just ImageDataset like from thingsvision.dataset import ImageDataset
I get an error cannot import name 'ImageDataset' from partially initialized module 'thingsvision.dataset'
. It comes from ImageDataset importing vision (https://github.com/ViCCo-Group/THINGSvision/blob/master/thingsvision/dataset.py#L12) which also imports ImageDatasets (https://github.com/ViCCo-Group/THINGSvision/blob/master/thingsvision/vision.py#L31).
When running images through CORnet-z, the output layer for IT, as far as I understand from the CORnet preprint (https://www.biorxiv.org/content/10.1101/408385v1.full.pdf), is supposed to be 7x7x512 which when flattened would be a vector of length 25088. This is indeed what happens when apply_center_crop is set to True. However, when it is set to False, the output vectors of the same images (224 x 224 pixels), same layer/module, end up being of size 32768.
This behavior does not appear to be layer specific, I e.g. also tested it for the V1 output layer and the vector length also depends on whether center crop is applied or not.
I am not quite sure if this is a bug or "normal" behavior that I then just do not quite understand.
Unittests should be more splitted in a test directory
Enable inference of images using all available GPUs
Add source-specific config classes (e.g., TimmConfig()
, VisslConfig()
) similarly to how huggingface
does it
add SimCLR from https://github.com/google-research/simclr to custom models.
update ecoset models and clip models
Hey,
when I call euclidean_matrix
(through compute_rdm(x, 'euclidean')
), the execution crashes with double free or corruption (!prev)
(input: features from alexnet layers). Unfortunately, no more error logs are created. The other distance metrics work fine.
We run python 3.8 with CUDA 10.2 in a container. The python packages are up-to-date.
Can you reproduce that the euclidean_matrix
does not work, or is it specific to our setup?
Thanks
David
EDIT: If anyone else has the same problem, I use scipy's squareform
+ pdist
functions as a workaround. I did not benchmark, but it seems pretty fast.
Hello there,
I would like to try your tool but there are some issue during import (please see below).
I have re-installed scikit-image but his did not help. Any ideas?
Thank you.
ImportError: dlopen(/anaconda3/envs/coactivations/lib/python3.9/site-packages/skimage/_shared/geometry.cpython-39-darwin.so, 2): Symbol not found: ____chkstk_darwin
Referenced from: /anaconda3/envs/coactivations/lib/python3.9/site-packages/skimage/_shared/../.dylibs/libomp.dylib (which was built for Mac OS X 10.15)
Expected in: /usr/lib/libSystem.B.dylib
in /anaconda3/envs/coactivations/lib/python3.9/site-packages/skimage/_shared/../.dylibs/libomp.dylib
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "", line 1, in
import thingsvision.vision as vision
File "/anaconda3/envs/coactivations/lib/python3.9/site-packages/thingsvision/vision.py", line 67, in
from skimage.transform import resize
File "/anaconda3/envs/coactivations/lib/python3.9/site-packages/skimage/init.py", line 124, in
_raise_build_error(e)
File "/anaconda3/envs/coactivations/lib/python3.9/site-packages/skimage/init.py", line 102, in _raise_build_error
raise ImportError("""%s
ImportError: dlopen(/anaconda3/envs/coactivations/lib/python3.9/site-packages/skimage/_shared/geometry.cpython-39-darwin.so, 2): Symbol not found: ____chkstk_darwin
Referenced from: /anaconda3/envs/coactivations/lib/python3.9/site-packages/skimage/_shared/../.dylibs/libomp.dylib (which was built for Mac OS X 10.15)
Expected in: /usr/lib/libSystem.B.dylib
in /anaconda3/envs/coactivations/lib/python3.9/site-packages/skimage/_shared/../.dylibs/libomp.dylib
It seems that scikit-image has not been built correctly.
Your install of scikit-image appears to be broken.
Try re-installing the package following the instructions at:
https://scikit-image.org/docs/stable/install.html
If this already exists, how do I get this time bin data?
Some datasets have images that are stored in HDF5 format (e.g. NSD stimuli). Currently, these have to be written to disk to use the Extractor with an ImageDataset. It would be nice to have an HDF5Dataset that directly uses HDF5 files for extraction.
add a custom model for VGG without batch norm
There is a well-maintained reference implementation for Representational Similarity Analysis, the rsatoolbox
. I therefore suggest to deprecate THINGSvision
's RSA functions.
Advantages:
THINGSvision
avoids getting issues raised asking for RSA functions currently not implemented in THINGSvision
, e.g. other distance functions.THINGSvision
to maintain as the focus is solely on feature extraction.Possibly PR THINGSvision
's RSA functions to the rsatoolbox
if they turn out to be more efficient.
Dear Lukas,
first thanks a lot for this great tool!
When extracting the activations of the CLIP-ViT penultimate layer for the whole THINGS image data set, we noticed a mismatch between the count of input images (n=26,111?), the files in file_names.txt that the data loader saves (26,109), and the rows of the feature matrix (26,107). The difference between file_names.txt and the feature matrix seems to be due to two missing activations for images of the object "peppermint".
Do you know what causes the differences?
Thanks in advance!
Jonas
add a directory where users can implement their own models/load weights/...
these models can then be loaded via model name and lookup in the directory
OS: macOS 10.14.6 (18G9323) (I know... ask my IT why)
I created and activated a fresh conda env using the environment.yml. I did not install any additional packages.
device = 'cuda' if torch.cuda.is_available() else 'cpu'
batch_size = 68
model_name = 'clip-ViT'
module_name = 'visual'
backend = 'pt'
clip = True
out_path = ...
everything works as expected until I execute
dl = vision.load_dl(root,
out_path=out_path,
batch_size=batch_size,
transforms=model.get_transformations(),
backend=backend,
file_names=file_names)
Which outputs:
/Users/kaniuth/Desktop/extractivations/conda_env/lib/python3.8/site-packages/torchvision/transforms/transforms.py:332:
UserWarning: Argument interpolation should be of type InterpolationMode instead of int.
Please, use InterpolationMode enum. warnings.warn(
The pipeline seems to continue normally; I can extract and save activations just fine. I just don't know whether that warning has any consequences on the extracted activations?
That warning did not occur with earlier versions of THINGSvision.
Improve readability of README file. Make text more concise and easier understandable?
if torch.device is cuda, then thingsvision.model_class.Model.extract_features()
throws a type error, since features are torch.Tensor()
and features = np.vstack(features)
#L223 is not possible. suggested fix on #L221 features.append(act.cpu().numpy())
Inside model.extract_features
add a loop to extract features from multiple layers. Use defaultdict<list>
to store activations.
Cannot import thingsvision.dataaset.ImageDataset
as a stand-alone, without prior import of thingsvision.vision
. Code to reproduce:
from thingsvision.dataset import ImageDataset
throws: ImportError: cannot import name 'ImageDataset' from partially initialized module 'thingsvision.dataset' (most likely due to a circular import)
. Temporary fix is:
import thingsvision.vision
from thingsvision.dataset import ImageDataset
When executing module_name = extractor.show_model()
the user is supposed to input the part of the model for which they would like to extract features. Currently, any value is accepted, even if the value is not a valid module name for the respective model.
Therefore, check the user's input whether it actually denotes a valid module name. If not, reject it and prompt the user again. This avoids an error message later in the pipeline when the user tries to execute features = extractor.extract_features(...)
which might confuse the user.
Allow a user to add their own (pretrained) models without the necessity to upload them to thingsvision
/ make a PR
add TF weights for a alexnet model trained on ecoset https://codeocean.com/capsule/9570390/tree/v1
Add support for Tensorflow 2 by:
Currently, we have some if-statements that address model-specific exceptions. Since these are exceptions rather than something general, we want to specify them in the custom model file or move them to source-specific extractor classes, if applicable. There's an if-statement that exclusively concerns clip
models. This should go into the custom model file or the extractor classes for OpenCLIP
.
Right now, features are stored in RAM when calling extractor.extract_features(...)
, which quickly makes memory usage explode. It might be a good idea to enable a low-memory option that just writes features to disk immediately.
Maybe add an output path to the extractor extractor = Extractor(out_path='...')
and an extra flag extractor.extract_features(store_to_disk=True)
?
The center cropping raises currently an Exception with Tensorflow backend and CNN transformations activated.
https://github.com/ViCCo-Group/THINGSvision/blob/master/thingsvision/model_class.py#L317
The crop width 224 should not be greater than input width.
Condition x >= 0 did not hold element-wise:
x (shape=() dtype=int32) =
-221
Enable compatibility with most recent torchvision version (>= 0.13.0)
I think there's a rather serious (but fortunately easily fixable!) problem with THINGSvision. Namely, in the hook that retrieves model activations, the dictionary activations
is given a pointer to the original tensor output
rather than a copy of output
. This means that a layer that operates in-place, such as ReLU(inplace=True)
, can modify our activation tensor, even if said layer comes after the layer whose activations we are extracting.
To replicate this, you can use the pytorch Colab notebook given, and extract features for Alexnet on pretty much any image. Examining the features from layer features.0
(a 2D convolution), all entries are non-negative, and many are 0. This is because the following layer, features.1
, is an in-place ReLU operation, and overwrote all of the negative entries.
To solve this, the dictionary activations
should store a copy of the tensor, rather than the original tensor. I'm happy to make a PR and fix this myself, but I thought I would also bring it to folks' attention here as well.
DEFAULT
weights for torchvision
which is limited behavior (our goal is flexible behavior)source
by inheritance from a base extractor class and a backend-specific extractor class (TensorFlowMixin
and PyTorchMixin
); something along the following lines does the trick@dataclass(repr=True)
class TimmExtractor(BaseExtractor, PyTorchMixin):
def __init__(self, config: object) -> None:
super(TimmExtractor, self).__init__()
raise NotImplementedError
Add to README that model names and layer names are backend specific
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.