voxel51 / fiftyone Goto Github PK
View Code? Open in Web Editor NEWThe open-source tool for building high-quality datasets and computer vision models
Home Page: https://fiftyone.ai
License: Apache License 2.0
The open-source tool for building high-quality datasets and computer vision models
Home Page: https://fiftyone.ai
License: Apache License 2.0
This code
train_dataset = foz.load_zoo_dataset("cifar10", split="train")
valid_dataset = foz.load_zoo_dataset("cifar10", split="test")
generates this output
Using default dataset directory '/home/jason/fiftyone/cifar10/train'
Dataset already downloaded
Parsing samples...
Creating dataset 'cifar10' containing 50000 samples
Using default dataset directory '/home/jason/fiftyone/cifar10/test'
Dataset already downloaded
Parsing samples...
Creating dataset 'cifar10' containing 10000 samples
It's confusing to say "Creating dataset 'cifar10'" twice.
Infinite scrolling of images in the dashboard is awesome! So awesome that we made it a front-and-center feature that every user will definitely interact with. As a result, it's important that we optimize its performance as much as possible.
This will be a recurring issue, but here's an initial list of low-hanging fruit that we brainstormed:
Scrolling efficiency ideas
Sample
s) and show those in the samples viewHere's a v1 design for our docs from our designer:
Personally, I like the colors/styling of the light theme the best:
The more important point is making sure the various components of the docs are laid out in a user-friendly way.
For reference, here's a tutorial from PyTorch:
and here's a tutorial from TF:
Distilling all of this, here are the page elements that I like:
Get Started
, Docs
, Tutorials
, GitHub
, etcView source on GitHub
and Download notebook
linksDiscussion is welcome!
Making Display
and View
sections on the dashboard sidebar collapsable (but expanded by default) will be important to support cases where either the sample field schema and/or the current are complex and thus cannot both fit on the screen at the same time.
Similarly, even with Display
collapsed, for example, the current View
may exceed the screen height, so the sidebar will need to be scrollable. (Maybe it already is? Not sure)
Just documenting this: we need the ability to programmatically close the dashboard
When viewing a dataset in the dashboard, the default should be to have at least one sample field of type fiftyone.core.label.Label
automatically selected (the user could deselect if desired, of course)
Here's a proposal:
My justification for this is:
The parsing samples progress bar is a bit disingenuous because loading a dataset always seem to hang for some time afterwards.
I'll further note that this hang time increases with the dataset size.
This hang time is not communicated to the user as anything specific.
From IPython, I run the following commands:
In [1]: import fiftyone as fo
In [2]: session = fo.launch_dashboard()
In [3]: exit()
The last command exits my shell with the dashboard still open. The dashboard does successfully close, but my shell is now borked: anything I now type does not appear. Commands can be run and their outputs will print, but anything I type doesn't appear. My only recourse is to close the shell and open a new one.
Trying to run the following from the dataset creation walkthrough:
import fiftyone.zoo as foz
# List available datasets
print(foz.list_zoo_datasets())
# Load a zoo dataset
# The dataset will be downloaded from the web the first time you access it
dataset = foz.load_zoo_dataset("cifar10")
# Print a few samples from the dataset
print(dataset.view().head())
Without tensorflow installed, it prompts me to pip install tensorflow>=1.15
.
However, with tensorflow==1.15, I get the following error:
Done writing /home/eric/fiftyone/cifar10/tmp-download/cifar10/3.0.2.incompleteFX5EZH/cifar10-train.tfrecord. Shard lengths: [50000]
Generating split test
Shuffling and writing examples to /home/eric/fiftyone/cifar10/tmp-download/cifar10/3.0.2.incompleteFX5EZH/cifar10-test.tfrecord
0%| | 0/10000 [00:00<?, ? examples/s]Done writing /home/eric/fiftyone/cifar10/tmp-download/cifar10/3.0.2.incompleteFX5EZH/cifar10-test.tfrecord. Shard lengths: [10000]
Skipping computing stats for mode ComputeStatsMode.AUTO.
Dataset cifar10 downloaded and prepared to /home/eric/fiftyone/cifar10/tmp-download/cifar10/3.0.2. Subsequent calls will reuse this data.
Constructing tf.data.Dataset for split test, from /home/eric/fiftyone/cifar10/tmp-download/cifar10/3.0.2
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-1-5a9ae139399a> in <module>
6 # Load a zoo dataset
7 # The dataset will be downloaded from the web the first time you access it
----> 8 dataset = foz.load_zoo_dataset("cifar10")
9
10 # Print a few samples from the dataset
~/venvs/test/lib/python3.6/site-packages/fiftyone/zoo/__init__.py in load_zoo_dataset(name, splits, dataset_dir, download_if_necessary)
142 if download_if_necessary:
143 info, dataset_dir = download_zoo_dataset(
--> 144 name, splits=splits, dataset_dir=dataset_dir
145 )
146 zoo_dataset = info.zoo_dataset
~/venvs/test/lib/python3.6/site-packages/fiftyone/zoo/__init__.py in download_zoo_dataset(name, splits, dataset_dir)
110 """
111 zoo_dataset, dataset_dir = _parse_dataset_details(name, dataset_dir)
--> 112 info = zoo_dataset.download_and_prepare(dataset_dir, splits=splits)
113 return info, dataset_dir
114
~/venvs/test/lib/python3.6/site-packages/fiftyone/zoo/__init__.py in download_and_prepare(self, dataset_dir, splits)
550 logger.info("Downloading split '%s' to '%s'", split, split_dir)
551 format, num_samples, classes = self._download_and_prepare(
--> 552 split_dir, scratch_dir, split
553 )
554
~/venvs/test/lib/python3.6/site-packages/fiftyone/zoo/tf.py in _download_and_prepare(self, dataset_dir, scratch_dir, split)
184 get_class_labels_fcn,
185 get_num_samples_fcn,
--> 186 sample_parser,
187 )
188
~/venvs/test/lib/python3.6/site-packages/fiftyone/zoo/tf.py in _download_and_prepare(dataset_dir, scratch_dir, download_fcn, get_class_labels_fcn, get_num_samples_fcn, sample_parser)
762 # Write the formatted dataset to `dataset_dir`
763 write_dataset_fcn(
--> 764 dataset.as_numpy_iterator(),
765 dataset_dir,
766 sample_parser=sample_parser,
AttributeError: 'PrefetchDataset' object has no attribute 'as_numpy_iterator'
When trying this code snippet with tensorflow==2.20
this worked without issue. Once run with v2.20
I then reinstalled v1.15
and it worked without issue. Only after deleting ~/fiftyone/cifar10
and rerunning with v1.15
did the error reoccur.
$ ipython
Python 3.6.10 (tags/debian/3.6.10-1+xenial1:7bb1d22, Jan 11 2020, 15:15:40)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.8.0 -- An enhanced Interactive Python. Type '?' for help.
[ins] In [1]: import fiftyone as fo
[ins] In [2]: fo?
Type: module
String form: <module 'fiftyone' from '/home/jason/voxel51/j/fiftyone/fiftyone/__init__.py'>
File: ~/voxel51/j/fiftyone/fiftyone/__init__.py
Docstring:
FiftyOne package namespace.
| Copyright 2017-2020, Voxel51, Inc.
| `voxel51.com <https://voxel51.com/>`_
|
Let's be sure to improve this documentation information as this is a common usage in ipython...
Single clicking on an image in the dashboard should open the expanded view. See #113 for details on how sample selection should be implemented, which is the current single click behavior that is being usurped by this request
Wouldn't this be cool and useful!?
dataset = fo.Dataset(...)
view1 = (
dataset.view()
....
)
view2 = (
dataset.view()
....
)
view_intersection = view1 & view2
# equivalent long version:
# view_intersection = view1.intersection(view2)
view_union = view1 | view2
# equivalent long version:
# view_union = view1.union(view2)
...
We could support all the natural set operations:
I can break the dashboard in a very unpleasant way by doing the following:
export FIFTYONE_DEFAULT_ML_BACKEND=tensorflow
fiftyone zoo download mnist --splits test
import fiftyone as fo
import fiftyone.zoo as foz
test = foz.load_zoo_dataset("mnist", splits=["test"])
session = fo.launch_dashboard(dataset=test)
Close terminal session, get a coffee, etc
Now re-download the dataset with Torch:
mv ~/fiftyone/mnist ~/fiftyone/mnist-torch
export FIFTYONE_DEFAULT_ML_BACKEND=torch
fiftyone zoo download mnist --splits test
import fiftyone as fo
import fiftyone.zoo as foz
test = foz.load_zoo_dataset("mnist", splits=["test"])
session = fo.launch_dashboard(dataset=test)
As you can see, the images didn't change, but the labels did! My dashboard is broken!
The issue is that the TF and Torch-based datasets yield images on disk with the same filenames ~/fiftyone/mnist/test/data/%05d.jpg
, but the order of the filenames is permuted, so ~/fiftyone/mnist/test/data/00001.jpg
is a different image in each download of the dataset.
I think this is some kind of aggressive image caching on the frontend that persists between sessions and doesn't realize if the source image has actually changed on disk.
For example, this shows that the image I see in the dashboard does not necessarily match what is actually on disk at the time:
The dashboard currently renders images with lots of image smoothing turned on. However, our technical audience of CV/ML scientists will want to see their raw images.
The difference is especially apparent at low resolutions. I would prefer to see the pixelated image 32x32 image on the left below (opened on my machine with smoothing turned off), not the image on the right (in the FO dashboard):
print(dataset_or_view)
can get out of hand when the collection contains many samples. We should consider a threshold above which we only serialize X samples and then append a message like ... X of Y total samples
or something
Proposed MVP implementation after our meeting on June 1st:
The tab headings are up for discussion; unsure how to best organize.
MVP field types for which to support
fiftyone.core.label.Classification
labelsfiftyone.core.label.Detections
labelstags
) values#117 went out a bit quick, but, upon closer inspection, I think we should not have deleted fiftyone.core.odm.ODMDatasetSample
.
The current implementation has only fiftyone.core.odm.ODMSample
and fiftyone.core.odm.NoDatasetSample
, where the latter inherits directly from SerializableDocument
:
fiftyone/fiftyone/core/odm/sample.py
Line 405 in 2730847
I think there's still value in having the following hierarchy:
fiftyone.core.odm.ODMSample
fiftyone.core.odm.ODMDatasetSample
fiftyone.core.odm.ODMNoDatasetSample
where the base class fiftyone.core.odm.ODMSample
defines the interface that all samples support. This will make it more clear what fiftyone.core.sample.Sample
is allowed to do with its backing documents, for example.
Although NoDatasetSample
is now completely home brewed (no MongoEngine), it is still a "document" in the sense that it is a JSON serializable representation of a Sample. So I see no issue with using the ODMNoDatasetSample
name. It is in the odm
package, after all.
Currently dataset summaries look something like this:
>>> d = fo.Dataset("ASDf")
>>> d
Name: ASDf
Num samples: 0
Tags: []
Sample fields:
filepath: fiftyone.core.fields.StringField
tags: fiftyone.core.fields.ListField(field=fiftyone.core.fields.StringField)
metadata: fiftyone.core.fields.EmbeddedDocumentField(document_type=fiftyone.core.metadata.Metadata)
We should consider implementing custom __str__
for our fiftyone.core.fields
classes to make the sample field representations more concise. Particularly list fields and embedded fields. The fact that metadata
is an "embedded document" is definitely not relevant to the user, for example
Example concise representation:
>>> d = fo.Dataset("ASDf")
>>> d
Name: ASDf
Num samples: 0
Tags: []
Sample fields:
filepath: fiftyone.core.fields.StringField
tags: fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
metadata: fiftyone.core.metadata.Metadata
In the Image Deduplication with Fiftyone
walkthrough docs in section 4 Compute File Hashes
:
We have two ways to visualize this new information:
1. From your terminal:
sample = dataset.view().first()
print(sample)1. By refreshing the dashboard:
session.dataset = dataset
Should be 1.
and 2.
In the GUI, allow users to move from one sample to the next using arrow keys.
I think the most critical design challenge we have remaining is how to ensure that it is crystal clear to the user when their dataset is synced with the DB.
Some thoughts on the issue:
can we make sample.save()
efficient enough, or is this fundamentally an antipattern for large-scale datasets?
an alternative batch approach is as follows:
for sample in dataset:
sample["file_hash"] = fof.compute_filehash(sample.filepath)
# batch updates all samples that have been modified
dataset.save()
where samples would report to their parent dataset that they have been modified, and then the dataset would handle batch saving all modified samples (the dataset would maintain a queue of in-memory samples to save).
A critical issue with the above code as written is that we'll have very unhappy users if they get an error on the 123456th sample they updated and then discover that none of the other modifications they made were saved.
As a result, I believe Dataset
needs a context manager that manages batch syncing to the DB:
with dataset:
for sample in dataset:
sample["file_hash"] = fof.compute_filehash(sample.filepath)
where __exit__
ensures that any changes will always be synced to the DB. As an optimization, the dataset could sync changes in batches of n
to avoid the need to store every sample in memory
Originally posted by @brimoor in https://github.com/voxel51/fiftyone/diffs
The current FiftyOne install process requires XCode developer tools, because the Electron app needs to build some things with gyp
.
Also, when @michaelsare ran the install script with Xcode dev tools installed, he didn't see any obvious errors, but yarn
was not properly installed. This had to be fixed via brew install yarn
. The install script (tries?) to install yarn via npm install -g yarn
.
We should find a way to ensure that a developer install from scratch on a fresh machine completes successfully. We'll have non-developer users of the tool internally that will need to be able to spin up a bleeding edge version of the tool to demo to folks. Or we'll want to install FiftyOne from scratch on a fresh VM
Here's what the no dataset page of the FiftyOne Dashboard currently looks like:
If the user has a dashboard open, then they've already figured out how to run fo.launch_dashboard()
, right? So, perhaps we should extend the help a bit to show the full workflow:
import fiftyone as fo
# Load your FiftyOne dataset
dataset = ...
# Launch your dashboard locally
# (if you're reading this from your dashboard, you've already done this!)
session = fo.launch_dashboard()
# Load a dataset
session.dataset = dataset
# Load a specific view into your dataset
session.view = view
Remote connections are a bit of a special case, so perhaps they should be hidden in a view that appears only after clicking on a Remote session?
link or something similar. Then perhaps that help page should show bifurcated local/remote instructions like this:
import fiftyone as fo
# Load your FiftyOne dataset
dataset = ...
# Launch the dashboard that you'll connect to from your local machine
session = fo.launch_dashboard(remote=True)
# Load a dataset
session.dataset = dataset
# Load a specific view into your dataset
session.view = view
# Configure port forwarding to access the session on your remote machine
ssh -L 5151:127.0.0.1:5151 username@remote_machine_ip
# Launch the dashboard
# NOTE: I'm going to support this via the CLI as a shortcut for launching a
# dashboard that you intend to connect to remotely
fiftyone remote
Need type
support for cifar10-like chunks of data...
This is blocked by #64 and the current approach to the hidden mongodb installation. Notably, the same database and log directories are used by mongodb for every session.
This is blocked by mongodb always using the same port (27017
).
ValidationError: ValidationError (Classification:None) (StringField only accepts string values: ['label'])
After an image is shown in the sidebar, add an option for viewing it fullscreen. This should still use player51 to show detections.
Is Dataset.serialize()
being used for anything?
fiftyone/fiftyone/core/dataset.py
Lines 414 to 420 in e9b8d72
I find it confusing because it does not serialize the dataset, it just returns metadata (currently only the name
).
This is becoming especially confusing given #121, which introduces a Dataset.to_dict()
method that actually serializes the entire dataset.
If you have Torch as your backend, but Tensorflow is available, Then the zoo should load the dataset from TF.
Google Photos please!!!
This has show up with my work on the new interface. Shows up maybe once in ~5-10 calls. Need to investigate
https://api.mongodb.com/python/current/api/pymongo/errors.html#pymongo.errors.NotMasterError
(fiftyone) tylerganter@tgmbp:~/source/fiftyone/tests$ python 51in15.py
Uncaught exception
Traceback (most recent call last):
File "51in15.py", line 25, in <module>
sample_id = dataset.add_sample(filepath="/path/to/img.jpg", tags=["train"])
File "/Users/tylerganter/source/fiftyone/fiftyone/core/dataset.py", line 205, in add_sample
sample = self._Doc(*args, **kwargs)
File "/Users/tylerganter/envs/fiftyone/lib/python3.6/site-packages/mongoengine/base/document.py", line 115, in __init__
setattr(self, key, value)
File "/Users/tylerganter/source/fiftyone/fiftyone/core/odm.py", line 214, in __setattr__
self.save()
File "/Users/tylerganter/envs/fiftyone/lib/python3.6/site-packages/mongoengine/document.py", line 408, in save
object_id = self._save_create(doc, force_insert, write_concern)
File "/Users/tylerganter/envs/fiftyone/lib/python3.6/site-packages/mongoengine/document.py", line 473, in _save_create
object_id = wc_collection.insert_one(doc).inserted_id
File "/Users/tylerganter/envs/fiftyone/lib/python3.6/site-packages/pymongo/collection.py", line 698, in insert_one
session=session),
File "/Users/tylerganter/envs/fiftyone/lib/python3.6/site-packages/pymongo/collection.py", line 612, in _insert
bypass_doc_val, session)
File "/Users/tylerganter/envs/fiftyone/lib/python3.6/site-packages/pymongo/collection.py", line 600, in _insert_one
acknowledged, _insert_command, session)
File "/Users/tylerganter/envs/fiftyone/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1491, in _retryable_write
return self._retry_with_session(retryable, func, s, None)
File "/Users/tylerganter/envs/fiftyone/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1384, in _retry_with_session
return func(session, sock_info, retryable)
File "/Users/tylerganter/envs/fiftyone/lib/python3.6/site-packages/pymongo/collection.py", line 595, in _insert_command
retryable_write=retryable_write)
File "/Users/tylerganter/envs/fiftyone/lib/python3.6/site-packages/pymongo/pool.py", line 618, in command
self._raise_connection_failure(error)
File "/Users/tylerganter/envs/fiftyone/lib/python3.6/site-packages/pymongo/pool.py", line 613, in command
user_fields=user_fields)
File "/Users/tylerganter/envs/fiftyone/lib/python3.6/site-packages/pymongo/network.py", line 167, in command
parse_write_concern_error=parse_write_concern_error)
File "/Users/tylerganter/envs/fiftyone/lib/python3.6/site-packages/pymongo/helpers.py", line 136, in _check_command_response
raise NotMasterError(errmsg, response)
pymongo.errors.NotMasterError: interrupted at shutdown
__del__
isn't guaranteed to be called, and even if it is, it isn't guaranteed to be called before other objects are deleted. When testing #75, this has led to some intermittent errors like this:
Exception ignored in: <bound method Service.__del__ of <fiftyone.core.service.ServerService object at 0x7f70aa0a4b38>>
Traceback (most recent call last):
File "/path/to/venv/lib/python3.5/site-packages/fiftyone/core/service.py", line 50, in __del__
File "/path/to/venv/lib/python3.5/site-packages/fiftyone/core/service.py", line 88, in stop
AttributeError: 'NoneType' object has no attribute 'STOP_SERVER'
Exception ignored in: <bound method Service.__del__ of <fiftyone.core.service.DatabaseService object at 0x7f70b117ae80>>
Traceback (most recent call last):
File "/path/to/venv/lib/python3.5/site-packages/fiftyone/core/service.py", line 50, in __del__
File "/path/to/venv/lib/python3.5/site-packages/fiftyone/core/service.py", line 75, in stop
AttributeError: 'NoneType' object has no attribute 'STOP_DB'
In this case, the entire fiftyone.constants
module was deleted before the Service
instances!
I think avoiding __del__
entirely is the right approach here. My understanding is that #76 has already made some improvements; this issue is mainly for tracking purposes.
Currently all bubbles rendered on images in the dashboard are displayed in <value>
mode. However, for certain fields (usually numeric fields), it can be strange to just see a number like 0.987235
on an image.
My proposal is that the user can click on any bubble on any image to toggle between showing the values in <value>
and <key>: <value>
mode. In the latter mode, you could see confidence: 0.987235
for example
I have seen this feature in CVAT, and provides for a great experience when trying to zoom in a small detections, for example.
Scroll up: zoom in
Scroll down: zoom out
I cannot select the text for the id
, filepath
or tags
on an expanded image in the UI
The discussion below was prompted by the following command:
view.match({"metadata.num_channels": 3, "metadata.size_bytes": {"$gt": 1200}})
Having the user write things like "metadata.size_bytes": {"$gt": 1200}
is getting a little too close to exposing the user to MongoDB syntax for my taste.
In an ideal pythonic interface, one could write:
view.match(lambda sample: sample.metadata.num_channels == 3)
Now, of course this would have to be implemented as a pipeline stage that read the samples into memory and applied the function, so it may not be as efficient as possible, but it would be very easy for the user to understand, and powerful.
Maybe the operation is reasonably fast even for datasets with 100K+ samples, and so we can just do it. Or maybe it's a bit slow so we expose the functionality and then suggest to the user that there are faster ways to implement certain operations.
This is analogous to https://www.tensorflow.org/api_docs/python/tf/py_function.
For any "optimized" operations we support, how about a more generic syntax like:
view.match("'metadata.size_bytes' > 1200")
where we would have a simple whitespace-based parser that would translate the string into "metadata.size_bytes": {"$gt": 1200}
.
The syntax of the match string "'metadata.size_bytes' > 1200"
vs "metadata.size_bytes": {"$gt": 1200}
is a small point.
The real ask here is implementing view.match(<function>)
rather than view.match(<match-string-using-either-syntax>)
. In the former case, the user has ultimate power to define a match operation that depends on 15 different fields in strange ways, which they can put in their custom, IDE-friendly function, at the cost of a potentially small performance overhead of reading the samples into memory during that stage.
On a multidisplay machine, I've noticed that fo.launch_dashboard()
causes a dashboard to launch on my default display. It would be nice if the dashboard would launch on the same display as the terminal window that I used to launch it. On multidisplay machines, one could not realize that the app has opened if you're not looking at the right display.
Similarly, if possible, it would be nice if the window would open with focus (so it doesn't appear in the background, again with the goal of ensuring that users see that the dashboard has launched)
foz.load_zoo_dataset()
without pip installing torch
, torchvision
, tensorflow-datasets
2) The uniqueness walkthrough errors on the call FIXEDfob.compute_uniqueness()
:
In [8]: fob.compute_uniqueness(dataset)
Search path is empty
---------------------------------------------------------------------------
ModelError Traceback (most recent call last)
<ipython-input-8-66cf7b6a3d6c> in <module>
----> 1 fob.compute_uniqueness(dataset)
~/venvs/test/lib/python3.6/site-packages/fiftyone/brain/uniqueness.py in compute_uniqueness(***failed
resolving arguments***)
~/venvs/test/lib/python3.6/site-packages/eta/core/learning.py in load_default_deployment_model(model_
name)
130 specified model
131 """
--> 132 model = etam.get_model(model_name)
133 config = ModelConfig.from_dict(model.default_deployment_config_dict)
134 return config.build()
~/venvs/test/lib/python3.6/site-packages/eta/core/models.py in get_model(name)
97 ModelError: if the model could not be found
98 """
---> 99 return _find_model(name)[0]
100
101
~/venvs/test/lib/python3.6/site-packages/eta/core/models.py in _find_model(name)
611 if Model.has_version_str(name):
612 return _find_exact_model(name)
--> 613 return _find_latest_model(name)
614
615
~/venvs/test/lib/python3.6/site-packages/eta/core/models.py in _find_latest_model(base_name)
635
636 if _model is None:
--> 637 raise ModelError("No models found with base name '%s'" % base_name)
638 if _model.has_version:
639 logger.debug(
ModelError: No models found with base name 'simple_resnet_cifar10'
In this code (taken from https://github.com/voxel51/fiftyone/blob/develop/examples/model_inference/README.md), I had to add float(confidence)
otherwise I got an error about confidence
, which was a numpy float32 or something similar, not being a supported value for a mongoengine.fields.FloatField
.
for imgs, sample_ids in data_loader:
predictions, confidences = predict(model, imgs)
# Add predictions to your FiftyOne dataset
for sample_id, prediction, confidence in zip(
sample_ids, predictions, confidences
):
sample = dataset[sample_id]
sample[model_name] = fo.Classification(label=labels_map[prediction])
sample["confidence"] = float(confidence) # float() is required here, but shouldn't need to be...
sample.save()
Kind of hard to believe that MongoEngine doesn't handle casting a np.float32
into a float, but, alas, it seems like our wrapper around mongoengine.fields.FloatField
will need to override the validate()
function below to cast non-int types with float()
as well...
I've been running through the file_hashing
example and have noticed some issues so far (and from a quick search, other examples appear to be affected as well):
DatasetView.filter()
was replaced in 6665090 (#52)Dataset.aggregate()
was made private in 6401fc3 (is there a user-facing replacement? This change doesn't appear to have been discussed in a PR)There could be others; I haven't gone through other examples yet. I don't know how to properly update all of the examples, so I would prefer to have someone with more knowledge of these changes address them.
(this issue belongs in ETA, but I'm adding it here first for visibility, as FiftyOne is the primary user of this needed install improvement)
FiftyOne currently performs a full ETA install (https://github.com/voxel51/eta/blob/develop/install.bash), which includes the following items:
pip install
component of the ETA install and instead moved to a startup script or some other appropriate placepip install eta
ffmpeg
imagemagick
eta.core.image
that we'd never be callingtensorflow
fiftyone
or eta
package reasonstensorflow/models
ETA needs a lite install process that installs what FiftyOne needs and nothing else, which can be accomplished via pip install
.
To preserve functionality for existing users, ETA also needs a full install that can be accomplished via pip install
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.