google-research / kubric Goto Github PK

A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.

License: Apache License 2.0

Python 13.00% Shell 0.08% Dockerfile 0.11% Makefile 0.13% Jupyter Notebook 86.68%

kubric's People

Stargazers

Watchers

Forkers

issamlaradji muskanmahajan37 msajjadi-g xtuden-com wsunid mfkiwl hmeyer fgolemo chikayan shyamalschandra yilundu ronnywilhelmsen drebain haiper-ai aliscifp ritmps wilson1yan isabella232 hulk006 abodacs falconmadhab techthiyanes sherwoac zebrajack jonas-kgomo victen18 cnheider jeffreyhaobondilabs josesaribeiro deanofthewebb pragyanaischool elvishelvis hiyyg scottylectronica yuzhoumao tlwzzy alexleetw87 soskek huapohen lindemeier lmartinp siviltaram zhumingxu ccchen6 xiaokangw helloheshee wrqf guoyang-xie jzhao1984 keepfaithme lingxufeng chenlongwhu lonpo xiefeifeihu gjtjx celiatester wangyephd komalthombare1 skeras yangyin2016 davidalphafox tianhaoyue suryatmodulus jsk1107 ziyuw laomao0 bellettif sarasra edwardt mrxandbadas self-driving-car-rain python-repository-hub hedlen codestella kim-junsung wenjiar gabriel-v zp1018 ganow wangyxxjtu siegelordex lethereal-0x00 crides anstarc cleemesser stancx1 roym899 rerrayne arslan-z gaochengmin-cloud motionguided bango123 afiretony yyhfly gitshohoku chenjun2hao tonylu0728 esukastudio quasimondo basilevh

kubric's Issues

Avoid object overlap

The worker currently does not ensure that the randomly placed objects do not overlap.
The simulator does check this and returns a bool, but the information is not yet used.
Something should be done to reposition overlapping objects.

Additional render passes error (blender 2.83.2)

Commented these out

    # bpy.context.scene.use_pass_crypto_object = True  # segmentation
    # bpy.context.scene.pass_crypto_depth = 4

Why?

Traceback (most recent call last):
  File "/home/atagliasacchi_google_com/dev/kubric/worker.py", line 67, in <module>
    renderer = THREE.Renderer()
  File "./kubric/viewer/blender.py", line 316, in __init__
    bpy.context.scene.use_pass_crypto_object = True  # segmentation
AttributeError: 'Scene' object has no attribute 'use_pass_crypto_object'

ArXiv One-Pager

A very brief placeholder paper that can be cited when using Kubric.
Private overleaf link

describe infrastructure + design decisions
describe 3/4 datasets + baselines
publish datasets in public place (see request)

Add blender "modifiers" to interface.py

def subdivision(mesh, level = 0):
  bpy.context.view_layer.objects.active = mesh
  bpy.ops.object.modifier_add(type='SUBSURF')
  mesh.modifiers["Subdivision"].render_levels = level # rendering subdivision level
  mesh.modifiers["Subdivision"].levels = level # subdivision level in 3D view

Migrate github unittest action to dockerfile workflow

Since #69, the tests require a working bpy module to run, and should thus be run in the Kubruntu container.

For that I suggest to change the actions as follows:

on every release (?) build the Kubruntu docker container and upload it to dockerhub
on every commit / PR run the tests using the latest Kubruntu container

Figure out how to chain PRs

This will become a necessity soon.
e.g. I want to continue the work I started with #9, but keep the code review nice and short.

I poked around a bit.

And it seems this is likely a decent solution:
https://probot.github.io/apps/dep

Thoughts @Qwlouse ?

[bpy] Properly importing OBJs from file

In blender.Scene.add_from_file()

This operation
bpy.ops.import_scene.obj(filepath=path, axis_forward='Y', axis_up='Z')

Does not update the bpy.context.object (to create blender.Object3D)

So I resorted to bpy.context.selected_objects[:] as mentioned here

But this solution needs to be certified.

Patch cycles

This is a known blender bug, but the fact it is giving a (no-consequences) error is still distracting:

Exception ignored in: <function CyclesRender.__del__ at 0x7f54b7d5c320>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/2.83/scripts/addons/cycles/__init__.py", line 68, in __del__
TypeError: 'NoneType' object is not callable

Hence propose to make a Kubruntu patch to replace in
/usr/local/lib/python3.7/site-packages/2.83/scripts/addons/cycles/__init__.py

    def __del__(self):
        engine.free(self)

with:

    def __del__(self):
        if engine.free is not None:
            engine.free(self)

ShapeNet preprocessing

Script for preprocessing ShapeNet so that it can be used as an AssetSource

prepare a docker container that contains all dependencies for the task
prepare a Makefile to replicate all efforts in standalone

Build blender as a python module "import bpy"

https://wiki.blender.org/wiki/Building_Blender/Other/BlenderAsPyModule

https://devtalk.blender.org/t/undefined-symbols-when-building-blender-as-python-module/7304/4

Alternative bucket file system I/O

Used in kubric/asset_source.py

Usage

from fs import open_fs
gcsfs = open_fs("gs://mybucket/root_path")

Dependencies

pip install google-api-core
pip install fs-gcsfs

Docs: https://docs.pyfilesystem.org/en/latest/

Add Windows and OSX platforms to tests

Currently the tests are only run on linux (ubuntu-latest). Ideally we should run on Windows and OSX too.

TFDS (Tensorflow Datasets) frontend

See draft for discussion in the tfds branch (will convert to PR once refactoring PR is completed)
https://github.com/google-research/kubric/blob/tfds/kubric/datasets/klevr.py

Example of usage is written at the bottom of the module (executable module)

Use the requirements.txt file for creating the docker image

Currently make_kubruntu.sh lists all python dependencies separately. It would be better to directly use the same requirements.txt as we are using for the python package.

Exported RGBA has a full alpha channel even for empty scenes.

I am rendering scenes with a single object and no background. The rendered background is black in the RGB, but the alpha channel is 255 for all pixels.

Set up documentation system

We need documentation and it should be built automatically.

Test exported render metadata

Add render nodes to be able to export

per pixel object class ID
optical flow
...

Ground Truth Collision information

Extend the Simulator to log all collision events between objects so that they can be stored as meta-data.

Cleanup quaternion conversion

Blender and pybullet use different quaternion formats. Currently this is converted in worker.py.
This needs to be cleaned up and hidden from the user facing code.

Install system requirements via requirements.{unix, windows, macos}

kubric/make_kubruntu.sh

Line 23 in a343d84

RUN apt-get install -y software-properties-common # has add-apt-repository

See this:
https://github.com/tensorflow/graphics/blob/master/requirements.unix

And the way it's used in the github actions:
https://github.com/tensorflow/graphics/blob/master/.github/workflows/build.yml#L25

Blender REPL with custom pip dependencies

Ubuntu 18.04

⚠️ ubuntu apt package manager on contains an obsolete version of blender (unusable)
⚠️ snapd package manager cannot be used Read-only file system for pip-install
⚠️ one could in theory modify blender so to use a custom virtualenv python ... no luck

Solution: download binary tarball and pip install in it:

wget https://download.blender.org/release/Blender2.83/blender-2.83.2-linux64.tar.xz
md5sum blender-2.83.2-linux64.tar.xz  # expected: 96194bd9d0630686765d7c12d92fcaeb
tar -xvf blender-2.83.2-linux64.tar.xz
export PATH="$PWD/blender-2.83.2-linux64:$PATH"

To find out the location of the Blender REPL python:

blender -noaudio --background --python-console
blender -noaudio --background --python-expr "import os; print(os.__file__)"

And finally dependencies can be installed via:

alias bpython=blender-2.83.2-linux64/2.83/python/bin/python3.7m
bpython -m ensurepip
bpython -m pip install --upgrade pip
bpython -m pip install wheel
bpython -m pip install numpy
...

To test an helloworld that uses some of these dependencies:

blender -noaudio --background --python helloworld.py

macOS Catalina (10.15.4)

brew cask install blender

and then simply

alias bpython="/Applications/Blender.app/Contents/Resources/2.83/python/bin/python3.7m"
bpython -m ensurepip
bpython -m pip install --upgrade pip
bpython -m pip install wheel
bpython -m pip install numpy

Upgrade to Blender version 2.91

rename master to main branch

Add spherical world (URDF + placer)

Unify terminology

Each library has its own standards for how to name properties, objects and other settings. This is especially annoying for things like position/location and rotation/orientation.
We should decide upon a standard and stay at least internally consistent.
Here is an overview for position and rotation, along with proposed naming in Kubric (more to follow).

Kubric	PyBullet	Blender	ThreeJS
position	position	location	position
quaternion	orientation	rotation_quaternion	quaternion
rotation	-	rotation_euler	rotation

Randomize renderer color/materials

new_mat = bpy.data.materials.new('Chrome')
new_mat.use_nodes = True
tree = mat.node_tree
[...]
o = bpy.context.object
o.active_material = new_mat

Control color as:

bpy.data.materials["Metal.002"].node_tree.nodes["Principled BSDF"].inputs[0].default_value = (0.749198, 0.297648, 0.249657, 1)

To duplicate materials:
https://blender.stackexchange.com/questions/82768/copy-material-with-python

Explore MeshCat as a WebGL backend

https://github.com/rdeits/meshcat-python

make_render gives an error for the -t option

Hi,

I am following the steps for running on GCP with my project and instance.

make_kubruntu finished successfully. Following is the output of make_render local which resulted in an error:

[name]@instance-1:~/kubric/kubric$ ./make_render.sh local
++ date +%b%d_%H%M%S
+ JOB_NAME=kubric_Aug20_144910
++ gcloud config get-value project
+ PROJECT_ID=[name]:[name]
+ TAG=gcr.io/[name]:[name]/kubric
+ REGION=us-central1
+ cat
+ docker build -f /tmp/Dockerfile -t gcr.io/[name]:[name]/kubric /home/[name]/kubric/kubric
invalid argument "gcr.io/[name]:[name]/kubric" for "-t, --tag" flag: invalid reference format
See 'docker build --help'.
+ cat
+ run_mode=local
+ shift
+ '[' local == local ']'
+ docker run gcr.io/[name]:[name]/kubric -- --output kubric_Aug20_144910/frame_
docker: invalid reference format.
See 'docker run --help'.

My guess: the issue is with my project id having a ':' in it. Any idea how to fix it?

(I retracted the names into [name])

Dataset Export

Currently the script generates a set of numpy arrays for the images, masks, optical flow etc. as well as a dict with meta-data information.
This information has to be structured in a standardized format, aggregated and converted into a TF dataset.

Preliminary suggestion for the format of each batch:

image (B, T, H, W, 3) float32 RGB information in the interval [0, 1]
segmentation_id (B, T, H, W, 2) int32 object IDs per pixel
segmentation_alpha (B, T, H, W, 2) float32 in the interval [0, 1]
optical_flow (B, T, H, W, 2) float32 in the interval [-1, 1]
camera_position (B, T, 3)
camera_rotation (B, T, 3)
objects
- number (B, ) int32 : number of objects in this scene
- global_position (B, K, 3) float32 global (world) coordinates of the object center of mass
- global_rotation (B, K, 3) float32 orientation of the object in global (world) coordinates
- id (B, K) int32 object/shape IDs
- ...

Add support for PolyGen

Deepmind recently released their 3D Mesh generator PolyGen as open source here.
It can generate a variety of pretty high-quality class-conditional meshes (see below).
I think we should integrate support for this directly into Kubric, in the form of a specialized asset source.
Their example Colab seems pretty easy to use, uses a model pre-trained on ShapeNet, and can even export meshes as .obj files. Adding it to Kubric should thus be very straightforward.

Use requirements file when building kubruntu image

We should really be using the requirements.txt when doing this:

kubric/make_kubruntu.sh

Line 41 in a343d84

RUN python3.7 -m pip install --upgrade --force-reinstall pip

Distribute pre-generated datasets + parametrized GitHub workflows for generating a few frames samples on GitHub CI without having to install Blender/dependencies locally

It would be very helpful if Kubric put online some example datasets that Kubric's able to produce. This would allow getting familiar with the file structure (even if Kubric requires some polishing and is not ready for mass usage yet)

Another question: how much compute does dataset generation require? How long does it take? If it's hours of single-core cpu, it may be possible to just use GitHub Actions for generating simple CATER-like datasets.

add cloud-bucket export to worker

Worker currently only stores files locally. Add functionality for exporting to a cloud storage (gs://[...] URI)

Add support for scaling objects

It should be possible to scale objects during import.
This includes:

adding a scale parameter to Objects
add scaling to Simulator
add scaling to Blender viewer / interface
adjusting physical information such as mass, volume, area, inertia, ...

Typo for @add_asset.register(core.Sphere)

Did you mean to write core.Sphere here?

kubric/kubric/simulator/pybullet.py

Line 89 in 00697d0

@add_asset.register(core.Sphere)

kubric/kubric/simulator/pybullet.py

Line 90 in 00697d0

def _add_object(self, obj: core.Cube) -> Optional[int]:

warning on cloud bucket authentication

Get rid of this warning:

~/Envs/python3/lib/python3.6/site-packages/google/auth/_default.py:69: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK without a quota project. You might receive a "quota exceeded" or "API not enabled" error. We recommend you rerun `gcloud auth application-default login` and make sure a quota project is added. Or you can use service accounts instead. For more information about service accounts, see https://cloud.google.com/docs/authentication/
  warnings.warn(_CLOUD_SDK_CREDENTIALS_WARNING)
~/Envs/python3/lib/python3.6/site-packages/google/auth/_default.py:69: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK without a quota project. You might receive a "quota exceeded" or "API not enabled" error. We recommend you rerun `gcloud auth application-default login` and make sure a quota project is added. Or you can use service accounts instead. For more information about service accounts, see https://cloud.google.com/docs/authentication/

Followed the gcloud auth application-default login to no avail; do we need a service account?

Configure python linter

Currently the automatic linting only fails on syntax errors.
It should be properly configured to (partially) enforce our style guidelines.

Set up automated tests

We should have a continuous integration server that automatically runs style-checks, linters, and, unittests on each PR.

Calm codecov down a notch

Our code coverage reporting fails on any PR that decreases the test coverage (this is the default setting for codecov).
But at the current state of development that is not ideal, so we should deactivate that.

Configure github actions

Configurable Materials

Currently material information is tied to the object and stored as an .mtl file along the .obj file for the visual geometry.
This is not ideal for two reasons:

.mtl is very limited and cannot store PBR based attributes such as metalness
there is currently no way to change/randomize the material of objects

We should add:

A way to store and import materials separately from objects. Maybe the asset source could create Materials the same way it creates objects.
A way to configure a few basic material properties, at least including color, roughness, specularity, and metalness.
A way to change textures. Possibly as part of materials?
export material information as ground truth information
Possibly assign physical properties to materials such as friction, density and bounciness.

Only send kubric logging info to stdout/stderr

First attempt failed, as PyBullet uses C-level stdout invocations.
It ignores the following settings:

def redirect_stdout(dirpath):
  dirpath = pathlib.Path(dirpath)
  filepath = dirpath / "stdout.txt"
  logger.info(f"redirecting stdout to {filepath}")
  sys.stdout = open(filepath, "w")
  sys.stderr = open(filepath, "w")

Integrate pybullet→blender in worker.py

Main design decision: how to import mesh. As URDF not directly supported by blender, and pybullet mesh export is experimental.

Blender side

blender → bpy.ops.import_scene.obj(filepath=path, axis_forward='Y', axis_up='Z')

PyBullet side (post import) docs:

getVisualShapeData(..)

Set up Copybara sync with google internal codebase

No camera when .blend is exported

Rendering to media works, but when exporting to .blend file, there is no camera in the scene.

File I/O that is transparent to local vs. bucket

Currently shaping path operations cannot use pathlib.Path as google cloud buckets do not support URL prefixes.
Aka, gs://bucket/foo/bar will be converted to gs:/bucket/foo/bar and become invalid

TFDS seems to offers some functionality in regards:

path = tfds.core.as_path("gs://foo/bar") / blah
path.write_text('asdas')
with path.open('rb') as fp:
  ...
tf.io.gfile.exists(path)  # Works in TF 2.4+

Note: PathLike support for tf.io.gfile was added only to TF 2.4+. Before this, you need to replace tf.io.gfile.exists(path) -> tf.io.gfile.exists(os.fspath(path)). In TFDS we mock tf.io.gfile during generation for backward compatibility.

Thanks to @Conchylicultor for the pointers

Installing OpenEXR on Mac

Failed with `fatal error: 'ImathBox.h' file not found``

use pathlib.Paths for core.FileBasedObject

Exporting to cloud bucket

just use gsutil cp -m -r from_path to_path and rely on cloud project to exist and be authenticated
use google.cloud storage python interface:

import pathlib
from google.cloud import storage
def mirror_to_bucket(from_path, bucket_path: str, project=None):
  """Equivalent to: gsutil cp -r from_path bucket_path"""
  from_path = pathlib.Path(from_path)
  bucket_path = pathlib.Path(bucket_path) 
  assert bucket_path.parts[0] == "gs:"
  
  bucket_name = bucket_path.parts[1]
  client = storage.Client(project=project)
  bucket_obj = client.get_bucket(bucket_name)  # expects bucket without prefix
  target_blob_prefix = pathlib.Path(*bucket_path.parts[2:])
  
  for source_blob in from_path.rglob("*"):
    if not source_blob.is_file(): continue
    target_blob = str(target_blob_prefix / source_blob.relative_to(from_path))
    source_blob = str(source_blob)
    logging.info(f"copying {source_blob} → {target_blob}")
    bucket_obj.blob(target_blob).upload_from_filename(source_blob)

→ after discussion (1) seems to be a cleaner way, especially as the sync is done after the rendering job is complete (the -m option allows for multi-threaded upload to bucket)

pre-setup (to be implemented in the Dockerfile)

For both cases to work, we need to install GCP and silently configure it.
p.s. isn't there a cleaner apt-get version for GCP? → investigate

apt-get install curl
curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-315.0.0-linux-x86_64.tar.gz
tar -zxvf google-cloud-sdk-315.0.0-linux-x86_64.tar.gz
./google-cloud-sdk/install.sh --quiet

and then make it available

echo source ./google-cloud-sdk/path.bash.inc >> ~/.bashrc
echo export GCLOUD_PROJECT=........... >> ~/.bashrc
source ~/.bashrc

Logs from blender REPL not flowing

When print or logging.info is executed within a blender REPL and the execution is performed on cloud nothing is printed to GCP's logging infrastructure.

If we use the python module everywhere this is not a problem, but if we stick to the REPL it might be good to understand how to fix, as it'll make future debugging a nightmare.