rasmushaugaard / surfemb Goto Github PK

View Code? Open in Web Editor NEW

77.0 77.0 17.0 46 KB

SurfEmb (CVPR 2022)

Home Page: https://surfemb.github.io/

License: MIT License

Python 100.00%

6d-pose-estimation computer-vision correspondence-distributions deep-learning object-pose-estimation pose-estimation

surfemb's People

Contributors

Stargazers

Watchers

Forkers

nitheeshkl hengseuer hiyyg barvin04 lniper jonashein transcend-lzy mrkulk noggrj bruinxiong asmabrz leroychou kekeblom kooshyarkosari emreds agb24 varunburde

surfemb's Issues

instance = self.instances[i].copy()-1 IndexError: list index out of range

value of loss in ycbv

I would like to know the value of loss after 500,000(default config) iterations of training in YCBV pbr dataset and real image dataset.Because I get worse recall than in the paper.I just need a rough range of loss.
Thank you.

'ambient_occlusion' filter throws core dumped for tless models

When I run the surface_samples_remesh_visible script for tless objects the processing fails with

Aborted (core dumped)

did anyone face the same issue or did I miss some steps before running the script?

remove invisible parts of the objects

when i try to run python -m surfemb.scripts.misc.surface_samples_remesh_visible tless,

it cause:
ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /home/robot/miniconda3/envs/surfemb/lib/python3.8/site-packages/cv2.cpython-38-x86_64-linux-gnu.so)

i install opencv-python==4.1.2.30 to fix it.

But it cause another problem:
Traceback (most recent call last):
File "/home/zzz/miniconda3/envs/surfemb-test/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/zzz/miniconda3/envs/surfemb-test/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/zzz/github/surfemb/surfemb/scripts/misc/surface_samples_remesh_visible.py", line 29, in
ms.repair_non_manifold_edges_by_removing_faces()
AttributeError: 'pymeshlab.pmeshlab.MeshSet' object has no attribute 'repair_non_manifold_edges_by_removing_faces'

question about pose score.

Hi, thank you again.

I'm comparing the pose score you proposed in paper and implemented in codes such that I encounter some questions.

In this line you calculate neg_mask_log_prob by inversing mask_lgts before feeding it into the logsigmoid. why do you inverse mask_lgts? Does it actually mean anything?
Can I think of the pose score as confidence as long as I map its value to [0, 1] via a kind of mono-increasing function?

Pose Refiner Diverges

Thank you for the wonderful work. Both the paper and the code are a pleasure to read.

I have tried the approach on a different dataset and would like to ask for your expert opinion, if I may. A fraction of the predictions (~60%) are very good even with only RGB refinement, but the remaining pose predictions are far away from the actual pose (about 1m in l1 distance) and could be ruled out by calculating the xyz boundaries of the crop. The input pose from the PNP between both fraction almost equally good.

Do you have an idea how to discipline the refinement?
How would you analyze the quality of the incoming query image or the sampled keys?
Do you have any other suggestions what to look for in these cases?

Thanks again for the wonderful work.

What does the `current_pose` array values mean?

I have modified the infer_debug file so that when a pose estimate is completed it sends pose details of the current_pose array. My question is what these numbers actually mean? I would assume x, y, z, pitch, roll and yaw but I have no idea which corresponds to which.

assert len(obj_ids) > 0 AssertionError

PLEASE SEE COMMENT BELOW FOR NEW ERROR:

I'm getting this strange error when I run python -m surfemb.scripts.infer <path>/data/models/ycbv-jwpvdij1.compact.ckpt --device cuda:0.

It originates from line 50 of infer.py, and also occurs when you run infer_debug.py.
If the full error stack would be useful, let me know.

Full Stack:
Traceback (most recent call last): File "/home/usr/anaconda3/envs/surfemb/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/usr/anaconda3/envs/surfemb/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/usr/Documents/6DPose/surfemb/scripts/infer.py", line 50, in <module> assert len(obj_ids) > 0 AssertionError

the solution was to rename ycbv_models to ycbv

Any Plan to upgrade?

Hi Rasmus,
Your deep learning model is really powerful. There is soo much work in it. I have made it work and using it. For my custom pbr dataset, it works really good, but not perfect. My question is if you have plans to improve the model to v2? It would be great if you have so
Thanks in advance
Best Regards

question about bbox and mask

Thanks again for this great work. I am still trying to make it work for our case. In order to run the model with our custom dataset, should we need to have bbox and mask in our bop formatted custom dataset? we can provide bbox_visible but not bbox. and we can provide mask_visible but not mask? how important are they? would be bbox_visible and mask_visible enough?
if we need them for this model, do you have any idea how to generate that ?
thanks thanks in advance

Custom dataset

If I want to train my datasets(texture-less), which has 4 classes.

First, how should I train the 2D detector? And MaskRCNN or Retinanet?

Second, in surfemb, What parameters should I modify?

Can you give me some advice? Thanks~

The code for visualization

Hi, authors!

Thank you for your great work! The visualization results in your paper and website are amazing, so would you please offer us the visualization code additionally?

dataset settings

请问大家数据集是怎么设置格式的？我用itodd进行训练显示;
ValueError: num_samples should be a positive integer value, but got num_samples=0
我是：
data
bop
itodd
models
train
test
camera.json
dataset_info.md
test_targets_bop19.json

train on a custom dataset

I'm trying to train surfemb on a custom dataset. But when I do inference, I find some files/dirs are musts(they're detection_results, surface_samples, surface_samples_normals).
For dectection_results, I've already known it's from CosyPose, but I don't know how to generate it in detail.
For surface_samples, I followed your guidance that run command $ python -m surfemb.scripts.misc.surface_samples_remesh_visible clip first. But I encountered this error:

Traceback (most recent call last):
  File "/root/miniconda3/envs/surfemb/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/miniconda3/envs/surfemb/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/root/surfemb/surfemb/scripts/misc/surface_samples_remesh_visible.py", line 32, in <module>
    ms.compute_scalar_ambient_occlusion(occmode='per-Face (deprecated)', reqviews=256)
AttributeError: 'pymeshlab.pmeshlab.MeshSet' object has no attribute 'compute_scalar_ambient_occlusion'

Could you please give me any suggestions on these two problems?

multiple instances in a scene

Hello, thanks for your great work and patient replyment.
I find that the inputs of the proposed nn surfemb are cropped images where there're only one instance, whenever do training or inference, aren't they? So can you deal with a image which has multiple instances whose bboxs are not known?

Some problems encountered when training tless

I downloaded the tless on bop and put it under data/bop/tless, the code can load the cad models , i use python -m surfemb.scripts.train tless --gpus 0 to run, but it runs to After trainer.fit, the following error will occur：
TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class 'trimesh.caching.TrackedArray'>
Sorry, am I setting it wrong?

A offset for PBR's camera K

Hi,

Thank you for your excellent work!
I just wondering why there is a 0.5 offset to compensate camera K for PBR data.
Thank you!

Best,
Rui

Wait at 0% while training.Epoch 0: 0% 0/13000

Hi, i run 'python -m surfemb.scripts.train ycbv', and ./data/bop/ycbv/models have 21 .ply files, and 80 folder in ./data/bop/ycbv/train_real. I did not use synth imgs, but the program always 0%. Is the program preprocessing image information,crop object from img?I waited a dozen hours and it was still 0%. like Epoch 0: 0%. The python is still running.Do I have to wait a long time before train model?Are you in a similar situation?

But if I just have 3 .ply files in ./data/bop/ycba/models,and 1 folder in ./data/bop/ycbv/train_real,it will soon(3 - 4 minutes) train cnn .Finally, the trained model is obtained.
For example, obj_000008.ply, obj_000014.ply, obj_000021.ply in ./data/bop/ycba/models, 000000 in ./data/bop/ycbv/train_real(imgs in 000000 only have 3 types of objects, obj_8, obj_14, obj_21).

If I want to train all the objects in ycbv at once, Do I have to wait longer?I'm using a server, CPU performance is not weak.

{
"os": "Linux-4.15.0-175-generic-x86_64-with-debian-buster-sid",
"python": "3.7.11",
"heartbeatAt": "2022-04-09T09:34:05.311715",
"startedAt": "2022-04-09T09:34:02.496814",
"docker": null,
"gpu": "GeForce RTX 3090",
"gpu_count": 8,
"cpu_count": 40,
"cuda": null,
"args": [],
"state": "running",
"program": "-m surfemb.scripts.train",
"git": {
"remote": "https://github.com/rasmushaugaard/surfemb.git",
"commit": "46f46ddc5670848d696968dc8ec65c8ce62b16a8"
},
"email": "[email protected]",
"root": "/home/aa/prjs/surfemb",
"host": "sddx-PR4908P",
"username": "aa",
"executable": "/home/aa/anaconda3/envs/d2_1.10/bin/python"
}

logs:
2022-04-09 11:02:46,213 INFO 2022-04-09 11:02:46,214 INFO 2022-04-09 11:02:46,214 INFO 2022-04-09 11:02:46,214 2022-04-09 11:02:46,214 INFO 2022-04-09 11:02:46,214 INFO 2022-04-09 11:02:46,214 INFO 2022-04-09 11:02:46,215 INFO 2022-04-09 11:02:46,215 INFO config: {}
2022-04-09 11:02:46,215 INFO 2022-04-09 11:02:46,228 INFO 2022-04-09 11:02:46,232 INFO 2022-04-09 11:02:46,238 INFO 2022-04-09 11:02:46,578 INFO 2022-04-09 11:02:49,104 INFO 2022-04-09 11:02:49,106 INFO 2022-04-09 11:02:49,107 INFO 2022-04-09 11:02:49,108 INFO 2022-04-09 11:02:49,109 INFO 2022-04-09 11:02:49,130 INFO 2022-04-09 11:07:50,141 MainThread:16337 [wandb_setup.py:_flush():75] Loading settings from /home/aa/.config/wandb/settings
MainThread:16337 [wandb_setup.py:_flush():75] Loading settings from /home/aa/prjs/bcnet/pose/surfemb/wandb/settings
MainThread:16337 [wandb_setup.py:_flush():75] Loading settings from environment variables: {'api_key': 'REDACTED', 'mode': 'offline', '_require_service': 'True'}
WARNING MainThread:16337 [wandb_setup.py:_flush():75] Could not find program at -m surfemb.scripts.train
MainThread:16337 [wandb_setup.py:_flush():75] Inferring run settings from compute environment: {'program_relpath': None, 'program': '-m surfemb.scripts.train'}
MainThread:16337 [wandb_init.py:_log_setup():405] Logging user logs to /home/aa/prjs/bcnet/pose/surfemb/wandb/offline-run-20220409_110246-3fewafz3/logs/debug.log
MainThread:16337 [wandb_init.py:_log_setup():406] Logging internal logs to /home/aa/prjs/bcnet/pose/surfemb/wandb/offline-run-20220409_110246-3fewafz3/logs/debug-internal.log
MainThread:16337 [wandb_init.py:init():439] calling init triggers
MainThread:16337 [wandb_init.py:init():443] wandb.init called with sweep_config: {}
MainThread:16337 [wandb_init.py:init():492] starting backend
MainThread:16337 [backend.py:_multiprocessing_setup():101] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
MainThread:16337 [wandb_init.py:init():501] backend started and connected
MainThread:16337 [wandb_init.py:init():565] updated telemetry
MainThread:16337 [wandb_init.py:init():625] starting run threads in backend
MainThread:16337 [wandb_run.py:_console_start():1733] atexit reg
MainThread:16337 [wandb_run.py:_redirect():1606] redirect: SettingsConsole.WRAP
MainThread:16337 [wandb_run.py:_redirect():1643] Wrapping output streams.
MainThread:16337 [wandb_run.py:_redirect():1667] Redirects installed.
MainThread:16337 [wandb_init.py:init():664] run started, returning control to user process
MainThread:16337 [wandb_run.py:_config_callback():992] config_cb None None {'n_objs': 21, 'emb_dim': 12, 'n_pos': 1024, 'n_neg': 1024, 'lr_cnn': 0.0003, 'lr_mlp': 3e-05, 'mlp_name': 'siren', 'mlp_hidden_features': 256, 'mlp_hidden_layers': 2, 'key_noise': 0.001, 'warmup_steps': 2000, 'separate_decoders': True, 'pa_sigma': 0.0, 'align_corners': False, 'dataset': 'ycbv', 'n_valid': 200, 'res_data': 256, 'res_crop': 224, 'batch_size': 16, 'num_workers': 'None', 'min_visib_fract': 0.1, 'max_steps': 500000, 'gpus': 2, 'debug': False, 'ckpt': 'None', 'synth': False, 'real': True}
WARNING MsgRouterThr:16337 [router.py:message_loop():76] message_loop has been closed

Could you tell me how to train icbin dataset in mask rcnn ?

TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

when i want to train the tless python -m surfemb.scripts.train tless:

Traceback (most recent call last):
  File "/home/zzz/miniconda3/envs/surfemb/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/zzz/miniconda3/envs/surfemb/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/zzz/github/surfemb/surfemb/scripts/train.py", line 121, in <module>
    main()
  File "/home/zzz/github/surfemb/surfemb/scripts/train.py", line 61, in main
    model = SurfaceEmbeddingModel(n_objs=len(obj_ids), **vars(args))
  File "/home/zzz/github/surfemb/surfemb/surface_embedding.py", line 48, in __init__
    n_class=(emb_dim + 1) if separate_decoders else n_objs * (emb_dim + 1),
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

It Seems the emb_dim is None. But it has the default value emb_dim=12.

About the implementation of Unet.

Hi, thank you for your great work!
I notice that your implementation of Unet is a little different from the original one. in this line you feed the data from "contracting path" into a convrelu before catenating it with current data from "expansive path", which you do in the next line. But the author of Unet merely copy it, with no furthur process. So what's the reason you do in such manner?

bug in pose/surfemb/surfemb/scripts/misc/render_poses.py

python -m surfemb.scripts.misc.render_poses ycbv /home/aa/prjs/pose/surfemb/data/results/ycbv-jwpvdij1-poses.npy，but
.thank you

scores in resulst and bop19_average_recall

Question 1:I used ycbv-jwpvdij1.compact.ckpt(a trained model that you provided) to infer test datasets in ycbv(python -m surfemb.scripts.infer), then python -m surfemb.scripts.misc.format_results_for_eval, the score in results all is negative, for example，-0.339 , -0.401.Is that normal？

A:scene_id B:img_id C:est_obj_id D: score.

ImportError: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.25' not found

Hi~
Thanks for your sharing.

Ubuntu 16.04
2070 super

when i try to run python -m surfemb.scripts.misc.surface_samples_remesh_visible tless, it has the following problems:

Traceback (most recent call last):
  File "surfemb/scripts/misc/surface_samples_remesh_visible.py", line 4, in <module>
    import pymeshlab
  File "/home/zzz/miniconda3/envs/surfemb_test/lib/python3.8/site-packages/pymeshlab/__init__.py", line 11, in <module>
    from .pmeshlab import *
ImportError: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.25' not found (required by /home/zzz/miniconda3/envs/surfemb_test/lib/python3.8/site-packages/pymeshlab/lib/libpython3.8.so.1.0)

I cannot solve it. So i came to ask for help.

Number of threads explodes when training

In the training script, an additional environment variable needs to be set in the worker_init_fn function

os.environ['OMP_NUM_THREADS'] = 1

See torch comment for additional information

Question about inference data

For some strange reason wget does not download data to the correct place on my machine, so I downloaded the inference_data.zip file manually. I am now unsure where to extract its contents? Would I do this in the root directory or perhaps in /data or /data/models. Thank you.

您好，请问如何用mask rcnn训练icbin数据集呢，谢谢

About the instantiation of renderer

I'm trying to use your render code for depth image generation. Although I have set 'OPENLAS_NUM_THREADS' etc, segmentation fault encounter in work always occurs if num_worker>0, do you have any idea about this?

Depth inference fails due to value error

I have just tried to run the depth inference script. It completed all of the tasks then errored at line 154.

The error was:

poses_depth_timings = poses_timings + np.array(all_depth_timings) <- THIS LINE CAUSED THE ERROR
ValueError: operands could not be broadcast together with shapes (2,0) (2,5438)

I've had a good look online, and apparently this can be fixed by using numpy.dot(), but I can't see where this can be used.
I also am unsure as to whether the + operator is applying matrix multiplication or adding a 2 by 0 matrix to a 2 by 5438 matrix, which is obviously impossible. So do I need to resize the 2 by 5438 matrix using .resize?

This is what happens when I run render_poses.py:

Then remove invisible parts of the objects

I run python -m surfemb.scripts.misc.surface_samples_remesh_visible
AttributeError: 'pymeshlab.pmeshlab.MeshSet' object has no attribute 'repair_non_manifold_edges_by_removing_faces'

I install pymeshlab==0.2.1 to fix it.
But it cause another problem:

0%| | 0/21 [00:00<?, ?it/s]
/home/c/surfemb/data/bop/ycbv/models/obj_000013.ply

----------AngleRad 0.523599 Angledeg 30.000000 ratio 0.066987 vn 256 vn2 3821
asked 3821 got 3821 (expecting 255 instead of 256)
0%| | 0/21 [00:02<?, ?it/s]
Traceback (most recent call last):
File "/home/c/surfemb/surfemb/scripts/misc/surface_samples_remesh_visible.py", line 42, in
area_reduction = trimesh.load_mesh(remesh_fp).area / trimesh.load_mesh(mesh_fp).area
File "/home/c/anaconda3/envs/surfemb/lib/python3.8/site-packages/trimesh/constants.py", line 153, in timed
result = method(*args, **kwargs)
File "/home/c/anaconda3/envs/surfemb/lib/python3.8/site-packages/trimesh/exchange/load.py", line 209, in load_mesh
results = mesh_loaders[file_type](file_obj,
File "/home/c/anaconda3/envs/surfemb/lib/python3.8/site-packages/trimesh/exchange/ply.py", line 106, in load_ply
ply_binary(elements, file_obj)
File "/home/c/anaconda3/envs/surfemb/lib/python3.8/site-packages/trimesh/exchange/ply.py", line 881, in ply_binary
raise ValueError('File is unexpected length!')
ValueError: File is unexpected length!

error when resuming from checkpoint

whenever i try to resume from a previous checkpoint, i get this error:
File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/moritz/surfemb/surfemb/surfemb/scripts/train.py", line 123, in <module> main() File "/home/moritz/surfemb/surfemb/surfemb/scripts/train.py", line 119, in main trainer.fit(model, loader_train, loader_valid) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 768, in fit self._call_and_handle_interrupt( File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 721, in _call_and_handle_interrupt return trainer_fn(*args, **kwargs) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 809, in _fit_impl results = self._run(model, ckpt_path=self.ckpt_path) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1234, in _run results = self._run_stage() File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1321, in _run_stage return self._run_train() File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1351, in _run_train self.fit_loop.run() File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run self.advance(*args, **kwargs) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 268, in advance self._outputs = self.epoch_loop.run(self._data_fetcher) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run self.advance(*args, **kwargs) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 208, in advance batch_output = self.batch_loop.run(batch, batch_idx) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run self.advance(*args, **kwargs) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance outputs = self.optimizer_loop.run(split_batch, optimizers, batch_idx) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run self.advance(*args, **kwargs) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 203, in advance result = self._run_optimization( File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 256, in _run_optimization self._optimizer_step(optimizer, opt_idx, batch_idx, closure) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 369, in _optimizer_step self.trainer._call_lightning_module_hook( File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1593, in _call_lightning_module_hook output = fn(*args, **kwargs) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/core/lightning.py", line 1644, in optimizer_step optimizer.step(closure=optimizer_closure) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py", line 168, in step step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 193, in optimizer_step return self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 155, in optimizer_step return optimizer.step(closure=closure, **kwargs) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 65, in wrapper return wrapped(*args, **kwargs) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/torch/optim/optimizer.py", line 109, in wrapper return func(*args, **kwargs) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/torch/optim/adam.py", line 157, in step adam(params_with_grad, File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/torch/optim/adam.py", line 213, in adam func(params, File "/home/moritz/anaconda3/envs/surfemb/lib/python3.8/site-packages/torch/optim/adam.py", line 255, in _single_tensor_adam assert not step_t.is_cuda, "If capturable=False, state_steps should not be CUDA tensors." AssertionError: If capturable=False, state_steps should not be CUDA tensors.

Any idea how to resolve this?

also i cant get the training to run with standard settings. i get an outofmemory error on a rtx3070ti (8gb) if i dont run at n-valid = 2 and batch-size = 1

2080ti one gpu gives out of memory error while training

Hi, thanks for your great work, I hope I will make it work and be able to use it on my custom dataset.
my problem is this;
I have one 2080ti and I am trying to train the tless pbr dataset but I get an error "cuda out of memory" .
I have used smaller batch size which is 8, I have decreased the number of workers to 0. but it keeps giving the error ( ok now it gives the error later than before but it still gives the error)

it only works if I decrease the scenes from 50 to 1 in train_pbr folder. otherwise no chance.

is this normal behavior with this one gpu , or I am missing something

thanks in advance

cannot import name 'egl' from 'glcontext'

I have tried to run the Inference Inspection code in my windows machine with the given inference data as proposed in the README but I got the error:

$ python -m surfemb.scripts.infer_debug data/models/tless-2rs64lwh.compact.ckpt --device cpu
loading objects: 0it [00:00, ?it/s]
Traceback (most recent call last):
  File "C:\Users\39331\anaconda3\envs\surfemb\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\39331\anaconda3\envs\surfemb\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\39331\Documenti\Final Year Project\surfemb\surfemb\scripts\infer_debug.py", line 43, in <module>
    renderer = ObjCoordRenderer(objs, res_crop)
  File "C:\Users\39331\Documenti\Final Year Project\surfemb\surfemb\data\renderer.py", line 43, in __init__
    self.ctx = moderngl.create_context(standalone=True, backend='egl', device_index=device_idx)
  File "C:\Users\39331\anaconda3\envs\surfemb\lib\site-packages\moderngl\context.py", line 1619, in create_context
    ctx.mglo, ctx.version_code = mgl.create_context(glversion=require, mode=mode, **settings)
  File "C:\Users\39331\anaconda3\envs\surfemb\lib\site-packages\glcontext\__init__.py", line 49, in get_backend_by_name
    return _egl()
  File "C:\Users\39331\anaconda3\envs\surfemb\lib\site-packages\glcontext\__init__.py", line 106, in _egl
    from glcontext import egl
ImportError: cannot import name 'egl' from 'glcontext' (C:\Users\39331\anaconda3\envs\surfemb\lib\site-packages\glcontext\__init__.py)

I have tried installing OpenGL again but it did not solve the problem, I cannot find any sources for solving the dependency.
How would you suggest me to solve it?

Some source files may be missing

Thanks for sharing your code.

Some source files may be missing under the folder of ./data, such as config, instance, detector_crops, ObjCoordRenderer.

Can you upload a RGBD version of the code

Very good job, but I'm a newcomer in this field. Can you upload a RGBD version of the code

A question about ambiguities

Hi.
Thanks for your work.
I have a question about your paper.
According to your paper, 'uniformly sampled object points are fed through the same key model to provide negative keys'.
But according to InfoNCE loss, the denominator should be query point and negative point. So if the sampled points include the symmetrical points of query point, it will not satisfied the requirement of InfoNCE loss, right?

CosyPose BOP-trained detection results

Hello, is there a link to the test results of Cosypose? I would like to use the detection results of other methods to find differences by observing the composition of the results. Thank you so much!

How do you create the `depth`, `mask` and `mask_visib` from the `rgb` image?

I have an rgb image I wish to try and run the program on, with an image I found off the internet (it should work since the image I found contains one of the items the algorithm was trained on). How would I go about creating the mask_visib, mask and depth images from my rgb image?

Why to normalize coordinates of objects when rendering?

Hello, I want to express my gratitude for your excellent work.

I recently observed that you normalize the objects' coordinates using offset and scale in this line.
Why not use the original coordinates? I would greatly appreciate it if you could provide the rationale behind this decision. Thank you very much.