brjathu / lart Goto Github PK
View Code? Open in Web Editor NEWCode repository for the paper "On the Benefits of 3D Pose and Tracking for Human Action Recognition", (CVPR 2023)
Home Page: https://github.com/brjathu/LART
Code repository for the paper "On the Benefits of 3D Pose and Tracking for Human Action Recognition", (CVPR 2023)
Home Page: https://github.com/brjathu/LART
Hi, thank you for your excellent work.
I followed this issue to make PHALP save features for my dataset videos. I had to set the save_fast_tracks = True on phalp/configs/base.py, as well as save_fast_tracks = true on outputs/.hydra/config.yaml to make the post_process function be executed. However, I'm getting the following error:
Error executing job with overrides: ['video.source=assets/gymnasts_short.mp4', '+half=True']
Traceback (most recent call last):
File "/home/lecun/LART_dev/scripts/demo.py", line 114, in main
lart_model.postprocessor.run_lart(pkl_path)
File "/home/lecun/LART_dev/venv/lib/python3.10/site-packages/phalp/visualize/postprocessor.py", line 102, in run_lart
final_visuals_dic = self.post_process(final_visuals_dic, save_fast_tracks=self.cfg.post_process.save_fast_tracks, video_pkl_name=video_pkl_name)
File "/home/lecun/LART_dev/venv/lib/python3.10/site-packages/phalp/visualize/postprocessor.py", line 42, in post_process
for idx, appe_idx in enumerate(smoothed_fast_track_['apperance_index']):
KeyError: 'apperance_index'
I'm using default configs that come from LART repository. I'm also using dev branch to run my code in half precision.
Do you have any suggestion for why this is happening?
Thanks
@brjathu Thank you for sharing this amazing work.
I have a question regarding finetuning on a custom dataset. I have run the demo on some videos to get the results_temporal_fast pkl files. I modified them to use my gt labels instead of the 80 labels of ava dataset.
I noticed that the demo provides ways for predicting both ava labels and kinetics labels. I'm not sure if I should replace the ava part or the kinetic part with my gt. The code seems to use pseudo labels for ava label space as well, so the dimension of the pseudo labels is not the same as my gt labels. Should I get rid of the pseudo label part since I have gt annotation for each frame ?
My other question is how can I use the pretrained model that you provide to finetune it on my dataset of smaller number of classes ?
Thanks
Hi, author. I want to train time configuration. Can you tell me your memory and how long you train at such configuration.
When demo with Hiera backend would be released?
Hello author, during the initialization process of the model, some pre trained models need to be loaded. Where can I download these models?
eg:
FileNotFoundError: [Errno 2] No such file or directory: '/home/zck/.cache/phalp/3D/smpl_mean_params.npz'
code:
mean_params = np.load(cfg.MODEL.SMPL_HEAD.SMPL_MEAN_PARAMS) # '/home/zck/.cache/phalp/3D/smpl_mean_params.npz'
init_body_pose = torch.from_numpy(mean_params['pose'].astype(np.float32)).unsqueeze(0)
init_betas = torch.from_numpy(mean_params['shape'].astype('float32')).unsqueeze(0)
init_cam = torch.from_numpy(mean_params['cam'].astype(np.float32)).unsqueeze(0)
self.register_buffer('init_body_pose', init_body_pose)
self.register_buffer('init_betas', init_betas)
self.register_buffer('init_cam', init_cam)
Hi. I have some questions on how you encoded the SMPL parameters.
The error is as follows:
ERROR: Could not build wheels for detectron2, which is required to install pyproject.toml-based projects
I am a beginner in computer vision. I want to try to run the demo by myself, but during the installation of the environment, there will always be errors. I guess it is a problem of version incompatibility. If you can take the time to help me solve this problem, I will be very grateful.
Hello, congratulations on your work!
I cloned the repository and followed the instructions in the "Training and Evaluation" section. However, I encountered a "ModuleNotFoundError: No module named 'lart.ActivityNet'" error, indicating that the 'lart.ActivityNet' file was missing. I resolved this error by following the additional instructions provided in the "Demo on videos" section, which allowed me to download the remaining files. I suggest adding a note to the README advising users to follow the "Demo on videos" section before proceeding with the "Training and Evaluation" section.
Hi @brjathu .
Great work! I had some questions regarding video input requirements/processing and Hiera:
Hi,
Thanks for releasing this great work! When runing the demo script on two 4090 GPU, I obtain the following error
File "/home/zhangy76/anaconda3/envs/lart/lib/python3.10/site-packages/slowfast/models/attention.py", line 112, in cal_rel_pos_spatialattn[:, :, sp_idx:, sp_idx:].view(B, -1, q_t, q_h, q_w, k_t, k_h, k_w) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 5.25 GiB (GPU 0; 23.65 GiB total capacity; 17.74 GiB already allocated; 3.99 GiB free; 19.04 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
I wonder if you have any suggestion on the possible reasons leading to the error?
Yufei
I generated output from my own videoclip, running on a python 3.10.12
notebook in Google collab. (sys.version
within the notebook agrees with what is produced by !python --version
.) I used the given command:
!python scripts/demo.py video.source="../../4dhumans/inputs/round5_clip.mp4"
However, I cannot read any of the pkl
files in outputs/results_temporal_fast
. I get:
with open("demo_round5_clip_100_38.pkl", "rb") as f:
z = pkl.load(f)
==>
UnpicklingError Traceback (most recent call last)
[<ipython-input-21-802b467ad1f5>](https://localhost:8080/#) in <cell line: 1>()
1 with open("demo_round5_clip_100_38.pkl", "rb") as f:
----> 2 z = pkl.load(f)
UnpicklingError: invalid load key, '\x0c'.
Even running python -m pickletools <filename>
gives an error:
191: \x94 MEMOIZE (as 22)
192: \x88 NEWTRUE
193: \x8c SHORT_BINUNICODE 'numpy_array_alignment_bytes'
222: \x94 MEMOIZE (as 23)
223: K BININT1 16
225: u SETITEMS (MARK at 69)
226: b BUILD
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/lib/python3.10/pickletools.py", line 2883, in <module>
dis(args.pickle_file[0], args.output, None,
File "/usr/lib/python3.10/pickletools.py", line 2448, in dis
for opcode, arg, pos in genops(pickle):
File "/usr/lib/python3.10/pickletools.py", line 2285, in _genops
raise ValueError("at position %s, opcode %r unknown" % (
ValueError: at position 227, opcode b'\x0c' unknown
Any clue?
The output video has on the left the original video and on the right the reconstructed video with SMPL skins for humans. Now, when the system is run on a video with very fast actions by humans (e.g. hands are moving extremely rapidly) it appears that there is a perceptible lag between the LHS and the RHS.
Do you know why? Is it known to be constant (in which case it does not matter, one can just adjust the two) or does it vary on some parameters (e.g. time, complexity of preceding actions)...?
@brjathu Great work. Thank you for sharing
May request to make script avaible in public to generate data in the format found in https://github.com/brjathu/LART/blob/main/scripts/download_data.sh. I want to apply this to multisports.
Thanks
Gurkirt
Hello,
First of all, thank you for the interesting research you presented!
Of the benchmarks you used, I'm interested in the AVA dataset.
I downloaded AVA raw video, so how can I create an image frame that matches the annotation you published?
I can not match the human with GT bounding boxes.
What should I do? I will wait for your reply.
Best regards,
Chanwoo
Hello, first of all thanks for your contribution, why I run this code on colab: !python scripts/demo.py video.source="assets/jump.mp4" +half=True, I get the following error: INFO OpenGL_accelerate module, then In Visualize the reconstructions on your colab interface, there is no processed video after the code runs here, and only the original video is output. Please answer, thank you!
Hi,
In the phalp/visualize/postprocessor.py, in the run_lart() function, there is this code:
final_visuals_dic = joblib.load(phalp_pkl_path)
Where the function read the tracks generated by PHALP from videos. By my understanding, in the phalp_pkl_path file there are all features of all tracks in the target video. However, some lines of code later, the post_process() function is called, where it calls:
self.phalp_tracker.pose_predictor.smooth_tracks()
where all the tracks are smoothed.
What is the difference between smoothed tracks and non smoothed tracks? Can I use the features of non smoothed tracks in my application?
Thanks
I am having some trouble downloading dependencies for running the demo. FYI, I am running this on Windows and in a Conda environment.
First problem is that I get this error when I try running the line pip install -e .[demo]
The line that I get is as follows:
Error in atexit._run_exitfuncs: Traceback (most recent call last): File "C:\Users\ahnji\anaconda3\lib\site-packages\colorama\ansitowin32.py", line 59, in closed return stream.closed ValueError: underlying buffer has been detached ERROR: Failed building wheel for detectron2
The second problem is that if I download detectron2 elsewhere and pass this error, I get to a stage where the OpenGL library causes the problem that goes along the lines of such:
raise ImportError("Unable to load OpenGL library", *err.args) ImportError: ('Unable to load OpenGL library', "Could not find module 'OSMesa' (or one of its dependencies)
Can you help me resolve this issue?
The demo colab notebook seems broken. Everything installs alright but on running the demo:
!python scripts/demo.py video.source="assets/jump.mp4" +half=True
No sequences are generated, hence:
FileNotFoundError: [Errno 2] No such file or directory: 'outputs/results_temporal_videos/'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.