Comments (10)
In the example above the first sample of 16 frames will start at 94 and end at 110 which is 60 frames TTE. The last sample starts at 125 and ends at 140 which is 30 frames TTE. In the lines 383-384 we already subtracted the observation length (16) frames to ensure that this is the case. This corresponds to the description in the paper.
In other words, we count TTE at the last frame of the observation, not the first frame. If we started sampling at sequence_length - 60
then the TTE would be 46, not 60.
from pedestrianactionbenchmark.
Hi Xingchen,
Thank you for your interest in our work.
To answer your question, from each pedestrian track we generate multiple observations of 16 frames (default observation length) within 1-2s time-to-event (TTE). The overlap parameter controls the step of the sliding window. At 0 overlap the samples will start at frames 0, 16, 32,... and so on, i.e. every observation sample starts after the previous one ends. At maximum overlap of 1, the samples will be collected starting at every frame.
Therefore, the purpose of overlap is to increase the amount of training data. For smaller dataset such as JAAD we use a higher overlap of 0.8 to get a comparable number of training samples as generated from the PIE dataset with 0.6 overlap. In our implementation, function action_predict.py:get_data_sequence() is where observation samples are extracted from pedestrian tracks.
from pedestrianactionbenchmark.
Hi,
Thank you so much for your quick and detailed reply! You are very kind.
I think now I know what overlap means and how you handle the training data.
If I understand correctly, you used the training data in this way:
- Each pedestrian track you mentioned contains an event (cross or not) and 30 frames (1-2s before this event), say from frame 0 to frame 29.
- You apply sliding window (controlled by 'overlap') within this 30 frames window prior the event.
- If overlap is 1, then the samples will be collected starting at every frame. So we can generate 15 observations of 16 frames. (frame 0-15, 1-16, 2-17, 3-18, 4-19, 5-20, 6-21, 7-22, 8-23, 9-24, 10-25, 11-26, 12-27, 13-28, 14-39). These 15 observations have the same label: C or NC.
Could you please kindly let me know if I understand correctly?
Thank you very much again for your nice work!
Bests,
Xingchen
from pedestrianactionbenchmark.
Hi Xingchen,
You are welcome and yes, your understanding is correct.
Yulia
from pedestrianactionbenchmark.
Hi Yulia,
Thank you very much for your reply and confirmation!
Very nice work! Hope I can develop a new method using your data.
Bests,
Xingchen
from pedestrianactionbenchmark.
Hi Yulia,
Sorry, I just read your paper again. I found maybe I made a mistask in my previous reply.
Actually, for each pedestrian track you have 76 frames (16 for observation and 60 for TTE). You actually apply the sliding window on the first 46 frames rather than 30 frames, right? In my previous reply, I mentioned that you applied sliding window on the 30 frames window (1-2 seconds) because I forget the observation period.
Could you let me know if now I understand correctly?
Thanks a lot!
Xingchen
from pedestrianactionbenchmark.
Hi Xingchen,
If the observation is between 1-2s TTE, we start observing 60 frames before the event and stop at 30 before the event. In this case, we apply a sliding window within the 30 frame range. If a single TTE instead of a range is set, then only 16 frames ending at that TTE are collected.
Please take a look at the function action_predict.py:get_data_sequence(). Pedestrian tracks stored in dictionary d
are already cropped so that they end at the event (crossing or not crossing).
Lines 383-384 show how the first and last index of observation is computed.
For example, the track is 170 frames, i.e. the event happens 170 frames after the pedestrian appears on screen.
start_idx = track length - observation length - 60 = 170-16-60 = 94
end_idx = track length - observation length - 30 = 170-16-30 = 125
The 16 frame segments are sampled within 30 frame range starting at start_idx
and ending at end_idx+1
with the step determined by the overlap parameter (at 0.8 it is every 3 frames, or every frame when overlap is 1).
Hope this clarifies your question.
Yulia
from pedestrianactionbenchmark.
Hi Yulia,
I apprecitate your detailed clarification very much!
Now I know what you mean. I previously thought in this case, the end_idx is 170-30 = 140. This is why I said the sliding window was applied to 46 frames window.
By the way, I am just curious why you guys did not use an end_idx 140 in this case. In your paper (on page 3) you said 'so the last frame of observation is between 1 and 2s (or 30-60 frames) prior to the crossing event start'. If you use 125 as the end_idx, so the last frame of observation is actually between 46-60 frames prior to the crossing event start.
Anyway, I am just curious about this. Maybe this is more practical in real case.
Many thanks again for your help!
Bests,
Xingchen
from pedestrianactionbenchmark.
Thank you very much Yulia! Now it is very clear!
from pedestrianactionbenchmark.
You are welcome :)
from pedestrianactionbenchmark.
Related Issues (20)
- About the performance of PCPA on JAAD-beh HOT 6
- Request: pre-trained models HOT 1
- Having Troubles Training TwoStream ATGC HOT 2
- The updated evaluation results HOT 2
- some errors about docker HOT 3
- About the performance of PCPA HOT 2
- Which TensorFlow and python version did you use ? HOT 2
- Update configs in PCPA.yaml HOT 1
- about i3d HOT 4
- Troubles with testing the PCPA and SFRNN model HOT 5
- Results without DataGenerator HOT 5
- About the PIE dataset HOT 2
- Results Visualization HOT 2
- Testing issue with PCPA model HOT 2
- Can you please share the jaad_data.py and pie_data.py with us? HOT 2
- Pedestrians who are already crossing in the past observations HOT 5
- code of PedFormer HOT 1
- Questions about the benchmark HOT 2
- About PIE dataset split HOT 2
- Imbalance of C/NC samples HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pedestrianactionbenchmark.