Code Monkey home page Code Monkey logo

Comments (6)

denfed avatar denfed commented on July 27, 2024 1

Hello! My apologies; the read-me isn't clear on this part and I'll update it. You need to use flickr-soundnet-dl to download both the train AND test videos. The only thing you need to get from the second link is the Annotation folder, which contains the bounding boxes that previous work labeled. This is the exact problem we had originally, that the original test videos aren't available, meaning we need to download them manually and then just use Learning to localize sound sources annotations.

To reiterate steps:

  • Use flickr-soundnet-dl to download training AND testing samples. You can download all flickr samples and split, or you can manually separate the urls_public.txt into train and test sections, and download separately.
  • Then use our train and test scripts to create the training and testing audio-frame-flow pairs for the train and test set respectively.

I hope this answers all your questions and please comment again if you have an issue!

Thanks.

from heartheflow.

denfed avatar denfed commented on July 27, 2024 1

I recommend to split the urls_public.txt and download the training and testing samples separately, as that is what the preprocessing scripts are expecting. If you look at the differences between the train and test preprocessing scripts, this should make sense.

from heartheflow.

denfed avatar denfed commented on July 27, 2024

Hello! Thanks for your interest in our work! I can't say for sure what the issue is without more information, but looking at the script I realized that I didn't add code to create the folder structure for the training data. I'm not sure if you did this already or if this is the problem, but try to manually create the folder structure, like this:

train/
train/frames
train/audio
train/flow
train/flow/flow_x
train/flow/flow_y

Let me know if this doesn't work or if something else is wrong!

from heartheflow.

gohyunhua avatar gohyunhua commented on July 27, 2024

Hi! Thanks for your reply! I did follow the suggestion you gave and it works! However, there are some missing files in both of the dataset links you gave which caused the errors while training the model. (Or maybe I missed out something). Here are some findings on the two datasets.

image

  • No data annotation files (.xml)

image

  • The mp4 files contain black screen only (Looks like mp3 instead of mp4)

Please enlighten me if there's any misunderstanding on my side. Thanks! Have a nice day!

from heartheflow.

gohyunhua avatar gohyunhua commented on July 27, 2024
  1. Hi! I have successfully downloaded and split the flickr-soundnet-dl into training and testing samples. I also ran both scripts for preprocess_flickr_train.py and preprocess_flickr_test.py and both work SUCCESSFULLY where audio-frame-flow pairs are also produced.

  2. The only problem now is when I wanted to run the training pretrained model OR training your own model section, it shows an error as the annotation files from Learning to localize sound sources totally don't match with theflickr-soundnet-dl . (The files between 2 links are totally different)
    image

  3. Or can I get your help by providing me a link to download all your annotations for the flickr-soundnet-dl dataset? (or all the folders for train & test). Thanks!

from heartheflow.

denfed avatar denfed commented on July 27, 2024

Given the nature of downloading these datasets from YouTube, you are most likely not able to download some videos that were recently deleted. I recommend further filtering out the test csv's based on the samples you do have access to.

Our flickr test set resulted in 178 samples, but you will most likely be missing a couple more from these.

I hope that makes sense.

If this isn't the problem you have, I recommend making sure you have your filepaths correct for the ground truth annotations folder.

from heartheflow.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.