uncg-daisy / pywebcat Goto Github PK

View Code? Open in Web Editor NEW

3.0 3.0 1.0 7.09 MB

Python tool for working with the NOAA NOS Web Camera Applications Testbed (WebCAT)

License: MIT License

Jupyter Notebook 96.86% Python 3.14%

pywebcat's People

Contributors

Stargazers

Watchers

Forkers

tomasbeuzen

pywebcat's Issues

grab frames from metadata

data releases will have the tags + the csv'ed metadata (from wc.save_frames).

we might want to have a way to grab frames directly from metadata record (i.e., for rehydrating the tagged data release)

URL with single digit months and single digit times

URLs do not generate correcly for single digit months, days and times. we need a leading 0 in these cases.

Example with months. In this example i use the CLI and provide a month of 09 ( 9 gives the same behavior):

evanbgoldstein$ pywebcat -dir /Users/evanbgoldstein/Downloads/WebCAT -s buxtoncoastalcam -y 2018 -m 09 -d 12 -t 1000 -i 20 -v

OpenCV: Couldn't read video stream from file "http://webcat-video.axds.co/buxtoncoastalcam/raw/2018/2018_9/2018_9_12/buxtoncoastalcam.2018-9-12_1000.mp4"

Warning: http://webcat-video.axds.co/buxtoncoastalcam/raw/2018/2018_9/2018_9_12/buxtoncoastalcam.2018-9-('buxtoncoastalcam', 2018, 9, 12, 1000)_1000.mp4 not a valid url... Skipping.

This URL works:
http://webcat-video.axds.co/buxtoncoastalcam/raw/2018/2018_09/2018_09_12/buxtoncoastalcam.2018-09-12_1000.mp4

this is also true for days 01-09:

~ evanbgoldstein$ pywebcat -dir /Users/evanbgoldstein/Downloads/WebCAT -s buxtoncoastalcam -y 2018 -m 09 -d 01 -t 1000 -i 20 -v

OpenCV: Couldn't read video stream from file "http://webcat-video.axds.co/buxtoncoastalcam/raw/2018/2018_9/2018_9_1/buxtoncoastalcam.2018-9-1_1000.mp4"

Warning: http://webcat-video.axds.co/buxtoncoastalcam/raw/2018/2018_9/2018_9_1/buxtoncoastalcam.2018-9-('buxtoncoastalcam', 2018, 9, 1, 1000)_1000.mp4 not a valid url... Skipping.

working URL:
http://webcat-video.axds.co/buxtoncoastalcam/raw/2018/2018_09/2018_09_01/buxtoncoastalcam.2018-09-01_1000.mp4

and times (0800):

:~ evanbgoldstein$ pywebcat -dir /Users/evanbgoldstein/Downloads/WebCAT -s buxtoncoastalcam -y 2018 -m 09 -d 01 -t 0800 -i 20 -v
OpenCV: Couldn't read video stream from file "http://webcat-video.axds.co/buxtoncoastalcam/raw/2018/2018_9/2018_9_1/buxtoncoastalcam.2018-9-1_800.mp4"

Warning: http://webcat-video.axds.co/buxtoncoastalcam/raw/2018/2018_9/2018_9_1/buxtoncoastalcam.2018-9-('buxtoncoastalcam', 2018, 9, 1, 800)_800.mp4 not a valid url... Skipping.

working URL:

http://webcat-video.axds.co/buxtoncoastalcam/raw/2018/2018_09/2018_09_01/buxtoncoastalcam.2018-09-01_0800.mp4

Move repo

Move webcat utils to Tom's page,
move Tiny Collision to DAISY

Directory Structure

-Coastal Station/
       |-Time XXX/
           |- metadata.csv
           |- jpgs/
               |-frame 0.jpg
               |-frame 10.jpg
               |....
     |- Time YYY/
     |- Time ZZZ/
     |....

@ebgoldstein I uploaded a notebook demonstrating a simple pseudo-labelling workflow (note that it draws from a "data" directory but I haven't uploaded the labelled images used to train a classifier because I wasn't sure of they had a DOI yet or not?).

We could turn this into a function/class/script to formalise it if we want to go down this path further.

Package up webcat utils

At the moment, webcat utils acts as a standalone module. It would be possible to package it up into an easy-to-install pip install webcat_utils.

However, as far as I'm aware, the only way to use the CLI aspect of webcat_utils without having to call the absolute path of the file on a local machine ($python path/webcat_utils.py) is to add it to $PATH, which can be done using setup.py. But I'm not sure how I feel about this, need to think more on it...

delete old nbs

I think the 'notebooks/FindEvents.ipynb and 'notebooks/WebCAT_Retriever.ipynb can be deleted. the event nb is not critical to this package — is that ok w/ you @TomasBeuzen ?

find good times on Buxton cam

Collision and non-collision

Create a requirements.txt (or equivalent)

How I bundle up the requirements depends in some part on whether this becomes a package or not (see #12). Leaving this issue open for now as a reminder.

What to do with existing .jpg files in a directory

If you run webcat_utils more than once for the same station+datetime but with a different "save frame interval", the existing .csv metadata will be overwritten, but the existing .jpg frames will not be. We have two options:

Clear any existing .jpg files in a directory that already exists
Append new metadata to the existing .csv rather than overwriting it (and keep existing .jpg files).

thoughts about `WebCAT.utils`

Ok, @TomasBeuzen so i looked over webcat_utils and had a think —

For this project, we might want a single method that does it all..

Inputs:

camera of interest
a data time (or date-time range)
number of seconds between frame captures (default is 1 frame every 10 seconds… 61 frames for a 10 minute video)
the option to save or delete the raw video (default is trash the video since they are so big)
option to write a .csv

Outputs:

frames saved in a folder called jpgs
optionally, the video
optionally, a .csv with some info

the method would do this:

from date-time/ date range — generate the URLS needed, using the WebCAT format
download each of files from the URLs (vr.download_url does this)
split the video into frames (at user specified interval) and save frames in a folder called jpegs (vr.save_frames does this)
optionally delete the video
optionally write to a csv that has:
- the file names of each capture,
- the date time of the video,
- the frame number for the capture,
- the Webcat URL (This CSV will be used to join with the labeled data)

release

I enabled the Zenodo webhook, so when we decide on #17 and #9 we should:

make a release (job for @TomasBeuzen )
tidy up the citation formatting as Beuzen and Goldstein, etc. (job for @ebgoldstein )

tag images collision vs. non collision vs. 'close' to collision

using something like pigeon .. (thx @TomasBeuzen)
release/publish tagged data

Improve the testing framework and code coverage

This is also somewhat related to #12 but testing of functionality with pytest and code coverage need to be improved. I might also implement some GitHub actions for automating this, as well as black formatting.

create WebCAT URLs

code needs to create WebCAT urls to download user-defined times and store in a directory structure.