waldo-vision / models Goto Github PK
View Code? Open in Web Editor NEWRepository for model development and training
Home Page: https://waldo.vision
License: Mozilla Public License 2.0
Repository for model development and training
Home Page: https://waldo.vision
License: Mozilla Public License 2.0
Develop code that takes a gameplay video as input, segments it into smaller clips of no longer than 30 seconds, and trims irrelevant sections such as menus, intros, and outros. The code should be designed in a modular fashion to accommodate game-specific features, allowing it to work with various games.
Since game-specific features may vary, it is suggested to create a basic solution first and then incrementally add support for different games as needed.
Consider using machine learning techniques, such as computer vision or deep learning, to detect and trim irrelevant sections with higher accuracy.
For better compatibility, consider using open-source libraries and tools for video processing, such as OpenCV, FFmpeg, or similar.
The current github action for running pylint is broken, mainly because setting up conda on a runner is proving to be difficult.
Right now it's not necessary that we setup the entire conda environment, so we need to revert to just installing deps with pip, and reverting the steps to setup conda.
We need code that downloads sets of gameplay video URLs submitted by users of waldo.vision and stored in our SQL database. The code should ensure the URLs are valid, have been reviewed by users 25 or more times, and have a 90% or higher positive rating. The code must also prevent downloading duplicate links. The analysis team will need to coordinate with the infrastructure team to identify the best way to download these URLs.
Consider using an ORM (Object-Relational Mapper) library, such as SQLAlchemy, to interact with the SQL database in a more pythonic and maintainable way.
Currently, link retrieval gives an output link this:
Requesting page 18
Requesting page 19
Requesting page 20
Requesting page 21
Requesting page 22
Requesting page 23
Requesting page 24
Requesting page 25
Requesting page 26
Requesting page 27
Requesting page 28
Requesting page 29
Requesting page 30
A better solution to this since we know the total pages would be to create a progress bar using tqdm or a similar library.
This would allow the developer to know how long the download would take and would make the experience better.
We need a script that takes a short video as input, processes it, and outputs a series of cropped images that are frames of the input video. The script should be easy to use and well-documented, so other team members can understand and extend it if necessary.
The script should accept a video file as input (in common formats like .mp4, .avi, .mov, etc.).
The user should be able to specify the cropping dimensions (width and height) and optionally the position (x and y coordinates) of the cropped area.
The script should convert the video into a series of cropped images that are frames of the input video, preserving the original frame rate.
The script should save the cropped images to a specified output directory.
The script should be able to handle videos of varying lengths and resolutions.
The script should be implemented using a popular and well-supported programming language (e.g., Python) and libraries (e.g., OpenCV).
The script should successfully process a video file and output a series of cropped images as specified by the user.
The script should be well-documented and easy for other team members to understand.
The script should be tested with various video formats, resolutions, and lengths to ensure compatibility and robustness.
Additional Context
This script will be used as part of a larger pipeline for video processing and analysis. It is crucial that the script is efficient and reliable, as it may be used on large datasets with multiple videos.
We need to develop or implement existing code that takes an input list of YouTube video URLs and downloads the videos to a specified directory. This will be run on a Linux system.
Ability to input a list of YouTube video URLs (e.g., via a text file or command line arguments).
Validate input URLs to ensure they are valid YouTube video URLs.
Download each video in a specified format (e.g., MP4, WebM, etc.).
Save downloaded videos to a specified directory.
Provide progress updates during the download process (e.g., percentage completed, estimated time remaining, etc.).
Handle download errors, such as network issues or invalid URLs, gracefully.
Provide clear documentation on how to use the code and specify the download directory and format.
Successfully input a list of YouTube video URLs (e.g., via a text file or command line arguments).
Validate input URLs to ensure they are valid YouTube video URLs with at least 95% accuracy.
Download each video in the specified format (e.g., MP4, WebM, etc.).
Save downloaded videos to the specified directory.
Provide progress updates during the download process, including percentage completed and estimated time remaining.
Handle download errors, such as network issues or invalid URLs, gracefully, without crashing the program.
Clear documentation provided on how to use the code and specify the download directory and format.
For better compatibility and to comply with YouTube's terms of service, consider using open-source libraries and tools specifically designed for this purpose, such as youtube-dl, pytube, or similar.
This issue is partially blocked by #1 because we don't know yet the format in which links will be stored locally.
Train the implemented video masked autoencoder model on a dataset of general gameplay clips from first-person shooter (FPS) games.
Monitor training progress and be prepared to adjust hyperparameters, such as learning rate, batch size, or other factors, to optimize model performance.
Blocked by #5
Implement the VideoMAE2 model on a dataset of videogame gameplay clips.
See the following papers:
Code for VideoMAE2 was recently released
Test the video masked autoencoder model on a small test dataset.
Evaluate the model's performance using appropriate metrics (e.g., reconstruction error, SSIM, etc.).
Provide proper attribution.
Provide clear documentation on how to use the model, including training, evaluation, and any customization options.
Successfully implement VideoMAE2 into our codebase.
Test the video masked autoencoder model on the prepared dataset, achieving satisfactory performance as indicated by appropriate metrics.
Ensure compliance with the license and provide proper attribution.
Clear documentation provided on how to use the model, including training, evaluation, and any customization options.
Be sure to preprocess the dataset appropriately, including resizing, normalization, and data augmentation if necessary.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.