Comments (6)
I have noticed this too, any idea what's causing it?
from videohash.
Just learning this library myself, but if you check out the collages, you can see the collage images are virtually identical (located at v1.collage_path
and v2.collage_path
). Basically the scenes are too short and as far as the video hash is concerned consist of white pixels at the exact same two points and black everywhere else. My guess is that this will not be an effective tool with short videos such as the two in the example. I have been trying to find recommendations on minimum scene lengths.
Just did some testing, and you can increase the number of frames per second. Check out the results of this:
v1 = VideoHash(url='https://user-images.githubusercontent.com/47534140/185008752-da1f09c7-a177-4a46-9c64-230744e998c1.mp4',frame_interval=5)
v2 = VideoHash(url='https://user-images.githubusercontent.com/47534140/185008748-b8922142-37cc-48a0-bad9-1385ba016587.mov',frame_interval=5)
print (v1 == v2)
# and compare their collages to the ones you created without using frame_interval
print(v1.collage_path)
print(v2.collage_path)
from videohash.
I'll have to look at that example later. I've also had the opposite problem where the same video will produce different hashes, not to mention that it always takes a few seconds to run which is quite long for real-world applications these days.
I think I'll either have to fork this and see if I can improve or switch to using something else. I'd also like to see if I can add partial fingerprint, where a video that's part of another one can be recognised as such.
from videohash.
Just learning this library myself, but if you check out the collages, you can see the collage images are virtually identical (located at
v1.collage_path
andv2.collage_path
). Basically the scenes are too short and as far as the video hash is concerned consist of white pixels at the exact same two points and black everywhere else. My guess is that this will not be an effective tool with short videos such as the two in the example. I have been trying to find recommendations on minimum scene lengths.Just did some testing, and you can increase the number of frames per second. Check out the results of this:
v1 = VideoHash(url='https://user-images.githubusercontent.com/47534140/185008752-da1f09c7-a177-4a46-9c64-230744e998c1.mp4',frame_interval=5) v2 = VideoHash(url='https://user-images.githubusercontent.com/47534140/185008748-b8922142-37cc-48a0-bad9-1385ba016587.mov',frame_interval=5) print (v1 == v2) # and compare their collages to the ones you created without using frame_interval print(v1.collage_path) print(v2.collage_path)
The issue of collages for short videos being almost entirely black seems to stem from the fact that the width of the collage is set to 1024px no matter what. Instead, i tried editing collagemaker.py
so that it would calculate the width of the collage based on the already-existing variable self.images_per_row_in_collage
, and it resulted in much nicer collages although i have not tested it extensively. From my limited testing it produces the same hash for a video when:
- it is converted to a different format (tested on
.mov
) - is is compressed
- it is downscaled (by 50%)
And, more importantly, it produces different hashes for the two videos I uploaded in the original issue.
Link: MikPisula@b4b8f32
from videohash.
When it comes to the performance, perhaps the python multiprocessing library could be used to speed up the image-manipulation part?
from videohash.
It could do but it has to be done in a way that works across devices. I think an algorithm with decent time complexity would be best. I'm also thinking it might be better to start over than to fork. I'd like to see if video fingerprinting might be possible.
Edit: I just found this: https://pypi.org/project/videofingerprint/
Looks like @akamhy was working on it but the repo doesn't exist anymore.
(Gonna start a separate issue for speed)
from videohash.
Related Issues (20)
- Video hashs on vastly different videos yield is_similar() True HOT 1
- Videohash 'is_similar' function returns True for different videos
- Temp folder not freeing up HOT 1
- Feature Request: serialize and deserialize hash result
- [Feature Request] Command-line interface.
- [WARNING] False Positive Issues
- ERROR: [generic] None: Unable to download webpage: (caused by URLError('unknown url type: c'))
- [Feature Request] Hash based on limited number of frames HOT 2
- BUG REPORT: AttributeError due to PIL.Image v10+ dropping ANTIALIAS HOT 3
- Write FFmpeg installer for windows in Python 3 (should try if you are good at writing installer for windows) HOT 5
- add a duration attribute to the videohash objects HOT 1
- Change Frame Interval HOT 3
- BUG REPORT - MAKE the -f worst optional HOT 5
- assets host issue HOT 9
- add conda-forge HOT 1
- BUG REPORT HOT 3
- Long Video might fail to maketile due to the jpeg format HOT 3
- Hashing speed issue. HOT 3
- pyhon subprocess inherits stdin by default and causes ffmpeg to fail
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from videohash.