Code Monkey home page Code Monkey logo

img_hash's Introduction

img_hash Build Status Crates.io shield

Now builds on stable Rust! (But needs nightly to bench.)

A library for getting perceptual hash values of images.

Thanks to Dr. Neal Krawetz for the outlines of the Mean (aHash), Gradient (dHash), and DCT (pHash) perceptual hash algorithms:
http://www.hackerfactor.com/blog/?/archives/432-Looks-Like-It.html (Accessed August 2014)

Also provides an implementation of the Blockhash.io algorithm.

This crate can operate directly on buffers from the PistonDevelopers/image crate.

Usage

Documentation

Add img_hash to your Cargo.toml:

[dependencies.img_hash]
version = "3.0"

Example program:

 extern crate image;
 extern crate img_hash;
 
 use img_hash::{HasherConfig, HashAlg};

 fn main() {
     let image1 = image::open("image1.png").unwrap();
     let image2 = image::open("image2.png").unwrap();
     
     let hasher = HasherConfig::new().to_hasher();

     let hash1 = hasher.hash_image(&image1);
     let hash2 = hasher.hash_image(&image2);
     
     println!("Image1 hash: {}", hash1.to_base64());
     println!("Image2 hash: {}", hash2.to_base64());
     
     println!("Hamming Distance: {}", hash1.dist(&hash2));
 }

Benchmarking

In order to build and test on Rust stable, the benchmarks have to be placed behind a feature gate. If you have Rust nightly installed and want to run benchmarks, use the following command:

cargo bench --features bench

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

img_hash's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

img_hash's Issues

Example code in documentation not running

I'm trying to run the example code and was unable to circumvent this error: "the trait img_hash::HashImage is not implemented for image::DynamicImage". Could you look into this??

screen shot 2017-08-22 at 11 40 44 am

Comparison with https://www.phash.org/

First of all thanks for this library because C implementation is fully abandoned. I've been doing some tests (in our database we rely on phash dct u32 hashes) and it looks like theres a difference from what phash produces and img_hash. Hashes are completely different like 10424892573973741600 vs 14502596198080605850. I used to use ph_dct_imagehash function which produced the latter one. I just wonder is that because of different algo?

Point to the PistonDevelopers image in Cargo.toml?

Getting error:

warning: using multiple versions of crate `image`
src/lib.rs:11:1: 11:27 note: used here
src/lib.rs:11 extern crate gfx_graphics;
              ^~~~~~~~~~~~~~~~~~~~~~~~~~
note: crate name: image
src/lib.rs:37:1: 37:41 note: used here
src/lib.rs:37 extern crate "img_hash" as img_hash_lib;
              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: crate name: image

Optimized DCT Implementation

Seeking an optimized DCT implementation with a compatible license.

Options:

  • RustDCT
    Needs benchmarking as current API supports 1D slices only so it needs a transpose step that the in-tree impl doesn't. (ejmahler/rust_dct#2)

Dubious:

  • FFTW; licensed under GPL, not compatible with MIT
    Can provide interface to link in like old UserDCT implementation but composable with the new 3.0-alpha API.

cc @ejmahler

hash_size doesn't match hash length when using DoubleGradient

Hi there,

I am trying out the DoubleGradient hash algorithm. I expected the hash_size() passed to HasherConfig to be respected (assuming width and height being multiples of 2), but the resulting hashes have fewer bits than that. Here's a snippet of code and the resulting output:

let image = image::open("grayscale.png").unwrap();

for (w,h) in [(8,8), (16,16), (8,16), (16,8)] {
  let hasher = HasherConfig::new().hash_size(w,h).hash_alg(HashAlg::Gradient).to_hasher();
    println!("Gradient({}, {}): {:?} bits", w, h, 8 * hasher.hash_image(&image).as_bytes().len());

    let hasher = HasherConfig::new().hash_size(w,h).hash_alg(HashAlg::DoubleGradient).to_hasher();
    println!("DoubleGradient({}, {}): {:?} bits", w, h, 8 * hasher.hash_image(&image).as_bytes().len());
}

I also added a println inside hash_image to print bytes.len(), resize_width, and resize_height.

HashVals: 72 (9x8?)
Gradient(8, 8): 64 bits
HashVals: 25 (5x5?)
DoubleGradient(8, 8): 40 bits
HashVals: 272 (17x16?)
Gradient(16, 16): 256 bits
HashVals: 81 (9x9?)
DoubleGradient(16, 16): 144 bits
HashVals: 144 (9x16?)
Gradient(8, 16): 128 bits
HashVals: 45 (5x9?)
DoubleGradient(8, 16): 80 bits
HashVals: 136 (17x8?)
Gradient(16, 8): 128 bits
HashVals: 45 (9x5?)
DoubleGradient(16, 8): 80 bits

Both 8 and 16 are multiples of two, so I didn't expect any changes when using DoubleGradient. I think this is a bug, but I wasn't able to pinpoint the problem yet.

I tried both with img_hash 3.2.0 and with the latest commit on the main branch, which seems to be the same.

Not compiling with latest version of PistonDevelopers/image and Rust

I tried to make it compile again and succeeded somehow.

You can find my changes here: https://github.com/forgemo/img_hash

I, however, failed to keep the image parameter of square_resize_and_gray generic.

I had to change

fn square_resize_and_gray<Img: GenericImage>(img: &Img, size: u32) -> GrayImage

to

fn square_resize_and_gray(img: &RgbImage, size: u32) -> GrayImage

and did some further modifications following this change.

It seems that my understanding of Rust's generic type-system is not good enough to make it work in a generic manner. Maybe someone could review my changes helping me to make it generic again.

ImageHash should be serializable

Since it is fairly costly to calculate a hash, it makes sense to serialize an deserialize the ImageHash struct. To do this, the struct should derive Serialize and Deserialize from the serde crate.

Thread panic when using Blockhash, but not with DCT

works:

let img = image::load_from_memory(&mut buf).unwrap();
let hash = ImageHash::hash(&img, 8, HashType::DCT);

doesn't work:

let img = image::load_from_memory(&mut buf).unwrap();
let hash = ImageHash::hash(&img, 8, HashType::Block);

thread 'main' panicked at 'index out of bounds: the len is 64 but the index is 64', /checkout/src/libcore/slice/mod.rs:2085:14

versions:
rustc: 1.28-nightly
image = "0.13" # had to peg this to an older version to get it to work

[dependencies.img_hash]
version = "^2.0.1"
features = ["rust-image"]

stack trace:

thread 'main' panicked at 'index out of bounds: the len is 64 but the index is 64', /checkout/src/libcore/slice/mod.rs:2085:14
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
             at libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::sys_common::backtrace::print
             at libstd/sys_common/backtrace.rs:71
             at libstd/sys_common/backtrace.rs:59
   2: std::panicking::default_hook::{{closure}}
             at libstd/panicking.rs:211
   3: std::panicking::default_hook
             at libstd/panicking.rs:227
   4: std::panicking::rust_panic_with_hook
             at libstd/panicking.rs:511
   5: std::panicking::continue_panic_fmt
             at libstd/panicking.rs:426
   6: rust_begin_unwind
             at libstd/panicking.rs:337
   7: core::panicking::panic_fmt
             at libcore/panicking.rs:92
   8: core::panicking::panic_bounds_check
             at libcore/panicking.rs:60
   9: <usize as core::slice::SliceIndex<[T]>>::index_mut
             at /checkout/src/libcore/slice/mod.rs:2085
  10: core::slice::<impl core::ops::index::IndexMut<I> for [T]>::index_mut
             at /checkout/src/libcore/slice/mod.rs:1958
  11: <alloc::vec::Vec<T> as core::ops::index::IndexMut<I>>::index_mut
             at /checkout/src/liballoc/vec.rs:1717
  12: img_hash::block::blockhash_slow::{{closure}}
             at /home/~/img_hash-2.0.1/src/block.rs:99
  13: img_hash::rust_image::<impl img_hash::HashImage for image::dynimage::DynamicImage>::foreach_pixel
             at /home/~/img_hash-2.0.1/src/rust_image.rs:89
  14: img_hash::block::blockhash_slow
             at /home/~/img_hash-2.0.1/src/block.rs:72
  15: img_hash::block::blockhash
             at /home/~/img_hash-2.0.1/src/block.rs:29
  16: img_hash::HashType::hash
             at /home/~/img_hash-2.0.1/src/lib.rs:279
  17: img_hash::ImageHash::hash
             at /home/~/img_hash-2.0.1/src/lib.rs:85
  18: app::event
             at src/main.rs:63
  19: nannou::run_loop::process_and_emit_glutin_event
             at /home/~/nannou-0.6.0/src/lib.rs:368
  20: nannou::run_loop
             at /home/~/nannou-0.6.0/src/lib.rs:412
  21: <nannou::Builder<M, E>>::run
             at /home/~/nannou-0.6.0/src/lib.rs:147
  22: app::main
             at src/main.rs:14
  23: std::rt::lang_start::{{closure}}
             at /checkout/src/libstd/rt.rs:74
  24: std::panicking::try::do_call
             at libstd/rt.rs:59
             at libstd/panicking.rs:310
  25: __rust_maybe_catch_panic
             at libpanic_unwind/lib.rs:105
  26: std::rt::lang_start_internal
             at libstd/panicking.rs:289
             at libstd/panic.rs:392
             at libstd/rt.rs:58
  27: std::rt::lang_start
             at /checkout/src/libstd/rt.rs:74
  28: main
  29: __libc_start_main
  30: _start

Add support for `image-rs 0.24`

Since the latest version of image-rs library require Rust 1.56.0, I think that img_hash also should use Rust 2021 edition and bump minimal requirements.

For now I can't use image-rs 0.24.0, because img_hash crate cannot interoperate with this version and I have such errors(this works with 0.23)

error[E0277]: the trait bound `image::DynamicImage: Image` is not satisfied
   --> czkawka_core/src/similar_images.rs:606:50
    |
606 |                     let hash = hasher.hash_image(&image);
    |                                       ---------- ^^^^^^ the trait `Image` is not implemented for `image::DynamicImage`
    |                                       |
    |                                       required by a bound introduced by this call

Unable to generate pHash hashes that match other implementations

I'm hoping that I can use your crate to generate phash hashes but I've been having some trouble with this. My current test program looks like this.

I have a teammate that generated a large collection of perceptual hashes. I want to reproduce those hashes for the same files using your library. I tried some different variations with your API and couldn't get it to work.

My next step was to verify that I could at least reproduce those hashes with a different pHash library.

Initially I thought that my teammate was using the ph_dct_imagehash() API from the original (GPL) pHash project. I customized one of the pHash project example programs to print the DCT pHash for a single file. I was unable to generate hashes with img_hash that match the hashes created by pHash's ph_dct_imagehash().

I spoke with the teammate and learned that they're actually using Johannes Buchner's Python-based (BSD) imagehash library. Specifically, he's using Johannes' DCT phash() function. To test it, I made this prototype to create phash() hashes with imagehash. Unfortunately, I was also unable to use img_hash to produce hashes that match those generated by Johannes' imagehash's phash() function.

I suppose you're primarily interested in reproducing the algorithms from the original pHash library (without reading their GPL source). But my motivation is to calculate hashes in Rust that match those that my teammate generated using Johannes Buchner's phash() function. I'm wondering if you'd be able to help me use your library to do this.

One thing I'm thinking is that maybe img_hash should provide a way to use Median instead of Mean for the hash algorithm. From your documentation:

http://www.hackerfactor.com/blog/?/archives/432-Looks-Like-It.html Krawetz describes a "pHash" algorithm which is equivalent to Mean + DCT preprocessing here.

But if you read the comments it says that pHash actually uses Median. Johannes' library also appears to use Median for phash() (though he uses Mean for phash_simple(), but we're not using phash_simple()).

To try to test using Median instead, I made an effort to add HashAlg::Median to img_hash: micahsnyder@6f5f603
I'm not certain that my Median code is correct, which is why I haven't submitted a PR to you. And my overall test program using img_hash::HashAlg::Median still didn't result in matching hashes.

Would you be willing to look at Johannes' phash() implementation and compare and help me figure out what I'm doing wrong.

Static lifetime required for hashing?

Hi. I am not exactly sure what is going on, but my best guess is that there is an explicit static lifetime somewhere in this library code or in its dependencies.

fn save_to_disk<'a>(renderer: &'a Renderer, buffer: &'a nanoview::wgpu::Buffer) -> ImageHash {

    let buffer_slice: nanoview::wgpu::BufferSlice<'a> = buffer.slice(..);
    let mapping = buffer_slice.map_async(nanoview::wgpu::MapMode::Read);
    renderer.device.poll(nanoview::wgpu::Maintain::Wait);
    pollster::block_on(mapping).unwrap();
    let data: nanoview::wgpu::BufferView<'a> = buffer_slice.get_mapped_range();
    let image_buffer: ImageBuffer<Bgra<u8>, nanoview::wgpu::BufferView<'a>> = ImageBuffer::<Bgra<u8>, _>::from_raw(512, 512, data).unwrap();
    //image_buffer.save("image.jpg").unwrap();

    let hasher = HasherConfig::new().to_hasher();

    let hash: ImageHash = hasher.hash_image(&image_buffer); // <- 

    return hash;
}

Here is the compiler error:

    |
256 | fn save_to_disk<'a>(renderer: &'a Renderer, buffer: &'a nanoview::wgpu::Buffer) -> ImageHash {
    |                               ------------ this data with lifetime `'a`...
...
271 |     let hash: ImageHash = hasher.hash_image(&image_buffer);
    |                                             ^^^^^^^^^^^^^ ...is captured here...
    |
note: ...and is required to live as long as `'static` here
   --> src/main.rs:271:34
    |
271 |     let hash: ImageHash = hasher.hash_image(&image_buffer);
    |    

It seems hash_image only takes an image with a static lifetime?

blockhash values differ from other implementations' values

The results using the blockhash algorithm in a simple test differed from those of other implementations.

In particular, I compared the blockhash crate (v0.2) and the Python implentation on their results for a 16bit hash of the 16x16_rgb.png image shipped with the blockhash crate (which both produced the same results):

$ ./blockhash.py ~/git/crates/blockhash/images/16x16_rgb.png --bits 16
ff00ff00ff00fe20fc3efc18f900f980f3c0f7c0ef80fe40fee07ee05e7a1804  [...]
$ cargo run ~/git/crates/blockhash/images/16x16_rgb.png
ff00ff00ff007f043f7c3f189f009f01cf03ef03f7017f007f077e077a1e1820

The img_hash program was the hash_image base program modified by #40 and with this configuration:

let hash = HasherConfig::new().hash_size(16, 16).hash_alg(img_hash::HashAlg::Blockhash).to_hasher().hash_image(&image);

Most of the difference can be explained by blockhash.py/blockhash-crate and img_hash using different bit orders -- once you flip the bits in the bytes, most of the difference goes away. By majority voting (and without looking into the spec of blockhash), I'm opening this as a bug here -- if you insist this behavior is correct, I'm happy to move it to the other projects.

The remaining two bit errors could be explained by rescaling; is there a hash configuration that can be documented and tested to be produce the exact hashes as the other implementations?

Support for 16bit integer and 16/32bit float color?

This crate looks perfect for my use case but my data is of the above kind (and u8).

I can ofc. just transmute the u16, f16 and f32 to u8.

But I wonder how effective the algorithm is in that case. And if not, if this could be added?

Produced gradient hash has an invalid size

I've been testing this library with this simple script, but I can't figure out why the produced hash has an invalid size (here 56 instead of 64 bits). The size is also incorrect for DoubleGradient with 135 instead of 128 bits.
When using the Mean, Block or DCT hashtype, the resulting hash has the correct length (64 bits).
(I'm a novice in Rust, so I might have done something wrong.)

extern crate image;
extern crate img_hash;

use std::path::Path;
use img_hash::{ImageHash, HashType};

fn main() {
    let img = image::open(&Path::new("00002701.jpg")).unwrap();
    let hash = ImageHash::hash(&img, 8, HashType::Gradient);
    println!("{:?}", hash.bitv);
    println!("{}", hash.size());
}

(The image is just a random picture, I am only looking at the hash size here.)

Relicense under dual MIT/Apache-2.0

This issue was automatically generated. Feel free to close without ceremony if
you do not agree with re-licensing or if it is not possible for other reasons.
Respond to @cmr with any questions or concerns, or pop over to
#rust-offtopic on IRC to discuss.

You're receiving this because someone (perhaps the project maintainer)
published a crates.io package with the license as "MIT" xor "Apache-2.0" and
the repository field pointing here.

TL;DR the Rust ecosystem is largely Apache-2.0. Being available under that
license is good for interoperation. The MIT license as an add-on can be nice
for GPLv2 projects to use your code.

Why?

The MIT license requires reproducing countless copies of the same copyright
header with different names in the copyright field, for every MIT library in
use. The Apache license does not have this drawback. However, this is not the
primary motivation for me creating these issues. The Apache license also has
protections from patent trolls and an explicit contribution licensing clause.
However, the Apache license is incompatible with GPLv2. This is why Rust is
dual-licensed as MIT/Apache (the "primary" license being Apache, MIT only for
GPLv2 compat), and doing so would be wise for this project. This also makes
this crate suitable for inclusion and unrestricted sharing in the Rust
standard distribution and other projects using dual MIT/Apache, such as my
personal ulterior motive, the Robigalia project.

Some ask, "Does this really apply to binary redistributions? Does MIT really
require reproducing the whole thing?" I'm not a lawyer, and I can't give legal
advice, but some Google Android apps include open source attributions using
this interpretation. Others also agree with
it
.
But, again, the copyright notice redistribution is not the primary motivation
for the dual-licensing. It's stronger protections to licensees and better
interoperation with the wider Rust ecosystem.

How?

To do this, get explicit approval from each contributor of copyrightable work
(as not all contributions qualify for copyright, due to not being a "creative
work", e.g. a typo fix) and then add the following to your README:

## License

Licensed under either of

 * Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
 * MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)

at your option.

### Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted
for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any
additional terms or conditions.

and in your license headers, if you have them, use the following boilerplate
(based on that used in Rust):

// Copyright 2016 img_hash developers
//
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
// option. This file may not be copied, modified, or distributed
// except according to those terms.

It's commonly asked whether license headers are required. I'm not comfortable
making an official recommendation either way, but the Apache license
recommends it in their appendix on how to use the license.

Be sure to add the relevant LICENSE-{MIT,APACHE} files. You can copy these
from the Rust repo for a plain-text
version.

And don't forget to update the license metadata in your Cargo.toml to:

license = "MIT/Apache-2.0"

I'll be going through projects which agree to be relicensed and have approval
by the necessary contributors and doing this changes, so feel free to leave
the heavy lifting to me!

Contributor checkoff

To agree to relicensing, comment with :

I license past and future contributions under the dual MIT/Apache-2.0 license, allowing licensees to chose either at their option.

Or, if you're a contributor, you can check the box in this repo next to your
name. My scripts will pick this exact phrase up and check your checkbox, but
I'll come through and manually review this issue later as well.

Example with image buffer

Hello,

I am running stable rust and trying to add this hash lib. I don't fully understand the error message I am receiving from rust. It looks like my buffer should work.

sbeckeriv/make-me-an-image#2

thanks for the code and any advice.
Becker

[edit] im on stable. I will try with nightly as well.

Why are some dependencies ranges?

Hey!

I'm considering using img_hash (thanks for your work!)

I'm hoping to gain an understanding of why some of the dependencies. are pinned to be below a certain minor version.

My main concern is using img_hash and pulling in multiple versions of crates.

I was able to find the PR that introduced the image cap, but couldn't quite tell why f51961c#diff-80398c5faae3c069e4e6aa2ed11b28c0

Thanks!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.