juliacrypto / sha.jl Goto Github PK
View Code? Open in Web Editor NEWA performant, 100% native-julia SHA1, SHA2, and SHA3 implementation
License: Other
A performant, 100% native-julia SHA1, SHA2, and SHA3 implementation
License: Other
i just wanted to register my disappointment that this library returns hex strings. a hash is not a hex string, it's a value, and most naturally described as an array of bytes. returning a string may suit the kinds of applications that you use hashes for, but they are more general than that. please, if / when this is ever moved into the main language, consider providing bytes.
Searching for fast Keccak implementations in Julia, I came across @tecosaur's package KangarooTwelve.jl. I think it should be straightforward to extract SHA-3 implementations from it. For example, changing the hardcoded number of rounds from 12 to 24 in ::Val{nrounds}=Val{12}())
, the function SHA3-256 can be computed as follows:
tecosaur_sha3_256(message) = KangarooTwelve.turboshake(NTuple{32, UInt8}, message, 0b110, Val(512))
On my machine, it performs better for short messages
julia> msg = zeros(UInt8, 10);
julia> @btime SHA.sha3_256(msg);
871.358 ns (5 allocations: 688 bytes)
julia> @btime tecosaur_sha3_256(msg);
437.598 ns (1 allocation: 48 bytes)
and significantly better for long messages:
julia> msg = zeros(UInt8, 100_000_000);
julia> @btime SHA.sha3_256(msg);
614.836 ms (5 allocations: 688 bytes)
julia> @btime tecosaur_sha3_256(msg);
218.257 ms (1 allocation: 48 bytes)
I just wanted to leave this observation as a note. Feel free to close the issue.
Python's hashlib.sha256(data).hexdigest()
is 10x faster than SHA.jl's sha256(data)
on my computer:
In [8]: %timeit hashlib.sha256(data).hexdigest()
53.8 µs ± 6.88 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
julia> @benchmark sha256(data)
BenchmarkTools.Trial: 8884 samples with 1 evaluation.
Range (min … max): 514.644 μs … 913.010 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 518.341 μs ┊ GC (median): 0.00%
Time (mean ± σ): 561.483 μs ± 99.395 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
This is presumably (haven't checked though) because Python's underlying C library uses SHA-specific CPU instructions, where as this package does not.
This might not be possible to fix in current versions of Julia, but given that we see a 10x performance difference, this is something Julia needs to support eventually (and probably will). When that time comes, this package should be updated.
This issue was discovered by @jonalm
@staticfloat could you please tag a version? master
seems to have a bunch of julia 0.5 deprecation warnings fixed, would be good to have those in a tagged version. Thanks.
1.7.0 stdlib/SHA: https://docs.julialang.org/en/v1.7.0-rc1/stdlib/SHA/
1.8 none: https://docs.julialang.org/en/v1.8-dev/search/?q=SHA
Should we copy stdlib/SHA/docs/src/index.md
to this repo?
https://github.com/JuliaLang/julia/pull/41370/files#diff-893e1da8d89d3231b3b4137c31c3eaa0553e79b5ea62ed497cfc74418bc88b8c
Currently, SHA contexts still work after calling digest!
. Perhaps they should be made non-functional, i.e. throwing an error when they have been digested?
At the very least, the docs should mention that this should not be done.
CC @staticfloat
I'm trying to test those lines:
https://app.codecov.io/gh/JuliaCrypto/SHA.jl/blob/master/src%2Fshake.jl#L84
Lines 74 to 76 in 88e1c83
Lines 83 to 88 in 88e1c83
So I construct an input:
julia> using SHA
julia> SHA.blocklen(SHAKE_128_CTX) |> Int
168
julia> SHA.shake128(codeunits("0" ^ 167), UInt(32)) |> bytes2hex
"1350314219fb0728c65f0a9f560e70c402d63f1862f9f34555ce27828f5e0373"
julia> SHA.shake128(codeunits("0" ^ 168), UInt(32)) |> bytes2hex
"39ebc9a5df5c81a46d8a6f543326348695ff59d0dd10e020b53cb49a1c1532e4"
python output:
>>> import hashlib
>>> h = hashlib.shake_128()
>>> h.update(b'0' * 167)
>>> h.hexdigest(32)
'ff60b0516fb8a3d4032900976e98b5595f57e9d4a88a0e37f7cc5adfa3c47da2'
>>> h = hashlib.shake_128()
>>> h.update(b'0' * 168)
>>> h.hexdigest(32)
'39ebc9a5df5c81a46d8a6f543326348695ff59d0dd10e020b53cb49a1c1532e4'
Notice that for the input "0" ^ 167
, julia's output does not match python's.
There might be a bug here. @immoschuett
I just noticed that the SHA-3 functions yield incorrect results if the length of the message is one byte less than a multiple of the block size. Taking as an example sha3_512
and a message of length 71:
julia> using SHA; bytes2hex(sha3_512(zeros(UInt8, SHA.blocklen(SHA.SHA3_512_CTX) - 1)))
"67dd0c8d9120f2772eac8c9287888a2f6128ce21f8894734f71ce4943e71f0e60f72823986c9a1c82f38d94fdbdc1905fc5df19c09499ea356950767d0812714"
However, the correct hash is:
"cd87417194c917561a59c7f2eb4b95145971e32e8e4ef3b23b0f190bfd29e3692cc7975275750a27df95d5c6a99b7a341e1b8a38a750a51aca5b77bae41fbbfc"
Ping @staticfloat
Tag v0.2.2 already exists here, but does not appear in METADATA.jl
.
The lack of update for SHA.jl is, somehow, keeping DataStructures.jl
from testing on master: https://travis-ci.org/JuliaLang/DataStructures.jl/jobs/175961386
(Cheers, Elliot!)
I would be happy to do the work, I figure the module would end up exporting the following new objects:
and
The algorithms appear to mostly be the SHA512 with a different initial hash value. Does that seem reasonable?
julia-buildpkg@v1 on Windows will take about 6~7min.
It only takes a few seconds on other platforms.
https://github.com/JuliaCrypto/SHA.jl/actions/runs/7591156618/job/20678829291?pr=100
@staticfloat Some time ago I created the org JuliaCrypto
, with the purpose to serve Krypto.jl
and the further intent to hold all major cryptographic packages in Julia - as JuliaMath, BioJulia, etc do for their field of science. This here is a suggestion to transfer SHA.jl
into the organisation. As the current owner and lead developer, you would, of course, retain the position :) The merge/transfer would further simplify the creation of a bigger Hash.jl
, containing the current SHA.jl
, if that would happen somewhere in the future.
Of course, such a transfer could cause some problems, as all dependent repos would have to update the position of the package (I look at you METADATA.jl
).
If you agree with this proposal, I would be very glad to see you reply to this thread, as well if you disagree. :-)
The latest version of this package is 0.7. You cannot install [email protected] on Julia 1.6 and 1.7 as these versions already include SHA.jl as a stdlib:
julia> VERSION
v"1.6.6"
(@v1.6) pkg> add SHA@0.7
Resolving package versions...
ERROR: Unsatisfiable requirements detected for package SHA [ea8e919c]:
SHA [ea8e919c] log:
├─possible versions are: 1.6.6 or uninstalled
└─restricted to versions 0.7 by an explicit requirement — no versions left
This is problematic as if you want to specify a Project.toml compat entry for SHA.jl and your package supports Julia 1.6+ you would have to define a compat entry of:
[compat]
SHA = "~1.6, ~1.7, 0.7"
Although this isn't a problem at the moment we could run into an issue in the future where SHA.jl version 1.0.0 removes functionality that was present in the stdlib and then stdlib SHA.jl and the external SHA.jl have versions 1.6.6 that provide different functionality.
Additionally, when this external package reaches version 2.0 then Julia 1.6 and 1.7 will be unable to use this new version as they are stuck with the stdlib version.
The only options I see are to:
julia> tmp_key = rand(UInt8,16);
julia> str = "123";
julia> bytes2hex( SHA.hmac_sha256(tmp_key,str) )
"ce0006ef9b02c5c710b8bfcb47e6f5177b44323992e2918dfbdb087a03ec6f4f"
julia> bytes2hex(tmp_key)
"e85726cc1b12609bbaebe535f0926f59"
where the correct answer should be "917ca2bb41787aadb5384038f2790b6e0a02e058593e37e1297f970be269e583" using other methods.. (ref:https://www.freeformatter.com/hmac-generator.html)
julia version v1.4.1, tried other strings and found the result incorrect..
@staticfloat
julia> bytes2hex(sha256("test"))
ERROR: MethodError: `bytes2hex` has no method matching bytes2hex(::ASCIIString)
If I knew what it should be, I'm happy to submit a PR :)
FYI: I made a pure Julia Ripemd implementation, in case you are interested. I stuck to the update!
, digest!
interface to keep things consistent. I is currently only Ripemd160, but should be easily extensible.
For short objects it is faster than Nettle and for long objects it takes < 1.5x the time of Nettle.
It would be helpful if for each hash there was an object representing a hash (e.g. SHA1
, SHA256
etc), similar to UUID
in Base.
Just saw this weird failure on Julia 0.5 win64 for jump-dev/ECOS.jl#46:
LoadError: MethodError: no method matching blocklen(::Type{SHA.SHA2_256_CTX})
Closest candidates are:
blocklen(!Matched::Type{SHA.SHA1_CTX}) at C:\Users\appveyor\.julia\v0.5\SHA\src\types.jl:92
blocklen(!Matched::Type{SHA.SHA2_224_CTX}) at C:\Users\appveyor\.julia\v0.5\SHA\src\types.jl:93
blocklen(!Matched::Type{SHA.SHA2_512_CTX}) at C:\Users\appveyor\.julia\v0.5\SHA\src\types.jl:94
...
while loading C:\Users\appveyor\.julia\v0.5\ECOS\deps\build.jl, in expression starting on line 48
@staticfloat What do you think of adding a separate CI job to run the doctests?
If stdlib SHA package replaces this package for Julia v1.0, there should be a version cap on REQUIRE file for this package.
diff: 7ac490d...master
I use this package for the TOTP algorithm and some exchanges protocols. Both requires an HMAC functionality, so I hope this can be built into this package like Nettle.jl did.
The implementation (from wiki) can be quiet simple:
function hmac(key::Vector{UInt8}, msg::Vector{UInt8}, hash, blocksize::Int=64)
if length(key) > blocksize
key = hash(key)
end
pad = blocksize - length(key)
if pad > 0
resize!(key, blocksize)
key[end-pad+1:end] = 0
end
o_key_pad = key .⊻ 0x5c
i_key_pad = key .⊻ 0x36
hash([o_key_pad; hash([i_key_pad; msg])])
end
and here is a test suite (also from wiki):
@test hmac(b"", b"", sha1, 64) == hex2bytes("fbdb1d1b18aa6c08324b7d64b71fb76370690e1d")
@test hmac(b"", b"", sha256, 64) == hex2bytes("b613679a0814d9ec772f95d778c35fc5ff1697c493715653c6c712144292c5ad")
@test hmac(b"key", b"The quick brown fox jumps over the lazy dog", sha1, 64) == hex2bytes("de7c9b85b8b78aa6bc8a7a36f70a90701c9db4d9")
@test hmac(b"key", b"The quick brown fox jumps over the lazy dog", sha256, 64) == hex2bytes("f7bc83f430538424b13298e6aa6fb143ef4d59a14946175997479dbc2d1a3cd8")
When opening an issue, please ping @staticfloat since he does not receive emails automatically when new issues are created.
What do you think of the notion of adding a pure julia MD5 to this package?
Conceptually it seems pretty similar to SHA1, so there is maybe some code overlap?
(But I am not crypto guy. I know they are both from the same linage of hash functions.)
We have access to MD5 in Nettle.jl, MbedTLS.jl and Crypto.jl but all three wrap binaries.
Which doesn't seem worth it for such a simple short algorithm.
I want it for DataDeps.jl, because most public data sources provide MD5 sums (if they provide a checksum at all).
For example: http://datadryad.org/resource/doi:10.5061/dryad.ds68r/4
I've started to implement it, but I'm not sure how much of the code for SHA I can reuse.
How different would that be, would it reuse much/any code with what's already here? We can get it from libgit2 (or easier from nettle.jl) so no rush, but would be convenient to have it available in pure Julia at some point.
I have a md5 implementation here. It uses the same structure as SHA.jl and I copy pasted some SHA.jl code there. This is not ideal, I can think of a couple of other options:
HashFunctionsBase.jl
package with common codeWhat do you think @staticfloat ?
This is incredibly slow and wasteful; read in something like a page at a time, or even better, use mmap
.
On python 3.11.6
In [1]: import random
In [2]: import hashlib
In [3]: def calc_sha(bytes_):
...: sha = hashlib.sha256()
...: sha.update(bytes_)
...: return sha
...:
In [4]: bytes_ =random.randbytes(10000)
In [5]: %timeit calc_sha(bytes_)
3.93 µs ± 13.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
On julia 1.10 beta
julia> using SHA
julia> using BenchmarkTools
julia> @btime sha256(a) setup=a=rand(UInt8, 10000);
25.584 μs (4 allocations: 352 bytes)
I saw that SHA is part of the Julia stdlib, so I guess that this repo is not maintained any more. You should put a comment in the readme. @staticfloat
From JuliaLang/julia#42871 (https://buildkite.com/julialang/julia-master/builds/5337#c13096a2-2434-4121-9250-1ed7fc8b81bd):
[ Info: Doctest: running doctests.
┌ Error: doctest failure in src/stdlib/SHA.md:48-88
│
│ ```jldoctest
│ julia> using SHA
│
│ julia> ctx = SHA2_256_CTX()
│ SHA2 256-bit hash state
│
│ julia> update!(ctx, b"some data")
│ 0x0000000000000009
│
│ julia> update!(ctx, b"some more data")
│ 0x0000000000000017
│
│ julia> digest!(ctx)
│ 32-element Vector{UInt8}:
│ 0xbe
│ 0xcf
│ 0x23
│ 0xda
│ 0xaf
│ 0x02
│ 0xf7
│ 0xa3
│ 0x57
│ 0x92
│ 0x5b
│ 0xc5
│ 0xe1
│ ⋮
│ 0x19
│ 0xa0
│ 0x1b
│ 0x89
│ 0x4f
│ 0x59
│ 0xd8
│ 0xb3
│ 0xb4
│ 0x81
│ 0x8b
│ 0xc5
│ ```
│
│ Subexpression:
│
│ digest!(ctx)
│
│ Evaluated output:
│
│ 32-element Vector{UInt8}:
│ 0xbe
│ 0xcf
│ 0x23
│ 0xda
│ 0xaf
│ 0x02
│ 0xf7
│ 0xa3
│ 0x57
│ 0x92
│ ⋮
│ 0x89
│ 0x4f
│ 0x59
│ 0xd8
│ 0xb3
│ 0xb4
│ 0x81
│ 0x8b
│ 0xc5
│
│ Expected output:
│
│ 32-element Vector{UInt8}:
│ 0xbe
│ 0xcf
│ 0x23
│ 0xda
│ 0xaf
│ 0x02
│ 0xf7
│ 0xa3
│ 0x57
│ 0x92
│ 0x5b
│ 0xc5
│ 0xe1
│ ⋮
│ 0x19
│ 0xa0
│ 0x1b
│ 0x89
│ 0x4f
│ 0x59
│ 0xd8
│ 0xb3
│ 0xb4
│ 0x81
│ 0x8b
│ 0xc5
│
│ diff =
│ 32-element Vector{UInt8}:
│ 0xbe
│ 0xcf
│ 0x23
│ 0xda
│ 0xaf
│ 0x02
│ 0xf7
│ 0xa3
│ 0x57
│ 0x92
│ 0x5b
│ 0xc5
│ 0xe1
│ 0x92
│ ⋮
│ 0x19
│ 0xa0
│ 0x1b
│ 0x89
│ 0x4f
│ 0x59
│ 0xd8
│ 0xb3
│ 0xb4
│ 0x81
│ 0x8b
│ 0xc5
└ @ Documenter.DocTests /cache/build/amdci4-3/julialang/julia-master/doc/deps/packages/Documenter/f5jts/src/DocTests.jl:385
ERROR: LoadError: `makedocs` encountered a doctest error. Terminating build
@staticfloat, now that the stress of the 1.0 release has passed, please don't forget to add the fast sha2 implementation to the SHA version in Stdlib.
Minor but needed to point out. SHA.jl does have all the SAH1-3 HMAC functions and if you search in terminal help you can find the functions (some of them) listed below but there is no documentation about them in SHA.jl
hmac_sha1, hmac_sha512, hmac_sha384, hmac_sha256, hmac_sha224
hmac_sha2_512 ,hmac_sha2_384 ,hmac_sha2_256 ,hmac_sha2_224
hmac_sha3_512 ,hmac_sha3_384 ,hmac_sha3_256 ,hmac_sha3_224
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.