aahancoc / tree_magic Goto Github PK
View Code? Open in Web Editor NEWDetermines the MIME type of a file by traversing a filetype tree.
License: MIT License
Determines the MIME type of a file by traversing a filetype tree.
License: MIT License
The case, when file data pulled from stream, requires little different API because we don't sure that we have enough data.
I think from_u8()
and match_u8()
should return Option<_>
where None means "not enough data to determine".
In tree_magic 0.2.0 the following panics from unwrapping a None
.
let file = std::path::Path::new("does-not-exist");
tree_magic::from_filepath(file);
Currently JSON files do not seem to be detected properly, but instead detected as application/json
. Is this planned, or out of scope?
It would be very cool if you would update obsolete nom v2 to latest v4. I tried to do that update myself, but it turned out to be not trivial.
Thanks!
First, thanks for this crate. It's the nicest MIME type detector I've found in pure Rust.
I'm using it as a library, and the v0.2.2 release made a breaking change to the API, namely tree_magic::from_filepath
now returns an Option
.
I've gone ahead and changed my crate's version requirement to be = 0.2.1
until I can update it, but in the future it'd be helpful if changes like this could be accompanied by a minor version bump.
I forked tree_magic
because I'd like to submit a PR that updates its deps, but I don't have confidence I won't break anything because some tests are failing for me on master (nightly rust, macOS)
running 16 tests
test from_u8::application_zip ... ok
test from_u8::application_x_7z ... ok
test from_u8::image_gif ... ok
test from_u8::image_png ... ok
test from_u8::image_bmp ... ok
test from_u8::application_tar ... ok
test from_u8::image_tiff ... ok
test from_u8::image_x_portable_bitmap ... ok
test from_u8::image_x_pcx ... ok
test from_u8::image_x_tga ... ok
test from_u8::text_plain ... ok
test from_u8::audio_ogg ... FAILED
test from_u8::audio_opus ... FAILED
test from_u8::audio_mpeg ... ok
test from_u8::audio_wav ... FAILED
test from_u8::audio_flac ... ok
failures:
---- from_u8::audio_ogg stdout ----
thread 'from_u8::audio_ogg' panicked at 'assertion failed: `(left == right)`
left: `"audio/x-vorbis+ogg"`,
right: `"audio/ogg"`', tests/from_u8.rs:118:3
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
---- from_u8::audio_opus stdout ----
thread 'from_u8::audio_opus' panicked at 'assertion failed: `(left == right)`
left: `"audio/x-opus+ogg"`,
right: `"audio/opus"`', tests/from_u8.rs:126:3
---- from_u8::audio_wav stdout ----
thread 'from_u8::audio_wav' panicked at 'assertion failed: `(left == right)`
left: `"application/x-riff"`,
right: `"audio/wav"`', tests/from_u8.rs:134:3
failures:
from_u8::audio_ogg
from_u8::audio_opus
from_u8::audio_wav
All of the JPG files I have tried returns text/plain
.
I want to use tree_magic 0.2.3, for validating user content uploaded being uploaded to my website (using actix).
The files being uploaded are being assigned a uuid filename. I discard the filename provided by the user, since it cannot be trusted.
I have tried both using the from_u8()
and the from_filepath()
.
If the path ends in .jpg
then tree_magic classified the file correctly as image/jpeg
.
However I cannot trust that the user have provided the correct file extension.
Interestingly all the PNG files I have tried with are identified correctly as image/png
.
The PNG identification works no matter what file extension I'm using, and I can leave it entirely out.
I know this library is unmaintained, but opening this for a future maintainer :)
this file: test.tar.zip (zipped to prevent github complaining)
file --mime-type
application/x-tar
tmagic:
application/pdf
Attached is minimal.pdf, which tree_magic 0.2.0 misidentifies as text/x-tex, when application/pdf is expected.
Unix "file" command, as tested on Ubuntu, correctly identifies it as "PDF document, version 1.1".
Happens when tree_magic::from_filepath(path)
is called on non regular files such as fifo, character, block files.
Might be out of scope of this project, but I'll submit this anyways.
If you test tree_magic on the test files in Musync (Check the data/
folder) you get mostly bogus results, with only a couple identifying correctly.
An optional feature to allow parsing directly to the https://github.com/hyperium/mime.rs Mime
type would be nice.
Could you put out a 0.3 release of tree_magic? There's an API fix I'd like to integrate into https://gitlab.gnome.org/GNOME/fractal that's on master, but I'd like to avoid directly fetching from the git repos.
It seems .docx is missing in mime database.
I don't sure this is problem of treemagic itself.
doc.doc:
magic: application/msword
treemagic: application/msword
doc.odt:
magic: application/vnd.oasis.opendocument.text
treemagic: application/vnd.oasis.opendocument.text
doc.docx:
magic: application/vnd.openxmlformats-officedocument.wordprocessingml.document
treemagic: application/zip
Filing this issue mainly for awareness, since it may be important to some users of this crate: This package lists its license as “MIT,” but the database files in the builtin
directory were released by the FreeDesktop project under the GNU GPL. Developers should be careful to avoid violating the GPL when distributing programs that include this crate.
The tree_magic_mini
fork of this crate offers an option to build without the GPLed database files. For details, see: mbrubeck#1
Hello. Sorry for abandoning this project. I mostly made it to learn how to program in Rust, and didn't really expect people to use it. That said I'm glad everyone finds it useful.
I'm not in a position where I have a lot of time to dedicate to maintaining this (and also it was a little too ambitious for my Rust skills), so if an active contributor would rather take charge I think that would be much better.
This file path doesn't seem right:
tree_magic/src/fdo_magic/sys.rs
Line 72 in 9bc1f46
Cargo.toml
:
[package]
name = "blah"
version = "0.1.0"
[dependencies]
tree_magic="0.2.0"
index.html
:
<!DOCTYPE html>
src/lib.rs
:
extern crate tree_magic;
#[cfg(test)]
mod tests {
use super::*;
use std::path::Path;
static MIME: &str = "text/html";
static FILE: &str = "index.html";
#[test]
fn from_filepath() {
assert_eq!(MIME, tree_magic::from_filepath(Path::new(FILE)));
}
#[test]
fn match_filepath() {
assert!(tree_magic::match_filepath(MIME, Path::new(FILE)));
}
}
Running cargo test
on the above project shows ´from_filepathpassing and
match_filepathfailing.
from_filepath` seems to pass as long as the HTML file has DOCTYPE html, ignoring everything else.
Attached is misidentified_rtf.zip, which tree_magic 0.2.0 misidentifies as application/octet-stream, when application/rtf is expected.
Unix "file" command, as tested on Ubuntu, correctly identifies it as "Rich Text Format data, version 1, ANSI".
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.