finalfusion / finalfusion-utils Goto Github PK
View Code? Open in Web Editor NEWfinalfusion command-line utilities
License: Other
finalfusion command-line utilities
License: Other
finalfusion seems to work on Apple Silicon. I am using an x86_64 toolchain under Rosetta 2 because for some other projects the dependencies are not available yet for Apple Silicon (e.g. libtorch). Then I cross-compile for Apple Silicon:
$ rustup target add aarch64-apple-darwin
$ cargo build --target=aarch64-apple-darwin --release
$ file target/aarch64-apple-darwin/release/finalfusion
target/aarch64-apple-darwin/release/finalfusion: Mach-O 64-bit executable arm64
$ target/aarch64-apple-darwin/release/finalfusion
zsh: killed target/aarch64-apple-darwin/release/finalfusion
Turns out that this is a known problem with code signing in cross compilation.
# Sign with ad-hoc signing
$ codesign -s - target/aarch64-apple-darwin/release/finalfusion
$ target/aarch64-apple-darwin/release/finalfusion
finalfusion
USAGE:
finalfusion <SUBCOMMAND>
Open questions:
There's quite some repetitions in the setup of the commandline parser, we might be able to factor out some of the common arguments such as the input format or the input path.
Amend README (and maybe command info?) that compute-accuracy
's results are incomparable without identical vocabularies.
The MKL software license permits redistribution for a while now: https://software.intel.com/content/dam/develop/external/us/en/documents/pdf/intel-simplified-software-license.pdf
Maybe we should use MKL in the precompiled version:
I guess this only means that we have to switch from a MUSL build to glibc, but that's not a problem.
I can do the implementation work, just wanted to check that you are ok with this, @sebpuetz.
This subcommand should do the opposite of finalfusion quantize
and reconstruct an embedding matrix from a quantized matrix. This would make it possible to evaluate quantized matrices with compute-accuracy
. This is motivated by @NianhengWu 's observation in finalfusion/finalfusion-rust#82 that it is currently difficult to do an intrinsic evaluation of quantized embedding matrices.
Proposed solution: change analogy
to return a Result<..., [bool; 3]>
, then print failed words.
I just realized that we do not have an ff-analogy
utility. This should lines from the standard input consisting of three words separated by a space. If the words are A B C, then it should return words that qualify as D in A is to B as C is to D.
Analogy queries are already supported by the finalfusion
crate. So, this utility would be largely a copy of ff-similar
, but reading three words rather than one word and using the analogy
method.
@NianhengWu: maybe this is something that you could do besides the finalfrontier experiments? Seems like little work for a lot of gratification ;). You can get embeddings to test with at:
Continuing the discussion in #26. I think I have found a nice solution to this: with Nix I can build a single binary with all the dependencies. It is not a static binary, but the binary will actually contain a tarball of the transitive closure of dependencies. It's self-extracting and uses user namespaces to pivot root (since the libraries are not relocatable).
I think it has nice properties, such as that we have complete control over the dependencies, down to the C library. But it requires that we have a single binary rather than several binaries as we have now.
So, I fear I have to use subcommands, which you know I am not a fan of ;). I'll implement a single finalfusion
command with subcommands and see if I can live with it.
Proposed approach: add an analogy_masked
method to finalfusion
where one can specify which words should be masked.
Even though we cannot easily ship this with OpenBLAS support, this is better than not providing any binary builds at all.
We're lacking some basic documentation here. Especially after merging everything into a single binary, giving some examples might be nice.
We have a number of tools that just quietly do nothing if we forget to pass an input. Maybe printing a status message to stderr stating something like Reading input from stdin
or Reading input from path/to/file
would allow users to catch that.
What's your opinion?
Master currently doesn't build, probably because of a mismatch of toml versions (0.4 and 0.5) in finalfrontier-utils vs. finalfrontier-rust
error[E0631]: type mismatch in function arguments
--> src/bin/ff-convert.rs:99:64
|
99 | let metadata = config.metadata_filename.map(read_metadata).map(Metadata);
| ^^^
| |
| expected signature of `fn(toml::value::Value) -> _`
| found signature of `fn(toml::value::Value) -> _`
error: aborting due to previous error
For more information about this error, try `rustc --explain E0631`.
error: Could not compile `finalfusion-utils`.
warning: build failed, waiting for other jobs to finish...
error: build failed
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.