Comments (4)
A helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words).
Source: https://beta.openai.com/tokenizer
from mochidiffusion.
The
gpt-3-encoder
package provides a python implementation as well as a javascript one.Both of which give me a more accurate token count than what Mochi provides me with currently. Unfortunately I have never wrote a single line of Swift so I have no idea how I would implement this but I thought I'd leave the info out there.
Token counts differ based on the model's vocab list. Also after I made the initial commit for the naive token counter, @CarterLombardi improved it by actually calculating the real token count (commit: 034bbe7)
from mochidiffusion.
So the token≈word isn't correct. I can see the text being truncated in Xcode output but parts of words are getting truncated if it goes over the "token" length. I've tried looking at webui's implementation of their token counter but the code is messy and isn't very straight forward. I'll be open to a PR but I can't figure out exactly what a token is.
from mochidiffusion.
From the same source,
If you need a programmatic interface for tokenizing text, check out the transformers package for python or the gpt-3-encoder package for node.js.
The gpt-3-encoder
package provides a python implementation as well as a javascript one.
Both of which give me a more accurate token count than what Mochi provides me with currently. Unfortunately I have never wrote a single line of Swift so I have no idea how I would implement this but I thought I'd leave the info out there.
from mochidiffusion.
Related Issues (20)
- Allow 2x upscale in addition to current 4x upscale
- How to use the SDXL Turbo Model? HOT 4
- Send app notification when image generation is done HOT 2
- preview the generated picture by pressing the space bar HOT 1
- Delete multiple Images at once HOT 1
- Seed is not at random HOT 1
- SDXL Turbo model crashes on load HOT 4
- Support 1 step generation
- auto scrolling logic is not working correctly HOT 1
- Crashs when using SDXL Turbo 1.0 fp16 6bit split einsum HOT 3
- Add abbility to set percise Guidance Score between 0 and 1 for SDXL Turbo HOT 1
- loading pipeline error when generate image HOT 4
- This Photo Needs Work. Can Someone Help Me?
- When will training be added? HOT 1
- The starting image size doesn't match the size of the image that will be generated HOT 1
- Quick Look does not work
- Include information about ControlNet in info panel
- SDXL Turbo Support
- Migrate to Swift-Format
- Tokenizer out of sync with current selected model
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mochidiffusion.