Provides functions to get fixed width of the character or string.
runewidth.StringWidth("つのだ☆HIRO") == 12
Yasuhiro Matsumoto
under the MIT License: http://mattn.mit-license.org/2013
wcwidth for golang
License: MIT License
Provides functions to get fixed width of the character or string.
runewidth.StringWidth("つのだ☆HIRO") == 12
Yasuhiro Matsumoto
under the MIT License: http://mattn.mit-license.org/2013
Which license did you adopt for this product? Thanks.
According to the Unicode® Standard Annex #11 Na stands for narrow:
ED5. East Asian Narrow (Na): All other characters that are always narrow and have explicit fullwidth or wide counterparts. These characters are implicitly narrow in East Asian typography and legacy character sets because they have explicit fullwidth or wide counterparts. All of ASCII is an example of East Asian Narrow characters.
Therefore, the characters that are currently considered to belong to the nonassigned table should have width 1, not width 0.
Two of these characters are commonly used in quantum mechanics: |α⟩⟨α|
EDIT: This issue is fixed by #44. Please merge that PR.
Variation Selectors 1-256 (Unicode range 0xFE00
-0xFE0F
and 0xE0100
-0xE01EF
report as width = 1. These are nonprintable characters and should report width 0. I think it would make sense to add them to the nonprint
table. I can submit a PR if that sounds good.
It would be great if you could add support for zero-width joiners (ZWJ). I have the following code example which doesn't work as expected:
package main
import (
"fmt"
runewidth "github.com/mattn/go-runewidth"
)
func main() {
e := "👨👨👧"
r := []rune(e)
var widths []int
for _, c := range r {
widths = append(widths, runewidth.RuneWidth(c))
}
fmt.Printf("%s : len=%d numrunes=%d width=%d widths=%v runes=%X\n", e, len(e), len(r), runewidth.StringWidth(e), widths, r)
}
The output is:
👨👨👧 : len=18 numrunes=5 width=6 widths=[2 0 2 0 2] runes=[1F468 200D 1F468 200D 1F467]
Specifically, width
should be 2
instead of 6
. I found this article which explains how they work. It does not only affect emojis but also characters in some languages.
This came up in rivo/tview#161. It would be great if support for ZWJ could be added so I can implement support for these Unicode characters in tview
. I understand that not all kinds of combinations are supported and it's probably difficult to figure out which ones are. But assuming these characters are supported will help a lot. I don't expect users to try to print ZWJ combinations which are not supported anyway.
Thanks!
Check this for the definition of box-drawing (BD below) characters.
I found that these characters are defined to be of ambiguous width, so passing these to RuneWidth
returns 2 in my environment. This is somehow inconvenient since AFAIK, terminal fonts tend to interpret BD characters in half-width.
Is it possible to remove these characters from the ambiguous table? I can make the PR if you think this sounds sane.
Thanks.
runewidth.StringWidth(🇩🇰)
returns 2.
I haven't looked into this at all, and I have no idea what I should expect, but a width of 1 seems reasonable.
I am trying to install go fiber 2.40.0 using gov1.15.
I encountered an error saying something like:
github.com/mattn/[email protected]/runewidth.go:7:2: found packages uniseg(doc.go) and main (gen_breaktest.go) in ...
Has anyone else ever encountered this before?
src/github.com/mattn/go-runewidth/runewidth.go:7:2: found packages uniseg (doc.go) and main (gen_breaktest.go) in /root/ngrok/src/github.com/rivo/uniseg
make: *** [Makefile:8: deps] Error 1
c9bd7d1 and 43a826d broke benchmark tests
$ go test -bench . -benchmem
--- FAIL: BenchmarkRuneWidthAll
benchmark_test.go:27: got 1293942, want 1293932
goos: linux
goarch: amd64
pkg: github.com/mattn/go-runewidth
cpu: 11th Gen Intel(R) Core(TM) i3-1115G4 @ 3.00GHz
BenchmarkRuneWidth768-4 650364 1877 ns/op 0 B/op 0 allocs/op
--- FAIL: BenchmarkRuneWidthAllEastAsian
benchmark_test.go:27: got 1432568, want 1432558
BenchmarkRuneWidth768EastAsian-4 85194 14217 ns/op 0 B/op 0 allocs/op
--- FAIL: BenchmarkString1WidthAll
benchmark_test.go:62: got 1295990, want 1295980
BenchmarkString1Width768-4 9513 125876 ns/op 86016 B/op 3072 allocs/op
--- FAIL: BenchmarkString1WidthAllEastAsian
benchmark_test.go:62: got 1436664, want 1436654
BenchmarkString1Width768EastAsian-4 8168 142574 ns/op 86016 B/op 3072 allocs/op
BenchmarkTablePrivate-4 656 1798150 ns/op 0 B/op 0 allocs/op
BenchmarkTableNonprint-4 402 2982255 ns/op 0 B/op 0 allocs/op
BenchmarkTableCombining-4 264 4511447 ns/op 0 B/op 0 allocs/op
BenchmarkTableDoublewidth-4 222 5379437 ns/op 0 B/op 0 allocs/op
BenchmarkTableAmbiguous-4 183 6475643 ns/op 0 B/op 0 allocs/op
BenchmarkTableEmoji-4 222 5272836 ns/op 0 B/op 0 allocs/op
BenchmarkTableNarrow-4 522 2255628 ns/op 0 B/op 0 allocs/op
BenchmarkTableNeutral-4 144 8281886 ns/op 0 B/op 0 allocs/op
FAIL
exit status 1
FAIL github.com/mattn/go-runewidth 19.880s
dot := '\uF111' // a dot
println(runewidth.RuneWidth(dot))
/*
Linux(wsl):1 (correct)
Windows 11:2 (incorrect)
*/
go version: 1.20.4
I was try to parse terminal input command in java, but ANSI code parse is difficult, so I want to use this project. However, this project is written in Go. I would like to ask if there is a Java version of it? Or any similar third-party library?
There are 14 commits since master.
func main() {
b := `─` // unicode 0x2500
fmt.Println(runewidth.StringWidth(b))
}
on windows/mac get: 2
on linux get: 1
I'm trying to port an old DOS program using tcell (which uses RuneWidth). My program has a table mapping CP437 char code to rune, and then I print that rune to the screen. I'm in the terminal with fixed width fonts, so I expect all chars to be the same width.
The issue is RuneWidth('\u2666')
and some other characters is returning width 2 instead of 1, which makes tcell allocate 2 chars for it and causes "gaps" in the rendering. Here's playground code showing which chars do this: https://play.golang.org/p/Hjq3GOC0Pcd -- output is:
RuneWidth('☺') = 2
RuneWidth('☻') = 2
RuneWidth('♥') = 2
RuneWidth('♦') = 2
RuneWidth('♣') = 2
RuneWidth('♠') = 2
RuneWidth('♂') = 2
RuneWidth('♀') = 2
RuneWidth('♪') = 2
RuneWidth('♫') = 2
RuneWidth('☼') = 2
RuneWidth('↕') = 2
RuneWidth('‼') = 2
RuneWidth('↔') = 2
I believe it's happening because these are treated as Emoji characters. Is this behavior expected? If so, how do I work around this in tcell?
Hi,
Currently, your go1
tag points to commit ce86f93. So when someone does go get -u github.com/mattn/go-runewidth
, it will check out that revision.
However, you have newer commits that add Truncate
and fix bugs on master, that are not available:
Can you either update go1
tag to point to latest stable version (I'm guessing 39104c7), or simpler yet, remove it and let master be the latest go get
table version. You can use feature branches for development and merge them into master when they're ready.
I'm guessing this was an unintended situation, but please let me know if that's not the case. Thanks.
Hello!
I maintain a golang library for drawing ASCII tables at https://github.com/jedib0t/go-pretty and this is one of the few dependencies I have, to calculate rune width for drawing the tables. Sample: https://go.dev/play/p/I6uxssyXxhN?v=goprev
Now, a couple of users reported some alignment issues, and after some investigation I figured that the Width returned for Box Drawing characters were not the expected values when LANG=zh_CN.UTF-8
or when EastAsianWidth=true is set in go-runewidth.
To replicate the bug, I create this program -- say foo.go
:
package main
import (
"fmt"
"strings"
"github.com/mattn/go-runewidth"
)
func main() {
boxDrawingChars := []string{
"+", "-", "=",
"┏", "┳", "┓",
"┣", "╋", "┫",
"┗", "┻", "┛",
"━", "┃",
}
cellWidth := 8
for _, boxDrawingChar := range boxDrawingChars {
padding := strings.Repeat(" ", cellWidth-runewidth.StringWidth(boxDrawingChar))
fmt.Printf("| %s%s |\n", boxDrawingChar, padding)
}
}
Output:
$ LANG=en_US.UTF-8 go run foo.go
| + |
| - |
| = |
| ┏ |
| ┳ |
| ┓ |
| ┣ |
| ╋ |
| ┫ |
| ┗ |
| ┻ |
| ┛ |
| ━ |
| ┃ |
$ LANG=zh_CN.UTF-8 go run foo.go
| + |
| - |
| = |
| ┏ |
| ┳ |
| ┓ |
| ┣ |
| ╋ |
| ┫ |
| ┗ |
| ┻ |
| ┛ |
| ━ |
| ┃ |
Is this behavior right, or am I using runewidth.RuneWidth/StringWidth incorrectly?
Hi,
Consider the following three similar unicode characters:
'-' - Unicode Character 'HYPHEN-MINUS' (U+002D)
'–' - Unicode Character 'EN DASH' (U+2013)
'—' - Unicode Character 'EM DASH' (U+2014)
From shurcooL/markdownfmt#7 (comment), I've learned that go-runewidth
considers the width of the first character to be 1, and the width of second and third characters to be 2.
Is that intended?
I'm not sure how to test this reliably, but in most environments it seems that EN DASH has width that's closer to 1 than 2.
Any thoughts on this?
It appears that StringWidth reports the length of certain runes incorrectly. The problem seems to be centered around languages used primarily in India (Tamil, Telugu, and Hindi are examples).
Sample program that shows the problem:
package main
import (
"fmt"
"github.com/mattn/go-runewidth"
"strings"
)
func main() {
words := []string{
"English",
"हिन्द",
"தமிழ்",
"ไทย",
"עברית",
}
for _, w := range words {
max := 12 - runewidth.StringWidth(w)
fmt.Printf("|%s%s|\n", w, strings.Repeat(" ", max))
}
}
The output is shows the misalignment in the 2nd and 3rd rows (sorry, but pasting here won't work since Github seems to force "Liberation Mono" as the monospace font and this font appears to have its own issues). I've tried this on terminals, browsers, etc, always with similar results.
Im using the package github.com/jhillyerd/go.enmime from an AppEngine classic project where the syscall package is not available.
The go.enmime in turn imports this package github.com/mattn/go-runewidth.
Unfortunately running the project from dev_appserver.py on windows results in:
go-app-builder: Failed parsing input: parser: bad import "syscall" in github.com\mattn\go-runewidth\runewidth.go from GOPATH
I had to change the file runewidth_windows.go to following to make the project build:
package runewidth
import (
//"syscall"
)
var (
//kernel32 = syscall.NewLazyDLL("kernel32")
//procGetConsoleOutputCP = kernel32.NewProc("GetConsoleOutputCP")
)
// IsEastAsian return true if the current locale is CJK
func IsEastAsian() bool {
return false
}
Hello,
Runewidth of '…' is not equal to actual width on terminal.
Is this expected?
I stumbled over a character that, when output to the console directly, takes up two characters. But StringWidth()
gives me 1
. This is because the first rune of this character has a width of 1
and that's what's being used, see here. I know I wrote this code and I'm sure that you cannot simply add up the widths of individual runes ("🏳️🌈" would then have a width of 4 which is obviously wrong) and using the first rune's width worked fine so far. But it turns out that it fails in some cases.
I'm not familiar with Indian characters but it seems to me that the second rune is a modifier that turns the character from a width of 1
into a width of 2
. Are you aware of any logic that we could add to go-runewidth
that makes this right?
Here's example code that illustrates the issue:
package main
import (
"fmt"
runewidth "github.com/mattn/go-runewidth"
)
func main() {
s := "खा"
fmt.Println("0123456789")
fmt.Println(s + "<")
fmt.Printf("String width: %d\n", runewidth.StringWidth(s))
var i int
for _, r := range s {
fmt.Printf("Rune %s (%d) width: %d\n", string(r), i, runewidth.RuneWidth(r))
i++
}
}
Output (on macOS with iTerm2):
I stumbled over this while working on #47.
It seems that RuneWidth is not always equal to the StringWidth of a single rune.
This is quite unexpected, TBH.
Please see markus-oberhumer-forks@5da511d for a test case.
This is a question about how you are defining "width"? I'm mostly looking for a solution that gives me character width in monospaced fonts. So example in #39 and #36, the "width" would still be 2
as a flag although is considered 1 character in modern renders, it still takes up the space of 2 normal characters.
When using an east asian encoding, the following runes are given a width of 2 but they should be 1: ─┌└┐┘│
.
To reproduce:
export LC_CTYPE="ja_JP.UTF-8"
(in go program)
runewidth.RuneWidth('─') // returns 2
looking at the runewidth_table.go file, the culprit is {0x24EB, 0x254B}
in the ambiguous
table. I'm not sure how to update this; the file is auto-generated.
In terminal apps which render box characters this can lead to broken rendering:
Let me know if there's anything else I can add. Thanks :)
go get github.com/brandleesee/TerminalStocks
# github.com/mattn/go-runewidth
../../mattn/go-runewidth/runewidth.go:823: function ends without a return statement
Here's a short example that illustrates an issue with flags (or "regional indicators"):
fmt.Println(runewidth.StringWidth("🇩🇪")) // Should be "2", outputs "4".
The flag consists of two code points which are processed separately by runewidth
. But most modern systems will combine them into one flag emoji.
This is part of a larger topic which I describe in more detail here: gdamore/tcell#264. It doesn't just affect flags but also characters in e.g. Arabic and Korean where there are more sophisticated rules than "combining characters" and zero-width joiners (which you added with #20).
I don't know exactly how you calculate the widths of characters. I'm also not sure how you would solve flags as well as some of the other rules described in the Unicode specification but it would sure be nice as printing these flags currently gives me trouble in tview
. There have been multiple issues asking for better support for different languages and emojis so it seems that there are quite a few people who use the terminal with these characters.
(Maybe my new package uniseg
can help you here.)
ZeroWidthJoiner
was removed after v0.0.9
: https://github.com/mattn/go-runewidth/blob/v0.0.9/runewidth.go#L14
The next version was v0.0.10
, but this introduced a breaking API change.
While being v0
means you can introduce breaking API changes, would it be possible to get a v1
release that can ensure API stability?
It's fine to just keep cutting new versions when API changes happen, but right now it makes managing Go Module dependencies rather painful, since it just assumes patch versions don't introduce breaking changes.
bash-3.2$ go get -u -d github.com/coreos/etcd/...
# cd .; git clone https://github.com/mattn/go-runewidth /Users/admin/go/src/github.com/mattn/go-runewidth
fatal: could not create work tree dir '/Users/admin/go/src/github.com/mattn/go-runewidth': Permission denied
package github.com/mattn/go-runewidth: exit status 128
# go build
/go/pkg/mod/github.com/mattn/[email protected]/runewidth.go:7:2: //go:build comment without // +build comment
currently on my Linux machine it's 0, and in terminal it's 8, but for most of the IDE, it's customizable.
i don't know if there's other char like this and should i just define the width of it my self?
Hi,
Updating go-runewidth
from v0.0.4 to v0.0.5 break my tests in https://github.com/MichaelMure/go-term-text. go-term-text
is a package doing text formatting for the terminal, relying on go-runewidth
to get the character width.
Here is example of before/after:
Notice that after switching to 0.0.5, the text go further than it should. As the algorithm remain unchanged, I suspect go-runewidth
return a different length. Would that be possible ? If so, why ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.