Code Monkey home page Code Monkey logo

Comments (6)

josegonzalez avatar josegonzalez commented on July 4, 2024 1

@allanlaal mind making a pr to change to the correct one?

from dokku-mariadb.

josegonzalez avatar josegonzalez commented on July 4, 2024 1

I think you did it to your repo and not this one :D

from dokku-mariadb.

allanlaal avatar allanlaal commented on July 4, 2024 1

correct PR done. thats what happens if I dont get my mornin' coffee =)

from dokku-mariadb.

josegonzalez avatar josegonzalez commented on July 4, 2024

Pull requests welcome.

from dokku-mariadb.

allanlaal avatar allanlaal commented on July 4, 2024

PR #ab67e88 used utf8mb3 instead of utf8mb4, which does not cover the entire UTF8 range.

TLDR: utf8mb4 covers 100% of Unicode characters (including emojis, chinese, icons etc)

UTF-8 is a variable-length encoding. In the case of UTF-8, this means that storing one code point requires one to four bytes. However, MySQL's encoding called "utf8" (alias of "utf8mb3") only stores a maximum of three bytes per code point.

So the character set "utf8"/"utf8mb3" cannot store all Unicode code points: it only supports the range 0x000 to 0xFFFF, which is called the "Basic Multilingual Plane".
See also Comparison of Unicode encodings.

This is what (a previous version of the same page at) the MySQL documentation has to say about it:

The character set named utf8[/utf8mb3] uses a maximum of three bytes per character and contains only BMP characters. As of MySQL 5.5.3, the utf8mb4 character set uses a maximum of four bytes per character supports supplemental characters:

  • For a BMP character, utf8[/utf8mb3] and utf8mb4 have identical storage characteristics: same code values, same encoding, same length.

  • For a supplementary character, utf8[/utf8mb3] cannot store the character at all, while utf8mb4 requires four bytes to store it. Since utf8[/utf8mb3] cannot store the character at all, you do not have any supplementary characters in utf8[/utf8mb3] columns and you need not worry about converting characters or losing data when upgrading utf8[/utf8mb3] data from older versions of MySQL.

So if you want your column to support storing characters lying outside the BMP (and you usually want to), such as emoji, use "utf8mb4". See also https://stackoverflow.com/questions/5567249/what-are-the-most-common-non-bmp-unicode-characters-in-actual-use.

source: https://stackoverflow.com/a/30074553/934511

from dokku-mariadb.

allanlaal avatar allanlaal commented on July 4, 2024

PR done :)

from dokku-mariadb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.