carpedm20 / emoji Goto Github PK

View Code? Open in Web Editor NEW

1.8K 26.0 271.0 14.07 MB

emoji terminal output for Python

License: Other

Python 99.46% CSS 0.16% JavaScript 0.30% HTML 0.08%

python emoji

emoji's Introduction

Emoji

Emoji for Python. This project was inspired by kyokomi.

Example

The entire set of Emoji codes as defined by the Unicode consortium is supported in addition to a bunch of aliases. By default, only the official list is enabled but doing emoji.emojize(language='alias') enables both the full list and aliases.

>>> import emoji
>>> print(emoji.emojize('Python is :thumbs_up:'))
Python is 👍
>>> print(emoji.emojize('Python is :thumbsup:', language='alias'))
Python is 👍
>>> print(emoji.demojize('Python is 👍'))
Python is :thumbs_up:
>>> print(emoji.emojize("Python is fun :red_heart:"))
Python is fun ❤
>>> print(emoji.emojize("Python is fun :red_heart:", variant="emoji_type"))
Python is fun ❤️ #red heart, not black heart
>>> print(emoji.is_emoji("👍"))
True

By default, the language is English (language='en') but also supported languages are:

Spanish ('es')
Portuguese ('pt')
Italian ('it')
French ('fr')
German ('de')
Farsi/Persian ('fa')
Indonesian ('id')
Simplified Chinese ('zh')
Japanese ('ja')
Korean ('ko')
Russian ('ru')
Arabic ('ar')
Turkish ('tr')

>>> print(emoji.emojize('Python es :pulgar_hacia_arriba:', language='es'))
Python es 👍
>>> print(emoji.demojize('Python es 👍', language='es'))
Python es :pulgar_hacia_arriba:
>>> print(emoji.emojize("Python é :polegar_para_cima:", language='pt'))
Python é 👍
>>> print(emoji.demojize("Python é 👍", language='pt'))
Python é :polegar_para_cima:️

Installation

Via pip:

$ python -m pip install emoji --upgrade

From master branch:

$ git clone https://github.com/carpedm20/emoji.git
$ cd emoji
$ python -m pip install .

Developing

$ git clone https://github.com/carpedm20/emoji.git
$ cd emoji
$ python -m pip install -e .\[dev\]
$ pytest
$ coverage run -m pytest
$ coverage report

The utils/get_codes_from_unicode_emoji_data_files.py is used to generate unicode_codes/data_dict.py. Generally speaking it scrapes a table on the Unicode Consortium's website with BeautifulSoup and prints the contents to stdout as a Python dictionary. For more information take a look in the utils/README.md file.

Links

Documentation

https://carpedm20.github.io/emoji/docs/

Overview of all emoji:

https://carpedm20.github.io/emoji/

(auto-generated list of the emoji that are supported by the current version of this package)

For English:

Emoji Cheat Sheet

Official Unicode list

For Spanish:

For Portuguese:

For Italian:

For French:

For German:

Authors

Taehoon Kim / @carpedm20

Kevin Wurster / @geowurster

Maintainer

Tahir Jalilov / @TahirJalilov

emoji's People

Contributors

Stargazers

Watchers

Forkers

demonlife hanxiaomax williamren jachinpy nvnvenki mnpk karimkhanp brendanjercich ardydedase gopivulcan beekpr tgwizard bupt2014140518 ssi379 sncs piggymoney sandy4321 bkj orarbel syxty123 a3dho3yn webstar0025 521xueweihan g10dras vgerardojr gupta-shantanu urvineet r-mct98 hyzhak 15five xinzhileo andreif edersantana dan1901 vantuz shivermetimbers kwresearch lasomidore sevenkn neelshah18 dixudx huangtao1 sabrinaxo bfolkens shubhamrohilla05 flolas lrode061518 georgelubaretsi fvancesco biranchi2018 phuongnguyentqn inspilab robroseknows embeddedsamurai cozyplanes ananyatyagi demojavascript macintoshxz machine-uprising sigoa ekramulhuq thelvey radhikari54 edschofield murattcan mehrdadbahrainy arnab-ai j-j-4 ithacy86 ace3df monkeymars xiaoguobiao johannestk afcarl ivancarrazana engsamshamsan mattdelong fuseteam jimcurrywang yjfiejd denisbazzali chenghuige nlpformyself venkat-nerellapalli morristech pepther solertis umair-tp khawoat6 marinang user-zhangke farhan262425 diurnate rakhithjk hemasaij patcon kiana63 cajubelt leewalter rushline

emoji's Issues

Some alias typos

Some aliases, like :upside-down_face:, should be 🙃 (hyphen turned to underscore). I observed this bug when exporting Slack messages and emojifying them.

The following hack fixed it for now, though it may override some legit aliases:

def fix_emoji():
    """Fix emoji's aliases as they have some typos."""
    from emoji import unicode_codes
    for key, val in list(unicode_codes.EMOJI_UNICODE.items()):
        unicode_codes.EMOJI_UNICODE[key.replace('-', '_')] = val
    for key, val in list(unicode_codes.EMOJI_ALIAS_UNICODE.items()):
        unicode_codes.EMOJI_ALIAS_UNICODE[key.replace('-', '_')] = val

    unicode_codes.UNICODE_EMOJI = {v: k for k, v in unicode_codes.EMOJI_UNICODE.items()}
    unicode_codes.UNICODE_EMOJI_ALIAS = {v: k for k, v in unicode_codes.EMOJI_ALIAS_UNICODE.items()}


fix_emoji()

Add note to readme about current utils being for development purposes only

Blocked until #19 is merged

emoji in table

Hi,

Thanks for your repo. I need to use it with terminaltables

As you can see in the images below with emoji cut the table! even when I added more space. Any help?

Markdown

I am sorry but I have stupid question. How to use it on website? It is possible use it with Markdown2 python module?

National flag emojis should not contain space character

So all the national flag emoji character combinations in your regexp and lookup tables, contain a space in between the two "regional indicator" letters. This means they won't actually match the national flag sequences:

In [7]: emoji.get_emoji_regexp().match("🇨🇦")

In [8]: emoji.get_emoji_regexp().match("🇨 🇦")
Out[8]: <_sre.SRE_Match object; span=(0, 3), match='🇨 🇦'>

outdated emoji list

I wanted to print :first_place_medal: ( 🥇 ) emoji in my own telegram bot but emojis of unicode v6+ is not supported. Any help?

Unable to detect flag emojis

``import emoji

def emoji_lis(string):
_entities = []
for pos,c in enumerate(string):
if c in emoji.UNICODE_EMOJI:
print("Matched!!", c ,c.encode('ascii',"backslashreplace"))
_entities.append({
"location":pos,
"emoji": c
})
else:
print(c ,c.encode('ascii',"backslashreplace"))
return _entities

#emoji_lis("مدیحہ🇵🇰")
emoji_lis("🇵🇰 👧🏿")

Output:
[{u'emoji': u'\U0001f467', u'location': 3},
{u'emoji': u'\U0001f3ff', u'location': 4}]

convert back to '\uxxx' code

import io,sys
import emoji

sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='utf-8')
s = 'hello\u2665'
emojiCode = emoji.demojize(s)
print(emojiCode) #hello:black_heart_suit: 
print(emoji.emojize(emojiCode)) #hello鈾?

how can i make the '\u2665' code just convert to '♥'

Docstring formatting

@carpedm20 do you have a preference for docstring formatting - looks like you're using RST? Can you link me to a guide?

generation of EMOJI_ALIAS_UNICODE

I guess it's more a question than an issue, how do you generate the mapping in EMOJI_ALIAS_UNICODE in here

Does this come from github/gemoji?

Asking to make progress on that issue: hadley/emo#21

The dataset in unicodes_code.py contains helm_symbol not in the Unicode standard

In an attempt to answer #18, I compared the contents in the EMOJI_UNICODE dict with that in http://www.unicode.org/emoji/charts/full-emoji-list.html (by using utils/get-codes-from-unicode-consortium.py). It seems that all characters are present, but there is one extra character: :helm_symbol:. More info about that character here: http://www.fileformat.info/info/unicode/char/2388/index.htm

Why is U+2638 included as an emoji? Perhaps that character should be in a separate dict of characters that are often treated as emojis?

Release v0.3.5

@carpedm20 I think this packaging issue might be affecting a lot of people so unless you have anything else to add I vote we do this as a patch and then do #14 (and anything else that pops up before it is implemented) in the next release.

Update changelog
Pull onto v0.3-maint branch
Tag a release in GitHub
Push to PyPi

regexp does not detect some emoji

Python 2.7.9 (default, Mar  1 2015, 12:57:24)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from emoji import get_emoji_regexp
>>> s='test sting 👍'
>>> RX = get_emoji_regexp()
>>> res = RX.match(s)
>>> print res
None

also 🚀 and 🔥 📧 and others

Use https://github.com/iamcal/emoji-data as source for short names

I think we should use https://github.com/iamcal/emoji-data as the source of the emoji data. This will add support for several older code-points (pre the now-standard Unicode format), and it will also make sure that the shortnames (:smile:) are the same as what other vendors use.

It will also add support for :skin-tone-2: and so on, which are now incorrectly called :emoji_modifier_fitzpatrick_type-1-2:.

If you think this is a good idea, I'd be happy to submit a PR trying it out. I'm thinking of adding iamcal/emoji-data as a submodule, and writing a script to generate core_data.py from it.

Upgrade fails on mac, python 3.4.2

Executed command: pip install -U emoji
Error occurred: UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 738: ordinal not in range(128)

Update in the PYCharm IDE, maybe a invisible character, like this issue had? But It could just be PyCharm related instead.

Collecting emoji
  Using cached emoji-0.2.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 20, in <module>
      File "/private/var/folders/zh/ntn59b4954l6tldmt4hn8bgc0000gn/T/pycharm-packaging1.tmp/emoji/setup.py", line 17, in <module>
        readme_content = f.read().strip()
      File "/Users/luckydonald/virtualenv3.4.3/bin/../lib/python3.4/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 738: ordinal not in range(128)

    ----------------------------------------

    Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/zh/ntn59b4954l6tldmt4hn8bgc0000gn/T/pycharm-packaging1.tmp/emoji

Re-implement emoji.decode() in a more useful way

See second half of discussion in #10.

Add emoji "annotations"

The annotations on the page you scrape would be useful for a program that wants to classify emoji, as well as for bots that might want to, for example, choose a random "grin" face to spice up their text.

demojize() not correctly decoding flag emojis

Noticed this problem while cleaning a twitter corpus:

@MENTION: I've been wanting to post this since I saw the HT. :smiling_face_with_open_mouth_&_smiling_eyes: off to work I go. Good Morning 🇺🇸 Good Night 🇵🇭…

My code is calling emoji.demojize(text) with a data['text'] directly from Twython. Not sure what's going on here -- am I doing something wrong?

Thanks!

Add missing GitHub releases

latest (and only) release is 0.3.4, pypi is on 0.4.5

Work backwards: \U0001f382 to :birthday:

Is it possible for this library to work backwards? Example:

u"\U0001F382" would change to 🎂

Give user a better option to enable emoji aliases

GitHub and others support a bunch of emoji aliases that are not officially part of the unicode set. Many of the aliases are easier to remember than their official counter-parts but they are only supported by some platforms and really should be considered specific to this library because different platforms might have different aliases that point to different codes so this library really only supports one set of aliases.

Replace is_alias in emojize() with use_alises to toggle character sets.

Symbols not render

example :one: 1️⃣

Getting black and white Emojis

Here is my code.
`import emoji'

'print(emoji.emojize("Hello 🌎", use_aliases=True))`

Here is the output.

tried others but still got no colors on both windows or linux..still no colors..

:gift: emoji only prints :gift:

as the title says, gift doesnt print 🎁

snowflake and other emojis not outputting correctly

Snowflake, high voltage, and other emojis are outputting incorrectly:

MyPy typeshed stub file

I've create a library stub for mypy static type checker. Here is the pull request for it: python/typeshed#1506
Can you please comment on this pull request on whether it could be merged from your point of view as the author and maintainer of the library?

emoji unicode set

does this package contain all of emoji-data v1.0

:hash: was printing only :hash: not emoji

I try these methods:
emoji.emojize("Настройки #️⃣", use_aliases=True)
emoji.emojize("Настройки #️⃣")
in each case result is same

add use_aliases parameter to demojize

as aliases are usually shorter, in many cases they are preferred for emojize/demojize

Release update to PyPI

Especially this commit b93de0d that's from June 2017 would be really helpful to have on the package index.

Release v0.3.4

Merge #7.
Close out https://github.com/carpedm20/emoji/milestones/v0.3.4.
Update changelog with mention of aliases and restoration of default functionality.
Make sure aliases are properly documented after the recent changes.
Create a v0.3-maint branch as a safety measure and keep it in git.
Tag a release in GitHub.
Upload to PyPi.

square output?

Fail to show in my terminal (utf8), and was :smile: not supported 😢 ?

>>> import emoji
>>> print(emoji.emojize('Python is :thumbs_up_sign:'))
Python is 👍
>>> print(emoji.emojize('Python is :thumbsup:', use_aliases=True))
Python is 👍
>>> print(emoji.emojize('Python is :smile:'))
Python is :smile:

emoji.decode() is fundamentally broken, not needed, and should be removed.

@carpedm20 having emoji.EMOJI_UNICODE gives us an easy way to look up unicode codes by emoji and emoji.UNICODE_EMOJI gives us an easy way to look up emoji names by unicode codes, but multiple aliases point to the same unicode code so we can't do reverse alias lookups. See the sample code below. About 400 aliases are dropped.

emoji.decode() really isn't that useful and I vote we just remove it. Any objections?

>>> import emoji
>>> len(emoji.EMOJI_UNICODE)
1282
>>> len(emoji.UNICODE_EMOJI)
1282
>>> len(emoji.EMOJI_ALIAS_UNICODE)
1694
>>> len(emoji.UNICODE_EMOJI_ALIAS)
1279

Regexp is slow -- could use codepoint ranges for consecutive emojis

Currently the regexp is quite slow, likely due to it containing a separate union term for each emoji. A lot of the emojis are consecutive codepoints, and some like the national flag emojis could be specified concisely as [\U0001F1E6-\U0001F1FF]{2}

Example code in README.rst is faulty

I tried the example and only the aliases are converted on my system.

>>> print(emoji.emojize('Python is :thumbsup: :thumbs_up_sign:', use_aliases=True))

This results in:
Python is 👍 :thumbs_up_sign:

I have a Windows machine, could this be the reason why it's not fully working?
I am using the pypi version 0.4.5

Some flags are absent

I didn't find these flags ':flag_for_Northern_Ireland:', ':flag_for_Wales:' and ':flag_for_England:'.
Is it possible to add them?

help

i keep getting Invalid Syntax 'import emoji'

Ignore Fitzpatrick Modifiers and other Flags

I want to extract all emojis from a long list of strings and then count the number of occurences of each emoji. The Emoji-module does a great job, but it extracts emojis with flags (which is what would be normally expected). Thus, distinct = set(list(emojis)) will treat same emojis with different flags/color differently. How can I ignore these modifiers?

This is my code:

def extract_emojis(str):
  return list(c for c in str if c in emoji.UNICODE_EMOJI)

I already tried:

def extract_emojis(str):
  return list(emoji.emojize(c,use_aliases=True) for c in str if c in emoji.UNICODE_EMOJI)

but it does not work. For example, I get:

❤,8654
😍,4774
🏻,3603
🏼,2839
✨,2696
☀,2439
😂,1904
😊,1862
👌,1690
💕,1677
🎄,1587
😎,1559
🏛,1459
✌,1434

The numbers are the number of occurrences. In this case, the first 4 emojis refer all to variants of "heart"-emoji I believe.

get nothing or exception

Hi!

I just tried this plugin all I get is nothing or an exception

Traceback (most recent call last):
  File "C:\add\emoji\test.py", line 4, in <module>
    print(emoji.emojize('Water! :water_wave:'))
  File "C:\Users\math\python\lib\encodings\cp850.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 7-8: character maps to <undefined>

Does anyone have an idea where it could be coming from?

I'm using python 2 on window 8

Mathieu

Creating new module which return location of the emoji and emoji dictonaries.

Hi, I just create new enhancement for the this code. which allow user to detect index of the emoji so they can track or change it. even they can do analysis when emoji use in most of the sentences.

cannot display emoji on linux

just display the unicode like u5408
but some emoji works

Duplicate aliases

@carpedm20 the following aliases appear multiple time in emoji.EMOJI_ALIAS_UNICODE, which means that the last one encountered will be included in the one present in the dictionary and the rest will be thrown away. Could you pick one for each and update the dictionary?

{
    ':bee:': u'\U0001F41D',
    ':bee:': u'\U0001F41D',
    ':satellite:': u'\U0001F4E1',
    ':satellite:': u'\U0001F6F0',
    ':snowman:': u'\U00002603',
    ':snowman:': u'\U000026C4',
    ':umbrella:': u'\U00002602',
    ':umbrella:': u'\U00002614'
}

package name conflict on PyPI

This package installs into a top-level emoji. This conflicts with the top-level of an older project named django-emoji.

It is impossible to use both packages within a single project.

To import from this package, you need to import from emoji.

This conflicts with an older project called django-emoji. To import from django-emoji you also need to import from emoji.

pip does not have the tooling to rename packages on install.

PyPI has a policy of unique package names, which this project violates: http://legacy.python.org/dev/peps/pep-0423/ "make sure your project name is unique, i.e. avoid duplicates:"

I'm getting many question marks

I think that some of the emojis I try to convert using the emoji library are being converted to question marks.

Any chance it happens as a result of the emojis that comes with iOS 10 maybe?

Windows Version

Mmmm i try to install this on python for windows but noting work do you have tutorial or other code

Not able to run on my terminal!

Does it supports windows command prompt ?
If yes then what should I set as output encoding ?
I am getting the following error when i ran print emojize("emoji is 👍 ")
UnicodeEncodeError: 'charmap' codec can't encode characters in position 9-10: character maps to

its is parsing thumbsup in github so adding a clip too

emoji note included in the dictionary

The emoji listed below are not included in the current unicode_codes.py :

🤬U0001F92C
🤭U0001F92D

the emoji didn't work....

as you can see , i do it as example ,but didn't work

 -*- coding: <encoding unicode > -*-

import emoji
print(emoji.emojize('Python is :thumbs_up_sign:'))`

the Result is:

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:38:22) [MSC v.1600 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> ================================ RESTART ================================
>>> 
Python is :thumbs_up_sign:
>>>

:one: alias doesn't work

emojize(":one:") doesn't print the emoji. I found that the correct alias is ":keycap_digit_one:". Btw the emoji that appears is different from the Apple one. I found that the Apple emoji is composed by two character, the first one is the number and the second one is chr(8419), so for me the best solution was to substitute emojize(":one:") with "1"+chr(8419). This works also for other numbers (the second character is always the same).

Don't import * in init

Blocked until #19 is merged

core.py does not implement an __all__ but this package is small enough that we should probably just explicitly import what we need to the top level.

carpedm20 / emoji Goto Github PK

emoji's Introduction

Emoji

Example

Installation

Developing

Links

Authors

Maintainer

emoji's People

Contributors

Stargazers

Watchers

Forkers

emoji's Issues

Recommend Projects

Recommend Topics

Recommend Org