Code Monkey home page Code Monkey logo

Comments (7)

tallforasmurf avatar tallforasmurf commented on August 13, 2024

Note if I use alg = (natsort.ns.LOCALE | natsort.ns.GROUPLETTERS ) then uppercase groups with lowercase, but the accented versions still sort last, that is, 'a' is far away from 'å'.

from natsort.

SethMMorton avatar SethMMorton commented on August 13, 2024

Do you have PyICU installed? I have found that python's built-in locale library (which does the work of understanding local-dependent sorting) does not work properly on some systems (specifically on Mac, which is what you are on). If you have PyICU installed, natsort will use that under the hood, and it gives better results. Can you try that?

from natsort.

tallforasmurf avatar tallforasmurf commented on August 13, 2024

I saw the note about PyICU in the docs, and specifically recommended for OSX. Before I install that rather large package, (a) what sequence would you expect the above code to print, if everything is working as you expect it (e.g. on your own test system)? and 2, would you expect changing locale from en_US to fr_FR or de_DE to make a difference?

from natsort.

SethMMorton avatar SethMMorton commented on August 13, 2024

a. I would expect the sequence that Qt printed to be the correct sequence.
b. In my tests it makes no difference which locale was used.

I can confirm that using Mac OS X's locale library (python uses's the system's C locale library), I get the (incorrect) results that you see. Below is the test file I used.

# -*- coding: utf-8 -*-
from __future__ import print_function, unicode_literals
import locale
from natsort import natsort_keygen, ns

words = ['apple', 'åpple', 'Apple', 'Äpple', 'Epple', 'Èpple', 'épple', 'epple']
locale.setlocale(locale.LC_ALL, str('de_DE.UTF-8'))
key_func_L = natsort_keygen(alg=ns.LOCALE)
print(' '.join(sorted(words, key=key_func_L)))

When I disabled PyICU, I get:

Apple Epple apple epple Äpple Èpple åpple épple

When I turn on PyICU, I get:

apple Apple åpple Äpple epple Epple épple Èpple

This is identical to what Qt is reporting.

from natsort.

SethMMorton avatar SethMMorton commented on August 13, 2024

Unfortunately, this is not something I can fix... it is a bug in the BSD locale implementation. There is a recent Python bug report on this... check it out: http://bugs.python.org/issue23195 (also check this out: http://stackoverflow.com/questions/3412933/python-not-sorting-unicode-properly-strcoll-doesnt-help). I'll definitely keep an eye on the bug report, but notice one of the solutions suggested is to install PyICU. Incidentally, it seems like the only affected locales are en_US, fr_FR and de_DE, which are the three you tried.

I'll make sure to update the docs in the next release to indicate that PyICU should only be needed on Mac OS X and BSD.

from natsort.

SethMMorton avatar SethMMorton commented on August 13, 2024

BTW, if you use HomeBrew (and I recommend it!), you can easily install ICU and PyICU with the following commands:

brew install icu4c
CFLAGS=-I/usr/local/opt/icu4c/include LDFLAGS=-L/usr/local/opt/icu4c/lib pip install pyicu

HomeBrew does not link icu4c to the system to avoid conflicts, so you need to tell python where to find it when installing PyICU.

from natsort.

tallforasmurf avatar tallforasmurf commented on August 13, 2024

Yes, good. I had to add exports, pip didn't pick up the flags otherwise. Putting this in for reference for anybody else:

brew install icu4c
CFLAGS=-I/usr/local/opt/icu4c/include
export CFLAGS
LDFLAGS=-L/usr/local/opt/icu4c/lib pip install pyicu
export LDFLAGS
pip install pyuic

After which natsort did behave as you say.
Thank you for your prompt & detailed help.

from natsort.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.