Comments (1)
Adding to the example above one could also use unicodedata.normalize
to remove diacritics from each word as well;
import re
import unicodedata
def normalize_rom_name(name: str) -> str:
converted_name = ' '.join((re.findall(r'(?:(?!_)\w)+', name))) # remove non-words and underscores
normalized_name = unicodedata.normalize('NFD', converted_name) # convert to normal form
canonical_form = ''.join([c for c in normalized_name if not unicodedata.combining(c)]) # remove accents
return canonical_form.lower()
# "àéêö亰" -> "aeeo亰"
...
exact_matches = [
rom
for rom in roms
if normalize_rom_name(rom["name"]) == normalize_rom_name(search_term)
or normalize_rom_name(rom["slug"]) == normalize_rom_name(search_term)
or rom["name"].lower() == search_term.lower()
or rom["slug"].lower() == search_term.lower()
]
...
EDIT: alternatively with handling articles (I don't know if this covers all of them); not as pretty (+ it assumes a comma is used before the article , the
):
def normalize_rom_name(name: str) -> str:
# Convert to lower case, replace underscores with spaces
name = name.lower()
name = re.sub(r'_', ' ', name)
# Remove leading and trailing articles
name = re.sub(r'^(a|an|the)\b', '', name)
name = re.sub(r',\b(a|an|the)\b', '', name)
# Remove special characters and punctuation
converted_name = ' '.join((re.findall(r'\w+', name))) # only keep words, no special characters or punctuation
normalized_name = unicodedata.normalize('NFD', converted_name) # convert to normal form
canonical_form = ''.join([c for c in normalized_name if not unicodedata.combining(c)]) # remove accents
return canonical_form
from romm.
Related Issues (20)
- [Bug] RomM v3.0.1: "Duplicate column name 'p_name'" on start-up HOT 19
- [Bug] DB migration "Duplicate entry '30' for key 'PRIMARY'" HOT 1
- [Feature] Password recovery system HOT 1
- [Bug] RomM v3.0.1: "Can't connect to server on 'romm-db'" HOT 2
- [Bug] The example YAML is wrong for maria DB HOT 1
- [UI/UX] Player view overflowing viewport + ad hoc suggestions HOT 1
- [Bug] Continue to get database errors when I launch the update after fixing the config.yml HOT 15
- [Feature] Change detection rescan debouncing HOT 1
- [Feature] GoG.com integration, auto-update games library HOT 6
- [Feature] Subfolder Whitelist HOT 4
- [Bug] EmulatorJS Amiga Support Not Working
- Should my PS1 roms be playable with eJS? HOT 2
- [Bug] "network error" when trying to play "Monsters, Inc." from GBA HOT 2
- [Bug] Keyboard shortcuts for saving/loading state do not persist state between sessions
- Reverse proxy gateway timeouts
- [UI/UX] Improve multi-select in gallery
- [Bug] Compressed downloads are not actual zip files HOT 2
- [Other] Do not package Redis in the container HOT 4
- [Feature] Metadata field: Sort Title
- [Bug] Romm doesn't start properly when lxc reboots in Proxmox HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from romm.