Comments (12)
Sounds like a good idea, but how should it be implemented? Choose encoding based on the file extension?
from cantools.
I would say:
Encoding defaults to None.
If encoding is none, select the default encoding by the type argument. If that is none, too, select the encoding by file name extension. If this did not help either, default to UTF-8 again.
The selections are in a dict [type, default_encoding].
The described selection process could be a local function named e.g. guess_encoding(filename, type).
from cantools.
What if the user want to pass encoding as None
to open()
to use the platform dependent encoding?
https://docs.python.org/3/library/functions.html#open
Maybe adding a special default encoding, like 'auto'
, which will do as you suggested.
from cantools.
I would suggest using a special option 'platform' because I would argue that selecting the encoding based on the platform is the most edgiest of cases. The file formats are associated with specific tools that use one specific encoding. If I deviate from that, I should know what I am doing and make that explicit by using respective arguments to cantools.
I know that this strategy is different from the one that open uses, but open is general purpose. And thats not necessarily the most convenient even though it is the most consistent with Python. But still, this is cantools :-).
from cantools.
All I know is that file encoding is much harder to get right than one can imagine. There are always some use case you don't think about. When I'm in the unknown I tend to implement as few restrictions as possible in the API. An additional platform argument might work, but would be nice if it's not needed.
I totally agree that DBC-files should have the same default encoding as CANdb++, we just have to figure out how to do it in a good way =)
from cantools.
Btw, how do you know that CP1252 is the default encoding?
from cantools.
At first I assumed it. Then I noticed, that the canmatrix project uses iso-8859-1 in one of its examples. So I verified by creating a dbc with an € char in it. I read that with an editor in 1252 mode and it came out fine.
from cantools.
8859 and 1252 are basically the same, but M$ replaced a few control chars with printables like €.
from cantools.
Let's implement it as you first suggested. If someone want to use the platform encoding they can always use load()
instead of load_file()
.
from cantools.
I implemented the suggested behavior on master, not yet released. Please give it a try. Consult the documentation for details.
from cantools.
I have updated to master and removed all the arguments to load_files. Does work like charm. It even fixes an issue I had in my early days with cantools: Some clever customer worked around regular quotes not being allowed in the comment field of CANdb++ by using fancy quotes... Which also happen to be in the char range that CP1252 adds to the ISO charset. Since I did not get the encoding right then, I got broken dbc files when saving with cantools (CANdb++ would refuse to open).
By the way, the last potential hurdle for this use case is, that a cantools user needs to pass the correct encoding to write() when saving the db string. A wrapper save_file that uses the appropriate enconding could work around that. Otherwise I would expect to find an increasing amound of dbc files written with the wrong encoding in the wild.
I also did verify that e.g a degC survives dbc to kcd translation (the latter being written in UTF8).
The sym default encoding seems also fine, I have checked one of the Peak tools and that uses UTF8 with BOM.
Great. Thanks.
from cantools.
Great that it works!
Yeah, feel free to add a dump (or write) function.
from cantools.
Related Issues (20)
- Decoding does not check for min / max violation HOT 1
- Floating point rounding operations result in unexpected message encoding HOT 1
- Message._assert_signal_values_valid() uses incorrect minimum raw values for signals with negative scale
- Why does the capitalization of signal names change HOT 2
- textparser.ParseError From SYM file Mux line. HOT 2
- A 2 byte Motorola signal that is signed is decoded wrong HOT 3
- Generated SYM Enum Lines Exceed Max Length HOT 3
- Generate C code without changing message/signal name case HOT 1
- Cantools has no way to load frame of only one bus HOT 2
- In canfd, when data length > 8, dlc != len(data) in Message._check will cause an error.
- sym file parsing cannot handle small number scientific notation. HOT 1
- SPN signal attribute don't use the default value
- Monitor doesn't work on Windows 11 HOT 2
- data export HOT 1
- Frame IDs collisions between extended format frames can and standard format frames HOT 1
- Exporting a dbc file with Chinese comments will result in garbled text for the Chinese parts. HOT 1
- Cannot clear TX Buffer of CAN HOT 1
- EncodeError: signal required for encoding when passing a partial dictionnary to Message.encode() HOT 3
- Extract Request and Response CAN IDs from CDD File HOT 2
- For Autosar classic platform release R4.2.0 and R4.2.1, I am not getting the schema version of it. Can anyone help? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cantools.