Comments (55)
Looks good
from pyais.
Not sure yet. I am think about moving the current state to a different branch (e.g version-1). Then Main would be the place for the new, active development.
from pyais.
Happy to test it and see how things go. Could I get a link to the repo/branch that contains the new code?
Thanks :)
from pyais.
Looks like there is a issue with asdict
attempting to cast enums which are None...
Traceback (most recent call last):
File "/pyais/messages.py", line 438, in asdict
d = {slt: int(getattr(self, slt)) if slt in ENUM_FIELDS else getattr(self, slt) for slt in self.__slots__}
File "/pyais/messages.py", line 438, in <dictcomp>
d = {slt: int(getattr(self, slt)) if slt in ENUM_FIELDS else getattr(self, slt) for slt in self.__slots__}
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
from pyais.
This seems like an error of your code, right?
Not technically an error those are part 2 messages that were unable to be matched to a part 1 (I'm not using the built in matching from pyais), something you can ignore.
May I ask where you got your large dataset from?
Unfortunately it's not publically available and I don't have permission to share it sorry <3
I guess, that we're ready for a v2 release, don't you think so?
Definitely looks like it, if you want to wait a day or two I will re-decode the ~203M test dataset with the new changes to see if anything else pops up. After its ready for release I intend to re-decode the full historical dataset of 7Billion reports.
With the new decoding method errors for the test dataset are down to 0.01% which is a order of magnitude improvement.
from pyais.
The only fix I can see for this currently is to check the length of the bit array before each access which would still cause a significant slowdown... Perhaps this is a issue the bitarray library needs to solve
from pyais.
Figured out why the error is not being returned the silent property for decoding defaults to true hiding errors...
def decode(self, silent: bool = True) -> Optional["AISMessage"]:
"""
Decode the message content.
@param silent: Boolean. If set to true errors are ignored and None is returned instead
"""
msg = AISMessage(self)
try:
msg.decode()
except Exception as e:
if not silent:
raise e
return msg
from pyais.
I'm thinking the issue is the message being decoded is part of a multipart message so even though the first section of data is fine the missing parts still result in a error being throw.
from pyais.
Nevermind, getting a lot of errors with single part messages too.
!AIVDM,1,1,,B,E>lt;KLab21@1bb@I@@@@@@@@@@D8k2tnmvs000003v0@,2*52
This for example fails as the last bit is missing, yet it has a valid checksum...
from pyais.
The fact that any exception that occurs during decoding is caught and thrown as pyais.exceptions.UnknownMessageException
makes it unnecessarily difficult to debug issues...
from pyais.
Nevermind, getting a lot of errors with single part messages too.
!AIVDM,1,1,,B,E>lt;KLab21@1bb@I@@@@@@@@@@D8k2tnmvs000003v0@,2*52
This for example fails as the last bit is missing, yet it has a valid checksum...
So the issue with this and others appears to be that it's missing the last bit for the assigned
property. I'm looking into if this is just a small subset of type 21 messages or a larger issue.
from pyais.
So looking over everything testing with 6 million records type 21 reports make up about 1%, while 40% of type 21 reports encounter this error and type 21 reports are getting 50x more errors than other types. Not sure if this is just due to type 21 reports being unlucky or if there is something funky with that last bit causing valid messages to be dropped....
On the one hand the bit is missing so it makes sense for the library to drop it, but on the other hand the number of messages failing to decode because of it seems to indicate that perhaps that bit isnt required or strictly sent all the time.
from pyais.
Hey @Inrixia,
thank you for bringing this up. Sadly, I am aware of this issue. This is why I started to refactor the whole project a while ago.
I began to implement a more reliable and generous approach for encoding messages. You can see the current state by looking at encode.py. I still need to figure out, how to refactor the existing decoding part without breaking too many things.
from pyais.
Ah thanks it certainly is a interesting problem. Having had a brief look over the rewrite it looks promising.
Ideally it would be possible to optionally allow for messages missing bits to be decoded with missing values set to None, especially if the checksums are valid.
As for the refractor I would think it's best to increment a major version number and just accept breaking changes. It would be a good opportunity to have things like the boolean types actually return booleans instead of 0 or 1. Currently I am having to cast boolean and enum types to their respective boolean and int values after decoding. And combining nmea message header information with decoded information as message decoding does not return information like the talker and fragment count etc.
Curious on your thoughts and plans for where you want to go with it
from pyais.
Part of the underlying issue is, that the NMEA specification is not freely available. Therefore, many decoders and encoders rely on unofficial resources like https://gpsd.gitlab.io/gpsd/AIVDM.html. According to this specification Type 21 messages should have between 272 and 360 bits:
This message is unusual in that it varies in length depending on the presence and size of the Name Extension field. May vary between 272 and 360 bits.
So it seems that you example message "!AIVDM,1,1,,B,E>lt;KLab21@1bb@I@@@@@@@@@@D8k2tnmvs000003v0@,2*52" is actually invalid. In fact other online decoders such as http://ais.tbsalling.dk/ fail to decode this message for this exact reason.
from pyais.
So it seems that you example message "!AIVDM,1,1,,B,E>lt;KLab21@1bb@I@@@@@@@@@@D8k2tnmvs000003v0@,2*52" is actually invalid. In fact other online decoders such as http://ais.tbsalling.dk/ fail to decode this message for this exact reason.
Yep this is the conclusion I came to. As for why roughly 50% of type 21 messages encounter this I don't know, it's certainly weird but I do beleive that failing to decode due to the missing bit is expected behavior with the current state of the library.
Though as mentioned above having the ability to decode messages with a valid checksum and none values where the bits end early could be benificial. But at the same time allowing for these messages to be decoded could allow for actually corrupt messages to be, which is why it would need to be optional and not default behavior
from pyais.
Generally I think that it it best to be very liberal in what to accept as the standard seems to be mostly a rough guideline for most encoders. :-D
To allow for more liberal decoding I am planning to slightly change the way of accessing parts to the message. If we look at the decoding now, we can see, that I am currently slicing the bitarray explicitly:
return {
'type': get_int_from_data(0, 6),
'repeat': get_int_from_data(6, 8),
'mmsi': get_mmsi(bit_arr, 8, 38),
'status': NavigationStatus(get_int_from_data(38, 42)),
'turn': get_int_from_data(42, 50, signed=True),
'speed': get_int_from_data(50, 60) / 10.0,
'accuracy': bit_arr[60],
'lon': get_int_from_data(61, 89, signed=True) / 600000.0,
'lat': get_int_from_data(89, 116, signed=True) / 600000.0,
'course': get_int_from_data(116, 128) * 0.1,
'heading': get_int_from_data(128, 137),
'second': get_int_from_data(137, 143),
'maneuver': ManeuverIndicator(get_int_from_data(143, 145)),
'raim': bit_arr[148],
'radio': get_int_from_data(149, len(bit_arr)),
}
By doing to, it is obvious that the lib will run into an IndexError
if the message has less bits than expected.
Instead, I think that it would be better to decode each part of the message until the end of the bitarray is reached. If the message is too short for every field, the remaining fields would be set to some default value. One could also set a warning
flag or so for the message to indicate that the decoded message seems strange.
from pyais.
My thoughts exactly. Though I would use None over a default to make it clear that no value was decoded, using a default would be misleading.
from pyais.
Using None has the disadvantage that a user of the library would need to perform a None check explicitly for every message. This may lead to bugs and a lot of boilerplate code.
from pyais.
Also type signatures would become messy
from pyais.
Assuming allowing partial decoding is enabled by default I would assume that a flag would be provided on each decoded object if it failed to completely decode (easy to do just set it/remove it at the end). So a user would only need to check the flag, or set a option to have default values returned. But returning defaults is worse imo as it would allow for a user to not check the flag and treat invalid data as valid. Not to mention using defaults you have no way to identify what fields failed to decode
from pyais.
I will need to think about this, when time has come. But this is a fairly small implementation detail, which can be changed easily later on. As my time is limited, I am planning the following steps:
- I will finish my work on the encoding of messages, as this adds a class for every message
- I will tackle this issue
Never the less, I am thinking about a quick fix for this issue:
We could change the get_int
function to add a bounds check:
def get_int(data: bitarray, ix_low: int, ix_high: int, signed: bool = False) -> int:
if len(data) < ix_low:
return 0
ix_high = min(ix_high, len(data) - 1)
shift: int = (8 - ((ix_high - ix_low) % 8)) % 8
data = data[ix_low:ix_high]
i: int = from_bytes_signed(data) if signed else from_bytes(data)
return i >> shift
Also all explict slices of the bitarray would be replaced with a call to get_int
. For Type 21 this would mean that we change:
'assigned': bit_arr[270],
to
'assigned': get_int_from_data(270, 270),
.
So 0
would become a somewhat universal default value for too short messages. One could also return None, but as get_int
is currently guaranteed to return an integer, this could mess things up.
from pyais.
I would be happy to assist in the rewrite, if you want help.
That sounds like a good plan. But I don't think a quick fix of returning 0 would work as 0 is a valid value for many properties, especially considering that boolean values are returned as ints currently where 0 = False
(another thing that could be fixed in the rewrite). I don't think there's a major need to try and fix this on the current version as it really does need a new version with different return types.
The best approach is definately using the new method of decoding mentioned above accessing the array lowest index up. Then just catching if an exception occurs and optionally returning a partial object along with a error flag set or throwing up further if not.
Of course this would require some consideration on if None should be explicitly set for invalid properties or if the kvp should just be missing due to how python treats those two instances differently.
from pyais.
The main thing is being able to identify bad/failed properties if partial decoding is enabled. For me at least I can only allow for either a fully decoded object or one where I know exactly what properties failed or are missing. Having a partial one where I don't know what properties could be invalid would make the object useless as I couldn't trust the data
from pyais.
Boolean and Integers are actually the same objects in Python, so I dont see any need to return a boolean:
1 == True
True
0 == False
True
from pyais.
When accessed by other libraries or serialized out to json it is interpreted as a integer though requiring casting beforehand.
from pyais.
I finished my work on adding support for encoding messages. So now there are classes for every message, which should make a iterative decoding approach quite feasible.
I hope that I will find time to work on this soon.
from pyais.
Sounds good, I would be happy to implement partial decoding myself and do a pull request but looks like only the encoding is using the new classes so will hold off until I know what you intend for the decoding.
from pyais.
Hey @Inrixia,
I added a very basic proof of concept on how we could reuse the existing classes used for encoding.
See the following method:
Line 245 in d643ebe
Actually, it would be fairly easy to reuse the classes without too many changes. I am thinking of the following:
- Get rid of the
encode.py
anddecode.py
files and move everything into themessages.py
file - Add a
from_bytes
andfrom_str
method to thePayload
superclass - Add all required datatypes and matching decoding functions
What do you think?
from pyais.
I implemented most of the logic. So the new structure of the project is basically done. The following things still need some love:
- the iterative decoding is not yet fully tested
- bad/invalid/fishy messages are not marked as such
- the streaming interfaces for sockets and files need to be tested
- the documentation needs to be updated
But most of the work should be done. Overall I am quite happy with the result. I really like the fact, that the message classes are somewhat declarative and not procedural anymore. Instead of telling the decoder how to decode
the individual parts, the class just lists all of it's fields and their width & value/type.
class MessageType1:
msg_type = bit_field(6, int, default=1)
repeat = bit_field(2, int, default=0)
mmsi = bit_field(30, int, from_converter=from_mmsi, to_converter=to_mmsi)
...
from pyais.
I also decided, to publish the refactored library under a different name. Currently, there are at least 11 dependent packages of this lib. And because the interface changed significantly this could cause some irritation/confusion. I am thinking about the name pyais-2
.
from pyais.
Awesome, is the refactor just going to be on a seperate branch then?
from pyais.
@Inrixia Sooo. Everything should be done. You can preview the newest version by installing: pip install pyais==2.0.0-alpha
.
May I ask you to install this version and mess around with it? I would love to get some feedback on bugs or errors that I may have missed. :-)
from pyais.
yep: https://github.com/M0r13n/pyais/tree/iterative-decoding
you can also install it with pip:
pip install pyais==2.0.0-alpha
from pyais.
Have tested using the new library on a few messages and looked into how it will integrate. Have not done an extensive test across message types to check for errors yet.
The good:
- Boolean values are fixed
- Partial decoding works with None properties set as expected. Though looking at the code defaults are all set to
0
so I am confused as to howattr.ib
is interpreting this, perhaps you can clarify if this is expected behaviour or if0
should be replaced withNone
. - Default error handling is no longer set to silently fail
The bad:
- Enum types do not use None if the fields are missing when partial decoding
- Using slotted classes removes the
__dict__
attribute, you have added asasdict()
function to iterate over the fields and return a dict, but this is slow.
A better approach would be to either haveasdict()
defined for each message class to return a object without iteration, or dont use slotted classes and return the__dict__
attribute. - While slightly out of scope of this issue, message classes that return Enum's require explicit conversion to integers even when using
asdict()
. Can you think of any good way to be able to have integers optionally returned in place of Enum's
Overall its a massive improvement, and other than the few kinks to sort out with enums and asdict()
/__dict__
I cannot see any other issues.
Please let me know your thoughts
from pyais.
Oh I should also note that currently I am merging the NMEAMessage
attributes with the decoded
attributes,
This is done using the following code:
def resultFromNMEA(NMEAMessage):
result = {}
result["validChecksum"] = NMEAMessage.is_valid
result["message_fragments"] = NMEAMessage.message_fragments
result["fragment_number"] = NMEAMessage.fragment_number
# Named ais_id header_type to remove confusion between ais_id and message_id
# ais_id (header_type) is the ais message type
# message_id is incremented for each new multi-fragment message sent by a vessel.
# It allows a decoding program to match together fragments that belong to the same message.
result["header_type"] = NMEAMessage.ais_id
result["NMEATalker"] = NMEAMessage.talker.value
result["NMEAType"] = NMEAMessage.type
result["message_id"] = NMEAMessage.message_id
result["channel"] = NMEAMessage.channel
return result
and then
result.update(decodedObjectGoesHere)
Might be nice to have this merging supported directly by the library as the values provided by NMEAMessage
are useful and critical for multi part decoding if not using the libraries built in functionality for doing so.
from pyais.
@Inrixia Thanks four your detailed reply. :-)
Enum types do not use None if the fields are missing when partial decoding
That's not so easy to achieve, because passing None
to any Enum would result in a ValueError: None is not a valid Enum
. To keep things simple, I therefore decided that every unknown value should result in some default value being returned. For example passing None
to EpfdType
returns EpfdType.Undefined
. Is there a legitimate usecase where it would be useful to distinguish between None
and EpfdType.Undefined
? If the payload is too short for the Enum-Value to be decoded, it's garbage anyway.
Using slotted classes removes the dict attribute, you have added as asdict() function to iterate over the fields and return a dict, but this is slow.
True. But on the other hand the classes itself consume less space and attribute access is faster.
A better approach would be to either have asdict() defined for each message class
I would like to not do this. The beauty of the current solution is that all messages are self contained in a single class and every field is defined through a single Attribute. If we would define asdict()
for every message, there would be duplicate logic. If one would need to add or remove a field, one would have to change both: the class and the asdict()
method.
I did the test and benchmarked the following options:
- the current
asdict()
with iterating overfields()
: Took: 1.997422695159912 s for 1000000 messages - replacing
asdict()
with this expression:{slt: getattr(self, slt) for slt in self.__slots__}
: Took: 1.2585115432739258 for 1000000 messages - define
asdict()
for every message (in this case only for Type 21): Took: 0.6488058567047119 for 1000000 messages
I think {slt: getattr(self, slt) for slt in self.__slots__}
is a nice option which doubles performance.
Can you think of any good way to be able to have integers optionally returned in place of Enum's
Not really. But I came up with a compromise:
def asdict(self, enum_as_int: bool = False) -> typing.Dict[str, typing.Any]:
"""
Convert the message to a dictionary.
@param enum_as_int: If set to True all Enum values will be returned as raw ints.
@return: The message as a dictionary.
"""
if enum_as_int:
d = {slt: int(getattr(self, slt)) if slt in ENUM_FIELDS else getattr(self, slt) for slt in self.__slots__}
else:
d = {slt: getattr(self, slt) for slt in self.__slots__}
return d
Using this approach you can get the values as ints when using asdict()
with a relatively small performance penalty. For 1000000 this takes only ~ 1.5s instead of ~1.3s on my machine.
Might be nice to have this merging supported directly by the library as the values provided by NMEAMessage are useful and critical for multi part decoding if not using the libraries built in functionality for doing so.
Your example is somewhat application specific. I added a more general implementation:
def decode_and_merge(self, enum_as_int: bool = False) -> Dict[str, Any]:
"""
Decodes the message and returns the result as a dict together with all attributes of
the original NMEA message.
@param enum_as_int: Set to True to treat IntEnums as pure integers
@return: A dictionary that holds all fields, defined in __slots__ + the decoded msg
"""
rlt = self.asdict()
del rlt['bit_array']
decoded = self.decode()
rlt.update(decoded.asdict(enum_as_int))
return rlt
from pyais.
Can you try it again? :-)
pip install git+https://github.com/M0r13n/pyais.git@iterative-decoding
from pyais.
Is there a legitimate usecase where it would be useful to distinguish between None and EpfdType.Undefined?
Yes, the enum only exists in the context of python so as soon as you want to write out the data to anything outside of python ie a file or database when serializing it the enum is turned into a int, there is a big difference between a default int value and null
/None
.
If every enum uses the same int for its "None" value then it could be cast to None
when serializing at the cost of a performance hit, but this doesn't seem like a good fix, and is inconsistent with all other properties that use None
to indicate a non decoded field. This is effectively the same discussion on not using default values for non decoded properties.
Also just to be sure when you state I therefore decided that every _unknown_ value should result in some default value being returned
is this just for Enums currently? I observed other properties being set to None
when not decoded (which is what should be happening).
True. But on the other hand the classes itself consume less space and attribute access is faster.
I would argue that this warrants actually checking the performance difference since presumably any system that is concerned with performance will be bulk processing or streaming data and accessing the entire object. But since using non slotted classes would require casting the enums to ints after the fact anyway it's potentially better to just go with the asdict
approach that implements enum_as_int
.
Also to improve performance of asdict
it's better to not have an if statement and to just have a seperate function named something like asdict_with_ints
, this would remove the overhead of the if check for both asdict
and asdict_with_ints
.
Using this approach you can get the values as ints when using
asdict()
with a relatively small performance penalty. For 1000000 this takes only ~ 1.5s instead of ~1.3s on my machine.
This is pretty good performance, assuming those numbers are accurate looking at tests would be roughly 4-5% of processing time out of my whole decoding process, where it takes roughly ~20min per ~20mil records.
Though it would be nice if the time it takes could be compared to the actual time to decode one message to see how much of a performance impact it would have, ie if it actually takes the same time as decoding the message then that would indicate a 2x speedup is possible.
Your example is somewhat application specific. I added a more general implementation...
Yes, and I've just realized that I'm actually performing some logic between getting the NMEA message and actually decoding so the above won't be useful for me sorry, however it's probably still nice to include it for people who want to have both.
from pyais.
Can you try it again? :-)
Yep I'll checkout the new asdict
functionality, but I doubt I will have any feedback for it.
Currently the only things left is the issue with None
for enums and performance considerations. Once those are resolved I intend to properly test the changes by re-decoding my current dataset of roughly ~7Billion reports.
Ideally with partial decoding the number of reports that are possible to decode will get much closer to 99% compared to the current metrics:
6176797644+33384245/6994626173 = 88.79% of messages were decoded
33384245/6176797644 = 0.54% of messages failed to decode (not including mismatched part 2 messages)
from pyais.
Currently the only things left is the issue with None for enums and performance considerations.
If a field is null/None
it is also None
for enums now. In order to achieve this, I added the following class function to all Enums:
@classmethod
def from_value(cls, v: typing.Optional[typing.Any]) -> typing.Optional["StationIntervals"]:
return cls(v) if v is not None else None
So it will return None if None is passed as a value.
Also I think that we shouldn't focus on performance too much. We should definitely avoid obvious performance bottlenecks, but I wouldn't focus on performance too much. If one needs really fast performance, there is the lib-ais project written in C++ which is ~3 times faster anyway.
from pyais.
That looks great.
Can I just pip install git+https://github.com/M0r13n/pyais.git@iterative-decoding
to test?
from pyais.
Yep. That should work.
from pyais.
These are the current errors I'm seeing with a 10mil test dataset.
Looks like everything is good aside from the above ENUM NoneType
issue
decodingError | count |
---|---|
null | 9260287 |
Traceback (most recent call last): File "", line 19, in decodeNMEA File "pyais/messages.py", line 438, in asdict d = {slt: int(getattr(self, slt)) if slt in ENUM_FIELDS else getattr(self, slt) for slt in self.slots} File "pyais/messages.py", line 438, in d = {slt: int(getattr(self, slt)) if slt in ENUM_FIELDS else getattr(self, slt) for slt in self.slots} TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType' | 24189 |
Cannot decode messages with a fragment number larger than 1 using single message decoder | 129 |
Message type 31 is not supported! | 110 |
Message type 30 is not supported! | 75 |
Message type 28 is not supported! | 72 |
Message type 29 is not supported! | 46 |
Message type 39 is not supported! | 30 |
Message type 42 is not supported! | 25 |
Message type 52 is not supported! | 22 |
Message type 56 is not supported! | 21 |
Message type 48 is not supported! | 19 |
Message type 54 is not supported! | 18 |
Message type 47 is not supported! | 15 |
Message type 55 is not supported! | 10 |
Message type 45 is not supported! | 4 |
Traceback (most recent call last): File "", line 19, in decodeNMEA File "pyais/messages.py", line 311, in decode return MSG_CLASS[self.ais_id].from_bitarray(self.bit_array) File "pyais/messages.py", line 429, in from_bitarray return cls(**kwargs) # type:ignore File "", line 11, in init self.data = __attr_converter_data(data) File "pyais/util.py", line 180, in int_to_bytes return int.from_bytes(val, 'big') TypeError: cannot convert 'NoneType' object to bytes | 3 |
Traceback (most recent call last): File "", line 19, in decodeNMEA File "pyais/messages.py", line 311, in decode return MSG_CLASS[self.ais_id].from_bitarray(self.bit_array) File "pyais/messages.py", line 1077, in from_bitarray raise ValueError(f"Partno {partno} is not allowed!") ValueError: Partno 2 is not allowed! | 2 |
Message type 60 is not supported! | 1 |
Message type 46 is not supported! | 1 |
from pyais.
Also just want to double check have any of the types changed for values returned compared to v1 aside from the Booleans and Enums?
from pyais.
Here are the error metrics using the fix in #50 on a large test dataset of ~230M reports.
decodingError | count |
---|---|
null | 232008704 |
Cannot decode messages with a fragment number larger than 1 using single message decoder | 2926 |
Message type 31 is not supported! | 2843 |
Message type 30 is not supported! | 1572 |
Message type 28 is not supported! | 1250 |
Message type 29 is not supported! | 1123 |
Message type 39 is not supported! | 731 |
Message type 42 is not supported! | 653 |
Message type 56 is not supported! | 602 |
Message type 52 is not supported! | 593 |
Message type 55 is not supported! | 552 |
Message type 48 is not supported! | 501 |
Message type 47 is not supported! | 427 |
Message type 54 is not supported! | 405 |
Message type 45 is not supported! | 119 |
Message type 63 is not supported! | 91 |
Message type 59 is not supported! | 62 |
Traceback (most recent call last): File "", line 19, in decodeNMEA File "pyais/messages.py", line 311, in decode return MSG_CLASS[self.ais_id].from_bitarray(self.bit_array) File "messages.py", line 429, in from_bitarray return cls(**kwargs) # type:ignore File "", line 11, in init self.data = __attr_converter_data(data) File "pyais/util.py", line 180, in int_to_bytes return int.from_bytes(val, 'big') TypeError: cannot convert 'NoneType' object to bytes | 47 |
Message type 61 is not supported! | 41 |
Message type 62 is not supported! | 33 |
Message type 60 is not supported! | 33 |
Message type 46 is not supported! | 32 |
Message type 58 is not supported! | 26 |
Message type 49 is not supported! | 23 |
Traceback (most recent call last): File "", line 19, in decodeNMEA File "pyais/messages.py", line 311, in decode return MSG_CLASS[self.ais_id].from_bitarray(self.bit_array) File "pyais/messages.py", line 1076, in from_bitarray raise ValueError(f"Partno {partno} is not allowed!") ValueError: Partno 2 is not allowed! | 18 |
Message type 44 is not supported! | 18 |
Message type 57 is not supported! | 18 |
Traceback (most recent call last): File "", line 19, in decodeNMEA File "pyais/messages.py", line 311, in decode return MSG_CLASS[self.ais_id].from_bitarray(self.bit_array) File "pyais/messages.py", line 1076, in from_bitarray raise ValueError(f"Partno {partno} is not allowed!") ValueError: Partno 3 is not allowed! | 15 |
Message type 38 is not supported! | 13 |
Message type 40 is not supported! | 11 |
Message type 33 is not supported! | 3 |
Message type 53 is not supported! | 2 |
Message type 35 is not supported! | 1 |
Message type 43 is not supported! | 1 |
Only error that stands out now is
Traceback (most recent call last):
File "", line 19, in decodeNMEA
File "pyais/messages.py", line 311, in decode
return MSG_CLASS[self.ais_id].from_bitarray(self.bit_array)
File "pyais/messages.py", line 429, in from_bitarray
return cls(**kwargs) # type:ignore
File "<attrs generated init pyais.messages.MessageType6>", line 11, in __init__
self.data = __attr_converter_data(data)
File "pyais/util.py", line 180, in int_to_bytes
return int.from_bytes(val, 'big')
TypeError: cannot convert 'NoneType' object to bytes
But it's only for 47/232008704 reports so probably fine to ignore it.
Outside of non throwable errors or issues with schema changes (see my previous question) I think with the implementation of #50 the implementation of iterative decoding is prettymuch done.
from pyais.
Performance wise at least for my pipeline which includes some advanced multi part decoding I got roughly 27k/s/core (reports per second per core) which is pretty good. Note this includes things like writing out data so the actual perfomance of just the pyais library will be faster.
from pyais.
Thank you very much for your detailed tests - I really appreciate it. 👍
Looks like there is a issue with asdict attempting to cast enums which are None...
This is closed thanks to your PR. I also added a unittest for this.
TypeError: cannot convert 'NoneType' object to bytes
This is also fixed and should not occur anymore. The reason was, that the converter
function was called, even if the value was None, which could lead to errors. Now None values are simply omitted. Matching unittests were added.
Cannot decode messages with a fragment number larger than 1 using single message decoder
This seems like an error of your code, right?
I also added a new base exception: AISBaseException
from which all other exceptions inherit. Instead of plain ValueErrors
pyais now raises slightly more verbose excpetions. For example ValueError: Partno 2 is not allowed!
is now UnknownPartNoException(Partno 2 is not allowed!)
. This is done, so that a user can simply catch all AISBaseException
s. Any other error that is not a AISBaseException
instance is then most likely a bug in pyais.
Also just want to double check have any of the types changed for values returned compared to v1 aside from the Booleans and Enums?
I don't think so.
May I ask where you got your large dataset from? It would be great, if I could use such a large dataset as kind of an integration test.
Again, thanks for your help. :-)
I guess, that we're ready for a v2 release, don't you think so?
from pyais.
I guess that I wait for you to decode the ~203M dataset. The new version will then be published next weekend! :-)
from pyais.
Here is the result using the latest version:
decodingError | count |
---|---|
null | 232008924 |
Cannot decode messages with a fragment number larger than 1 using single message decoder | 2926 |
Message type 31 is not supported! | 2843 |
Message type 30 is not supported! | 1572 |
Message type 28 is not supported! | 1250 |
Message type 29 is not supported! | 1123 |
Message type 39 is not supported! | 730 |
Message type 42 is not supported! | 653 |
Message type 56 is not supported! | 602 |
Message type 52 is not supported! | 595 |
Message type 55 is not supported! | 552 |
Message type 48 is not supported! | 501 |
Message type 47 is not supported! | 427 |
Message type 54 is not supported! | 405 |
Message type 45 is not supported! | 119 |
Message type 63 is not supported! | 91 |
Message type 59 is not supported! | 62 |
Message type 61 is not supported! | 41 |
Message type 62 is not supported! | 33 |
Message type 60 is not supported! | 33 |
Message type 46 is not supported! | 32 |
Message type 58 is not supported! | 26 |
Message type 49 is not supported! | 23 |
Traceback (most recent call last): File "", line 19, in decodeNMEA File "pyais/messages.py", line 314, in decode return MSG_CLASS[self.ais_id].from_bitarray(self.bit_array) File "pyais/messages.py", line 1091, in from_bitarray raise UnknownPartNoException(f"Partno {partno} is not allowed!") pyais.exceptions.UnknownPartNoException: Partno 2 is not allowed! | 18 |
Message type 44 is not supported! | 18 |
Message type 57 is not supported! | 18 |
Traceback (most recent call last): File "", line 19, in decodeNMEA File "pyais/messages.py", line 314, in decode return MSG_CLASS[self.ais_id].from_bitarray(self.bit_array) File "pyais/messages.py", line 1091, in from_bitarray raise UnknownPartNoException(f"Partno {partno} is not allowed!") pyais.exceptions.UnknownPartNoException: Partno 3 is not allowed! | 15 |
Message type 38 is not supported! | 13 |
Message type 40 is not supported! | 11 |
Message type 33 is not supported! | 3 |
Message type 53 is not supported! | 2 |
Message type 35 is not supported! | 1 |
Message type 43 is not supported! | 1 |
Will proceed with decoding the full dataset now and update with the metrics for that. I reckon this is prettymuch ready for release
from pyais.
Here are the results using the full dataset, a few new edge cases but overall extremely good.
decodingError | count |
---|---|
null | 5557216332 |
Cannot decode messages with a fragment number larger than 1 using single message decoder | 42007 |
Message type 31 is not supported! | 11346 |
Message type 30 is not supported! | 6346 |
Traceback (most recent call last): File "", line 19, in decodeNMEA File "/pyais/messages.py", line 314, in decode return MSG_CLASS[self.ais_id].from_bitarray(self.bit_array) File "/pyais/messages.py", line 1091, in from_bitarray raise UnknownPartNoException(f"Partno {partno} is not allowed!") pyais.exceptions.UnknownPartNoException: Partno 3 is not allowed! | 6090 |
Message type 29 is not supported! | 5109 |
Message type 28 is not supported! | 5029 |
Traceback (most recent call last): File "", line 19, in decodeNMEA File "/pyais/messages.py", line 314, in decode return MSG_CLASS[self.ais_id].from_bitarray(self.bit_array) File "/pyais/messages.py", line 1091, in from_bitarray raise UnknownPartNoException(f"Partno {partno} is not allowed!") pyais.exceptions.UnknownPartNoException: Partno 2 is not allowed! | 2976 |
Message type 42 is not supported! | 2972 |
Message type 39 is not supported! | 2931 |
Message type 56 is not supported! | 2613 |
Message type 52 is not supported! | 2526 |
Message type 55 is not supported! | 2205 |
Message type 48 is not supported! | 2192 |
Message type 54 is not supported! | 1767 |
Message type 47 is not supported! | 1709 |
Message type 45 is not supported! | 457 |
Message type 63 is not supported! | 98 |
Message type 62 is not supported! | 51 |
Traceback (most recent call last): File "", line 19, in decodeNMEA File "/pyais/messages.py", line 314, in decode return MSG_CLASS[self.ais_id].from_bitarray(self.bit_array) File "/pyais/messages.py", line 1174, in from_bitarray addressed: int = bit_arr[38] IndexError: bitarray index out of range | 47 |
Message type 32 is not supported! | 41 |
Message type 59 is not supported! | 41 |
Message type 46 is not supported! | 40 |
Message type 44 is not supported! | 39 |
Message type 58 is not supported! | 35 |
Message type 40 is not supported! | 31 |
Message type 61 is not supported! | 30 |
Message type 57 is not supported! | 24 |
Message type 60 is not supported! | 22 |
Message type 35 is not supported! | 22 |
Message type 50 is not supported! | 21 |
Traceback (most recent call last): File "", line 19, in decodeNMEA File "/pyais/messages.py", line 314, in decode return MSG_CLASS[self.ais_id].from_bitarray(self.bit_array) File "/pyais/messages.py", line 1274, in from_bitarray addressed: int = bit_arr[38] IndexError: bitarray index out of range | 20 |
Message type 49 is not supported! | 20 |
Traceback (most recent call last): File "", line 19, in decodeNMEA File "/pyais/messages.py", line 314, in decode return MSG_CLASS[self.ais_id].from_bitarray(self.bit_array) File "/pyais/messages.py", line 996, in from_bitarray if bit_arr[139]: IndexError: bitarray index out of range | 14 |
Message type 51 is not supported! | 14 |
Message type 38 is not supported! | 13 |
Message type 34 is not supported! | 10 |
Message type 53 is not supported! | 9 |
Message type 33 is not supported! | 9 |
Message type 43 is not supported! | 7 |
Message type 36 is not supported! | 6 |
Message type 41 is not supported! | 5 |
Message type 37 is not supported! | 3 |
from pyais.
Also just want to double check have any of the types changed for values returned compared to v1 aside from the Booleans and Enums?
I don't think so.
Looks like shiptype
is now ship_type
and type
is now msg_type
? I've made some changes for them already, do you know if any other field names changed?
from pyais.
Thanks again for your valuable input. I looked at those IndexError
's. Those were indeed bugs still present in the code. These errors occurred for messages of type 22, 25 or 26. If these messages are very short pyais tried to access an element from the bit array, that did not exist. Example messages were:
!AIVDO,1,1,,A,F0001,0*74
(type 22)!AIVDO,1,1,,A,Ig,0*65
(type 25)!AIVDO,1,1,,A,Jgg,4*4E
(type 26)
These are obviously garbage, but should not cause any crashes non the less.
Regarding your concerns regarding field names:
You are right. I changed some names. type
was renamed to msg_type
to avoid interference with Pythons native type
keyword. Besides shiptype
I also found that name_extension
was renamed to name_ext
. Also I renamed some attributes of the NMEAMessage
class:
message_fragments
->frag_cnt
fragment_number
->frag_num
message_id
->seq_id
from pyais.
Version 2.0.0 is released. 🥳
I would suggest that we treat new errors in dedicated issues. This thread is already quite huge and covers lots of different topics.
from pyais.
Awesome! And yes this did outgrow its initial cause. Thanks again for the help
from pyais.
Related Issues (20)
- Stream from serial/COM port HOT 2
- lat lon converters float accuracy HOT 2
- AtoN codes do not match R0126 (A-126) Table 3 or M.1371-5 TABLE 74 HOT 1
- Provide decoding of communication status field in types 1, 2, 3, 4, 9, 11, 18 HOT 5
- pyais fails to decode type 5 messages HOT 6
- Message Type 26 Logic may be incorrect HOT 2
- MessageType24PartA : incorrect length? HOT 6
- EpfdType Enum Missing InternalGNSS = 15 HOT 2
- Navigational statuses 9, 10, 13 and Ship type 29 HOT 3
- NoneFilter broken HOT 1
- Get Country Code HOT 5
- TCPConnection stops receiving messages HOT 4
- Determination of the communication status field HOT 5
- Encoded NRZI data HOT 1
- Timestamp for message type 1 and others HOT 1
- spare_1 field in MessageType19 converted as bytes HOT 1
- how to apply a filter to a single message HOT 5
- Decode raw payload data HOT 5
- Stream NMEA message with metadata HOT 5
- import SO_REUSEPORT in streams.py fails on Windows HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyais.