Comments (6)
Frankly, I'm not sure what you mean under streaming here. According to the XML standard, the document is either well-formed or not; well-formedness errors are hard errors, that is, they unconditionally stop the parsing process. If a document does not have a valid element structure (e.g. there is no closing tag for some opening tag), then the document is not well-formed. EventReader
strives to follow the specification, therefore, EndDocument
should only be emitted if the root element is closed and there are no problems with the elements structure. It is a parsing error otherwise. If that's not the case, it is a bug.
Additionally, each EventReader
has only one Read
instance as a backing source; I'm not sure how additional methods which provide more data come into picture here.
Could you please explain in detail how exactly what you propose should work and why existing API cannot do the same thing?
from xml-rs.
I’m currently trying to write an XMPP client library, and this protocol starts with a prolog, then a <stream:stream …>
root element with various attributes and namespace declarations, and then children of this stream:stream
will be received on various events. When the user disconnects, they send </stream:stream>
followed by EOF and wait for the server to do the same.
The XML “document” that makes this session is well-formed (any error will lead to the stream being closed), and has a few restrictions like no PI, no comment, no doctype/entity declaration, etc.
I am currently (ab?)using the xml::EventReader::from_str
method to parse a single stanza (the name given to the direct children of stream:stream
), but the issue is that it creates a newer parser for every element received, and doesn’t play well with the namespaces declared before.
My first prototype was creating an xml::EventReader
from the std::tcp::TcpStream
corresponding to the session, but it was waiting for the stream to EOF instead of emitting xml::XmlElement
s as they came.
I since moved to mio::tcp::TcpStream
, which provides non-blocking IO but doesn’t implement std::io::Read
, and also plan on adding TLS encryption to this stream, so pushing &str
to the parser made sense but I’m open to any better suggestion.
from xml-rs.
Hmm, I see now, thanks for the explanation.
That EventReader
waited for stream EOF before emitting the next event is likely a bug. Ideally it should read no more than absolutely needed from the stream to produce the next event, so if the next stanza is fully available in the stream, it should result into an event immediately. This is worth investigating, thanks.
And I do see the problem with byte sources not implementing Read
. First, are you sure that mio::tcp::TcpStream
does not implement Read
? Its documentation hints otherwise. Second, I'm really unwilling to add any other byte sources aside from Read
instances. I would suggest the following approach.
EventReader
would provide a &mut R
reference to its internal reader, and a configuration option would be added which would allow graceful passthrough of EOF errors. Then you can put a Cursor<Vec<u8>>
inside the EventReader
and push incoming data to it through &mut R
reference, while reading events from it with conventional means.
What do you think?
from xml-rs.
Hi, sorry for the late reply.
Making the parser emit XmlEvent
s before EOF (triggered every time some new data are pushed into the mio::tcp::TcpStream
) would be perfect for my usecase. It will require the ability to close the document at some point, like after authentication or after STARTTLS negotiation, so I’m not sure if that’s perfect yet.
I was indeed wrong, mio::tcp::TcpStream
does implement Read
, so my previous arguments are void. :)
from xml-rs.
@linkmauve note that since the parser is pull-based, events are not triggered in response to the new data in the stream; rather, upon the user request the parser attempts to read more data from the input stream, and thus its behavior depends on the behavior of the source (it may block or return an error).
That said, I don't know how mio::tcp::TcpStream
behaves when there is no data in the socket yet. Does it block or does it return EOF or something else? If it blocks, then there is little xml-rs could do; if it returns some special value, probably an error, it will be propagated for the user consumption, and we only need to allow the parser not to terminate upon certain errors.
I'm also not sure what do you mean under closing the document. The document will be "closed" automatically when the last closing tag arrives. This does not require special actions from the user.
from xml-rs.
With #146 being merged, this issue is now fixed, thanks!
from xml-rs.
Related Issues (20)
- Overflow in lexer when parsing malformed doctype HOT 1
- Fails to parse /> as part of XML body HOT 7
- Implement the position trait for the Events Iterator
- panicked at 'attempt to add with overflow' HOT 2
- Feature request: common Error enum for read/write
- Restricted XmlEvent? HOT 1
- [Question] How to implement streaming parsing? HOT 1
- deprecation warnings HOT 1
- Is this crate abandoned? HOT 3
- Parsing of comments <!-- <!-->
- Maintenance of xml-rs HOT 10
- EventReader never return Result::Err after document end HOT 1
- Version 0.8.9 broke deserialization behavior HOT 2
- 0.8.12's field types in xml::common::TextPosition break existing code HOT 1
- Panic in `PullParser::push_pos()` HOT 1
- You've found a bug in xml-rs, caused by calls to push_pos() HOT 3
- Profile-Guided Optimization (PGO) results HOT 2
- Simple way to remove whitespaces HOT 1
- Deal with files that start with empty line HOT 2
- Add default "xmlns" on ParserConfig2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xml-rs.