Code Monkey home page Code Monkey logo

Comments (6)

netvl avatar netvl commented on May 31, 2024

Frankly, I'm not sure what you mean under streaming here. According to the XML standard, the document is either well-formed or not; well-formedness errors are hard errors, that is, they unconditionally stop the parsing process. If a document does not have a valid element structure (e.g. there is no closing tag for some opening tag), then the document is not well-formed. EventReader strives to follow the specification, therefore, EndDocument should only be emitted if the root element is closed and there are no problems with the elements structure. It is a parsing error otherwise. If that's not the case, it is a bug.

Additionally, each EventReader has only one Read instance as a backing source; I'm not sure how additional methods which provide more data come into picture here.

Could you please explain in detail how exactly what you propose should work and why existing API cannot do the same thing?

from xml-rs.

linkmauve avatar linkmauve commented on May 31, 2024

I’m currently trying to write an XMPP client library, and this protocol starts with a prolog, then a <stream:stream …> root element with various attributes and namespace declarations, and then children of this stream:stream will be received on various events. When the user disconnects, they send </stream:stream> followed by EOF and wait for the server to do the same.

The XML “document” that makes this session is well-formed (any error will lead to the stream being closed), and has a few restrictions like no PI, no comment, no doctype/entity declaration, etc.

I am currently (ab?)using the xml::EventReader::from_str method to parse a single stanza (the name given to the direct children of stream:stream), but the issue is that it creates a newer parser for every element received, and doesn’t play well with the namespaces declared before.

My first prototype was creating an xml::EventReader from the std::tcp::TcpStream corresponding to the session, but it was waiting for the stream to EOF instead of emitting xml::XmlElements as they came.

I since moved to mio::tcp::TcpStream, which provides non-blocking IO but doesn’t implement std::io::Read, and also plan on adding TLS encryption to this stream, so pushing &str to the parser made sense but I’m open to any better suggestion.

from xml-rs.

netvl avatar netvl commented on May 31, 2024

Hmm, I see now, thanks for the explanation.

That EventReader waited for stream EOF before emitting the next event is likely a bug. Ideally it should read no more than absolutely needed from the stream to produce the next event, so if the next stanza is fully available in the stream, it should result into an event immediately. This is worth investigating, thanks.

And I do see the problem with byte sources not implementing Read. First, are you sure that mio::tcp::TcpStream does not implement Read? Its documentation hints otherwise. Second, I'm really unwilling to add any other byte sources aside from Read instances. I would suggest the following approach.

EventReader would provide a &mut R reference to its internal reader, and a configuration option would be added which would allow graceful passthrough of EOF errors. Then you can put a Cursor<Vec<u8>> inside the EventReader and push incoming data to it through &mut R reference, while reading events from it with conventional means.

What do you think?

from xml-rs.

linkmauve avatar linkmauve commented on May 31, 2024

Hi, sorry for the late reply.

Making the parser emit XmlEvents before EOF (triggered every time some new data are pushed into the mio::tcp::TcpStream) would be perfect for my usecase. It will require the ability to close the document at some point, like after authentication or after STARTTLS negotiation, so I’m not sure if that’s perfect yet.

I was indeed wrong, mio::tcp::TcpStream does implement Read, so my previous arguments are void. :)

from xml-rs.

netvl avatar netvl commented on May 31, 2024

@linkmauve note that since the parser is pull-based, events are not triggered in response to the new data in the stream; rather, upon the user request the parser attempts to read more data from the input stream, and thus its behavior depends on the behavior of the source (it may block or return an error).

That said, I don't know how mio::tcp::TcpStream behaves when there is no data in the socket yet. Does it block or does it return EOF or something else? If it blocks, then there is little xml-rs could do; if it returns some special value, probably an error, it will be propagated for the user consumption, and we only need to allow the parser not to terminate upon certain errors.

I'm also not sure what do you mean under closing the document. The document will be "closed" automatically when the last closing tag arrives. This does not require special actions from the user.

from xml-rs.

linkmauve avatar linkmauve commented on May 31, 2024

With #146 being merged, this issue is now fixed, thanks!

from xml-rs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.