smeans / pjxml Goto Github PK

17.0 5.0 8.0 79 KB

Pure JavaScript XML parser.

License: MIT License

CSS 0.48% HTML 10.21% JavaScript 89.31%

xml parsing tree javascript

pjxml's Introduction

Pure JavaScript XML (pjxml)

This package provides a very lightweight, forgiving XML parser. It's 100% JavaScript and while it is Node-compliant it has no package dependencies.

Parsing - Plain JavaScript

Include pjxml.js then call the parse() method. For a full example, see the index.html demo page in the pjxml GitHub repository.

var xml = '<document attribute="value"><name>David Bowie</name></document>';
var doc = pjXML.parse(xml);

Parsing - Node.js

Install the pjxml package, then include it using require.

var pjXML = require('pjxml');

var xml = '<document attribute="value"><name>David Bowie</name></document>';
var doc = pjXML.parse(xml)

the document tree

The parse() function returns a hierarchical object tree with each element mapped to one object. The text and element contents of each element are stored in an array in the content property. Any attributes are in the attributes property.

finding things

The select() method supports a very minimal XPath selection syntax. It returns an array of all elements that match the path given. If only one node matches, it returns a single node instead of an array. The selectAll() method always returns an array.

The // operator matches recursively.

For example:

var el = doc.selectAll('//name');

returns

[{"type":1,"content":["David Bowie"],"name":"name","attributes":{}}]

extracting text

The text() method returns all text content for a given element and its children:

console.log(doc.text());

returns

David Bowie

extracting XML

The xml() method returns valid XML for any given document or element object.

console.log(doc.xml());

returns

<document attribute="value"><name>David Bowie</name></document>

pjxml's People

Contributors

Stargazers

Watchers

Forkers

damianofalcioni ikq grefel gr-qft maherfa dmarmor thiagocaetano datadiode

pjxml's Issues

Licensing

I'm thinking about using this in a project. Has this been released under an open source license like MIT?

Not sure how to select a path

Great that you made this module! I was using xml2js before, but needed something lightweight for simple stuff. But I'm running into problems with a device that spits out xml that is probably not fully compliant to the xml standards.

In below xml example I'm trying to select everything from m:GetInfoResponse with:

const el = doc.select('//m:GetInfoResponse*');

But the result is not making much sense because it somehow includes 'soap-env:Envelope', which I definitely cannot use:

Node {
  type: 9,
  content:
   [ '',
     Node { type: 7, content: [Array] },
     '',
     Node {
       type: 1,
       content: [],
       name: 'soap-env:Envelope',
       attributes: [Object] },
     '/Description>        ',

So how can I solve this?

Thankx for your help!

<?xml version="1.0" encoding="UTF-8"?>
<soap-env:Envelope
        xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/"
        soap-env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
        >
<soap-env:Body>
    <m:GetInfoResponse
        xmlns:m="urn:NETGEAR-ROUTER:service:DeviceInfo:1">
        <ModelName>R7000</ModelName>
        <Description>Netgear Smart Wizard 3.0, specification 0.7 version</Description>
        <SerialNumber>3LG23B7106783</SerialNumber>
        <Firmwareversion>V1.0.9.12</Firmwareversion>
        <SmartAgentversion>3.0</SmartAgentversion>
        <FirewallVersion>ACOS NAT-Netfilter v3.0.0.5 (Linux Cone NAT Hot Patch 06/11/2010)</FirewallVersion>
        <VPNVersion>N/A</VPNVersion>
        <OthersoftwareVersion>1.2.23</OthersoftwareVersion>
        <Hardwareversion>R7000</Hardwareversion>
        <Otherhardwareversion>N/A</Otherhardwareversion>
        <FirstUseDate>Saturday, 20 Feb 2016 23:40:20</FirstUseDate>
    </m:GetInfoResponse>
    <ResponseCode>000</ResponseCode>
</soap-env:Body>
</soap-env:Envelope>

Self-closing XML tags seem to confuse the parser

In other words: <This/> notation for empty tags causes stuff to go wrong.
A simple .replace(/<\w+\/>/g, "") helps me work around the issue for now.
Thanks for sharing this handy piece of work!

Pull request to fix early decoding of &-encoded characters

Hi, I've been experimenting with using pjxml in an embedded project where I need a single source-file lightweight XML parser, and it seems ideal for my purpose.

But I noticed one parsing error in my testing. It came up when I had a node that had an < at the end of its text content. That caused the parser to erroneously mark that as the beginning of a new node. I think I figured out the problem and have applied a fix that solves it for my test cases at least. You can find the details in the pull request I just submitted, #5.

Thanks!

smeans / pjxml Goto Github PK

pjxml's Introduction

Pure JavaScript XML (pjxml)

Parsing - Plain JavaScript

Parsing - Node.js

the document tree

finding things

extracting text

extracting XML

pjxml's People

Contributors

Stargazers

Watchers

Forkers

pjxml's Issues

Recommend Projects

Recommend Topics

Recommend Org