Code Monkey home page Code Monkey logo

fuzi's Introduction

Fuzi (斧子)

Build Status CocoaPods Compatible License Carthage Compatible Platform Twitter

A fast & lightweight XML/HTML parser in Swift that makes your life easier. [Documentation]

Fuzi is based on a Swift port of Mattt Thompson's Ono(斧), using most of its low level implementations with moderate class & interface redesign following standard Swift conventions, along with several bug fixes.

Fuzi(斧子) means "axe", in homage to Ono(斧), which in turn is inspired by Nokogiri (鋸), which means "saw".

简体中文 日本語

A Quick Look

let xml = "..."
// or
// let xmlData = <some NSData or Data>
do {
  let document = try XMLDocument(string: xml)
  // or
  // let document = try XMLDocument(data: xmlData)
  
  if let root = document.root {
    // Accessing all child nodes of root element
    for element in root.children {
      print("\(element.tag): \(element.attributes)")
    }
    
    // Getting child element by tag & accessing attributes
    if let length = root.firstChild(tag:"Length", inNamespace: "dc") {
      print(length["unit"])     // `unit` attribute
      print(length.attributes)  // all attributes
    }
  }
  
  // XPath & CSS queries
  for element in document.xpath("//element") {
    print("\(element.tag): \(element.attributes)")
  }
  
  if let firstLink = document.firstChild(css: "a, link") {
    print(firstLink["href"])
  }
} catch let error {
  print(error)
}

Features

Inherited from Ono

  • Extremely performant document parsing and traversal, powered by libxml2
  • Support for both XPath and CSS queries
  • Automatic conversion of date and number values
  • Correct, common-sense handling of XML namespaces for elements and attributes
  • Ability to load HTML and XML documents from either String or NSData or [CChar]
  • Comprehensive test suite
  • Full documentation

Improved in Fuzi

  • Simple, modern API following standard Swift conventions, no more return types like AnyObject! that cause unnecessary type casts
  • Customizable date and number formatters
  • Some bugs fixes
  • More convenience methods for HTML Documents
  • Access XML nodes of all types (Including text, comment, etc.)
  • Support for more CSS selectors (yet to come)

Requirements

  • iOS 8.0+ / Mac OS X 10.9+
  • Xcode 8.0+

Use version 0.4.0 for Swift 2.3.

Installation

There are 4 ways you can install Fuzi to your project.

Using CocoaPods

You can use CocoaPods to install Fuzi by adding it to your to your Podfile:

platform :ios, '8.0'
use_frameworks!

target 'MyApp' do
	pod 'Fuzi', '~> 1.0.0'
end

Then, run the following command:

$ pod install

Using Swift Package Manager

The Swift Package Manager is now built-in with Xcode 11 (currently in beta). You can easily add Fuzi as a dependency by choosing File > Swift Packages > Add Package Dependency... or in the Swift Packages tab of your project file and clicking on +. Simply use https://github.com/cezheng/Fuzi as repository and Xcode should automatically resolve the current version.

Manually

  1. Add all *.swift files in Fuzi directory into your project.
  2. In your Xcode project Build Settings:
    1. Find Search Paths, add $(SDKROOT)/usr/include/libxml2 to Header Search Paths.
    2. Find Linking, add -lxml2 to Other Linker Flags.

Using Carthage

Create a Cartfile or Cartfile.private in the root directory of your project, and add the following line:

github "cezheng/Fuzi" ~> 1.0.0

Run the following command:

$ carthage update

Then do the followings in Xcode:

  1. Drag the Fuzi.framework built by Carthage into your target's General -> Embedded Binaries.
  2. In Build Settings, find Search Paths, add $(SDKROOT)/usr/include/libxml2 to Header Search Paths.

Usage

XML

import Fuzi

let xml = "..."
do {
  // if encoding is omitted, it defaults to NSUTF8StringEncoding
  let document = try XMLDocument(string: html, encoding: String.Encoding.utf8)
  if let root = document.root {
    print(root.tag)
    
    // define a prefix for a namespace
    document.definePrefix("atom", defaultNamespace: "http://www.w3.org/2005/Atom")
    
    // get first child element with given tag in namespace(optional)
    print(root.firstChild(tag: "title", inNamespace: "atom"))

    // iterate through all children
    for element in root.children {
      print("\(index) \(element.tag): \(element.attributes)")
    }
  }
  // you can also use CSS selector against XMLDocument when you feels it makes sense
} catch let error as XMLError {
  switch error {
  case .noError: print("wth this should not appear")
  case .parserFailure, .invalidData: print(error)
  case .libXMLError(let code, let message):
    print("libxml error code: \(code), message: \(message)")
  }
}

HTML

HTMLDocument is a subclass of XMLDocument.

import Fuzi

let html = "<html>...</html>"
do {
  // if encoding is omitted, it defaults to NSUTF8StringEncoding
  let doc = try HTMLDocument(string: html, encoding: String.Encoding.utf8)
  
  // CSS queries
  if let elementById = doc.firstChild(css: "#id") {
    print(elementById.stringValue)
  }
  for link in doc.css("a, link") {
      print(link.rawXML)
      print(link["href"])
  }
  
  // XPath queries
  if let firstAnchor = doc.firstChild(xpath: "//body/a") {
    print(firstAnchor["href"])
  }
  for script in doc.xpath("//head/script") {
    print(script["src"])
  }
  
  // Evaluate XPath functions
  if let result = doc.eval(xpath: "count(/*/a)") {
    print("anchor count : \(result.doubleValue)")
  }
  
  // Convenient HTML methods
  print(doc.title) // gets <title>'s innerHTML in <head>
  print(doc.head)  // gets <head> element
  print(doc.body)  // gets <body> element
  
} catch let error {
  print(error)
}

I don't care about error handling

import Fuzi

let xml = "..."

// Don't show me the errors, just don't crash
if let doc1 = try? XMLDocument(string: xml) {
  //...
}

let html = "<html>...</html>"

// I'm sure this won't crash
let doc2 = try! HTMLDocument(string: html)
//...

I want to access Text Nodes

Not only text nodes, you can specify what types of nodes you would like to access.

let document = ...
// Get all child nodes that are Element nodes, Text nodes, or Comment nodes
document.root?.childNodes(ofTypes: [.Element, .Text, .Comment])

Migrating From Ono?

Looking at example programs is the swiftest way to know the difference. The following 2 examples do exactly the same thing.

Ono Example

Fuzi Example

Accessing children

Ono

[doc firstChildWithTag:tag inNamespace:namespace];
[doc firstChildWithXPath:xpath];
[doc firstChildWithXPath:css];
for (ONOXMLElement *element in parent.children) {
  //...
}
[doc childrenWithTag:tag inNamespace:namespace];

Fuzi

doc.firstChild(tag: tag, inNamespace: namespace)
doc.firstChild(xpath: xpath)
doc.firstChild(css: css)
for element in parent.children {
  //...
}
doc.children(tag: tag, inNamespace:namespace)

Iterate through query results

Ono

Conforms to NSFastEnumeration.

// simply iterating through the results
// mark `__unused` to unused params `idx` and `stop`
[doc enumerateElementsWithXPath:xpath usingBlock:^(ONOXMLElement *element, __unused NSUInteger idx, __unused BOOL *stop) {
  NSLog(@"%@", element);
}];

// stop the iteration at second element
[doc enumerateElementsWithXPath:XPath usingBlock:^(ONOXMLElement *element, NSUInteger idx, BOOL *stop) {
  *stop = (idx == 1);
}];

// getting element by index 
ONOXMLDocument *nthElement = [(NSEnumerator*)[doc CSS:css] allObjects][n];

// total element count
NSUInteger count = [(NSEnumerator*)[document XPath:xpath] allObjects].count;

Fuzi

Conforms to Swift's SequenceType and Indexable.

// simply iterating through the results
// no need to write the unused `idx` or `stop` params
for element in doc.xpath(xpath) {
  print(element)
}

// stop the iteration at second element
for (index, element) in doc.xpath(xpath).enumerate() {
  if idx == 1 {
    break
  }
}

// getting element by index 
if let nthElement = doc.css(css)[n] {
  //...
}

// total element count
let count = doc.xpath(xpath).count

Evaluating XPath Functions

Ono

ONOXPathFunctionResult *result = [doc functionResultByEvaluatingXPath:xpath];
result.boolValue;    //BOOL
result.numericValue; //double
result.stringValue;  //NSString

Fuzi

if let result = doc.eval(xpath: xpath) {
  result.boolValue   //Bool
  result.doubleValue //Double
  result.stringValue //String
}

License

Fuzi is released under the MIT license. See LICENSE for details.

fuzi's People

Contributors

a-yasui avatar banjun avatar bryant1410 avatar cezheng avatar devssun avatar joediv avatar jordanekay avatar klaas avatar mhmiles avatar mickael-menu-mantano avatar monoqlo avatar rayps avatar readmecritic avatar rendercoder avatar superlopuh avatar thabz avatar thebluepotato avatar tualatrix avatar valeriyvan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fuzi's Issues

css attribute selector

It seems query on attributes are not working:

let x = tr.firstChild(css: "td[attribute=value]")?.stringValue

is returning a wrong value (the one of the 1st TD, not the one that should be matched by attribute=value)

have issue with playground

I like testing framework using playground (use this tool), which works fine with most pods like Alamofire, but your framework need to include libxml2, this some kind of improving the complication and it indeed not work, can you help me inspect this problem

parseHTML Some questions

Here's my code

func parseHTML () {
        
        let html = "http://ec2-52-193-39-25.ap-northeast-1.compute.amazonaws.com/ForMyDear/"
        do {
        let doc = try HTMLDocument(string: html, encoding: String.Encoding.utf8)
        
        print(doc.title) // gets <title>'s innerHTML in <head>
        print(doc.head)  // gets <head> element
        print(doc.body)
        }
        catch let error {
            print(error)
        }
    }

the print(doc.title) There was a yellow exclamation point
Expression implicitly coerced from 'XMLElement?' To Any

Where is the problem?
I'm sorry. I'm a new Swift

Error in HTML parsing

Hello!
I have encountered an error when I attempt to parse this piece of page:

...
<i>
    <p class="center">Тут должен быть текст на кириллице. Много текста.<br>
    <br>
    И здесь еще чуть чуть.
    <br>
    </p>
</i>
...

And after using:

let doc = try HTMLDocument(string: mainHTMLFile, encoding: .utf8)

The resulting nodes look like that:

...
<i></i>
<p class="center">Тут должен быть текст на кириллице. Много текста.<br>
<br>
И здесь еще чуть чуть.
<br>
</p>
...

Using Swift, Xcode 8.3.2, pod 'Fuzi', '~> 1.0.0'

Update README

README is kind of out of date as some description still in Swift 3.

  • English
  • Chinese
  • Japanese

Redefinition of module 'libxml2' error when using framework that uses Fuzi

I run into an issue using Fuzi via Carthage and two nested frameworks.
Framework A has a Carthage dependency to Fuzi and is itself a carthage dependency of framework B.
Building framework A works fine, however framework B ends up with 2 module maps and a redefinition of libxml2 (as Fuzi is contained in framework A and as a resolved dependency pulled into B). I don't have much experience using module maps, so this might very well be an error on my side or maybe an Carthage issue.

Parsing HTML and then pretty printing it

I'm trying to parse some html text and then pretty print the entire document. I couldn't tell what was the best way to traverse the hierarchy of nodes/elements and wasn't sure how to get the inner html content of a tag. I'm posting here because I think this could improve the documentation for the API.

   let html = "**** put some html text here. ****"

       let doc = try HTMLDocument(string: html, encoding: NSUTF8StringEncoding)

        if let root = doc.root {
                let str = self.dumpElement(root)
                print(str)
        }

    func dumpElement(element:XMLElement) -> String {
        var str = ""

        str = "<\(element.tag!.uppercaseString)"
        for attr in element.attributes {
            str += " \(attr.0)=\(attr.1)"
        }
        str += ">"
        let nodes = element.childNodes(ofTypes: [.Text])
        for node in nodes {
            str += node.stringValue
        }

        for el in element.children {
            str += self.dumpElement(el)
        }
        str += "</\(element.tag!.uppercaseString)>"
        return str
    }

Is this correct?

Anchor tag with table in it produces incorrect XMLElement.

Description:

  • Expected behaviour:
do {
    var string = "<a href=\"/something.pdf\"><table>click here to go to pdf</table></a>"
    let document = try HTMLDocument(string: string)
    print(document.body!)
} catch let error {
    print(error)
}

should print

<body>
<a href="/nothing.pdf"><table>click here to go to pdf</table></a>
</body>
  • Actual behaviour:
    prints
<body>
<a href="/nothing.pdf"></a><table>click here to go to pdf</table>
</body>

Environment

  • Package Manager:

    • [x ] Carthage, version: 0.12.0
    • CocoaPods, version:
    • Manually
  • Fuzi version: master

  • Xcode version: 8.2.1

How to reproduce:

See above. I know that this is very much an edge case. And probably by design. According to the HTML5 standard: The a element may be wrapped around entire paragraphs, lists, tables, and so forth, even entire sections, so long as there is no interactive content within (e.g. buttons or other links).
The code acts as expected when you wrap it in a div or a p tag. It's only the a tag that messes up.

dyld: Library not loaded: @rpath/Fuzi.framework/Fuzi

error:
dyld: Library not loaded: @rpath/Fuzi.framework/Fuzi
Referenced from: /Users/xxx/Library/Developer/Xcode/DerivedData/marksix-ccpkaihhjxcprxalsglvfbucokfl/Build/Products/Debug-iphonesimulator/xxxx.framework/markSixKit
Reason: image not found
(lldb)

swift 3.0
Xcode 8

cocoapods

Can I import this into my Objective-C project?

Thanks for the great library! I am using Ono and AFOnoResponseSerializer (https://github.com/AFNetworking/AFOnoResponseSerializer) but I found it doesn't work well for some CSS selectors even though they are not that complex. For example, even "h1.title" or ".highlighted .itemImg .img" do not seem to work. Does Fuzi work for these CSS selectors? If so, is there a way to import Fuji in my Objective-C project and work with AFOnoResponseSerializer? Thanks!

CSS selector expression generates XPath error

I heavly use css selectors and when they are a bit more complex the XPathFromCSS routine fails

The css selector .box-paging a:not(.active) generates the error

XPath error : Undefined namespace prefix
.//*[contains(concat(' ',normalize-space(@class),' '),' box-paging ')]/descendant::a:not([contains(concat(' ',normalize-space(@class),' '),' active) ')]

You can reproduce the error yourself using the following snippet

    func checkSelector() {
        let html = "<div class=\"box-paging\"><a class=\"active\" href=\"1.html\">1</a><a href=\"2.html\">2</a><a href=\"3.html\">3</a></div>"

        if let htmlDocument = try? HTMLDocument(string: html) {
            htmlDocument.css(".box-paging a:not(.active)")
        }

    }

Obviously the selector works fine using firebug or document.querySelectorAll()

Fuzi and AttributedString

Hi!
Is possible to use Fuzi to render HTML inside a UILabel using AttributedString?

Thanks
-Paolo

Fuzi 2.0.0, iOS11, XCode 9.1beta - crash due to nil value - issue occurred after migration to Swift 4

Description:

I have migrated my code to Swift 4 and updated Fuzi v2.0.0, my code:

//...
            let manifest = try String(contentsOfFile: "\(destination)/imsmanifest.xml");
            
            
            let document = try XMLDocument(string: manifest)
            
            if let root = document.root {
                printlog("\(root.tag)")
                // launcher
                if let launch = root.firstChild(tag: "resources")?.firstChild(tag: "resource")?.attr("href") {
                    launcher = launch
                    printlog("Found: \(launch)")
                }
                
                //mastery score
                if let _mastery = root.firstChild(tag: "organizations")?.firstChild(tag: "organization")?.firstChild(tag: "item")?.firstChild(tag: "masteryscore", inNamespace: "adlcp") {
                    printlog("Masteryscore: \(_mastery.numberValue?.intValue)")
                    mastery = _mastery.numberValue?.intValue
                }
//...

manifest file with commented out masteryscore

    <manifest xmlns="http://www.imsproject.org/xsd/imscp_rootv1p1p2" xmlns:adlcp="http://www.adlnet.org/xsd/adlcp_rootv1p2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" identifier="adapt_manifest" version="1" xsi:schemaLocation="http://www.imsproject.org/xsd/imscp_rootv1p1p2 imscp_rootv1p1p2.xsd http://www.imsglobal.org/xsd/imsmd_rootv1p2p1 imsmd_rootv1p2p1.xsd http://www.adlnet.org/xsd/adlcp_rootv1p2 adlcp_rootv1p2.xsd">
        <metadata>
            <schema>ADL SCORM</schema>
            <schemaversion>1.2</schemaversion>
            <lom xmlns="http://www.imsglobal.org/xsd/imsmd_rootv1p2p1" xsi:schemaLocation="http://www.imsglobal.org/xsd/imsmd_rootv1p2p1 imsmd_rootv1p2p1.xsd">
                <general>
                    <title>
                        <langstring xml:lang="x-none"><![CDATA[title]]></langstring>
                    </title>
                    <description>
                        <langstring xml:lang="x-none"><![CDATA[Test!]]></langstring>
                    </description>
                </general>
            </lom>
        </metadata>
        <organizations default="adapt_scorm">
            <organization identifier="adapt_scorm">
                <title><![CDATA[Adapt Version 2.0 demonstration]]></title>
                <item identifier="item_1" isvisible="true" identifierref="res1">
                    <title><![CDATA[Adapt Version 2.0 demonstration]]></title>
                     <!-- <adlcp:masteryscore>70</adlcp:masteryscore> -->
                </item>
            </organization>
        </organizations>
        <resources>
            <resource identifier="res1" type="webcontent" href="index_lms.html" adlcp:scormtype="sco">
                <file href="index_lms.html"/>
            </resource>
        </resources>
    </manifest>
  • Expected behaviour:
    To work as it was in previous version, before migration:)

  • Actual behaviour:
    error is pointed to the
    guard let prefix = node?.pointee.ns.pointee.prefix else {
    Line 142 in Helpers.swift
    Thread 32: Fatal error: Unexpectedly found nil while unwrapping an Optional value

I've noticed that on 1.0.1 there is no "smiling helper" on that line if that matters:)

Environment

  • Package Manager:

    • Carthage, version:
    • [X ] CocoaPods, version:
    • Manually
  • Fuzi version: 2.0.0

  • Xcode version: 9.1beta (9b37)

How to reproduce:

How to get all html, and html from node element

I want to write my html page to file like:

do {
   try HTMLDocument(string: NSString(data: responseData, encoding: NSUTF8StringEncoding) as! String, encoding: NSUTF8StringEncoding).allhtml!.stringValue.writeToFile(filename, atomically: true, encoding: NSUTF8StringEncoding)
} catch {}

But how to write full file?

And how to get all html from node like this:

for node in doc.xpath(xpath) {
            node.HTML_with_text
}

Is this possible ?

Swift Compiler Error: No such module "libxml2"

Description:

Header Search Paths: $(SDKROOT)/usr/include/libxml2
Other Linker Flags: -lxml2

  • Expected behaviour:

  • Actual behaviour:
    Element.swift: No such module "libxml2"

Environment

  • Package Manager:

    • Carthage, version:
    • CocoaPods, version:
    • Manually
  • Fuzi version:master

  • Xcode version:9.2

How to reproduce:

Xcode 9.4 error

Pods/Fuzi/Sources/Queryable.swift:329:64:
'range(at:)' has been renamed to 'rangeAt(_:)'

Only namespace prefixes bound on the root node are considered by document.xpath(_ xpath: String)

Description:

  • Expected behaviour:

XML namespaces can be bound to a prefix at any element in a document. They can be bound within an element that uses that namespace prefix. XPath expressions should be able to use the same prefixes. E.g given the document:

<root>
    <u:BrowseResponse xmlns:u="urn:foo.bar" />
</root>

the XPath expression /root/u:BrowseResponse should resolve to a nodelist of 1 element.

  • Actual behaviour:

The XPath expression /root/u:BrowseResponse resolve to a nodelist of 0 elements and the following is printed

XPath error : Undefined namespace prefix
XPath error : Invalid expression

If the document is amended such that the prefix u: is bound at the root element the xpath expression evaluates correctly.

Environment

  • Package Manager:
    • CocoaPods, version: 1.2.2
  • Fuzi version: 1.0.1
  • Xcode version: 8.3.3

How to reproduce:

The following test fails, demonstrating the issue.

    func testNestedNamespacePrefixWithXPath() {
        let docStr = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><root><u:BrowseResponse xmlns:u=\"urn:foo.bar\"/></root>"

        do {
            let document = try Fuzi.XMLDocument(string: docStr)
            let nodes = document.xpath("/root/u:BrowseResponse")
            XCTAssertFalse(nodes.isEmpty)
        } catch let error {
            print(error)
            XCTFail(String(describing:error))
        }
    }

Issues on Swift 4.

Environment

  • Fuzi version: 2.0.0
  • Xcode version: 9.0

How to reproduce:

Compile any Swift 4 project. Issues with rangeAt(1) / range(at: 1).

Remove element

Hello, is there any way to add ability to remove Element from document tree?

I tried to add some code

	open func remove() {
		xmlUnlinkNode(cNode)
		xmlFreeNode(cNode)
	}

to XMLElement. But it does nothing. Content remains the same

HTML parsing error

I'm encountering a bug when I attempt to pares this page:
https://itunes.apple.com/us/podcast/the-adam-carolla-show/id306390087?mt=2&i=372787085

Here is an excerpt of the source:

                <span class="ac-gn-searchview-close-wrapper">
                    <span class="ac-gn-searchview-close-left"></span>
                    <span class="ac-gn-searchview-close-right"></span>
                </span>
            </button>
        </aside>
        <aside class="ac-gn-bagview" data-analytics-region="bag">
            <div class="ac-gn-bagview-scrim"> <span class="ac-gn-bagview-caret ac-gn-bagview-caret-small"></span> </div>
            <div class="ac-gn-bagview-content" id="ac-gn-bagview-content"> </div>
        </aside>
   </div>
</nav>
...
</html>

Here is the result of printing the XMLDocument:

                <span class="ac-gn-searchview-close-wrapper">
                    <span class="ac-gn-searchview-close-left"/>
                    <span class="ac-gn-searchview-close-right"/>
                </span>
            </button>
        </nav>
        <aside class="ac-gn-bagview" data-analytics-region="bag">
            <div class="ac-gn-bagview-scrim"> <span class="ac-gn-bagview-caret ac-gn-bagview-caret-small"/> </div>
            <div class="ac-gn-bagview-content" id="ac-gn-bagview-content"> </div>
        </aside>
    </body>
</html>

Somehow the nav tag is getting closed early and cutting off the rest of the document.

FUZI support for Swift 3

Hi Cezheng,

                We team needed to use FUZI for Swift 3 and in present development on Xcode beta 8.3.Its working all well on  Swift 2.2 but when developed for Swift 3 and Xcode Beta 8.3 having many errors.

               Is there any official update from your end for supporting Swift 3 ? If not the case , what i can do for make use of FUZI for Swift 3?

              Also more importantly to note that we are going to use FUZI for parsing on Linux environment also. Will this work? is there any solution for making this work on Linux too ? are there any dependencies to add ? Please guide me using FUZI in this context.


              Thanks for the Library.and the Team.

rawXML self-closes empty tags

rawXML self-closes empty tags, this can cause problems (with frames in my case)

I've changed this function to look like that, and it seems to have fixed the problem

import <libxml2/libxml/xmlsave.h> (in libxml2-fuzi.h)

public private(set) lazy var rawXML: String = {
let buffer = xmlBufferCreate()
let ctxt = xmlSaveToBuffer(buffer, nil, Int32(XML_SAVE_NO_EMPTY.rawValue))
xmlSaveTree(ctxt,self.cNode)
xmlSaveClose(ctxt)
let dumped = ^-^xmlBufferContent(buffer) ?? ""
xmlBufferFree(buffer)
return dumped
}()

Parsing invalid document will crash

When creating a XMLDocument instance like this:

guard let XML = try? XMLDocument(data: validData) else {
    return .failure(DataRequest.createSerializerFailure("Invalid XML"))
}

alternatively:

guard let XML = try? XMLDocument(string: String(data: validData, encoding: .utf8)!) else {
    return .failure(DataRequest.createSerializerFailure("Invalid XML"))
}

Then I would expect the .failure block to be run. Instead, I get fatal error: unexpectedly found nil while unwrapping an Optional value in XMLDocument.init:
screen shot 2016-10-26 at 13 01 24

validData contains the UTF8 string "RES" (just some garbage, really).

This used to work, but now crashes. Is that related to Swift3?

Update for Xcode 8 beta 6 / Swift 3.0 beta 6

@cezheng Hi! The very recent Xcode update brought many changes to Swift 3.0, especially the end of implicitly bridged types and changes to pointers. Because the latter are heavily used in Fuzi, it won't compile. I tried fixing it myself but I really have trouble understanding how Swift handles pointers...

And a little heads up for you : they changed the whole "public/private" system and from what I've gathered from the release notes open is the old public and fileprivate is the old private :

  • A declaration marked as private can now only be accessed within the lexical scope it is declared in
    (essentially the enclosing curly braces {}). A private declaration at the top level of a file can be
    accessed anywhere in that file, as in Swift 2. The access level formerly known as private is now called
    fileprivate. (SE-0025)
  • Classes declared as public can no longer be subclassed outside of their defining module, and
    methods declared as public can no longer be overridden outside of their defining module. To allow
    a class to be externally subclassed or a method to be externally overridden, declare them as open,
    which is a new access level beyond public.

Root Element with attributes not working

Hi Team,

          Its good using the FUZI.I am having a problem when my xml document's root element having attributes then its not working.

When i removed the attributes in the root element. Its a charm.Giving the desired results.

Please tell me the solution for doing this.

Issue with XMLNode.stringValue

Hello! I'm trying to parse some html text from a website and I get strange behaviour from XMLNode.stringValue method. Part of html look like this:

...
<div class="main_text" itemprop="articleBody">
Не так давно 
<a target="_blank" href="https://somelinkhere.com">мы писали</a>
, как журналист Game Informer пытался выяснить хоть что-нибудь о судьбе 
<b>Half-Life 3</b>
, так что её просто не существует. Разговоры о третьей части в команде разработчиков действительно были, но в середине производства  
<br>
...

Plain text is cyrillic. When I try to print .stringValue for elements I get random \320 and \321 in text and even more. Some lines of text gets repeated multiple times. For the last part of given html print(element.stringValue) looks like that:

, так что её просто не существует. Разговоры о третьеё просто не существует. Разговоры о третье\320й части в команде разработчиков действё просто не существует. Разговоры о третье\320й части в команде разработчиков действ\320ительно были, но в середине ё просто не существует. Разговоры о третье\320й части в команде разработчиков действ\320ительно были, но в середине \320производства 

Text duplicating is random each time I parse it. One time this part is OK, the other time it get all mixed up. Though .rawXML don't get those duplicates the \321\320 can be found there either.

Using Swift, Xcode 8.2, pod 'Fuzi', '~> 1.0.0'

Is there any workaround?

Thank you.

Init doesn't fail for invalid XML

Description:

  • Expected behaviour:
    Reading a broken XML throws an error.
  • Actual behaviour:
    Fuzi recovers and parses the XML partially (up to the error).

Environment

  • Package Manager: CocoaPods
  • Fuzi version: 1.0.1
  • Xcode version: 8.3.2

How to reproduce:

While unit testing our library, we want to catch that our parser fails on invalid documents. Instead, the broken file gets partially parsed, because Fuzi sets the option to ignore errors (XML_PARSE_NOERROR): https://github.com/cezheng/Fuzi/blob/master/Sources/Document.swift#L112

As the initializers are failable, as a user we'd expect it to fail for invalid documents. Now, I can understand that not all users might want this, but could it be made available as an option? For example, modify the initialisers to accept an extra options parameter, by default set to

Int32(XML_PARSE_NOWARNING.rawValue | XML_PARSE_NOERROR.rawValue | XML_PARSE_RECOVER.rawValue)

Even better would be to provide a swift OptionSet that maps to the libxml values.

Example faulty XML:

<?xml version="1.0" encoding="utf-8"?>
<resources>
  <>
  <color name="ArticleTitle">#33fe66</color>
</resources>

Get access to items inside XMLElement

Hi,
I have HTML page like:

...
// Product 1
<div class="lst_main">
    <a href="link1.html">
    <span> Product 1 name </span>
    <div class="lst_meta">
        <span> Product1 $price</span>
    </dev>
</div>
// Product 2
<div class="lst_main">
</div>
....
// Product N
<div class="lst_main">
</div>
....

I need to get URL, name and price for each product. Here it is my code:

let document = try HTMLDocument(string: htmlString, encoding: String.Encoding.utf8)
for productSection in document.xpath("//div[@class='lst_main']") {
}

Using this code I can get every product. But I could not understand, how to get link, name and price for each product?

Thank you.

Find youtube video code - question

Description:

  • Expected behaviour: question

  • Actual behaviour: question

Environment

  • Package Manager:

    • Carthage, version:
    • CocoaPods, version: latest
    • Manually
  • Fuzi version: latest

  • Xcode version: latest

How to reproduce:

Hello! Thanks for library. I have some code in my html. All of them inside brackets - []. How i can catch them and take code for video?

[vc_video link=’https://youtu.be/xxxxxxxxxxx’]

Build fails with xcode 9.3 (Beta)

Description:

Building Fuzi with xcode 9.3 fails with error Redefinition of module 'libxml2'
[ I think the issue is because of the file module.modulemap. The error should go way by renaming it to something else like Fuzi_module.modulemap.]

  • Expected behaviour:
    Successful Build without errors.

  • Actual behaviour:
    Build Failure.

Environment

  • Package Manager:

    • Carthage, version:
    • CocoaPods, version:1.4
    • Manually
  • Fuzi version: Latest (2.0.1)

  • Xcode version: 9.3

How to reproduce:

Just include Fuzi to a new project using cocoapods and try building with xcode 9.3
Sample -> Fuzi_xcode93.zip

Iterate Text Nodes

Hey!

Is there a possibility to also have text nodes when iterating over the children of a XMLElement? Right now we use xpath with the //text() selector for this.

br
denis

Unsupported Architecturesuploading to app store

I've integrated Fuzi using carthage as said in the README.
Code works great when I ran in simulator.
Howevere, Im getting the following error trying to upload to appstore:

ERROR ITMS-90087: "Unsupported Architectures. The executable for StockWatch.app/Frameworks/Fuzi.framework contains unsupported architectures '[x86_64, i386]'."
ERROR ITMS-90209: "Invalid Segment Alignment. The app binary at 'StockWatch.app/Frameworks/Fuzi.framework/Fuzi' does not have proper segment alignment. Try rebuilding the app with the latest Xcode version."

Any idea?

Can't import Fuzi into new project

This is error when i import Fuzi into my project by cocoapods version -0.39.0 (/Library/Ruby/Gems/2.0.0/gems/cocoapods-0.39.0/lib/cocoapods.rb)
screen shot 2015-11-24 at 4 12 51 pm
screen shot 2015-11-24 at 4 12 19 pm
This is step when i install success Fuzi .
My xcode version is 7.1.1
screen shot 2015-11-24 at 4 16 14 pm

[Fuzi 0.1.1, iOS9, Swift 2] XMLElement.firstChild(...) crash: "libswiftCore.dylib`_swift_abortRetainUnowned:", "attempted to retain deallocated object".

Input:

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
    <SOAP-ENV:Header />
    <SOAP-ENV:Body>
        <sub:rr xmlns:sub="http://www.rr.com/schema">
            <sub:error>
                <sub:code>1234</sub:code>
            </sub:error>
        </sub:rr>
    </SOAP-ENV:Body>
</SOAP-ENV:Envelope>"

Code:

...
guard let rootXML = try? XMLDocument(data: responseEnvelopeData).root else {
   ...
}

...

if let _ = rootXML?.firstChild(tag: "Body")?.firstChild(tag: "rr")?.firstChild(tag: "error")?.stringValue {
... 
}

...

Crash:

(lldb) bt

  • thread #15: tid = 0x6b9ffb, 0x0000000108b1899f libswiftCore.dylib`_swift_abortRetainUnowned + 15, queue = 'NSOperationQueue 0x7fcf91d2fb50 :: NSOperation 0x7fcf91e4f7b0 (QOS: LEGACY)', stop reason = EXC_BREAKPOINT (code=EXC_I386_BPT, subcode=0x0)
    • frame #0: 0x0000000108b1899f libswiftCore.dylib_swift_abortRetainUnowned + 15 frame #1: 0x0000000108b18986 libswiftCore.dylibswift_retainUnowned + 38
      frame #2: 0x0000000105a2398b Fuzi`Fuzi.XMLElement.firstChild (tag="Body", ns=nil, self=0x00007fcf91e50290)(tag : Swift.String, inNamespace : Swift.Optional<Swift.String>) -> Swift.Optional<Fuzi.XMLElement> + 411 at Element.swift:81
      ...

Framework issues

Hi,
I added framework to my project manually but had some issues:

  1. I had to add libxml2.tbd to framework

  2. It had only macOS framework. There is not one for iOS, etc. Every os should have separate framework

  3. It's not updated for Xcode 8.1 (recommended settings)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.