tobiaszcudnik / phpquery Goto Github PK

View Code? Open in Web Editor NEW

1.1K 1.1K 462.0 229 KB

Server-side implementation of jQuery in PHP5 (2009)

HTML 48.19% CSS 0.33% PHP 50.97% JavaScript 0.51%

phpquery's People

Contributors

Stargazers

Watchers

Forkers

upworks supertowers like2dev lucassouza1 deivisonarthur hasansama lordgnu niveuseverto zhuomingliang aurielle interestingforked missinglink lori emat12 wave2future a973c nev3rm0re robbestad david4worx ohec cphoover mnishihan tomhat damien-list aptivate contextworks panrafal reinos brdhamblin denis-isaev jack-tsue ralph-tice dtrelax broklyngagah webmaia octante electrolinux onlilove codeachange irbissk wuxiaorui86 beimuaihui sminnee autovalue tunght13488 allanfreitas liuyao729 ajay-patel ellice neeke ergoz chlik utt73 smassey-isn symfonyluxury okworld26 dirts aimutran thecharge imarlboro superles dochoa vhagerty thinkbox vfbiby imclab ja1cap tlkoert neagle2009 lzpfmh 1mr3yn grebaldi norbit24 iplesca skafandri euwern chandim devwellington indian2020 mzaman sydefz xuanskyer brycehappy bluetechy qwxingzhe jayuloy tianzhi0549 progmax x-bean limweb mpawlucz kevan maliang0130 lixiaokai2008 shiwolang igaojie qq1622178458 apmsoft nikel303 junfengwu

phpquery's Issues

privacy-breach-w3c-valid-html, privacy-breach-generic

lintian on Debian found the following problems:

privacy-breach-w3c-valid-html test-cases/document-types/document-fragment-utf8.xhtml (http://www.w3.org/icons/valid-xhtml10)
N:
N:    This package creates a potential privacy breach by fetching W3C
N:    validation icons.
N:
N:    These badges may be displayed to tell readers that care has been taken
N:    to make a page compliant with W3C standards. Unfortunately, downloading
N:    the image from www.w3.org might expose the reader's IP address to
N:    potential tracking.
N:
N:    Note that these icons are non-free and must not be copied into the
N:    package. You could safely delete this W3C validation badge.
N:
N:    Refer to http://validator.w3.org/docs/help.html#icon and
N:    http://www.w3.org/Consortium/Legal/logo-usage-20000308 for details.
N:
N:    Severity: serious, Certainty: possible

privacy-breach-generic test-cases/document-types/document-fragment-utf8.xhtml (http://www.w3.org/tr/xhtml1/xhtml1.pdf) 
N:  
N:    This package creates a potential privacy breach by fetching data from an 
N:    external website at runtime. Please remove these scripts or external 
N:    HTML resources.
N:
N:    Please replace any scripts, images, or other remote resources with
N:    non-remote resources. It is preferable to replace them with text and
N:    links but local copies of the remote resources are also acceptable as
N:    long as they don't also make calls to remote services. Please ensure
N:    that the remote resources are suitable for Debian main before making
N:    local copies of them.
N:
N:    Severity: important, Certainty: wild-guess

Other affected files are

privacy-breach-generic test-cases/document-types/document-iso88592-nocharset.xhtml (http://www.w3.org/tr/xhtml1/xhtml1.pdf)
privacy-breach-w3c-valid-html test-cases/document-types/document-iso88592-nocharset.xhtml (http://www.w3.org/icons/valid-xhtml10)
privacy-breach-generic test-cases/document-types/document-iso88592.xhtml (http://www.w3.org/tr/xhtml1/xhtml1.pdf)
privacy-breach-w3c-valid-html test-cases/document-types/document-iso88592.xhtml (http://www.w3.org/icons/valid-xhtml10)
privacy-breach-generic test-cases/document-types/document-utf8-nocharset.xhtml (http://www.w3.org/tr/xhtml1/xhtml1.pdf)
privacy-breach-w3c-valid-html test-cases/document-types/document-utf8-nocharset.xhtml (http://www.w3.org/icons/valid-xhtml10)
privacy-breach-generic test-cases/document-types/document-utf8.xhtml (http://www.w3.org/tr/xhtml1/xhtml1.pdf)
privacy-breach-w3c-valid-html test-cases/document-types/document-utf8.xhtml (http://www.w3.org/icons/valid-xhtml10)

Please send a composer package on packagist.org

Please send a PhpQuery as composer package on packagist.org. Thanks.

Function split() is deprecated

split()函数改成了explode

Separated find queries for request 'h1,h2,h3'

In original jQuery, $('h1,h2,h3') returns a collection of elements sorted by their location in document.
In phpQuery, ['h1,h2,h3'] launch separated find queries for h1, h2 and h3. So the result collection is sorted by tag name in the find request.
In this case it is unable to quickly build a document's table of contents (headers hierarchy).

XML namespaces

Query namespace|* selects all elements not those from specified namespace.

not working: ->find() method with direct child selector

On an item node like this one,

<item>
    <artist>
        <name>Niagara</name>
    </artist>
    <name>Pendant que les champs brûlent</name>
</item>

$node->find('> name');

gets those nodes

<name>Niagara</name>

<name>Pendant que les champs brûlent</name>

while it should only get this one

<name>Pendant que les champs brûlent</name>

Is that a bug ?

I could use

$node->children('name');

But I want to be able to select NOT only direct descendants.

How could I do ?
Thanks.

Mangled javascript if contains closing tags in strings

What steps will reproduce the problem?

Process HTML that contains <script> tag with HTML in strings. Ex:

<div>
<script>
  var html = [
    '<div>',
    '<select>',
    '</select>',
    '</div>',
  ];
</script>
</div>

What is the expected output?
I expect the JS code within <script> tag not to be changed.

What do you see instead?
Some closing tags like </select> or </option> are fully removed, and some, like </div> are changed to close open tags outside of <script>

What version of the product are you using? On what operating system?
phpQuery 0.9.5

In all github forks: Couldn't fetch DOMElement

While parsing some sites fetched works fine, for some fetches, this is the resulting error:
Couldn't fetch DOMElement. Node no longer exists in .../phpQuery/phpQuery.php on line 148
It's the same with all forks I've found on github.
Any ideas?

UTF-8 issue when try to create a DOM document

I have a fetched page by CURL, what charset is windows-1250, and doctype is

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

I change the encoding of my string, check it, and replace the meta charset in string:

$html = str_replace('windows-1250', 'UTF-8', mb_convert_encoding($result, 'UTF-8')); var_dump(mb_detect_encoding($html, "UTF-8, ASCII, ISO-8859-1, windows-1250")); $Doc = \phpQuery::newDocumentHTML($html, 'UTF-8'); echo pq($Doc)->html();

All the UTF-8 characters are messy. var_dump says, its UTF-8, content-type="text/plain; charset=UTF-8".

When I var_dump($Doc); I see, the DOMDocument encoding and xmlencoding values are nulls.

But if I am using:

$Dom = new \DOMDocument(); $Dom->loadHTML($html);

and var_dump it, then everyhing is fine, the characters are ok.

I've checked the createDocumentWrapper and the $contentType is ok.

If I set the static $debug to true I've get this:

`string 'Load markup for content type text/html;charset=utf-8' (length=52)

string 'Loading HTML, content type 'text/html;charset=utf-8'' (length=52)

string 'Full markup load (HTML):

' (length=275)

string 'DOC: UTF-8 REQ: UTF-8' (length=21)

string 'Full markup load (HTML), documentCreate('utf-8')' (length=48)

string 'Selecting document '52280a0c077ec7c5fb2f2350db12f22c' as default one' (length=68)`

pq() returns array in a random order

I tried to use it to parse something like this way:

phpQuery::newDocumentFile($search_url,"text/html;charset=utf8");
$array = pq(".results .wx-rb");
foreach ($array as $item) {
echo pq($item)->find("a")->text();
}

but it outputs in random order, not the order it shows orginally.How to fix that?

Where is learning document?

Joint repo @ phpquery/phpquery - gathering all edits & forks..

Dear Tobiasz,

dunno if you're actively pursuing phpquery anymore - likely not - however i'd like to inform you about a phpquery repo i put together with the aim to gather all the forks and edits in one coherent and complete repo as single source for further developement.

Current state: I pulled the googlecode SVN - thus preserving your original commit history - added some of the forks as branches, applied some tags and so on..

https://github.com/phpquery/phpquery

If you wish, i can hand over the ownership of that repo to you.. or alternatively to anyone who is willing to act as a future maintainer.

Feel free to contact me at lib..
cheers,
Jan

Parsing HTML5 data ..

Hi! I first downloaded original phpQuery code from Google Code site, and i love the project ;)

I had problems using PQ with some html5 markup ..
In particular, when i try to work with Special SCRIPTS (i.e: HandleBars templates) inside script tag, phpQuery breaks the code, because it treats text inside as it was normal html.

For example, consider "appending" this simple code:

$doc->append("<script> document.write('<div>Hello!</div>'); </script>");

PQ transforms it like this:

<script> document.write('<div>Hello!'); </script>

I had to use Masterminds php-HTML5 Parser like this:

$HTML5 = new HTML5(['disable_html_ns'=>true]) ;
$doc->append($HTML5->loadHTMLFragment("<script> document.write('<div>Hello!</div>'); </script>"));

So, i'd like to know if it is possible to extend phpQuery html parser to use a (better) HTML5 parser, just like Masterminds, or any other ...
I could try to do the trick by myself, but i need someone to address me to where i must add the code.

Hope this could be useful for someone else. TY!

Error: Unable to create XML parser while phpquery installing

[root@default /]# pear channel-discover phpquery-pear.appspot.com
Error: Unable to create XML parser
Discovering channel phpquery-pear.appspot.com over http:// failed with message: channel-add: invalid channel.xml file
Trying to discover channel phpquery-pear.appspot.com over https:// instead
Error: Unable to create XML parser
Discovery of channel "phpquery-pear.appspot.com" failed (channel-add: invalid channel.xml file)

<?php
var_dump(xml_parser_create());
?>

outputs

resource(2, xml)

Latest Amazon Linux

cat /etc/issue

Amazon Linux AMI release 2014.03