Code Monkey home page Code Monkey logo

pquery's People

Contributors

tburry avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pquery's Issues

Removed html attributes

I have a html content with attributes align="left" and valign="top"
When I tryed to parse the content, gives me without this tags. Is it an issue?

Thanks.

PHP 8 Deprecations

Some deprecations errors is occurring on PHP 8


<b>Deprecated</b>:  Return type of pQuery::offsetExists($offset) should either be compatible with ArrayAccess::offsetExists(mixed $offset): bool, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in <b>/var/www/html/api/vendor/tburry/pquery/pQuery.php</b> on line <b>121</b>

<b>Deprecated</b>:  Return type of pQuery::offsetGet($offset) should either be compatible with ArrayAccess::offsetGet(mixed $offset): mixed, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in <b>/var/www/html/api/vendor/tburry/pquery/pQuery.php</b> on line <b>125</b>

<b>Deprecated</b>:  Return type of pQuery::offsetSet($offset, $value) should either be compatible with ArrayAccess::offsetSet(mixed $offset, mixed $value): void, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in <b>/var/www/html/api/vendor/tburry/pquery/pQuery.php</b> on line <b>129</b>

<b>Deprecated</b>:  Return type of pQuery::offsetUnset($offset) should either be compatible with ArrayAccess::offsetUnset(mixed $offset): void, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in <b>/var/www/html/api/vendor/tburry/pquery/pQuery.php</b> on line <b>138</b>

<b>Deprecated</b>:  Return type of pQuery::getIterator() should either be compatible with IteratorAggregate::getIterator(): Traversable, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in <b>/var/www/html/api/vendor/tburry/pquery/pQuery.php</b> on line <b>97</b>

<b>Deprecated</b>:  Return type of pQuery::count() should either be compatible with Countable::count(): int, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in <b>/var/www/html/api/vendor/tburry/pquery/pQuery.php</b> on line <b>81</b>

<b>Deprecated</b>:  Return type of pQuery\DomNode::count() should either be compatible with Countable::count(): int, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in <b>/var/www/html/api/vendor/tburry/pquery/gan_node_html.php</b> on line <b>2355</b>

I think just add the types like the interface it should solve the problem:

public offsetExists(mixed $offset): bool { /*.... */ }
//                  ^^^^^           ^^^^

public offsetGet(mixed $offset): mixed { /*.... */ }
//               ^^^^^           ^^^^^ 

public offsetSet(mixed $offset, mixed $value): void  { /*.... */ }
//               ^^^^^          ^^^^^          ^^^^

public offsetUnset(mixed $offset): void { /*.... */ }
//                 ^^^^^           ^^^^

Interface 'Counter' not found

I'm trying to use pQuery on a project of mine but when I do a instance of the class

$doc=pQuery::parseStr($htmlPage);

I get this error

PHP Fatal error: Can't inherit abstract function pQuery\IQuery::count() (previously declared abstract in Countable) in /home/freemem/Travel/RipScripts/Rippers/crystalcruises/vendor/tburry/pquery/pQuery.php on line 15

I've removed "Countable" from list implements of pQuery class declaration and now, all is running perfectly ¿?¿?

Regards

Fatal error: Class 'pQueryTestCase' not found

Fatal error: Class 'pQueryTestCase' not found in /var/www/public/pquery/tests/BasicTest.php on line 3

Im not being able to load your classes, i've used composer but maybe im doing something wrong or messing up something. Let me tell you what i have

Windows 8.1 with vagrant/virtualbox/scotchbox.io (for dev virtualmachine... which is an Ubuntu)

so i go into my virtualmachine (linux) thorugh ssh, then go to public folder (/var/www/public/ ) the default folder for index.php and there i issue: git clone https://github.com/tburry/pquery.git

and then i issue this command : composer require tburry/pquery

after that i try your test files and get those kind of errors (Class not found)

also with the snippet of code you have as example (check index.php attached)

im sure im doing something awfully wrong... can't figure out what, and as a plus im new with composer :(

would you be kind enough to give me some guidance?

index2.txt

Bug <tag><hello</tag>

Hello,

I'm parsing some game reports, and players can choose their own name. (here it's <:::::::::::::)=o)

$dom            = \pQuery::parseStr('<td><:::::::::::::)=o</td><td>1,524</td><td>29</td>');
foreach ($dom->query('td') as $td)
{
    var_dump($td->text());
}

Results:

string(7) "1,52429"
string(5) "1,524"
string(2) "29"

The <td><:::::::::::::)=o</td> is missing.
Is that a known bug ?

Thanks,

tagname:nth-of-type(2).classname does not work

tagname:nth-of-type(2).classname is recognized as two selectors:

tagname:nth-of-type(2) and .classname

instead of one.

what works is:
tagname.classname:nth-of-type(2)

A simple patch would be in gan_selector.php:

protected function parse_conditions() {
...
$conditions_all[] = $conditions;
$break = false;
if ($tok === CSSQueryTokenizer::TOK_WHITESPACE) {
$tok = $p->next_no_whitespace();
$break = true;
}
if ($tok === CSSQueryTokenizer::TOK_COMMA) {
$tok = $p->next_no_whitespace();
continue;
} else {
if ($break)
break;
}
....

Comma-separeted selector fail

The following selector returns 0 results with pquery:
#NewsItem1 a.topnewsitem_heading,#mellemartikelforside a.topnewsitem_heading,#mindreartikel a.topnewsitem_heading

There are 21 elements in the dom, and using PHP Simple DOM for reference with the same selector, I get all of the elements.

This must be a bug in either pquery or Ganon, but I am not able to figure out the problem.

Script dying in parseStr

For several hours I thought I might be going crazy, and then I realized pQuery is die-ing/exit-ing under some unknown conditions while executing parseStr(). About 1 in 5 times (a guess) I call it on a web page (fetched via file_get_contents()) it will kill my script.

The entirety of the pQuery-related code I'm calling is:

$html = file_get_contents($url);
// It always reaches here
$dom = pQuery::parseStr($html);
// It sometimes does NOT reach here
$html2 = $dom->query('div[class="col-xs-12 col-sm-6 col-md-5 col-lg-4"]')->html();
$dom2 = pQuery::parseStr($html2); // I'm doing this second parseStr because I can't find an equivalence to jquery $('#foo').find('#bar')
$imageUrl = $dom2->query('img')->attr('src');

I do not have a sample of the HTML which causes the dying... I'll try to collect that later and add it. The actual contents seem like they might not prove all that relevant because I'm basically working through a list of URLs and when I retry (after the script death) it works fine with the same URL (it isn't stuck on that URL) and with the pQuery code disabled the script works fine, so it's not a case where the $html it is given is ever "" or null.

Broken javascript content

Hi,
First, sorry for my elementary English.
I'll found that this istruction (gan_node_html.php line 465):
function getPlainText() {
return preg_replace('\s+', ' ', html_entity_decode($this->toString(true, true, true), ENT_QUOTES));
}
will breaks embedded javascript with double slash comment.
infact, the pattern \s+ will match spaces and newlines, so code block is trasformed into a single line entirely broken by comment starting with '//'
Changing pattern in '\ +' (a space) trasformation works great.
Hope this can help somebody
Franco

Element attr without quotes

<a href=/index.php/example>Example</a>

$e->query('a')->attr('href') return "e" (last char in href i think), should return "/index.php/example".

pQuery namespace

pQuery renames ganon's classes and puts them into a namespace.

pQuery.php is not in a namespace - more specifically, it's in the global namespace. It's supposed to be in the same namespace as the others. You should also use PSR-4.

Escaped values in attributes are converted back to characters

When adding sanitized strings to attributes Pquery's output can break form element because it converted encodes strings back to the character values of those.

Proof of concept:

require_once('pquery/load_pquery.php'); 
$domObj = pQuery::parseStr('<input type="text" placeholder="Hello">'); 
$domObj->query('input')->attr('placeholder', '\&quot;&gt; &#039;'); 
echo $domObj->html(); 

Output:
<input type="text" placeholder="\"> '" />

This is the result of this getInnerText() function in gan_node_html.php

function getInnerText() {
		return html_entity_decode($this->toString(true, true, 1), ENT_QUOTES);
}

I'm not sure what the purpose is for that line since pQuery doesn't seem to store those values encoded.

Browsers always insert tbody if missing in tables

A simple patch adapts browser behavious for table tbodys:

protected function parse_hierarchy($self_close = null) {
    $tag_curr = strtolower($this->status['tag_name']);
    if ($self_close === null) {
        $this->status['self_close'] = ($self_close = isset($this->tags_selfclose[$tag_curr]));
    }

    if (! ($self_close || $this->status['closing_tag'])) {
        //$tag_prev = strtolower(end($this->hierarchy)->tag);
        $tag_prev = strtolower($this->hierarchy[count($this->hierarchy) - 1]->tag);
        if (isset($this->tags_optional_close[$tag_curr]) && isset($this->tags_optional_close[$tag_curr][$tag_prev])) {
            array_pop($this->hierarchy);
        }

//patch begin
if ($tag_curr == "tr" && $tag_prev == "table"){
$this->hierarchy[] = $this->hierarchy[count($this->hierarchy) - 1]->addChild(new DomNode("tbody", $this->hierarchy[count($this->hierarchy) - 1]));
}
//patch end
}

    return parent::parse_hierarchy($self_close);
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.