tburry / pquery Goto Github PK
View Code? Open in Web Editor NEWA jQuery like html dom parser written php.
License: GNU Lesser General Public License v2.1
A jQuery like html dom parser written php.
License: GNU Lesser General Public License v2.1
I have a html content with attributes align="left" and valign="top"
When I tryed to parse the content, gives me without this tags. Is it an issue?
Thanks.
Some deprecations errors is occurring on PHP 8
<b>Deprecated</b>: Return type of pQuery::offsetExists($offset) should either be compatible with ArrayAccess::offsetExists(mixed $offset): bool, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in <b>/var/www/html/api/vendor/tburry/pquery/pQuery.php</b> on line <b>121</b>
<b>Deprecated</b>: Return type of pQuery::offsetGet($offset) should either be compatible with ArrayAccess::offsetGet(mixed $offset): mixed, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in <b>/var/www/html/api/vendor/tburry/pquery/pQuery.php</b> on line <b>125</b>
<b>Deprecated</b>: Return type of pQuery::offsetSet($offset, $value) should either be compatible with ArrayAccess::offsetSet(mixed $offset, mixed $value): void, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in <b>/var/www/html/api/vendor/tburry/pquery/pQuery.php</b> on line <b>129</b>
<b>Deprecated</b>: Return type of pQuery::offsetUnset($offset) should either be compatible with ArrayAccess::offsetUnset(mixed $offset): void, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in <b>/var/www/html/api/vendor/tburry/pquery/pQuery.php</b> on line <b>138</b>
<b>Deprecated</b>: Return type of pQuery::getIterator() should either be compatible with IteratorAggregate::getIterator(): Traversable, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in <b>/var/www/html/api/vendor/tburry/pquery/pQuery.php</b> on line <b>97</b>
<b>Deprecated</b>: Return type of pQuery::count() should either be compatible with Countable::count(): int, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in <b>/var/www/html/api/vendor/tburry/pquery/pQuery.php</b> on line <b>81</b>
<b>Deprecated</b>: Return type of pQuery\DomNode::count() should either be compatible with Countable::count(): int, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in <b>/var/www/html/api/vendor/tburry/pquery/gan_node_html.php</b> on line <b>2355</b>
I think just add the types like the interface it should solve the problem:
public offsetExists(mixed $offset): bool { /*.... */ }
// ^^^^^ ^^^^
public offsetGet(mixed $offset): mixed { /*.... */ }
// ^^^^^ ^^^^^
public offsetSet(mixed $offset, mixed $value): void { /*.... */ }
// ^^^^^ ^^^^^ ^^^^
public offsetUnset(mixed $offset): void { /*.... */ }
// ^^^^^ ^^^^
I'm trying to use pQuery on a project of mine but when I do a instance of the class
$doc=pQuery::parseStr($htmlPage);
I get this error
PHP Fatal error: Can't inherit abstract function pQuery\IQuery::count() (previously declared abstract in Countable) in /home/freemem/Travel/RipScripts/Rippers/crystalcruises/vendor/tburry/pquery/pQuery.php on line 15
I've removed "Countable" from list implements of pQuery class declaration and now, all is running perfectly ¿?¿?
Regards
Pquery dosen't work with mb_* functions or mbstring.func_overload = 2 php.ini param
Fatal error: Class 'pQueryTestCase' not found in /var/www/public/pquery/tests/BasicTest.php on line 3
Im not being able to load your classes, i've used composer but maybe im doing something wrong or messing up something. Let me tell you what i have
Windows 8.1 with vagrant/virtualbox/scotchbox.io (for dev virtualmachine... which is an Ubuntu)
so i go into my virtualmachine (linux) thorugh ssh, then go to public folder (/var/www/public/ ) the default folder for index.php and there i issue: git clone https://github.com/tburry/pquery.git
and then i issue this command : composer require tburry/pquery
after that i try your test files and get those kind of errors (Class not found)
also with the snippet of code you have as example (check index.php attached)
im sure im doing something awfully wrong... can't figure out what, and as a plus im new with composer :(
would you be kind enough to give me some guidance?
Hello,
I'm parsing some game reports, and players can choose their own name. (here it's <:::::::::::::)=o)
$dom = \pQuery::parseStr('<td><:::::::::::::)=o</td><td>1,524</td><td>29</td>');
foreach ($dom->query('td') as $td)
{
var_dump($td->text());
}
Results:
string(7) "1,52429"
string(5) "1,524"
string(2) "29"
The <td><:::::::::::::)=o</td>
is missing.
Is that a known bug ?
Thanks,
tagname:nth-of-type(2).classname is recognized as two selectors:
tagname:nth-of-type(2) and .classname
instead of one.
what works is:
tagname.classname:nth-of-type(2)
A simple patch would be in gan_selector.php:
protected function parse_conditions() {
...
$conditions_all[] = $conditions;
$break = false;
if ($tok === CSSQueryTokenizer::TOK_WHITESPACE) {
$tok = $p->next_no_whitespace();
$break = true;
}
if ($tok === CSSQueryTokenizer::TOK_COMMA) {
$tok = $p->next_no_whitespace();
continue;
} else {
if ($break)
break;
}
....
The following selector returns 0 results with pquery:
#NewsItem1 a.topnewsitem_heading,#mellemartikelforside a.topnewsitem_heading,#mindreartikel a.topnewsitem_heading
There are 21 elements in the dom, and using PHP Simple DOM for reference with the same selector, I get all of the elements.
This must be a bug in either pquery or Ganon, but I am not able to figure out the problem.
For several hours I thought I might be going crazy, and then I realized pQuery is die-ing/exit-ing under some unknown conditions while executing parseStr(). About 1 in 5 times (a guess) I call it on a web page (fetched via file_get_contents()) it will kill my script.
The entirety of the pQuery-related code I'm calling is:
$html = file_get_contents($url);
// It always reaches here
$dom = pQuery::parseStr($html);
// It sometimes does NOT reach here
$html2 = $dom->query('div[class="col-xs-12 col-sm-6 col-md-5 col-lg-4"]')->html();
$dom2 = pQuery::parseStr(
$imageUrl = $dom2->query('img')->attr('src');
I do not have a sample of the HTML which causes the dying... I'll try to collect that later and add it. The actual contents seem like they might not prove all that relevant because I'm basically working through a list of URLs and when I retry (after the script death) it works fine with the same URL (it isn't stuck on that URL) and with the pQuery code disabled the script works fine, so it's not a case where the $html it is given is ever "" or null.
Hi,
First, sorry for my elementary English.
I'll found that this istruction (gan_node_html.php line 465):
function getPlainText() {
return preg_replace('\s+
', ' ', html_entity_decode($this->toString(true, true, true), ENT_QUOTES));
}
will breaks embedded javascript with double slash comment.
infact, the pattern \s+ will match spaces and newlines, so code block is trasformed into a single line entirely broken by comment starting with '//'
Changing pattern in '\ +
' (a space) trasformation works great.
Hope this can help somebody
Franco
Hi. For example, I have <div style="color: #cc0000;">
. How can I find this div purely using the css color property?
<a href=/index.php/example>Example</a>
$e->query('a')->attr('href') return "e" (last char in href i think), should return "/index.php/example".
pQuery renames ganon's classes and puts them into a namespace.
pQuery.php
is not in a namespace - more specifically, it's in the global namespace. It's supposed to be in the same namespace as the others. You should also use PSR-4.
When adding sanitized strings to attributes Pquery's output can break form element because it converted encodes strings back to the character values of those.
Proof of concept:
require_once('pquery/load_pquery.php');
$domObj = pQuery::parseStr('<input type="text" placeholder="Hello">');
$domObj->query('input')->attr('placeholder', '\"> '');
echo $domObj->html();
Output:
<input type="text" placeholder="\"> '" />
This is the result of this getInnerText() function in gan_node_html.php
function getInnerText() {
return html_entity_decode($this->toString(true, true, 1), ENT_QUOTES);
}
I'm not sure what the purpose is for that line since pQuery doesn't seem to store those values encoded.
A simple patch adapts browser behavious for table tbodys:
protected function parse_hierarchy($self_close = null) {
$tag_curr = strtolower($this->status['tag_name']);
if ($self_close === null) {
$this->status['self_close'] = ($self_close = isset($this->tags_selfclose[$tag_curr]));
}
if (! ($self_close || $this->status['closing_tag'])) {
//$tag_prev = strtolower(end($this->hierarchy)->tag);
$tag_prev = strtolower($this->hierarchy[count($this->hierarchy) - 1]->tag);
if (isset($this->tags_optional_close[$tag_curr]) && isset($this->tags_optional_close[$tag_curr][$tag_prev])) {
array_pop($this->hierarchy);
}
//patch begin
if ($tag_curr == "tr" && $tag_prev == "table"){
$this->hierarchy[] = $this->hierarchy[count($this->hierarchy) - 1]->addChild(new DomNode("tbody", $this->hierarchy[count($this->hierarchy) - 1]));
}
//patch end
}
return parent::parse_hierarchy($self_close);
}
Do you have any plans to implement jQuery's dom traversal methods?
http://api.jquery.com/category/traversing/
It would be very helpful... :)
If you will implement it, I suggest to make a separate interface for the traversal methods like:
IQueryTraversable
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.