lubosdz / parser-orsr Goto Github PK
View Code? Open in Web Editor NEWParser obchodného registra SR
Parser obchodného registra SR
Fixed in PR: #1
Zdravim. V prvom rade pekne dakujem, velmi uzitocna klasa 👍 . Len by som pre zaujimavost spomenul moj pripad pouzitia:
Na serveri nemam k dispozicii tidy php extension a instalacia stoji namahu viacero ludi :D
Tak som dal (namiesto efektu tidy-cka) sancu kombinacii :
libxml_use_internal_errors(true); ... $xml = new \DOMDocument('1.0', 'utf-8'); $xml->preserveWhiteSpace = false;
Samozrejme, tidy-ckove casti kodu som zakomentoval. Riadok libxml_use_internal_errors(true);
sluzi na to, aby volanie $xml->loadHTML($html);
nepadlo do chyby na nevalidnom html. Riadkom $xml->preserveWhiteSpace = false;
hovorim DOMDocument-u, aby ignoroval whitespace-y medzi tagmi (inak by ich bral ako sucast dom).
Vysledok vyzera pouzitelne. Pouzivam vsak len metodu getDetailByICO()
, takze neviem, mozno to neriesi globalne vsetky funkcie klasy...
riadok 531
$xml = new \DOMDocument('1.0', 'utf-8');
$xml->loadHTML($html);
$xpath = new \DOMXpath($xml);
V tejto casti nastane chyba, kedze sa php pokusa nacitat string, ktory je kodovany nespravne. Tato chyba nastava o riadok vyssie, kedy preg_replace pokazi kodovanie stringu. Deje sa to len na niektorych PHP verziach, ktore boli skompilovane inym sposobom. Spravne v tychto pripadoch treba pouzit flag u.
$html = preg_replace('/\s+/', ' ', $html);
$html = preg_replace('/\s+/u', ' ', $html);
Vytvoril som uz pull request, prosim o aplikovanie. #3
https://maxivak.com/working-with-regular-expressions-preg_-and-utf-8-strings-in-php/
https://stackoverflow.com/questions/35887031/why-does-w-match-non-english-characters-in-mac-os-x-php-environment
Moze nastat situacia ze pre jedno ICO je viac vysledkov
https://www.orsr.sk/hladaj_ico.asp?ICO=45281025
SRO medzitym zmenila nazov aj adresu. Parser spracuje udaje z prveho vysledku, ale spravny je (predpokladam ze vzdy) prave ten druhy. V nespravnom vysledku je takato informacia
Spis odstúpený na iný registrový súd z dôvodu miestnej nepríslušnosti
Deprecated trim with null value may be possible issue in future.
ak to dobre chapem getDetailByICO
by ma vratit detail rovnako ako getDetailByPartialLink
ale vracia list s 1 polozkou ako findByObchodneMeno
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.