bundesapi / deutschland Goto Github PK
View Code? Open in Web Editor NEWDie wichtigsten APIs Deutschlands in einem Python Paket.
License: Apache License 2.0
Die wichtigsten APIs Deutschlands in einem Python Paket.
License: Apache License 2.0
Notice, not the AGS ("Amtlicher Gemeindeschlüssel") is used, as done by DWD; instead, the RS ("Bundeseinheitlicher Regionalschlüssel") is used.
But only the RS of city districts and county districts work, not those of towns.
The following request returns all warnings for "Stadtkreis Heilbronn" / "Universitätsstadt Heilbronn" (["081210000000","Heilbronn, Universitätsstadt",null]):
https://warnung.bund.de/api31/dashboard/081210000000.json
But the following request does not return the warnings for "Möckmühl" (["081255007063","Möckmühl, Stadt",null]):
https://warnung.bund.de/api31/dashboard/081255007063.json
Instead, we must use the RS from "Landkreis Heilbronn" (which is not provided in the linked AGS-Table):
https://warnung.bund.de/api31/dashboard/081250000000.json
But in this case, we receive all the warnings from the county district and not only the ones from the targeted town.
How to receive only the warnings from one town (e.g. Möckmühl), as it is done nowadays by the NINA App (warning level == "Gemeinde")?
Would be great to change the documentation to
from deutschland import bundesanzeiger
ba = bundesanzeiger.Bundesanzeiger()
This worked.
Is there a good API for courts in Germany, so one sends stuff to the correct court?
E.g. a website to scrape could be this one:
This would currently be helpful for FragDenStaat, which do sometimes choose the wrong court in their complaint generation tool: bundesAPI/sofortmassnahmen#64
Maybe we should switch to latest tag to avoid changes in future.
`
import time
from deutschland import destatis
from pprint import pprint
from deutschland.destatis.api import default_api
with destatis.ApiClient() as api_client:
# Create an instance of the API class
api_instance = default_api.DefaultApi(api_client)
username = "xxx"
password = "xxx"
name = "45341-0102"
area = "all"
compress = "false"
transpose = "false"
startyear = "startyear_example"
endyear = "endyear_example"
timeslices = "timeslices_example"
regionalvariable = "regionalvariable_example"
regionalkey = "regionalkey_example"
classifyingvariable1 = "classifyingvariable1_example"
classifyingkey1 = "classifyingkey1_example"
classifyingvariable2 = "classifyingvariable2_example"
classifyingkey2 = "classifyingkey2_example"
classifyingvariable3 = "classifyingvariable3_example"
classifyingkey3 = "classifyingkey3_example"
job = "false"
stand = "01.01.1970 01:00"
language = "de"
format = "csv"
try:
api_instance.table(username=username, password=password, name=name, area=area, compress=compress, transpose=transpose, startyear=startyear, endyear=endyear, timeslices=timeslices, regionalvariable=regionalvariable, regionalkey=regionalkey, classifyingvariable1=classifyingvariable1, classifyingkey1=classifyingkey1, classifyingvariable2=classifyingvariable2, classifyingkey2=classifyingkey2, classifyingvariable3=classifyingvariable3, classifyingkey3=classifyingkey3, job=job, stand=stand, language=language)
except destatis.ApiException as e:
print("Exception:")
print(e)
`
Ok... analog zum Standard konfiguriert - Abfrage geht fehlerfrei durch.
Wahrscheinliche eine dumme Frage, aber wie komme ich an das Ergebnis? :)
I ran the sample code, but had the following error:
OSError Traceback (most recent call last)
in ()
1 from deutschland import Bundesanzeiger
----> 2 ba = Bundesanzeiger()
3 # search term
4 data = ba.get_reports("Deutsche Bahn AG")
5 # returns a dictionary with all reports found as fulltext reports
3 frames
/usr/local/lib/python3.7/dist-packages/keras/saving/save.py in load_model(filepath, custom_objects, compile, options)
202 if isinstance(filepath_str, str):
203 if not tf.io.gfile.exists(filepath_str):
--> 204 raise IOError(f'No file or directory found at {filepath_str}')
205
206 if tf.io.gfile.isdir(filepath_str):
OSError: No file or directory found at assets/model.h5
I noticed there is no linter. ruff is a great option to use.
How to reproduce:
from deutschland.bundesanzeiger import Bundesanzeiger
ba = Bundesanzeiger()
data = ba.get_reports('4steps systems')
Expected result:
data
.What I got instead:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[4], line 3
1 from deutschland.bundesanzeiger import Bundesanzeiger
2 ba = Bundesanzeiger()
----> 3 data = ba.get_reports('4steps systems')
File [~/miniconda3/envs/uregister/lib/python3.8/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py:186](https://file+.vscode-resource.vscode-cdn.net/home/adomberg/projects/20230616_Unternehmensregister/~/miniconda3/envs/uregister/lib/python3.8/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py:186), in Bundesanzeiger.get_reports(self, company_name)
182 # perform the search
183 response = self.session.get(
184 f"https://www.bundesanzeiger.de/pub/de/start?0-2.-top%7Econtent%7Epanel-left%7Ecard-form=&fulltext={company_name}&area_select=&search_button=Suchen"
185 )
--> 186 return self.__generate_result(response.text)
File [~/miniconda3/envs/uregister/lib/python3.8/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py:120](https://file+.vscode-resource.vscode-cdn.net/home/adomberg/projects/20230616_Unternehmensregister/~/miniconda3/envs/uregister/lib/python3.8/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py:120), in Bundesanzeiger.__generate_result(self, content)
118 """iterate trough all results and try to fetch single reports"""
119 result = {}
--> 120 for element in self.__find_all_entries_on_page(content):
121 get_element_response = self.session.get(element.content_url)
123 if self.__is_captcha_needed(get_element_response.text):
File [~/miniconda3/envs/uregister/lib/python3.8/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py:90](https://file+.vscode-resource.vscode-cdn.net/home/adomberg/projects/20230616_Unternehmensregister/~/miniconda3/envs/uregister/lib/python3.8/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py:90), in Bundesanzeiger.__find_all_entries_on_page(self, page_content)
88 soup = BeautifulSoup(page_content, "html.parser")
89 wrapper = soup.find("div", {"class": "result_container"})
---> 90 rows = wrapper.find_all("div", {"class": "row"})
91 for row in rows:
92 info_element = row.find("div", {"class": "info"})
AttributeError: 'NoneType' object has no attribute 'find_all'
I tried other numerals and non-numerals with the described error pattern.
My env: Ubuntu 22.04, python 3.8.17
In a map oriented to the north, the first coordinate passed to Geo.fetch() is the lower left, and the second coordinate is the upper right of the two coordinates.
The following image shows both coordinates from the documentation in OpenStreetMap.
# top_right and bottom_left coordinates
data = geo.fetch([52.50876180448243, 13.359631043007212],
[52.530116236589244, 13.426532801586827])
Maybe I misunderstand the names top_right and bottom_left. Please correct me if this is the case. Otherwise, I would work on a pull request for this issue, which renames the variables.
I am receiving following error message, it looks like the model used to solve the captchas does not exist in assets/model.h5
directory: OSError: SavedModel file does not exist at: assets/model.h5/{saved_model.pbtxt|saved_model.pb}
Here's my code:
from deutschland import Bundesanzeiger
ba = Bundesanzeiger()
# search term
data = ba.get_reports("Deutsche Bahn AG")
# returns a dictionary with all reports found as fulltext reports
print(data.keys())
# dict_keys(['Jahresabschluss zum Geschäftsjahr vom 01.01.2020 bis zum 31.12.2020', 'Konzernabschluss zum Geschäftsjahr vom 01.01.2020 bis zum 31.12.2020\nErgänzung der Veröffentlichung vom 04.06.2021',
Hi,
I have exported your model to solve the CAPTCHAs to the ONNX format. This has the advantage that you get rid of TensorFlow as a dependency (~500 MB) and can use onnxruntime (~5 MB) for inference instead – would make the whole project way more lightweight. There are also significantly fewer problems updating onnxruntime
than TensorFlow without breaking the model.
And I could also fix #8, if you're interested.
Was müsste man im Code ändern, um nur im Bereich Rechnungslegung/Finanzberichte zu suchen?
Ich vermute es bräuchte in der aufzurufenden URL Angaben bezüglich "area_select="
response = self.session.get(
update to format and test workflow
f"https://www.bundesanzeiger.de/pub/de/start?0-2.-top%7Econtent%7Epanel-left%7Ecard-form=&fulltext={company_name}&area_select=&search_button=Suchen"
)
I started to work on the documentation (see #41) and I have a few suggestions:
We need to find a way to create API bindings from all the openapi specs to integrate them automatically into the deutschland lib.
Any suggestions on how to tackle this?
I think the Handelsregister code in this repo is deprecated for the Handelsregister repo.
This means we should be able to clean up the repo and move the tests to the Handelsregister repo.
Looks like a new release should fix this problem soon.
After fresh virtualenv install via:
pipenv install git+https://github.com/bundesAPI/deutschland.git#egg=deutschland
I get:
Python 3.9.5 (default, Nov 23 2021, 15:27:38)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
import deutschland as de
from deutschland import Geo
Traceback (most recent call last):
File "", line 1, in
ImportError: cannot import name 'Geo' from 'deutschland' (unknown location)
geo = Geo()
Traceback (most recent call last):
File "", line 1, in
NameError: name 'Geo' is not defined
data = geo.fetch([52.530116236589244, 13.426532801586827],
... [52.50876180448243, 13.359631043007212])
Traceback (most recent call last):
File "", line 1, in
NameError: name 'geo' is not defined
Similar problem with google collab(python3.7):
!pip install deutschland
I get the folowing:
Attempting uninstall: urllib3
Found existing installation: urllib3 1.24.3
Uninstalling urllib3-1.24.3:
Successfully uninstalled urllib3-1.24.3
Attempting uninstall: requests
Found existing installation: requests 2.23.0
Uninstalling requests-2.23.0:
Successfully uninstalled requests-2.23.0
Attempting uninstall: regex
Found existing installation: regex 2022.6.2
Uninstalling regex-2022.6.2:
Successfully uninstalled regex-2022.6.2
Attempting uninstall: numpy
Found existing installation: numpy 1.21.6
Uninstalling numpy-1.21.6:
Successfully uninstalled numpy-1.21.6
Attempting uninstall: Pillow
Found existing installation: Pillow 7.1.2
Uninstalling Pillow-7.1.2:
Successfully uninstalled Pillow-7.1.2
Attempting uninstall: pandas
Found existing installation: pandas 1.3.5
Uninstalling pandas-1.3.5:
Successfully uninstalled pandas-1.3.5
Attempting uninstall: lxml
Found existing installation: lxml 4.2.6
Uninstalling lxml-4.2.6:
Successfully uninstalled lxml-4.2.6
Attempting uninstall: beautifulsoup4
Found existing installation: beautifulsoup4 4.6.3
Uninstalling beautifulsoup4-4.6.3:
Successfully uninstalled beautifulsoup4-4.6.3
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
xarray-einstats 0.2.2 requires numpy>=1.21, but you have numpy 1.19.0 which is incompatible.
tensorflow 2.8.2+zzzcolab20220527125636 requires numpy>=1.20, but you have numpy 1.19.0 which is incompatible.
google-colab 1.0.0 requires requests~=2.23.0, but you have requests 2.28.1 which is incompatible.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.
albumentations 0.1.12 requires imgaug<0.2.7,>=0.2.5, but you have imgaug 0.2.9 which is incompatible.
Successfully installed Pillow-8.4.0 beautifulsoup4-4.11.1 boto3-1.24.34 botocore-1.27.34 dateparser-1.1.1 de-autobahn-1.0.4 de-bundesrat-0.1.0 de-bundestag-0.1.0 de-dwd-1.0.1 de-interpol-0.1.0 de-jobsuche-0.1.0 de-ladestationen-1.0.5 de-mudab-0.1.0 de-nina-1.0.2 de-polizei-brandenburg-0.1.0 de-risikogebiete-0.1.0 de-smard-0.1.0 de-strahlenschutz-1.0.0 de-travelwarning-0.1.0 de-zoll-0.1.0 deutschland-0.3.0 gql-2.0.0 graphql-core-2.3.2 jmespath-1.0.1 lxml-4.9.1 mapbox-vector-tile-1.2.1 numpy-1.19.0 onnxruntime-1.10.0 pandas-1.1.5 pyclipper-1.3.0.post3 pypresseportal-0.1 regex-2022.3.2 requests-2.28.1 rx-1.6.1 s3transfer-0.6.0 slugify-0.0.1 urllib3-1.26.10
WARNING: The following packages were previously imported in this runtime:
[PIL,numpy]
You must restart the runtime in order to use newly installed versions.
import deutschland as de
geo = de.Geo()
AttributeError Traceback (most recent call last)
in ()
1 import deutschland as de
----> 2 geo = de.Geo()
AttributeError: module 'deutschland' has no attribute 'Geo'
Running an older version:
deutschland = "==0.1.9"
I don't get the import errors in Geo() but an empy result:
Python 3.9.13 (main, May 23 2022, 22:01:06)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
from deutschland import Geo
geo = Geo()
data = geo.fetch([52.530116236589244, 13.426532801586827],
... [52.50876180448243, 13.359631043007212])
print(data.keys())
dict_keys([])
With the Bundesanzeiger I get the import error again:
from deutschland import Bundesanzeiger
ba = Bundesanzeiger()
2022-07-22 00:09:52.683533: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-07-22 00:09:52.683555: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
File "", line 1, in
File "/home/moritz/.local/share/virtualenvs/deutschland-fqErnsp1/lib/python3.9/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py", line 47, in init
self.model = deutschland.bundesanzeiger.model.load_model()
File "/home/moritz/.local/share/virtualenvs/deutschland-fqErnsp1/lib/python3.9/site-packages/deutschland/bundesanzeiger/model.py", line 36, in load_model
return keras.models.load_model(
File "/home/moritz/.local/share/virtualenvs/deutschland-fqErnsp1/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/moritz/.local/share/virtualenvs/deutschland-fqErnsp1/lib/python3.9/site-packages/keras/saving/save.py", line 206, in load_model
raise IOError(f'No file or directory found at {filepath_str}')
OSError: No file or directory found at assets/model.h5
data = ba.get_reports("Deutsche Bahn AG")
Traceback (most recent call last):
File "", line 1, in
NameError: name 'ba' is not defined
Could you point to what I am doing wrong? Best regards
>>> de.fetch([48.51999, 9.07136], [48.51999, 9.07137])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ajung/src/deutschland/lib/python3.9/site-packages/deutschland/geo.py", line 81, in fetch
return parsed
UnboundLocalError: local variable 'parsed' referenced before assignment
A new version of the API generator is released which should improve the quality of the generated python code (e.g. Type hints etc.)
We need to evaluate if our current process of generating the clients still works with this newest release.
Possibly we will need to adapt the post-processing script for the new code.
All
I'd like to create another library for .NET, so I thought I'd start with something simple like the geodata API. Turns out I couldn't find the open API doc.
Is there one? Or is there another kind of documentation about that that just wasn't enough to make a repository yet? A cURL/Postman/etc example request would be enough for me, as reading python is not one of my string suits (although I have already started doing that).
Execute tests with pytest.
Tests fail with:
_______________________ test_for_no_data_handelsregister _______________________
def test_for_no_data_handelsregister():
hr = Handelsregister()
data = hr.search(keywords="foobar", keyword_match_option=3)
> assert (
len(data) == 0
), "Found registered companies for 'foobar' although none were expected."
E TypeError: object of type 'NoneType' has no len()
tests/integration_test.py:19: TypeError
___________ test_fetching_handelsregister_data_for_deutsche_bahn_ag ____________
def test_fetching_handelsregister_data_for_deutsche_bahn_ag():
hr = Handelsregister()
data = hr.search(
keywords="Deutsche Bahn Aktiengesellschaft", keyword_match_option=3
)
> assert (
len(data) > 0
), "Found no data for 'Deutsche Bahn Aktiengesellschaft' although it should exist."
E TypeError: object of type 'NoneType' has no len()
tests/integration_test.py:29: TypeError
___ test_fetching_handelsregister_data_for_deutsche_bahn_ag_with_raw_params ____
def test_fetching_handelsregister_data_for_deutsche_bahn_ag_with_raw_params():
r = Registrations()
data = r.search_with_raw_params(
{"schlagwoerter": "Deutsche Bahn Aktiengesellschaft", "schlagwortOptionen": 3}
)
> assert (
len(data) > 0
), "Found no data for 'Deutsche Bahn Aktiengesellschaft' although it should exist."
E TypeError: object of type 'NoneType' has no len()
Das Beispiel aus der README funktioniert nicht:
from deutschland.handelsregister import Handelsregister
hr = Handelsregister()
# search by keywords, see documentation for all available params
hr.search(keywords="Deutsche Bahn Aktiengesellschaft") # Hier wird None zurückgegeben
print(hr) # Das funktioniert sowieso nicht, sondern gibt das Handelsregister-Objeckt zurück
Ich habemir das gestern etwas angeschaut und der der Server liefert einen 404 zurück. Kann es sein, dass der Endpunkt jetzt https://www.handelsregister.de/rp_web/normalesuche.xhtml ist? Die Parameter heißen auch etwas anders. Nachdem meine IP gestern geblockt wurde, habe ich aber nicht mehr weiter gemacht.
Some reports share the same names, e.g. Mitteilung von Netto-Leerverkaufspositionen
. The problem is that get_reports
cannot return multiple items with the same name because it returns a dictionary/map.
#!/opt/homebrew/bin/python3
from deutschland.bundesanzeiger import Bundesanzeiger
ba = Bundesanzeiger()
data = ba.get_reports("DE000A0TGJ55")
print(data.keys())
print(data["Mitteilung von Netto-Leerverkaufspositionen"])
dict_keys(['Mitteilung von Netto-Leerverkaufspositionen'])
{'date': datetime.datetime(2022, 9, 26, 0, 0), 'name': 'Mitteilung von Netto-Leerverkaufspositionen', 'company': 'BlackRock Investment Management (UK) Limited', 'report': '\n\n\n\n\xa0\n\n\n\n\n\n\n\nBlackRock Investment Management (UK) Limited\nLondon\nMitteilung von Netto-Leerverkaufspositionen\nZu folgendem Emittenten wird vom oben genannten Positionsinhaber eine Netto-Leerverkaufsposition\n gehalten:\n\nVARTA AKTIENGESELLSCHAFT\n\n\nISIN: DE000A0TGJ55\n\nDatum der Position: 23.09.2022\nProzentsatz des ausgegebenen Aktienkapitals: 2,26 %\n\xa0\n\n\n\n\n\n\n\n\n\n\n\n\n'}
Spectral removed auto detection for openAPI schemas. Because of this, the linting fails when no .spectra.yaml is provided.
See: stoplightio/spectral#1796
In the Autobahn repo there is a .spectral.yaml file to account for that.
However, if we do not expect to use custom linting rulesets, we can simply get rid of this extra file by executing this before the linting process:
echo "extends: spectral:oas" > .spectral.yaml
Das topic hacktoberfest
hinzufügen um Teilnehmern des Hacktoberfest (mehr Informationen unter https://hacktoberfest.digitalocean.com/) zu ermöglichen hier getätigte Pull-Requests zu nutzen.
Kann dazu beitragen, dass noch mehr Personen auf das Repository/die gesamte Organisation aufmerksam werden.
I was just trying to install the this package and couldn't get it to work using python 3.9 on Windows 10. Mostly due to numpy/blas/lapack/mkl errors somewhere deep down. However, reading the pyproject.toml I saw the mentioning of python 3.6.2.
=> Using python 3.6 I was able to install this package without errors.
Hi,
trying to get your example https://github.com/bundesAPI/deutschland#geographic-data
to run …
pip3 install deutschland
on my macOS 12.5 with homebrew results in:
Requirement already satisfied: deutschland in /opt/homebrew/lib/python3.9/site-packages (0.1.4)
Requirement already satisfied: mapbox-vector-tile<2.0.0,>=1.2.1 in /opt/homebrew/lib/python3.9/site-packages (from deutschland) (1.2.1)
Requirement already satisfied: Shapely<2.0.0,>=1.7.1 in /opt/homebrew/lib/python3.9/site-packages (from deutschland) (1.8.2)
Requirement already satisfied: requests<3.0.0,>=2.26.0 in /opt/homebrew/lib/python3.9/site-packages (from deutschland) (2.28.1)
Requirement already satisfied: pyclipper in /opt/homebrew/lib/python3.9/site-packages (from mapbox-vector-tile<2.0.0,>=1.2.1->deutschland) (1.3.0.post3)
Requirement already satisfied: setuptools in /opt/homebrew/lib/python3.9/site-packages (from mapbox-vector-tile<2.0.0,>=1.2.1->deutschland) (63.3.0)
Requirement already satisfied: protobuf in /opt/homebrew/lib/python3.9/site-packages (from mapbox-vector-tile<2.0.0,>=1.2.1->deutschland) (4.21.4)
Requirement already satisfied: future in /opt/homebrew/lib/python3.9/site-packages (from mapbox-vector-tile<2.0.0,>=1.2.1->deutschland) (0.18.2)
Requirement already satisfied: certifi>=2017.4.17 in /opt/homebrew/lib/python3.9/site-packages (from requests<3.0.0,>=2.26.0->deutschland) (2022.6.15)
Requirement already satisfied: idna<4,>=2.5 in /opt/homebrew/lib/python3.9/site-packages (from requests<3.0.0,>=2.26.0->deutschland) (3.3)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /opt/homebrew/lib/python3.9/site-packages (from requests<3.0.0,>=2.26.0->deutschland) (1.26.11)
Requirement already satisfied: charset-normalizer<3,>=2 in /opt/homebrew/lib/python3.9/site-packages (from requests<3.0.0,>=2.26.0->deutschland) (2.1.0)
[notice] A new release of pip available: 22.2.1 -> 22.2.2
[notice] To update, run: python3.9 -m pip install --upgrade pip
(Didn’t save the initial output, but it seems to be successfully installed)
Now running your example code of https://github.com/bundesAPI/deutschland#geographic-data
fails early in the game:
% python3 deutschland.py
Traceback (most recent call last):
File "/Users/ghoffart/src/deutschland.py", line 1, in <module>
from deutschland.geo import Geo
File "/Users/ghoffart/src/deutschland.py", line 1, in <module>
from deutschland.geo import Geo
ModuleNotFoundError: No module named 'deutschland.geo'; 'deutschland' is not a package
Am I overseeing something very obvious? Sorry, Python’s really not part of my knowledge package :-o
The installation of Shapely doesn't work in a fresh environment.
How to reproduce:
Expected result:
What I got instead:
ERROR: Cannot install deutschland==0.1.0, deutschland==0.1.1, deutschland==0.1.2, deutschland==0.1.3, deutschland==0.1.4, deutschland==0.1.5, deutschland==0.1.6 and deutschland==0.1.7 because these package versions have conflicting dependencies.
The conflict is caused by:
deutschland 0.1.7 depends on Shapely<2.0.0 and >=1.7.1
Workaround:
pip install Shapely
.My environment:
Seems like the API changed:
When I run the test, I get the following:
HTTPSConnectionPool(host='adv-smart.de', port=443): Max retries exceeded with url: /tiles/smarttiles_de_public_v1/15/17605/10747.pbf (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1091)')))
To reproduce:
wget https://adv-smart.de/tiles/smarttiles_de_public_v1/15/17605/10745.pbf --no-check-certificate
Gives 404 error.
@LilithWittmann Do you know more?
When you're in paranoia mode and want to use (anonymous) proxies, replace this line
deutschland/deutschland/geo.py
Line 67 in b85d765
with this one
result = requests.get(url, headers=headers, proxies=proxies)
and add this to your main function (with your proxy servers of course):
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
I thought about contributing to your package by adding the extended search functionality (i.e. not only search for all documents but add the possibility to limit the search to certain types of documents).
Unfortunately, this is only working for certain companies while for certain other companies the captcha solver always fails. Any ideas why that might be?
(e.g. it works without errors for "Deutsche Bahn AG" but it keeps failing for "Deutsche Bank AG")
Change:
add the value 22 to the search request
response = self.session.get(
f"https://www.bundesanzeiger.de/pub/de/start?0-2.-top%7Econtent%7Epanel-left%7Ecard-form=&fulltext={company_name}&area_select=22&search_button=Suchen"
)
The links to the API documentations for the auto-generated API-Clients are currently not working on https://pypi.org/project/deutschland/ because there are relative links.
For example: https://pypi.org/project/deutschland/docs/nina/README.md
Are absolute links the way to go here, or is there a better option?
If absolute links are okay, I can do a PR for that.
I’ve searched and searched, but I couldn’t see a license file.
Die hat doch Twitter und ist Beauftragte der Bundesregierung für Digitalisierung. Vielleicht hat sie auch GitHub und kann ein gutes Wort bei Merkel einlegen?
I would like to request all PDF's which are found for a given search term in the "Bundesanzeiger". Maybe you can help me with this!?
A proper documentation would be quite useful. Especially if the system keeps growing.
The usage documentation in the Readme is nice to get started but does not show all parameters and advanced usage.
Running the demo code in the README.md for the Handelsregister module returns an error.
How to reproduce:
>>> from deutschland import Bundesanzeiger
>>> from deutschland import Handelsregister
>>> hr = Handelsregister()
>>> hr.search(keywords="Deutsche Bahn Aktiengesellschaft")
Expected result:
What I got instead:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/xxx/Dokumente/bundesapi/test_bundesapi_2/env/lib/python3.9/site-packages/deutschland/handelsregister/handelsregister.py", line 138, in search
return self.search_with_raw_params(params)
File "/home/xxx/Dokumente/bundesapi/test_bundesapi_2/env/lib/python3.9/site-packages/deutschland/handelsregister/handelsregister.py", line 215, in search_with_raw_params
return self.__find_entries(soup)
File "/home/xxx/Dokumente/bundesapi/test_bundesapi_2/env/lib/python3.9/site-packages/deutschland/handelsregister/handelsregister.py", line 242, in __find_entries
data = self.__extract_history(tr)
File "/home/xxx/Dokumente/bundesapi/test_bundesapi_2/env/lib/python3.9/site-packages/deutschland/handelsregister/handelsregister.py", line 276, in __extract_history
[position, historical_name] = tds[1].text.strip().split(".) ", 1)
ValueError: not enough values to unpack (expected 2, got 1)
My env:
Running the following sample code:
from deutschland.bundesanzeiger import Bundesanzeiger
ba = Bundesanzeiger()
data = ba.get_reports("Deutsche Bahn AG")
throws the following error:
[/usr/local/lib/python3.10/dist-packages/deutschland/bundesanzeiger/bundesanzeiger.py](https://localhost:8080/#) in __find_all_entries_on_page(self, page_content)
88 soup = BeautifulSoup(page_content, "html.parser")
89 wrapper = soup.find("div", {"class": "result_container"})
---> 90 rows = wrapper.find_all("div", {"class": "row"})
91 for row in rows:
92 info_element = row.find("div", {"class": "info"})
AttributeError: 'NoneType' object has no attribute 'find_all'
Ich verwende Python3.9 in einem dockerimage und plane die bundesAPI in einem
headless Projekt zu verwenden. Aber die Bundesanzeiger API verwendet google-chrome,
kann man das umgehen?
ba = Bundesanzeiger()
====== WebDriver manager ======
====== WebDriver manager ======
/bin/sh: 1: google-chrome: not found
/bin/sh: 1: google-chrome-stable: not found
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.9/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py", line 46, in __init__
self.driver = webdriver.Chrome(ChromeDriverManager().install(), options=options)
File "/usr/local/lib/python3.9/site-packages/webdriver_manager/chrome.py", line 25, in __init__
self.driver = ChromeDriver(name=name,
File "/usr/local/lib/python3.9/site-packages/webdriver_manager/driver.py", line 57, in __init__
self.browser_version = chrome_version(chrome_type)
File "/usr/local/lib/python3.9/site-packages/webdriver_manager/utils.py", line 155, in chrome_version
raise ValueError(f'Could not get version for Chrome with this command: {cmd}')
ValueError: Could not get version for Chrome with this command: google-chrome --version || google-chrome-stable --version
Hi , I am trying to find if there is a way to access and download (.xml, pdf data ) from the command line command . If so how can one execute that ?
Command line : handelsregister.py [-h] [-d] [-f] -s SCHLAGWOERTER [-so {all,min,exact}]
Is it even supported ?
The tests fail under windows as can be seen here
The reason is under windows somehow the telephon symbol is not parsed properly.
When I tried to print it I saw something like
E UnicodeEncodeError: 'charmap' codec can't encode character '\u260e' in position 38: character maps to <undefined>
Unfortunately I do not have access to a windows machine to dig deeper in a reasonable manner.
Raw error:
================================== FAILURES ===================================
_________________________________ test_verena _________________________________
def test_verena():
v = Verena()
> res = v.get()
tests\verena\test_verena.py:6:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\deutschland\verena\verena.py:21: in get
extract = VerenaExtractor(page).extract()
C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\deutschland\verena\verenaextractor.py:38: in extract
phone, fax, homepage, email, deadline = self.__extract_part4(aus_parts[3])
C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\deutschland\verena\verenaextractor.py:158: in __extract_part4
print(x)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <encodings.cp1252.IncrementalEncoder object at 0x000002AC31A54F10>
input = '\\r\\r\\n\\r\\r\\n \u260e 02381 973060\\r\\r\\n '
final = False
def encode(self, input, final=False):
> return codecs.charmap_encode(input,self.errors,encoding_table)[0]
E UnicodeEncodeError: 'charmap' codec can't encode character '\u260e' in position 38: character maps to <undefined>
C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\encodings\cp1252.py:19: UnicodeEncodeError
___________________________ test_extractor_content ____________________________
def test_extractor_content():
with open("tests/verena/ausschreibung_test_input.html", "r") as f:
with open("tests/verena/ausschreibung_correct_result.json", "r") as correct:
content = "<html><body>" + f.read() + "</body></html>"
ve = VerenaExtractor(content)
res = ve.extract()
> assert len(res) == 1 and res[0] == json.loads(correct.read())
E AssertionError: assert (1 == 1 and {'comments': ...ulingen', ...} == {'comments': ...ulingen', ...}
E + where 1 = len([{'comments': 'Bemerkung zur Stelle: Testbemerkung', 'contact': {'fax': '0172 2222 2222', 'homepage': 'http://www.eine...line/': '17.09.2021', 'desc': 'Eine Schule\nSchule der Sekundarstufe II\ndes Landkreis Schuling\n9999 Schulingen', ...}])
E Omitting 11 identical items, use -vv to show
E Differing items:
E {'contact': {'fax': '0172 2222 2222', 'homepage': 'http://www.eine-schule.de/', 'mail': {'adress': 'bewerbung@eineschul...'mailto:[email protected]?subject=Stellenausschreibung in VERENA', 'subject': 'Stellenausschreibung in VERENA'}}} != {'contact': {'fax': '0172 2222 2222', 'homepage': 'http://www.eine-schule.de/', 'mail': {'adress': '[email protected]?subject=Stellenausschreibung in VERENA', 'subject': 'Stellenausschreibung in VERENA'}, 'phone': '0172 1111 1111'}}
E Full diff:
E {
E 'comments': 'Bemerkung zur Stelle: Testbemerkung',
E 'contact': {'fax': '0172 2222 2222',
E 'homepage': 'http://www.eine-schule.de/',
E 'mail': {'adress': '[email protected]',
E 'raw': 'mailto:[email protected]?subject=Stellenausschreibung '
E 'in VERENA',
E - 'subject': 'Stellenausschreibung in VERENA'},
E + 'subject': 'Stellenausschreibung in VERENA'}},
E ? +
E - 'phone': '0172 1111 1111'},
E 'deadline': '17.09.2021',
E 'desc': 'Eine Schule\n'
E 'Schule der Sekundarstufe II\n'
E 'des Landkreis Schuling\n'
E '9999 Schulingen',
E 'duration': '01.01.2021 - 01.01.2022',
E 'geolocation': {'coord_system': 'epsg:25832',
E 'coordinates': [1111111,
E 1111111],
E 'post_adress': 'Eine Straße 1\n'
E '99999 Schulingen'},
E 'hours_per_week': '13,5',
E 'replacement_job_title': 'Lehrkraft',
E 'replacement_job_type': 'Vertretung',
E 'replacement_job_type_raw': 'Vertretung für',
E 'school_id': '99999',
E 'subjects': ['Fach 1',
E 'Fach 2'],
E })
tests\verena\test_verenaextractor.py:12: AssertionError
============================== warnings summary ===============================
Hi,
I would like to extract all the charging stations in Germany. I guess the ladestationen would give me this info. However, I'm not sure what "geometry" should be provided as input here. Also, the URI is localhost. Could you please provide a valid URI?
Thanks
The import of Bundesanzeiger and Handelsregister is not working in a new install of the "deutschland" package.
How to reproduce:
Expected result:
What I got instead:
>>> from deutschland import Bundesanzeiger
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name 'Bundesanzeiger' from 'deutschland' (/opt/homebrew/Caskroom/miniforge/base/envs/py_de/lib/python3.9/site-packages/deutschland/__init__.py)
>>> from deutschland import Handelsregister
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name 'Handelsregister' from 'deutschland' (/opt/homebrew/Caskroom/miniforge/base/envs/py_de/lib/python3.9/site-packages/deutschland/__init__.py)
My environment:
Hi @LilithWittmann @wirthual, there have been a couple of changes since the last release end of 2022. Could please create a new release of the library?
Thanks in advance
Python 3.11 supports the following Pillow verisons: Pillow >= 9.3
readthedocs Pillow
deutschland 0.3.2 requires Pillow<9.0.0,>=8.3.1
Error message from prompt:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
deutschland 0.3.2 requires Pillow<9.0.0,>=8.3.1, but you have pillow 9.3.0 which is incompatible.
Hello,
when trying to use the destatis api (timeseries_data
to be exact), it throws a SSLCertVerificationError.
MaxRetryError: HTTPSConnectionPool(host='www-genesis.destatis.de', port=443): Max retries exceeded with url: /genesisWS/rest/2020/data/timeseries?username=*******&password=******* (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:997)')))
My code is just the example for timeseries_data
, with a valid username and password provided.
The host url is https://www-genesis.destatis.de/genesisWS/rest/2020
.
The data is provided by the AdV SmartMapping.
The link is outdated.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.