bundesapi / deutschland Goto Github PK

Die wichtigsten APIs Deutschlands in einem Python Paket.

License: Apache License 2.0

Python 44.87% HTML 54.08% Makefile 0.47% Batchfile 0.57%

deutschland's Issues

[NINA] AGS not working for "Gemeinde", only for "Stadtkreis" and "Landkreis"

Notice, not the AGS ("Amtlicher Gemeindeschlüssel") is used, as done by DWD; instead, the RS ("Bundeseinheitlicher Regionalschlüssel") is used.
But only the RS of city districts and county districts work, not those of towns.

The following request returns all warnings for "Stadtkreis Heilbronn" / "Universitätsstadt Heilbronn" (["081210000000","Heilbronn, Universitätsstadt",null]):
https://warnung.bund.de/api31/dashboard/081210000000.json

But the following request does not return the warnings for "Möckmühl" (["081255007063","Möckmühl, Stadt",null]):
https://warnung.bund.de/api31/dashboard/081255007063.json

Instead, we must use the RS from "Landkreis Heilbronn" (which is not provided in the linked AGS-Table):
https://warnung.bund.de/api31/dashboard/081250000000.json
But in this case, we receive all the warnings from the county district and not only the ones from the targeted town.

How to receive only the warnings from one town (e.g. Möckmühl), as it is done nowadays by the NINA App (warning level == "Gemeinde")?

Bundesanzeiger Import python

Would be great to change the documentation to

from deutschland import bundesanzeiger
ba = bundesanzeiger.Bundesanzeiger()

This worked.

Justizadressen / zuständige Gerichte?

Is there a good API for courts in Germany, so one sends stuff to the correct court?

E.g. a website to scrape could be this one:

Use case

This would currently be helpful for FragDenStaat, which do sometimes choose the wrong court in their complaint generation tool: bundesAPI/sofortmassnahmen#64

Update spectral acton to stoplightio/[email protected]

Maybe we should switch to latest tag to avoid changes in future.

See:
stoplightio/spectral-action#633

destatis - Ergebnis?!

`
import time
from deutschland import destatis
from pprint import pprint
from deutschland.destatis.api import default_api

with destatis.ApiClient() as api_client:
# Create an instance of the API class
api_instance = default_api.DefaultApi(api_client)
username = "xxx"
password = "xxx"
name = "45341-0102"
area = "all"
compress = "false"
transpose = "false"
startyear = "startyear_example"
endyear = "endyear_example"
timeslices = "timeslices_example"
regionalvariable = "regionalvariable_example"
regionalkey = "regionalkey_example"
classifyingvariable1 = "classifyingvariable1_example"
classifyingkey1 = "classifyingkey1_example"
classifyingvariable2 = "classifyingvariable2_example"
classifyingkey2 = "classifyingkey2_example"
classifyingvariable3 = "classifyingvariable3_example"
classifyingkey3 = "classifyingkey3_example"
job = "false"
stand = "01.01.1970 01:00"
language = "de"
format = "csv"

try:
    api_instance.table(username=username, password=password, name=name, area=area, compress=compress, transpose=transpose, startyear=startyear, endyear=endyear, timeslices=timeslices, regionalvariable=regionalvariable, regionalkey=regionalkey, classifyingvariable1=classifyingvariable1, classifyingkey1=classifyingkey1, classifyingvariable2=classifyingvariable2, classifyingkey2=classifyingkey2, classifyingvariable3=classifyingvariable3, classifyingkey3=classifyingkey3, job=job, stand=stand, language=language)
except destatis.ApiException as e:
    print("Exception:")
    print(e)

`
Ok... analog zum Standard konfiguriert - Abfrage geht fehlerfrei durch.

Wahrscheinliche eine dumme Frage, aber wie komme ich an das Ergebnis? :)

issue with Bundesanzeiger?

I ran the sample code, but had the following error:

OSError Traceback (most recent call last)
in ()
1 from deutschland import Bundesanzeiger
----> 2 ba = Bundesanzeiger()
3 # search term
4 data = ba.get_reports("Deutsche Bahn AG")
5 # returns a dictionary with all reports found as fulltext reports

3 frames
/usr/local/lib/python3.7/dist-packages/keras/saving/save.py in load_model(filepath, custom_objects, compile, options)
202 if isinstance(filepath_str, str):
203 if not tf.io.gfile.exists(filepath_str):
--> 204 raise IOError(f'No file or directory found at {filepath_str}')
205
206 if tf.io.gfile.isdir(filepath_str):

OSError: No file or directory found at assets/model.h5

Add ruff as a linter in pre-commit

I noticed there is no linter. ruff is a great option to use.

Bundesanzeiger: query of a string starting with a number returns an error

How to reproduce:

from deutschland.bundesanzeiger import Bundesanzeiger
ba = Bundesanzeiger()
data = ba.get_reports('4steps systems')

Expected result:

assignment of the result dict to data.

What I got instead:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[4], line 3
      1 from deutschland.bundesanzeiger import Bundesanzeiger
      2 ba = Bundesanzeiger()
----> 3 data = ba.get_reports('4steps systems')

File [~/miniconda3/envs/uregister/lib/python3.8/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py:186](https://file+.vscode-resource.vscode-cdn.net/home/adomberg/projects/20230616_Unternehmensregister/~/miniconda3/envs/uregister/lib/python3.8/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py:186), in Bundesanzeiger.get_reports(self, company_name)
    182 # perform the search
    183 response = self.session.get(
    184     f"https://www.bundesanzeiger.de/pub/de/start?0-2.-top%7Econtent%7Epanel-left%7Ecard-form=&fulltext={company_name}&area_select=&search_button=Suchen"
    185 )
--> 186 return self.__generate_result(response.text)

File [~/miniconda3/envs/uregister/lib/python3.8/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py:120](https://file+.vscode-resource.vscode-cdn.net/home/adomberg/projects/20230616_Unternehmensregister/~/miniconda3/envs/uregister/lib/python3.8/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py:120), in Bundesanzeiger.__generate_result(self, content)
    118 """iterate trough all results and try to fetch single reports"""
    119 result = {}
--> 120 for element in self.__find_all_entries_on_page(content):
    121     get_element_response = self.session.get(element.content_url)
    123     if self.__is_captcha_needed(get_element_response.text):

File [~/miniconda3/envs/uregister/lib/python3.8/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py:90](https://file+.vscode-resource.vscode-cdn.net/home/adomberg/projects/20230616_Unternehmensregister/~/miniconda3/envs/uregister/lib/python3.8/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py:90), in Bundesanzeiger.__find_all_entries_on_page(self, page_content)
     88 soup = BeautifulSoup(page_content, "html.parser")
     89 wrapper = soup.find("div", {"class": "result_container"})
---> 90 rows = wrapper.find_all("div", {"class": "row"})
     91 for row in rows:
     92     info_element = row.find("div", {"class": "info"})

AttributeError: 'NoneType' object has no attribute 'find_all'

I tried other numerals and non-numerals with the described error pattern.

My env: Ubuntu 22.04, python 3.8.17

In geo.py Geo.fetch() top_right and bottom_left seem to be swapped.

In a map oriented to the north, the first coordinate passed to Geo.fetch() is the lower left, and the second coordinate is the upper right of the two coordinates.

The following image shows both coordinates from the documentation in OpenStreetMap.

# top_right and bottom_left coordinates
data = geo.fetch([52.50876180448243, 13.359631043007212], 
                 [52.530116236589244, 13.426532801586827])

Maybe I misunderstand the names top_right and bottom_left. Please correct me if this is the case. Otherwise, I would work on a pull request for this issue, which renames the variables.

Saved Model does not exist

I am receiving following error message, it looks like the model used to solve the captchas does not exist in assets/model.h5 directory: OSError: SavedModel file does not exist at: assets/model.h5/{saved_model.pbtxt|saved_model.pb}

Here's my code:

from deutschland import Bundesanzeiger
ba = Bundesanzeiger()
# search term
data = ba.get_reports("Deutsche Bahn AG")
# returns a dictionary with all reports found as fulltext reports
print(data.keys())
# dict_keys(['Jahresabschluss zum Geschäftsjahr vom 01.01.2020 bis zum 31.12.2020', 'Konzernabschluss zum Geschäftsjahr vom 01.01.2020 bis zum 31.12.2020\nErgänzung der Veröffentlichung vom 04.06.2021',

Model for CAPTCHAs in ONNX format

Hi,

I have exported your model to solve the CAPTCHAs to the ONNX format. This has the advantage that you get rid of TensorFlow as a dependency (~500 MB) and can use onnxruntime (~5 MB) for inference instead – would make the whole project way more lightweight. There are also significantly fewer problems updating onnxruntime than TensorFlow without breaking the model.

And I could also fix #8, if you're interested.

Bundesanzeiger - nur Suche bei Rechnungslegung/Finanzberichte möglich?

Was müsste man im Code ändern, um nur im Bereich Rechnungslegung/Finanzberichte zu suchen?
Ich vermute es bräuchte in der aufzurufenden URL Angaben bezüglich "area_select="

perform the search

    response = self.session.get(

update to format and test workflow
f"https://www.bundesanzeiger.de/pub/de/start?0-2.-top%7Econtent%7Epanel-left%7Ecard-form=&fulltext={company_name}&area_select=&search_button=Suchen"
)

Documentation - Suggestions

I started to work on the documentation (see #41) and I have a few suggestions:

Include source-code documentation into sphinx
Provide the documentation in both german and english
Rethink the README - currently it is a mix of quickstart, documentation for specific apis and links to more documentation on autogenerated api-clients; IMO it should only contain a short description of the repo, how to install it and where to find more documentation
Provide an Issue template for documentation related issues and tag them as documentation, which would make it easier to plan and track documentation progress and allow to filter bugs and other issues.
Host the documentation on readthedocs (@LilithWittmann thats something you should look into)

Add all openapi spec apis to library

We need to find a way to create API bindings from all the openapi specs to integrate them automatically into the deutschland lib.

Any suggestions on how to tackle this?

Handelsregister tests are failing

I think the Handelsregister code in this repo is deprecated for the Handelsregister repo.

This means we should be able to clean up the repo and move the tests to the Handelsregister repo.

Doc generation fails

Looks like a new release should fix this problem soon.

CrossNox/m2r2#52

Can't get 0.3.0 running

After fresh virtualenv install via:
pipenv install git+https://github.com/bundesAPI/deutschland.git#egg=deutschland
I get:
Python 3.9.5 (default, Nov 23 2021, 15:27:38)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

import deutschland as de
from deutschland import Geo
Traceback (most recent call last):
File "", line 1, in
ImportError: cannot import name 'Geo' from 'deutschland' (unknown location)
geo = Geo()
Traceback (most recent call last):
File "", line 1, in
NameError: name 'Geo' is not defined
data = geo.fetch([52.530116236589244, 13.426532801586827],
... [52.50876180448243, 13.359631043007212])
Traceback (most recent call last):
File "", line 1, in
NameError: name 'geo' is not defined

Similar problem with google collab(python3.7):
!pip install deutschland
I get the folowing:

Attempting uninstall: urllib3
Found existing installation: urllib3 1.24.3
Uninstalling urllib3-1.24.3:
Successfully uninstalled urllib3-1.24.3
Attempting uninstall: requests
Found existing installation: requests 2.23.0
Uninstalling requests-2.23.0:
Successfully uninstalled requests-2.23.0
Attempting uninstall: regex
Found existing installation: regex 2022.6.2
Uninstalling regex-2022.6.2:
Successfully uninstalled regex-2022.6.2
Attempting uninstall: numpy
Found existing installation: numpy 1.21.6
Uninstalling numpy-1.21.6:
Successfully uninstalled numpy-1.21.6
Attempting uninstall: Pillow
Found existing installation: Pillow 7.1.2
Uninstalling Pillow-7.1.2:
Successfully uninstalled Pillow-7.1.2
Attempting uninstall: pandas
Found existing installation: pandas 1.3.5
Uninstalling pandas-1.3.5:
Successfully uninstalled pandas-1.3.5
Attempting uninstall: lxml
Found existing installation: lxml 4.2.6
Uninstalling lxml-4.2.6:
Successfully uninstalled lxml-4.2.6
Attempting uninstall: beautifulsoup4
Found existing installation: beautifulsoup4 4.6.3
Uninstalling beautifulsoup4-4.6.3:
Successfully uninstalled beautifulsoup4-4.6.3
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
xarray-einstats 0.2.2 requires numpy>=1.21, but you have numpy 1.19.0 which is incompatible.
tensorflow 2.8.2+zzzcolab20220527125636 requires numpy>=1.20, but you have numpy 1.19.0 which is incompatible.
google-colab 1.0.0 requires requests~=2.23.0, but you have requests 2.28.1 which is incompatible.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.
albumentations 0.1.12 requires imgaug<0.2.7,>=0.2.5, but you have imgaug 0.2.9 which is incompatible.
Successfully installed Pillow-8.4.0 beautifulsoup4-4.11.1 boto3-1.24.34 botocore-1.27.34 dateparser-1.1.1 de-autobahn-1.0.4 de-bundesrat-0.1.0 de-bundestag-0.1.0 de-dwd-1.0.1 de-interpol-0.1.0 de-jobsuche-0.1.0 de-ladestationen-1.0.5 de-mudab-0.1.0 de-nina-1.0.2 de-polizei-brandenburg-0.1.0 de-risikogebiete-0.1.0 de-smard-0.1.0 de-strahlenschutz-1.0.0 de-travelwarning-0.1.0 de-zoll-0.1.0 deutschland-0.3.0 gql-2.0.0 graphql-core-2.3.2 jmespath-1.0.1 lxml-4.9.1 mapbox-vector-tile-1.2.1 numpy-1.19.0 onnxruntime-1.10.0 pandas-1.1.5 pyclipper-1.3.0.post3 pypresseportal-0.1 regex-2022.3.2 requests-2.28.1 rx-1.6.1 s3transfer-0.6.0 slugify-0.0.1 urllib3-1.26.10

WARNING: The following packages were previously imported in this runtime:
[PIL,numpy]
You must restart the runtime in order to use newly installed versions.

import deutschland as de
geo = de.Geo()
AttributeError Traceback (most recent call last)

in ()
1 import deutschland as de
----> 2 geo = de.Geo()
AttributeError: module 'deutschland' has no attribute 'Geo'

Running an older version:
deutschland = "==0.1.9"
I don't get the import errors in Geo() but an empy result:

Python 3.9.13 (main, May 23 2022, 22:01:06)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

from deutschland import Geo
geo = Geo()
data = geo.fetch([52.530116236589244, 13.426532801586827],
... [52.50876180448243, 13.359631043007212])
print(data.keys())
dict_keys([])

With the Bundesanzeiger I get the import error again:

from deutschland import Bundesanzeiger
ba = Bundesanzeiger()
2022-07-22 00:09:52.683533: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-07-22 00:09:52.683555: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
File "", line 1, in
File "/home/moritz/.local/share/virtualenvs/deutschland-fqErnsp1/lib/python3.9/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py", line 47, in init
self.model = deutschland.bundesanzeiger.model.load_model()
File "/home/moritz/.local/share/virtualenvs/deutschland-fqErnsp1/lib/python3.9/site-packages/deutschland/bundesanzeiger/model.py", line 36, in load_model
return keras.models.load_model(
File "/home/moritz/.local/share/virtualenvs/deutschland-fqErnsp1/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/moritz/.local/share/virtualenvs/deutschland-fqErnsp1/lib/python3.9/site-packages/keras/saving/save.py", line 206, in load_model
raise IOError(f'No file or directory found at {filepath_str}')
OSError: No file or directory found at assets/model.h5
data = ba.get_reports("Deutsche Bahn AG")
Traceback (most recent call last):
File "", line 1, in
NameError: name 'ba' is not defined

Could you point to what I am doing wrong? Best regards

UnboundLocalError: local variable 'parsed' referenced before assignment

>>> de.fetch([48.51999, 9.07136], [48.51999, 9.07137])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ajung/src/deutschland/lib/python3.9/site-packages/deutschland/geo.py", line 81, in fetch
    return parsed
UnboundLocalError: local variable 'parsed' referenced before assignment

Vector Tiles Download / Weiterverarbeitung nicht erlaubt?

Kann es sein, dass die Nutzungsbedingungen von adv-smart den Download und Weiterverarbeitung der Tiles untersagen?

Test newest version of the openapi generator

A new version of the API generator is released which should improve the quality of the generated python code (e.g. Type hints etc.)

We need to evaluate if our current process of generating the clients still works with this newest release.

Possibly we will need to adapt the post-processing script for the new code.

Hey,I'm-Jeremy-i-comimg-from-the-shit-land-of-this-world-german

All

OpenAPI doc for geodata?

I'd like to create another library for .NET, so I thought I'd start with something simple like the geodata API. Turns out I couldn't find the open API doc.
Is there one? Or is there another kind of documentation about that that just wasn't enough to make a repository yet? A cURL/Postman/etc example request would be enough for me, as reading python is not one of my string suits (although I have already started doing that).

Fix integration tests for handelsregister

Execute tests with pytest.

Tests fail with:

_______________________ test_for_no_data_handelsregister _______________________

    def test_for_no_data_handelsregister():
        hr = Handelsregister()
        data = hr.search(keywords="foobar", keyword_match_option=3)
>       assert (
            len(data) == 0
        ), "Found registered companies for 'foobar' although none were expected."
E       TypeError: object of type 'NoneType' has no len()

tests/integration_test.py:19: TypeError
___________ test_fetching_handelsregister_data_for_deutsche_bahn_ag ____________

    def test_fetching_handelsregister_data_for_deutsche_bahn_ag():
        hr = Handelsregister()
        data = hr.search(
            keywords="Deutsche Bahn Aktiengesellschaft", keyword_match_option=3
        )
>       assert (
            len(data) > 0
        ), "Found no data for 'Deutsche Bahn Aktiengesellschaft' although it should exist."
E       TypeError: object of type 'NoneType' has no len()

tests/integration_test.py:29: TypeError
___ test_fetching_handelsregister_data_for_deutsche_bahn_ag_with_raw_params ____

    def test_fetching_handelsregister_data_for_deutsche_bahn_ag_with_raw_params():
        r = Registrations()
        data = r.search_with_raw_params(
            {"schlagwoerter": "Deutsche Bahn Aktiengesellschaft", "schlagwortOptionen": 3}
        )
>       assert (
            len(data) > 0
        ), "Found no data for 'Deutsche Bahn Aktiengesellschaft' although it should exist."
E       TypeError: object of type 'NoneType' has no len()

Handelsregister.search funktioniert nicht (mehr?)

Das Beispiel aus der README funktioniert nicht:

from deutschland.handelsregister import Handelsregister
hr = Handelsregister()
# search by keywords, see documentation for all available params
hr.search(keywords="Deutsche Bahn Aktiengesellschaft") # Hier wird None zurückgegeben
print(hr)  # Das funktioniert sowieso nicht, sondern gibt das Handelsregister-Objeckt zurück

Ich habemir das gestern etwas angeschaut und der der Server liefert einen 404 zurück. Kann es sein, dass der Endpunkt jetzt https://www.handelsregister.de/rp_web/normalesuche.xhtml ist? Die Parameter heißen auch etwas anders. Nachdem meine IP gestern geblockt wurde, habe ich aber nicht mehr weiter gemacht.

Bundesanzeiger: Only 1 item is returned for reports w/ same name

Some reports share the same names, e.g. Mitteilung von Netto-Leerverkaufspositionen. The problem is that get_reports cannot return multiple items with the same name because it returns a dictionary/map.

#!/opt/homebrew/bin/python3
from deutschland.bundesanzeiger import Bundesanzeiger
ba = Bundesanzeiger()
data = ba.get_reports("DE000A0TGJ55")
print(data.keys())

print(data["Mitteilung von Netto-Leerverkaufspositionen"])

dict_keys(['Mitteilung von Netto-Leerverkaufspositionen'])
{'date': datetime.datetime(2022, 9, 26, 0, 0), 'name': 'Mitteilung von Netto-Leerverkaufspositionen', 'company': 'BlackRock Investment Management (UK) Limited', 'report': '\n\n\n\n\xa0\n\n\n\n\n\n\n\nBlackRock Investment Management (UK) Limited\nLondon\nMitteilung von Netto-Leerverkaufspositionen\nZu folgendem Emittenten wird vom oben genannten Positionsinhaber eine Netto-Leerverkaufsposition\n            gehalten:\n\nVARTA AKTIENGESELLSCHAFT\n\n\nISIN: DE000A0TGJ55\n\nDatum der Position: 23.09.2022\nProzentsatz des ausgegebenen Aktienkapitals: 2,26 %\n\xa0\n\n\n\n\n\n\n\n\n\n\n\n\n'}

Here you can see how it looks on the website:

Fix linting with updated spectral version

Spectral removed auto detection for openAPI schemas. Because of this, the linting fails when no .spectra.yaml is provided.

See: stoplightio/spectral#1796

In the Autobahn repo there is a .spectral.yaml file to account for that.

However, if we do not expect to use custom linting rulesets, we can simply get rid of this extra file by executing this before the linting process:

echo "extends: spectral:oas" > .spectral.yaml

Hacktoberfest

Das topic hacktoberfest hinzufügen um Teilnehmern des Hacktoberfest (mehr Informationen unter https://hacktoberfest.digitalocean.com/) zu ermöglichen hier getätigte Pull-Requests zu nutzen.
Kann dazu beitragen, dass noch mehr Personen auf das Repository/die gesamte Organisation aufmerksam werden.

Suggestion: add python version in readme

I was just trying to install the this package and couldn't get it to work using python 3.9 on Windows 10. Mostly due to numpy/blas/lapack/mkl errors somewhere deep down. However, reading the pyproject.toml I saw the mentioning of python 3.6.2.
=> Using python 3.6 I was able to install this package without errors.

No module named 'deutschland.geo'; 'deutschland' is not a package

Hi,

trying to get your example https://github.com/bundesAPI/deutschland#geographic-data to run …

pip3 install deutschland on my macOS 12.5 with homebrew results in:

Requirement already satisfied: deutschland in /opt/homebrew/lib/python3.9/site-packages (0.1.4)
Requirement already satisfied: mapbox-vector-tile<2.0.0,>=1.2.1 in /opt/homebrew/lib/python3.9/site-packages (from deutschland) (1.2.1)
Requirement already satisfied: Shapely<2.0.0,>=1.7.1 in /opt/homebrew/lib/python3.9/site-packages (from deutschland) (1.8.2)
Requirement already satisfied: requests<3.0.0,>=2.26.0 in /opt/homebrew/lib/python3.9/site-packages (from deutschland) (2.28.1)
Requirement already satisfied: pyclipper in /opt/homebrew/lib/python3.9/site-packages (from mapbox-vector-tile<2.0.0,>=1.2.1->deutschland) (1.3.0.post3)
Requirement already satisfied: setuptools in /opt/homebrew/lib/python3.9/site-packages (from mapbox-vector-tile<2.0.0,>=1.2.1->deutschland) (63.3.0)
Requirement already satisfied: protobuf in /opt/homebrew/lib/python3.9/site-packages (from mapbox-vector-tile<2.0.0,>=1.2.1->deutschland) (4.21.4)
Requirement already satisfied: future in /opt/homebrew/lib/python3.9/site-packages (from mapbox-vector-tile<2.0.0,>=1.2.1->deutschland) (0.18.2)
Requirement already satisfied: certifi>=2017.4.17 in /opt/homebrew/lib/python3.9/site-packages (from requests<3.0.0,>=2.26.0->deutschland) (2022.6.15)
Requirement already satisfied: idna<4,>=2.5 in /opt/homebrew/lib/python3.9/site-packages (from requests<3.0.0,>=2.26.0->deutschland) (3.3)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /opt/homebrew/lib/python3.9/site-packages (from requests<3.0.0,>=2.26.0->deutschland) (1.26.11)
Requirement already satisfied: charset-normalizer<3,>=2 in /opt/homebrew/lib/python3.9/site-packages (from requests<3.0.0,>=2.26.0->deutschland) (2.1.0)

[notice] A new release of pip available: 22.2.1 -> 22.2.2
[notice] To update, run: python3.9 -m pip install --upgrade pip

(Didn’t save the initial output, but it seems to be successfully installed)

Now running your example code of https://github.com/bundesAPI/deutschland#geographic-data fails early in the game:

% python3 deutschland.py
Traceback (most recent call last):
  File "/Users/ghoffart/src/deutschland.py", line 1, in <module>
    from deutschland.geo import Geo
  File "/Users/ghoffart/src/deutschland.py", line 1, in <module>
    from deutschland.geo import Geo
ModuleNotFoundError: No module named 'deutschland.geo'; 'deutschland' is not a package

Am I overseeing something very obvious? Sorry, Python’s really not part of my knowledge package :-o

Dependency Shapely: Not installed in new env

The installation of Shapely doesn't work in a fresh environment.
How to reproduce:

pip install deutschland

Expected result:

python package should be installed

What I got instead:

Error message:

 ERROR: Cannot install deutschland==0.1.0, deutschland==0.1.1, deutschland==0.1.2, deutschland==0.1.3, deutschland==0.1.4, deutschland==0.1.5, deutschland==0.1.6 and deutschland==0.1.7 because these package versions have conflicting dependencies.

The conflict is caused by:
    deutschland 0.1.7 depends on Shapely<2.0.0 and >=1.7.1

Workaround:

first install Shapely manually by running pip install Shapely.

My environment:

Mac 11.5.2. Big Sur, Python 3.9.6, pip 21.2.4

Geo tests are failing

Seems like the API changed:

When I run the test, I get the following:

HTTPSConnectionPool(host='adv-smart.de', port=443): Max retries exceeded with url: /tiles/smarttiles_de_public_v1/15/17605/10747.pbf (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1091)')))

To reproduce:

wget https://adv-smart.de/tiles/smarttiles_de_public_v1/15/17605/10745.pbf --no-check-certificate

Gives 404 error.

@LilithWittmann Do you know more?

paranoia mode: requests' proxy support

When you're in paranoia mode and want to use (anonymous) proxies, replace this line

deutschland/deutschland/geo.py

Line 67 in b85d765

result = requests.get(url, headers=headers)

with this one

result = requests.get(url, headers=headers, proxies=proxies)

and add this to your main function (with your proxy servers of course):

proxies = {
  'http': 'http://10.10.1.10:3128',
  'https': 'http://10.10.1.10:1080',
}

error when trying to extend bundesanzeiger search

I thought about contributing to your package by adding the extended search functionality (i.e. not only search for all documents but add the possibility to limit the search to certain types of documents).
Unfortunately, this is only working for certain companies while for certain other companies the captcha solver always fails. Any ideas why that might be?
(e.g. it works without errors for "Deutsche Bahn AG" but it keeps failing for "Deutsche Bank AG")

Change:
add the value 22 to the search request
response = self.session.get(
f"https://www.bundesanzeiger.de/pub/de/start?0-2.-top%7Econtent%7Epanel-left%7Ecard-form=&fulltext={company_name}&area_select=22&search_button=Suchen"
)

Links to Auto-Generated API-Clients documentations do not work on pypi.org

The links to the API documentations for the auto-generated API-Clients are currently not working on https://pypi.org/project/deutschland/ because there are relative links.

For example: https://pypi.org/project/deutschland/docs/nina/README.md

Are absolute links the way to go here, or is there a better option?

If absolute links are okay, I can do a PR for that.

What’s the LICENSE?

I’ve searched and searched, but I couldn’t see a license file.

Kennt jemand den GitHub-Account von Doro Bär (CSU)?

Die hat doch Twitter und ist Beauftragte der Bundesregierung für Digitalisierung. Vielleicht hat sie auch GitHub und kann ein gutes Wort bei Merkel einlegen?

Hallo

Get PDF- Documents from Results in "Bundesanzeiger"

I would like to request all PDF's which are found for a given search term in the "Bundesanzeiger". Maybe you can help me with this!?

Add Documentation

A proper documentation would be quite useful. Especially if the system keeps growing.
The usage documentation in the Readme is nice to get started but does not show all parameters and advanced usage.

Handelsregister demo code returns error

Running the demo code in the README.md for the Handelsregister module returns an error.

How to reproduce:

>>> from deutschland import Bundesanzeiger
>>> from deutschland import Handelsregister
>>> hr = Handelsregister()
>>> hr.search(keywords="Deutsche Bahn Aktiengesellschaft")

Expected result:

Handelsregister infos to be retrieved and stored in 'hr' object.

What I got instead:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/xxx/Dokumente/bundesapi/test_bundesapi_2/env/lib/python3.9/site-packages/deutschland/handelsregister/handelsregister.py", line 138, in search
    return self.search_with_raw_params(params)
  File "/home/xxx/Dokumente/bundesapi/test_bundesapi_2/env/lib/python3.9/site-packages/deutschland/handelsregister/handelsregister.py", line 215, in search_with_raw_params
    return self.__find_entries(soup)
  File "/home/xxx/Dokumente/bundesapi/test_bundesapi_2/env/lib/python3.9/site-packages/deutschland/handelsregister/handelsregister.py", line 242, in __find_entries
    data = self.__extract_history(tr)
  File "/home/xxx/Dokumente/bundesapi/test_bundesapi_2/env/lib/python3.9/site-packages/deutschland/handelsregister/handelsregister.py", line 276, in __extract_history
    [position, historical_name] = tds[1].text.strip().split(".) ", 1)
ValueError: not enough values to unpack (expected 2, got 1)

My env:

Ubuntu 20.04, venv with 3.9.6 and pip 21.2.4

Bundesanzeiger not working properly anymore

Running the following sample code:

from deutschland.bundesanzeiger import Bundesanzeiger
ba = Bundesanzeiger()
data = ba.get_reports("Deutsche Bahn AG")

throws the following error:

[/usr/local/lib/python3.10/dist-packages/deutschland/bundesanzeiger/bundesanzeiger.py](https://localhost:8080/#) in __find_all_entries_on_page(self, page_content)
     88         soup = BeautifulSoup(page_content, "html.parser")
     89         wrapper = soup.find("div", {"class": "result_container"})
---> 90         rows = wrapper.find_all("div", {"class": "row"})
     91         for row in rows:
     92             info_element = row.find("div", {"class": "info"})

AttributeError: 'NoneType' object has no attribute 'find_all'

Problem mit Bundesanzeiger

Ich verwende Python3.9 in einem dockerimage und plane die bundesAPI in einem
headless Projekt zu verwenden. Aber die Bundesanzeiger API verwendet google-chrome,
kann man das umgehen?

ba = Bundesanzeiger()

====== WebDriver manager ======
====== WebDriver manager ======
/bin/sh: 1: google-chrome: not found
/bin/sh: 1: google-chrome-stable: not found
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/site-packages/deutschland/bundesanzeiger/bundesanzeiger.py", line 46, in __init__
    self.driver = webdriver.Chrome(ChromeDriverManager().install(), options=options)
  File "/usr/local/lib/python3.9/site-packages/webdriver_manager/chrome.py", line 25, in __init__
    self.driver = ChromeDriver(name=name,
  File "/usr/local/lib/python3.9/site-packages/webdriver_manager/driver.py", line 57, in __init__
    self.browser_version = chrome_version(chrome_type)
  File "/usr/local/lib/python3.9/site-packages/webdriver_manager/utils.py", line 155, in chrome_version
    raise ValueError(f'Could not get version for Chrome with this command: {cmd}')
ValueError: Could not get version for Chrome with this command: google-chrome --version || google-chrome-stable --version

Accessing and downloading (xml, pdf ) files from handelsregister API

Hi , I am trying to find if there is a way to access and download (.xml, pdf data ) from the command line command . If so how can one execute that ?
Command line : handelsregister.py [-h] [-d] [-f] -s SCHLAGWOERTER [-so {all,min,exact}]

Is it even supported ?

Verena test fails for windows

The tests fail under windows as can be seen here

The reason is under windows somehow the telephon symbol is not parsed properly.

When I tried to print it I saw something like

E       UnicodeEncodeError: 'charmap' codec can't encode character '\u260e' in position 38: character maps to <undefined>

Unfortunately I do not have access to a windows machine to dig deeper in a reasonable manner.

Raw error:

================================== FAILURES ===================================
_________________________________ test_verena _________________________________

    def test_verena():
        v = Verena()
>       res = v.get()

tests\verena\test_verena.py:6: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\deutschland\verena\verena.py:21: in get
    extract = VerenaExtractor(page).extract()
C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\deutschland\verena\verenaextractor.py:38: in extract
    phone, fax, homepage, email, deadline = self.__extract_part4(aus_parts[3])
C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\deutschland\verena\verenaextractor.py:158: in __extract_part4
    print(x)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <encodings.cp1252.IncrementalEncoder object at 0x000002AC31A54F10>
input = '\\r\\r\\n\\r\\r\\n                                \u260e 02381 973060\\r\\r\\n                                '
final = False

    def encode(self, input, final=False):
>       return codecs.charmap_encode(input,self.errors,encoding_table)[0]
E       UnicodeEncodeError: 'charmap' codec can't encode character '\u260e' in position 38: character maps to <undefined>

C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\encodings\cp1252.py:19: UnicodeEncodeError
___________________________ test_extractor_content ____________________________

    def test_extractor_content():
        with open("tests/verena/ausschreibung_test_input.html", "r") as f:
            with open("tests/verena/ausschreibung_correct_result.json", "r") as correct:
                content = "<html><body>" + f.read() + "</body></html>"
                ve = VerenaExtractor(content)
                res = ve.extract()
>               assert len(res) == 1 and res[0] == json.loads(correct.read())
E               AssertionError: assert (1 == 1 and {'comments': ...ulingen', ...} == {'comments': ...ulingen', ...}
E                +  where 1 = len([{'comments': 'Bemerkung zur Stelle: Testbemerkung', 'contact': {'fax': '0172 2222 2222', 'homepage': 'http://www.eine...line/': '17.09.2021', 'desc': 'Eine Schule\nSchule der Sekundarstufe II\ndes Landkreis Schuling\n9999 Schulingen', ...}])
E                 Omitting 11 identical items, use -vv to show
E                 Differing items:
E                 {'contact': {'fax': '0172 2222 2222', 'homepage': 'http://www.eine-schule.de/', 'mail': {'adress': 'bewerbung@eineschul...'mailto:[email protected]?subject=Stellenausschreibung in VERENA', 'subject': 'Stellenausschreibung in VERENA'}}} != {'contact': {'fax': '0172 2222 2222', 'homepage': 'http://www.eine-schule.de/', 'mail': {'adress': '[email protected]?subject=Stellenausschreibung in VERENA', 'subject': 'Stellenausschreibung in VERENA'}, 'phone': '0172 1111 1111'}}
E                 Full diff:
E                   {
E                    'comments': 'Bemerkung zur Stelle: Testbemerkung',
E                    'contact': {'fax': '0172 2222 2222',
E                                'homepage': 'http://www.eine-schule.de/',
E                                'mail': {'adress': '[email protected]',
E                                         'raw': 'mailto:[email protected]?subject=Stellenausschreibung '
E                                                'in VERENA',
E                 -                       'subject': 'Stellenausschreibung in VERENA'},
E                 +                       'subject': 'Stellenausschreibung in VERENA'}},
E                 ?                                                                   +
E                 -              'phone': '0172 1111 1111'},
E                    'deadline': '17.09.2021',
E                    'desc': 'Eine Schule\n'
E                            'Schule der Sekundarstufe II\n'
E                            'des Landkreis Schuling\n'
E                            '9999 Schulingen',
E                    'duration': '01.01.2021 - 01.01.2022',
E                    'geolocation': {'coord_system': 'epsg:25832',
E                                    'coordinates': [1111111,
E                                                    1111111],
E                                    'post_adress': 'Eine Straße 1\n'
E                                                   '99999 Schulingen'},
E                    'hours_per_week': '13,5',
E                    'replacement_job_title': 'Lehrkraft',
E                    'replacement_job_type': 'Vertretung',
E                    'replacement_job_type_raw': 'Vertretung für',
E                    'school_id': '99999',
E                    'subjects': ['Fach 1',
E                                 'Fach 2'],
E                   })

tests\verena\test_verenaextractor.py:12: AssertionError
============================== warnings summary ===============================

Charging stations

Hi,
I would like to extract all the charging stations in Germany. I guess the ladestationen would give me this info. However, I'm not sure what "geometry" should be provided as input here. Also, the URI is localhost. Could you please provide a valid URI?

Thanks

import of Bundesanzeiger and Handelsregister not working

The import of Bundesanzeiger and Handelsregister is not working in a new install of the "deutschland" package.

How to reproduce:

pip install Shapely
pip install deutschland
open python REPL
Try to import: from deutschland import Bundesanzeiger

Expected result:

python modules should be imported

What I got instead:

>>> from deutschland import Bundesanzeiger
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name 'Bundesanzeiger' from 'deutschland' (/opt/homebrew/Caskroom/miniforge/base/envs/py_de/lib/python3.9/site-packages/deutschland/__init__.py)
>>> from deutschland import Handelsregister
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name 'Handelsregister' from 'deutschland' (/opt/homebrew/Caskroom/miniforge/base/envs/py_de/lib/python3.9/site-packages/deutschland/__init__.py)

My environment:

Mac 11.5.2. Big Sur, Python 3.9.6, pip 21.2.4, conda 4.10.3 (via Miniforge)

Create new release

Hi @LilithWittmann @wirthual, there have been a couple of changes since the last release end of 2022. Could please create a new release of the library?

Thanks in advance

Python 3.11 compatability - Dependency Pillow

Python 3.11 supports the following Pillow verisons: Pillow >= 9.3
readthedocs Pillow
deutschland 0.3.2 requires Pillow<9.0.0,>=8.3.1

Error message from prompt:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
deutschland 0.3.2 requires Pillow<9.0.0,>=8.3.1, but you have pillow 9.3.0 which is incompatible.

destatis api does not work due to SSLError

Hello,
when trying to use the destatis api (timeseries_data to be exact), it throws a SSLCertVerificationError.

MaxRetryError: HTTPSConnectionPool(host='www-genesis.destatis.de', port=443): Max retries exceeded with url: /genesisWS/rest/2020/data/timeseries?username=*******&password=******* (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:997)')))

My code is just the example for timeseries_data, with a valid username and password provided.
The host url is https://www-genesis.destatis.de/genesisWS/rest/2020.

Link outdated, error 404

The data is provided by the AdV SmartMapping.

The link is outdated.

bundesapi / deutschland Goto Github PK

deutschland's Issues

Use case

perform the search

Recommend Projects

Recommend Topics

Recommend Org