ccs-amsterdam / amcat4py Goto Github PK
View Code? Open in Web Editor NEWAPI Client for AmCAT4
License: MIT License
API Client for AmCAT4
License: MIT License
When performing a bad request, the Amcat Client always returns HTTP Status Code 500, even though the error generated is 422 (or a different type).
Furthermore, error messages are not forwarded to the user. Performing a bad request with the client raises this exception:
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://opted.amcat.nl/api/index/wp3/documents
Performing the same request manually raises this:
requests.exceptions.HTTPError: 422 Client Error: Unprocessable Entity for url: https://opted.amcat.nl/api/index/wp3/documents
r.text
'{"detail":[{"loc":["body","documents"],"msg":"field required","type":"value_error.missing"},{"loc":["body","columns"],"msg":"field required","type":"value_error.missing"}]}'
Listing, adding, removing and modifying index users is currently not implemented, as far as I can see. That is:
Just tried to log into an amcat instance I set up through docker on my server. .env looks liks this:
# Host this instance is served at (needed for checking tokens)
amcat4_host=http://192.168.2.180:8069/amcat
# Elasticsearch password. This the password for the 'elastic' user when Elastic xpack security is enabled
#amcat4_elastic_password=
# Elasticsearch host. Default: https://localhost:9200 if elastic_password is set, http://localhost:9200 otherwise
amcat4_elastic_host=http://elastic7:9200
# Elasticsearch verify SSL (only used if elastic_password is set). Default: True unless host is localhost)
amcat4_elastic_verify_ssl=True
# Do we require authorization?
# Valid options:
# - no_auth: everyone (that can reach the server) can do anything they want
# - allow_guests: everyone can use the server, dependent on index-level guest_role authorization settings
# - allow_authenticated_guests: everyone can use the server, if they have a valid middlecat login, and dependent on index-level guest_role authorization settings
# - authorized_users_only: only people with a valid middlecat login and an explicit server role can use the server
amcat4_auth=allow_authenticated_guests
# Middlecat server to trust as ID provider
amcat4_middlecat_url=https://middlecat.up.railway.app
# Email address for a hardcoded admin email (useful for setup and recovery)
[email protected]
# Elasticsearch index to store authorization information in
amcat4_system_index=amcat4_system
When I use amcat.login()
, python is stuck at Waiting for authorization in browser...
as the redirect seemingly does not work. The address generated by middlecat is:
https://middlecat.up.railway.app/authorize?response_type=code&client_id=amcat4py&redirect_uri=http%3A%2F%2Flocalhost%3A65432%2F&state=Y6igGmWz7aQGsvlHWUd68yV7mK4Ljd&code_challenge=UU5NSzZcSgBVU5g3d4ltDgs4xlhUUODxsFfMwCly538&code_challenge_method=S256&resource=http%3A%2F%2F192.168.2.180%3A8069%2Famcat&refresh_mode=static&session_type=api_key
http://localhost:65432/
is open, but the code is seemingly not sent through.
I don't want to be nagging, but I think the low level methods (e.g., put
, url
) should be private methods (_put
, _url
). Not that it makes a big difference in python, but is probably better to hide that stuff from new users
I would like to use the query_aggregate
function in my current project. Can you update the PyPi amcat4py package so that I can update my local installation and use the new features? Or is a GitHub installation adviced?
amcat = AmcatClient("http://localhost/amcat")
amcat.query_aggregate(...)
---
'AmcatClient' object has no attribute 'query_aggregate'
Error thrown as the package is outdated.
Extend upload_documents
to allow for chunked uploads of documents (steal from copy_index.py
). Add a progress bar for large uploads and maybe add some some documentation so the user knows right away what the server expects
Looks like to get a batch of say 1000 documents, one needs to go through _post
directly, as there is no way to stop query()
or documents()
, once they start pulling results:
amcat = AmcatClient("http://localhost/amcat")
body = dict(queries="test",
fields=["_id", "text"],
page=0,
per_page=10)
res = amcat._post("query", index="state_of_the_union", json=body, ignore_status=[404]).json()
len(res['results'])
Installed package via pip install git+https://...
but get an error when running
from amcat4py import AmcatClient
amcat = AmcatClient("http://localhost/amcat")
amcat.login()
---
TypeError Traceback (most recent call last)
Cell In[2], line 2
1 from amcat4py import AmcatClient
----> 2 amcat = AmcatClient("http://localhost/amcat")
3 amcat.login()
File [~/anaconda3/envs/opted/lib/python3.11/site-packages/amcat4py/amcatclient.py:44](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/peter/uni/ParLawSpeechDashboard/~/anaconda3/envs/opted/lib/python3.11/site-packages/amcat4py/amcatclient.py:44), in AmcatClient.__init__(self, host, ignore_tz)
42 self.server_config = self.get_server_config()
43 # If we have a token cached, load it. Otherwise, only log in if explicitly requested
---> 44 self.token = _get_token(self.host, login_if_needed=False)
File [~/anaconda3/envs/opted/lib/python3.11/site-packages/amcat4py/auth.py:132](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/peter/uni/ParLawSpeechDashboard/~/anaconda3/envs/opted/lib/python3.11/site-packages/amcat4py/auth.py:132), in _get_token(host, force_refresh, login_if_needed)
130 file_path = user_cache_dir(CLIENT_ID) + "/" + sha256(host.encode()).hexdigest()
131 if os.path.exists(file_path) and not force_refresh:
--> 132 token = secret_read(file_path, host)
133 elif login_if_needed:
134 token = get_middlecat_token(host)
File [~/anaconda3/envs/opted/lib/python3.11/site-packages/amcat4py/auth.py:176](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/peter/uni/ParLawSpeechDashboard/~/anaconda3/envs/opted/lib/python3.11/site-packages/amcat4py/auth.py:176), in secret_read(path, host)
174 with open(path, "rb") as f:
175 token_enc = f.read()
--> 176 fernet = Fernet(make_key(host))
177 return loads(fernet.decrypt(token_enc).decode())
File [~/anaconda3/envs/opted/lib/python3.11/site-packages/amcat4py/auth.py:191](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/peter/uni/ParLawSpeechDashboard/~/anaconda3/envs/opted/lib/python3.11/site-packages/amcat4py/auth.py:191), in make_key(key)
181 """
182 Helper function to make key for encryption of tokens
183 :param key: string that is turned into key.
184 """
185 kdf = PBKDF2HMAC(
186 algorithm=sha256(),
187 length=32,
188 salt="supergeheim".encode(),
189 iterations=5,
190 )
--> 191 return urlsafe_b64encode(kdf.derive(key.encode()))
File [~/anaconda3/envs/opted/lib/python3.11/site-packages/cryptography/hazmat/primitives/kdf/pbkdf2.py:53](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/peter/uni/ParLawSpeechDashboard/~/anaconda3/envs/opted/lib/python3.11/site-packages/cryptography/hazmat/primitives/kdf/pbkdf2.py:53), in PBKDF2HMAC.derive(self, key_material)
50 raise AlreadyFinalized("PBKDF2 instances can only be used once.")
51 self._used = True
---> 53 return rust_openssl.kdf.derive_pbkdf2_hmac(
54 key_material,
55 self._algorithm,
56 self._salt,
57 self._iterations,
58 self._length,
59 )
It works when downgrading cryptgraphy to 40.0.2
When I upload a document to amcat4 via amcat4py and forget the required fields (title, text, date) the following exception is being raised
Unexpected exception formatting exception. Falling back to standard exception
Traceback (most recent call last):
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/amcat4py/amcatclient.py", line 84, in _request
r.raise_for_status()
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 422 Client Error: Unprocessable Entity for url: http://localhost/amcat/index/speeches_aut/documents
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3433, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "/tmp/ipykernel_31037/2743098145.py", line 1, in <module>
amcat.upload_documents("speeches_aut", speeches_aut.to_dicts())
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/amcat4py/amcatclient.py", line 267, in upload_documents
self._post("documents", index=index, json=body)
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/amcat4py/amcatclient.py", line 100, in _post
return self._request("post", url=self._url(url, index), data=data, headers=headers, ignore_status=ignore_status)
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/amcat4py/amcatclient.py", line 86, in _request
raise AmcatError(e.response, e.request) from e
amcat4py.amcatclient.AmcatError: Error from server (422): [{'loc': ['body', 'documents', 0, 'title']
[...]
'msg': 'field required', 'type': 'value_error.missing'}, {'loc': ['body', 'documents', 98, 'title'], 'msg': 'field required', 'type': 'value_error.missing'}, {'loc': ['body', 'documents', 99, 'title'], 'msg': 'field required', 'type': 'value_error.missing'}, {'loc': ['body', 'documents', 100, 'title'], 'msg': 'field required', 'type': 'value_error.missing'}]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 2052, in showtraceback
stb = self.InteractiveTB.structured_traceback(
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1118, in structured_traceback
return FormattedTB.structured_traceback(
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1012, in structured_traceback
return VerboseTB.structured_traceback(
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/IPython/core/ultratb.py", line 865, in structured_traceback
formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context,
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/IPython/core/ultratb.py", line 818, in format_exception_as_a_whole
frames.append(self.format_record(r))
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/IPython/core/ultratb.py", line 736, in format_record
result += ''.join(_format_traceback_lines(frame_info.lines, Colors, self.has_colors, lvals))
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper
value = obj.__dict__[self.func.__name__] = self.func(obj)
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/stack_data/core.py", line 734, in lines
pieces = self.included_pieces
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper
value = obj.__dict__[self.func.__name__] = self.func(obj)
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/stack_data/core.py", line 677, in included_pieces
scope_pieces = self.scope_pieces
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper
value = obj.__dict__[self.func.__name__] = self.func(obj)
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/stack_data/core.py", line 614, in scope_pieces
scope_start, scope_end = self.source.line_range(self.scope)
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/stack_data/core.py", line 178, in line_range
return line_range(self.asttext(), node)
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/executing/executing.py", line 428, in asttext
self._asttext = ASTText(self.text, tree=self.tree, filename=self.filename)
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/asttokens/asttokens.py", line 307, in __init__
super(ASTText, self).__init__(source_text, filename)
File "/home/peter/anaconda3/envs/vds/lib/python3.10/site-packages/asttokens/asttokens.py", line 44, in __init__
source_text = six.ensure_text(source_text)
AttributeError: module 'six' has no attribute 'ensure_text'
The client seems not to handle 422 error or forward the server's error msg
This differs from amcat4r
, which prints The fields title, date, and text are required and can never be NA
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.