openlegaldata / oldp Goto Github PK
View Code? Open in Web Editor NEWOpen Legal Data Platform
Home Page: https://openlegaldata.io
License: MIT License
Open Legal Data Platform
Home Page: https://openlegaldata.io
License: MIT License
See
Would a Mastodon account for the project be possible?
When using https://de.openlegaldata.io/api/cases/search/
or https://de.openlegaldata.io/api/laws/search/
both return an error 500 with following message:
Fehler 500
Webservice currently unavailable. Our service team has been dispatched to bring it back online.
A valid list depending on the search query would be an expected response.
Pagination with total number of items
<< Previous page | 123 items | Next page >>
tellme-feedback form is not responsive:
On the page for the German court decisions dataset , the "Download" section points to an outdated data version. Only at the end of the page there is a small remark "Update For more recent dumps, please refer to this link".
IMO Download section should just point to all available dumps
Thanks for releasing this data set.
Dear Maintainers, I wanted to get involved in the developement / have a look at the project.
Since I prefer not to install everything locally, I wanted to use docker for a test deployment.
docker-compose up
app_1 | usage: gunicorn [OPTIONS] [APP_MODULE]
app_1 | gunicorn: error: unrecognized arguments: oldp.wsgi:application
Which corresponds to this line in the code:
Line 38 in edcf93a
docker-compose build
Since I tried to fix the issue in 1), I changed the line mentioned above but then saw, that the container does not build.
$ sudo docker-compose build
[...]
Step 14/18 : RUN python manage.py collectstatic --no-input
---> Running in c6bac8bbfe6e
Traceback (most recent call last):
File "manage.py", line 11, in <module>
execute_from_command_line(sys.argv)
File "/usr/local/lib/python3.6/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
utility.execute()
File "/usr/local/lib/python3.6/site-packages/django/core/management/__init__.py", line 357, in execute
django.setup()
File "/usr/local/lib/python3.6/site-packages/django/__init__.py", line 24, in setup
apps.populate(settings.INSTALLED_APPS)
File "/usr/local/lib/python3.6/site-packages/django/apps/registry.py", line 91, in populate
app_config = AppConfig.create(entry)
File "/usr/local/lib/python3.6/site-packages/django/apps/config.py", line 90, in create
module = import_module(entry)
File "/usr/local/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'django_extensions'
ERROR: Service 'app' failed to build: The command '/bin/sh -c python manage.py collectstatic --no-input' returned a non-zero code: 1
I have either misread the documentation or it is incomplete here in the part where it says:
mkdir -p ./data/es
mkdir -p ./data/mysql
chmod 777 ./data/es
chmod 777 ./data/mysql
Since the correct path should be ./docker/data
.
Still thank you for your awesome project ๐ฏ ๐ฅ
Let me know if I can help resolving the mentioned issues.
Use multiple threads for some processing tasks
See CourtListner
Make recent content available with RSS feeds.
Extracted references should be auto-assigned to corresponding items.
Use Jekyll static site generator (with separated repo) for www-landingpage and maybe blog as well.
Task: Provide a more verbose explanation of the processing pipeline and how the results are used in the front-end. https://oldp.readthedocs.io/en/latest/processing.html
Future work: Actual workflows (processing step dependencies)
Currently loading the web page can take several seconds, making the site not user friendly. Thus, we should improve the site performance.
db_index=True
for all query fieldsselect_related()
to decrease number of db queries (don't use own get_x()
methods)how can i retrieve newer/updated versions of dumps, meaning all new cases since the last dump Oct-2022, and updated versions for the references which haven't been updated since 2019.
We do not need to load all db items to list in advance. See https://github.com/openlegaldata/oldp/blob/master/oldp/apps/processing/content_processor.py#L265
Flow:
Add Dockerfile to repo and make it available on Docker Hub.
Footer link still points to legalresearch.io
When i try to call the API, i get the following error:
MaxRetryError: HTTPSConnectionPool(host='de.openlegaldata.io', port=443): Max retries exceeded with url: /api/courts/ (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')])")))
Any idea how to enable requests?
Hello,
I have downloaded the Data Dumps from here https://static.openlegaldata.io/dumps/de/ and there is a refs.csv file, where the mapping between case and cited law is given. For the cases there are the id-s given in the cases.json file, but for laws in laws.json this is not the case. Is there any way to acess the relation between the laws and the respective id-s?
Hey, I was trying to check out OLDP using the provided docker compose file. But when trying to initialite the database using
docker exec -it oldp_app_1 python manage.py migrate
like the docs in https://oldp.readthedocs.io/en/latest/docker.html say, the database crashes:
E:\_Dev\oldp_test>docker exec -it oldp_test_app_1 python manage.py migrate
Operations to perform:
Apply all migrations: account, admin, annotations, auth, authtoken, cases, contenttypes, courts, flatpages, laws, references, search, sessions, sites, socialaccount, sources, tellme, topics
Running migrations:
Applying contenttypes.0001_initial... OK
Applying auth.0001_initial...Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
[...]
django.db.utils.OperationalError: (2013, 'Lost connection to MySQL server during query')
and database logs:
db_1 | 2023-03-30 15:55:52 0x7f55f4119640 InnoDB: Assertion failure in file ./storage/innobase/fil/fil0fil.cc line 604
db_1 | InnoDB: Failing assertion: fsize != os_offset_t(-1)
db_1 | InnoDB: We intentionally generate a memory trap.
db_1 | InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
db_1 | InnoDB: If you get repeated assertion failures or crashes, even
db_1 | InnoDB: immediately after the mariadbd startup, there may be
db_1 | InnoDB: corruption in the InnoDB tablespace. Please refer to
db_1 | InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
db_1 | InnoDB: about forcing recovery.
db_1 | 230330 15:55:52 [ERROR] mysqld got signal 6 ;
db_1 | This could be because you hit a bug. It is also possible that this binary
db_1 | or one of the libraries it was linked against is corrupt, improperly built,
db_1 | or misconfigured. This error can also be caused by malfunctioning hardware.
db_1 |
db_1 | To report this bug, see https://mariadb.com/kb/en/reporting-bugs
db_1 |
db_1 | We will try our best to scrape up some info that will hopefully help
db_1 | diagnose the problem, but since we have already crashed,
db_1 | something is definitely wrong and this may fail.
db_1 |
db_1 | Server version: 10.11.2-MariaDB-1:10.11.2+maria~ubu2204 source revision: cafba8761af55ae16cc69c9b53a341340a845b36
db_1 | key_buffer_size=134217728
db_1 | read_buffer_size=131072
db_1 | max_used_connections=3
db_1 | max_threads=153
db_1 | thread_count=3
db_1 | It is possible that mysqld could use up to
db_1 | key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 468019 K bytes of memory
db_1 | Hope that's ok; if not, decrease some variables in the equation.
db_1 |
db_1 | Thread pointer: 0x7f55b8000c68
db_1 | Attempting backtrace. You can use the following information to find out
db_1 | where mysqld died. If you see no messages after this, something went
db_1 | terribly wrong...
db_1 | stack_bottom = 0x7f55f4118c78 thread_stack 0x49000
db_1 | Printing to addr2line failed
db_1 | mariadbd(my_print_stacktrace+0x32)[0x555f5fd475e2]
db_1 | mariadbd(handle_fatal_signal+0x488)[0x555f5f81dc08]
db_1 | /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f560abe0520]
db_1 | /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f560ac34a7c]
db_1 | /lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f560abe0476]
db_1 | /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f560abc67f3]
db_1 | mariadbd(+0x693ef1)[0x555f5f445ef1]
db_1 | mariadbd(+0x6a3054)[0x555f5f455054]
db_1 | mariadbd(+0x6a3a06)[0x555f5f455a06]
db_1 | mariadbd(+0x6a616d)[0x555f5f45816d]
db_1 | mariadbd(+0x6a6b5a)[0x555f5f458b5a]
db_1 | mariadbd(+0xee6616)[0x555f5fc98616]
db_1 | mariadbd(+0xee7811)[0x555f5fc99811]
db_1 | mariadbd(+0xe645db)[0x555f5fc165db]
db_1 | mariadbd(+0xea720e)[0x555f5fc5920e]
db_1 | mariadbd(+0xea7ad7)[0x555f5fc59ad7]
db_1 | mariadbd(+0xddd7d9)[0x555f5fb8f7d9]
db_1 | mariadbd(+0xd679de)[0x555f5fb199de]
db_1 | mariadbd(+0xd6f67a)[0x555f5fb2167a]
db_1 | mariadbd(+0xd71f45)[0x555f5fb23f45]
db_1 | mariadbd(_Z17mysql_alter_tableP3THDPK25st_mysql_const_lex_stringS3_P22Table_specification_stP10TABLE_LISTP13Recreate_infoP10Alter_infojP8st_orderbb+0x4d3f)[0x555f5f68662f]
db_1 | mariadbd(_ZN19Sql_cmd_alter_table7executeEP3THD+0x398)[0x555f5f6f5978]
db_1 | mariadbd(_Z21mysql_execute_commandP3THDb+0x4cfd)[0x555f5f5c918d]
db_1 | mariadbd(_Z11mysql_parseP3THDPcjP12Parser_state+0x1e7)[0x555f5f5ca367]
db_1 | mariadbd(_Z16dispatch_command19enum_server_commandP3THDPcjb+0x14c5)[0x555f5f5ccab5]
db_1 | mariadbd(_Z10do_commandP3THDb+0x138)[0x555f5f5ce698]
db_1 | mariadbd(_Z24do_handle_one_connectionP7CONNECTb+0x3bf)[0x555f5f6f09cf]
db_1 | mariadbd(handle_one_connection+0x5d)[0x555f5f6f0d1d]
db_1 | mariadbd(+0xc99b66)[0x555f5fa4bb66]
db_1 | /lib/x86_64-linux-gnu/libc.so.6(+0x94b43)[0x7f560ac32b43]
db_1 | /lib/x86_64-linux-gnu/libc.so.6(clone+0x44)[0x7f560acc3bb4]
db_1 |
db_1 | Trying to get some variables.
db_1 | Some pointers may be invalid and cause the dump to abort.
db_1 | Query (0x7f55b8010a70): ALTER TABLE `auth_permission` ADD CONSTRAINT `auth_permission_content_type_id_codename_01ab375a_uniq` UNIQUE (`content_type_id`, `codename`)
db_1 |
db_1 | Connection ID (thread ID): 6
db_1 | Status: NOT_KILLED
db_1 |
db_1 | Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off
db_1 |
db_1 | The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
db_1 | information that should help you find out what is causing the crash.
db_1 | Writing a core file...
db_1 | Working directory at /var/lib/mysql
db_1 | Resource Limits:
db_1 | Limit Soft Limit Hard Limit Units
db_1 | Max cpu time unlimited unlimited seconds
db_1 | Max file size unlimited unlimited bytes
db_1 | Max data size unlimited unlimited bytes
db_1 | Max stack size 8388608 unlimited bytes
db_1 | Max core file size 0 unlimited bytes
db_1 | Max resident set unlimited unlimited bytes
db_1 | Max processes unlimited unlimited processes
db_1 | Max open files 1048576 1048576 files
db_1 | Max locked memory 83968000 83968000 bytes
db_1 | Max address space unlimited unlimited bytes
db_1 | Max file locks unlimited unlimited locks
db_1 | Max pending signals 102413 102413 signals
db_1 | Max msgqueue size 819200 819200 bytes
db_1 | Max nice priority 0 0
db_1 | Max realtime priority 0 0
db_1 | Max realtime timeout unlimited unlimited us
db_1 | Core pattern: core
db_1 |
db_1 | Kernel version: Linux version 4.19.128-microsoft-standard (oe-user@oe-host) (gcc version 8.2.0 (GCC)) #1 SMP Tue Jun 23 12:58:10 UTC 2020
db_1 |
db_1 | Fatal signal 11 while backtracing
oldp_test_db_1 exited with code 139```
For some of the decisions (e.g., this one), the references are not aligned at all with the corresponding occurrences in the text.
Is there any way to work with the data prior to the annotation (as it is available through the JSON), to potentially help with investigating this?
requirements.txt
should use exact package versions (not >=)
Legal markdown not suitable, cannot represent complex documents, e.g. problems with tables etc.
For a better code quality and to make collaboration easier we should add more unit and integration tests.
hello team,
your ssl certificat for your domain is expired, since yesterday Tue, 09 Apr 2024 11:15:04 GMT.
Could you please renew it and maybe and ideally establish an automated process for this, if this is not already the case.
thank you very much and thanks for your project!
Make it easier for users to navigate the site.
For the German court decisions dataset it is unclear under which Licence the dataset is being shared.
Thanks for making this dataset public.
API Bulk Export & Import
Extending OLDP to other countries would get much easier if we support themes.
Germany-specific elements should be moved to a separated repo:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.