Code Monkey home page Code Monkey logo

paperless-ngx_ynh's Introduction

Paperless-ngx for YunoHost

Integration level Working status Maintenance status

Install Paperless-ngx with YunoHost

Lire ce readme en français.

This package allows you to install Paperless-ngx quickly and simply on a YunoHost server. If you don't have YunoHost, please consult the guide to learn how to install it.

Overview

Paperless-ngx is a document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.

Features

  • Organize and index your scanned documents with tags, correspondents, types, and more.
  • Performs OCR on your documents, adds selectable text to image only documents and adds tags, correspondents and document types to your documents.
  • Supports PDF documents, images, plain text files, and Office documents (Word, Excel, Powerpoint, and LibreOffice equivalents).
  • Paperless stores your documents plain on disk. Filenames and folders are managed by paperless and their format can be configured freely.
  • Single page application front end.
  • Full text search helps you find what you need.
  • Email processing: Paperless adds documents from your email accounts.
  • Machine learning powered document matching.
  • Optimized for multi core systems: Paperless-ngx consumes multiple documents in parallel.
  • The integrated sanity checker makes sure that your document archive is in good health.
  • More screenshots are available in the documentation.

Shipped version: 2.3.3~ynh1

Demo: https://demo.paperless-ngx.com/

Screenshots

Screenshot of Paperless-ngx

Documentation and resources

Developer info

Please send your pull request to the testing branch.

To try the testing branch, please proceed like that.

sudo yunohost app install https://github.com/YunoHost-Apps/paperless-ngx_ynh/tree/testing --debug
or
sudo yunohost app upgrade paperless-ngx -u https://github.com/YunoHost-Apps/paperless-ngx_ynh/tree/testing --debug

More info regarding app packaging: https://yunohost.org/packaging_apps

paperless-ngx_ynh's People

Contributors

alexaubin avatar ericgaspar avatar fabianwilkens avatar orhtej2 avatar tagadda avatar yalh76 avatar yunohost-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

paperless-ngx_ynh's Issues

Fresh installation – NLTK Data not available

Describe the bug

I installed the paperless-ngx app via YunoHost webadmin. Installation logs are fine. I am able to upload documents and manual classify them. Automatic classification is not being done. In the log view of the webpage I see the error documented below.

Context

  • Hardware: Old computer
  • YunoHost version: 11.1.16 (stable)
  • I have access to my server: direct access via keyboard / screen
  • Are you in a special context or did you perform some particular tweaking on your YunoHost instance?: no

Steps to reproduce

Make a fresh installation of the paperless-ngx application in YunoHost.

Expected behavior

After adding and manual classifying several documents, paperless-ngx should start to classify new documents by itself.

Logs

[2023-04-01 01:03:45,720] [DEBUG] [paperless.classifier] Document classification model does not exist (yet), not performing automatic matching.
[2023-04-01 01:03:45,723] [DEBUG] [paperless.classifier] Gathering data from database...
[2023-04-01 01:03:49,436] [WARNING] [paperless.tasks] Classifier error: 
**********************************************************************
  Resource �[93mstopwords�[0m not found.
  Please use the NLTK Downloader to obtain the resource:

  �[31m>>> import nltk
  >>> nltk.download('stopwords')
  �[0m
  For more information see: https://www.nltk.org/data.html

  Attempted to load �[93mcorpora/stopwords�[0m

  Searched in:
    - '/opt/yunohost/paperless-ngx/src/__FINALPATH_/nltk_data'
**********************************************************************

[2023-04-01 01:04:25,388] [DEBUG] [paperless.classifier] Document classification model does not exist (yet), not performing automatic matching.
[2023-04-01 01:04:25,391] [DEBUG] [paperless.classifier] Gathering data from database...
[2023-04-01 01:04:29,029] [WARNING] [paperless.tasks] Classifier error: 
**********************************************************************
  Resource �[93mstopwords�[0m not found.
  Please use the NLTK Downloader to obtain the resource:

  �[31m>>> import nltk
  >>> nltk.download('stopwords')
  �[0m
  For more information see: https://www.nltk.org/data.html

  Attempted to load �[93mcorpora/stopwords�[0m

  Searched in:
    - '/opt/yunohost/paperless-ngx/src/__FINALPATH_/nltk_data'
**********************************************************************

Workaround

Manual download the NLTK stopwords to the “searched in:” path mentioned in the log using this https://www.nltk.org/data.html guide.

Possible fix?

In the file paperless.conf.example, the parameter
PAPERLESS_NLTK_DIR=__FINALPATH_/nltk_data

__FINALPATH_ seams to be missing a second underscore

MissingDependencyError: gs - OCRmyPDF requires 'gs' 9.55 or higher.

Describe the bug

Upload a PDF failed with the message: MissingDependencyError: gs

Context

  • YunoHost version: 11.2.8.2
  • Paperless-ngx 2.1.3~ynh1

Logs

[2023-12-26 10:39:24,687] [ERROR] [paperless.consumer] Error occurred while consuming document 20231016 Wasser Analyse Testergebnisse.pdf: MissingDependencyError: gs

Traceback (most recent call last):

  File "/var/www/paperless-ngx/src/paperless_tesseract/parsers.py", line 330, in parse

    ocrmypdf.ocr(**args)

  File "/var/www/paperless-ngx/venv/lib/python3.9/site-packages/ocrmypdf/api.py", line 374, in ocr

    check_options(options, plugin_manager)

  File "/var/www/paperless-ngx/venv/lib/python3.9/site-packages/ocrmypdf/_validation.py", line 248, in check_options

    _check_plugin_options(options, plugin_manager)

  File "/var/www/paperless-ngx/venv/lib/python3.9/site-packages/ocrmypdf/_validation.py", line 241, in _check_plugin_options

    plugin_manager.hook.check_options(options=options)

  File "/var/www/paperless-ngx/venv/lib/python3.9/site-packages/pluggy/_hooks.py", line 493, in __call__

    return self._hookexec(self.name, self._hookimpls, kwargs, firstresult)

  File "/var/www/paperless-ngx/venv/lib/python3.9/site-packages/pluggy/_manager.py", line 115, in _hookexec

    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)

  File "/var/www/paperless-ngx/venv/lib/python3.9/site-packages/pluggy/_callers.py", line 113, in _multicall

    raise exception.with_traceback(exception.__traceback__)

  File "/var/www/paperless-ngx/venv/lib/python3.9/site-packages/pluggy/_callers.py", line 77, in _multicall

    res = hook_impl.function(*args)

  File "/var/www/paperless-ngx/venv/lib/python3.9/site-packages/ocrmypdf/builtin_plugins/ghostscript.py", line 51, in check_options

    check_external_program(

  File "/var/www/paperless-ngx/venv/lib/python3.9/site-packages/ocrmypdf/subprocess/__init__.py", line 341, in check_external_program

    raise MissingDependencyError(program)

ocrmypdf.exceptions.MissingDependencyError: gs

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "/var/www/paperless-ngx/src/documents/consumer.py", line 446, in try_consume_file

    document_parser.parse(self.path, mime_type, self.filename)

  File "/var/www/paperless-ngx/src/paperless_tesseract/parsers.py", line 397, in parse

    raise ParseError(f"{e.__class__.__name__}: {e!s}") from e

documents.parsers.ParseError: MissingDependencyError: gs```

Trouble using the manage.py script

Describe the bug

A clear and concise description of what the bug is.

Context

  • Hardware: VPS
  • YunoHost version: 11.1.6
  • I have access to my server: Through SSH
  • Are you in a special context or did you perform some particular tweaking on your YunoHost instance?: no
  • Using, or trying to install package version/branch: standard

Steps to reproduce

  •   sudo -Hu paperless-ngx python3 /opt/yunohost/paperless-ngx/src/manage.py document retagger -c -T -t

Expected behavior

This should start the Retagger. Instead, it complained that Django was not installed. I installed it using

pip install Django

Same for celery. Now it still gives me an error instead of retagging:

Logs

Traceback (most recent call last):
  File "/opt/yunohost/paperless-ngx/.local/lib/python3.9/site-packages/django/core/management/__init__.py", line 259, in fetch_command
    app_name = commands[subcommand]
KeyError: 'document'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/yunohost/paperless-ngx/src/manage.py", line 11, in <module>
    execute_from_command_line(sys.argv)
  File "/opt/yunohost/paperless-ngx/.local/lib/python3.9/site-packages/django/core/management/__init__.py", line 446, in execute_from_command_line
    utility.execute()
  File "/opt/yunohost/paperless-ngx/.local/lib/python3.9/site-packages/django/core/management/__init__.py", line 440, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/opt/yunohost/paperless-ngx/.local/lib/python3.9/site-packages/django/core/management/__init__.py", line 266, in fetch_command
    settings.INSTALLED_APPS
  File "/opt/yunohost/paperless-ngx/.local/lib/python3.9/site-packages/django/conf/__init__.py", line 92, in __getattr__
    self._setup(name)
  File "/opt/yunohost/paperless-ngx/.local/lib/python3.9/site-packages/django/conf/__init__.py", line 79, in _setup
    self._wrapped = Settings(settings_module)
  File "/opt/yunohost/paperless-ngx/.local/lib/python3.9/site-packages/django/conf/__init__.py", line 190, in __init__
    mod = importlib.import_module(self.SETTINGS_MODULE)
  File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 790, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/opt/yunohost/paperless-ngx/src/paperless/settings.py", line 15, in <module>
    from concurrent_log_handler.queue import setup_logging_queues
ModuleNotFoundError: No module named 'concurrent_log_handler'

I feel like there is some basic mistake I am making, can you help me with this? I am trying to run the document_retagger.

API Access does not work with guest access

Describe the bug

I've added access to "Paperless-ngx (paperless-ngx API)" to Guests, assuming this would make the API available outside of SSO.
(In a second attempt, I wiped the app and checked the "app access" option during install which seems to have the same or similar result.

Unfortunately something (I assume the SSO stuff) is still messing with authentication in a way that makes it impossible to use the API from the outside. My knowledge of how Yuno-SSO works is limited but I guess it could mess with the Authorization header that is used to authenticate API usage.

I have compared this with a local docker install I have of Paperless-ngx I installed to test it beforehand and neither the "get token" call nor any API call authenticated via token seems to do the job in the same way my local installation does. Instead, I always get a 302 towards the login page.

Context

  • Hardware: VPS bought online
  • YunoHost version: 11.0.11
  • I have access to my server: Through SSH | through the webadmin
  • Are you in a special context or did you perform some particular tweaking on your YunoHost instance?: no

Steps to reproduce

  • Install paperless-ngx
  • Make sure guest access is active for api
  • create access token via admin UI
  • curl -H "Authorization: Token $YOURTOKEN" https://$HOST/api/correspondents

Expected behavior

  • API should authenticate via token.

Observed behaviour

  • API never authenticates, instead always 302's to the login page. (/accounts/login/?next=/api/correspondents)

Logs

I can provide logs but neither in the nginx nor in the paperless-ngx logs is there anything meaningful beyond the obvious redirect.

document_importer / exporter is not working, folders missing

Context

  • Hardware: VPS bought online
  • YunoHost version: 11.2.5
  • I have access to my server: Through SSH | through the webadmin | direct access via keyboard
  • Are you in a special context or did you perform some particular tweaking on your YunoHost instance?: no

Steps to reproduce

  • Install paperless-ngx via web interface
  • Put export data from another paperless instance into the export directory of paperless-ngx_ynh
  • run the command for import the files
    sudo -u paperless-ngx /var/www/paperless-ngx/venv/bin/python3 /var/www/paperless-ngx/src/manage.py document_importer ../export

Expected behavior

*Error:*
SystemCheckError: System check identified some issues:

ERRORS:
?: PAPERLESS_CONSUMPTION_DIR is set but doesn't exist.
	HINT: Create a directory at /var/www/paperless-ngx/consume
?: PAPERLESS_MEDIA_ROOT is set but doesn't exist.
	HINT: Create a directory at /var/www/paperless-ngx/media

Logs

nothing conspicuous in the logs

Workaround ???

If I use the paperless-ngx_ynh instance alone, does it make sense to create the required folders under /var/www/paperless-ngx/consume ?

No longer able to upload docs.

Describe the bug

When trying to upload a new documents to the server, using any method, returns a missing or corrupt dependency.

Context

  • Hardware: VPS bought online / Old laptop or computer / Raspberry Pi at home / Internet Cube with VPN / Other ARM board / ...
  • YunoHost version: 11.2.8
  • I have access to my server: Through SSH | through the webadmin | direct access via keyboard / screen | ...
  • Are you in a special context or did you perform some particular tweaking on your YunoHost instance?: no
    • If yes, please explain:
  • Using, or trying to install package version/branch:
  • If upgrading, current package version: can be found in the admin, or with app v2.1.0-ynh1

Steps to reproduce

Two weeks ago I installed Paperless-ngx and started to use it in anger, find it very useful. However, there was a recent upgrade to the app via the ynh GUI. Since that upgrade, I am no longer able to upload documents. Any document I try to upload gives the attached error.

It doesn't matter what method you use to upload the error is always the same

Expected behaviour

The document should upload to the server.

Logs

I’m sorry but I’m just a simple user and wouldn’t know where to look for a log. The rest of the app is fine, I just can’t upload new ones.

Screenshot 2023-12-09 at 13 42 18

RuntimeError: Response content longer than Content-Length

Describe the bug

I see this error in the logs:

root@YunoHost:~# tail -f /var/log/paperless-ngx/paperless-ngx.log 
    return await application(scope, receive, send)
  File "/var/www/paperless-ngx/venv/lib/python3.9/site-packages/django/core/handlers/asgi.py", line 160, in __call__
    await self.handle(scope, receive, send)
  File "/var/www/paperless-ngx/venv/lib/python3.9/site-packages/django/core/handlers/asgi.py", line 190, in handle
    await self.send_response(response, send)
  File "/var/www/paperless-ngx/venv/lib/python3.9/site-packages/django/core/handlers/asgi.py", line 274, in send_response
    await send(
  File "/var/www/paperless-ngx/venv/lib/python3.9/site-packages/uvicorn/protocols/http/httptools_impl.py", line 560, in send
    raise RuntimeError("Response content longer than Content-Length")
RuntimeError: Response content longer than Content-Length
/var/www/paperless-ngx/venv/lib/python3.9/site-packages/django/http/response.py:517: Warning: StreamingHttpResponse must consume synchronous iterators in order to serve them asynchronously. Use an asynchronous iterator instead.
  warnings.warn(

But the web services works and i can login and use it.

Context

  • YunoHost version: 11.2.8.2
  • Paperless-ngx 2.1.3~ynh1

state of this package?

hey, just a question: what's the current state of this app package? is this ready for testing on my server?
and which branch should i try, master or testing? thank you

Paperlass-ngx https:\\$HOST\api not found (Error 404)

Hello,

My YunoHost Server
Hardware: VPS Purchased online
YunoHost Version: 11.1.21.4 (stable)

I have access to my server: over SSH | over Webadmin |
Are you in a special context or have you made certain settings on your YunoHost instance? : no

Description of my problem

paperlass-ngx https:$HOST\api

Error message

not found (Error 404)

All logs show no errors.

Best regards

Enable "less powerful" mode

Describe the bug

Not a bug, but I'd want to install paperless-ngx on a Raspberry Pi, which isn't powerful enough. However, the most RAM usage comes from OCR and extra workers. paperless-ngx describes the config changes needed to run it on RasPi and similar.

It'd be nice to be able to choose between the "default" and "less powerful" config, where the second one would exclude OCR and maybe even NLTK

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.