Code Monkey home page Code Monkey logo

kube-paperless-ng's Introduction

ci Crowdin Documentation Status Coverage Status Chat on Matrix

Paperless-ngx

Paperless-ngx is a document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.

Paperless-ngx forked from paperless-ng to continue the great work and distribute responsibility of supporting and advancing the project among a team of people. Consider joining us! Discussion of this transition can be found in issues #1599 and #1632.

Paperless-ngx on Kubernetes

I don't care...take me to TLDR

I ran paperless-ngx on Docker for a while and moved to installation to Kubernetes. I did this for a couple of reasons

  • Learn Kubernetes better
  • Help out people who wanted to learn Kubernetes better
  • See how a document management sytem would scale and work for me

This is repo of the manifiests that I used to put paperless into my microk8s cluster. There are a few caveats here:

  • If you run multiple nodes in your cluster your PVC configs will beed to refelct that, OpenEBS is a good option
  • I put the consumption directory on and NFS share - why you ask? Simple
    • If you have a scanner you can scan directly to a share on a NAS, File server or whatever you want
    • Usually this is a share folder somewhere, created a PVC for this seemed like a bad idea a share was best
    • You can be grandualr with permissons on the share so make sure you grant paperless the access needed
    • You can also have a "inotify" process running to pass stuff to this directory from another share if you want.

The possibilities are endless really

I also offloaded OCR and document convertion to Tika and Gotenberg respectively. The OCR deployment and service manifests show the servicies neded.

I also put no NGINX ingress on this installation as I didn't want it, I wanted the port. In my setup I have an external LB/Controller that handles access-lists and certificates. You can easly change the service deployments for the webserver to have ingress if you which then create the ingress manifest which I may include later.

You'll also notice an AV manifest. I was working on a solution to scan uploads with ClamAV but it's not there yet so you can safely remove them if you want or keep thema and see if you can get Clam to scan the consume directory.

Installation

Have a working K8S installaton somwhere. Microk8s, Minikube...doesn't matter. Download/pull the manifest and edit paperless-config.yaml to your liking. You'll notice those are the envrionment values for paperless itself so you can easily add/remove what you want based on the paperless documentation here. Please note the strings for the OCR section, as they point to the OCR Service depolyment. This DNS internal to the cluster will resolve the service name not the names of the containers so make sure you don't change that or try to resolve the container names as you would with docker. If you have those services running somewhere else like different tenant or not in K8S you'll have to use the IP address. Be careful upgrading the TIKA version past the 1 series branch is it does break stuff. I might try it later with Tensor-Flow but who knows.

All deployment should pull their configs from configmaps.

If you want to run this in production create a secret for the PosgresSQL database login info and change the env values deployment manifest to reflect that. Using secrets is easy and you can do that by looking here. Then change the env vaules to: envFrom: - secretRef: name: your-paperless-db-secret Or whatever you want to call your secret.

That's it.

TLDR

  1. Go to CLI kubectl create namespace paperless
  2. Edit paperless-config.yaml to your liking
  3. Back to CLI kubectl -n paperless apply -f .
  4. Create a Super User kubectl -n paperless exec --stdin --tty pod/paperless-app-xxxxxxx -- python3 ./manage.py createsuperuser
  5. Profit

TO DO LIST

Add autoscaling in with horizontal autoscale

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.