Code Monkey home page Code Monkey logo

hinkskalle's Introduction

Hinkskalle

Build Status

On-Premises Container Registry for OCI/docker and singularity

What Am I

(buckethead) - I can store, retrieve and manage OCI and singularity containers in a central library.

Compatible with/re-implementation of the singularity library protocol and the OCI distribution spec.

Documentation

Installation + Usage instructions can be found here:

https://csf-ngs.github.io/hinkskalle/

Continue to read here if you are interested in mucking around in the source code (and maybe helping out with the odd PR or two).

Development

Clone Current HEAD

git clone https://github.com/csf-ngs/hinkskalle.git

Docker development environment

Achieve the best development experience with continuous reloads and frontend builds! No need to set up/mess up your computer!

Hinkskalle comes with a development environment based on docker-compose.

The ghcr.io/csf-ngs/hinkskalle-dev image contains a complete development environment.

First Setup

Initial setup (or maybe you want to reset your dev environment):

# (of course you can use your own favorite dummy secrets)
cat <<_EOF > conf/db_secrets.env
POSTGRES_PASSWORD=supersecret
_EOF

cat <<_EOF > conf/secrets.env
HINKSKALLE_SECRET_KEY=superdupersecret
DB_PASSWORD=supersecret
HINKSKALLE_LDAP_BIND_PASSWORD=superldapsecret
_EOF

cat <<_EOF > conf/slapd_secrets.env
LDAP_ROOT_PASSWORD=superrootsecret
LDAP_LOGIN_PASSWORD=superldapsecret
_EOF

# start hinkdb first to set up base database
docker-compose up -d hinkdb
# give it a second
# install current database schema
docker-compose run --rm api flask db upgrade

# set up first admin user
docker-compose run --rm api flask localdb add-user \
    -u admin.hase \
    -p oink \
    -e '[email protected]' \
    -f Admin \
    -l Hase \
    --admin
# set up a normal user
docker-compose run --rm api flask localdb add-user \
    -u test.hase \
    -p oink \
    -e '[email protected]' \
    -f Test \
    -l Hase 

# ONLY when you need to reset the dev env: clean everything
docker-compose down
docker-compose config --volumes | xargs docker volume rm 

Running Development Instances

Dev server: http://localhost:7660

# WARNING: On first startup we need to install node modules and build the frontend
# this might take a few minutes and the dev server will show
# The requested URL was not found on the server. until that's done.
#
# Whole stack (rarely needed)
docker-compose up -d
# bare minimum
docker-compose up -d api build_frontend
# log output
docker-compose logs -f

The current working directory (base) is mounted into the relevant containers. You can edit the source files with your favorite editor/IDE directly. Services will automatically rebuild and/or restart on changes.

This starts the following services:

api: Local Backend Instance at port 7660

Using script/start-dev.sh. Restarts on changes in backend/

build_frontend: Continuous Frontend Build

Using script/start-dev-frontend.sh, basically a yarn build --watch

hinkdb: Postgres database

rq_scheduler, rq_worker, redis: Backend async job queue (optional)

ldap: for testing LDAP authentication (optional)

hockeypuck, hockeypuck_db: PGP keyserver (optional)

Side Notes

  • uploaded images are stored in ./tmp

Development Install

Needs postgresql dev libraries! Install according to your OS instructions, e.g.:

# mac os x
brew install postgresql
# ubuntu/debian
apt install postgresql-dev
# etc.
cd backend/
python3 -m venv venv
source venv/bin/activate
pip install -e '.[dev]'

This will also install nose2, Jinja2, fakeredis and psycopg2 for running tests and generating typescript classes.

You also need to set up Node, Vue and vue-cli for testing and compiling the frontend:

# install node according to your OS
cd frontend/
yarn install

Patch Singularity

Not necessary for singularity v3.9.0 or newer, use --insecure and --no-https params instead

Singularity absolutely required that the library server is reachable via https. While you can set this up for your development server, it's much easier to patch the source code and recompile your own.

The necessary patch is provided in share/singularity-plain-http.patch and should work an all versions.

ORAS requires a similar patch. If you want to play around with that, apply share/oras-plain-http.patch.

Follow the instructions on https://sylabs.io/guides/3.7/admin-guide/installation.html (adjust for the version you would like) and apply the patch between the steps "Checkout Code from Git" and "Compile Singularity":

cd ${GOPATH}/src/github.com/sylabs/singularity
patch -p1 < /path/to/singularity-plain-http.patch
patch -p1 < /path/to/oras-plain-http.patch

Start Development Server

script/start-dev.sh
# continuous build of frontend
script/start-dev-frontend.sh
# (optional: start rq worker)
# script/start-dev-worker.sh

Backend Tests

cd backend
nose2

Frontend Tests

cd frontend
yarn test:unit

OCI Conformance Tests

Requires a docker image built from https://github.com/opencontainers/distribution-spec/tree/main/conformance.

cd share/oci
./conformance-test.sh

Your backend should be available at localhost:7660

Generate Typescript Classes for models

pip3 install git+https://github.com/csf-ngs/swagspotta
# make sure your local dev server is running at localhost:7660
share/generate-models.sh

Built With

Contributing

Please do!

Authors

License

This project is licensed under the MIT License - see the LICENSE.md file for details

hinkskalle's People

Contributors

grisuthedragon avatar h3kker avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

hinkskalle's Issues

do not re-serialize dump_only fields

after update to marshmallow 3.x dump_only fields are not accepted any more (instead of ignored)

swagger-to-typescript and swagger-to-python should not include those fields in serialize*

groups #17

enable groups, handle permissions

local/ldap group membership

create groups for users?

image housekeeping job #64

  • check file exists
  • check correct size
  • check hash?
  • verify signature for singularity

possibly also symlink maintenance in same job #9

Tag has multiple architectures exception

Hard to reproduce, but this keeps popping up at vbcf after some push (running v4.3.1):

api_1           | [2022-09-23 09:01:51,044] ERROR in app:1455 | Exception on /v1/latest [GET]
api_1           | Traceback (most recent call last):
api_1           |   File "/usr/local/lib/python3.8/dist-packages/flask_rebar/compat.py", line 44, in dump
api_1           |     result = schema.dump(data)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/schema.py", line 552, in dump
api_1           |     result = self._serialize(processed_obj, many=many)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/schema.py", line 520, in _serialize
api_1           |     value = field_obj.serialize(attr_name, obj, accessor=self.get_attribute)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/fields.py", line 338, in serialize
api_1           |     return self._serialize(value, attr, obj, **kwargs)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/fields.py", line 634, in _serialize
api_1           |     return schema.dump(nested_obj, many=many)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/schema.py", line 552, in dump
api_1           |     result = self._serialize(processed_obj, many=many)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/schema.py", line 514, in _serialize
api_1           |     return [
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/schema.py", line 515, in <listcomp>
api_1           |     self._serialize(d, many=False)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/schema.py", line 520, in _serialize
api_1           |     value = field_obj.serialize(attr_name, obj, accessor=self.get_attribute)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/fields.py", line 338, in serialize
api_1           |     return self._serialize(value, attr, obj, **kwargs)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/fields.py", line 634, in _serialize
api_1           |     return schema.dump(nested_obj, many=many)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/schema.py", line 552, in dump
api_1           |     result = self._serialize(processed_obj, many=many)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/schema.py", line 520, in _serialize
api_1           |     value = field_obj.serialize(attr_name, obj, accessor=self.get_attribute)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/fields.py", line 330, in serialize
api_1           |     value = self.get_value(obj, attr, accessor=accessor)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/fields.py", line 260, in get_value
api_1           |     return accessor_func(obj, check_key, default)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/schema.py", line 479, in get_attribute
api_1           |     return get_value(obj, attr, default)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/utils.py", line 241, in get_value
api_1           |     return _get_value_for_key(obj, key, default)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/marshmallow/utils.py", line 255, in _get_value_for_key
api_1           |     return getattr(obj, key, default)
api_1           |   File "/srv/hinkskalle/backend/Hinkskalle/models/Container.py", line 174, in imageTags
api_1           |     raise Exception(f"Tag {[tag.name](http://tag.name/)} has multiple architectures")
api_1           | Exception: Tag 0.21.0-asm294v2 has multiple architectures
api_1           | During handling of the above exception, another exception occurred:
api_1           | Traceback (most recent call last):
api_1           |   File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1523, in full_dispatch_request
api_1           |     rv = self.dispatch_request()
api_1           |   File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1509, in dispatch_request
api_1           |     return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/flask_rebar/rebar.py", line 164, in wrapped
api_1           |     marshaled = marshal(data=data, schema=schema)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/flask_rebar/utils/request_utils.py", line 91, in marshal
api_1           |     return compat.dump(schema=schema, data=data)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/flask_rebar/compat.py", line 48, in dump
api_1           |     raise marshmallow.ValidationError(str(e))
api_1           | marshmallow.exceptions.ValidationError: Tag 0.21.0-asm294v2 has multiple architectures

error: exec: "./script/start.sh": stat ./script/start.sh: no such file or directory

After fixing the "Hinkskalle.uitl" python error I now get:

api_1           | error: exec: "./script/start.sh": stat ./script/start.sh: no such file or directory

I use the plain docker-compose.yml from share/deploy and commented out all LDAP related stuff from the environment files. I set the system up by:

docker-compose up -d hinkdb
docker-compose run --rm api flask db upgrade
docker-compose run --rm api flask localdb add-user   -u admin   -p admin   -e [email protected]   -f Admin -l Hase   --admin
docker-compose up 

Max Image size

Push works fine for me with smaller images (~100MB) But I get NumericValueOutOfRange exception for files of a few gigabytes.
How can I expand the capacity?

ModuleNotFoundError: No module named 'Hinkskalle.util'

In the current containers, downloaded by docker compose, the follwoing error is thrown:

# flask db upgrade
Usage: flask db upgrade [OPTIONS] [REVISION]
Try 'flask db upgrade --help' for help.

Error: While importing 'Hinkskalle', an ImportError was raised:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/flask/cli.py", line 234, in locate_app
    __import__(module_name)
  File "/usr/local/lib/python3.8/dist-packages/Hinkskalle/__init__.py", line 17, in <module>
    from Hinkskalle.util.swagger import register_authenticators
ModuleNotFoundError: No module named 'Hinkskalle.util'


Make LDAP Sync Better to configure

Since I got hinkskalle running and the first experiments were statisfying I started integrating it in our infrastructure. Thereby, the LDAP sync allows us to login with our normal usersnames (something like lastname+x), hinkskalle uses lowercase(givenname.lastname), which is not intuitive in our setup. The reason for this is the following line:

user.username = slugify(_get_attr(attrs.get('cn')), separator='.')

in my installation, I changed it to:

user.username = _get_attr(attrs.get('uid'))

Since everybody has a bit different ldap installation, this should be configurable.

Integer out of Range when uploading large Images

When we upload a large image (around 14 GB), hinkskalle returns a HTTP 500 and we get the following output in the log:

hinkdb_1        | 2023-01-24 12:50:34.094 UTC [47] ERROR:  integer out of range
hinkdb_1        | 2023-01-24 12:50:34.094 UTC [47] STATEMENT:  INSERT INTO image_upload_url (id, "expiresAt", path, size, md5sum, sha256sum, state, type, "partNumber", "totalParts", "createdAt", "createdBy", parent_id, image_id) VALUES ('35b54ce4-2a61-41c9-83f9-25713125b8f8', '2023-01-24T12:55:34.094330'::timestamp, '/mnt/images/_tmp/tmp61ddtvmq', 14068015104, NULL, NULL, 'initialized', 'multipart', NULL, 210, '2023-01-24T12:50:34.094355'::timestamp, 'koehlerm', NULL, 35)
api_1           | [2023-01-24 12:50:34,097] ERROR in app:1449 | Exception on /v2/imagefile/35/_multipart [POST]
api_1           | Traceback (most recent call last):
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
api_1           |     self.dialect.do_execute(
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py", line 736, in do_execute
api_1           |     cursor.execute(statement, parameters)
api_1           | psycopg2.errors.NumericValueOutOfRange: integer out of range
api_1           | 
api_1           | 
api_1           | The above exception was the direct cause of the following exception:
api_1           | 
api_1           | Traceback (most recent call last):
api_1           |   File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1517, in full_dispatch_request
api_1           |     rv = self.dispatch_request()
api_1           |   File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1503, in dispatch_request
api_1           |     return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/flask_rebar/rebar.py", line 141, in wrapped
api_1           |     rv = f(*args, **kwargs)
api_1           |   File "/srv/hinkskalle/backend/Hinkskalle/routes/imagefiles.py", line 231, in push_image_v2_multi_init
api_1           |     db.session.commit()
api_1           |   File "<string>", line 2, in commit
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 1451, in commit
api_1           |     self._transaction.commit(_to_root=self.future)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 829, in commit
api_1           |     self._prepare_impl()
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 808, in _prepare_impl
api_1           |     self.session.flush()
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 3429, in flush
api_1           |     self._flush(objects)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 3569, in _flush
api_1           |     transaction.rollback(_capture_exception=True)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
api_1           |     compat.raise_(
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py", line 208, in raise_
api_1           |     raise exception
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 3529, in _flush
api_1           |     flush_context.execute()
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
api_1           |     rec.execute(self)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/unitofwork.py", line 630, in execute
api_1           |     util.preloaded.orm_persistence.save_obj(
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
api_1           |     _emit_insert_statements(
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/persistence.py", line 1238, in _emit_insert_statements
api_1           |     result = connection._execute_20(
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1705, in _execute_20
api_1           |     return meth(self, args_10style, kwargs_10style, execution_options)
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/elements.py", line 333, in _execute_on_connection
api_1           |     return connection._execute_clauseelement(
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1572, in _execute_clauseelement
api_1           |     ret = self._execute_context(
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1943, in _execute_context
api_1           |     self._handle_dbapi_exception(
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 2124, in _handle_dbapi_exception
api_1           |     util.raise_(
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py", line 208, in raise_
api_1           |     raise exception
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
api_1           |     self.dialect.do_execute(
api_1           |   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py", line 736, in do_execute
api_1           |     cursor.execute(statement, parameters)
api_1           | sqlalchemy.exc.DataError: (psycopg2.errors.NumericValueOutOfRange) integer out of range
api_1           | 
api_1           | [SQL: INSERT INTO image_upload_url (id, "expiresAt", path, size, md5sum, sha256sum, state, type, "partNumber", "totalParts", "createdAt", "createdBy", parent_id, image_id) VALUES (%(id)s, %(expiresAt)s, %(path)s, %(size)s, %(md5sum)s, %(sha256sum)s, %(state)s, %(type)s, %(partNumber)s, %(totalParts)s, %(createdAt)s, %(createdBy)s, %(parent_id)s, %(image_id)s)]
api_1           | [parameters: {'id': '35b54ce4-2a61-41c9-83f9-25713125b8f8', 'expiresAt': datetime.datetime(2023, 1, 24, 12, 55, 34, 94330), 'path': '/mnt/images/_tmp/tmp61ddtvmq', 'size': 14068015104, 'md5sum': None, 'sha256sum': None, 'state': 'initialized', 'type': 'multipart', 'partNumber': None, 'totalParts': 210, 'createdAt': datetime.datetime(2023, 1, 24, 12, 50, 34, 94355), 'createdBy': 'koehlerm', 'parent_id': None, 'image_id': 35}]
api_1           | (Background on this error at: https://sqlalche.me/e/14/9h9h)

Running Hinkskalle 4.6.0 upgrade from 4.4.x.

Some details:

In models/Image.py the size column is created as

size = db.Column(db.BigInteger())

but looking at the table in the postgres db we get:

hinkskalle=# \d image_upload_url
                      Table "public.image_upload_url"
   Column   |            Type             | Collation | Nullable | Default 
------------+-----------------------------+-----------+----------+---------
 id         | character varying           |           | not null | 
 expiresAt  | timestamp without time zone |           |          | 
 path       | character varying           |           | not null | 
 size       | integer                     |           |          | 
 md5sum     | character varying           |           |          | 
 sha256sum  | character varying           |           |          | 
 state      | character varying           |           |          | 
 createdAt  | timestamp without time zone |           |          | 
 createdBy  | character varying           |           |          | 
 image_id   | integer                     |           | not null | 
 type       | upload_types                |           |          | 
 parent_id  | character varying           |           |          | 
 partNumber | integer                     |           |          | 
 totalParts | integer                     |           |          | 

This results in a 4 byte integer column.

The reason seems that during the update to 4.6. the db migrations are not executed. For now, I did it manually:

docker exec -it hinkskalle_hinkdb_1 psql -U XXXXX -d hinkskalle
alter table image_upload_url alter column size TYPE BIGINT;
alter table image alter column size TYPE BIGINT;

Either we're missing something during upgrade or the migration does not work properly.
@h3kker I think, a short "upgrade" section in the docs would be nice.

webauthn backend url validate

origin validation fails with trailing /:

api_1           | webauthn.helpers.exceptions.InvalidRegistrationResponse: Unexpected client data origin "https://kubel.testha.se", expected "https://kubel.testha.se/"

archive old images #62

upload unused images to azure cold storage

  • not download within 1 year
  • created >6 months ago+never downloaded

or something.

[ ] download date
[ ] maintenance routine to move unused
[ ] set Image.location to storage url, e.g. https://myaccount.blob.core.windows.net/mycontainer/myblob
[ ] would need to wrap send_file for pulls. can singularity/oras/... do redirects?
[ ] maintenance/command to move back to local storage

https://pypi.org/project/cloudstorage/ seems dubious (documentation?)

symlink maintenance #49

link handling should be done on image:

  • gather current links
  • remove without tag
  • remove changed tag
  • add links for current tags

on tag route. also run on oci push for singularity media type?

trigger link maintenance route

link maintenance on all images -> rq job

could be folded into #5

update dev docs

pip3 install '.[dev]' requires postgres client libraries

start postgres container once for db init, run initial migrations

default quota fpr new users

After running hinkskalle now for two months, we found a nice to have enhancement. It would be great if we can set a default quota for users and groups on their first login or when the group is created.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.