Code Monkey home page Code Monkey logo

edge's Introduction

Storj logo

Storj V3 Network

Go Report Card Go Doc Coverage Status

Storj is building a distributed cloud storage network. Check out our white paper for more info!


Storj is an S3-compatible platform and suite of distributed applications that allows you to store data in a secure and distributed manner. Your files are encrypted, broken into little pieces and stored in a global distributed network of computers. Luckily, we also support allowing you (and only you) to retrieve those files!

Table of Contents

Contributing to Storj

All of our code for Storj v3 is open source. If anything feels off, or if you feel that some functionality is missing, please check out the contributing page. There you will find instructions for sharing your feedback, building the tool locally, and submitting pull requests to the project.

A Note about Versioning

While we are practicing semantic versioning for our client libraries such as uplink, we are not practicing semantic versioning in this repo, as we do not intend for it to be used via Go modules. We may have backwards-incompatible changes between minor and patch releases in this repo.

Start using Storj

Our wiki has documentation and tutorials. Check out these three tutorials:

License

This repository is currently licensed with the AGPLv3 license.

For code released under the AGPLv3, we request that contributors sign our updated Contributor License Agreement (CLA) v2 so that we can relicense the code under Apache v2, or other licenses in the future.

Support

If you have any questions or suggestions please reach out to us on our community forum or file a ticket at https://support.storj.io/.

edge's People

Contributors

aishwaryasharmaccoew avatar alexeyaleonov avatar amwolff avatar boshevski avatar calebcase avatar cam-a avatar dlamarmorgan avatar egonelbre avatar elek avatar erikvv avatar fadila82 avatar halkyon avatar ifraixedes avatar isaachess avatar jessicagreben avatar jewharton avatar jtolio avatar kaloyan-raev avatar mniewrzal avatar mobyvb avatar neo-cypher avatar nergdron avatar nikolaiyurchenko avatar pwilloughby avatar stefanbenten avatar vinozzz avatar vitaliishpital avatar wilfred-asomanii avatar wthorp avatar zeebo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

edge's Issues

Release binaries have incorrect version

Release binaries produced for gateway-mt, linksharing, and authservice on the CI don't show the correct version when running the version command, e.g. gateway-mt version. Instead, they show "Development build". This is the output for gateway-mt built with the v1.15.1 release:

gateway-mt version

2021-10-26T10:46:40.592+1300	DEBUG	process/tracing.go:72	Anonymized tracing disabled
Development build

Custom metadata is not saved when S3 MPU is used

Custom metadata is not saved when S3 MPU is used (in this example the threshold is 8MB)

As you can see in the second case the metadata is empty, but with real AWS it's foo: bar in both cases.

> aws s3 cp --endpoint http://192.168.80.4:7777 --metadata foo=bar 1mb s3://bucket1/1mb
upload: ./1mb to s3://bucket1/1mb                                
 > aws s3 cp --endpoint http://192.168.80.4:7777 --metadata foo=bar 20mb s3://bucket1/20mb
upload: ./20mb to s3://bucket1/20mb                               
> aws s3api get-object --endpoint http://192.168.80.4:7777 --bucket bucket1 --key 1mb x 
{
    "AcceptRanges": "bytes",
    "LastModified": "Thu, 30 Sep 2021 11:36:10 GMT",
    "ContentLength": 1048576,
    "ETag": "\"f5f16eee8ad5a86a3439dc75429831ee\"",
    "ContentType": "application/octet-stream",
    "Metadata": {
        "foo": "bar"
    }
}
> aws s3api get-object --endpoint http://192.168.80.4:7777 --bucket bucket1 --key 20mb x
{
    "AcceptRanges": "bytes",
    "LastModified": "Thu, 30 Sep 2021 11:36:21 GMT",
    "ContentLength": 20971520,
    "ETag": "\"80f096bdc4c7fd9639307cd1174b398a-3\"",
    "ContentType": "application/octet-stream",
    "Metadata": {}
}

Credit:

The problem is identified by the rclone s3 unit tests during the conversation in rclone/rclone#5616 where @olefrost suggested to re-run s3 unit tests with storj gateway-mt and @ncw drew attention to this problem from many failures.

Try to remove globalDomainNames

Another Minio global: globalDomainNames. This almost definitely involves a custom implementation of registerAPIRouter. Bill at one time had rewritten the muxer from scratch, ask him if you can't find that.

Investigate runtime errors while completing the multipart upload

While completing the multipart upload, we encountered invalid memory address or nil pointer dereference runtime error panics. A good starting point would be to try to replicate them locally. It might also be a good idea to try to find similar issues on the net (maybe in minio/minio?). There's a possible data race somewhere.

A sample stack trace from the killed Gateway-MT process:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x48deaa]

goroutine 628087 [running]:
bufio.(*Writer).Available(...)
	/usr/local/go/src/bufio/bufio.go:624
bufio.(*Writer).WriteString(0x0, {0x1b351fa, 0x1d})
	/usr/local/go/src/bufio/bufio.go:706 +0x6a
net/http.writeStatusLine(0x1b3d59b, 0x0, 0x1d, {0xc006d4d94b, 0x40001, 0x3})
	/usr/local/go/src/net/http/server.go:1491 +0x47
net/http.(*chunkWriter).writeHeader(0xc006d4d8c0, {0xc0039ab800, 0x180a000, 0x800})
	/usr/local/go/src/net/http/server.go:1461 +0x103c
net/http.(*chunkWriter).Write(0xc006d4d8c0, {0xc0039ab800, 0x28, 0x800})
	/usr/local/go/src/net/http/server.go:370 +0x3d
bufio.(*Writer).Flush(0xc004aa1d00)
	/usr/local/go/src/bufio/bufio.go:607 +0x62
net/http.(*response).Flush(0xc006d4d880)
	/usr/local/go/src/net/http/server.go:1658 +0x3a
storj.io/gateway-mt/pkg/server/middleware.(*StatusRecorder).Flush(0x40bedd)
	/go/build/pkg/server/middleware/metrics.go:35 +0x35
github.com/minio/minio/cmd/logger.(*ResponseWriter).Flush(0x40bedd)
	/go/pkg/mod/storj.io/[email protected]/cmd/logger/audit.go:119 +0x37
github.com/minio/minio/cmd/http/stats.(*OutgoingTrafficMeter).Flush(0x0)
	/go/pkg/mod/storj.io/[email protected]/cmd/http/stats/http-traffic-recorder.go:58 +0x37
github.com/minio/minio/cmd/logger.(*ResponseWriter).Flush(0x0)
	/go/pkg/mod/storj.io/[email protected]/cmd/logger/audit.go:119 +0x37
github.com/minio/minio/cmd.sendWhiteSpace.func1()
	/go/pkg/mod/storj.io/[email protected]/cmd/object-handlers.go:2468 +0x9f
created by github.com/minio/minio/cmd.sendWhiteSpace
	/go/pkg/mod/storj.io/[email protected]/cmd/object-handlers.go:2452 +0x9d

Fix release versioning

The edge services can't report their version number correctly, because the versioning build process doesn't quite work right for them. We end up having to manually set flags which say its a release build, etc. Also, since we create that process, Go now supports release tagging as a feature, without our custom process. Find what make sense for Edge and update things so our releases are properly tagged.

More client side caching / less downloading of previews in Satellite UI

Steps to reproduce:

All the steps must be done from UI.
PS: Having the dashboard section in a separated tab will help to compile the information.
Check the current bandwidth usage before starting
Upload a .MP4 file to a bucket
Right after the file is uploaded (preview screen) check the bandwidth again.
Click on linksharing
Check the bandwidth again

Gateway-MT seems to have Cache-Control, Last-Modified, and ETag headers, linksharing does not.
Somehow it seems like we get Last-Modified headers anyway (Traefik?).

Moby says the only the Object Browser that appears to handle caching poorly.

Auth service DB blueprint

We want to provide an alternative to using CockroachDB. We have another ticket for creating a TiKV provider for test purposes. What if we wrote something from scratch? Create a blueprint.

How can we test this at scale? What do we need for migrations?

Linkshare has extra round trips to satellite, slowing cross-region linksharing to satellite requests significantly

The linksharing method presentWithProject is all of the logic around requesting data from Satellites/Storage nodes. https://github.com/storj/linksharing/blob/main/sharing/present.go#L50

Here is how it works:

  • If the path requested is empty or ends in a slash, it starts a goroutine to check if index.html exists. This is a round trip to the satellite (StatObject).

  • In parallel, it checks to see if the object requested exists (another round trip to the satellite (StatObject)). If it exists, it calls showObject (discussed below). If it doesn’t exist, it determines if the path requested is a prefix, in which case it redirects to have a slash. This does another round of round trips. We can ignore the isPrefix redirection for the purposes of this ticket.

  • If index.html exists, it calls showObject.

  • If it ends with a slash, it otherwise lists objects.

So, so far, we have done most likely one, potentially two round trips to the Satellite in parallel (StatObject), and in the case of not finding anything without a trailing slash, we may have done a bunch of sequential round trips to determine if the path is a prefix.

showObject then does yet another set of round trips! It calls httpranger.ServeContent with an objectranger. httpranger.ServeContent will parse the byte range requests and so on. This is a really complicated method, to the point that JT has been trying to break it apart into a more modular codebase here: https://github.com/jtolio/servecontent. Once it has parsed byte ranges, then it will go and make another round trip to the Satellite to get order limits.

Let’s talk about a specific example for the purpose of this ticket.

Let’s say the request is to load https://link.us1.storjshare.io/raw//roms/android/my-android.rom

This path does not end in a slash, so we’re going to do one StatObject request to find out if the object exists. Let’s say it does. Then we’re going to do one DownloadObject request (inside the objectranger) to get order limits. So this request is a best case scenario and still has an unnecessary round trip. We could have just done DownloadObject.

To fix this, we will need to explode the ServeContent method so we can do the byte range processing up front. This is a really big refactor, so I propose using https://github.com/jtolio/servecontent as a starting point.

We probably need some test cases to preserve the existing semantic behavior.

Specifically:

  • if there is an object that ends with a slash in its name, that currently is served and is not assumed to be a prefix.

  • if a path is requested that ends with a slash or is the empty key (special case), but no object exists with that exact name (ending with a slash), we first check to see if an index.html exists. if it does we serve that. if it doesn’t, we try to do a prefix listing if we have the listing permission.

  • if a path is requested that does not end in a slash but that object doesn’t exist, we assume we don’t have the listing permission and check to see if an object with a key addition of “/index.html” exists. if +“/index.html” doesn’t exist, we try and do a prefix listing. either way, if we confirm that the path is a prefix, we do a redirect with a trailing slash. if we can’t confirm the path is a prefix, we either return an access denied (no listing permission) or a 404

  • otherwise we serve correctly named objects

in all object serving cases, we do our best to serve with

  • range requests

  • etags

  • if match, etc

  • whatever else ServeContent provides

Create TiKV Auth database provider

We want to provide an alternative to using CockroachDB. Research and create a TiKV provider for test purposes. How can we test this at scale? What do we need for migrations? Help feed into the blueprint.

Is this actually robust if a peer goes down?

Write CORS and CriticalError handlers from scratch

We currently import the Apache2 copyright blocks for Minio CORS and Critical Error handlers. It'd be nice to just rewrite these from scratch, and avoid nested licensing. It'd be nice to validate we even need the critical error handler, and we probably don't need as fancy of a CORS solution as we brought over from Minio.

Re-add credential caching to Gateway-MT

As part of recent refactoring of authentication in Gateway-MT, we're no longer caching credentials as was previously being done by minio. This means every S3 request is going to the authservice.

We'll want to introduce a cache again of some form, possibly re-using what Linksharing is doing for this.

Analyze segment sizes per user agent

See if we can get uploaded segment size data for Gateway-MT specifically, and if we can correlate this with user agent information somehow. If not, figure out how we can gather this information and implement it.

If we have the necessary information, create a draft blog post on how to configure these clients to use larger segment sizes. If there is development work / metrics that need to be added before we can understand this, spin this second part out into a new ticket.

Create MIME type tests

Test https://www.storj.io/blog/nft-storage-for-opensea-sdk-with-storj-dcs.

Karl Mozurkewich 28 minutes ago
Does minio have a config to change the mime Type in the response based on file type?

Bill Thorp 21 minutes ago
I believe this is just stored as metadata, and we may be defaulting to the application/octet-stream if the user doesn't set this via their S3 client. It would make sense to try to look it up by file extension, I'm not seeing this logic in my search of the Minio code yet

Karl Mozurkewich 17 minutes ago
Does uplink set content-type when uploading?

Karl Mozurkewich 16 minutes ago
Not sure if aws cli does....

Karl Mozurkewich 11 minutes ago
I'm afk, but would test if aws cli upload to us for an image preserves mime

Karl Mozurkewich 11 minutes ago
Same with uplink in the metadata

Karl Mozurkewich 10 minutes ago
Likely you guys should confirm your test coverage for content type preservation for these workflows

CopyObject does not support x-amz-metadata-directive header to replace metadata

When copying an object, Gateway-ST and MT currently copy over the metadata, but do not respect the x-amz-metadata-directive header being set to "REPLACE", which you can use to indicate that the metadata should be set to what was provided in the CopyObject request, instead of copying it from the original object.

The default is copy metadata if no header is specified. See https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html

I put together a WIP: https://review.dev.storj.io/c/storj/gateway/+/5927

Gateway-MT logs too much

Artur reported "a 10-day old container produced 20GiB worth of logs" and "it's mostly a full request/response log". We should figure out and implement some strategies to log less.

Metric if auth service use is global

We want to understand if the auth service feature of being a distributed database is something that is actually of value. Can we create some sort of metric to understand if this is the case? Monkit? Or do we need to tweek the database and run queries against it? Timebox this to < 4 hours.

Implement S3 authorization outside of Minio

There are a few reasons why the existing Minio authorization logic is non-ideal. Two large reasons are that it caches credentials, which is against the security posture we’re marketing toward. The second is that it requires lots of internal changes to Minio to pass the AccessGrant into the Storj Object API implementation.

If we have a lightweight auth handler of our own (EG signature.go), it could extract credentials but not validate them. It would look up credentials in the Auth Service, and persist this information to the context, so that the Access Grant is available from all of our code.

We continue to use Minio for validation, except we use an IAMObjectStore solution as in https://review.dev.storj.io/c/storj/gateway-mt/+/5827. This solution would not longer talk to the Auth Service, instead it would simply check that the information in the context matches what was expected. It is stateless and relies only on context, it should be safe to use globally.

Finally, gateway.go, our ObjectLayer, would no longer pull its AccessGrant from Minio’s logger. That information would have been set in the context by our lightweight auth handler before Minio even validated the credentials. Also note that the current Minio solution requires a fork to store information into the context for logger to access, so this would remove the last remaining Minio hack.

Why did the compilation fail and where is the storj-release command

[root@localhost gateway-mt]# make binaries
for C in gateway-mt authservice linksharing; do
make binary COMPONENT=$C || exit $?
; done
make[1]: Entering directory '/opt/gateway-mt'

freebsd/amd64 target is currently skipped until https://storjlabs.atlassian.net/browse/GMT-302

storj-release --components="cmd/gateway-mt" --go-version="1.15.14" --branch="" --skip-osarches="freebsd/amd64"
/bin/sh: storj-release: command not found
make[1]: *** [Makefile:112: binary] Error 127
make[1]: Leaving directory '/opt/gateway-mt'
make: *** [Makefile:117: binaries] Error 2

Provide an option to perform in-memory encoding on upload to both Gateway-MT and Gateway-ST

We added the hacktoberfest label to issues we considered a good fit for 2021's Hacktoberfest. If you want to work on this issue, let us know in the discussion below, so we can remove the up for grabs label and avoid having people working on the same issue independently, making sure no one's work goes in vain. Please also consider discussing how you would like to approach the problem, if appropriate. Thank you for contributing!


This is for tracking Gateway-MT-part of storj/gateway-st#36.

Fix go test issues with --count > 1

Running go test ./... --count 5 causes errors in TestLogError and TestMetrics.

Global singletons or something cause these tests to fail, see if we can fix this.

Add monkit metrics for auth type

Create a monkit metric that shows what style auth clients are using:
V4 querystring
V4 headers
v4 multipart form
V2 querystring
V2 headers
v2 multipart form

Instrument all non-trivial functions with monkit's Task

We added the hacktoberfest label to issues we considered a good fit for 2021's Hacktoberfest. If you want to work on this issue, let us know in the discussion below, so we can remove the up for grabs label and avoid having people working on the same issue independently, making sure no one's work goes in vain. Please also consider discussing how you would like to approach the problem, if appropriate. Thank you for contributing!


We use monkit to monitor most of our code. Every function/method that is not trivial or performance-critical, especially from the object layer, should be instrumented with monkit's Task, e.g.

var mon = monkit.Package()

...

func (s service) RemoveExpiredUsers(ctx context.Context) (err error) {
	defer mon.Task()(&ctx)(&err)

	...

}

Read more about monkit here.

Segment already exists issue

Track Jeff's change set, see if there's anything we need to do. Wait for satellite release. Close the loop with customer.

Blueprint of auto-deploy to staging

JT and Kaloyan think that because the "edge" services like Gateway, Auth, and Link sharing are stateless, that they'd be a good place to start working on a continuous release process. Write up a plan to have these automatically deploy to staging on a successful build.

Make sure Krista is involved.

Implement auto-deploy to staging

Work with Stefan to get this going.

Once Stefan is done, there will probably a lot of Jenkins work. This would tell the docker stock to redeploy using the job. Potentially use watchtower instead?

Secondary part could be post deployment tests.

Gateway-MT: Re-evaluate logging

We currently use a global Sync.Once mutex with the Minio logger. The need to copy things between context and use of globals is unfortunate. Could we simplify the logging solution if we use our own middleware? Maybe we would have to implement a custom RegisterAPIRouter equivalent; Bill had one at one point that included all the routes and was hand written, not derived from Minio.

Create GitHub templates

Build things to make community feedback more valuable. Clement may already have some, talk to him first.

Rename object endpoint coordinate with Storj DCS UI

Metainfo has a rename/move object feature that might be useful to the UI team. Work with Moby to create an endpoint which is useful to them so that this feature can be expose via the UI. Some things to keep in mind: S3 does not support renames due to immutability promises. Fadilla is working on a Copy metainfo feature, which is how S3 supports move / rename. We need to decide if this feature lives in Linksharing or Gateway-MT, which will largely be motivated by the needs of the UI team, I imagine

Sync testsuite scripts between Gateway-MT and Gateway-ST

We added the hacktoberfest label to issues we considered a good fit for 2021's Hacktoberfest. If you want to work on this issue, let us know in the discussion below, so we can remove the up for grabs label and avoid having people working on the same issue independently, making sure no one's work goes in vain. Please also consider discussing how you would like to approach the problem, if appropriate. Thank you for contributing!


Scripts in the testsuite folders

should be synced (Gateway-MT, this repository, is the ground-truth revision) and linted so that they pass shellcheck.

Gateway-MT does not shutdown on signal

If you start Gateway-MT, then attempt to shut it down with SIGINT, it does not exit and hangs on "Got a signal..." message.

gateway-mt run --auth-token "super-secret" --auth-url http://localhost:8000 --domain-name localhost --server.address 0.0.0.0:7777
2021-10-27T16:30:37.901+1300	DEBUG	process/tracing.go:72	Anonymized tracing disabled
2021-10-27T16:30:37.902+1300	DEBUG	process/debug.go:37	debug server listening on 127.0.0.1:4000
2021-10-27T16:30:37.902+1300	DEBUG	process/metrics.go:79	Telemetry disabled
2021-10-27T16:30:37.903+1300	INFO	gateway-mt/main.go:105	Starting Storj DCS S3 Gateway
2021-10-27T16:30:37.903+1300	INFO	gateway-mt/main.go:106	Endpoint: 0.0.0.0:7777
2021-10-27T16:30:40.407+1300	INFO	process/exec_conf.go:116	Got a signal from the OS: "interrupt"

E2E encryption blueprint

Track down all the previous end-to-end encryption work and create an updated blueprint for this work. Topics may include 1) do we need a new service? 2) overall architecture etc

Object / File Caching Blueprint

We know there are high usage files which may benefit from caching of objects in Gateway-MT and Linksharing. They have a few files which are requested in burst-y traffic patterns. However, caching files has complications. How will we capture billing information? How will we know if / when to invalidate a cache and/or detect changes on the satellite? Will all files be eligible, or will this be as setting? Do we have appropriate hardware for these workflows? Will SNOs be compensated in any way? How will range queries be billed [per byte? per segment]? Can we leverage an off-the-shelf caching layer, such as Squid? Will our current infrastructure support adequate cache size?

"key length is too big" unmapped error

/storj/storj/satellite/metainfo/metainfo.go contains a check that returns a frequent and unmapped error:

	if objectKeyLength > endpoint.config.MaxEncryptedObjectKeyLength {
		return nil, rpcstatus.Error(rpcstatus.InvalidArgument, fmt.Sprintf("key length is too big, got %v, maximum allowed is %v", objectKeyLength, endpoint.config.MaxEncryptedObjectKeyLength))
	}

It'd be okay to map this, but it seems better to catch it ahead of time.

Try to remove GlobalNotificationSys

These are more Minio globals, I'm not sure which we'd need to try to figure out / remove. GlobalNotificationSys = cmd.NewNotificationSys(GlobalEndpoints)

Gateway-MT : Replace Minio fork with most recent Apache2 version

We want to roll back the "great purge" (see storj/minio) because it has let to numerous bugs. We want Minio to be as close as possible to minio/minio, just because its difficult to maintain multiple repos. We may want to join Gateway-MT and Gateway-ST, so we probably don't want to use a version of Minio that is AGPL3.

We may want to hold off on this work until the Auth work is complete. It’s quite possible that we don’t have to rebase our Minio work so much as just abandon the use of that repo; as we may have removed all the necessary hacks

Gateway-MT testable HTTP service

I also did a little work on https://review.dev.storj.io/c/storj/gateway-mt/+/4221, which runs bash tests from Go. That process highlights that we have Go tests which run against the Miniogw interface (no live HTTP) or compiled executable, but currently no HTTP-enabled testable code.

The real blocked by Minio’s use of global variables. We will likely need to get authorization working outside of Minio to make progess on this ticket, as globalIAMSys is the main culprit for requiring custom configuration during testing.

This ticket should consider moving any existing tests that use the compiled version of gateway-mt services.

Look into Minio object API tests

Are there Minio tests that we can use to test our Storj Object API implementation?

Things like:
/storj/minio/cmd/object_api_suite_test.go

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.