msiebuhr / ucs Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 0.0 2.47 MB

Unity Cache Server re-implementation

License: MIT License

Makefile 0.37% Go 92.88% HTML 6.75%

go unity-editor cache-server

ucs's Introduction

Unity Cache Server

.. in Go

ARCHIVED

This was a mostly do-some-work-related-Go. As I no longer work with Unity, I don't really have a need for this, nor can I test it in any meaningful capacity.

If you like the project, feel free to fork the project - or petition Unity to take it over.

Installation from source

go get -u github.com/msiebuhr/ucs/cmd/ucs
ucs

This will listen for cache-requests on TCP port 8126 and start a small web-server on http://localhost:9126 with setup-instructinos and Promehteus metrics.

Full usage options are shown with ucs -h. Note that options can be passed as environment variables, making the following examples equivalent:

ucs -quota 10GB
ucs --quota 10GB
QUOTA=10GB ucs

As it is generally recommended to use a cache per major Unity Release and project, the server supports namespaces. This is done by using multiple -port arguments or comma-separated list.

ucs -port=8126 -port=name:8127
ucs -port=8126,name:8127
PORT=8126,name:8127 ucs

Each name/port will have a seperate cache, but garbage-collected as one (so old projects' data will all but vanish and new ones will get lots of space).

For convenience, ports can be named as in name:8127. Is is used for the file-system path, display on the help-page and in metrics. If the name is left out, the port-number also becomes the name.

Load testing

There's also a quick-and-dirty loadtest utility, ucs-bender:

go get -u github.com/msiebuhr/ucs/cmd/ucs-bender
ucs-bender # Will run against localhost

Miscellaneous

Icon by Elizabeth Arostegui
MIT-Licensed

ucs's People

Contributors

Stargazers

Watchers

ucs's Issues

Too many open files

While doing stress-testing with go run ./cmd/ucs-bender -workers=5 -requests=100000, I began getting logs indicating it's running against ulimits:

 server: namespace=default addr=127.0.0.1:56946 Error reading from cache: open /Users/msiebuhr/Source/go/src/github.com/msiebuhr/ucs/cache5.0/default/ce/cefdfc072182654f163f5f0f9a621d72-9566c74d10037c4d7bbb0407d1e2c649.bin: too many open files
server: namespace=default addr=127.0.0.1:56955 Error reading from cache: open /Users/msiebuhr/Source/go/src/github.com/msiebuhr/ucs/cache5.0/default/ce/cefdfc072182654f163f5f0f9a621d72-9566c74d10037c4d7bbb0407d1e2c649.info: too many open files
server: namespace=default addr=127.0.0.1:56952 Error reading from cache: open /Users/msiebuhr/Source/go/src/github.com/msiebuhr/ucs/cache5.0/default/ce/cefdfc072182654f163f5f0f9a621d72-9566c74d10037c4d7bbb0407d1e2c649.info: too many open files
server: namespace=default addr=127.0.0.1:56949 Error reading from cache: open /Users/msiebuhr/Source/go/src/github.com/msiebuhr/ucs/cache5.0/default/ce/cefdfc072182654f163f5f0f9a621d72-9566c74d10037c4d7bbb0407d1e2c649.info: too many open files

And finally:

server: namespace=default Error accepting:  accept tcp [::]:8126: accept: too many open files
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x13a9264]

goroutine 10 [running]:
github.com/msiebuhr/ucs.(*Server).Listener(0xc0000683c0, 0x151f800, 0xc00019c060, 0xc0001cc000, 0xf, 0xc0001a8f28)
	/Users/msiebuhr/Source/go/src/github.com/msiebuhr/ucs/server.go:125 +0x1e4
github.com/msiebuhr/ucs.(*Server).Listen(0xc0000683c0, 0x151f780, 0xc000098008, 0x14aa351, 0x5, 0x0, 0x0)
	/Users/msiebuhr/Source/go/src/github.com/msiebuhr/ucs/server.go:100 +0x1a3
created by main.main
	/Users/msiebuhr/Source/go/src/github.com/msiebuhr/ucs/cmd/ucs/main.go:115 +0x31a
exit status 2

go run ./cmd/ucs puts cache-directory in a temporary location

It will need to resolve relative to CWD.

File-system cache

Add a cache-module that does file-system storage.

Caching large projects hang forever

The server successfully accepts uploads of data from large projects (~1 GB tested), re-building with the cache server fails with a server-side timeout.

(The good news: Until the timeout, things happen at ~3 Gbit on a 10Gbit LAN...)

A packet capture (sudo tcpdump -i enp1s0 -s 65535 -w failed-build-$(date -Iseconds).pcap tcp port 10005 &) reveals some TCP-level issues when inspected with Wireshark:

Bittorrent cache backend

Try fetching through a local bittorrent network.

GC: Remove files in pairs

Right now, the GC rather blindly just deletes whatever file it finds as the oldest. But files are uploaded in pair-wise (hence the transactions).

We really should delete the files pairwise as well!

Initial idea: In the GC-scan, we only track the (ns, uuid, hash, age) of the oldest file. At delete-time, we then look generate all suffixes/names and deletes them. It will create some issues wrt. knowing when to stop (and how much we're actually deleting. Guess we could re-scan the folder in question, re-sort the list of oldest entries and then delete the next top-result.

Client library

Replace PutObject with composed struct

We have a frequent use for getting some file contents and knowing its size up-front; in the client when we want to upload things and in the server when we return things.

A rough sketch:

package main

import (
	"fmt"
	"io"
	"os"
)

type SizeReader interface {
	io.Reader
	Size() int64
}

type SizeFile struct {
	*os.File
}

func (sf *SizeFile) Size() int64 {
	stat, _ := sf.Stat()
	if stat == nil {
		return 0
	}
	return stat.Size()
}

// Add things that makes sized random readers, network, ...

func main() {
	x, err := os.Open("./test.go")
	if err != nil {
		panic(err)
	}
	sf := &SizeFile{x}
	fmt.Println(x, err, sf.Size())
}

Would also be useful for the streaming put/get interfaces...

ucs-cli

A quick-and-dirty CLI tool for poking at cache servers. Ideas for now

ucscli ping <ip>:<port> - connects, does a handshake and exits with returncode 0 on success and 1 on failure
ucscli fs-cache-filenames <path> or ucscli fs-cache-filenames <ns> <uuid/hash> given an "example" name, it will generate all pars of .info, .asset and .resouce for those files. Useful for deleting files in relevant pairs from shellscripts (`find ./cache5.0 -type f -atime +30 | ucscli fs-cache-names | parallel rm.)
As above - do a ucscli fsck to clean up invalid pairs of cache-entries + do other checks (the upstream server has a few hidden somewhere).

Fuzz the parser

... because I've never not found a bug doing that ...

Filesystem-cache reports having a quota of zero in /metrics

Add namespacing

It is generally sane to use a separate cache-server per project (and Unity version - [citation needed]).

Instead of the currently required way of running a process per cache-server, we should add name-spacing, so the same cache-server can serve many different ports from the same cache (and keep them from poisoning each other).

PRO: We'd have simpler server-administration.
PRO: Shared GC. Old stuff from inactive builds would be discarded sooner.
CON: Single point of failure.
CON: Old projects will be wiped, making re-builds potentially very slow.

Practically speaking, I think it'd be done with running it as

ucs -ns project-one:8125 -ns project-two:8127 ...

ucs -listen project-one:8125,project-two:8127,...

Internally, the server would either communicate namespaces with parameters or through known context variables - and then start up multiple ucs.Server with the same back-end attached.

For the file-system back-end, I think the folder structure should be something along the lines of

$BASEDIR/<namespace>/UU/UUID-HASH.type

HTTP Backend

I'm suspecting that many/most HTTP servers will have better performance than what I can squeeze out of this, so we might as well do a back-end that is a HTTP proxy.

Allowing it to be used as a proxy for Amazon S3 or Google Cloud Store would be a bonus.

Be more intelligent about when to GC

Spun off from #12

Thinking it over, it should have some way of ensuring we'll do a GC on/after an upload. Current logic is to check if (quota < size + newly-uploaded-item) { gc() }, which means we'll do a GC on most uploads when we're hovering around the quota.

Run GC periodically (either timer or every X requests/connections)
Over-GC (say, 10% or just remove the oldest entry from every bucket)
Have a lock/queue, so we don't over-schedule GC's (we can upload small items a lot faster than we can GC them)
More fine-grained locking so GC doesn't block new uploads (goes with locked GC'ing, so we don't have something looking at files while another is deleting them...)

Cache size quotas

Otherwise we'll quickly run out of space.

For the file-system back-end, I envision that we track the total amount used and then the oldest file (via atime) in any of the ~256 folders we have. That way we can do a quick scan to find out what to delete, without having to scan the entire thing, nor track every single file.

Admin interface

Using the "stock" server in production has shown that the most administrative work involves removing a specific UUID/HASH combination (known to be bad) or, worst-case, remove the entire backing cache.

Deleting individual items implies that the cache.Cacher should have at least Search(key) chan(uuidAndHash) and Delete(uuidAndHash).

When a fresh project is simultaneously built multiple times, metrics get off

They'll both detect that everything is missing and upload the required items (but only stored once).

As metrics increase on uploads, we end up counting the doubly-uploaded items twice, even though they're only stored once in the cache.

Instead of doing lots of checking to see if anything changes, a simple solution would be to do a GC-run after each connection that uploads anything.

Benchmark utility

I've been using crude shell-scripts (not terribly consistent) and Go's internal BenchmarkXX (which ATM says FS-cache is ~100x faster than in-memory!?).

I've been pondering writing something that uploads a single file and hammers is with downloads.

Seems Pinterest is already there: https://github.com/pinterest/bender