Code Monkey home page Code Monkey logo

gcp-storage-emulator's Introduction

Local Emulator for Google Cloud Storage

CI PyPI codecov Code style: black

Google doesn't (yet) ship an emulator for the Cloud Storage API like they do for Cloud Datastore.

This is a stub emulator so you can run your tests and do local development without having to connect to the production Storage APIs.

THIS IS A WORK IN PROGRESS AND ONLY SUPPORTS A LIMITED SUBSET OF THE API


Installation

pip install gcp-storage-emulator

CLI Usage

Starting the emulator

Start the emulator with:

gcp-storage-emulator start

By default, the server will listen on http://localhost:9023 and data is stored under ./.cloudstorage. You can configure the folder using the env variables STORAGE_BASE (default ./) and STORAGE_DIR (default .cloudstorage).

If you wish to run the emulator in a testing environment or if you don't want to persist any data, you can use the --in-memory parameter. For tests, you might want to consider starting up the server from your code (see the Python APIs)

If you're using the Google client library (e.g. google-cloud-storage for Python) then you can set the STORAGE_EMULATOR_HOST environment variable to tell the library to connect to your emulator endpoint rather than the standard https://storage.googleapis.com, e.g.:

export STORAGE_EMULATOR_HOST=http://localhost:9023

Wiping data

You can wipe the data by running

gcp-storage-emulator wipe

You can pass --keep-buckets to wipe the data while keeping the buckets.

Example

Use in-memory storage and automatically create default storage bucket my-bucket.

gcp-storage-emulator start --host=localhost --port=9023 --in-memory --default-bucket=my-bucket

Python APIs

To start a server from your code you can do

from gcp_storage_emulator.server import create_server

server = create_server("localhost", 9023, in_memory=False)

server.start()
# ........
server.stop()

You can wipe the data by calling server.wipe()

This can also be achieved (e.g. during tests) by hitting the /wipe HTTP endpoint

Example

import os

from google.cloud import storage
from gcp_storage_emulator.server import create_server

HOST = "localhost"
PORT = 9023
BUCKET = "test-bucket"

# default_bucket parameter creates the bucket automatically
server = create_server(HOST, PORT, in_memory=True, default_bucket=BUCKET)
server.start()

os.environ["STORAGE_EMULATOR_HOST"] = f"http://{HOST}:{PORT}"
client = storage.Client()

bucket = client.bucket(BUCKET)
blob = bucket.blob("blob1")
blob.upload_from_string("test1")
blob = bucket.blob("blob2")
blob.upload_from_string("test2")
for blob in bucket.list_blobs():
    content = blob.download_as_bytes()
    print(f"Blob [{blob.name}]: {content}")

server.stop()

Docker

Pull the Docker image.

docker pull oittaa/gcp-storage-emulator

Inside the container instance, the value of the PORT environment variable always reflects the port to which requests are sent. It defaults to 8080. The directory used for the emulated storage is located under /storage in the container. In the following example the host's directory $(pwd)/cloudstorage will be bound to the emulated storage.

docker run -d \
  -e PORT=9023 \
  -p 9023:9023 \
  --name gcp-storage-emulator \
  -v "$(pwd)/cloudstorage":/storage \
  oittaa/gcp-storage-emulator
import os

from google.cloud import exceptions, storage

HOST = "localhost"
PORT = 9023
BUCKET = "test-bucket"

os.environ["STORAGE_EMULATOR_HOST"] = f"http://{HOST}:{PORT}"
client = storage.Client()

try:
    bucket = client.create_bucket(BUCKET)
except exceptions.Conflict:
    bucket = client.bucket(BUCKET)

blob = bucket.blob("blob1")
blob.upload_from_string("test1")
print(blob.download_as_bytes())

gcp-storage-emulator's People

Contributors

code-hex avatar dependabot-preview[bot] avatar dependabot[bot] avatar jacobhayes avatar kubasaw avatar oittaa avatar spacebaire avatar visualage avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

gcp-storage-emulator's Issues

Resumable Upload problem

It seems that the upload functionality returns an error due to a wrong host name returned by the server to the client during resumable uploads.

The same client Python program works as expected when run from the host system and uses the exposed port of the GCP Emulator server.

Here is the minimal case that showcases the problem:

compose.yaml

services:
    google_storage:
        image: oittaa/gcp-storage-emulator
        ports:
            # Exposed in port 9023 of localhost
            - "127.0.0.1:9023:9023/tcp"
        entrypoint: gcp-storage-emulator
        command: ["start",
            "--host=0.0.0.0", "--port=9023", "--in-memory",
            "--default-bucket=localtesting_bucket" ]

    upload:
        image: python:3.7-buster
        environment:
            STORAGE_EMULATOR_HOST: "http://google_storage:9023"
        entrypoint: /entrypoint.sh
        volumes:
            - ./entrypoint.sh:/entrypoint.sh:ro
            - ./upload.py:/upload.py:ro

entrypoint.sh

#!/usr/bin/env bash

# Install Python requirements
pip install google-cloud-storage==1.31.2

echo "STORAGE_EMULATOR_HOST: ${STORAGE_EMULATOR_HOST}"

# Test upload data, 10MiB file to force resumable upload
dd if=/dev/zero of=/test.data bs=1024 count=10240

exec python3 /upload.py

upload.py

import sys
from google.auth.credentials import AnonymousCredentials
from google.cloud import storage

# Upload
client = storage.Client(credentials=AnonymousCredentials(),
                                project='localtesting')
bucket_obj = client.bucket('localtesting_bucket')
blob_obj = bucket_obj.blob('test/test.data')
blob_obj.upload_from_filename('/test.data')

# Print bucket contents
for bucket in client.list_buckets():
    print(f'Bucket: {bucket}')
    for blob in bucket.list_blobs():
        print(f'|_Blob: {blob}')

sys.exit(0)

Put these 3 files in a directory and then docker-compose up to run. The initial connection to the GCP emulator service succeeds:

google_storage_1  | "POST /upload/storage/v1/b/localtesting_bucket/o?uploadType=resumable HTTP/1.1" 200 -

but then an error occurs during connection, with host 0.0.0.0 and port 9023:

upload_1          | Traceback (most recent call last):
upload_1          |   File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 170, in _new_conn
upload_1          |     (self._dns_host, self.port), self.timeout, **extra_kw
upload_1          |   File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 96, in create_connection
upload_1          |     raise err
upload_1          |   File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 86, in create_connection
upload_1          |     sock.connect(sa)
upload_1          | ConnectionRefusedError: [Errno 111] Connection refused
upload_1          | 
upload_1          | During handling of the above exception, another exception occurred:
upload_1          | 
upload_1          | Traceback (most recent call last):
upload_1          |   File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 706, in urlopen
upload_1          |     chunked=chunked,
upload_1          |   File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 394, in _make_request
upload_1          |     conn.request(method, url, **httplib_request_kw)
upload_1          |   File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 234, in request
upload_1          |     super(HTTPConnection, self).request(method, url, body=body, headers=headers)
upload_1          |   File "/usr/local/lib/python3.7/http/client.py", line 1281, in request
upload_1          |     self._send_request(method, url, body, headers, encode_chunked)
upload_1          |   File "/usr/local/lib/python3.7/http/client.py", line 1327, in _send_request
upload_1          |     self.endheaders(body, encode_chunked=encode_chunked)
upload_1          |   File "/usr/local/lib/python3.7/http/client.py", line 1276, in endheaders
upload_1          |     self._send_output(message_body, encode_chunked=encode_chunked)
upload_1          |   File "/usr/local/lib/python3.7/http/client.py", line 1036, in _send_output
upload_1          |     self.send(msg)
upload_1          |   File "/usr/local/lib/python3.7/http/client.py", line 976, in send
upload_1          |     self.connect()
upload_1          |   File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 200, in connect
upload_1          |     conn = self._new_conn()
upload_1          |   File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 182, in _new_conn
upload_1          |     self, "Failed to establish a new connection: %s" % e
upload_1          | urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fe241769150>: Failed to establish a new connection: [Errno 111] Connection refused
upload_1          | 
upload_1          | During handling of the above exception, another exception occurred:
upload_1          | 
upload_1          | Traceback (most recent call last):
upload_1          |   File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
upload_1          |     timeout=timeout
upload_1          |   File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 756, in urlopen
upload_1          |     method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
upload_1          |   File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 574, in increment
upload_1          |     raise MaxRetryError(_pool, url, error or ResponseError(cause))
upload_1          | urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='0.0.0.0', port=9023): Max retries exceeded with url: /upload/storage/v1/b/localtesting_bucket/o?uploadType=resumable&upload_id=localtesting_bucket%3Atest%2Ftest.data%3A2021-09-08+10%3A13%3A34.413826 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fe241769150>: Failed to establish a new connection: [Errno 111] Connection refused'))
upload_1          | 
upload_1          | During handling of the above exception, another exception occurred:
upload_1          | 
upload_1          | Traceback (most recent call last):
upload_1          |   File "/upload.py", line 10, in <module>
upload_1          |     blob_obj.upload_from_filename('/test.data')
upload_1          |   File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2348, in upload_from_filename
upload_1          |     checksum=checksum,
upload_1          |   File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2235, in upload_from_file
upload_1          |     checksum=checksum,
upload_1          |   File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2082, in _do_upload
upload_1          |     checksum=checksum,
upload_1          |   File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1949, in _do_resumable_upload
upload_1          |     response = upload.transmit_next_chunk(transport, timeout=timeout)
upload_1          |   File "/usr/local/lib/python3.7/site-packages/google/resumable_media/requests/upload.py", line 503, in transmit_next_chunk
upload_1          |     timeout=timeout,
upload_1          |   File "/usr/local/lib/python3.7/site-packages/google/resumable_media/requests/_request_helpers.py", line 136, in http_request
upload_1          |     return _helpers.wait_and_retry(func, RequestsMixin._get_status_code, retry_strategy)
upload_1          |   File "/usr/local/lib/python3.7/site-packages/google/resumable_media/_helpers.py", line 188, in wait_and_retry
upload_1          |     raise error
upload_1          |   File "/usr/local/lib/python3.7/site-packages/google/resumable_media/_helpers.py", line 177, in wait_and_retry
upload_1          |     response = func()
upload_1          |   File "/usr/local/lib/python3.7/site-packages/google/auth/transport/requests.py", line 486, in request
upload_1          |     **kwargs
upload_1          |   File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 542, in request
upload_1          |     resp = self.send(prep, **send_kwargs)
upload_1          |   File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 655, in send
upload_1          |     r = adapter.send(request, **kwargs)
upload_1          |   File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 516, in send
upload_1          |     raise ConnectionError(e, request=request)
upload_1          | requests.exceptions.ConnectionError: HTTPConnectionPool(host='0.0.0.0', port=9023): Max retries exceeded with url: /upload/storage/v1/b/localtesting_bucket/o?uploadType=resumable&upload_id=localtesting_bucket%3Atest%2Ftest.data%3A2021-09-08+10%3A13%3A34.413826 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fe241769150>: Failed to establish a new connection: [Errno 111] Connection refused'))

I think the problem resides to the URL returned by the server to resume the upload. It contains the bind IP which breaks the HTTP client on the uploader side.

Bucket Upload notification to PubSub

Description

The main goal is to have any bucket upload integrated with pubsub notification. I dont know if this feature doesnt exists yet in this emulator or if I`m doing it wrong

Example

After running the local emulator with docker-compose:

version: "3"

services:
  gcs:
    container_name: gcs
    image: oittaa/gcp-storage-emulator
    ports:
      - 9023:9023
    environment:
      PORT: 9023

I then tried to run the following code whitout success:

import os
import google.cloud.storage as gcs
from google.cloud.storage.notification import BucketNotification

os.environ["STORAGE_EMULATOR_HOST"] = 'http://localhost:9023'

gcs_client: gcs.Client = gcs.Client()
gcs_client.create_bucket('bucket_name')

bucket: gcs.Bucket = gcs_client.get_bucket('bucket_name')
notification: BucketNotification = bucket.notification('topic_name', 'project_id')

notification.create(gcs_client)

It raises ConnectionError.

('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')

Any help or ideas?
Thanks!

Make project number as configurable as project id

Is your feature request related to a problem? Please describe.
project number for the buckets is hardcoded as "1234".

"projectNumber": "1234",

This is not a valid GCP project number. So, instead allow this to be configurable in the same as the project id.

Describe the solution you'd like
Make the project number configurable.

        storage_client = storage.Client.create_anonymous_client()
        storage_client.project_number = <value> 

Describe alternatives you've considered
None

Additional context
None

Tiny file blobs causes error

Uploading a tiny text file blob such as foo causes an error in server.py:_read_raw_data because the line is an empty byte string. The following change seems to fix it:

def _read_raw_data(request_handler):
    if request_handler.headers["Content-Length"]:
        return request_handler.rfile.read(
            int(request_handler.headers["Content-Length"])
        )

    if request_handler.headers["Transfer-Encoding"] == "chunked":
        raw_data = b""

        while True:
            line = request_handler.rfile.readline().strip()
            chunk_size = int(line, 16) if line else 0
            if chunk_size == 0:
                break

            raw_data += request_handler.rfile.read(chunk_size)

            request_handler.rfile.readline()

        return raw_data

    return None

How to integrate other languages with the emulator?

Hi,

The emulator looks awesome. It is exactly what I was looking for. Can you help me to integrate the emulator with my Java project?

I am using Kotlin, but it's very different from Java or Python.

class StorageBucketTestServer(projectName: String, private val bucketName: String) {

    private val credentials: Credentials = ImpersonatedCredentials
        .newBuilder()
        .setScopes(listOf())
        .build()

    private val storage: Storage = StorageOptions
        .newBuilder()
        .setCredentials(credentials)
        .setProjectId(projectName)
        .setHost("http://localhost:9023")
        .build()
        .service

    fun upload(file: File) {
      storage[bucketName]
    }
}

I start emulator like this: gcp-storage-emulator start --in-memory --default-bucket=my-common-test

Explanation:

ImpersonatedCredentials - I create credentials similar to AnonymousCredentials in Python.
StorageOptions builder is the same as storage.Client in Python

The issue happens on the line storage[bucketName], claiming no buckets. More precisely, it's NullPointerException, which is practically the same.

What would be the correct way to access the bucket?

Note: I based my implementation on your examples and baeldung.com article.

Python API inside a Docker container fails

Describe the bug
Using the Python API example works perfectly in a virtual environment.
Fails when run in a docker container with a time out.
HTTPConnectionPool(host='172.17.0.1', port=63342): Max retries exceeded with url:

To Reproduce

import os
import pathlib as pl
import shutil

import pytest
from gcp_storage_emulator.server import create_server
from google.auth.credentials import AnonymousCredentials
from google.cloud import storage

import upload


@pytest.fixture
def mock_storage_client():
    """Google Cloud Storage emulator"""
    host = "localhost"
    port = 9023
    server = create_server(host, port, in_memory=True)
    server.start()
    os.environ["STORAGE_EMULATOR_HOST"] = f"http://{host}:{port}"
    yield storage.Client(credentials=AnonymousCredentials(), project="test-project")
    server.stop()
    del os.environ["STORAGE_EMULATOR_HOST"]


def test_upload_folder(mock_storage_client):
    """Test uploading a folder to Google Cloud Storage"""
    bucket_name = "test-upload-bucket"
    mock_storage_client.create_bucket(bucket_name)

Expected behavior
Bucket is created.

System (please complete the following information):

  • OS version: macos / linux
  • Python version: 3.9
  • gcp-storage-emulator version: latest

Additional context
Add any other context about the problem here.

A blob without any content cannot be created

If you attempt to create a blob without any content in it, gcp-storage-emulator fails with an internal error.

Script to replicate the bug:

import os

from gcp_storage_emulator.server import create_server
from google.cloud.storage.client import Client

server = create_server("localhost", 9023, in_memory=True, default_bucket="test_bucket")
server.start()

os.environ["STORAGE_EMULATOR_HOST"] = "http://localhost:9023"
gcp_client = Client.create_anonymous_client()

test_bucket = gcp_client.get_bucket("test_bucket")
test_bucket.blob("empty_blob").open("w").close()

Exception message:

$ python3 /home/thomash/Documents/test-folder/test.py
An error has occurred while running the handler for PUT http://127.0.0.1:9023/upload/storage/v1/b/test_bucket/o?uploadType=resumable&upload_id=test_bucket%3Aempty_blob%3A2021-12-14+11%3A18%3A48.989292
a bytes-like object is required, not 'NoneType'
----------------------------------------
Exception occurred during processing of request from ('127.0.0.1', 53540)
Traceback (most recent call last):
  File "/usr/lib/python3.9/socketserver.py", line 316, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/usr/lib/python3.9/socketserver.py", line 347, in process_request
    self.finish_request(request, client_address)
  File "/usr/lib/python3.9/socketserver.py", line 360, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/gcp_storage_emulator/server.py", line 351, in __init__
    super().__init__(*args, **kwargs)
  File "/usr/lib/python3.9/socketserver.py", line 747, in __init__
    self.handle()
  File "/usr/lib/python3.9/http/server.py", line 427, in handle
    self.handle_one_request()
  File "/usr/lib/python3.9/http/server.py", line 415, in handle_one_request
    method()
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/gcp_storage_emulator/server.py", line 367, in do_PUT
    router.handle(PUT)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/gcp_storage_emulator/server.py", line 337, in handle
    raise e
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/gcp_storage_emulator/server.py", line 328, in handle
    handler(request, response, self._request_handler.storage)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/gcp_storage_emulator/handlers/objects.py", line 268, in upload_partial
    obj = _checksums(data, obj)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/gcp_storage_emulator/handlers/objects.py", line 91, in _checksums
    crc32c_hash = _crc32c(content)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/gcp_storage_emulator/handlers/objects.py", line 80, in _crc32c
    val = google_crc32c.Checksum(content)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/google_crc32c/cext.py", line 36, in __init__
    self._crc = value(initial_value)
TypeError: a bytes-like object is required, not 'NoneType'
----------------------------------------
An error has occurred while running the handler for PUT http://127.0.0.1:9023/upload/storage/v1/b/test_bucket/o?uploadType=resumable&upload_id=test_bucket%3Aempty_blob%3A2021-12-14+11%3A18%3A48.989292
a bytes-like object is required, not 'NoneType'
----------------------------------------
Exception occurred during processing of request from ('127.0.0.1', 53542)
Traceback (most recent call last):
  File "/usr/lib/python3.9/socketserver.py", line 316, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/usr/lib/python3.9/socketserver.py", line 347, in process_request
    self.finish_request(request, client_address)
  File "/usr/lib/python3.9/socketserver.py", line 360, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/gcp_storage_emulator/server.py", line 351, in __init__
    super().__init__(*args, **kwargs)
  File "/usr/lib/python3.9/socketserver.py", line 747, in __init__
    self.handle()
  File "/usr/lib/python3.9/http/server.py", line 427, in handle
    self.handle_one_request()
  File "/usr/lib/python3.9/http/server.py", line 415, in handle_one_request
    method()
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/gcp_storage_emulator/server.py", line 367, in do_PUT
    router.handle(PUT)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/gcp_storage_emulator/server.py", line 337, in handle
    raise e
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/gcp_storage_emulator/server.py", line 328, in handle
    handler(request, response, self._request_handler.storage)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/gcp_storage_emulator/handlers/objects.py", line 268, in upload_partial
    obj = _checksums(data, obj)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/gcp_storage_emulator/handlers/objects.py", line 91, in _checksums
    crc32c_hash = _crc32c(content)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/gcp_storage_emulator/handlers/objects.py", line 80, in _crc32c
    val = google_crc32c.Checksum(content)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/google_crc32c/cext.py", line 36, in __init__
    self._crc = value(initial_value)
TypeError: a bytes-like object is required, not 'NoneType'
----------------------------------------
Traceback (most recent call last):
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 445, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 440, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.9/http/client.py", line 1377, in getresponse
    response.begin()
  File "/usr/lib/python3.9/http/client.py", line 320, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.9/http/client.py", line 289, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 755, in urlopen
    retries = retries.increment(
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/urllib3/util/retry.py", line 532, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/urllib3/packages/six.py", line 769, in reraise
    raise value.with_traceback(tb)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 445, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 440, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.9/http/client.py", line 1377, in getresponse
    response.begin()
  File "/usr/lib/python3.9/http/client.py", line 320, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.9/http/client.py", line 289, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/thomash/Documents/test-folder/test.py", line 13, in <module>
    test_bucket.blob("empty_blob").open("w").close()
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/google/cloud/storage/fileio.py", line 421, in close
    self._upload_chunks_from_buffer(1)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/google/cloud/storage/fileio.py", line 400, in _upload_chunks_from_buffer
    upload.transmit_next_chunk(transport)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/google/resumable_media/requests/upload.py", line 515, in transmit_next_chunk
    return _request_helpers.wait_and_retry(
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/google/resumable_media/requests/_request_helpers.py", line 170, in wait_and_retry
    raise error
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/google/resumable_media/requests/_request_helpers.py", line 147, in wait_and_retry
    response = func()
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/google/resumable_media/requests/upload.py", line 507, in retriable_request
    result = transport.request(
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/google/auth/transport/requests.py", line 480, in request
    response = super(AuthorizedSession, self).request(
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/home/thomash/Documents/test-folder/venv/lib/python3.9/site-packages/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

Hive working with GCP storage emulator

Hi Team,

Can you please describe what core-site.xml I need to use for hive to be able to reach to the emulator?

I am using the config below which throws access denied error when I am trying to run create_table command.

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>fs.AbstractFileSystem.gs.impl</name>
        <value>com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS</value>    
    </property>
    <property>
        <name>fs.gs.impl</name>
        <value>com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem</value>
    </property>
    <property>
        <name>fs.gs.project.id</name>
        <value>test-project</value>    
    </property>
    <property>
        <name>spark.hadoop.google.cloud.auth.service.account.enable</name>
        <value>true</value>
    </property>
    <property>
        <name>spark.hadoop.google.cloud.auth.service.account.json.keyfile</name>
        <value>/run/secrets/gcloudservicekey</value>
    </property>

</configuration>```

ERROR:

`hmsclient.hmsclient.genthrift.hive_metastore.ttypes.MetaException: MetaException(message='Got exception: java.io.IOException Error accessing gs://bec1c26d1ce7c49b19773cfb768ef690c/test_user/foo')
    raise ThriftHiveError(f"error creating table {table_object}") from ex
apollo.core.utils.hive.ThriftHiveError: error creating table Table(tableName='test_user__foo', dbName='testdb_98edf558bfa54e759b83c9926fb5719b', owner=None, createTime=None, lastAccessTime=None, retention=None, sd=StorageDescriptor(cols=[FieldSchema(name='count', type='int', comment='')], location='gs://bec1c26d1ce7c49b19773cfb768ef690c/test_user/foo', inputFormat='org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat', outputFormat='org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat', compressed=None, numBuckets=None, serdeInfo=SerDeInfo(name=None, serializationLib='org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe', parameters=None), bucketCols=None, sortCols=None, parameters=None, skewedInfo=None, storedAsSubDirectories=None), partitionKeys=[], parameters={'EXTERNAL': 'TRUE'}, viewOriginalText=None, viewExpandedText=None, tableType='EXTERNAL_TABLE', privileges=None, temporary=False, rewriteEnabled=None)
`

ENV:

> Docker environment
> Local Hive setup
> Python 3.6
> gcp-storage-emulator==2021.12.14"
> google-auth==2.14.1
> POST commands to the server work but GET commands fail with access denied issue

upload_from_file() with content type "application/json" not working

According to the method documentation of gcp_storage_emulator.storage.Storage.add_to_resumable_upload it expects parameter content to be of type bytes, but it is actually passed not as bytes (in my example, it is passed as list), because request.data (which is passed as parameter content) returns json.loads() when content_type is application_json.

  1. Start the emulator
docker run  \
  -e PORT=9023 \
  -p 9023:9023 \
  --name gcp-storage-emulator-outtest \
  -v "$(pwd)/cloudstorage":/storage \
  oittaa/gcp-storage-emulator
  1. Given file data.json with content:
    [{"a": 1}]

  2. Try to upload the file (using Postman for example):

import uvicorn
from fastapi import FastAPI, UploadFile, File

import os

from google.cloud import exceptions, storage

HOST = "localhost"
PORT = 9023
BUCKET = "test-bucket"

os.environ["STORAGE_EMULATOR_HOST"] = f"http://{HOST}:{PORT}"
client = storage.Client(project="TEST")

try:
    bucket = client.create_bucket(BUCKET)
except exceptions.Conflict:
    bucket = client.bucket(BUCKET)

#####


app = FastAPI()


@app.post("/upload")
async def upload_json(file: UploadFile = File(...)):
    blob_bucket = bucket.blob("test_blob")
    blob_bucket.upload_from_file(file.file, content_type=file.content_type)  # this fails, and at this point file.content_type == "application/json"
 

if __name__ == '__main__':
    uvicorn.run("main:app", debug=True, reload=True)

  1. The error I get on the emulator is:
Resource not found:
resource '355470393de4cb2d1491b7e5a7a4e47375563fb727c56ea585a2af657d5aa8ec' not found
An error has occurred while running the handler for PUT http://0.0.0.0:9023/upload/storage/v1/b/test-bucket/o?uploadType=resumable&upload_id=test-bucket%3Atest_blob%3A2022-04-09+08%3A08%3A08.176611
can't concat list to bytes
----------------------------------------
Exception occurred during processing of request from ('172.17.0.1', 59828)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/socketserver.py", line 316, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/usr/local/lib/python3.10/socketserver.py", line 347, in process_request
    self.finish_request(request, client_address)
  File "/usr/local/lib/python3.10/socketserver.py", line 360, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/local/lib/python3.10/site-packages/gcp_storage_emulator/server.py", line 351, in __init__
    super().__init__(*args, **kwargs)
  File "/usr/local/lib/python3.10/socketserver.py", line 747, in __init__
    self.handle()
  File "/usr/local/lib/python3.10/http/server.py", line 427, in handle
    self.handle_one_request()
  File "/usr/local/lib/python3.10/http/server.py", line 415, in handle_one_request
    method()
  File "/usr/local/lib/python3.10/site-packages/gcp_storage_emulator/server.py", line 367, in do_PUT
    router.handle(PUT)
  File "/usr/local/lib/python3.10/site-packages/gcp_storage_emulator/server.py", line 337, in handle
    raise e
  File "/usr/local/lib/python3.10/site-packages/gcp_storage_emulator/server.py", line 328, in handle
    handler(request, response, self._request_handler.storage)
  File "/usr/local/lib/python3.10/site-packages/gcp_storage_emulator/handlers/objects.py", line 324, in upload_partial
    data = storage.add_to_resumable_upload(upload_id, request.data, total_size)
  File "/usr/local/lib/python3.10/site-packages/gcp_storage_emulator/storage.py", line 249, in add_to_resumable_upload
    file_content += content
TypeError: can't concat list to bytes
----------------------------------------
Resource not found:
resource '355470393de4cb2d1491b7e5a7a4e47375563fb727c56ea585a2af657d5aa8ec' not found
An error has occurred while running the handler for PUT http://0.0.0.0:9023/upload/storage/v1/b/test-bucket/o?uploadType=resumable&upload_id=test-bucket%3Atest_blob%3A2022-04-09+08%3A08%3A08.176611
can't concat list to bytes
----------------------------------------
Exception occurred during processing of request from ('172.17.0.1', 59830)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/socketserver.py", line 316, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/usr/local/lib/python3.10/socketserver.py", line 347, in process_request
    self.finish_request(request, client_address)
  File "/usr/local/lib/python3.10/socketserver.py", line 360, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/local/lib/python3.10/site-packages/gcp_storage_emulator/server.py", line 351, in __init__
    super().__init__(*args, **kwargs)
  File "/usr/local/lib/python3.10/socketserver.py", line 747, in __init__
    self.handle()
  File "/usr/local/lib/python3.10/http/server.py", line 427, in handle
    self.handle_one_request()
  File "/usr/local/lib/python3.10/http/server.py", line 415, in handle_one_request
    method()
  File "/usr/local/lib/python3.10/site-packages/gcp_storage_emulator/server.py", line 367, in do_PUT
    router.handle(PUT)
  File "/usr/local/lib/python3.10/site-packages/gcp_storage_emulator/server.py", line 337, in handle
    raise e
  File "/usr/local/lib/python3.10/site-packages/gcp_storage_emulator/server.py", line 328, in handle
    handler(request, response, self._request_handler.storage)
  File "/usr/local/lib/python3.10/site-packages/gcp_storage_emulator/handlers/objects.py", line 324, in upload_partial
    data = storage.add_to_resumable_upload(upload_id, request.data, total_size)
  File "/usr/local/lib/python3.10/site-packages/gcp_storage_emulator/storage.py", line 249, in add_to_resumable_upload
    file_content += content
TypeError: can't concat list to bytes

Implement IAM endpoints for GCP Buckets

Is your feature request related to a problem? Please describe.
For my emulator use case we need to allow editing of IAM policies. As this emulator doesn't implement even stubs of the IAM endpoints, we can't use it to test our code.

Describe the solution you'd like
It would be really useful if the bucket/iam endpoints were implemented in this emulator.

Describe alternatives you've considered
We looked into using the new firebase storage emulator but we need to create and delete buckets.

Additional context
GetIAMPolicy
SetIAMPolicy

Docker Repo no longer exists

Describe the bug
A clear and concise description of what the bug is.

The docker repo no longer exists: https://hub.docker.com/layers/oittaa/gcp-storage-emulator/

To Reproduce
A minimal, complete, and reproducible example to reproduce the behavior.

docker pull oittaa/gcp-storage-emulator returns a repository does not exist

Expected behavior
A clear and concise description of what you expected to happen.

System (please complete the following information):

  • OS version:
  • Python version:
  • gcp-storage-emulator version:

Additional context
Add any other context about the problem here.

BUG: Objects not being read written to or read from bucket using docker image

Hi there. This is a pretty useful sounding project, but it unfortunately doesn't appear to work (at least when using the docker image).

Objects don't appear to be written to or read from the mounted filesystem while following the documentation in this repo and for Google's API. The following will set up a basic test server and demonstrate the surprising behavior:

Setup.

mkdir cloudstorage
docker run -d \
  -p 8080:8080 \
  --name gcp-storage-emulator \
  -v "$(pwd)/cloudstorage":/storage \
  oittaa/gcp-storage-emulator

Create bucket "test".

Request:

curl -X "POST" "http://localhost:8080/storage/v1/b?project=example" \
     -H 'content-type: application/json' \
     -d $'{
  "iamConfiguration": {
    "uniformBucketLevelAccess": {
      "enabled": true
    }
  },
  "storageClass": "NEARLINE",
  "name": "test",
  "location": "US"
}'

Response: HTTP 200

{
  "kind": "storage#bucket",
  "id": "test",
  "selfLink": "/storage/v1/b/test",
  "projectNumber": "1234",
  "name": "test",
  "timeCreated": "2021-10-22T21:19:41.824117Z",
  "updated": "2021-10-22T21:19:41.824117Z",
  "metageneration": "1",
  "iamConfiguration": {
    "bucketPolicyOnly": {
      "enabled": false
    },
    "uniformBucketLevelAccess": {
      "enabled": false
    }
  },
  "location": "US",
  "locationType": "multi-region",
  "storageClass": "STANDARD",
  "etag": "CAE="
}

List objects in bucket "test".

Request:

curl "http://localhost:8080/storage/v1/b/test/o?project=example" \
     -H 'content-type: application/json'

Response: HTTP 200

{
  "kind": "storage#object",
  "prefixes": [],
  "items": []
}

Upload an object to the bucket.

Request:

curl -X "POST" "http://localhost:8080/upload/storage/v1/b/test/o?uploadType=media&name=test.txt" \
     -H 'content-type: text/plain' \
     -d "Hello"

Response: HTTP 200

Attempt to download the object.

curl "http://localhost:8080/storage/v1/b/test/o/test.txt?alt=media" -v

Response: HTTP 404

List objects in bucket "test" again.

Request:

curl "http://localhost:8080/storage/v1/b/test/o?project=example" \
     -H 'content-type: application/json'

Response: HTTP 200, but still empty

{
  "kind": "storage#object",
  "prefixes": [],
  "items": []
}

Check the directory's contents.

ls -la cloudstorage
total 8
drwxr-xr-x  3 mike  staff   96 Oct 22 14:19 .
drwxr-xr-x  3 mike  staff   96 Oct 22 14:19 ..
-rw-r--r--  1 mike  staff  649 Oct 22 14:19 .meta
cat cloudstorage/.meta
{
  "buckets": {
    "test": {
      "kind": "storage#bucket",
      "id": "test",
      "selfLink": "/storage/v1/b/test",
      "projectNumber": "1234",
      "name": "test",
      "timeCreated": "2021-10-22T21:19:41.824117Z",
      "updated": "2021-10-22T21:19:41.824117Z",
      "metageneration": "1",
      "iamConfiguration": {
        "bucketPolicyOnly": {
          "enabled": false
        },
        "uniformBucketLevelAccess": {
          "enabled": false
        }
      },
      "location": "US",
      "locationType": "multi-region",
      "storageClass": "STANDARD",
      "etag": "CAE="
    }
  },
  "objects": {},
  "resumable": {}
}

Emulator returns `0.0.0.0` address instead of `localhost` for resumable upload in Docker (.NET client)

Describe the bug
Emulator returns it's internal server address 0.0.0.0 instead of localhost / 127.0.0.0 for resumable upload return address when running in Docker env.

To Reproduce
Use .NET client + docker compose:
See repo

Expected behavior
Emulator returns of localhost / 127.0.0.0 when running in Docker

System (please complete the following information):

  • OS version: Windows 11 / 22621.674
  • Python version: 3.9.13
  • gcp-storage-emulator version: oittaa/gcp-storage-emulator (latest)

Additional context
It works without problems on the same sample when you don't use Docker.

Downloading blobs not working

Setup using the docker image oittaa/gcp-storage-emulator (docker-compose).

...
storage:
    image: oittaa/gcp-storage-emulator
    volumes:
      - ./.cloudstorage:/storage
    ports:
      - 9023:9023
    environment:
      - PORT=9023

and tests:

@pytest.fixture(scope="session", autouse=True)
def setup_local_storage_emulator():
  client = storage.Client(
          credentials=AnonymousCredentials(),
          project="test",
      )
  # fill buckets with test data
  try:
      bucket = client.create_bucket(TEST_BUCKET)
  except Conflict:
      bucket = client.get_bucket(TEST_BUCKET)
  
  blob = bucket.blob("blob1")
  blob.upload_from_string("test1")
  blob = bucket.blob("blob2")
  blob.upload_from_string("test2")
  # up to here everything runs
  for blob in bucket.list_blobs():
      content = blob.download_as_bytes()
      # this step fails
      print("Blob [{}]: {}".format(blob.name, content))

The error I get after a while is the following:

requests.exceptions.ConnectionError: HTTPConnectionPool(host='0.0.0.0', port=9023): Max retries exceeded with url: /download/storage/v1/b/TEST_BUCKET/o/blob1?generation=1640043791&alt=media (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb2434a7460>: Failed to establish a new connection: [Errno 111] Connection refused'))

Is there something

Installing gcp-storage-emulator==2022.4.9 breaks tests

Describe the bug
Having a repo with tests under tests folder, installation of gcp-storage-emulator causes pytest to detect :
/venv/lib/python3.9/site-packages/tests/__init__.py instead.

To Reproduce
Prepare a small repo with a folder tests, in which a file imports something from tests module (e.g. fixtures).
Run pytest before installation of the gcp-storage-emulator and after.

Expected behavior
The tests of any system should work as before the installation.

System (please complete the following information):

  • OS version: MacOS
  • Python version: 3.9.12
  • gcp-storage-emulator version: 2022.4.9

Additional context
I suspect the tests is being recognised as a module for installation and is being installed as such.

Add functionality that allow to pre-load data into the storage bucket(s)

It would be really useful, if the Docker container could start with pre-loaded data. This can help the usage of the tool for unit and integration testing. This could be done by either attaching a volume with data to pre-load or provide a hook script that is called before the server starts. Similar functionality is implemented in other GCP Storage Emulators and also in other common Docker images like databases (postgres Docker image has the /docker-entrypoint-initdb.d directory where the user can have SQL scripts for database initialization and data import).

Resumable uploads errors if (optional) name is omitted

Thanks for the great package!

Describe the bug
When initiating a resumable upload, the name query parameter is optional:

This is not required if you included a name in the object metadata file in Step 2.

However, gcp-storage-emulator expects to get some data on the first request (request.data.get("name")), causing:

  File ".../gcp_storage_emulator/handlers/objects.py", line 188, in _create_resumable_upload
    object_id = request.data.get("name")
AttributeError: 'NoneType' object has no attribute 'get'

To Reproduce
A minimal, complete, and reproducible example to reproduce the behavior.

import os

import gcp_storage_emulator.server
import gcsfs

server = gcp_storage_emulator.server.create_server("localhost", 0, in_memory=True)
server.start()
host, port = server._api._httpd.socket.getsockname()
os.environ["STORAGE_EMULATOR_HOST"] = f"http://{host}:{port}"

try:
    gcs_fs = gcsfs.GCSFileSystem()
    gcs_fs.mkdir("test_bucket")
    gcs_fs.touch("test_bucket/file") # Triggers a resumable upload w/o name on first request and errors
finally:
    server.stop()

Or, with curl to a running instance w/ existing bucket:

curl -i -X 'POST' -H 'content-type: application/json' 'http://127.0.0.1:9023/upload/storage/v1/b/test_bucket/o?uploadType=resumable'

Expected behavior
No error, but to redirect to a unique generated upload ID.

System (please complete the following information):

  • OS version: macos monterey
  • Python version: 3.9
  • gcp-storage-emulator version: 2022.2.17

Additional context
Add any other context about the problem here.

CredentialsError on first Client() logon

Seeing stuff like this. How do I handle the credentials for my first login?

>>> from google.cloud import storage
>>> storage.Client()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/palewire/.local/share/virtualenvs/bln-rP4rNvCf/lib/python3.9/site-packages/google/cloud/storage/client.py", line 124, in __init__
    super(Client, self).__init__(
  File "/home/palewire/.local/share/virtualenvs/bln-rP4rNvCf/lib/python3.9/site-packages/google/cloud/client/__init__.py", line 320, in __init__
    _ClientProjectMixin.__init__(self, project=project, credentials=credentials)
  File "/home/palewire/.local/share/virtualenvs/bln-rP4rNvCf/lib/python3.9/site-packages/google/cloud/client/__init__.py", line 268, in __init__
    project = self._determine_default(project)
  File "/home/palewire/.local/share/virtualenvs/bln-rP4rNvCf/lib/python3.9/site-packages/google/cloud/client/__init__.py", line 287, in _determine_default
    return _determine_default_project(project)
  File "/home/palewire/.local/share/virtualenvs/bln-rP4rNvCf/lib/python3.9/site-packages/google/cloud/_helpers/__init__.py", line 152, in _determine_default_project
    _, project = google.auth.default()
  File "/home/palewire/.local/share/virtualenvs/bln-rP4rNvCf/lib/python3.9/site-packages/google/auth/_default.py", line 579, in default
    raise exceptions.DefaultCredentialsError(_HELP_MESSAGE)
google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started

Support for streaming of files upload

Hello there, thanks for this awesome package. I am currently using it in a docker-compose env + a nodejs fastify application running on node 20.

I am trying to stream a file upload to a bucket, following the samples from google there ==> https://github.com/googleapis/nodejs-storage/blob/main/samples/streamFileUpload.js

  // Creates a client
  const storage = new Storage();

  // Get a reference to the bucket
  const myBucket = storage.bucket(bucketName);

  // Create a reference to a file object
  const file = myBucket.file(destFileName);

  // Create a pass through stream from a string
  const passthroughStream = new stream.PassThrough();
  passthroughStream.write(contents);
  passthroughStream.end();

  async function streamFileUpload() {
    passthroughStream.pipe(file.createWriteStream()).on('finish', () => {
      // The file upload is complete
    });

    console.log(`${destFileName} uploaded to ${bucketName}`);
  }

  streamFileUpload().catch(console.error);

But at every call, I am receiving this error ==>
'NoneType' object is not callable

image

I did the same test using a real bucket on Google cloud storage and it works flawlessly, I guess this is related to some methods not implemented yet in this emulator nope ?

Thanks again 👍

Custom metadata not included in HTTP headers when using storage emulator

Describe the bug

When custom metadata is added to a cloud object and you access it via an HTTP GET request, each key-value pair should be included in the response headers with the prefix x-goog-meta-<key>. These key-value pairs are missing from the response headers when the storage emulator is used.

For example, if I add this metadata to a cloud object:

{"some": "metadata"}

then there should be a header in the response that looks like this:

{"x-goog-meta-some": "metadata"}

Instead, the only headers included in the response are:

{
    'Server': 'BaseHTTP/0.6 Python/3.9.13', 
    'Date': 'Wed, 29 Jun 2022 15:59:33 GMT', 
    'X-Goog-Hash': 'crc32c=pdRKBQ==,md5=vrakOt+5UOxvgs7tGb7uIQ==', 
    'Content-type': 'text/plain',
    'Content-Length': '10'
}

To Reproduce

With the emulator running, execute the following python script:

import requests        
from google.cloud import storage


client = storage.Client()
client.create_bucket("my-bucket")

blob = client.get_bucket("my-bucket").blob(blob_name="my_file.txt")
blob.upload_from_string("blah blah blah")
blob.metadata = {"some": "metadata"}
blob.patch()

url = f"{os.environ['STORAGE_EMULATOR_HOST']}/my-bucket/my_file.txt"
response = requests.get(url)
assert response.headers["x-goog-meta-some"] == "metadata"

This will raise:

KeyError: 'x-goog-meta-some'

Expected behavior

Custom metadata (key-value pairs) should appear in the headers as key-value pairs like:

x-goog-meta-<key>: <value>

System (please complete the following information)

  • OS version: MacOS Monterey 12.4
  • Python version: 3.9.13
  • gcp-storage-emulator version: 2022.06.11

Additional context

I came across this while creating signed URLs to cloud objects and attempting to get their metadata, but this problem occurs for all URLs to cloud objects.

Connecting to emulator through PySpark

Hello,

I've been trying to use the emulator inside PySpark using something like this:
csv_file = 'gs://bucket/file.csv'
df = spark_session.read.format('csv') .load(csv_file )

I've also added the JAR gcs-connector-hadoop3-latest.jar because gs file type could not be recognized.

When running the above code, the program hangs. Do you have any idea as to what the problem could be?

To start the emulator server I use the following:
server = create_server('localhost', 9023, in_memory=False, default_bucket='bucket')
server.start()

os.environ['STORAGE_EMULATOR_HOST'] = 'http://localhost:9023'
client = storage.Client()
client.create_bucket('bucket')

Upload from file results in a corrupted blob

When calling upload_from_file on a blob, the resulting blob cannot access certain properties such as the id or public_url.

The reason for this seems to be that obj is used to determine the json to return, but is overwritten by a call to storage.create_file, which does not return a value, hence obj ends up being none.

Support signing URLs

Is your feature request related to a problem? Please describe.
Signing of URLs doesn't work ATM.

Describe the solution you'd like
It would be good to make it possible to use/test signed URLs.

Describe alternatives you've considered
Use the go package, but it's harder to integrate.

Additional context
Thank you!

Module 'crc32c' has no attribute 'crc32c'

As far as I can tell my crc32c dependency is correct, could this be a python compatibility issue?

File "lib/python3.8/site-packages/gcp_storage_emulator/handlers/objects.py", line 64, in _crc32c val = crc32c.crc32c(content) AttributeError: module 'crc32c' has no attribute 'crc32c'

cannot upload a file in nodejs

Describe the bug
Uploading a file is returning 404 when using nodejs.

To Reproduce

import { Storage } from "@google-cloud/storage";

const host = "localhost:9023";
  process.env.STORAGE_EMULATOR_HOST = host;
const bucket =  "my-bucket";
  const filename =  "my-file";
  const filepath = "file.txt";
  const storage = new Storage();
storage.bucket(bucket_name).upload(filePath, { destination: filename });

Expected behavior
A clear and concise description of what you expected to happen.

System (please complete the following information):

  • OS version: linux fedora 36
  • version: (node 16.13.2)
  • gcp-storage-emulator version: (latest)

Additional context
the log producted by the emulator:

"POST /upload/storage/v1/b/my-bucket/o?uploadType=multipart&name=my-file HTTP/1.1" 404 -

the file to be uploaded exists.

I didn't have this issue 2 weeks ago or so.

also when creating a bucket there is this log error reported by the emulator:

Method not implemented: POST - /b
"POST /b?project=my-project HTTP/1.1" 501 -

with python instead looks like is working smoothily.
Honestly i don't understand why as the API should be the same.

Does it work with current SDK versions?

Hello! Thanks for being the only one around who cared to make an emulator for GCP Storage!

So, I've tried to use it with Docker and run the simple hello world from Google in .Net, and I'm getting "NotImplemented" response with this in the container logs:

2023-06-03 23:02:26 Starting server at 0.0.0.0:9594
2023-06-03 23:02:26 [SERVER] All services started
2023-06-03 23:02:38 Method not implemented: POST - /b
2023-06-03 23:02:38 "POST /b?project=orleans-test HTTP/1.1" 501 -

This is happening with the call to create a bucket.

I'm setting the STORAGE_EMULATOR_HOST.

Also tried building the client explicitly with this:

var client = new StorageClientBuilder()
        {
            BaseUri = this.StorageEndpoint,
            UnauthenticatedAccess = true
        }.Build();

Same problem...

Any idea what may be wrong?

Thank you! I appreciate any help!

Bucket or Object not found

Describe the bug
when do a download of an object is reported 404, not found.

To Reproduce

  • run the docker container as a per readme example
  • create in the shared foder, cloudstorage, my-bucket/my-filename
  • use a snippet program to download it and it won't be found even something like:
import os



from google.cloud import exceptions, storage



HOST = "localhost"

PORT = 9023

BUCKET = "my-bucket"



os.environ["STORAGE_EMULATOR_HOST"] = f"http://{HOST}:{PORT}"

client = storage.Client()



#try:

#    bucket = client.create_bucket(BUCKET)

#except exceptions.Conflict:

#    bucket = client.bucket(BUCKET)



bucket = client.bucket(BUCKET)

blob = bucket.blob("my-filename")

#blob.upload_from_string("test1")

print(blob.download_as_bytes())

Expected behavior
A clear and concise description of what you expected to happen.

If the file is present in the docker folder /storage/my-bucket/my-filename it should be expected to be found and downloaded.

System (please complete the following information):

  • OS version:
  • Python version:
  • gcp-storage-emulator version:

Additional context
it looks like the file must be uploaded from calling the api, cannot be manually added in the cloud storage folder.
Anyway this is just a basic use case, having a file to be downloaded, it shouldn't require to be uploaded first.

Need to check query string for blob name

The Cloud Storage node.js client sends blob name in query string when uploading. The query string should thus be checked for the blob name if it's not in the request data. The following change to objects.py:_multipart_upload seems to do the job:

def _multipart_upload(request, response, storage):

    name = request.data["meta"].get("name")
    # name might be in query string, e.g., node.js client
    if "name" in request.query:
        name = request.query['name'][0]

    obj = _make_object_resource(
        request.base_url,
        request.params["bucket_name"],
        name,
        request.data["content-type"],
        str(len(request.data["content"])),
        request.data["meta"],
    )
    try:
        obj = _checksums(request.data["content"], obj)
        storage.create_file(
            request.params["bucket_name"],
            name,
            request.data["content"],
            obj,
        )

        response.json(obj)
    except NotFound:
        response.status = HTTPStatus.NOT_FOUND
    except Conflict as err:
        _handle_conflict(response, err)

AttributeError: 'str' object has no attribute 'get_payload'

Describe the bug
gsutil can't upload files.

To Reproduce
All commands run as root

  1. Start emulator
gcloud-storage-emulator start --port=9023 --default-bucket=test
  1. Start devd https proxy (gsutil supports only https):
./devd-0.9-linux64/devd --tls --port 443 http://localhost:9023
  1. create .boto file
# cat /root/.boto 
[Credentials]
  gs_json_host=localhost
[Boto]
  https_validate_certificates=false
[GSUtil]
  default_project_id=1
  1. Run any gsutil command
gsutil cp -r . gs://test/

Expected behavior
gsutil finish successfully

System (please complete the following information):

  • OS version: Debian GNU/Linux 10 (buster)
  • Python version: Python 3.7.3
  • gcp-storage-emulator version: v2022.06.11

Additional context
gcloud-storage-emulator logs:

gcloud-storage-emulator start --port=9023 --default-bucket=test  
Starting server at localhost:9023
[SERVER] Creating default bucket "test"
[SERVER] All services started

"GET /storage/v1/b/test/o?alt=json&fields=nextPageToken%2Cprefixes%2Citems%2Fname&key=AIzaSyDnacJHrKma0048b13sh8cgxNUwulubmJM&delimiter=%2F&maxResults=1000&prefix=cache&projection=noAcl HTTP/1.1" 200 -
An error has occurred while running the handler for POST http://127.0.0.1:9023/upload/storage/v1/b/test/o?alt=json&fields=generation%2Cmd5Hash%2CcustomerEncryption%2Cetag%2Csize%2Ccrc32c&key=AIzaSyDnacJHrKma0048b13sh8cgxNUwulubmJM&uploadType=multipart
'str' object has no attribute 'get_payload'
----------------------------------------
Exception happened during processing of request from ('127.0.0.1', 59258)
Traceback (most recent call last):
  File "/usr/lib/python3.7/socketserver.py", line 316, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/usr/lib/python3.7/socketserver.py", line 347, in process_request
    self.finish_request(request, client_address)
  File "/usr/lib/python3.7/socketserver.py", line 360, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/local/lib/python3.7/dist-packages/gcloud_storage_emulator/server.py", line 244, in __init__
    super().__init__(*args, **kwargs)
  File "/usr/lib/python3.7/socketserver.py", line 720, in __init__
    self.handle()
  File "/usr/lib/python3.7/http/server.py", line 426, in handle
    self.handle_one_request()
  File "/usr/lib/python3.7/http/server.py", line 414, in handle_one_request
    method()
  File "/usr/local/lib/python3.7/dist-packages/gcloud_storage_emulator/server.py", line 252, in do_POST
    router.handle(POST)
  File "/usr/local/lib/python3.7/dist-packages/gcloud_storage_emulator/server.py", line 232, in handle
    raise e
  File "/usr/local/lib/python3.7/dist-packages/gcloud_storage_emulator/server.py", line 225, in handle
    handler(request, response, self._request_handler.storage)
  File "/usr/local/lib/python3.7/dist-packages/gcloud_storage_emulator/handlers/objects.py", line 95, in insert
    return _multipart_upload(request, response, storage)
  File "/usr/local/lib/python3.7/dist-packages/gcloud_storage_emulator/handlers/objects.py", line 44, in _multipart_upload
    request.data["meta"]["name"],
  File "/usr/local/lib/python3.7/dist-packages/gcloud_storage_emulator/server.py", line 156, in data
    self._data = _read_data(self._request_handler)
  File "/usr/local/lib/python3.7/dist-packages/gcloud_storage_emulator/server.py", line 100, in _read_data
    "meta": json.loads(payload[0].get_payload()),
AttributeError: 'str' object has no attribute 'get_payload'
----------------------------------------

Help with signed URLs

Hi!

I'm trying to utilize gcp-storage-emulator for automated testing in https://github.com/strayer/django-gcloud-storage.

The basics are working, but two tests that are calling bucket.get_blob(name).generate_signed_url have issues… I built a fake credentials object that pretends to be able to sign:

from google.auth.credentials import AnonymousCredentials, Signing


class FakeSigningCredentials(Signing, AnonymousCredentials):
    def sign_bytes(self, message):
        return b"foobar"

    @property
    def signer_email(self):
        return "[email protected]"

    @property
    def signer(self):
        pass

This makes the generate_signed_url function work in general. The returned URLs are problematic in two ways:

  1. the host seems to be hardcoded to https://storage.googleapis.com in the SDK itself [see 1] – nothing gcp-storage-emulator can to about that

  2. the path is not generated in a way gcp-storage emulator expects: GET /download/storage/v1/b/test_bucket_zdkgfs/test.jpg?Expires=1643405203&GoogleAccessId=foobar%40example.tld&Signature=Zm9vYmFy HTTP/1.1" 404 -

What is missing here is /o/ before the filename.

I've got my tests working by a very ugly hack to at least confirm some kind of working state by hacking the URLs returned by generate_signed_url:

        url = storage.url(file_name)
        if os.getenv('STORAGE_EMULATOR_HOST'):
            url = url.replace(f"/{file_name}", f"/o/{file_name}")
            url = url.replace("https://storage.googleapis.com", f"{os.getenv('STORAGE_EMULATOR_HOST')}/download/storage/v1/b")

        assert "image/jpeg" == urlopen(url).info().get("Content-Type")

Even though this is really ugly I can at least finally get automated tests without actual GCS access working, but I'm curious if I'm just missing something obvious here. Any help would be appreciated!

1: https://github.com/googleapis/python-storage/blob/main/google/cloud/storage/blob.py#L420

`TypeError` when uploading binary files

I don't know if I'm doing something wrong when uploading binary files, but I get this error:

Exception occurred during processing of request from ('172.17.0.1', 55522)
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/socketserver.py", line 316, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/usr/local/lib/python3.9/socketserver.py", line 347, in process_request
    self.finish_request(request, client_address)
  File "/usr/local/lib/python3.9/socketserver.py", line 360, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/local/lib/python3.9/site-packages/gcp_storage_emulator/server.py", line 348, in __init__
    super().__init__(*args, **kwargs)
  File "/usr/local/lib/python3.9/socketserver.py", line 747, in __init__
    self.handle()
  File "/usr/local/lib/python3.9/http/server.py", line 427, in handle
    self.handle_one_request()
  File "/usr/local/lib/python3.9/http/server.py", line 415, in handle_one_request
    method()
  File "/usr/local/lib/python3.9/site-packages/gcp_storage_emulator/server.py", line 356, in do_POST
    router.handle(POST)
  File "/usr/local/lib/python3.9/site-packages/gcp_storage_emulator/server.py", line 334, in handle
    raise e
  File "/usr/local/lib/python3.9/site-packages/gcp_storage_emulator/server.py", line 325, in handle
    handler(request, response, self._request_handler.storage)
  File "/usr/local/lib/python3.9/site-packages/gcp_storage_emulator/handlers/objects.py", line 230, in insert
    return _create_resumable_upload(request, response, storage)
  File "/usr/local/lib/python3.9/site-packages/gcp_storage_emulator/handlers/objects.py", line 176, in _create_resumable_upload
    request.data["name"],
TypeError: byte indices must be integers or slices, not str

This was the URL from the logs (URL decoded):

http://0.0.0.0:8080/upload/storage/v1/b/logs/o?alt=json&name=Log1/210719-112358Z-apabepa/Log1-210719-112358Z-apabepa.lcmlog&prettyPrint=false&projection=full&uploadType=resumable&upload_id=logs:Log1/210719-112358Z-apabepa/Log1-210719-112358Z-apabepa.lcmlog:2021-09-16+11:16:33.090102

The file was some roughly 100-200MB iirc.

Something you've seen before?

using the emulator, upload a file then downloading it return 404

Describe the bug
A clear and concise description of what the bug is.
uploading a file succesfully, then when try to downloading is returning 404.

[SERVER] All services started
"POST /upload/storage/v1/b/local-bucket/o?name=Example.xml&uploadType=resumable HTTP/1.1" 200 -
No file to remove '_resumable/6effa36d984da5caa33e4f5fb3a0f70e54ab18f801d03504f7cd2fdaa38e2cc5'
"PUT /upload/storage/v1/b/local-bucket/o?name=Example.xml&uploadType=resumable&upload_id=local-bucket%3AExample.xml%3A2022-10-21+13%3A31%3A18.531160 HTTP/1.1" 200 -
Resource not found:
**resource 'b' not found**
"GET /b/local-bucket/o/Example.xml?alt=media HTTP/1.1" 404 -

the url to get /b is the standard GCP Storage url bucket /o obect.
it isn't expected to fail on a GET /b url

To Reproduce
A minimal, complete, and reproducible example to reproduce the behavior.

  1. upload a file with NodeJS
  2. downaload a file with NodeJS
import { Storage } from "@google-cloud/storage"
import process from "process"

const HOST = "localhost"
const PORT = 9023
const BUCKET = "local-bucket"

process.env.STORAGE_EMULATOR_HOST = `http://${HOST}:${PORT}`

const storage = new Storage()

async function upload(filePath, destinationFilename) {
    const resp = await storage.bucket(BUCKET).upload(filePath,{
        destination: destinationFilename
    })
}
// USAGE needs filepath to upload and the name of the file at destination:
// e.g. local_storage_upload mypath/filename.txt filename.txt
upload(process.argv[2], process.argv[3]).catch(console.error)

// usage npx local-storage_upload.ts filetoUpload.txt filetoUpload.txt

download:

const [fileData] = await this.storage.bucket("local-bucket").file("fileToUpload.txt).download({ validation: "md5" });
// this will trigger 404 and throw exception as file not found

Expected behavior
it should download the file

System (please complete the following information):

  • gcp-storage-emulator version: latest, there is no way to retrieve from -h argument or a --version argument.

Additional context
Add any other context about the problem here.

Basically what is saying is "url not found" `b/" the b/ part of the path is the "bucket" in GCP storage. so it should just work without any further issue as it is the basic GCP storage url

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.