Comments (32)
Is there a reliable way to check whether the CLI is able to call webbrowser.open? This is convenient for local access.
How convenient? In my terminal I just click on the link and a browser opens. I suppose that this may depend on terminal choice though.
from gcsfs.
If you're on GCP I recommend using GCSFileSystem(token='cloud')
This could definitely use a better error message though.
from gcsfs.
With fs = gcsfs.GCSFileSystem(project='pangeo-181919', token='cloud')
, I get the following error
---------------------------------------------------------------------------
gaierror Traceback (most recent call last)
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/requests/packages/urllib3/connection.py in _new_conn(self)
140 conn = connection.create_connection(
--> 141 (self.host, self.port), self.timeout, **extra_kw)
142
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/requests/packages/urllib3/util/connection.py in create_connection(address, timeout, source_address, socket_options)
59
---> 60 for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
61 af, socktype, proto, canonname, sa = res
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/socket.py in getaddrinfo(host, port, family, type, proto, flags)
732 addrlist = []
--> 733 for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
734 af, socktype, proto, canonname, sa = res
gaierror: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
NewConnectionError Traceback (most recent call last)
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
599 body=body, headers=headers,
--> 600 chunked=chunked)
601
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
355 else:
--> 356 conn.request(method, url, **httplib_request_kw)
357
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/http/client.py in request(self, method, url, body, headers)
1106 """Send a complete request to the server."""
-> 1107 self._send_request(method, url, body, headers)
1108
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/http/client.py in _send_request(self, method, url, body, headers)
1151 body = _encode(body, 'body')
-> 1152 self.endheaders(body)
1153
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/http/client.py in endheaders(self, message_body)
1102 raise CannotSendHeader()
-> 1103 self._send_output(message_body)
1104
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/http/client.py in _send_output(self, message_body)
933
--> 934 self.send(msg)
935 if message_body is not None:
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/http/client.py in send(self, data)
876 if self.auto_open:
--> 877 self.connect()
878 else:
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/requests/packages/urllib3/connection.py in connect(self)
165 def connect(self):
--> 166 conn = self._new_conn()
167 self._prepare_conn(conn)
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/requests/packages/urllib3/connection.py in _new_conn(self)
149 raise NewConnectionError(
--> 150 self, "Failed to establish a new connection: %s" % e)
151
NewConnectionError: <requests.packages.urllib3.connection.HTTPConnection object at 0x7f1af4133908>: Failed to establish a new connection: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
MaxRetryError Traceback (most recent call last)
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
422 retries=self.max_retries,
--> 423 timeout=timeout
424 )
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
648 retries = retries.increment(method, url, error=e, _pool=self,
--> 649 _stacktrace=sys.exc_info()[2])
650 retries.sleep()
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/requests/packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
375 if new_retry.is_exhausted():
--> 376 raise MaxRetryError(_pool, url, error or ResponseError(cause))
377
MaxRetryError: HTTPConnectionPool(host='metadata.google.internal', port=80): Max retries exceeded with url: /computeMetadata/v1/instance/service-accounts/default/token (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f1af4133908>: Failed to establish a new connection: [Errno -2] Name or service not known',))
During handling of the above exception, another exception occurred:
ConnectionError Traceback (most recent call last)
<ipython-input-18-c16df27e58be> in <module>()
1 import gcsfs
----> 2 fs = gcsfs.GCSFileSystem(token='cloud')
3 fs
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/gcsfs/core.py in __init__(self, project, access, token, block_size)
162 self.access = access
163 self.dirs = {}
--> 164 self.connect()
165 self._singleton[0] = self
166
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/gcsfs/core.py in connect(self, refresh)
259 'http://metadata.google.internal/computeMetadata/v1/'
260 'instance/service-accounts/default/token',
--> 261 headers={'Metadata-Flavor': 'Google'})
262 data = r.json()
263 data['timestamp'] = time.time()
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/requests/api.py in get(url, params, **kwargs)
68
69 kwargs.setdefault('allow_redirects', True)
---> 70 return request('get', url, params=params, **kwargs)
71
72
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/requests/api.py in request(method, url, **kwargs)
54 # cases, and look like a memory leak in others.
55 with sessions.Session() as session:
---> 56 return session.request(method=method, url=url, **kwargs)
57
58
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/requests/sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
486 }
487 send_kwargs.update(settings)
--> 488 resp = self.send(prep, **send_kwargs)
489
490 return resp
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/requests/sessions.py in send(self, request, **kwargs)
607
608 # Send the request
--> 609 r = adapter.send(request, **kwargs)
610
611 # Total elapsed time of the request (approximately)
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
485 raise ProxyError(e, request=request)
486
--> 487 raise ConnectionError(e, request=request)
488
489 except ClosedPoolError as e:
ConnectionError: HTTPConnectionPool(host='metadata.google.internal', port=80): Max retries exceeded with url: /computeMetadata/v1/instance/service-accounts/default/token (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f1af4133908>: Failed to establish a new connection: [Errno -2] Name or service not known',))
from gcsfs.
I think token='cloud'
did not work because I don't have any gcloud stuff installed on this computer. I was hoping to get redirected to a browser to create a new token, as happened when I tried this on my laptop.
I am now going to try manually generating a token.
from gcsfs.
The token='cloud'
approach should grab a token if the compute from which you are running it was deployed on Google Cloud.
Also, if you're trying to recreate the workflow from Pangeo you'll also need to ensure that your data is readable with your current permissions.
from gcsfs.
from gcsfs.
Then I was mistaken in recommending token='cloud'
. You'll need to authenticate, perhaps with gcloud auth login
if you can install gcloud
.
from gcsfs.
If you log in with gcloud, gcsfs will pick up your default token (this is relatively new).
For the original error, this sounds like the time previously that google deleted the API; but checking in the console, it is still there.
from gcsfs.
from gcsfs.
Yes: you either need a token as text or a file (as generated by gcloud or gcsfs), or the browser-based method should work. I don't know why the latter is failing.
from gcsfs.
Even after installing gcloud and running gcloud init
, I am getting the same OAuth error.
How do I generate a token manually?
from gcsfs.
Some of this documentation might help: https://cloud.google.com/storage/docs/authentication
Also, for gcloud you want to run gcloud auth login
from gcsfs.
Also http://gcsfs.readthedocs.io/en/latest/#credentials
from gcsfs.
I created a json key for a "service account." When I passed it via token
to GCSFileSystem
, I got the error
ValueError: Only 'authorized_user' tokens accepted, got: service_account
from gcsfs.
Yeah, there was a requests to support service account credentials recently, but I'm not sure what to do with them and, indeed, what it means to access a project via a service account. If you let me know how to make one, perhaps I can try it - could be that I just need to relax the test for "authorized_user".
from gcsfs.
The problem is that I don't know how to get an authorized_user
json token manually. I have read all the docs you guys have pointed me to, but I'm stuck.
from gcsfs.
I am assuming that authorized_user
tokens are related to "OAuth 2.0 Client IDs" on this page: https://console.cloud.google.com/apis/credentials
When I try to create a new OAuth Client ID, it just gives me a "client ID" and "client secret" via a dialog in the browser. I don't know how to convert this to the json key or dict needed by gcsfs.
from gcsfs.
No, client IDs are used by applications, so that they can request their own auth - this is what the browser-based method is supposed to do. You can in theory use your own client credentials to authenticate, by replacing the contents of gcsfs.core.not_secret
; I think that has a decent change of working.
from gcsfs.
Ok, then it looks like I am stuck. Until the original issue (RuntimeError: b'{\n "error" : "deleted_client",\n "error_description" : "The OAuth client was deleted."\n}'
) can be resolved somehow, I can't connect to gcsfs. This means I can't transfer the data from my server to gcs, which is needed for pangeo-data/pangeo#19.
Any advice on how to debug?
from gcsfs.
Browser auth still works at this end. Are you sure you have a recent gcsfs? gcsfs.core.not_secret["client_id"]
should be "586241054156-7a3vrghs70ffkkfkmnnatjnbjg03cq9a.apps.googleusercontent.com"
.
from gcsfs.
I have gcsfs.__version__ == '0.0.2'
and gcsfs.core.not_secret["client_id"] == '586241054156-is96mugvl2prnj0ib5gsg1l3q9m9jp7p.apps.googleusercontent.com'
This does appear to be different from yours. I think the problem is that the install instructions on the gcsfs docs still refer to your personal repo:
pip install git+https://github.com/martindurant/gcsfs.git
from gcsfs.
That's quite a miss, sorry.
You want conda install -c conda-forge gcsfs
to get 0.0.3, or pip-install from the dask org.
from gcsfs.
I just pip installed it. I got past that error! Woo hoo!
However, I am still stuck, because the remote server can't open a browser window. It says Enter the following code when prompted in the browser:
, but then nothing happens. If it would just print the url, I could click on it an authenticate.
from gcsfs.
The docs say I can pass a token file that should be located at ~/.config/gcloud/application_default_credentials.json
. I did not have this file. After lots of experimentation I found a valid credentials file at
~/.config/gcloud/legacy_credentials/<google username>/adc.json
It worked. I am now authenticated.
from gcsfs.
Sorry that was so tricky for you. That gcloud path is where it happens to be for me, but I guess it is version-dependent. gcsfs will pick it up if the oauth library knows about that location.
from gcsfs.
I am going to make a PR which updates the docs to address some of the challenges I encountered here.
from gcsfs.
@martindurant any objection to the following change?
diff --git a/gcsfs/core.py b/gcsfs/core.py
index e61b769..101dbdc 100644
--- a/gcsfs/core.py
+++ b/gcsfs/core.py
@@ -18,7 +18,6 @@ import requests
import sys
import time
import warnings
-import webbrowser
from requests.exceptions import RequestException
from .utils import HtmlError
@@ -253,9 +252,8 @@ class GCSFileSystem(object):
'scope': scope})
validate_response(r, path)
data = json.loads(r.content.decode())
- print('Enter the following code when prompted in the browser:')
- print(data['user_code'])
- webbrowser.open(data['verification_url'])
+ print('Navigate to:', data['verification_url'])
+ print('Enter code:', data['user_code']))
for i in range(max(self.retries, 10)):
# minimum 10 retries =20s, usually reasonable
time.sleep(2)
from gcsfs.
Had better up the timeout for the additional clicks - could be quite large, as the user can always ^C out.
from gcsfs.
Is there a reliable way to check whether the CLI is able to call webbrowser.open
? This is convenient for local access.
from gcsfs.
Is there a reliable way to check whether the CLI is able to call webbrowser.open?
I doubt it; in fact, it has been broken on OSX. It could be fire-and-forget.
from gcsfs.
from gcsfs.
Docs are now correct, so I'll close.
from gcsfs.
Related Issues (20)
- Inconsistent `STORAGE_EMULATOR_HOST` format HOT 2
- GCSFileSystem does not accept token, file.json or instance of service credentials HOT 13
- Errors when deleting a directory with huge number of files HOT 13
- The info object in GCSFS does not have mtime/ctime HOT 2
- `find` performance regression HOT 4
- Stale cache info leading to failing isfile check HOT 2
- _find does not support maxdepth
- GCS High Performance Parallel Listing
- Strange behavior with `HTTPError` and multiprocessing HOT 3
- isdir/info method works incorrectly HOT 20
- Clarify how to pass JSON credentials HOT 2
- Is it needed to be so strict about dependency on fsspec HOT 1
- Release 2023.10.0 and consider relaxing fsspec dependency HOT 3
- Error when listing large directory with versions=True
- Request: add chmod
- Issues when using identity_pool.Credentials for connecting GCSFileSystem HOT 2
- Strange error message when using cp instead of put HOT 2
- Filename with slashes in the path are getting URL encoded, causing them to fail HOT 2
- Pin generation on open for version aware file system HOT 4
- asyncio exception while writing to zarr store HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gcsfs.