joshmckin / em_aws Goto Github PK
View Code? Open in Web Editor NEWEM-Synchrony handler for AWS-SDK
License: MIT License
EM-Synchrony handler for AWS-SDK
License: MIT License
Get error: log writing failed. undefined method `lock' for #EventMachine::Synchrony::Thread::Mutex:0x00000004285038
When deploying to heroku bamboo using em_aws version 0.1.6
Hi Josh,
great package. My question is about how people are generally using em_aws. In my application, I need to know when an upload to S3 has succeeded in order to kick-off the next task.
I would like to perform many of uploads asynchronously and in parallel and then associate callbacks to kick off the next tasks.
Is this feasible with the library? It looks like line 152/fetch_response does not support this type of use...
The other option is to wrap the standard aws-sdk s3.write calls using EM:defer. You might have considered doing similar things; assuming i only need ~20 concurrent uploads, would this be a feasible solution in your opinion?
Seeing this line in your sample code:
EM::Synchrony.sleep(2) # Let the pending fibers run
I gather this is to simulate the reactor doing other stuff while the async operation completes. But what if we want to chain a callback on the async operation -- do you have sample code of this working with your gem?
I'm sending some trivial dynamo fetch requests that are returning a status of 0. These requests work fine with a 200 response when using the default net_http_handler.
I am using ruby 2.0.0-p195 with aws-sdk 1.9.5 and em_aws 0.3.0
Here is some debug information about the requests being sent that I pulled from the #process_request
method:
url: https://dynamodb.us-east-1.amazonaws.com:443
options: {:inactivity_timeout=>60, :connect_timeout=>10}
opts: {
:inactivity_timeout => 0,
:connect_timeout => 10,
:head=> {
"content-type" => "application/x-amz-json-1.0",
"x-amz-target" => "DynamoDB_20111205.GetItem",
"content-length" => "150",
"user-agent" => "aws-sdk-ruby/1.9.5 ruby/2.0.0 x86_64-darwin12.3.0",
"host" => "dynamodb.us-east-1.amazonaws.com",
"x-amz-date"=>"20130608T182936Z",
"x-amz-content-sha256" => "redacted",
"authorization"=>"redacted"
},
:query => nil,
:body => "{\"AttributesToGet\":[\"forward\"],\"TableName\":\"redacted\",\"Key\":{\"HashKeyElement\":{\"S\":\"progress is made on midnight oil\"}}}",
:path=> "/",
:async=>nil
}
method: apost
Both headers and body are the same in either http_handler, so I'm not sure why em_http_request is returning a status 0. The response from em_http is not all too helpful either . .
(apologies for the wall of text below, I tried to format it a little better than the plain old #inspect
)
got response: #<EventMachine::HttpClient:0x007fa7427b6dd0
@conn = #<EventMachine::HttpConnection:0x007fa74388a7b0
@deferred=false,
@middleware=[],
@connopts= #<HttpConnectionOptions:0x007fa7427b2028
@connect_timeout=10,
@inactivity_timeout=60,
@tls={},
@proxy=nil,
@host="dynamodb.us-east-1.amazonaws.com",
@port=443
>,
@uri="https://dynamodb.us-east-1.amazonaws.com:443",
@clients=[#<EventMachine::HttpClient:0x007fa7427b6dd0 ...>],
@pending=[],
@p= #<HTTP::Parser:0x007fa7427b4b98>,
@conn= #<EventMachine::HttpStubConnection:0x007fa7427b5520
@signature=7,
@parent= #<EventMachine::HttpConnection:0x007fa74388a7b0 ...>,
@deferred_status=:unknown,
@callbacks=[#<Proc:0x007fa7427b4760@/Users/kbishop/.rvm/gems/ruby-2.0.0-p195@n/gems/em-http-request-1.0.3/lib/em-http/http_connection.rb:94>]
>
>,
@req=#<HttpClientOptions:0x007fa74388a760
@keepalive=false,
@redirects=0,
@followed=0,
@method="POST",
@headers = {
"content-type"=>"application/x-amz-json-1.0",
"x-amz-target"=>"DynamoDB_20111205.GetItem",
"content-length"=>"150",
"user-agent"=>"aws-sdk-ruby/1.9.5 ruby/2.0.0 x86_64-darwin12.3.0",
"host"=>"dynamodb.us-east-1.amazonaws.com",
"x-amz-date"=>"20130608T183008Z",
"x-amz-content-sha256"=>"redacted",
"authorization"=>"redacted"
},
@query=nil,
@path="/",
@file=nil,
@body="{\"AttributesToGet\":[\"forward\"],
\"TableName\":\"redacted\",
\"Key\":{\"HashKeyElement\":{\"S\":\"progress is made on midnight oil\"}}}",
@pass_cookies=true,
@decoding=true,
@uri=#<Addressable::URI:0x3fd3a13dabbc URI:https://dynamodb.us-east-1.amazonaws.com:443/>,
@host="dynamodb.us-east-1.amazonaws.com",
@port=443>,
@stream=nil,
@headers=nil,
@cookies=[],
@cookiejar=#<EventMachine::HttpClient::CookieJar:0x007fa7427b63f8 @jar=#<CookieJar::Jar:0x007fa7427b5b60 @domains={}>>,
@response_header={},
@state=:response_header,
@response="",
@error=nil,
@content_decoder=nil,
@content_charset=nil,
@deferred_status=:failed,
@callbacks=[#<Proc:0x007fa743890200@/Users/kbishop/.rvm/gems/ruby-2.0.0-p195@n/gems/em-synchrony-1.0.3/lib/em-synchrony.rb:64>],
@errbacks=[],
@deferred_timeout=nil,
@deferred_args=[#<EventMachine::HttpClient:0x007fa7427b6dd0 ...>]
>
Hi,
It looks like HttpHandler does not respect port settings. This is important when working with e.g. fake_dynamo which runs locally on a port different than 443 or 80.
The current version does not work with an example config:
AWS.config({
dynamo_db_endpoint: 'localhost',
dynamo_db_port: '4567',
use_ssl: false
})
Having peeked into the original handler I would suggest a simple solution in aws/core/http/em_http_handler.rb
def fetch_url(request)
(request.use_ssl? ? "https" : "http") + "://#{request.host}:#{request.port}"
end
The port is already set on the request. I have not done extensive tests, but it seems to work for both original AWS services and local fake_dynamo.
I forked their library to switch it to:
if defined?(EM) && EM.reactor_running?
fiber = Fiber.current
EM::Timer.new(sleeps.shift) { fiber.resume }
Fiber.yield
else
Kernel.sleep....
But I thought I would mention in case you want to monkey patch, as it was killing the performance of my servers (predictably).
This is important when long polling takes place, e.g. when using AWS::SimpleForkflow:: DecisionTaskCollection#poll and probably other services like SQS.
What happens is that right now the handler uses default timeout settings, which makes it timeout after ~ 15s:
[AWS SimpleWorkflow 0 15.294478 0 retries] poll_for_activity_task(:domain=>"streamza-dev1",:identity=>"mf:8254",:task_list=>{:name=>"download-list"})
What should happen is that the handler should respect the read_timeout on the request object and set the inactivity_timeout on the EM::HttpRequest object. Timeout on polling requests equals.
My very rough fix to this is (in em_http_handler.rb):
# Builds and attempts the request. Occasionally under load em-http-request
# returns a status of 0 with nil for header and body, in such situations
# we retry as many times as status_0_retries is set. If our retries exceed
# status_0_retries we assume there is a network error
def process_request(request,response,async=false,retries=0,&read_block)
method = "a#{request.http_method}".downcase.to_sym # aget, apost, aput, adelete, ahead
opts = fetch_request_options(request)
opts[:async] = (async || opts[:async])
opts[:inactivity_timeout] = request.read_timeout
puts opts
url = fetch_url(request)
begin
http_response = fetch_response(url,method,opts,&read_block)
unless opts[:async]
response.status = http_response.response_header.status.to_i
if response.status == 0
if retries <= status_0_retries.to_i
process_request(request,response,(retries + 1),&read_block)
else
response.network_error = true
end
else
response.headers = fetch_response_headers(http_response)
response.body = http_response.response
end
end
rescue Timeout::Error => error
response.network_error = error
rescue *EM_PASS_THROUGH_ERRORS => error
raise error
rescue Exception => error
response.network_error = error
end
nil
end
end
and
def fetch_response(url,method,opts={},&read_block)
inactivity_timeout = opts.delete :inactivity_timeout
if @pool
@pool.run(url) do |connection|
req = connection.send(method, {:keepalive => true}.merge(opts))
req.stream &read_block if block_given?
return EM::Synchrony.sync req unless opts[:async]
end
else
req = EM::HttpRequest.new(url, :inactivity_timeout => inactivity_timeout).send(method,opts)
req.stream &read_block if block_given?
return EM::Synchrony.sync req unless opts[:async]
end
nil
end
There are just 3 lines changed. I have not figured out how to re-configure @pool, but I hope this gives a good hint.
Hi,
I have a problem with em_aws and DynamoDB. Here is what is happening:
require 'em-synchrony'
require 'aws-sdk'
require 'aws/core/http/em_http_handler'
AWS.config({
access_key_id: .............,
secret_access_key: ............,
http_handler: AWS::Http::EMHttpHandler.new()
})
dynamo_db = AWS::DynamoDB.new
dynamo_db.tables.each {|table| puts table.name }
results in:
Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.1/lib/aws/core/client.rb:318:in `return_or_raise': CRC32 integrity check failed (AWS::DynamoDB::Errors::CRC32CheckFailed)
from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.1/lib/aws/core/client.rb:419:in `client_request'
from (eval):3:in `list_tables'
from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.1/lib/aws/dynamo_db/table_collection.rb:121:in `_each_item'
from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.1/lib/aws/core/collection/with_limit_and_next_token.rb:54:in `_each_batch'
from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.1/lib/aws/core/collection.rb:82:in `each_batch'
from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.1/lib/aws/core/collection.rb:49:in `each'
from testdynamodb.rb:12:in `<main>'
Somehow CRC32 fails. The above code runs OK if the dynamo_db_crc32
option is added to AWS.config
AWS.config({
access_key_id: .............,
secret_access_key: ............,
http_handler: AWS::Http::EMHttpHandler.new(),
dynamo_db_crc32: false
})
It would be great if this issue can be fixed! Thanks,
Michal
PS em_aws is great! It made my projects so much simpler!
There's a comment that says it defaults a connection pool to size 5. Is this actually happening? I'm having an incredibly hard time finding where this is implemented, as it isn't here and it isn't in em-http-request.
Hey @JoshMcKin,
I'm using your em_aws (great idea by the way) but I'm running into an issue. I'm using it for DynamoDB to do massive amounts of writes. On the initial query, it makes an HTTP request which requires a session authorization from AWS, which unfortunately has some locking. Any subsequence requests before the session authorization comes back results in the following exception:
[2012-05-11 23:26:17.954 #18573] ERROR -- : #<ThreadError: deadlock; recursive locking>
[2012-05-11 23:26:17.954 #18573] DEBUG -- : internal:prelude:8:in lock' <internal:prelude>:8:in
synchronize'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/session_signer.rb:63:in get_session' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/session_signer.rb:72:in
session'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/session_signer.rb:42:in access_key_id' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/dynamo_db/request.rb:31:in
add_authorization!'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:436:in build_request' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:375:in
block (3 levels) in client_request'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/response.rb:65:in call' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/response.rb:65:in
rebuild_request'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/response.rb:60:in initialize' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:169:in
new'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:169:in new_response' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:375:in
block (2 levels) in client_request'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:287:in log_client_request' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:363:in
block in client_request'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:275:in return_or_raise' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:362:in
client_request'
(eval):3:in describe_table' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/dynamo_db/table.rb:497:in
get_resource'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/dynamo_db/table.rb:305:in `exists?'
Once the session is returned by AWS, the thing works like a charm.
I was wondering if you had any insight, maybe we could tackle this problem together?
Breaks darn near everything
Hi,
I came across the following bug. Given the config:
AWS.config({
access_key_id: CONFIG.aws.api_key,
secret_access_key: CONFIG.aws.secret,
http_handler: AWS::Http::EMHttpHandler.new
})
the following fails if and object does NOT exist:
AWS::S3.new.buckets[mybucket].objects['this-does-not-exist'].exists?
AWS::Errors::Base:
from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.3/lib/aws/core/client.rb:318:in `return_or_raise'
from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.3/lib/aws/core/client.rb:419:in `client_request'
from (eval):3:in `head_object'
from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.3/lib/aws/s3/s3_object.rb:294:in `head'
from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.3/lib/aws/s3/s3_object.rb:271:in `exists?'
from (irb):5
from /Users/michalf/.rvm/rubies/ruby-1.9.3-p327/bin/irb:16:in `<main>'
I have not yet time to look into this, but this seems critical to our project. It works with the default HTTP handler.
Thanks in advance!
Michal
I see you're using the AWS autoloader. I highly recommend you drop in:
AWS.eager_autoload!
As well, as otherwise the AWS library itself isn't threadsafe. Since EventMachine can't/doesn't run in a non-threaded environment (a unicorn deployment, for instance), this problem is basically impossible to avoid, and I had a couple heisenbugs on this one. The line of code doesn't change the functionality of the library in any way, but it's a little obscure and took me a while to find even after I figured out what the problem was.
Since it's something that must be in all the projects that use this library, it might save some people the confusion of the undefined method exceptions deep within the dark depths of Amazon's code. There might be an issue with you calling it too soon if people are using other patches to the aws-sdk, but I haven't seen any troubles.
Thanks for sharing the cool code. This isn't an issue so much as a suggestion (maybe the best solution is me issuing a pull request for a change to the readme?)
Thanks for the gem. Works great in most of my cases. One scenario when streaming a file in sinatra causes the following error:
(eval):11:in yield' : can't yield from root fiber ( FiberError ) from (eval):11:in
head'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/em_aws-0.1.3/lib/aws/core/http/em_http_handler.rb:94:in fetch_response' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/em_aws-0.1.3/lib/aws/core/http/em_http_handler.rb:124:in
handle_it'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/em_aws-0.1.3/lib/aws/core/http/em_http_handler.rb:99:in handle' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:224:in
block in make_sync_request'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:235:in retry_server_errors' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:219:in
make_sync_request'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:400:in block (2 levels) in client_request' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:289:in
log_client_request'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:373:in block in client_request' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:271:in
return_or_raise'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:372:in client_request' from (eval):3:in
head_object'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/s3/s3_object.rb:94:in head' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/s3/s3_object.rb:71:in
exists?'
from /Users/philipwi/CloudDev/storage_server/storage.rb:86:in stream' from /Users/philipwi/CloudDev/storage_server/myapp.rb:155:in
block (2 levels) in class:StorageApp'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/sinatra-1.3.2/lib/sinatra/base.rb:296:in block in stream' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/sinatra-1.3.2/lib/sinatra/base.rb:264:in
call'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/sinatra-1.3.2/lib/sinatra/base.rb:264:in block in each' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/eventmachine-1.0.0.beta.4/lib/eventmachine.rb:1012:in
call'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/eventmachine-1.0.0.beta.4/lib/eventmachine.rb:1012:in `block in spawn_threadpool'
Do you have any ideas?
What are the holdups that need to be addressed to support ruby 2.0.0?
When using methods the .with_prefix, the http params are not being passed in the request to AWS
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.