Code Monkey home page Code Monkey logo

rturk's Introduction

RTurk - A ridiculously simple Mechanical Turk library in Ruby

Build Status

What's it do?!?

RTurk is designed to fire off Mechanical Turk tasks for pages that reside on an external site.

The pages could be a part of a rails app, or just a simple javascript enabled form.

If you're integrating RTurk with a Rails app, do yourself a favor and check out Turkee by Jim Jones. It integrates your Rails forms with Mechanical Turk, and includes rake tasks to pull and process submissions. Definitely a time saver.

Installation

# Requires Ruby >1.9.2
gem install rturk

Use

Let's say you have a form at "http://myapp.com/turkers/add_tags" where Turkers can add some tags to items in your catalogue.

Creating HIT's

require 'rturk'

RTurk.setup(YourAWSAccessKeyId, YourAWSAccessKey, :sandbox => true)
hit = RTurk::Hit.create(:title => "Add some tags to a photo") do |hit|
  hit.max_assignments = 2
  hit.description = 'blah'
  hit.question("http://myapp.com/turkers/add_tags",
               :frame_height => 1000)  # pixels for iframe
  hit.reward = 0.05
  hit.qualifications.add :approval_rate, { :gt => 80 }
end

p hit.url #=>  'https://workersandbox.mturk.com:443/mturk/preview?groupId=Q29J3XZQ1ASZH5YNKZDZ'

Reviewing and Approving hits HIT's

hits = RTurk::Hit.all_reviewable

puts "#{hits.size} reviewable hits. \n"

unless hits.empty?
  puts "Reviewing all assignments"

  hits.each do |hit|
    hit.assignments.each do |assignment|
      puts assignment.answers['tags']
      assignment.approve! if assignment.status == 'Submitted'
    end
  end
end

Wiping all your hits out

hits = RTurk::Hit.all_reviewable

puts "#{hits.size} reviewable hits. \n"

unless hits.empty?
  puts "Approving all assignments and disposing of each hit!"

  hits.each do |hit|
    hit.expire!
    hit.assignments.each do |assignment|
      assignment.approve!
    end
    hit.dispose!
  end
end

Logging

Want to see what's going on - enable logging.

RTurk::logger.level = Logger::DEBUG

Nitty Gritty

Here's a quick peak at what happens on the Mechanical Turk side.

A worker takes a look at your hit. The page will contain an iframe with your question URL loaded inside of it.

If you want to use an Amazon-hosted QuestionForm, do

hit.question_form "<Question>What color is the sky?</Question>" # not the real format

Amazon will append the AssignmentID parameter to the URL for your own information. In preview mode this will look like

http://myapp.com/turkers/add_tags?item_id=1234&AssignmentId=ASSIGNMENT_ID_NOT_AVAILABLE

If the Turker accepts the HIT, the page will reload and the iframe URL will resemble

http://myapp.com/turkers/add_tags?item_id=1234&AssignmentId=1234567890123456789ABC

The form in your page MUST CONTAIN the AssignmentID in a hidden input element. You could do this on the server side with a rails app, or on the client side with javascript(check the examples)

Anything submitted in this form will be sent to Amazon and saved for your review later.

Testing

bundle install
rake

More information

Take a look at the Amazon MTurk developer docs for more information. They have a complete list of API operations, many of which can be called with this library.

Mark gave a presentation about RTurk at the Atlanta Ruby User Group that got recorded as a 20-minute screencast video.

Contributors

Zach Hale David Balatero Rob Hanlon Haris Amin Tyler David Dai Alex Chaffee

rturk's People

Contributors

abscondment avatar alexch avatar bobbytables avatar c0r0nel avatar dbalatero avatar denniskuczynski avatar eggie5 avatar hamin avatar hampei avatar karimn avatar mdp avatar robhanlon22 avatar seeingidog avatar tansey avatar zachhale avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rturk's Issues

How to set assignment time?

hit.duration= sets the amount of time the HIT will run for, but how do I set the amount of time a worker will have to complete the HIT? Thank you

X-Frame-Options from MTurk stopping the submit frame from loading

I am using RTurk in a rails 4 app, trying to send custom HITs to the requestersanbox. It seems that the response from the form submission cannot be displayed in an iframe because of the response header 'X-Frame-Options'. I ran into the same console error when trying to get my hit to display - but since the hit was hosted server side it was easy to change the header. How do I stop mturk from sending this form header, or is it something I need to do server side?

here is the error
Refused to display 'https://workersandbox.mturk.com/mturk/externalSubmit' in a frame because it set 'X-Frame-Options' to 'SAMEORIGIN'.

Really sorry if this is not an issue for this gem!

License missing from gemspec

RubyGems.org doesn't report a license for your gem. This is because it is not specified in the gemspec of your last release.

via e.g.

spec.license = 'MIT'
# or
spec.licenses = ['MIT', 'GPL-2']

Including a license in your gemspec is an easy way for rubygems.org and other tools to check how your gem is licensed. As you can imagine, scanning your repository for a LICENSE file or parsing the README, and then attempting to identify the license or licenses is much more difficult and more error prone. So, even for projects that already specify a license, including a license in your gemspec is a good practice. See, for example, how rubygems.org uses the gemspec to display the rails gem license.

There is even a License Finder gem to help companies/individuals ensure all gems they use meet their licensing needs. This tool depends on license information being available in the gemspec. This is an important enough issue that even Bundler now generates gems with a default 'MIT' license.

I hope you'll consider specifying a license in your gemspec. If not, please just close the issue with a nice message. In either case, I'll follow up. Thanks for your time!

Appendix:

If you need help choosing a license (sorry, I haven't checked your readme or looked for a license file), GitHub has created a license picker tool. Code without a license specified defaults to 'All rights reserved'-- denying others all rights to use of the code.
Here's a list of the license names I've found and their frequencies

p.s. In case you're wondering how I found you and why I made this issue, it's because I'm collecting stats on gems (I was originally looking for download data) and decided to collect license metadata,too, and make issues for gemspecs not specifying a license as a public service :). See the previous link or my blog post about this project for more information.

hit.max_assignments has no effect

So I've been using rturk to post HITs to Mechanical Turk with no problem so far except that hit.max_assignments isn't setting the correct amount of assignments per worker.

RTurk.setup(YourAWSAccessKeyId, YourAWSAccessKey, :sandbox => true)
hit = RTurk::Hit.create(:title => "My Hit") do |hit|
hit.max_assignments = 1
hit.assignments = 29
hit.description = description
hit.question("myapp.com",
:frame_height => 500)
hit.reward = 0.01
hit.keywords = ['fun','easy','research','search']
hit.lifetime= 1800
end

When I search for my HIT as a worker it appears in the search results, but says only one HIT is available. But when I go to my requester dashboard it says that I have the correct amount of assignments left. Is this a bug or am I just missing something. Thanks!

Allow non-global configuration

Is it possible to configure rturk on a per-request or per-client basis? I need to use multiple AWS credentials and a mix of sandbox and real environments.

IAM User vs AWS User

AWS has changed their Security Credentials and do not allow you to retrieve or create new AWS security creds:

image

However, Mechanical Turk (and rturk) don't allow you to use IAM users:

image

Any idea how to get around this?

Looking for new maintainer

I've not actively used RTurk since early 2010, so I doubt I'm the best person to keep maintaining this gem.

If anyone wants to volunteer for the role, I only ask a couple things:

  • You currently use the RTurk gem
  • You have a decent level of experience maintaining a Ruby gem (Tests!)

All this could be yours ๐Ÿ˜„ Just let me know!

Use HTTPS to make requests

Amazon will be deprecating HTTP calls in January 2012. RTurk needs to change to HTTPS calls. I'm putting this here so I don't forget

Need post option for long strings

When calling CreateQualificationType with a large test, I kept getting a 400 Request Error (shorter tests worked fine). Finally patched requester.rb as follows

< RTurk.logger.debug "Sending request:\n\t #{credentials.host}?#{querystring}"
< RestClient.get("#{credentials.host}?#{querystring}")

    RTurk.logger.debug "Posting request to #{credentials.host}:\n\t #{params.inspect}"        
    RestClient.post(credentials.host.to_s, params)

and then they submit fine. Maybe this should be an option if the params are long or called explicitly from a higher layer?

Search for non-expired hits

So RTurk::Hit.all will return all hits (well 100 of them), is it possible to search for only non-expired hits? Also, how can we return more than 100 hits (since Hit.all doesn't take any parameters).

RequiredToPreview cannot be set to false

In qualification_builder.rb, need

43a44,47

    if opts.has_key?(:required_to_preview)
      qualifier[:RequiredToPreview] = opts[:required_to_preview]
    end

57c61

< params["RequiredToPreview"] = qualifier[:RequiredToPreview] || 'true'

  params["RequiredToPreview"] = qualifier.has_key?(:RequiredToPreview) ? qualifier[:RequiredToPreview] : 'true'

also the comments about how to add a qualification requirement seemed a little misleading to me. Maybe you want to make more flexible now for custom requirement types? I modified the comments as follows in qualifications_builder.rb

20,21c20,21
< # Needs a type name(you can reference this later)

< # and the operation as a hash: ':gt => 85'

# Needs a type name (you can reference this later) or your custom type ID
# and the operation as a hash: ':gt => 85' ; NOTE, though the first key of the hash is special and must be the comparator

23c23

< # qualifications.add('EnglishSkillsRequirement', :gt => 66, :type_id => '1234567890123456789ABC')

# qualifications.add('1234567890123456789ABC', :gt => 66, :require_to_preview => false ) 

Allow more than 100 assignments to be returned

I'm having a difficult time retrieving more than 100 assignments from any specific HIT. I wonder if this is a similar issue as #19?

Essentially, I'm always returned page 1 of results, no matter which page I request.

a1 = RTurk::GetAssignmentsForHIT(hit_id: 'HIT_ID', page_size: 10, page: 1)
a2 = RTurk::GetAssignmentsForHIT(hit_id: 'HIT_ID', page_size: 10, page: 2)

# a1 == a2

Also, RTurk::HIT.assignments will only allow you to iterate through the first 100 results.

hit = RTurk::Hit.find HIT_ID
hit.assignments.each do |a|
  process_assignment a
end

It never makes it past 100 assignments

Any help would be greatly appreciated!

Allow more than 100 hits to be traversed

No time to do this right and fork this, but here's a monkey patch that hopefully will make it's way into main.

module RTurk
  class Hit
    def self.each(page_number = 1, &block)
      result = RTurk::SearchHITs.create(page_number:page_number, page_size:100)
      total_num_results = result.xpath("//TotalNumResults").text.to_i
      result.hits.map do |hit|
        yield new(hit.id, hit)
      end
      each(page_number + 1, &block) if total_num_results > page_number * 100
    end
  end
end

Make alias for removed CreateHIT methods

In a recent version of Rturk, the RTurk::CreateHIT instance methods note=, assignments=, and auto_approval= were removed, replaced with methods with slightly different names. This affects/alters/breaks the functioning of the public RTurk::Hit.create and specifically of how new HITs are set up by users of RTurk.

Given the importance of this basic RTurk function, it would imho make sense to set up aliases pointing from the old methods to the new methods โ€” at least until this important change could be, say, added to the Changelog with a deprecation notice. The change broke a bunch of my code and while everything is now fixed on my end it seems it would be nice to either avoid breaking code using the old public methods (however poorly named) or at least to avoid doing so without giving notice beyond the github commit log.

It is nice to see RTurk development and improvement happening, don't get me wrong, I just wonder if we can't provide a bridge for old, working code to the shiny new world. Thanks for considering.

Disposing of an unassignable hit

When I create a hit, and the task is accepted by the turker, but not yet completed by him/her, it seems that I am in a limbo where I cannot expire! or dispose! of it.

Is this the desired behavior? Maybe it makes sense for most cases, but I feel like it runs the risk of me having hits sitting around that I can't get rid of.

SSL

Has RTurk been updated to support Amazon's new SSL requirements for MTurk requests?

How to define multiple assignements

Looking at the README I see that you define 2 assignments for the tagging hit and later on the URL contains an item_id, I was wondering how one defines multiple assignments and the corresponding ids.

The examples has only one assignment and well the rdoc is pretty light.

Additional HIT details

It would be nice if when getting details about a HIT you would parse the external question and qualifications. It comes back with the XML, but it doesn't appear that it's being parsed into the Rturk::Hit adapter.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.