Code Monkey home page Code Monkey logo

grobot's Introduction

GRobot Build Status

GRobot is still under development and I love to refactor.So,control your risks.

GRobot is a powerful web robot based on gevent. This project comes from Ghost.py,which I have rewrote most of the code inside, and changed its name to GRobot.

import gevent
from grobot import GRobot

def test():
    robot = GRobot()
    robot.open("http://www.yahoo.com")
    assert 'yahoo' in robot.content()

gevent.spawn(test).join()

What it can and can't do

Can do

  • Can set up the socks5/http{s} proxy.
  • Can simulate ALL the operations of hunman beings.
  • Can run webkit plugin.
  • Can evaluate javascript.
  • Can grab the web page as image.
  • Can run on a GUI-Less server(by install xvfb).

Can't do

  • Can't operate Flash.
  • Can't run without PyQt and gevent
  • Can't get back http status code.
  • Can't transport human to the Mars.

Installation

First you need to install PyQt.

In Ubuntu

sudo apt-get install python-qt4

Install gevent from the development version.

pip install http://gevent.googlecode.com/files/gevent-1.0b3.tar.gz#egg=gevent

Install Flask for unittest(optional).

pip install Flask

Install GRobot using pip.

pip install GRobot

How to use

Quick start

First of all, you need a instance of GRobot in greenlet:

import gevent
from grobot import GRobot

def test():
    robot = GRobot()
    #do something

gevent.spawn(test).join()

Element selector

Element selector tell GRobot which HTML element a command refers to. The format of a selector is:

selectorType=argument

We support the following strategies for locating elements:

  • identifier = id : Select the element with the specified @id attribute. If no match is found, select the first element whose @name attribute is id. (This is normally the default; see below.)

  • id = id : Select the element with the specified @id attribute.

  • name = name : Select the first element with the specified @name attribute.

  • xpath = xpathExpression : Locate an element using an XPath expression.

      xpath=//img[@alt='The image alt text']
      xpath=//table[@id='table1']//tr[4]/td[2]
      xpath=//a[contains(@href,'#id1')]
      xpath=//a[contains(@href,'#id1')]/@class
      xpath=(//table[@class='stylee'])//th[text()='theHeaderText']/../td
      xpath=//input[@name='name2' and @value='yes']
      xpath=//*[text()="right"]
    
  • link = textPattern : Select the link (anchor) element which contains text matching the specified pattern.

      link=The link text
    
  • css = cssSelectorSyntax : Select the element using css selectors. Please refer to CSS2 selectors for more information.

      css=a[href="#id3"]
      css=span#firstChild + span
    

Option Selector

When you using GRobot.select to deal with select tag,option selector tell GRobot which option you want. The format of a selector is:

selectorType=argument

We support the following strategies for locating elements:

  • text = text : Match options based on their the visible text.(This is normally the default.)
  • id = id : Match options based on their @id attribute.
  • name = name : Match options based on their @name attribute.
  • value = value : Match options based on their values.

Open a web page

GRobot provide a method that open web page the following way:

robot.open('http://my.web.page')

Web page actions

We have three shortcut method.

robot.reload()
robot.forward()
robot.back()

If you want more action,check in Qt Document

robot.trigger_action('SelectAll)

Execute javascript

Executing javascripts inside webkit frame is one of the most interesting features provided by GRobot:

result = robot.evaluate( "document.getElementById('my-input').getAttribute('value');")

As many other GRobot methods, you can pass an extra parameter that tells GRobot you expect a page loading:

robot.evaluate( "document.getElementById('link').click();", expect_loading=True)

Dealing with forms

Type the text

Simulating human typing.

robot.type('id=blog_content',u'Hello,world.I'm Tom.\n你好,世界.我是无名氏\n')

Type key by key,only ASCII character allowed.

robot.key_clicks('id=blog_content','Hello,world.\rToday is good for sleep.')

Select options

Select the options those match selector from select tag.

Single selectbox.

robot.select('name=sex','text=Male')

Multiple selectbox.

robot.select('id=like',[
    ('apple',True),
    ('orange',False),
    ('banana',True),
])

Set the checkbox

You can specify the state of checkbox.

Check it.

robot.check('id=agree')

Uncheck it.

robot.check('id=agree',False)

Click something

Click the first element which selected by selector.

robot.click("xpath=//input[@type='submit']")

Click a point of the absolute position (1500,36).

robot.click_at(1500,36)

Move the mouse

Move your mouse to first element which selected by selector.

robot.move_to('css=#button')

Move your mouse to the absolute position (500,300).

robot.move_at(500,300)

Setup input file

Selenium can't access the <input type='file'/> tag.You can't use selenium to set up a file type input.

robot.choose_file('id=file-upload', '/tmp/file')

Waiters

GRobot provides several methods for waiting for specific things before the script continue execution:

wait_for_page_loaded()

That wait until a new page is loaded.

robot.wait_for_page_loaded()

wait_for_selector(selector)

That wait until a element match the given css selector.

result = robot.wait_for_selector("ul.results")

wait_for_text(text)

That wait until the given text exists inside the frame.

result = robot.wait_for_text("My result")

Sample use case

Post a twitter

import gevent
import logging
from grobot import GRobot

USERNAME = 'your twitter username'
PASSWORD = 'your twitter password'

def main():

    robot = GRobot(display=True, log_level=logging.DEBUG, develop=False)
    robot.set_proxy('socks5://127.0.0.1:7070')
    robot.open('https://twitter.com')

    #Login
    robot.key_clicks('id=signin-email',USERNAME)
    robot.key_clicks('id=signin-password',PASSWORD)

    robot.click("xpath=//td/button[contains(text(),'Sign in')]",expect_loading=True)

    #Post a twitter
    robot.key_clicks("id=tweet-box-mini-home-profile","GRobot is too powerful.https://github.com/DYFeng/GRobot")

    #Wait for post success
    while 1:
        robot.click("xpath=//div[@class='module mini-profile']//button[text()='Tweet']")
        try:
            robot.wait_for_text('Your Tweet was posted')
            break
        except :
            #Something go wrong,refresh page.
            if 'refresh the page' in robot.content():
                robot.reload()

    # Wait forever.
    robot.wait_forever()

if __name__ == '__main__':
    gevent.spawn(main).join()

Browsing Google.com and find GRobot project

import gevent
import logging
from grobot import GRobot

def main():

    # Show the browser window.Open the webkit inspector.
    robot = GRobot(display=True, develop=False, log_level=logging.DEBUG, loading_timeout=10, operate_timeout=10)

    # In China,people can only using proxy to access google.
    robot.set_proxy('socks5://127.0.0.1:7070')

    #Open google
    robot.open('http://www.google.com/')

    #Type out project and search.
    robot.type('name=q','GRobot github')
    robot.click('name=btnK', expect_loading=True)

    for i in xrange(1, 10):
        # Waiting for the ajax page loading.

        robot.wait_for_xpath("//tr/td[@class='cur' and text()='%s']" % i)

        if u'https://github.com/DYFeng/GRobot' in robot.content():
            print 'The porject in page', i
            break

        # Click the Next link.We don't use expect_loading.Because it's ajax loading,not page loading.
        robot.click("xpath=//span[text()='Next']")

    else:
        print "Can not found.Make a promotion for it."

    # Wait forever.
    robot.wait_forever()

if __name__ == '__main__':
    gevent.spawn(main).join()

grobot's People

Contributors

bryant1410 avatar dyfeng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.