Code Monkey home page Code Monkey logo

surf's People

Contributors

cassiobotaro avatar charl avatar chlunde avatar cornerot avatar ethankent avatar eziscky avatar fede-bitlogic avatar haruyama avatar headzoo avatar jtwatson avatar lalyos avatar lennyxc avatar lestrrat avatar lox avatar lxt2 avatar mashinamashina avatar mattn avatar mholt avatar nicot avatar noy avatar nxadm avatar shavit avatar sqs avatar tatsushid avatar tlianza avatar utahta avatar uwynell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

surf's Issues

browser.Body() Error, replace with Text()

Open this page,
link := https://player.vimeo.com/video/244405542/config
assuming,
bow := surf.NewBrowser(); bow.Open(link); res := bow.Body()

In here res is parsed wrongly, it's better to make
bow.Body() return doc.Find("body").Text()

Edit:
This works only for this document and for other documents it seems to include the rest of body element.
Problem in parsing is with the double quotes for video->embed_code object

Thank you

surf is incompatible with app engine

I see two solvable problems in getting surf to work with app engine.

We need this commit brought into master: 4015114. app engine is a sandbox environment and requires that any url fetching be done through their urlfetch package. They provide an http.RoundTripper implementation for this.

We need a way to disable the current agent code. On linux (which is what is used when building for app engine), the agent code expects to be able to use syscall, which isn't allowed in the app engine sandbox. Because surf.go calls the agent code when setting up its DefaultUserAgent var, there isn't a way around this without changing something in the surf library. For this, I propose adding an agent_appengine implementation that essentially returns some hard-coded values.

I'm glad to put together a PR for this. If you would be receptive to that, please let me know, and please let me know any thoughts you have about my proposed approach.

syscall.Utsname is OS dependent.

agent/agent.go uses syscall.Utsname{}
https://github.com/headzoo/surf/blob/master/agent/agent.go#L363

But this is OS dependent. not working Darwin (go 1.3.1).

Please use uname command like

    import  "os/exec"

    out, err := exec.Command("uname", "-s").Output()
    if err != nil {
        return "Linux"
    }
    return strings.ToLower(strings.TrimSpace(string(out)))

uname -s for osName and uname -r for osVersion.

However, uname -r shows the 'linux' version. But from agent_test.go, an expected string is a distribution version like 14.04 for Ubuntu 14.04.

If you want to get a distribution version, you may use https://github.com/shirou/gopsutil and HostInfo().

Some problems with encoding.

Hello from Russia!
I'am gonna try to use golang instead ruby (mechanize). But now i have a problem with encoding. When i trying to get some page (ex. http://google.com/) - i got some words on Russian like a diamond with a question. What i should to do?

P.S. I tryed to change response to utf-8 and cp1251...

Problem code;

a := surf.NewBrowser()
a.Open("http://google.com/")
fmt.Println(a.Body()) // For example

and here i have bad characters.

Ajax calls

Does Surf handle ajax and dynamic pages?

proxy: unknown scheme: http

I setted up a proxy server in local PC with nginx.
But when my configuration code as follow:
err = bow.SetProxy("http://192.168.1.120:82") if err != nil { fmt.Println(err) }
when I run my pro, then it show me as the title!
Please help, thanks in advance!

Download

Downloading assets doesn't use cookies from my testing, for some sites, assets can't be accessed without cookies, any chance this could be fixed?

Click a link that doesn't have a class name ?

Is it possible to click a link that doesn't have a class name like this ?

<a href="/some/kind/of/link">The Link</a>

I know you can do

bow.Click("a.new")
Like in the docs which will find any anchor with the class "new" but how do we click a link by just if it has a anchor tag and some text like "The Link" ?

Click wants a selector string. Is there any way to get that from bow.Find ? I can get all the links with

bow.Find("a").Each(func(_ int, s *goquery.Selection) {
        if s.Text() == "The Link" {
            //how do i click on the link here ? 
        }
}

But then bow.Click wants a selection string and I have a goquery.Selection. How do I click the link once I've found it ?

Request Header Too Large

https://github.com/TheInsideMan/SainsburysScraper/tree/header
(run instructions in README.md)

The script goes to a page... finds all the .Each links loops through

bow.Find("#page #main #content #productsContainer #productLister ul li .product .productInner").Each(func(i int, s *goquery.Selection) {
    title := strings.TrimSpace(s.Find(".productInfoWrapper  .productInfo h3 a").Text())
    click_err := bow.Click("a:contains(\"" + title + "\")")
    if click_err != nil {
        fmt.Println(click_err.Error())
    } else {
        tdesc := bow.Find("title").Text()
        // commonly fails here due to request header being too large
        if tdesc == "400 Bad Request" {
            fmt.Printf("%v: NOT FOUND!! -- %v\n", i, tdesc)
        }
    }
    // get some Text() from this page
    bow.Back()
    // I've had to add in the below to reset the request header
    // is there a better way of doing this so that the .Click() doesn't slow down
    // due to remote server resetting cookies?
    if i != 0 {
        if math.Mod(float64(i), 2) != 0 {
            c := bow.SiteCookies()
            cookieJar, _ := cookiejar.New(nil)
            casurl, _ := url.Parse(url_link)
            cookieJar.SetCookies(casurl, c)
            bow.SetCookieJar(cookieJar)
            // bow.DelRequestHeader("Cookie")
        }
    }
})

Basically the remote server gives 404 that the request header is too large. Something seems to add a new set of cookies to the header with every bow.Click().

bow.DelRequestHeader("Cookie") does get around the issue but it really slows the script down as then the cookies needs to be reset for every link!

If i could just keep to one set of cookies then I think the problem would be solved. Any ideas?

install problem

When I do this:

go get gopkg.in/headzoo/surf.v2

I get the foillowing:

# gopkg.in/headzoo/surf.v2
src\gopkg.in\headzoo\surf.v2\browser.go:197:31: not enough arguments in call to jar.NewHistoryState
        have (*http.Request, *http.Response)
        want (*http.Request, *http.Response, *goquery.Document)
src\gopkg.in\headzoo\surf.v2\surf.go:47:33: too many arguments in call to agent.Create
        have (string, string)
        want ()

Form submit hangs when downloading a large zip file

The following code hangs on f2.Submit(). It works when the download is small but for large downloads it hangs and never comes back.

 func (ct CT) GetArticles() {
	bow := surf.NewBrowser()
	bow.Open("https://clinicaltrials.gov/")

	f, _ := bow.Form("form[action='/ct2/results']")
	f.Input("home-search-query", "cancer")
	f.Submit()
	logger.Info(bow.Title())

	bow.Click("a#save-list-link")
	f2, err := bow.Form("form[action='/ct2/results/download']")
	if err != nil {
		logger.Error(err)
	}
	//f2.Set("down_stds", "all")
	f2.Set("down_typ", "study")
	f2.Set("down_flds", "all")
	f2.Set("down_fmt", "xml")
	err = f2.Submit()
	logger.Info(bow.ResponseHeaders())

	file, e := os.Create("./dt.zip")
	if e != nil {
		logger.Error(e)
	}
	defer file.Close()
	bow.Download(file)
}

Getting framebuster code on consent page while doing google oauth

I'm trying to do google Oauth and i can submit Email and Password forms but when the browser navigates to consent page i get framebuster code and the browser is not able to get around that. This is what i get on body:

<noscript>
  <meta http-equiv="refresh" content="0;url=/o/noscript">
</noscript>
<!-- framebuster code starts here -->
<style nonce="mblvj0tjsrS61nO4S0oBmILXhsY">
  plaintext {
    display: none
  }
</style>
<script nonce="mblvj0tjsrS61nO4S0oBmILXhsY">
  (function () {
    try {
      var win = this;
      while ("<plaintext>") {
        if (win.parent == win)
          break;
        eval("win.frameElement.src").substr(0, 1);
        win = win.parent;
      }
      if (win.frameElement != null) throw 'busted';
      document.write("\x3Cxmp style\x3Ddisplay:none\x3E");
    } catch (e) {
      try {
        if (!open(location, '_top'))
          alert('this content cant be framed');
        top.location = location;
      } catch (e) { }
    }
  })();
</script>
<!-- do not remove the plaintext nor xmp tags -->
<plaintext>
  <xmp>.</xmp>
  <!-- framebuster code ends here -->
  <div id="ogb">
    <div id=guser width=100%>
      <nobr>
        <span id=gbn class=gbi></span>
        <span id=gbf class=gbf></span>
        <b class=gb4>[email protected]</b> |
        <span id=gbe></span>
        <a target=_blank href="https://myaccount.google.com/?utm_source=OGB" class=gb4>My Account</a> |
        <a target=_top id=gb_71 href="https://accounts.google.com/Logout?continue=https://accounts.google.com/o/oauth2/auth?access_type%3Doffline%26approval_prompt%3Dforce%26"
          class=gb4>Sign out</a>
      </nobr>
    </div>
    <div class=gbh style=left:0></div>
    <div class=gbh style=right:0></div>
  </div>
  <div id="third_party_info_container">
    <div id="third_party_info" class="section_container" data-section="main">
      <div class="column">
        <div class="clear" id="grant_heading">
          <div id="parties_brand"></div>
          <a href="javascript:void(0);" id="developer_info_a" class="third_party_name_wide">
            <span class="goog-flat-menu-button-dropdown goog-inline-block third_party_dropdown"></span>test-oauth2</a> would like to:</div>
        <div id="scope_list">
          <div class="scope_spacer">
            <div class="scope_icon_container">
              <img class="icon" src="/o/static/4184593589-default_scope_icon.png" alt="">
            </div>
            <div class="scope_summary">Have offline access</div>
            <div class="info_icon_container">
              <a href="javascript:void(0);" class="more_info_icon" data-heading="" data-ok="OK">
                <img class="icon" src="/o/static/787932081-icon_info.png" alt="Click for more information">
              </a>
              <div class="more_info_detail">This app still has access to your account when your device is turned off.</div>
            </div>
          </div>
        </div>
      </div>
      <div id="approval_container">
        <div class="column">
          <div id="connect_container" class="modal-dialog-buttons button_container">
            <form id="connect-approve" action="https://accounts.google.com/o/oauth2/approval?"
              method="POST" style="display: inline;">
              <input type="hidden" name="bgresponse" id="bgresponse">
              <input type="hidden" id="_utf8" name="_utf8" value="☃">
              <input id="state_wrapper" type="hidden" name="state_wrapper" value="CnohQ2hSRU4xcHFSbFpqWHpoc2FrcFpVVU5MYkhCblpoSWZjell4UzBOeVltazRNM05YVlU5eFdtbGZRamR0VWkxVlZHRnJiMFpvV1HiiJlBQ1RoWnQ0QUFBQUFXbmpBd1B3QzM3dU51dkpDYzlTaHdjc3EzWkNjUkhPTxIVMTAwNjQ1MzA5NTM4NTE0MTc5OTk2GNjqjZnFvOTIMw">
              <input type="hidden" id="submit_access" name="submit_access" value="">
              <button id="submit_approve_access" type="submit" disabled tabindex="1" class="goog-buttonset-action">Allow</button>
              <button id="submit_deny_access" type="submit" disabled tabindex="2">Deny</button>
            </form>
            <div class="clear"></div>
          </div>
        </div>
      </div>
    </div>
  </div>
  <div style="display:none">
    <div id="developer_info_tooltip_content">
      <br/>Clicking "Allow" will redirect you to:
      <br/>http://localhost:3030/callback</div>
  </div>
  <div id="tooltip_bubble"></div>
  <script type="text/javascript" nonce="mblvj0tjsrS61nO4S0oBmILXhsY">window.onload = lso.dynamicAdjustHeight;</script>
  </body>

  </html>

As you can see its not valid html with all the opening and closing tags so the browser is not able to parse form#connect-approve. Has someone else faced this problem? Is javascript not supported in this browser.

Add support for calling Browser.Post from initial state

The Post method of Browser references bow.Url(), which implies that the browser instance has already established a connection (via a GET call?) prior to the POST call. This causes a nil pointer dereference if you attempt to make a Post call with a newly instantiated Browser instance. My current workaround is to make a dummy get call to the same url prior to the POST call.

[todo] Support Tabs/Processes

I like that Surf is stateful. Actionable methods like Open() and Post() modify the state of the browser rather than returning some object representing the request. However I've run into some minor problems with this stateful approach. I would like to add support for tabs/processes so developers can configure the browser once, and then create any number of tabs/processes that do the actual work.

How a page is opened now:

bow := surf.NewBrowser()
err := bow.Open("http://golang.org")
if err != nil {
    panic(err)
}

How it would be opened with "tab" support.

// Create and configure a new browser.
bow := surf.NewBrowser()
bow.AddRequestHeader("User-Agent", "Surfbot 1.0")
bow.SetTransport(&http.Transport{})

// Create a new tab. The tab is essentially an instance
// of Browsable, and inherits it's configuration from
// the parent browser.
tab := bow.NewTab()
err := tab.Open("http://golang.org")
if err != nil {
    panic(err)
}

Still debating using the term "tab", which may be taking the browser terminology too far, but on the upside the concept behind a tab -- and how it works -- should be easily understood by developers.

The feature may be treading into "over engineered" territory. Will have to give it some thought.

Reddit login not working

Reddit login example from Readme fails with "wrong password" message.

Is it because clicking Reddit "login" button seems to send an ajax request (in browser)?

Any ideas how to fix it?

No way to access the underlying transport mechanism

While attempting to manipulate an https url, I encountered problems with an unverified x509 certificate:

err = x509: certificate signed by unknown authority

Googling indicated that this is a known issue for some sites under OSX. The recommended fix was to adjust the http.Transport to ignore unverified certificates. Reference the following Google groups link:

https://groups.google.com/forum/#!topic/golang-nuts/v5ShM8R7Tdc

I modified my version of surf to expose a method to set the transport, and this fixes my issue locally. Would you mind reviewing my diff and consider integrating it with the mainline?

diff --git a/browser/browser.go b/browser/browser.go
index 48d8afe..01121b7 100644
--- a/browser/browser.go
+++ b/browser/browser.go
@@ -168,6 +169,10 @@ type Browser struct {

        // refresh is a timer used to meta refresh pages.
        refresh *time.Timer
+
+       // transport specifies the mechanism by which individual HTTP
+       // requests are made.
+       transport *http.Transport
 }

 // Open requests the given URL using the GET method.
@@ -423,6 +428,11 @@ func (bow *Browser) SetHeadersJar(h http.Header) {
        bow.headers = h
 }

+// SetTransport sets the http library transport mechanism for each request.
+func (bow *Browser) SetTransport(t *http.Transport) {
+       bow.transport = t
+}
+
 // AddRequestHeader sets a header the browser sends with each request.
 func (bow *Browser) AddRequestHeader(name, value string) {
        bow.headers.Add(name, value)
@@ -498,7 +508,7 @@ func (bow *Browser) Find(expr string) *goquery.Selection {

 // buildClient creates, configures, and returns a *http.Client type.
 func (bow *Browser) buildClient() *http.Client {
-       client := &http.Client{}
+       client := &http.Client{Transport: bow.transport}
        client.Jar = bow.cookies
        client.CheckRedirect = bow.shouldRedirect
        return client
diff --git a/surf.go b/surf.go
index f7ff95f..63f6c6f 100644
--- a/surf.go
+++ b/surf.go
@@ -2,6 +2,8 @@
 package surf

 import (
+       "net/http"
+
        "github.com/headzoo/surf/agent"
        "github.com/headzoo/surf/browser"
        "github.com/headzoo/surf/jar"
@@ -34,6 +36,7 @@ func NewBrowser() *browser.Browser {
                browser.MetaRefreshHandling: DefaultMetaRefreshHandling,
                browser.FollowRedirects:     DefaultFollowRedirects,
        })
+       bow.SetTransport(&http.Transport{})

        return bow
 }

No support for form input types checkbox, radio or select multi

There is a serious issue when handling forms that have checkbox's, radio buttons or select multiple input types. If you submit a form that has checkbox's, they will all be checked when submitted regardless of there state as downloaded. Forms with radio buttons will have the last button selected regardless of original state. Disabled fields are submitted, etc.

I will be submitting a PR shortly to implement robust support for Checkboxes, Radio buttons and Select multiple input elements. It also handles disabled fields properly.

Proposal: A way to make contrived forms

Really enjoying surf so far. However, as mentioned in the TODO section of the readme, surf doesn't execute JavaScript. One of my projects right now involves logging in on a site where the login form is injected into the page by JavaScript. I can use the network inspector in Chrome to see what the form submit looks like, but the trouble is getting the form into the DOM for a browser that doesn't support JavaScript.

One thought I had would be to contrive a form that doesn't actually exist on the DOM but is submittable all the same. Basically, the user would invoke surf.NewFakeForm(action, method) which acts like a regular surf.Form except that methods like Input(name, value) don't check to make sure name already exists (it would create it).

Basically, the Form type might become an interface.

Would you accept a pull request that adds this functionality? And I'm still kind of new to this package, so if you know of a better way to accomplish what I need to do, feel free to let me know!

Versioning?

So currently there is a tag that points to v1:

3ab5b77 =>
gopkg.in/headzoo/surf.v1

and there is work on a v2 branch.

Thoughts on how we should proceed with point releases in between?

My suggestion would be to adopt semver, and then keep using gopkg.in:

When using branches or tags to version the GitHub repository, gopkg.in understands that a selector in the URL such as "v1" may be satisfied by a tag or branch "v1.2" or "v1.2.1" (vMAJOR[.MINOR[.PATCH]]) in the repository, and will select the highest version satisfying the requested selector.

Support the ability to use hidden form controls (input, button)

Is there a way in Surf to make hidden elements visible so that they can be set and submitted with the form. I'm working with a form that hides/shows fields using javascript and I cannot submit it. Any options:

func (pubmed Pubmed) GetArticles() {
	bow := surf.NewBrowser()
	bow.Open("https://www.ncbi.nlm.nih.gov/pubmed/")

	f, err := bow.Form("form[name='EntrezForm']")
	if err != nil {
		logger.Error(err)
	}
	f.Input("term", QUERY)
	err = f.Submit()
	logger.Error(err)
	logger.Info(bow.Title())

	err = bow.Click("a.tgt_dark:hidden")
	logger.Error(err)
	f, err = bow.Form("form[name='EntrezForm']")
	logger.Error(err)

	f, err = bow.Form("form[name='EntrezForm']")
	err = f.Input("EntrezSystem2.PEntrez.PubMed.Pubmed_ResultsPanel.Pubmed_DisplayBar.SendTo", "File")
	logger.Error(err)
	err = f.Set("EntrezSystem2.PEntrez.PubMed.Pubmed_ResultsPanel.Pubmed_DisplayBar.FFormat", "xml")
	logger.Error(err)
	err = f.Click("EntrezSystem2.PEntrez.PubMed.Pubmed_ResultsPanel.Pubmed_DisplayBar.SendToSubmit")
	logger.Error(err)
	logger.Info(bow.ResponseHeaders())


	file, e := os.Create("./pubmed.xml")
	if e != nil {
		logger.Error(e)
	}
	defer file.Close()
	bow.Download(file)
}

Add support for recognizing text/csv content-type during download

Hello,

I'm using surf to login to some website and download a CSV report. My problem is that the report is downloaded as an HTML file, not as a plain text: after

err := bow.Open(reportURL)
if err != nil {
    return err
}
f, err := os.Create(output)
if err != nil {
    return err
}
defer f.Close()
fmt.Println(bow.ResponseHeaders())
i, err := bow.Download(f)

I get a file which is prepended with <html><head></head><body>, some symbols are HTML-escaped, etc.

When I print the headers,

fmt.Println(bow.ResponseHeaders())

I get:

map[Date:[Wed, 18 Feb 2015 02:27:13 GMT] 
Content-Disposition:[attachment; filename=source.csv] 
Content-Type:[text/csv; charset=ISO-8859-1] 
X-Cnection:[close]]

Looks like content-type text/csv is not recognized. When I follow the reportURL link with a browser, I get the file downloaded properly.

Any advice on what's the best way to download the file properly? Or may be it's a bug/feature request...

Browser default page state

I would like to know if the headless browser produced by surf package has the same behavior with the known browsers? I mean - does it allow the page to load and run its own javascripts for additional UI rendering? It seems i can't find an specific dom element after logging in at facebook

Is it possible to submit form if any form field has no "name" attribute?

Since the current implementation uses the field's name attribute to create the url.Values , which is correct from the W3C standards, is it maybe possible to submit a form event if the field has no name attribute?

Say we have this

<form class='some-form'>
        <textarea id='someid'></textarea>
       <button id='someid' type="button" onclick="someJSsubmit()"></button>
</form>

Is it possible that i could populate the textarea and click the button after?

[todo] Add Middleware/Plugin Support

It would be nice if Surf had a built in mechanism for middleware/plugins. It would be nice if developers could set plugin modules that perform common tasks and/or add functionality. For example:

  • OAuth support
  • Logging into popular sites (Github, Reddit, GMail, etc)

Possible syntax:

// GithubLogin implements a Surf plugin for Github authentication.
// When requests are made to the github.com domain, the plugin
// performs user authentication automatically if the user isn't
// logged in already.
type GithubLogin struct {
    username string
    password string
}

... implement methods

func main() {
    bow := NewBrowser()

    login := middleware.NewGithubLogin("username", "password")
    bow.AddMiddleWare(login)

    err := bow.Open("https://github.com")
    if err != nil {
        panic(err)
    }
}

Use a File interface to allow alternative source for file upload

Currently in order to POST a multipart message, I create a temporary file and upload that. However, instead of requiring a file to be created on disk, there should be a mechanism to allow arbitrary data sources that implement a common interface. I propose https://golang.org/pkg/mime/multipart/#File.

type File interface {
        io.Reader
        io.ReaderAt
        io.Seeker
        io.Closer
}

This exposes enough information to construct the body buffer I use in the http library. This way, I can write a filestring as follows:

type filestring struct {
	*strings.Reader
}
func (fs *filestring) Close() error {
	return nil
}
func NewFileString(s string) *filestring {
	return &filestring{strings.NewReader(s)}
}

Can't work with "gzip, deflate"

When add http.header like this:

head := http.Header{}
head.Set("Accept", "*/*")
head.Set("Accept-Encoding", "gzip, deflate")
head.Set("Connection", "keep-alive")
bow.SetHeadersJar(head)

Then method Body can only got garbled text like:

�Z�SW�&lt;�W��Af����ĔF&#39;�[)�jz���=v��C��D1�h⃨(����H4�A�Y�g�俰���s�l�R\�}���{��BߎG��?v�+c%�ا���d�D�@2y ��;�A��a.�HqyMPtِUE(%�?�9

And many other method(like Title, Links) can't get expected results.

Screenshots

Is it possible to add support for screenshot generation?

Any interest in a collaborator?

There seems to a lot of open PR's and issues that could pretty easily be brought in without breaking BC.

I end up using Surf quite a bit in various projects, would be quite happy to help out if you are inclined @headzoo.

UnBounded History growth can consume lots of memory

I use this lib in production for large batch updates where we may make 100,000+ calls in one run. I experienced exceeding 8Gig memory usage before the program was terminated by the OS.

Researching this issue I found it was simply the History object growing indefinitely until the Browser object was release, at which point the GC would cleanup.

I implemented some optional methods and default settings that do not change the current behavior, but make it simple to limit the growth of the History or just clear the history at will.

PR is on its way.

I can't manual set user agent when page redirect

Hello sir:
sorry, my English is very bad, I use Google Translate.
When the page redirects happen, user agent will be reset.
Please add following line in your source browser.go.

func (bow _Browser) shouldRedirect(req *http.Request, _ []_http.Request) error {
if bow.attributes[FollowRedirects] {
req.Header.Set("User-Agent", bow.userAgent)
return nil
}
return errors.NewLocation(
"Redirects are disabled. Cannot follow '%s'.", req.URL.String())
}

HTTPS Insecure Certificates

Hi, it's just an idea...
I had to modify browser.go file (buildClient func) to work with insecure certificates urls (https):

import (
      "crypto/tls"
)
 func (bow *Browser) buildClient() *http.Client {
    tr := &http.Transport{
        TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
        }
    client := &http.Client{Transport: tr}
    client.Jar = bow.cookies
    client.CheckRedirect = bow.shouldRedirect
    return client
}

Would be nice if it were configurable...
:)

Can't get surf.v2

gopkg.in/headzoo/surf.v2
......\gopkg.in\headzoo\surf.v2\browser.go:197:31: not enough arguments in call to jar.NewHistoryState
have (*http.Request, *http.Response)
want (*http.Request, *http.Response, *goquery.Document)
......\gopkg.in\headzoo\surf.v2\surf.go:47:33: too many arguments in call to agent.Create
have (string, string)
want ()

Update gopkg.in/headzoo/surf.v1

Hi,

The documentations points to gopkg.in/headzoo/surf.v1 to retrieve surf. However, this tag does not has more recent accepted PR's (like mine: #52).

Is it possible to update gopkg.in/headzoo/surf.v1 be updated to a more recent tag or create a new tag (e.g. v1.1)?

Thank you,

C.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.