generalmills / pytrends Goto Github PK

View Code? Open in Web Editor NEW

3.1K 3.1K 796.0 3.28 MB

Pseudo API for Google Trends

License: Other

Python 100.00%

pytrends's People

Contributors

Stargazers

Watchers

Forkers

arhontmw jonnynye acanalesg sfakir jl1nk boo-bee sonicrick karimkhanp mahoutm robertsudwarts sharonanne srepho skymeson iinm jshen9393 bigboss21x xros slitayem clsung github4ry erood hakudjin harayz berkeleydave t-web diekanne mrmushfiq anushabala pwnerel mzchewiize sinzear nastako tatari-tv leplen nwillems robin-liu-1983 rjshanahan franciscogodoy oliviercheng jczerwinski jinmong93 liondancer kitten91 chrisemoulton finologic syuer23 mattoc rodrigogonzalez balaprasanna thientu digideskio silburt nigeljyng jpforny moultondigital perpetua1 constantineg1 shobute rdadams pshapiro datafordevelopment jburke007 gaobocn-zz arielshad kevin-li-195 deduhanchik icharalamp mindey marekjs weiyideng nkhuyu warungdata justinlai mzkhan karimchik2gol vionwinnie yuseferi viveksikri totalsnackbreak jksan33 nysthee bzw0018 willwong430 deepakbaliga allen0517 enod samarv antobiotics sn0wfree mivg dominiksuwala jreguero shawnhill kokoro011 rstuczynski hitjackma sak1993 rubencart bhuvan-ganesan danielbynight

pytrends's Issues

Advanced Keywords, return Google Trends suggestions

Advanced Keywords
- When using Google Trends dashboard Google may provide suggested narrowed search terms.
- For example "iron" will have a drop down of "Iron Chemical Element, Iron Cross, Iron Man, etc"

Using Fiddler or some other proxy tool we can find the request that gets the suggested terms. This would be an additional functionality.

Date Parameter Change

MINOR

From testing the date parameters changed - it appears to have only affected the Daily and Hourly patterns:

Daily: date="today 12-m"
Hourly: date="now 1-H"

Saving to CSV in example.py

I cloned this repository to get a sense of the data that is returned. I tried running example.py and got this error File "example.py", line 19, in <module> connector.csv(path, "pizza").

So, I looked through pyGTrends.py and noticed there is a method on the pyGTrends class called save_csv(). When I changed line 19 of example.py to connector.save_csv(path, "pizza") then it successfully saved a file.

issues with SSL (urlopen error ssl certificate verify failed (_ssl.c:590)) and possible fix

I don't think my solution is good enough for a commit, so I'll just leave this here:

While running the included example I immediately hit an SSL error:
URLError: urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)

I fixed this by editing pyGTrends.py:

on line 11 add:
import ssl
on line 18 add an import for the HTTPSHandler of urllib2:
from urllib2 import build_opener, HTTPCookieProcessor, HTTPSHandler

then in def _connect in pyGTrends.py:

on line 59 add:
self.hn = HTTPSHandler(context=ssl._create_unverified_context())
on line 60:
self.opener = build_opener(HTTPCookieProcessor(self.cj ),self.hn)

I hope this helps!

urllib.error.HTTPError: HTTP Error 404: Not Found

Hello everyone,

Had this problem which the traceback you can see below. I've seen it is the same problem described here: #30

I've done what was suggested there, to update pytrends through pip, however it didn't solve the problem, I keep having it. I'm using python 3.5.0 and the guy in the link above had 3.5.1. Would that be the problem?

TRACEBACK:
/Library/Frameworks/Python.framework/Versions/3.5/bin/python3.5 "/Users/santanna_santanna/PycharmProjects/Predictive Models/APIGtrends.py"
Traceback (most recent call last):
File "/Users/santanna_santanna/PycharmProjects/Predictive Models/APIGtrends.py", line 19, in
connector = pyGTrends(google_username, google_password)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pytrends/pyGTrends.py", line 41, in init
self.fake_ua = UserAgent()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/fake_useragent/fake.py", line 10, in init
self.data = load_cached()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/fake_useragent/utils.py", line 140, in load_cached
update()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/fake_useragent/utils.py", line 135, in update
write(load())
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/fake_useragent/utils.py", line 92, in load
browsers_dict[browser_key] = get_browser_versions(browser)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/fake_useragent/utils.py", line 53, in get_browser_versions
html = get(settings.BROWSER_BASE_PAGE, browser)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/fake_useragent/utils.py", line 20, in get
return urlopen(url).read()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 162, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 471, in open
response = meth(req, response)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 581, in http_response
'http', request, response, code, msg, hdrs)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 509, in error
return self._call_chain(_args)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 443, in _call_chain
result = func(_args)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 589, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

Getting Login Error

Traceback (most recent call last):
File "general.py", line 22, in
connector = pyGTrends(google_username, google_password)
File "/Library/Python/2.7/site-packages/pytrends/pyGTrends.py", line 51, in init
self._connect()
File "/Library/Python/2.7/site-packages/pytrends/pyGTrends.py", line 67, in _connect
raise Exception('Cannot parse GALX out of login page')
Exception: Cannot parse GALX out of login page

Any thoughts?

request_report() got an unexpected keyword argument 'date'

Hi everyone,
first of all thanks for the support on here, super helpful! I just updated to 2.0.2 to solve the http connection error problem. When I run request_report() now though, I get the TypeError: "request_report() got an unexpected keyword argument 'date' "

The code that triggers the error looks something like this:

search_string_date = str(month) + "/" + str(year) + " 36m"
# connect to Google
connector = pyGTrends(google_username, google_password)
# make request
connector.request_report(keys, date=search_string_date, geo="US")

Any ideas? I'm lost at this point. This worked perfectly before the google connection problems and my update to 2.0.2 and I haven't changed anything.

Thanks in advance!

Related Queries Functionality

The 'toprelated' API function seems to pull data from the Google Trends "related queries" section. On the Google Trends website, one can configure the related queries section to show:

"rising" queries
"top" queries

It seems the data being pulled with the "toprelated" api function only pulls the rising queries. It would be nice to be able to select the "top" queries as well.

Hey

This is not an issue just wanted to reach out. Matt Reid here, I decided to retire my bitbucket version of google trends, as I have been away for some time and you have fixed all the problems in the mean time. I did write a bunch of classes to manipulate the data inside the downloaded csv file but didn't upload it. Do you want it, it would be nice to collaborate and make this more professional. My only worry is Google change the format... AGAIN... Thus, there is not much point spending too much time since were at the mercy of the great and powerful Google.

return_type

Quick question about the export, when I set return_type = 'df', it returns to empty, any idea why? (return_type = 'json' works)

Also the export doesn't seem to have the relative search when including multiple terms?

Script stopped working.. 400 Bad Request error

I have been using this python script for long now... suddenly today this script stopped working. I am getting 400 Bad Request error and now able to download any Google Trend CSV file from the script..

Getting error for Connector as well. "connector = pyGTrends(google_username, google_password)"

I think this is the main issue.

HTTP Error 400: Bad Request

Hey,

Been banging my head against wall with getting data from Trends - stumbled across this tool and was trying to test it out but got an error when creating the connector.

I was using a google datalab notebook:

code:

from pytrends.pyGTrends import pyGTrends
import time
from random import randint

google_username = "[email protected]"
google_password = "XXX"
path = ""

# connect to Google
connector = pyGTrends(google_username, google_password)

Error:

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-11-be8cbc60619e> in <module>()
      8 
      9 # connect to Google
---> 10 connector = pyGTrends(google_username, google_password)
     11 

/usr/local/lib/python2.7/dist-packages/pytrends/pyGTrends.pyc in __init__(self, username, password)
     49         self.url_CookieCheck = 'https://www.google.com/accounts/CheckCookie?chtml=LoginDoneHtml'
     50         self.url_PrefCookie = 'http://www.google.com'
---> 51         self._connect()
     52 
     53     def _connect(self):

/usr/local/lib/python2.7/dist-packages/pytrends/pyGTrends.pyc in _connect(self)
     69         params = urlencode(self.login_params).encode('utf-8')
     70         self.opener.open(self.url_ServiceLoginBoxAuth, params)
---> 71         self.opener.open(self.url_CookieCheck)
     72         self.opener.open(self.url_PrefCookie)
     73 

/usr/lib/python2.7/urllib2.pyc in open(self, fullurl, data, timeout)
    435         for processor in self.process_response.get(protocol, []):
    436             meth = getattr(processor, meth_name)
--> 437             response = meth(req, response)
    438 
    439         return response

/usr/lib/python2.7/urllib2.pyc in http_response(self, request, response)
    548         if not (200 <= code < 300):
    549             response = self.parent.error(
--> 550                 'http', request, response, code, msg, hdrs)
    551 
    552         return response

/usr/lib/python2.7/urllib2.pyc in error(self, proto, *args)
    467             http_err = 0
    468         args = (dict, proto, meth_name) + args
--> 469         result = self._call_chain(*args)
    470         if result:
    471             return result

/usr/lib/python2.7/urllib2.pyc in _call_chain(self, chain, kind, meth_name, *args)
    407             func = getattr(handler, meth_name)
    408 
--> 409             result = func(*args)
    410             if result is not None:
    411                 return result

/usr/lib/python2.7/urllib2.pyc in http_error_302(self, req, fp, code, msg, headers)
    654         fp.close()
    655 
--> 656         return self.parent.open(new, timeout=req.timeout)
    657 
    658     http_error_301 = http_error_303 = http_error_307 = http_error_302

/usr/lib/python2.7/urllib2.pyc in open(self, fullurl, data, timeout)
    435         for processor in self.process_response.get(protocol, []):
    436             meth = getattr(processor, meth_name)
--> 437             response = meth(req, response)
    438 
    439         return response

/usr/lib/python2.7/urllib2.pyc in http_response(self, request, response)
    548         if not (200 <= code < 300):
    549             response = self.parent.error(
--> 550                 'http', request, response, code, msg, hdrs)
    551 
    552         return response

/usr/lib/python2.7/urllib2.pyc in error(self, proto, *args)
    473         if http_err:
    474             args = (dict, 'default', 'http_error_default') + orig_args
--> 475             return self._call_chain(*args)
    476 
    477 # XXX probably also want an abstract factory that knows when it makes

/usr/lib/python2.7/urllib2.pyc in _call_chain(self, chain, kind, meth_name, *args)
    407             func = getattr(handler, meth_name)
    408 
--> 409             result = func(*args)
    410             if result is not None:
    411                 return result

/usr/lib/python2.7/urllib2.pyc in http_error_default(self, req, fp, code, msg, hdrs)
    556 class HTTPDefaultErrorHandler(BaseHandler):
    557     def http_error_default(self, req, fp, code, msg, hdrs):
--> 558         raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
    559 
    560 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 400: Bad Request

Has something changed on the Google side that caused this problem do you think - or more likely my setup?

Cheers
Andy

Problem with connect

Hi there.

I'm trying to use your script. However, when trying to log with my credentials I got the following message.

"connector = pyGTrends(google_username, google_password)
Traceback (most recent call last):
File "", line 1, in
File "/home/persican/Downloads/pytrends-master/pyGTrends.py", line 37, in init
self._connect()
File "/home/persican/Downloads/pytrends-master/pyGTrends.py", line 64, in _connect
self.opener.open(self.url_CookieCheck)
File "/usr/lib/python3.4/urllib/request.py", line 461, in open
response = meth(req, response)
File "/usr/lib/python3.4/urllib/request.py", line 571, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python3.4/urllib/request.py", line 493, in error
result = self._call_chain(_args)
File "/usr/lib/python3.4/urllib/request.py", line 433, in _call_chain
result = func(_args)
File "/usr/lib/python3.4/urllib/request.py", line 676, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "/usr/lib/python3.4/urllib/request.py", line 461, in open
response = meth(req, response)
File "/usr/lib/python3.4/urllib/request.py", line 571, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python3.4/urllib/request.py", line 499, in error
return self._call_chain(_args)
File "/usr/lib/python3.4/urllib/request.py", line 433, in _call_chain
result = func(_args)
File "/usr/lib/python3.4/urllib/request.py", line 579, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request
"

Do you have an idea?

Reagrds,
Phil

Date not working

trend_payload = {'q': ['pizza','macaroni'],'date':'now 7-d'}
df = pytrend.trend(trend_payload, return_type='dataframe')
print(df)

results in:

Traceback (most recent call last):
  File "trend.py", line 16, in <module>
    df = pytrend.trend(trend_payload, return_type='dataframe')
  File "/usr/lib/python3.5/site-packages/pytrends-3.0.0-py3.5.egg/pytrends/request.py", line 81, in trend
  File "/usr/lib/python3.5/site-packages/pytrends-3.0.0-py3.5.egg/pytrends/request.py", line 154, in _trend_dataframe
TypeError: 'NoneType' object is not subscriptable

I also tried:
trend_payload = {'q': ['pizza','macaroni'],'date':'today 7-d'}

with the same error result.

Reach Quota Limit

Hi, when I was trying to call trend(payload) function on over 45 different pairs of keywords, the program keeps return "You have reached your quota limit. Please try again later." JSON response after successfully calling 10 times. After doing some research on StackOverflow, it says that enable PREF cookie on google.com would solve the problem. But I don't know how to enable this on python. Could anyone provide some hint or fix that? Thanks!

You have reached your quota limit

Has anyone else run into the following when establishing the connection?

You have reached your quota limit. Please try again later.

This is after doing zero requests in the last 20 hours, so I'm not sure if it is related to possibly hitting trends too hard in a previous session or if this is a cookie issue.

I've attempted to establish a connection using proxies and different accounts and I get the same response back from the connector every time.

Any thoughts?

csv parsing fails when rising searches between 1000% and 5000%

Unbelievably, it looks like Google Trends outputs comma-separated files that separate numerical values in the thousands with, yes, a comma:

Rising searches for Egg
dragon city,Breakout
яйцо,+1,650%
trung,+1,600%
яйца,+800%
...

This is malformed CSV that gets wrongly parsed in, say, MS Excel, and it throws an exception in parse_data(). I'll work on making the parsing code more robust, particularly to this issue. Hope it doesn't get too complex.

Script throws error if Google Trends returns no data

If we use a keyword q="Tied tee shirt dress". Google Trends will not return any data, due to less amount of data.
The python script gives this error:
Python Error:
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/pytrends/request.py", line 79, in trend
self.results = json.loads(text)
File "/usr/lib/python3.4/json/init.py", line 318, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.4/json/decoder.py", line 343, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.4/json/decoder.py", line 361, in raw_decode
raise ValueError(errmsg("Expecting value", s, err.value)) from None
ValueError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "individual.py", line 18, in
df = pytrend.trend(trend_payload, return_type='dataframe')
File "/usr/local/lib/python3.4/dist-packages/pytrends/request.py", line 81, in trend
raise ResponseError(req.content)
File "/usr/local/lib/python3.4/dist-packages/pytrends/request.py", line 200, in init
self.server_error = BeautifulSoup(content, "lxml").findAll("div", {"class": "errorSubTitle"})[0].get_text()
IndexError: list index out of range
Could you fix this so that it goes returns 0.

request report

Thanks for updating the code and now the connector works. However it seems like the request_report automated link doesn't works.

When I use the example payload = {'q': ['Pizza, Italian, Spaghetti, Breadsticks, Sausage'], 'cat': '0-71'}, the link report is https://accounts.google.com/logout?hl=en-US&continue=http%3A%2F%2Fwww.google.com%2Ftrends%23cmpt%3Dq%26q%3DPizza%2C%2BItalian%2C%2BSpaghetti%2C%2BBreadsticks%2C%2BSausage%26hl%3Den-US%26cat%3D0-71%26content%3D1
which doesn't seem to be a valid link to download, anyone else have same problem?

HTTPError - Problems with init

Hi,
the code worked fine in the past. But since last week I get an Error Message: "HTTPError: Temporary Redirect". There seems to be a problem with init, or more precisely with self._authenticate(username, password).
Does anyone have the same kind of problem? What can I do to handle this?

Best,
Lari

Here is the complete output of the console:

File "", line 5, in
downloader = pyGoogleTrendsCsvDownloader(google_username, google_pass)

File "Y:/Python_Codes/pyGoogleTrendsCsvDownloaderNew.py", line 67, in init
self._authenticate(username, password)

File "Y:/pyGoogleTrendsCsvDownloaderNew.py", line 114, in _authenticate
self.opener.open(self.url_authenticate, params)

File "C:\Python27\lib\urllib2.py", line 437, in open
response = meth(req, response)

File "C:\Python27\lib\urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)

File "C:\Python27\lib\urllib2.py", line 469, in error
result = self._call_chain(*args)

File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
result = func(*args)

File "C:\Python27\lib\urllib2.py", line 635, in http_error_302
new = self.redirect_request(req, fp, code, msg, headers, newurl)

File "C:\Python27\lib\urllib2.py", line 596, in redirect_request
raise HTTPError(req.get_full_url(), code, msg, headers, fp)

HTTPError: Temporary Redirect

Cannot parse GALX out of login page running example.py

Hi, I tried running example.py with valid creds but am getting an exception saying it "Cannot parse GALX out of login page".

code refactoring

Currently, all code is bundled in a single module, but it would probably help with developing, debugging, and usability to refactor it into separate modules for, say, connecting to Google, interfacing with user queries, and parsing raw CSV data. Modules for Py2/3 compatibility and a package-specific Exception class may also be useful.

I have a refactor branch going, but it may be a while before everything is ready for a PR. It currently breaks the API -- I renamed a couple classes and class methods, for clarity and adhering to community standards.

Do you think this is a good idea, @dreyco676? Any preferences or requests?

HTTP Error 400: Bad Request

With both Python 3.5.1 and 2.7.11 on OSX, example.py returns an error as follows:

    connector = pyGTrends(google_username, google_password)
  File "/usr/local/lib/python2.7/site-packages/pytrends/pyGTrends.py", line 51, in __init__
    self._connect()
  File "/usr/local/lib/python2.7/site-packages/pytrends/pyGTrends.py", line 75, in _connect
    self.opener.open(self.url_CookieCheck)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 437, in open
    response = meth(req, response)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 550, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 469, in error
    result = self._call_chain(*args)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 409, in _call_chain
    result = func(*args)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 656, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 437, in open
    response = meth(req, response)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 550, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 475, in error
    return self._call_chain(*args)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 409, in _call_chain
    result = func(*args)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 558, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 400: Bad Request

py2/3 compatibility?

This package is only Python3-compatible, and its predecessor was only Python2-compatible. Any interest in making this work in both Python versions? I'm happy to hack on it and submit a PR.

I should note that if this is made Py2/3 compatible, a universal wheel could be used if/when registering with PyPi.

pytrends possibly leaks password

Someone accessed my account using my password minutes after I use this API.
It was a throw away account, but please be cautious!

Initialization fails

Hi,

I am getting the following errors as of recently

Traceback (most recent call last):
  File "..", line 32, in <module>
    connector = pyGTrends(google_username, google_password)
  File "/Library/Python/2.7/site-packages/pytrends/pyGTrends.py", line 41, in __init__
    self.fake_ua = UserAgent()
  File "/Library/Python/2.7/site-packages/fake_useragent/fake.py", line 10, in __init__
    self.data = load_cached()
  File "/Library/Python/2.7/site-packages/fake_useragent/utils.py", line 140, in load_cached
    update()
  File "/Library/Python/2.7/site-packages/fake_useragent/utils.py", line 135, in update
    write(load())
  File "/Library/Python/2.7/site-packages/fake_useragent/utils.py", line 92, in load
    browsers_dict[browser_key] = get_browser_versions(browser)
  File "/Library/Python/2.7/site-packages/fake_useragent/utils.py", line 55, in get_browser_versions
    html = html.split('<div id=\'liste\'>')[1]
IndexError: list index out of range

The same code worked fine a few days ago. Has Google maybe changed their HTML code? Any tips?

Fetching 61months of data gives weekly stats

Hi there.

With regards to the date parameters, I think the documentation should be updated. I have just fetched a good portion of keywords, and it seems that the "API" is not consistent :-(. I don't know what to write, and I unfortunately don't have the time to investigate further.

def download_report(words, downloader):
    default_date = "1/2011 61m"
    default_geo = "DK"

    terms = ', '.join(words)

    downloader.request_report(terms, geo=default_geo, date=default_date)
    downloader.save_csv('./queries/', terms.replace(", ", "_"))

See the two attached files as an example, both files are produced from the above code.
11_beskyttelse_1_indført.txt
tilfælde_hpv-vaccine_doser_stivkrampe.txt

HTTP Error 400: Bad Request

Getting HTTP Error 400, Please Help

C:\Windows\System32>python C:\Users\SGG\Desktop\pytrends-master\examples\example
.py
Traceback (most recent call last):
File "C:\Users\SGG\Desktop\pytrends-master\examples\example.py", line 10, in <
module>
connector = pyGTrends(google_username, google_password)
File "C:\Python34\lib\site-packages\pytrends-1.1.2-py3.4.egg\pytrends\pyGTrend
s.py", line 51, in init
File "C:\Python34\lib\site-packages\pytrends-1.1.2-py3.4.egg\pytrends\pyGTrend
s.py", line 71, in _connect
File "C:\Python34\lib\urllib\request.py", line 469, in open
response = meth(req, response)
File "C:\Python34\lib\urllib\request.py", line 579, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python34\lib\urllib\request.py", line 501, in error
result = self._call_chain(_args)
File "C:\Python34\lib\urllib\request.py", line 441, in _call_chain
result = func(_args)
File "C:\Python34\lib\urllib\request.py", line 684, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "C:\Python34\lib\urllib\request.py", line 469, in open
response = meth(req, response)
File "C:\Python34\lib\urllib\request.py", line 579, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python34\lib\urllib\request.py", line 507, in error
return self._call_chain(_args)
File "C:\Python34\lib\urllib\request.py", line 441, in _call_chain
result = func(_args)
File "C:\Python34\lib\urllib\request.py", line 587, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

Reach quota limit after 10 downloads

I seem to reach the quota limit after only 10 downloads. Anyone encountering a similar issue?

RateLimitError

Hi!

Thank you for the updates of the code. I tried to run the new updated version. After about 10 downloads, I receive the following traceback:

Traceback (most recent call last):
File "C:/Users/Documents/Python Scripts/collect_gtrends.py", line 34, in
trend=pytrend.trend(trend_payload, return_type='dataframe')
File "C:\Users\AppData\Roaming\Python\Python27\site-packages\pytrends\request.py", line 62, in trend
raise RateLimitError
pytrends.request.RateLimitError

I don't think this is the quota limit problem. Maybe I was downloading too frequently? How may seconds do you guys wait in between requests? My current program lets it sleep for 5-10 seconds. Is that not enough? Thank you!

Two keywords, parse_data function issue

Hi. I have one problem with script.

If I use two keywords (and each has got different period of time -
month and week) parse_data function returns dictionary without one
keyword. For example if I request "mleko migdałowe, posadzka betonowa"
(Polish language), parse_data returns trends only for "mleko
migdałowe".

Is there a possibility to return dict for both keywords? I care about
relation between keywords trends so I can request it single.

Greetings
Bartek

tz (timezone) parameter needed

Google provides a timezone parameter,tz which uses this the following pattern before urlencoding: Etc/GMT+5. This parameter gets appended at the end of the request url.

http://www.google.com/trends/explore#q=pizza&date=today%2012-m&cmpt=q&tz=Etc%2FGMT%2B5

We should implement this to enable better results.

Handle "Breakout" at end of chunk

Breakout trends that occur at the end of a chunk aren't cleaned. The trailing \n is stripped by the chunker so the replacement never happens :(

[Enhancement] Hot Trends

It would be great to be able to access data from https://www.google.com/trends/hottrends

It is different from the currently implemented function, ehich requires knowing the search terms in advance. It helps discover trendy search terms. Example:

Request: What are the top trending searches in 2016?
Response: Meldonium (300m searches), Nate Diaz (200m searches), etc

Connect Error

Hi,

I am trying to use the example.py code, but it gives me the following error right after the line:
pytrend = TrendReq(google_username, google_password, custom_useragent='My Pytrends Script')

Here's the error:
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python2.7/site-packages/pytrends/request.py", line 34, in init
self._connect()
File "/usr/lib/python2.7/site-packages/pytrends/request.py", line 45, in _connect
soup_login = BeautifulSoup(login_html.content, "lxml").find('form').find_all('input')
File "/usr/lib/python2.7/site-packages/bs4/init.py", line 165, in init
% ",".join(features))
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

Do you happen to see what the problem is? I have tried to install lxml and it's just fine.
Thanks.

Python 2 compatibility issues

Hi all,

I'm having troubles using this library with pyhton2 when there is an error in the response. In fact JSONDecodeError that is being caught when parsing the response is not defined in Python 2 (as stated in https://docs.python.org/3/library/json.html#json.JSONDecodeError).
It is also stated that JSONDecodeError is a subclass of ValueError, which could be used for the python2 version.

Thanks,
Luca

Longevity

In trying to get related entities it looks like Google is not using the same method of get requests in the pytrend API. It's using the URL:
https://www.google.com/trends/api/widgetdata/relatedsearches/csv?req={"restriction":{"geo":{"country":"US"},"time":"2004-01-01 2016-08-16"},"keywordType":"ENTITY","metric":["TOP","RISING"],"trendinessSettings":{"compareTime":"2004-01-01 2005-01-01"},"requestOptions":{"property":"","backend":"IZG","category":44}}&token=[TOKEN]&tz=240

Are there plans to incorporate this new method into the pytrends module? Thanks in advance.

Can only pull 80 days worth of daily data.

Every time I specify date="today 90-d", I don't get the fully ninety days of trend data for any term. Is this something to do with Google authentication or a bug in the library?

pytrend.trend problem

I'm adapting my application to your new code. I'm using the following code:

# connect to Google
pytrend = TrendReq(google_username, google_password, custom_useragent='Hey!')

# make request

def google_request():
    googlekey="tomate"
    search_param={'q':googlekey,'h1':'pt-BR','geo':'BR'}
    trend=pytrend.trend(search_param,return_type='dataframe')

Apparently there is a problem here
trend=pytrend.trend(search_param,return_type='dataframe')

As you can see in the below traceback:

Traceback (most recent call last):
Response did not parse. See server response for details.
This page is currently unavailable. Please try again later.<br/> Please make sure your query is valid and try again.<br/> If you're experiencing long delays, consider reducing your comparison items.<br/> Thanks for your patience.

File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pytrends/request.py", line 79, in trend

self.results = json.loads(text)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/__init__.py", line 319, in loads
return _default_decoder.decode(s)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/santanna_santanna/PycharmProjects/Predictive Models/APIGtrends.py", line 37, in <module>
google_request()
File "/Users/santanna_santanna/PycharmProjects/Predictive Models/APIGtrends.py", line 27, in google_request
trend=pytrend.trend(search_param,return_type='dataframe')
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pytrends/request.py", line 81, in trend
raise ResponseError(req.content)
pytrends.request.ResponseError: b'<!DOCTYPE html><html ><head><meta name="google-site-verification" content="-uo2JByp3-hxDA1ZgvM3dP8BE1_qDDddCm_st_w41P8" /><meta...
[ HERE COMES A LONG LONG LONG HTML CODE] ...This page is currently unavailable. Please try again later.&lt;br/&gt; Please make sure your query is valid and try again.&lt;br/&gt; If you&#39;re experiencing long delays, consider reducing your comparison items.&lt;br/&gt; Thanks for your patience.</div></div></div></div></div></div></div></div><div class="gb_6a"></div></body></html>'

Any ideas where might be the prob?

HTTP Issues

Hi,

trying to connect raises an error for me:

----> 8 connector = pyGTrends(google_username, google_password)
      9 
     10 # make request

/home/cs/anaconda3/lib/python3.5/site-packages/pytrends/pyGTrends.py in __init__(self, username, password)
     49         self.url_CookieCheck = 'https://www.google.com/accounts/CheckCookie?chtml=LoginDoneHtml'
     50         self.url_PrefCookie = 'http://www.google.com'
---> 51         self._connect()
     52 
     53     def _connect(self):

/home/cs/anaconda3/lib/python3.5/site-packages/pytrends/pyGTrends.py in _connect(self)
     73         params = urlencode(self.login_params).encode('utf-8')
     74         self.opener.open(self.url_ServiceLoginBoxAuth, params)
---> 75         self.opener.open(self.url_CookieCheck)
     76         self.opener.open(self.url_PrefCookie)
     77 

/home/cs/anaconda3/lib/python3.5/urllib/request.py in open(self, fullurl, data, timeout)
    469         for processor in self.process_response.get(protocol, []):
    470             meth = getattr(processor, meth_name)
--> 471             response = meth(req, response)
    472 
    473         return response

/home/cs/anaconda3/lib/python3.5/urllib/request.py in http_response(self, request, response)
    579         if not (200 <= code < 300):
    580             response = self.parent.error(
--> 581                 'http', request, response, code, msg, hdrs)
    582 
    583         return response

/home/cs/anaconda3/lib/python3.5/urllib/request.py in error(self, proto, *args)
    501             http_err = 0
    502         args = (dict, proto, meth_name) + args
--> 503         result = self._call_chain(*args)
    504         if result:
    505             return result

/home/cs/anaconda3/lib/python3.5/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    441         for handler in handlers:
    442             func = getattr(handler, meth_name)
--> 443             result = func(*args)
    444             if result is not None:
    445                 return result

/home/cs/anaconda3/lib/python3.5/urllib/request.py in http_error_302(self, req, fp, code, msg, headers)
    684         fp.close()
    685 
--> 686         return self.parent.open(new, timeout=req.timeout)
    687 
    688     http_error_301 = http_error_303 = http_error_307 = http_error_302

/home/cs/anaconda3/lib/python3.5/urllib/request.py in open(self, fullurl, data, timeout)
    469         for processor in self.process_response.get(protocol, []):
    470             meth = getattr(processor, meth_name)
--> 471             response = meth(req, response)
    472 
    473         return response

/home/cs/anaconda3/lib/python3.5/urllib/request.py in http_response(self, request, response)
    579         if not (200 <= code < 300):
    580             response = self.parent.error(
--> 581                 'http', request, response, code, msg, hdrs)
    582 
    583         return response

/home/cs/anaconda3/lib/python3.5/urllib/request.py in error(self, proto, *args)
    507         if http_err:
    508             args = (dict, 'default', 'http_error_default') + orig_args
--> 509             return self._call_chain(*args)
    510 
    511 # XXX probably also want an abstract factory that knows when it makes

/home/cs/anaconda3/lib/python3.5/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    441         for handler in handlers:
    442             func = getattr(handler, meth_name)
--> 443             result = func(*args)
    444             if result is not None:
    445                 return result

/home/cs/anaconda3/lib/python3.5/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
    587 class HTTPDefaultErrorHandler(BaseHandler):
    588     def http_error_default(self, req, fp, code, msg, hdrs):
--> 589         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    590 
    591 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 400: Bad Request

Do you have an idea why the request does not work properly? I double checked my authentification credentials and these are correct.

Thanks and Cheers
Carsten

Relative search

Hi,

Regarding the relative search, I used the example here

payload = {'q': ['Pizza', 'Italian', 'Spaghetti', 'Breadsticks', 'Sausage']}
df = connector.trend(payload, return_type = 'dataframe')

And this is the output I get

df
Out[11]:
pizza
Date
2004-01-01 31.0
2004-02-01 30.0
2004-03-01 30.0
2004-04-01 30.0
2004-05-01 29.0
2004-06-01 28.0
2004-07-01 31.0
2004-08-01 30.0
2004-09-01 30.0
2004-10-01 33.0
2004-11-01 31.0
2004-12-01 33.0
2005-01-01 34.0
2005-02-01 36.0
2005-03-01 35.0
2005-04-01 35.0
2005-05-01 34.0
2005-06-01 33.0
2005-07-01 37.0
2005-08-01 36.0
2005-09-01 36.0
2005-10-01 37.0
2005-11-01 38.0
2005-12-01 40.0
2006-01-01 40.0
2006-02-01 41.0
2006-03-01 39.0
2006-04-01 39.0
2006-05-01 37.0
2006-06-01 38.0
...
2014-04-01 79.0
2014-05-01 80.0
2014-06-01 78.0
2014-07-01 83.0
2014-08-01 84.0
2014-09-01 75.0
2014-10-01 79.0
2014-11-01 82.0
2014-12-01 83.0
2015-01-01 82.0
2015-02-01 84.0

`hl` parameter doesn't limit results

Hey, as far as I can tell, the hl parameter works across most Google services to set the language of the web interface, but it doesn't filter results (for trends, searches, etc.) by that language. The repo's readme claims otherwise. Could you double-check, @dreyco676?

_infer_dtype regex doesn't work for string starting in a number

https://github.com/GeneralMills/pytrends/blob/master/pytrends/pyGTrends.py#L196

I think the regexes don't encompass all possible options, for example I get this error:

ValueError: val=3d printer dtype not recognized

pyGTrends

Hi, I'm interested in using pytrends for predicting winners in senate races. I copied your example code, to begin playing around with it, entered a path, my username, password etc. When I run the code, however, I get, "NameError: name 'pyGTrends' is not defined." Any ideas?

packaging and pypi?

Is there any interest in restructuring this project into a module distribution and adding a setup.py for installation? Maybe also building a source distribution and uploading it to PyPi? I've never done this before but I'm happy to take a first pass and send a pull request.

[Question] Is there a way to get the google trends data in memory instead of a csv file?

I'm trying to use your API as part of a webapp. I want to gather search data from a variety of sources including google trends then aggregate it to get an even more general estimate of how popular a topic is. I was trying to save to csv first then immediately read the csv but that's a little silly.

How could I say, print line 500 of the csv file before the script stops running?

Interest by Subregion , cid?

Hi,

first of all thank you for this beautiful and efficient script.

For a research project Iam interested in using the Google Trends function 'interest by subregion' automatically and iam curious if I could use a modified version of pytrends to receive the desired functionality.

For example if one enters the term 'Python' on the main page and sets the country to 'United Kingdom', below the 'interest over time' - graph (which is accesable by the 'trends' function) there is a map and I need to be able to grap the corresponding data, which is also available as csv, automatically.

To modify pytrends my first idea is to use the existing 'trends' function as a framework of which the core change should be to correct payload['cid'] = 'TIMESERIES_GRAPH_0' to the value that correspondents to the json of the 'interest by subregion' data. I tried to find that value for 'cid' with the developer console of Chrome by observing the get and respond calls of Google Trends , yet I was not able to find it.

(To that end Iam looking for the link that goes like: https://www.google.com/trends/fetchComponent?hl=en-US&q=Python&geo=US&cid=**--WhichValueHere--** and points to the 'interest by region' data. )

It would be very helpful if you could point out the correct value for the 'cid' key or if you showed me where I can find it.

Otherwise I would be very happy to hear about a different/better approach to get to the json (or csv) of the 'interest by subregion' section of Google Trends.

If Iam able to write the code with the desired functionality I would be happy to contribute it to the pytrends project.

Your help is much appreciated!

Regards

Hottrends parameters

Hi,

Firstly, many thanks for the library and updates :)

I'm trying to use hottrends function, by starting the example in the example.py. I have changed the country_payload to {'geo': 'TR'}. However, the results do not change when the country_payload changes. I have tried different countries and the result is always the same. Is there any points I'm missing or the parameters are not working as expected?

Cheers

Data differs from web interface

Hi,
Firstly, thanks for the module.

I've fetched some data using the module but when I check it against the csv taken manually, there are some differences in the numbers.
For example: keyword: "olympics", time: "today 30-d"
data for 14.08.2016 is 100(manual) and 86(from api).

Am I missing something or can you reproduce results this as well?

Thanks in advance.

Doesn't save to path ?

What am I doing wrong that it doesn't save to the path I've entered Oo

generalmills / pytrends Goto Github PK

pytrends's People

Contributors

Stargazers

Watchers

Forkers

pytrends's Issues

Recommend Projects

Recommend Topics

Recommend Org