waditu / tushare Goto Github PK
View Code? Open in Web Editor NEWTuShare is a utility for crawling historical data of China stocks
License: BSD 3-Clause "New" or "Revised" License
TuShare is a utility for crawling historical data of China stocks
License: BSD 3-Clause "New" or "Revised" License
“龙虎榜数据”来源
深圳证券交易所 http://www.szse.cn/main/disclosure/news/scgkxx/
上海证券交易所 http://www.sse.com.cn/disclosure/diclosure/public/
谢谢。
高送转和中报,都是可能热炒的概念。
研究高送转操作手法,需要一个可以及时更新的,包含送配方案(比例)、除权登记日(包括历史的)、除权日、股票代码和股票名称的列表。
最终我们可以这样检索,等于1(10送10)的,2008年1-3月除权的,股票代码。
中报也类似,股票代码、(预计)披露日期、预增(降)、百分数。
务必包含历史,便于研究炒作的方向和力度
印象中上半年炒高送转,下半年炒中报。
请问复权数据可靠性有多高?理解tushare用的都是免费数据源,不可能100%,但想知道复权数据准确度大概能达到什么样的水平?
yahoo-finance上沪深两市复权数据至少有30%不全或错误,不知道国内数据源的数据会不会好一些?
谢谢!
请问:
历史分时数据,ts.get_tick_data('600848',date='日期'),每一天的数据量和时间节点不尽相同。有时相差上千行,是原始数据就是这样的,还是API的数据质量问题?
错误信息为
#No tables found matching pattern '.+'
谢谢你的开源tushare,第一次运行时报错:在spyder下运行调用任何tushare 公式都报错
例如 import tushare as ts
ts.get_area_classified()
File "C:\Anaconda\lib\site-packages\tushare\stock\trading.py", line 13, in
import lxml.html
ImportError: No module named html
但在powershell 里面 进入python 然后 import lxml.html 成功不报错。
pip list 里显示有 lxml (3.4.4)
第一次用Python, 不知是不是我设置问题。
例如沪深300指数 000300.SH
的历史数据
我发现不支持指数基金数据,我有Python编程能力,如何为项目添加支持新的数据
data = get_h_data(code=code, end="2015-07-01")
2015-07-01 14.35 14.59 13.92 13.74 183540608 2629096448
2015-06-30 13.54 14.54 14.54 13.38 254810320 3572978432
....
结果会出现超过7月-1号的数据
@jimmysoa
你好
在升级到2.0之后,报错如下,不知如何解决。
NiTu:Liang ~$ python dddd.py
Traceback (most recent call last):
File "dddd.py", line 1, in <module>
import tushare as ts
File "/Users/Liang/anaconda/lib/python2.7/site-packages/tushare/__init__.py", line 5, in <module>
from tushare.stock.trading import (get_hist_data, get_tick_data,
File "/Users/Liang/anaconda/lib/python2.7/site-packages/tushare/stock/trading.py", line 16, in <module>
import pandas as pd
File "/Users/Liang/anaconda/lib/python2.7/site-packages/pandas/__init__.py", line 41, in <module>
from pandas.core.api import *
File "/Users/Liang/anaconda/lib/python2.7/site-packages/pandas/core/api.py", line 9, in <module>
from pandas.core.groupby import Grouper
File "/Users/Liang/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 15, in <module>
from pandas.core.frame import DataFrame
File "/Users/Liang/anaconda/lib/python2.7/site-packages/pandas/core/frame.py", line 40, in <module>
from pandas.computation.eval import eval as _eval
File "/Users/Liang/anaconda/lib/python2.7/site-packages/pandas/computation/eval.py", line 8, in <module>
from pandas.computation.expr import Expr, _parsers, tokenize_string
File "/Users/Liang/anaconda/lib/python2.7/site-packages/pandas/computation/expr.py", line 148, in <module>
(getattr(ast, node) for node in dir(ast))))
File "/Users/Liang/anaconda/lib/python2.7/site-packages/pandas/computation/expr.py", line 147, in <lambda>
issubclass(x, ast.AST),
AttributeError: 'module' object has no attribute 'AST'
谢谢
建议增加函数get_hist_data(),处理特殊参数all,即,
get_hist_data('all', '2015-05-01', ktype='D')
函数值是这一天所有日线数据,并以股票代码为index。这样我们可以更方便的合成,近五日换手率列表。
code '2015-05-01' '2015-05-02' '2015-05-03'
600001 0.051 0.061 0.123
逐股和逐日,就像空间和时间,是最基本的两种提取信息的维度。目前似乎TuShare支持其中一种。
望考虑增强按时间提取所有数据的功能,从日线开始,扩展到其他时间尺度。另外,时间参数的表示,也成问题,有些日期不开市,所以宜支持,自然时间和交易时间两种方式,例如,Unix时间。
非常感谢!
import tushare as ts
data = ts.get_h_data('000033',start='2014-06-29',end='2015-06-29')
给出以下错误信息:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/hao/venv/q-django/lib/python3.4/site-packages/tushare/stock/trading.py", line 392, in get_h_data
if ((float(rt['high']) == 0) & (float(rt['low']) == 0)):
TypeError: 'NoneType' object is not subscriptable
可能是这股有点奇怪?既没有停牌,也没有数据
sym1 is None after the following snippet executes. It worked fine for 2015-07-16 but stopped working on 2015-07-17.
import tushare as ts
sym1 = ts.get_tick_data('600893', date='2015-07-17')
您好
这个项目太有用了,感谢!
我在运行0.1.8升级命令式,报错
Traceback (most recent call last):
File "setup.py", line 4, in <module>
import tushare
File "/Users/Liang/tushare/tushare/__init__.py", line 7, in <module>
from tushare.stock.fundamental import (get_stock_basics, get_report_data,
File "/Users/Liang/tushare/tushare/stock/fundamental.py", line 11, in <module>
import lxml.html
ImportError: No module named lxml.html
上涨家数, 下跌家数,在
大盘指数行情列表
获取大盘指数实时行情列表,以表格的形式展示大盘指数实时行情。
调用方法:
import tushare as ts
df = ts.get_index()
能不能返回?
有几家上涨可以判断主力对今日的看法
get_report_data返回的dataframe里code是数字型的,导致以0开头的code只剩下后面几位。
df=ts.get_h_data('000033',start='2015-06-01',end='2015-07-21')
if df is not None:
print "OK"
else:
print "ERROR"
一些股票获取失败了
INDEX_COLS少了"open"列,另外,不建议get_index函数里修改amount:
df['amount'] = df['amount'] / 100000000,保留原数据比较好。
/usr/lib64/python2.7/urllib2.pyc in _call_chain(self, chain, kind, meth_name, _args)
380 func = getattr(handler, meth_name)
381
--> 382 result = func(_args)
383 if result is not None:
384 return result
/usr/lib64/python2.7/urllib2.pyc in http_error_default(self, req, fp, code, msg, hdrs)
529 class HTTPDefaultErrorHandler(BaseHandler):
530 def http_error_default(self, req, fp, code, msg, hdrs):
--> 531 raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
532
533 class HTTPRedirectHandler(BaseHandler):
HTTPError: HTTP Error 403: Forbidden
http://vip.stock.finance.sina.com.cn/corp/go.php/vMS_FuQuanMarketHistory/stockid/900901.phtml?year=2015&jidu=1
http://vip.stock.finance.sina.com.cn/corp/go.php/vMS_FuQuanMarketHistory/stockid/200418.phtml?year=2015&jidu=1
等等都出错了,当前价格为零点几,但复权价都几百的, 造成
df = ts.get_h_data('900901', start='2015-08-01', end=endDate)
在这里出错了:
`if ((float(rt['high']) == 0) & (float(rt['low']) == 0)):``
pydev debugger: starting
[Getting data:]Traceback (most recent call last):
File "/Applications/Aptana Studio 3/plugins/org.python.pydev_3.0.0.1388187472/pysrc/pydevd.py", line 1479, in <module>
debugger.run(setup['file'], None, None)
File "/Applications/Aptana Studio 3/plugins/org.python.pydev_3.0.0.1388187472/pysrc/pydevd.py", line 1125, in run
pydev_imports.execfile(file, globals, locals) #execute the script
File "/Users/apple/node/undo4096/tool/stock/winBuy.py", line 75, in <module>
df = ts.get_h_data('900901', start='2015-08-01', end=endDate)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tushare-0.3.4-py2.7.egg/tushare/stock/trading.py", line 392, in get_h_data
if ((float(rt['high']) == 0) & (float(rt['low']) == 0)):
TypeError: 'NoneType' object has no attribute '__getitem__'
但深市B股就ok, 要怎么解决?
创业板和中小板都是相似的问题,沪深300和中证500没有问题。
import pandas as pd
import tushare as ts
pd.set_option('display.width', 200)
zxb = ts.get_sme_classified()
"C:\Program Files\Anaconda3\python.exe" "D:/MyDoc/Cloud Station/Investment/Python/PQuant/A Share DK Pailie.py"
Traceback (most recent call last):
File "D:/MyDoc/Cloud Station/Investment/Python/PQuant/A Share DK Pailie.py", line 8, in <module>
zxb = ts.get_sme_classified()
File "C:\Program Files\Anaconda3\lib\site-packages\tushare\stock\classifying.py", line 113, in get_sme_classified
df = fd.get_stock_basics()
File "C:\Program Files\Anaconda3\lib\site-packages\tushare\stock\fundamental.py", line 44, in get_stock_basics
text = urlopen(request, timeout=10).read()
File "C:\Program Files\Anaconda3\lib\urllib\request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "C:\Program Files\Anaconda3\lib\urllib\request.py", line 469, in open
response = meth(req, response)
File "C:\Program Files\Anaconda3\lib\urllib\request.py", line 579, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Program Files\Anaconda3\lib\urllib\request.py", line 507, in error
return self._call_chain(*args)
File "C:\Program Files\Anaconda3\lib\urllib\request.py", line 441, in _call_chain
result = func(*args)
File "C:\Program Files\Anaconda3\lib\urllib\request.py", line 587, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
Process finished with exit code 1
我想把历史数据,报价及交易数据保存到数据库,发现报价保存是没有股票代码的,这样就无法区分具体是哪支股票的价格了
ind_data = ts.get_hist_data('sh', ktype='30', start='2012-06-25', end='2015-06-24')
最早的2015-04-22 13:30:00
让用户可以在本地保持所有数据的副本,提供一个统一的数据访问接口,首先访问本地数据看是否命中,如果没有命中,自动去网上下载相应数据并在本地保存。这样的功能基本上所有使用tushare作为量化研究的数据来源应用都需要实现的。
tushare/stock/trading.py
_code_to_symbol
:
return 'sh%s'%code if code[:1] == '6' else 'sz%s'%code
SSE B股以'9'开头,ETF以'5'开头,还有可转债等
I found that get_profit_data only returns column titles.
It seems that "html = lxml.html.parse(url)" in "_get_profit_data" doesn't get the full cotent of page. And it triggered a excetion.
Simple test
url = 'http://vip.stock.finance.sina.com.cn/q/go.php/vFinanceAnalyze/kind/profit/index.phtml?s_i=&s_a=&s_c=&reportdate=2014&quarter=1&p=29&num=60'
html = lxml.html.parse(url)
xtrs = html.xpath("//table[@Class="list_table"]/tr")
for trs in xtrs:
code = trs.xpath('td[1]/a/text()')[0]
name = trs.xpath('td[2]/a/text()')[0]
roe = trs.xpath('td[3]/text()')
print(code, name, roe)
And I could escape this by get the content via requests.get and then use lxml parse it.
看到复权数据的计算只除以了一次rate。
如果一次获取的数据比较长,中间复权了多次,每次的复权因子都是不同的,这样是不是漏掉了很多次复权计算?
cannot convert the series to <type 'float'>
调用 get_today_ticks()
时出错, 错误如下:
UnboundLocalError: local variable 'data' referenced before assignment.
可以试一下把data = pd.DataFrame()
放到try外面。
例如ma60 ma200
我看到有一个版本的提交里写了 "期货实时数据获取"
是指可以在 tushare 里查看 期货的 tick 数据了么? 数据源应该还好, 随便注册个 ctp 就可以用了
但是现在的 methods 里没看到的样子, 是会到下个版本里么?
另外, 如果期货的 methods 和股票的用在一起, 会不会太混乱了..... 希望能分离开..
以及是否能对信息类和数据类的方法分离一下?
例如:
ts.info.get_gdp_*
ts.stock.get_today
这样子....
endDate='2015-07-23'
df=ts.get_h_data("000026",start='2015-06-01',end=endDate)
当天已经收盘,现在查询不到今天的历史数据,结果df里没有今天的
ts.get_hist_data('601106', start='2015-02-05', end='2015-02-06', ktype='30')
ts.get_hist_data('601106', start='2015-02-05', end='2015-02-06', ktype='15')
ts.get_hist_data('601106', start='2015-02-05', end='2015-02-06', ktype='5')
返回空。
ts.get_hist_data('601106', start='2015-02-05', end='2015-02-06', ktype='60')
则正常。
把get_industry_classified()获得数据保存成csv的时候报错,保存成xls没问题
把get_stock_basics()获得的数据保存成csv没问题
data2 = ts.get_industry_classified()
print('\nget classified ...ok')
data2.to_csv('D:/Other/PY/data/industry_classified2.csv')
print('save classified ...ok')
[Getting data:]#################################################Traceback (most recent call last):
File "D:/Other/PY/tuShareGetStockList.py", line 13, in <module>
data2.to_csv('D:/Other/PY/data/industry_classified2.csv')
File "D:\Soft\Anaconda\lib\site-packages\pandas\core\frame.py", line 1189, in to_csv
formatter.save()
File "D:\Soft\Anaconda\lib\site-packages\pandas\core\format.py", line 1467, in save
get classified ...ok
self._save()
File "D:\Soft\Anaconda\lib\site-packages\pandas\core\format.py", line 1567, in _save
self._save_chunk(start_i, end_i)
File "D:\Soft\Anaconda\lib\site-packages\pandas\core\format.py", line 1594, in _save_chunk
lib.write_csv_rows(self.data, ix, self.nlevels, self.cols, self.writer)
File "pandas\lib.pyx", line 975, in pandas.lib.write_csv_rows (pandas\lib.c:17612)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)
Process finished with exit code 1
日线数据没有问题,周线月线出现错误。
import pandas as pd
import tushare as ts
RawData = ts.get_hist_data('600018', start='2015-01-01', end='2015-07-27', ktype='W')
print(RawData)
Traceback (most recent call last):
File "D:/MyDoc/LCW/Investment/Python/PQuant/test1.py", line 9, in <module>
RawData = ts.get_hist_data('600018', start='2015-01-01', end='2015-07-27', ktype='W')
File "C:\Program Files\Anaconda3\lib\site-packages\tushare\stock\trading.py", line 84, in get_hist_data
df[col] = df[col].astype(float)
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2411, in astype
dtype=dtype, copy=copy, raise_on_error=raise_on_error, **kwargs)
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\internals.py", line 2504, in astype
return self.apply('astype', dtype=dtype, **kwargs)
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\internals.py", line 2459, in apply
applied = getattr(b, f)(**kwargs)
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\internals.py", line 373, in astype
values=values, **kwargs)
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\internals.py", line 403, in _astype
values = com._astype_nansafe(self.values.ravel(), dtype, copy=True)
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\common.py", line 2734, in _astype_nansafe
return arr.astype(dtype)
ValueError: could not convert string to float:
Process finished with exit code 1
from tushare import get_stock_basics
base_frame=get_stock_basics()
print base_frame["name"]
如何获取code 列表
get_h_data('000300', start='2000-01-01', index=True)
返回网络错误。
sr/lib/python2.7/site-packages/tushare/stock/trading.pyc in get_h_data(code, start, end, autype, index, retry_count, pause)
363 ct._write_console()
364 df = _parse_fq_data(_get_index_url(index, code, qt), index,
--> 365 retry_count, pause)
366 data = data.append(df, ignore_index=True)
367 if len(data) == 0 or len(data[(data.date>=start)&(data.date<=end)]) == 0:
/usr/lib/python2.7/site-packages/tushare/stock/trading.pyc in _parse_fq_data(url, index, retry_count, pause)
481 else:
482 return df
--> 483 raise IOError(ct.NETWORK_URL_ERROR_MSG)
484
485
IOError: 获取失败,请检查网络和URL
如果只获取最近的数据则没问题。
版本0.3.4
import tushare as ts
ts.get_st_classified()
>>> import tushare as ts
>>> ts.get_st_classified()
code name
0 000033 *ST新都
1 000059 *ST华锦
2 000068 *ST华赛
3 000155 *ST川化
4 000403 ST生化
5 000510 *ST金路
6 000520 *ST凤凰
7 000557 *ST广夏
8 000590 *ST古汉
9 000611 *ST蒙发
10 000659 *ST中富
11 000677 *ST海龙
12 000711 *ST京蓝
13 000799 *ST酒鬼
14 000815 *ST美利
15 000892 *ST星美
16 000912 *ST天化
17 000927 *ST夏利
18 000976 *ST春晖
19 000995 *ST皇台
20 002015 *ST霞客
21 002192 *ST路翔
22 002306 *ST云网
23 002417 *ST元达
24 002506 *ST集成
25 002608 *ST舜船
26 002633 *ST申科
27 600069 *ST银鸽
28 600071 *ST光学
29 600091 *ST明科
30 600145 *ST新亿
31 600163 *ST南纸
32 600217 *ST秦岭
33 600242 *ST中昌
34 600247 *ST成城
35 600265 ST景谷
36 600301 *ST南化
37 600401 *ST海润
38 600408 *ST安泰
39 600444 *ST国通
40 600539 *ST狮头
41 600608 *ST沪科
42 600644 *ST乐电
43 600656 *ST博元
44 600691 *ST阳化
45 600710 *ST常林
46 600715 *ST松辽
47 600722 *ST金化
48 600732 *ST新梅
49 600779 *ST水井
50 600793 ST宜纸
51 600817 ST宏盛
52 600870 *ST厦华
53 600962 *ST中鲁
54 600984 *ST建机
S股怎么没有列出来? 比如S佳通
非常感谢jim的付出
请问 能否实现:
傻瓜化的配置,比如 只需要配置好mysql帐号密码
然后 让脚本一直运行,自动去服务器拉取数据并存入数据库
第二天 当脚本再次运行,他会自动接着上次 继续拉取数据存入数据库
不会出现重复入库的情况
当然这个需求并不是特别广泛, (:
谢谢
和新浪财经,甚至和凤凰财经网站上的数据不一致。
使用get_today_all()的时候,有两个疑似bug:
1 有的股票的信息有时会出现两次,而且重复的股票是随机的
2 有的股票信息缺失,而且缺失的股票是随机的
使用环境:windows 7 32bit , python version 2.7.10
get_tick_data()返回数据的type列,买盘/卖盘/中性盘 的判断标准具体是什么?
AssertionError: 15 columns passed, passed data had 14 columns
,AttributeError: 'list' object has no attribute 'keys'
如 “ts.get_hist_data('600848')” 拓展为 “ts.get_hist_data(['600848', '600004','000876'])” , 返回结果为各股票数据的dataframe的combine, 并加入一个‘STK_ID' index或column来区分。
请问get_tick_data的数据是在哪个网页爬到的?
这几天试了一下,将数据获取后逐一写到本地数据库,有发现部分数据会报错,不晓得大家有没有遇到。
举个例子:
df = ts.get_stock_basics() #获取所有股票列表
将这个数据写到数据库时,运行时错误代码:
sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (1170, "BLOB/TEXT column 'code' used in key specification without a key length") [SQL: u'CREATE INDEX ix_hist5_code ON hist5 (code)']
在抓取
df = ts.get_index() #大盘行情
这个不报错
平台 OSX 10.10.4 Python:2.7 MYSQL:5.6.26
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.