Stooq API for non-US stock market data - python

I have been trying to complete current data for several stocks from WSE (Warsaw Stock Exchange). I am focused on stooq as yahoo doesn't cover polish stock market.
Once I try the code from pandas-datareader for stooq, I see the below error:
"StooqDailyReader request returned no data;
check URL for invalid inputs: https://stooq.com/q/d/l/"
Under this link csv file telling me the ticker provided is wrong, but the same ticker works well on stooq website directly.
Do you know what may be wrong there?
import pandas_datareader.data as web
prices = web.DataReader('KGH', 'stooq')
print(prices)

Do you want something like this?
import quandl
mydata = quandl.get("WSE/WIG20TR")
print(mydata)
That, produces the following.
Open High Low Close % Change Turnover (1000s)
Date
2016-10-03 2911.76 2919.84 2911.76 2919.84 0.93 401910.61
2016-10-04 2940.43 2968.24 2940.43 2968.24 1.66 707134.92
2016-10-05 2970.79 2982.08 2970.79 2982.08 0.47 713899.62
2016-10-06 2972.15 2981.52 2972.15 2981.52 -0.02 522210.82
2016-10-07 2972.68 2974.87 2964.30 2964.30 -0.58 456252.80
2016-10-10 2977.32 2988.68 2977.32 2988.39 0.81 472514.88
2016-10-11 2987.31 2987.31 2973.30 2973.30 -0.50 457603.07
2016-10-12 2969.48 2979.04 2969.48 2979.04 0.19 447665.23
2016-10-13 2954.50 2954.50 2923.40 2923.40 -1.87 720287.00
2016-10-14 2937.36 2937.36 2910.02 2910.02 -0.46 498163.77
2016-10-17 2919.28 2919.28 2900.82 2900.82 -0.32 621362.01
2016-10-18 2922.67 2922.67 2901.12 2912.68 0.41 548288.70
2016-10-19 2919.90 2956.99 2919.90 2952.13 1.35 676115.64
2016-10-20 2963.88 2965.74 2959.68 2959.68 0.26 440663.31
2016-10-21 2975.40 2975.40 2965.20 2965.20 0.19 580519.75
2016-10-24 2991.65 3017.78 2991.65 3016.87 1.74 470453.44
2016-10-25 3042.78 3042.78 3018.19 3021.62 0.16 549692.97
2016-10-26 3016.35 3016.35 3004.60 3011.71 -0.33 503003.10
2016-10-27 3035.19 3038.10 3034.83 3038.10 0.88 699662.42
2016-10-28 3031.84 3076.10 3031.84 3076.10 1.25 532073.92
2016-10-31 3085.46 3085.46 3070.81 3070.81 -0.17 681466.77
2016-11-02 3026.11 3026.11 2987.83 2987.83 -2.70 928538.72
2016-11-03 2998.86 2998.86 2990.79 2990.79 0.10 656489.77
2016-11-04 2975.22 2980.33 2975.22 2975.30 -0.52 385598.43
2016-11-07 3005.59 3005.59 2981.35 2981.35 0.20 484758.97
2016-11-08 3013.01 3017.61 3008.44 3017.61 1.22 445944.20
2016-11-09 2988.07 3030.20 2988.07 3030.20 0.42 759370.28
2016-11-10 3084.03 3084.03 3040.50 3040.50 0.34 1199406.31
2016-11-14 3031.86 3031.86 2967.93 2967.93 -2.39 972806.47
2016-11-15 2989.57 2989.57 2964.69 2968.20 0.01 604846.49
... ... ... ... ... ...
2019-05-31 3908.15 3969.56 3893.10 3969.56 0.80 816765.70
2019-06-03 3960.51 3984.63 3949.97 3966.72 -0.07 532820.53
2019-06-04 3965.50 3974.06 3946.34 3959.69 -0.18 675309.03
2019-06-05 3967.09 3970.77 3935.23 3941.11 -0.47 655383.13
2019-06-06 3943.94 4027.33 3941.65 4007.89 1.69 995862.14
2019-06-07 4011.61 4050.60 4011.61 4042.75 0.87 616987.47
2019-06-10 4062.75 4068.64 4029.45 4046.13 0.08 651731.30
2019-06-11 4049.73 4078.42 4039.98 4069.34 0.57 1015106.95
2019-06-12 4052.77 4062.90 4013.92 4046.65 -0.56 769784.26
2019-06-13 4040.89 4091.55 4040.53 4077.50 0.76 811131.81
2019-06-14 4075.39 4075.39 4049.28 4053.14 -0.60 620163.19
2019-06-17 4058.67 4062.29 4026.71 4037.13 -0.40 426200.13
2019-06-18 4036.36 4123.23 4030.44 4123.23 2.13 949164.98
2019-06-19 4124.82 4124.82 4108.94 4113.58 -0.23 593252.42
2019-06-21 4124.86 4155.76 4085.95 4093.37 -0.49 1697240.27
2019-06-24 4113.01 4136.59 4099.92 4133.72 0.99 521246.30
2019-06-25 4123.47 4128.56 4073.93 4084.48 -1.19 692376.39
2019-06-26 4095.28 4109.06 4081.32 4109.06 0.60 635235.61
2019-06-27 4119.07 4160.37 4119.07 4140.91 0.78 682904.34
2019-06-28 4141.09 4144.04 4125.12 4132.41 -0.21 592757.96
2019-07-01 4195.52 4195.52 4130.92 4136.08 0.09 527728.12
2019-07-02 4152.31 4156.24 4101.91 4156.24 0.49 719797.10
2019-07-03 4145.92 4170.18 4141.94 4164.17 0.19 670327.79
2019-07-04 4165.05 4184.44 4152.69 4183.57 0.47 490173.23
2019-07-05 4186.73 4186.73 4145.94 4157.86 -0.61 516459.10
2019-07-08 4140.77 4167.74 4131.49 4152.64 -0.13 598552.93
2019-07-09 4148.48 4148.48 4109.20 4125.50 -0.65 679278.48
2019-07-10 4124.36 4174.02 4113.97 4125.75 0.01 764583.50
2019-07-11 4146.63 4167.49 4121.76 4133.21 0.18 598836.12
2019-07-12 4144.03 4145.02 4129.71 4130.38 -0.07 535157.11
[691 rows x 6 columns]
https://www.quandl.com/data/WSE-Warsaw-Stock-Exchange-GPW?keyword=KGH

Related

Python: Graph this QUARTERLY data with Year below Quarters

I am a bit new to Python. I have been trying to graph this QUARTERLY data like the one produced here (How do you create a line chart with quarter and year labels with monthly ticks?). I am sorry for a naïve question, but any help would be appreciated. I am actually not able to make string series of Qs. Thanks.
Date
A
B
C
3/31/2013
7.16333
0.6
0.6982
6/30/2013
7.87967
0.6
0.6726
9/30/2013
7.26133
0.6
0.7771
12/31/2013
6.66667
0.6
0.9108
3/31/2014
7.06267
0.6
0.8292
6/30/2014
7.41867
0.6
0.7069
9/30/2014
7.85617
0.6
0.6246
12/31/2014
6.93353
0.8305
0.5752
3/31/2015
6.40496
0.987
0.586
6/30/2015
6.93939
0.8629
0.575
9/30/2015
6.12374
1.0991
0.5794
12/31/2015
6.12928
1.0922
0.5806
3/31/2016
5.37414
1.2744
0.6523
6/30/2016
5.8968
1.1046
0.5851
9/30/2016
5.84815
1.0991
0.5884
12/31/2016
5.59963
1.1397
0.5901
3/31/2017
5.68668
1.0695
0.5815
6/30/2017
5.0588
1.2126
0.5957
9/30/2017
5.07095
1.2178
0.5888
12/31/2017
5.07308
1.1371
0.593
3/31/2018
5.06668
1.1697
0.6059
6/30/2018
5.30797
0.9936
0.6167
9/30/2018
5.47215
0.8294
0.6733
12/31/2018
4.30104
1.0148
0.903
3/31/2019
4.77011
0.8565
0.924
6/30/2019
6.23133
0.6
1.0592
9/30/2019
6.1635
0.6
1.0556
12/31/2019
6.02583
0.6
1.0303
3/31/2020
7.53533
0.6
1.085
6/30/2020
6.42743
1.0238
0.5975
9/30/2020
7.33954
0.7784
0.6923
12/31/2020
8.62803
0.6514
0.6792
3/31/2021
8.28196
0.6359
0.7152
6/30/2021
7.63684
0.6347
0.7456
9/30/2021
6.92014
0.7851
0.6163
12/31/2021
7.04538
0.8175
0.5068

matplotlib for stock data analysis plot not correct

I'm using matplotlib to draw trendance line for stock data.
import pandas as pd
import matplotlib.pyplot as plt
A = pd.read_csv('daily/A.csv', index_col=[0])
print(A)
AAL = pd.read_csv('daily/AAL.csv', index_col=[0])
print(AAL)
A['Close'].plot()
AAL['Close'].plot()
plt.show()
then result is:
High Low Open Close Volume Adj Close
Date
1999-11-18 35.77 28.61 32.55 31.47 62546300.0 27.01
1999-11-19 30.76 28.48 30.71 28.88 15234100.0 24.79
1999-11-22 31.47 28.66 29.55 31.47 6577800.0 27.01
1999-11-23 31.21 28.61 30.40 28.61 5975600.0 24.56
1999-11-24 30.00 28.61 28.70 29.37 4843200.0 25.21
... ... ... ... ... ... ...
2020-06-24 89.08 86.32 89.08 86.56 1806600.0 86.38
2020-06-25 87.35 84.80 86.43 87.26 1350100.0 87.08
2020-06-26 87.56 85.52 87.23 85.90 2225800.0 85.72
2020-06-29 87.36 86.11 86.56 87.29 1302500.0 87.29
2020-06-30 88.88 87.24 87.33 88.37 1428931.0 88.37
[5186 rows x 6 columns]
High Low Open Close Volume Adj Close
Date
2005-09-27 21.40 19.10 21.05 19.30 961200.0 18.19
2005-09-28 20.53 19.20 19.30 20.50 5747900.0 19.33
2005-09-29 20.58 20.10 20.40 20.21 1078200.0 19.05
2005-09-30 21.05 20.18 20.26 21.01 3123300.0 19.81
2005-10-03 21.75 20.90 20.90 21.50 1057900.0 20.27
... ... ... ... ... ... ...
2020-06-24 13.90 12.83 13.59 13.04 140975500.0 13.04
2020-06-25 13.24 12.18 12.53 13.17 117383400.0 13.17
2020-06-26 13.29 12.13 13.20 12.38 108813000.0 12.38
2020-06-29 13.51 12.02 12.57 13.32 114650300.0 13.32
2020-06-30 13.48 12.88 13.10 13.07 68669742.0 13.07
[3715 rows x 6 columns]
yes, the begin of 2 stocks is different, the end date is same.
so i get the plot is like this:
stockplot
this is not normal like others.
so, who could give me any advice, to draw a normal trendance line for 2 stocks?
You can try for making two different plots with same limits and then put one over the another for comparison.

How can I read the data in DataFrame one by one with a loop statement?

say
>>> import tushare as ts
>>> df=ts.get_stock_basics()
print(df)
>>> print(df)
name industry area pe ... profit gpr npr holders
code ...
000629 攀钢钒钛 小金属 四川 15.46 ... 176.65 25.39 16.99 328000.0
002113 天润数娱 互联网 湖南 122.93 ... 75.16 52.10 10.44 63566.0
603029 天鹅股份 农用机械 山东 0.00 ... -132.89 35.05 -8.76 13965.0
600721 百花村 生物制药 新疆 23.82 ... 21.22 42.37 24.30 20891.0
300493 润欣科技 通信设备 上海 53.90 ... 2.11 10.69 3.45 23308.0
600532 宏达矿业 普钢 上海 0.00 ... -64.67 2.00 -4.47 16451.0
300749 顶固集创 家居用品 广东 50.01 ... 0.00 38.01 7.66 56044.0
300748 金力永磁 元器件 江西 40.92 ... 0.00 20.94 8.47 79240.0
002931 锋龙股份 汽车配件 浙江 70.47 ... 0.00 32.10 13.05 15734.0
600101 明星电力 水力发电 四川 23.21 ... 26.58 13.71 7.13 36654.0
002219 恒康医疗 中成药 甘肃 51.07 ... -56.30 31.11 3.88 24831.0
000593 大通燃气 供气供热 四川 161.89 ... 3.12 22.64 3.14 29631.0
002937 兴瑞科技 元器件 浙江 29.97 ... 0.00 26.30 10.13 87519.0
600568 中珠医疗 区域地产 湖北 51.79 ... -66.89 51.11 13.11 21593.0
603701 德宏股份 汽车配件 浙江 21.60 ... -3.29 31.54 16.26 13549.0
600603 广汇物流 仓储物流 四川 17.85 ... 76.97 55.25 24.74 26642.0
300005 探路者 服饰 北京 71.50 ... -69.47 30.23 2.75 42941.0
002568 百润股份 红黄药酒 上海 40.70 ... 28.41 68.29 14.17 21333.0
000697 炼石有色 航空 陕西 0.00 ... 10.23 18.64 -19.91 33614.0
002007 华兰生物 生物制药 河南 38.68 ... 5.05 60.09 37.71 44000.0
000782 美达股份 化纤 广东 60.01 ... 142.72 8.78 1.31 42060.0
603538 美诺华 化学制药 浙江 25.74 ... 37.21 27.65 12.94 11346.0
002627 宜昌交运 公共交通 湖北 33.06 ... 41.84 12.09 4.05 9891.0
002864 盘龙药业 中成药 陕西 57.00 ... 71.07 70.89 14.40 22538.0
300649 杭州园林 建筑施工 浙江 63.27 ... 90.70 18.98 7.17 16732.0
300168 万达信息 软件服务 上海 138.01 ... 101.63 38.12 7.64 42770.0
002299 圣农发展 农业综合 福建 30.52 ... 204.28 13.87 6.61 22637.0
600290 华仪电气 电气设备 浙江 144.15 ... 13.64 25.08 1.48 14832.0
002496 ST辉丰 农药化肥 江苏 17.21 ... -58.07 35.44 5.96 63487.0
002437 誉衡药业 化学制药 黑龙江 16.93 ... 3.09 73.33 8.91 45373.0
I want that it outputs code(the name of the first column) one by one, so I can input each code into another function.
You can use pandas.Index.map:
import pandas as pd
from math import sqrt
df = pd.DataFrame({'col1': (1,2,3), 'col2': (3,4,6),}, index=[1,4,9])
df
Out:
col1 col2
1 1 3
4 2 4
9 3 6
mapped_index = df.index.map(sqrt)
mapped_index
Out:
Float64Index([1.0, 2.0, 3.0], dtype='float64')
Then, if you need, you can just iterate through the result:
for i in df.index.map(sqrt):
....
A simple demonstration traversing DataFrame index:
import pandas as pd
import tushare as ts
df = ts.get_stock_basics()
print(df)
for i in df.index:
print(i, type(i))

How to solve NaN values error using Lmfit with Python

I'm trying to fit a set of data taken by an external simulation, and stored in a vector, with the Lmfit library.
Below there's my code:
import numpy as np
import matplotlib.pyplot as plt
from lmfit import Model
from lmfit import Parameters
def DGauss3Par(x,I1,sigma1,sigma2):
I2 = 2.63 - I1
return (I1/np.sqrt(2*np.pi*sigma1))*np.exp(-(x*x)/(2*sigma1*sigma1)) + (I2/np.sqrt(2*np.pi*sigma2))*np.exp(-(x*x)/(2*sigma2*sigma2))
#TAKE DATA
xFull = []
yFull = []
fileTypex = np.dtype([('xFull', np.float)])
fileTypey = np.dtype([('yFull', np.float)])
fDatax = "xValue.dat"
fDatay = "yValue.dat"
xFull = np.loadtxt(fDatax, dtype=fileTypex)
yFull = np.loadtxt(fDatay, dtype=fileTypey)
xGauss = xFull[:]["xFull"]
yGauss = yFull[:]["yFull"]
#MODEL'S DEFINITION
gmodel = Model(DGauss3Par)
params = Parameters()
params.add('I1', value=1.66)
params.add('sigma1', value=1.04)
params.add('sigma2', value=1.2)
result3 = gmodel.fit(yGauss, x=xGauss, params=params)
#PLOTS
plt.plot(xGauss, result3.best_fit, 'y-')
plt.show()
When I run it, I get this error:
File "Overlap.py", line 133, in <module>
result3 = gmodel.fit(yGauss, x=xGauss, params=params)
ValueError: The input contains nan values
These are the values of the data contained in the vector xGauss (related to the x axis):
[-3.88 -3.28 -3.13 -3.08 -3.03 -2.98 -2.93 -2.88 -2.83 -2.78 -2.73 -2.68
-2.63 -2.58 -2.53 -2.48 -2.43 -2.38 -2.33 -2.28 -2.23 -2.18 -2.13 -2.08
-2.03 -1.98 -1.93 -1.88 -1.83 -1.78 -1.73 -1.68 -1.63 -1.58 -1.53 -1.48
-1.43 -1.38 -1.33 -1.28 -1.23 -1.18 -1.13 -1.08 -1.03 -0.98 -0.93 -0.88
-0.83 -0.78 -0.73 -0.68 -0.63 -0.58 -0.53 -0.48 -0.43 -0.38 -0.33 -0.28
-0.23 -0.18 -0.13 -0.08 -0.03 0.03 0.08 0.13 0.18 0.23 0.28 0.33
0.38 0.43 0.48 0.53 0.58 0.63 0.68 0.73 0.78 0.83 0.88 0.93
0.98 1.03 1.08 1.13 1.18 1.23 1.28 1.33 1.38 1.43 1.48 1.53
1.58 1.63 1.68 1.73 1.78 1.83 1.88 1.93 1.98 2.03 2.08 2.13
2.18 2.23 2.28 2.33 2.38 2.43 2.48 2.53 2.58 2.63 2.68 2.73
2.78 2.83 2.88 2.93 2.98 3.03 3.08 3.13 3.28 3.88]
And these ones the ones in the vector yGauss (related to y axis):
[0.00173977 0.00986279 0.01529543 0.0242624 0.0287456 0.03238484
0.03285927 0.03945234 0.04615091 0.05701618 0.0637672 0.07194268
0.07763934 0.08565687 0.09615262 0.1043281 0.11350606 0.1199406
0.1260062 0.14093328 0.15079665 0.16651464 0.18065023 0.1938894
0.2047541 0.21794024 0.22806706 0.23793043 0.25164404 0.2635118
0.28075974 0.29568682 0.30871501 0.3311846 0.34648062 0.36984661
0.38540666 0.40618835 0.4283945 0.45002014 0.48303911 0.50746062
0.53167057 0.5548792 0.57835128 0.60256181 0.62566436 0.65704847
0.68289386 0.71332794 0.73258027 0.769608 0.78769989 0.81407275
0.83358852 0.85210239 0.87109068 0.89456217 0.91618782 0.93760247
0.95680234 0.96919757 0.9783219 0.98486193 0.9931429 0.9931429
0.98486193 0.9783219 0.96919757 0.95680234 0.93760247 0.91618782
0.89456217 0.87109068 0.85210239 0.83358852 0.81407275 0.78769989
0.769608 0.73258027 0.71332794 0.68289386 0.65704847 0.62566436
0.60256181 0.57835128 0.5548792 0.53167057 0.50746062 0.48303911
0.45002014 0.4283945 0.40618835 0.38540666 0.36984661 0.34648062
0.3311846 0.30871501 0.29568682 0.28075974 0.2635118 0.25164404
0.23793043 0.22806706 0.21794024 0.2047541 0.1938894 0.18065023
0.16651464 0.15079665 0.14093328 0.1260062 0.1199406 0.11350606
0.1043281 0.09615262 0.08565687 0.07763934 0.07194268 0.0637672
0.05701618 0.04615091 0.03945234 0.03285927 0.03238484 0.0287456
0.0242624 0.01529543 0.00986279 0.00173977]
I've also tried to print the values returned by my function, to see if there really were some NaN values:
params = Parameters()
params.add('I1', value=1.66)
params.add('sigma1', value=1.04)
params.add('sigma2', value=1.2)
func = DGauss3Par(xGauss,I1,sigma1,sigma2)
print func
but what I obtained is:
[0.04835225 0.06938855 0.07735839 0.08040181 0.08366964 0.08718237
0.09096169 0.09503048 0.0994128 0.10413374 0.10921938 0.11469669
0.12059333 0.12693754 0.13375795 0.14108333 0.14894236 0.15736337
0.16637406 0.17600115 0.18627003 0.19720444 0.20882607 0.22115413
0.23420498 0.24799173 0.26252377 0.27780639 0.29384037 0.3106216
0.32814069 0.34638266 0.3653266 0.38494543 0.40520569 0.42606735
0.44748374 0.46940149 0.49176057 0.51449442 0.5375301 0.56078857
0.58418507 0.60762948 0.63102687 0.65427809 0.6772804 0.69992818
0.72211377 0.74372824 0.76466232 0.78480729 0.80405595 0.82230355
0.83944875 0.85539458 0.87004937 0.88332762 0.89515085 0.90544838
0.91415806 0.92122688 0.92661155 0.93027889 0.93220625 0.93220625
0.93027889 0.92661155 0.92122688 0.91415806 0.90544838 0.89515085
0.88332762 0.87004937 0.85539458 0.83944875 0.82230355 0.80405595
0.78480729 0.76466232 0.74372824 0.72211377 0.69992818 0.6772804
0.65427809 0.63102687 0.60762948 0.58418507 0.56078857 0.5375301
0.51449442 0.49176057 0.46940149 0.44748374 0.42606735 0.40520569
0.38494543 0.3653266 0.34638266 0.32814069 0.3106216 0.29384037
0.27780639 0.26252377 0.24799173 0.23420498 0.22115413 0.20882607
0.19720444 0.18627003 0.17600115 0.16637406 0.15736337 0.14894236
0.14108333 0.13375795 0.12693754 0.12059333 0.11469669 0.10921938
0.10413374 0.0994128 0.09503048 0.09096169 0.08718237 0.08366964
0.08040181 0.07735839 0.06938855 0.04835225]
So it doesn't seems that there are NaN values, I'm not understanding for which reason it returns me that error.
Could anyone help me, please? Thanks!
If you add a print function to your fit function, printing out sigma1 and sigma2, you'll find that
DGauss3Par is evaluated already a few times before the error occurs.
Both sigma variables have a negative value at the time the error occurs.
Taking the square root of a negative value causes, of course, a NaN.
You should add a min bound or similar to your sigma1 and sigma2 parameters to prevent this. Using min=0.0 as an additional argument to params.add(...) will result in a good fit.
Be aware that for some analyses, setting explicit bounds to your fitting parameters may make these analyses invalid. For most cases, you'll be fine, but for some cases, you'll need to check whether the fitting parameters should be allowed to vary from negative infinity to positive infinity, or are allowed to be bounded.

Error building a function to calculate standard deviation

I am new to Python and I am trying to build a function to run some statistics on a data set. The data is in an Excel format and it contains 7 rows, with the first row I know what a function is and how it should be built, nevertheless I can't figure it out how to build this function.
This is the function:
def st_dev(benchmark, factor):
benchmark = mkt_ret
factor = smb
statistics = st.stdev(benchmark, factor)
return statistics
print(st_dev)
And this is the result:
Mkt-RF SMB HML RMW CMA RF
196307 -0.39 -0.46 -0.81 0.72 -1.16 0.27
196308 5.07 -0.81 1.65 0.42 -0.4 0.25
196309 -1.57 -0.48 0.19 -0.8 0.23 0.27
196310 2.53 -1.29 -0.09 2.75 -2.26 0.29
196311 -0.85 -0.85 1.71 -0.34 2.22 0.27
4.38
<function st_dev at 0x0000000002D92F28>
Process finished with exit code 0
the full code can be viewed here.
I tried several versions to write the function, some error messages told me that I cannot convert 'Series' to numerator/denominator.
I am running python 3.7
Thank you for your help.
Alex

Categories