I am downloading an Eurostat dataset in Python using the eurostat package and the Dataframe format is tricky to work with. I have been trying to turn the panel data into time-series, but I have not been successful.
I have filtered and cleaned the data a little bit, but I've failed to turn the table into time-series (I am fairly new to Python). Below my code:
#pip install eurostat
import pandas as pd
import eurostat
# Commercial flights by reporting country – monthly data (source: Eurocontrol)
df_eurostat = eurostat.get_data_df('avia_tf_cm')
df_eurostat = df_eurostat.rename(columns={'geo\\time':'Region'})
# To exclude: 'EU27_2020', 'EU28'
# df_eurostat = df_eurostat.drop(columns='unit').T
country_list = ['AL', 'AT', 'BE', 'BG', 'CH', 'CY', 'CZ', 'DE', 'DK', 'EE', 'EL',
'ES', 'FI', 'FR', 'HR', 'HU', 'IE', 'IS', 'IT', 'LT', 'LU', 'LV',
'ME', 'MK', 'MT', 'NL', 'NO', 'PL', 'PT', 'RO', 'RS', 'SE', 'SI',
'SK', 'TR', 'UK']
df_eurostat = df_eurostat[df_eurostat['Region'].isin(country_list)]
df_eurostat = df_eurostat.loc[(df_eurostat['unit']=='NR')]
Before:
After - what I want to achieve:
Would highly appreciate it if anyone could help. Thank you in advance!
One more step:
to_date = lambda x: pd.to_datetime(x['Date'], format='%YM%m')
df_eurostat = df_eurostat.drop(columns='unit').set_index('Region').T \
.rename_axis(index='Date', columns=None) \
.reset_index().assign(Date=to_date)
Output:
>>> df_eurostat
Date AL AT BE BG CH CY CZ DE DK ... NO PL PT RO RS SE SI SK TR UK
0 2021-12-01 2265.0 15224.0 20055.0 4188.0 24102.0 3851.0 6690.0 94592.0 17277.0 ... 32284.0 23299.0 23977.0 10653.0 3804.0 19148.0 1038.0 1224.0 55338.0 96922.0
1 2021-11-01 1953.0 15513.0 20445.0 3694.0 21180.0 4452.0 6549.0 96853.0 17630.0 ... 33727.0 22105.0 23334.0 9294.0 3578.0 19088.0 993.0 1040.0 57975.0 90265.0
2 2021-10-01 2358.0 18314.0 21520.0 4945.0 26289.0 7118.0 7019.0 115037.0 18805.0 ... 33051.0 23325.0 27620.0 11708.0 4017.0 19070.0 1137.0 1178.0 81820.0 103358.0
3 2021-09-01 2998.0 18856.0 21834.0 6853.0 24979.0 6488.0 7785.0 107754.0 17609.0 ... 31901.0 25523.0 26989.0 13370.0 4691.0 18503.0 1155.0 1453.0 81744.0 98183.0
4 2021-08-01 3705.0 19579.0 22261.0 8807.0 26451.0 6873.0 7815.0 106657.0 16538.0 ... 28870.0 26381.0 29506.0 14416.0 5761.0 17061.0 1268.0 1695.0 90404.0 92697.0
5 2021-07-01 2973.0 17697.0 21617.0 7663.0 24531.0 6418.0 7291.0 99334.0 15357.0 ... 26152.0 24355.0 26176.0 13446.0 5831.0 15591.0 1210.0 1608.0 87664.0 72389.0
6 2021-06-01 2173.0 11225.0 15313.0 4441.0 15021.0 4328.0 5151.0 68482.0 8958.0 ... 21798.0 17129.0 19879.0 10222.0 3955.0 11832.0 788.0 992.0 58319.0 50648.0
7 2021-05-01 1452.0 7783.0 11247.0 2796.0 11619.0 3016.0 3051.0 51870.0 5993.0 ... 19007.0 8933.0 13758.0 6936.0 2736.0 8661.0 592.0 436.0 36572.0 35027.0
8 2021-04-01 1039.0 6632.0 9537.0 2457.0 10199.0 1872.0 2310.0 45712.0 4994.0 ... 18183.0 7256.0 10086.0 5720.0 2203.0 7683.0 455.0 280.0 39540.0 27739.0
9 2021-03-01 935.0 5327.0 8454.0 2071.0 8431.0 1334.0 2174.0 39463.0 4615.0 ... 19120.0 6120.0 6216.0 4212.0 1829.0 7502.0 479.0 377.0 38896.0 25305.0
10 2021-02-01 751.0 3976.0 7836.0 1756.0 7116.0 992.0 1889.0 30330.0 3522.0 ... 16159.0 4553.0 5134.0 3543.0 1527.0 6274.0 391.0 418.0 30167.0 20496.0
11 2021-01-01 881.0 4801.0 9481.0 2229.0 9262.0 1064.0 2208.0 36932.0 4937.0 ... 18953.0 6943.0 9227.0 4555.0 1741.0 7203.0 402.0 444.0 32167.0 28100.0
12 2020-12-01 880.0 5271.0 10360.0 2577.0 9804.0 1316.0 2572.0 39709.0 6030.0 ... 18913.0 7898.0 10387.0 4463.0 1887.0 8003.0 416.0 521.0 29614.0 38484.0
13 2020-11-01 872.0 5409.0 9787.0 2265.0 7667.0 1528.0 2248.0 40854.0 6328.0 ... 21194.0 8035.0 9738.0 3661.0 2130.0 8903.0 404.0 362.0 36441.0 34516.0
14 2020-10-01 1227.0 9237.0 11507.0 3392.0 12132.0 3185.0 3271.0 64376.0 9356.0 ... 24317.0 13245.0 15886.0 6179.0 2817.0 11103.0 577.0 653.0 49092.0 61735.0
15 2020-09-01 1513.0 11990.0 12241.0 4429.0 14364.0 3464.0 4749.0 69292.0 10604.0 ... 24939.0 15927.0 17980.0 7112.0 2845.0 10819.0 664.0 901.0 51449.0 72451.0
16 2020-08-01 2087.0 13469.0 14772.0 5396.0 18023.0 3770.0 5157.0 73205.0 10657.0 ... 24069.0 18681.0 20945.0 8059.0 2898.0 9963.0 796.0 1114.0 51758.0 79123.0
17 2020-07-01 1754.0 10377.0 13294.0 5026.0 15326.0 2914.0 4441.0 62889.0 9168.0 ... 23057.0 14361.0 14599.0 6925.0 2846.0 8154.0 703.0 783.0 36743.0 52547.0
18 2020-06-01 400.0 3901.0 6902.0 2495.0 6319.0 996.0 1715.0 31467.0 4085.0 ... 17126.0 3120.0 4340.0 2386.0 1570.0 5025.0 513.0 382.0 18020.0 21071.0
19 2020-05-01 186.0 1628.0 5626.0 1521.0 2841.0 457.0 979.0 20787.0 2245.0 ... 13377.0 1106.0 2208.0 1391.0 494.0 3716.0 340.0 191.0 4703.0 16397.0
20 2020-04-01 134.0 1297.0 4708.0 931.0 1936.0 355.0 823.0 17894.0 1974.0 ... 13114.0 1059.0 1600.0 1393.0 295.0 3422.0 369.0 207.0 3726.0 13634.0
21 2020-03-01 862.0 13690.0 16101.0 3551.0 20060.0 2749.0 5807.0 84579.0 14416.0 ... 28254.0 14506.0 17820.0 8349.0 2529.0 19940.0 903.0 811.0 40122.0 96914.0
22 2020-02-01 1667.0 24837.0 22531.0 4923.0 33073.0 4030.0 9417.0 128115.0 22684.0 ... 36181.0 27688.0 25712.0 12360.0 4278.0 27289.0 1256.0 1511.0 60161.0 137542.0
23 2020-01-01 1984.0 25526.0 23595.0 5261.0 34628.0 4422.0 10130.0 132506.0 23224.0 ... 38375.0 29776.0 26492.0 13357.0 4614.0 27758.0 1325.0 1580.0 66067.0 141097.0
24 2019-12-01 2204.0 25704.0 24205.0 5187.0 33464.0 4233.0 11243.0 134607.0 22640.0 ... 35866.0 29886.0 27860.0 13759.0 4792.0 27187.0 1409.0 1747.0 65175.0 148395.0
25 2019-11-01 1983.0 24584.0 24661.0 4931.0 30263.0 4886.0 11019.0 139360.0 24478.0 ... 39602.0 29281.0 27347.0 13367.0 4675.0 29459.0 1375.0 1641.0 68215.0 143007.0
26 2019-10-01 2173.0 28210.0 28315.0 6027.0 36833.0 7826.0 13484.0 175844.0 28961.0 ... 44260.0 33407.0 35370.0 15213.0 5655.0 34250.0 1274.0 1934.0 92012.0 179242.0
27 2019-09-01 2572.0 29329.0 29049.0 9908.0 37735.0 8426.0 15865.0 176614.0 29324.0 ... 43968.0 36534.0 37728.0 16539.0 6418.0 35217.0 2242.0 2917.0 99239.0 186990.0
28 2019-08-01 3012.0 30197.0 29686.0 12911.0 38535.0 9024.0 16373.0 174218.0 29149.0 ... 43548.0 37931.0 40481.0 17583.0 7150.0 32993.0 2726.0 3444.0 110635.0 196632.0
29 2019-07-01 2954.0 30638.0 30711.0 12911.0 39715.0 8895.0 16339.0 178418.0 28525.0 ... 42885.0 37728.0 40453.0 17591.0 7041.0 31426.0 2757.0 3535.0 108069.0 196964.0
30 2019-06-01 2479.0 29954.0 28327.0 10775.0 37872.0 8428.0 15645.0 171786.0 29099.0 ... 43533.0 35891.0 37269.0 16189.0 6103.0 33742.0 2508.0 2937.0 98885.0 189383.0
31 2019-05-01 2262.0 28262.0 28503.0 7053.0 37384.0 7597.0 13281.0 171324.0 28684.0 ... 43880.0 34017.0 36306.0 15470.0 5319.0 34604.0 2555.0 2104.0 86267.0 187445.0
32 2019-04-01 2110.0 27218.0 27080.0 5539.0 36308.0 6305.0 11985.0 158711.0 25866.0 ... 39057.0 30874.0 34229.0 14432.0 4958.0 31597.0 2426.0 1955.0 73548.0 169391.0
33 2019-03-01 1775.0 27362.0 24518.0 5108.0 37157.0 4415.0 11213.0 150008.0 26319.0 ... 41485.0 28338.0 28165.0 13041.0 4299.0 33232.0 2285.0 1848.0 67939.0 157772.0
34 2019-02-01 1625.0 23368.0 21019.0 4529.0 33206.0 3526.0 9256.0 131628.0 22559.0 ... 36782.0 25442.0 24069.0 11906.0 3824.0 28637.0 2022.0 1614.0 59619.0 139353.0
35 2019-01-01 1925.0 24110.0 23694.0 4990.0 35228.0 3751.0 10059.0 138258.0 23211.0 ... 38933.0 27756.0 26258.0 13292.0 4237.0 30192.0 2226.0 1723.0 66304.0 145002.0
[36 rows x 37 columns]
Related
I have list of 3 dataframes of stock tickers and prices I want to convert into a single dataframe.
dataframes:
[ Date AMBU-B.CO BAVA.CO CARL-B.CO CHR.CO COLO-B.CO \
0 2020-01-02 112.500000 172.850006 984.400024 525.599976 814.000000
1 2020-01-03 111.300003 171.199997 989.799988 526.799988 812.000000
2 2020-01-06 108.150002 166.100006 1001.000000 519.599976 820.200012
3 2020-01-07 110.500000 170.000000 1002.000000 522.400024 823.599976
4 2020-01-08 109.599998 171.399994 993.000000 510.399994 820.000000
.. ... ... ... ... ... ...
308 2021-03-25 270.000000 295.200012 965.799988 562.599976 964.200012
309 2021-03-26 271.299988 302.000000 974.599976 548.599976 954.000000
310 2021-03-29 281.000000 294.000000 981.400024 575.000000 968.200012
311 2021-03-30 280.899994 282.600006 986.599976 567.400024 950.200012
312 2021-03-31 297.899994 286.399994 974.599976 576.400024 953.799988
DANSKE.CO DEMANT.CO DSV.CO FLS.CO ... NETC.CO \
0 110.349998 208.600006 769.799988 272.500000 ... 314.000000
1 107.900002 206.600006 751.400024 267.899994 ... 313.000000
2 106.699997 206.500000 752.400024 265.600006 ... 309.799988
3 107.750000 204.399994 753.799988 273.399994 ... 309.200012
4 108.250000 205.600006 755.799988 268.000000 ... 309.200012
.. ... ... ... ... ... ...
308 117.349998 260.399994 1170.000000 230.199997 ... 603.000000
309 120.050003 267.600006 1212.500000 237.800003 ... 603.500000
310 118.750000 267.100006 1206.000000 238.300003 ... 599.000000
311 120.500000 265.500000 1213.500000 243.600006 ... 592.000000
312 118.699997 268.700012 1244.500000 243.100006 ... 604.000000
NOVO-B.CO NZYM-B.CO ORSTED.CO PNDORA.CO RBREW.CO ROCK-B.CO \
0 388.700012 327.100006 681.000000 293.000000 603.000000 1584.0
1 383.200012 322.500000 677.400024 293.200012 605.200012 1567.0
2 382.049988 321.200012 670.200012 328.200012 601.599976 1547.0
3 381.700012 322.000000 662.000000 339.299988 612.200012 1546.0
4 382.500000 322.700012 645.000000 343.600006 602.200012 1531.0
.. ... ... ... ... ... ...
308 425.450012 403.399994 983.000000 655.799988 658.400024 2506.0
309 423.549988 404.100006 1013.500000 666.400024 666.599976 2672.0
310 431.549988 404.000000 1013.000000 678.400024 669.799988 2650.0
311 430.700012 401.500000 998.799988 678.400024 672.000000 2632.0
312 429.750000 406.299988 1024.500000 679.599976 663.400024 2674.0
SIM.CO TRYG.CO VWS.CO
0 776.0 196.399994 659.400024
1 764.5 195.600006 648.599976
2 751.5 195.000000 648.400024
3 753.5 200.000000 639.599976
4 762.0 197.500000 645.400024
.. ... ... ...
308 769.0 145.300003 1138.500000
309 775.5 146.500000 1187.000000
310 772.0 149.000000 1217.000000
311 781.0 149.800003 1245.000000
312 785.5 149.600006 1302.000000
[313 rows x 26 columns],
Date 1COV.DE ADS.DE ALV.DE BAS.DE BAYN.DE \
0 2020-01-02 42.180000 291.549988 221.500000 68.290001 73.519997
1 2020-01-03 41.900002 291.950012 219.050003 67.269997 72.580002
2 2020-01-06 39.889999 289.649994 217.699997 66.269997 71.739998
3 2020-01-07 40.130001 294.750000 218.199997 66.300003 72.129997
4 2020-01-08 40.830002 302.850006 218.300003 65.730003 74.000000
.. ... ... ... ... ... ...
314 2021-03-29 56.439999 264.100006 214.600006 70.029999 53.360001
315 2021-03-30 58.200001 265.000000 219.050003 71.879997 53.750000
316 2021-03-31 57.340000 266.200012 217.050003 70.839996 53.959999
317 2021-04-01 57.660000 267.950012 217.649994 71.629997 53.419998
318 2021-04-01 57.660000 267.950012 217.649994 71.629997 53.419998
BEI.DE BMW.DE CON.DE DAI.DE ... IFX.DE LIN.DE \
0 105.650002 74.220001 116.400002 49.974998 ... 20.684999 190.050003
1 105.650002 73.320000 113.980003 49.070000 ... 20.389999 185.300003
2 106.000000 73.050003 112.680000 48.805000 ... 20.045000 183.600006
3 105.750000 74.220001 115.120003 49.195000 ... 21.040001 185.300003
4 106.199997 74.410004 117.339996 49.470001 ... 21.309999 185.850006
.. ... ... ... ... ... ... ...
314 90.220001 85.599998 111.949997 73.709999 ... 34.880001 237.000000
315 90.040001 88.800003 113.449997 75.940002 ... 35.535000 238.500000
316 90.099998 88.470001 112.699997 76.010002 ... 36.154999 238.899994
317 90.500000 89.519997 112.760002 NaN ... 36.570000 238.699997
318 90.500000 89.519997 112.760002 74.970001 ... 36.570000 238.699997
MRK.DE MTX.DE MUV2.DE RWE.DE SAP.DE SIE.DE \
0 106.000000 258.100006 265.899994 26.959999 122.000000 118.639999
1 107.250000 257.799988 262.600006 26.840000 120.459999 116.360001
2 108.400002 258.000000 262.700012 26.450001 119.559998 115.820000
3 109.500000 262.299988 264.500000 27.049999 120.099998 116.559998
4 111.300003 263.000000 265.000000 27.170000 120.820000 117.040001
.. ... ... ... ... ... ...
314 145.949997 196.199997 260.200012 32.709999 104.300003 137.839996
315 145.949997 201.300003 265.000000 32.400002 103.559998 141.080002
316 145.800003 200.699997 262.600006 33.419998 104.419998 140.000000
317 145.800003 206.199997 266.049988 34.060001 106.000000 141.020004
318 145.800003 206.199997 266.049988 34.060001 106.000000 141.020004
VNA.DE VOW3.DE
0 48.419998 180.500000
1 48.599998 176.639999
2 48.450001 176.619995
3 48.709999 176.059998
4 48.970001 176.820007
.. ... ...
314 55.599998 229.750000
315 55.619999 240.550003
316 55.700001 238.600006
317 56.099998 235.850006
318 56.099998 235.850006
[319 rows x 31 columns],
Date ADE.OL AKRBP.OL BAKKA.OL DNB.OL EQNR.OL \
0 2020-01-02 106.800003 289.000000 664.0 165.800003 177.949997
1 2020-01-03 108.199997 292.899994 670.0 164.850006 180.949997
2 2020-01-06 107.000000 296.299988 654.0 164.899994 185.000000
3 2020-01-07 111.199997 295.700012 657.5 163.899994 183.000000
4 2020-01-08 108.800003 295.299988 668.5 166.000000 183.600006
.. ... ... ... ... ... ...
310 2021-03-25 133.000000 237.500000 633.0 178.050003 164.449997
311 2021-03-26 133.300003 244.199997 640.0 181.449997 167.649994
312 2021-03-29 131.100006 248.199997 660.0 182.000000 169.750000
313 2021-03-30 126.900002 244.800003 672.0 182.500000 168.600006
314 2021-03-31 125.900002 242.800003 677.5 182.000000 167.300003
GJF.OL LSG.OL MOWI.OL NAS.OL ... NHY.OL \
0 184.149994 59.240002 229.500000 4094.000000 ... 33.410000
1 185.100006 58.900002 229.800003 3986.000000 ... 32.660000
2 182.550003 59.000000 229.199997 3857.000000 ... 32.299999
3 184.600006 59.000000 227.199997 3964.000000 ... 32.220001
4 184.199997 59.700001 226.699997 3964.000000 ... 32.090000
.. ... ... ... ... ... ...
310 199.199997 70.680000 205.500000 53.299999 ... 50.060001
311 200.000000 71.959999 208.000000 53.020000 ... 53.080002
312 200.600006 73.099998 209.699997 55.000000 ... 53.060001
313 200.399994 73.419998 210.800003 60.759998 ... 53.419998
314 200.600006 73.099998 212.199997 66.400002 ... 54.759998
ORK.OL SALM.OL SCATC.OL SCHA.OL STB.OL SUBC.OL \
0 89.959999 454.000000 123.400002 271.299988 69.900002 105.900002
1 89.699997 453.899994 123.000000 272.100006 69.500000 107.150002
2 89.139999 453.500000 117.300003 268.299988 68.639999 108.150002
3 89.879997 447.700012 116.000000 272.299988 69.720001 107.699997
4 87.720001 451.799988 118.400002 271.899994 70.139999 107.250000
.. ... ... ... ... ... ...
310 84.000000 568.799988 235.000000 368.200012 81.779999 87.800003
311 84.400002 581.799988 237.600006 375.700012 83.860001 87.000000
312 84.839996 585.000000 244.600006 367.399994 84.540001 87.820000
313 84.800003 587.400024 246.399994 361.000000 85.400002 87.279999
314 83.839996 590.000000 258.600006 359.000000 86.139999 85.900002
TEL.OL TOM.OL YAR.OL
0 157.649994 287.799988 361.299988
1 158.800003 284.399994 356.000000
2 159.399994 280.000000 356.000000
3 156.850006 274.000000 351.399994
4 155.449997 278.600006 357.299988
.. ... ... ...
310 149.350006 376.200012 438.000000
311 149.050003 376.700012 444.000000
312 151.000000 378.500000 448.500000
313 150.600006 372.799988 447.200012
314 150.500000 370.299988 444.799988
[315 rows x 21 columns]]
I found out that to solve this one usually uses pd.concat, but this does not seem to work for me:
df = pd.concat(dataframes)
df
It seems to return a lot of NANs, and it should not. How to solve this? If it can help, all dataframes uses the same dates from 2020-01-02 to 2021-03-31.
Date AMBU-B.CO BAVA.CO CARL-B.CO CHR.CO COLO-B.CO DANSKE.CO DEMANT.CO DSV.CO FLS.CO ... NHY.OL ORK.OL SALM.OL SCATC.OL SCHA.OL STB.OL SUBC.OL TEL.OL TOM.OL YAR.OL
0 2020-01-02 112.500000 172.850006 984.400024 525.599976 814.000000 110.349998 208.600006 769.799988 272.500000 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 2020-01-03 111.300003 171.199997 989.799988 526.799988 812.000000 107.900002 206.600006 751.400024 267.899994 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 2020-01-06 108.150002 166.100006 1001.000000 519.599976 820.200012 106.699997 206.500000 752.400024 265.600006 ... NaN NaN NaN
EDIT: here is how dataframes are created to start with:
def motor_daily(ticker_list):
#function uses start and end dates to get closing prices for certain stocks.
df = yf.download(ticker_list, start=phase_2.start(),
end=phase_2.tomorrow()).Close
return df
def ticker_data(list):
#function takes "ticks" which is a list of ticker names and uses
#motor_daily_big_function to get data frame yahoo API
data = []
for ticks in list:
data.append(motor_daily(ticks))
return data
res = ticker_data(list_of_test)
dataframes = [pd.DataFrame(lst) for lst in res]
I fixed it myself, here is what I did:
dataframes_concat = pd.concat(dataframes)
df1 = dataframes_concat.groupby('Date', as_index=True).first()
print(df1)
AMBU-B.CO BAVA.CO CARL-B.CO CHR.CO COLO-B.CO DANSKE.CO DEMANT.CO DSV.CO FLS.CO GMAB.CO ... NHY.OL ORK.OL SALM.OL SCATC.OL SCHA.OL STB.OL SUBC.OL TEL.OL TOM.OL YAR.OL
Date
2020-01-02 112.500000 172.850006 984.400024 525.599976 814.000000 110.349998 208.600006 769.799988 272.500000 1486.5 ... 33.410000 89.959999 454.000000 123.400002 271.299988 69.900002 105.900002 157.649994 287.799988 361.299988
2020-01-03 111.300003 171.199997 989.799988 526.799988 812.000000 107.900002 206.600006 751.400024 267.899994 1444.5 ... 32.660000 89.699997 453.899994 123.000000 272.100006 69.500000 107.150002 158.800003 284.399994 356.000000
2020-01-06 108.150002 166.100006 1001.000000 519.599976 820.200012 106.699997 206.500000 752.400024 265.600006 1419.5 ... 32.299999 89.139999 453.500000 117.300003 268.299988 68.639999 108.150002 159.399994 280.000000 356.000000
2020-01-07 110.500000 170.000000 1002.000000 522.400024 823.599976 107.750000 204.399994 753.799988 273.399994 1456.0 ... 32.220001 89.879997 447.700012 116.000000 272.299988 69.720001 107.699997 156.850006 274.000000 351.399994
2020-01-08 109.599998 171.399994 993.000000 510.399994 820.000000 108.250000 205.600006 755.799988 268.000000 1466.5 ... 32.090000 87.720001 451.799988 118.400002 271.899994 70.139999 107.250000 155.449997 278.600006 357.299988
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2021-03-26 271.299988 302.000000 974.599976 548.599976 954.000000 120.050003 267.600006 1212.500000 237.800003 2045.0 ... 53.080002 84.400002 581.799988 237.600006 375.700012 83.860001 87.000000 149.050003 376.700012 444.000000
2021-03-29 281.000000 294.000000 981.400024 575.000000 968.200012 118.750000 267.100006 1206.000000 238.300003 2028.0 ... 53.060001 84.839996 585.000000 244.600006 367.399994 84.540001 87.820000 151.000000 378.500000 448.500000
2021-03-30 280.899994 282.600006 986.599976 567.400024 950.200012 120.500000 265.500000 1213.500000 243.600006 2019.0 ... 53.419998 84.800003 587.400024 246.399994 361.000000 85.400002 87.279999 150.600006 372.799988 447.200012
2021-03-31 297.899994 286.399994 974.599976 576.400024 953.799988 118.699997 268.700012 1244.500000 243.100006 2087.0 ... 54.759998 83.839996 590.000000 258.600006 359.000000 86.139999 85.900002 150.500000 370.299988 444.799988
2021-04-01 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
last row is NAN as markets are closed for easter.
I'm using matplotlib to draw trendance line for stock data.
import pandas as pd
import matplotlib.pyplot as plt
A = pd.read_csv('daily/A.csv', index_col=[0])
print(A)
AAL = pd.read_csv('daily/AAL.csv', index_col=[0])
print(AAL)
A['Close'].plot()
AAL['Close'].plot()
plt.show()
then result is:
High Low Open Close Volume Adj Close
Date
1999-11-18 35.77 28.61 32.55 31.47 62546300.0 27.01
1999-11-19 30.76 28.48 30.71 28.88 15234100.0 24.79
1999-11-22 31.47 28.66 29.55 31.47 6577800.0 27.01
1999-11-23 31.21 28.61 30.40 28.61 5975600.0 24.56
1999-11-24 30.00 28.61 28.70 29.37 4843200.0 25.21
... ... ... ... ... ... ...
2020-06-24 89.08 86.32 89.08 86.56 1806600.0 86.38
2020-06-25 87.35 84.80 86.43 87.26 1350100.0 87.08
2020-06-26 87.56 85.52 87.23 85.90 2225800.0 85.72
2020-06-29 87.36 86.11 86.56 87.29 1302500.0 87.29
2020-06-30 88.88 87.24 87.33 88.37 1428931.0 88.37
[5186 rows x 6 columns]
High Low Open Close Volume Adj Close
Date
2005-09-27 21.40 19.10 21.05 19.30 961200.0 18.19
2005-09-28 20.53 19.20 19.30 20.50 5747900.0 19.33
2005-09-29 20.58 20.10 20.40 20.21 1078200.0 19.05
2005-09-30 21.05 20.18 20.26 21.01 3123300.0 19.81
2005-10-03 21.75 20.90 20.90 21.50 1057900.0 20.27
... ... ... ... ... ... ...
2020-06-24 13.90 12.83 13.59 13.04 140975500.0 13.04
2020-06-25 13.24 12.18 12.53 13.17 117383400.0 13.17
2020-06-26 13.29 12.13 13.20 12.38 108813000.0 12.38
2020-06-29 13.51 12.02 12.57 13.32 114650300.0 13.32
2020-06-30 13.48 12.88 13.10 13.07 68669742.0 13.07
[3715 rows x 6 columns]
yes, the begin of 2 stocks is different, the end date is same.
so i get the plot is like this:
stockplot
this is not normal like others.
so, who could give me any advice, to draw a normal trendance line for 2 stocks?
You can try for making two different plots with same limits and then put one over the another for comparison.
I have the following code:
data1 = data1.set_index('Date')
data1['ret'] = 1 - data1['Low'].div(data1['Close'].shift(freq='1d'))
data1['ret'] = data1['ret'].astype(float)*100
For some reason on the column ret i am getting NaN value:
High Low Open Close Volume Adj Close ret
Date
2020-01-24 3333.179932 3281.530029 3333.100098 3295.469971 3707130000 3295.469971 1.323394
2020-01-27 3258.850098 3234.500000 3247.159912 3243.629883 3823100000 3243.629883 NaN
2020-01-28 3285.780029 3253.219971 3255.350098 3276.239990 3526720000 3276.239990 -0.295659
2020-01-29 3293.469971 3271.889893 3289.459961 3273.399902 3584500000 3273.399902 0.132777
2020-01-30 3285.909912 3242.800049 3256.449951 3283.659912 3787250000 3283.659912 0.934803
2020-01-31 3282.330078 3214.679932 3282.330078 3225.520020 4527830000 3225.520020 2.100704
2020-02-03 3268.439941 3235.659912 3235.659912 3248.919922 3757910000 3248.919922 NaN
2020-02-04 3306.919922 3280.610107 3280.610107 3297.590088 3995320000 3297.590088 -0.975407
2020-02-05 3337.580078 3313.750000 3324.909912 3334.689941 4117730000 3334.689941 -0.490052
2020-02-06 3347.959961 3334.389893 3344.919922 3345.780029 3868370000 3345.780029 0.008998
2020-02-07 3341.419922 3322.120117 3335.540039 3327.709961 3730650000 3327.709961 0.707157
2020-02-10 3352.260010 3317.770020 3318.280029 3352.090088 3450350000 3352.090088 NaN
2020-02-11 3375.629883 3352.719971 3365.870117 3357.750000 3760550000 3357.750000 -0.018791
2020-02-12 3381.469971 3369.719971 3370.500000 3379.449951 3926380000 3379.449951 -0.356488
2020-02-13 3385.090088 3360.520020 3365.899902 3373.939941 3498240000 3373.939941 0.560148
2020-02-14 3380.689941 3366.149902 3378.080078 3380.159912 3398040000 3380.159912 0.230888
2020-02-18 3375.010010 3355.610107 3369.040039 3370.290039 3746720000 3370.290039 NaN
2020-02-19 3393.520020 3378.830078 3380.389893 3386.149902 3600150000 3386.149902 -0.253392
2020-02-20 3389.149902 3341.020020 3380.449951 3373.229980 4007320000 3373.229980 1.332779
2020-02-21 3360.760010 3328.449951 3360.500000 3337.750000 3899270000 3337.750000 1.327512
2020-02-24 3259.810059 3214.649902 3257.610107 3225.889893 4842960000 3225.889893 NaN
2020-02-25 3246.989990 3118.770020 3238.939941 3128.209961 5591510000 3128.209961 3.320630
2020-02-26 3182.510010 3108.989990 3139.899902 3116.389893 5478110000 3116.389893 0.614408
2020-02-27 3097.070068 2977.389893 3062.540039 2978.760010 7058840000 2978.760010 4.460289
2020-02-28 2959.719971 2855.840088 2916.899902 2954.219971 8563850000 2954.219971 4.126547
2020-03-02 3090.959961 2945.189941 2974.280029 3090.229980 6376400000 3090.229980 NaN
2020-03-03 3136.719971 2976.629883 3096.459961 3003.370117 6355940000 3003.370117 3.676105
2020-03-04 3130.969971 3034.379883 3045.750000 3130.120117 5035480000 3130.120117 -1.032499
2020-03-05 3083.040039 2999.830078 3075.699951 3023.939941 5575550000 3023.939941 4.162461
2020-03-06 2985.929932 2901.540039 2954.199951 2972.370117 6552140000 2972.370117 4.047696
2020-03-09 2863.889893 2734.429932 2863.889893 2746.560059 8423050000 2746.560059 NaN
2020-03-10 2882.590088 2734.000000 2813.479980 2882.229980 7635960000 2882.229980 0.457301
2020-03-11 2825.600098 2707.219971 2825.600098 2741.379883 7374110000 2741.379883 6.072035
2020-03-12 2660.949951 2478.860107 2630.860107 2480.639893 8829380000 2480.639893 9.576191
2020-03-13 2711.330078 2492.370117 2569.989990 2711.020020 8258670000 2711.020020 -0.472871
2020-03-16 2562.979980 2380.939941 2508.590088 2386.129883 7781540000 2386.129883 NaN
2020-03-17 2553.929932 2367.040039 2425.659912 2529.189941 8358500000 2529.189941 0.800034
2020-03-18 2453.570068 2280.520020 2436.500000 2398.100098 8755780000 2398.100098 9.831999
2020-03-19 2466.969971 2319.780029 2393.479980 2409.389893 7946710000 2409.389893 3.265922
2020-03-20 2453.010010 2295.560059 2431.939941 2304.919922 9044690000 2304.919922 4.724426
2020-03-23 2300.729980 2191.860107 2290.709961 2237.399902 7402180000 2237.399902 NaN
2020-03-24 2449.709961 2344.439941 2344.439941 2447.330078 7547350000 2447.330078 -4.784126
2020-03-25 2571.419922 2407.530029 2457.770020 2475.560059 8285670000 2475.560059 1.626264
2020-03-26 2637.010010 2500.719971 2501.290039 2630.070068 7753160000 2630.070068 -1.016332
2020-03-27 2615.909912 2520.020020 2555.870117 2541.469971 6194330000 2541.469971 4.184301
2020-03-30 2631.800049 2545.280029 2558.979980 2626.649902 5746220000 2626.649902 NaN
Why am i getting NaN?
Reason for missing values is if use Series.shift with freq='d' it count frequency per consecutive days.
So there is DatetimeIndex with some values missing, because removed weekends datetimes, so Mondays datetimes are counts from non exist Sundays and output are NaNs.
Solution is remove it, using:
data1 = data1.set_index('Date')
data1['ret'] = 1 - data1['Low'].div(data1['Close'].shift())
data1['ret'] = data1['ret'].astype(float)*100
then next Mondays use value from previous Fridays.
I have a Dataframe of vote and I would like to create one of preferences.
For example here is the number of votes for each party P1, P2, P3 in each city Comm, Comm2 ...
Comm Votes P1 P2 P3
0 comm1 1315.0 2.0 424.0 572.0
1 comm2 4682.0 117.0 2053.0 1584.0
2 comm3 2397.0 2.0 40.0 192.0
3 comm4 931.0 2.0 12.0 345.0
4 comm5 842.0 47.0 209.0 76.0
... ... ... ... ... ...
1524 comm1525 10477.0 13.0 673.0 333.0
1525 comm1526 2674.0 1.0 55.0 194.0
1526 comm1527 1691.0 331.0 29.0 78.0
These electoral results would suffice for a first pass the ballot system, I would like to test the alternative election model. So for each political party I need to get the preferences.
As I don't know the preferences, I want to make them with random numbers. I suppose that voters are honest. For example, for the "P1" party in town "comm" We know that 2 people voted for it and that there are 1315 voters. I need to create preferences to see if people would put it as their first, second or third option. It is to say, and for each party:
Comm Votes P1_1 P1_2 P1_3 P2_1 P2_2 P2_3 P3_1 P3_2 P3_3
0 comm1 1315.0 2.0 1011.0 303.0 424.0 881.0 10.0 570.0 1.0 1.0
... ... ... ... ... ...
1526 comm1527 1691.0 331.0 1300.0 60.0 299.0 22.0 10.0 ...
So I have to do:
# for each column in parties I create (parties -1) other columns
# I rename them all Party_i. The former 1 becomes Party_1.
# In the other columns I put a random number.
# For a given line, the sum of all Party_i for i in [1, parties] mus t be equal to Votes
I tried this so far:
parties = [item for item in df.columns if item not in ['Comm','Votes']]
for index, row in df_test.iterrows():
# In the other columns I put a random number.
for party in parties:
# for each column in parties I create (parties -1) other columns
for i in range(0,len(parties) -1):
print(random.randrange(0, row['Votes']))
# I rename them all Party_i. The former 1 becomes Party_1.
row["{party}_{preference}".format(party = party,preference = i)] = random.randrange(0, row['Votes']) if (row[party] < row['Votes']) else 0 # false because the sum of the votes isn't = to df['Votes']
The results are:
Comm Votes ... P1_1 P1_2 P1_3 P2_1 P2_2 P2_3 P3_1 P3_2 P3_3
0 comm1 1315.0 ... 1003 460 1588 1284 1482 1613 1429 345
1 comm2 1691.0 ... 1003 460 1588 1284 1482 1613 ...
...
But:
the numbers are the same for each rows
the value in row of Pi_1 isn't equal to the one in the row of Pi (Pi being a given party).
the sum of Pi_j for all j in [0, parties] isn't equal to the number in the column Votes
Update
I tried Antihead's answer with his own data and it worked well. But when apllying to my own data it doesn't. It leaves me an empty dataframe:
import collections
def fill_cells(cell):
v_max = cell['Votes']
all_dict = {}
#iterate over parties.copy()
for p in parties:
tmp_l = parties.copy()
tmp_l.remove(p)
# sample new data with equal choices
sampled = np.random.choice(tmp_l, int(v_max-cell[p]))
# transform into dictionary
c_sampled = dict(collections.Counter(sampled))
c_sampled.update({p:cell[p]})
# batch update of the dictio~nary keys
all_dict.update(
dict(zip([p+'_%s' %k[1] for k in c_sampled.keys()], c_sampled.values()))
)
return pd.Series(all_dict)
Indeed, with the following dataframe:
Comm Votes LPC CPC BQ
0 comm1 1315.0 2.0 424.0 572.0
1 comm2 4682.0 117.0 2053.0 1584.0
2 comm3 2397.0 2.0 40.0 192.0
3 comm4 931.0 2.0 12.0 345.0
4 comm5 842.0 47.0 209.0 76.0
... ... ... ... ... ...
1522 comm1523 23808.0 1588.0 4458.0 13147.0
1523 comm1524 639.0 40.0 126.0 40.0
1524 comm1525 10477.0 13.0 673.0 333.0
1525 comm1526 2674.0 1.0 55.0 194.0
1526 comm1527 1691.0 331.0 29.0 78.0
I have an empty dataframe:
0
1
2
3
4
...
1522
1523
1524
1525
1526
Does this work:
# data
columns = ['Comm', 'Votes', 'P1', 'P2', 'P3']
data =[['comm1', 1315.0, 2.0, 424.0, 572.0],
['comm2', 4682.0, 117.0, 2053.0, 1584.0],
['comm3', 2397.0, 2.0, 40.0, 192.0],
['comm4', 931.0, 2.0, 12.0, 345.0],
['comm5', 842.0, 47.0, 209.0, 76.0],
['comm1525', 10477.0, 13.0, 673.0, 333.0],
['comm1526', 2674.0, 1.0, 55.0, 194.0],
['comm1527', 1691.0, 331.0, 29.0, 78.0]]
df =pd.DataFrame(data=data, columns=columns)
import collections
def fill_cells(cell):
v_max = cell['Votes']
all_dict = {}
#iterate over parties
for p in ['P1', 'P2', 'P3']:
tmp_l = ['P1', 'P2', 'P3']
tmp_l.remove(p)
# sample new data with equal choices
sampled = np.random.choice(tmp_l, int(v_max-cell[p]))
# transform into dictionary
c_sampled = dict(collections.Counter(sampled))
c_sampled.update({p:cell[p]})
# batch update of the dictionary keys
all_dict.update(
dict(zip([p+'_%s' %k[1] for k in c_sampled.keys()], c_sampled.values()))
)
return pd.Series(all_dict)
# get back a data frame
df.apply(fill_cells, axis=1)
If You need to merge the data frame back, do something like:
new_df = df.apply(fill_cells, axis=1)
pd.concat([df, new_df], axis=1)
Based on Antihead's answer and for the following dataset:
Comm Votes LPC CPC BQ
0 comm1 1315.0 2.0 424.0 572.0
1 comm2 4682.0 117.0 2053.0 1584.0
2 comm3 2397.0 2.0 40.0 192.0
3 comm4 931.0 2.0 12.0 345.0
4 comm5 842.0 47.0 209.0 76.0
... ... ... ... ... ...
1522 comm1523 23808.0 1588.0 4458.0 13147.0
1523 comm1524 639.0 40.0 126.0 40.0
1524 comm1525 10477.0 13.0 673.0 333.0
1525 comm1526 2674.0 1.0 55.0 194.0
1526 comm1527 1691.0 331.0 29.0 78.0
I tried:
def fill_cells(cell):
votes_max = cell['Votes']
all_dict = {}
#iterate over parties
parties_temp = parties.copy()
for p in parties_temp:
preferences = ['1','2','3']
for preference in preferences:
preferences.remove(preference)
# sample new data with equal choices
sampled = np.random.choice(preferences, int(votes_max-cell[p]))
# transform into dictionary
c_sampled = dict(collections.Counter(sampled))
c_sampled.update({p:cell[p]})
c_sampled['1'] = c_sampled.pop(p)
# batch update of the dictionary keys
all_dict.update(
dict(zip([p+'_%s' %k for k in c_sampled.keys()],c_sampled.values()))
)
return pd.Series(all_dict)
It returns
LPC_2 LPC_3 LPC_1 CPC_2 CPC_3 CPC_1 BQ_2 BQ_3 BQ_1
0 891.0 487.0 424.0 743.0 373.0 572.0 1313.0 683.0 2.0
1 2629.0 1342.0 2053.0 3098.0 1603.0 1584.0 4565.0 2301.0 117.0
2 2357.0 1186.0 40.0 2205.0 1047.0 192.0 2395.0 1171.0 2.0
3 919.0 451.0 12.0 586.0 288.0 345.0 929.0 455.0 2.0
4 633.0 309.0 209.0 766.0 399.0 76.0 795.0 396.0 47.0
... ... ... ... ... ... ... ... ... ...
1520 1088.0 536.0 42.0 970.0 462.0 160.0 1117.0 540.0 13.0
1521 4742.0 2341.0 219.0 3655.0 1865.0 1306.0 4705.0 2375.0 256.0
1522 19350.0 9733.0 4458.0 10661.0 5352.0 13147.0 22220.0 11100.0 1588.0
1523 513.0 264.0 126.0 599.0 267.0 40.0 599.0 306.0 40.0
1524 9804.0 4885.0 673.0 10144.0 5012.0 333.0 10464.0 5162.0 13.0
It's almost good. I would have prefered the preferences to be dynamically encoded rather than to hard code ['1','2','3'].
I know using pandas this is how you normally get daily stock price quotes. But I'm wondering if its possible to get monthly or weekly quotes, is there maybe a parameter I can pass through to get monthly quotes?
from pandas.io.data import DataReader
from datetime import datetime
ibm = DataReader('IBM', 'yahoo', datetime(2000,1,1), datetime(2012,1,1))
print(ibm['Adj Close'])
Monthly closing prices from Yahoo! Finance...
import pandas_datareader.data as web
data = web.get_data_yahoo('IBM','01/01/2015',interval='m')
where you can replace the interval input as required ('d', 'w', 'm', etc).
Using Yahoo Finance, it is possible to get Stock Prices using "interval" option with instead of "m" as shown:
#Library
import yfinance as yf
from datetime import datetime
#Load Stock price
df = yf.download("IBM", start= datetime(2000,1,1), end = datetime(2012,1,1),interval='1mo')
df
The result is:
The other possible interval options are:
1m,
2m,
5m,
15m,
30m,
60m,
90m,
1h,
1d,
5d,
1wk,
1mo,
3mo.
try this:
In [175]: from pandas_datareader.data import DataReader
In [176]: ibm = DataReader('IBM', 'yahoo', '2001-01-01', '2012-01-01')
UPDATE: show average for Adj Close only (month start)
In [12]: ibm.groupby(pd.TimeGrouper(freq='MS'))['Adj Close'].mean()
Out[12]:
Date
2001-01-01 79.430605
2001-02-01 86.625519
2001-03-01 75.938913
2001-04-01 81.134375
2001-05-01 90.460754
2001-06-01 89.705042
2001-07-01 83.350254
2001-08-01 82.100543
2001-09-01 74.335789
2001-10-01 79.937451
...
2011-03-01 141.628553
2011-04-01 146.530774
2011-05-01 150.298053
2011-06-01 146.844772
2011-07-01 158.716834
2011-08-01 150.690990
2011-09-01 151.627555
2011-10-01 162.365699
2011-11-01 164.596963
2011-12-01 167.924676
Freq: MS, Name: Adj Close, dtype: float64
show average for Adj Close only (month end)
In [13]: ibm.groupby(pd.TimeGrouper(freq='M'))['Adj Close'].mean()
Out[13]:
Date
2001-01-31 79.430605
2001-02-28 86.625519
2001-03-31 75.938913
2001-04-30 81.134375
2001-05-31 90.460754
2001-06-30 89.705042
2001-07-31 83.350254
2001-08-31 82.100543
2001-09-30 74.335789
2001-10-31 79.937451
...
2011-03-31 141.628553
2011-04-30 146.530774
2011-05-31 150.298053
2011-06-30 146.844772
2011-07-31 158.716834
2011-08-31 150.690990
2011-09-30 151.627555
2011-10-31 162.365699
2011-11-30 164.596963
2011-12-31 167.924676
Freq: M, Name: Adj Close, dtype: float64
monthly averages (all columns):
In [179]: ibm.groupby(pd.TimeGrouper(freq='M')).mean()
Out[179]:
Open High Low Close Volume Adj Close
Date
2001-01-31 100.767857 103.553571 99.428333 101.870357 9474409 79.430605
2001-02-28 111.193160 113.304210 108.967368 110.998422 8233626 86.625519
2001-03-31 97.366364 99.423637 95.252272 97.281364 11570454 75.938913
2001-04-30 103.990500 106.112500 102.229501 103.936999 11310545 81.134375
2001-05-31 115.781363 117.104091 114.349091 115.776364 7243463 90.460754
2001-06-30 114.689524 116.199048 113.739523 114.777618 6806176 89.705042
2001-07-31 106.717143 108.028095 105.332857 106.646666 7667447 83.350254
2001-08-31 105.093912 106.196521 103.856522 104.939999 6234847 82.100543
2001-09-30 95.138667 96.740000 93.471334 94.987333 12620833 74.335789
2001-10-31 101.400870 103.140000 100.327827 102.145217 9754413 79.937451
2001-11-30 113.449047 114.875715 112.510952 113.938095 6435061 89.256046
2001-12-31 120.651001 122.076000 119.790500 121.087999 6669690 94.878736
2002-01-31 116.483334 117.509524 114.613334 115.994762 9217280 90.887920
2002-02-28 103.194210 104.389474 101.646316 102.961579 9069526 80.764672
2002-03-31 105.246500 106.764499 104.312999 105.478499 7563425 82.756873
... ... ... ... ... ... ...
2010-10-31 138.956188 140.259048 138.427142 139.631905 6537366 122.241844
2010-11-30 144.281429 145.164762 143.385241 144.439524 4956985 126.878319
2010-12-31 145.155909 145.959545 144.567273 145.251819 4245127 127.726929
2011-01-31 152.595000 153.950499 151.861000 153.181501 5941580 134.699880
2011-02-28 163.217895 164.089474 162.510002 163.339473 4687763 144.050847
2011-03-31 160.433912 161.745652 159.154349 160.425651 5639752 141.628553
2011-04-30 165.437501 166.587500 164.760500 165.978500 5038475 146.530774
2011-05-31 169.657144 170.679046 168.442858 169.632857 5276390 150.298053
2011-06-30 165.450455 166.559093 164.691819 165.593635 4792836 146.844772
2011-07-31 178.124998 179.866502 177.574998 178.981500 5679660 158.716834
2011-08-31 169.734350 171.690435 166.749567 169.360434 8480613 150.690990
2011-09-30 169.752858 172.034761 168.109999 170.245714 6566428 151.627555
2011-10-31 181.529525 183.597145 180.172379 182.302381 6883985 162.365699
2011-11-30 184.536668 185.950952 182.780477 184.244287 4619719 164.596963
2011-12-31 188.151428 189.373809 186.421905 187.789047 4925547 167.924676
[132 rows x 6 columns]
weekly averages (all columns):
In [180]: ibm.groupby(pd.TimeGrouper(freq='W')).mean()
Out[180]:
Open High Low Close Volume Adj Close
Date
2001-01-07 89.234375 94.234375 87.890625 91.656250 11060200 71.466436
2001-01-14 93.412500 95.062500 91.662500 93.412500 7470200 72.835824
2001-01-21 100.250000 103.921875 99.218750 102.250000 13851500 79.726621
2001-01-28 109.575000 111.537500 108.675000 110.600000 8056720 86.237303
2001-02-04 113.680000 115.465999 111.734000 113.582001 6538080 88.562436
2001-02-11 113.194002 115.815999 111.639999 113.884001 7269320 88.858876
2001-02-18 113.960002 116.731999 113.238000 115.106000 7225420 89.853021
2001-02-25 109.525002 111.375000 105.424999 107.977501 10722700 84.288436
2001-03-04 103.390001 106.052002 100.386000 103.228001 11982540 80.580924
2001-03-11 105.735999 106.920000 103.364002 104.844002 9226900 81.842391
2001-03-18 95.660001 97.502002 93.185997 94.899998 13863740 74.079992
2001-03-25 90.734000 92.484000 88.598000 90.518001 11382280 70.659356
2001-04-01 95.622000 97.748000 94.274000 96.106001 10467580 75.021411
2001-04-08 95.259999 97.360001 93.132001 94.642000 12312580 73.878595
2001-04-15 98.350000 99.520000 95.327502 97.170000 10218625 75.851980
... ... ... ... ... ... ...
2011-09-25 170.678003 173.695996 169.401996 171.766000 6358100 152.981582
2011-10-02 176.290002 178.850000 174.729999 176.762000 7373680 157.431216
2011-10-09 175.920001 179.200003 174.379999 177.792001 7623560 158.348576
2011-10-16 185.366000 187.732001 184.977997 187.017999 5244180 166.565614
2011-10-23 180.926001 182.052002 178.815997 180.351999 9359200 160.628611
2011-10-30 183.094003 184.742001 181.623996 183.582001 5743800 163.505379
2011-11-06 184.508002 186.067999 183.432004 184.716003 4583780 164.515366
2011-11-13 185.350000 186.690002 183.685999 185.508005 4180620 165.750791
2011-11-20 187.600003 189.101999 185.368002 186.738000 5104420 166.984809
2011-11-27 181.067497 181.997501 178.717499 179.449997 4089350 160.467733
2011-12-04 185.246002 187.182001 184.388000 186.052002 5168720 166.371376
2011-12-11 191.841998 194.141998 191.090002 192.794000 4828580 172.400204
2011-12-18 191.085999 191.537998 187.732001 188.619998 6037220 168.667729
2011-12-25 183.810001 184.634003 181.787997 183.678000 5433360 164.248496
2012-01-01 185.140003 185.989998 183.897499 184.750000 3029925 165.207100
[574 rows x 6 columns]
Get it from Quandl:
import pandas as pd
import quandl
quandl.ApiConfig.api_key = 'xxxxxxxxxxxx' # Optional
quandl.ApiConfig.api_version = '2015-04-09' # Optional
ibm = quandl.get("WIKI/IBM", start_date="2000-01-01", end_date="2012-01-01", collapse="monthly", returns="pandas")