Flatting out a multiindex dataframe - python

I have a df:
df = pd.DataFrame.from_dict({('group', ''): {0: 'A',
1: 'A',
2: 'A',
3: 'A',
4: 'A',
5: 'A',
6: 'A',
7: 'A',
8: 'A',
9: 'B',
10: 'B',
11: 'B',
12: 'B',
13: 'B',
14: 'B',
15: 'B',
16: 'B',
17: 'B',
18: 'all',
19: 'all'},
('category', ''): {0: 'Amazon',
1: 'Apple',
2: 'Facebook',
3: 'Google',
4: 'Netflix',
5: 'Tesla',
6: 'Total',
7: 'Uber',
8: 'total',
9: 'Amazon',
10: 'Apple',
11: 'Facebook',
12: 'Google',
13: 'Netflix',
14: 'Tesla',
15: 'Total',
16: 'Uber',
17: 'total',
18: 'Total',
19: 'total'},
(pd.Timestamp('2020-06-29'), 'last_sales'): {0: 195.0,
1: 61.0,
2: 106.0,
3: 61.0,
4: 37.0,
5: 13.0,
6: 954.0,
7: 4.0,
8: 477.0,
9: 50.0,
10: 50.0,
11: 75.0,
12: 43.0,
13: 17.0,
14: 14.0,
15: 504.0,
16: 3.0,
17: 252.0,
18: 2916.0,
19: 2916.0},
(pd.Timestamp('2020-06-29'), 'sales'): {0: 1268.85,
1: 18274.385000000002,
2: 19722.65,
3: 55547.255,
4: 15323.800000000001,
5: 1688.6749999999997,
6: 227463.23,
7: 1906.0,
8: 113731.615,
9: 3219.6499999999996,
10: 15852.060000000001,
11: 17743.7,
12: 37795.15,
13: 5918.5,
14: 1708.75,
15: 166349.64,
16: 937.01,
17: 83174.82,
18: 787625.7400000001,
19: 787625.7400000001},
(pd.Timestamp('2020-06-29'), 'difference'): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0,
7: 0.0,
8: 0.0,
9: 0.0,
10: 0.0,
11: 0.0,
12: 0.0,
13: 0.0,
14: 0.0,
15: 0.0,
16: 0.0,
17: 0.0,
18: 0.0,
19: 0.0},
(pd.Timestamp('2020-07-06'), 'last_sales'): {0: 26.0,
1: 39.0,
2: 79.0,
3: 49.0,
4: 10.0,
5: 10.0,
6: 436.0,
7: 5.0,
8: 218.0,
9: 89.0,
10: 34.0,
11: 133.0,
12: 66.0,
13: 21.0,
14: 20.0,
15: 732.0,
16: 3.0,
17: 366.0,
18: 2336.0,
19: 2336.0},
(pd.Timestamp('2020-07-06'), 'sales'): {0: 3978.15,
1: 12138.96,
2: 19084.175,
3: 40033.46000000001,
4: 4280.15,
5: 1495.1,
6: 165548.29,
7: 1764.15,
8: 82774.145,
9: 8314.92,
10: 12776.649999999996,
11: 28048.075,
12: 55104.21000000002,
13: 6962.844999999999,
14: 3053.2000000000003,
15: 231049.11000000002,
16: 1264.655,
17: 115524.55500000001,
18: 793194.8000000002,
19: 793194.8000000002},
(pd.Timestamp('2020-07-06'), 'difference'): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0,
7: 0.0,
8: 0.0,
9: 0.0,
10: 0.0,
11: 0.0,
12: 0.0,
13: 0.0,
14: 0.0,
15: 0.0,
16: 0.0,
17: 0.0,
18: 0.0,
19: 0.0},
(pd.Timestamp('2021-06-28'), 'last_sales'): {0: 96.0,
1: 56.0,
2: 106.0,
3: 44.0,
4: 34.0,
5: 13.0,
6: 716.0,
7: 9.0,
8: 358.0,
9: 101.0,
10: 22.0,
11: 120.0,
12: 40.0,
13: 13.0,
14: 8.0,
15: 610.0,
16: 1.0,
17: 305.0,
18: 2652.0,
19: 2652.0},
(pd.Timestamp('2021-06-28'), 'sales'): {0: 5194.95,
1: 19102.219999999994,
2: 22796.420000000002,
3: 30853.115,
4: 11461.25,
5: 992.6,
6: 188143.41,
7: 3671.15,
8: 94071.705,
9: 6022.299999999998,
10: 7373.6,
11: 33514.0,
12: 35943.45,
13: 4749.000000000001,
14: 902.01,
15: 177707.32,
16: 349.3,
17: 88853.66,
18: 731701.46,
19: 731701.46},
(pd.Timestamp('2021-06-28'), 'difference'): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0,
7: 0.0,
8: 0.0,
9: 0.0,
10: 0.0,
11: 0.0,
12: 0.0,
13: 0.0,
14: 0.0,
15: 0.0,
16: 0.0,
17: 0.0,
18: 0.0,
19: 0.0},
(pd.Timestamp('2021-07-07'), 'last_sales'): {0: 45.0,
1: 47.0,
2: 87.0,
3: 45.0,
4: 13.0,
5: 8.0,
6: 494.0,
7: 2.0,
8: 247.0,
9: 81.0,
10: 36.0,
11: 143.0,
12: 56.0,
13: 9.0,
14: 9.0,
15: 670.0,
16: 1.0,
17: 335.0,
18: 2328.0,
19: 2328.0},
(pd.Timestamp('2021-07-07'), 'sales'): {0: 7556.414999999998,
1: 14985.05,
2: 16790.899999999998,
3: 36202.729999999996,
4: 4024.97,
5: 1034.45,
6: 163960.32999999996,
7: 1385.65,
8: 81980.16499999998,
9: 5600.544999999999,
10: 11209.92,
11: 32832.61,
12: 42137.44500000001,
13: 3885.1499999999996,
14: 1191.5,
15: 194912.34000000003,
16: 599.0,
17: 97456.17000000001,
18: 717745.3400000001,
19: 717745.3400000001},
(pd.Timestamp('2021-07-07'), 'difference'): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0,
7: 0.0,
8: 0.0,
9: 0.0,
10: 0.0,
11: 0.0,
12: 0.0,
13: 0.0,
14: 0.0,
15: 0.0,
16: 0.0,
17: 0.0,
18: 0.0,
19: 0.0}}).set_index(['group','category'])
I am trying to sort of flatten it so it would no longer be a multiindex df. As there are several dates I try to select one:
df.loc[:,'2020-06-29 00:00:00']
But this gives me an error :
KeyError: '2020-06-29 00:00:00'
I am trying to make it that the first week ( and my final output ) of 2020-06-29 would look like this :
group category last_sales sales difference
A Amazon 195.00 1,268.85 0.00
A Apple 61.00 18,274.39 0.00
A Facebook 106.00 19,722.65 0.00
A Google 61.00 55,547.25 0.00
A Netflix 37.00 15,323.80 0.00
A Tesla 13.00 1,688.67 0.00
A Total 954.00 227,463.23 0.00
A Uber 4.00 1,906.00 0.00
A total 477.00 113,731.62 0.00
B Amazon 0.00 3,219.65 0.00
B Apple 50.00 15,852.06 0.00
B Facebook 75.00 17,743.70 0.00
B Google 43.00 37,795.15 0.00
B Netflix 17.00 5,918.50 0.00
B Tesla 14.00 1,708.75 0.00
B Total 504.00 166,349.64 0.00
B Uber 3.00 937.01 0.00
B total 252.00 83,174.82 0.00
all Total 2,916.00 787,625.74 0.00

try via pd.to_dateime():
out=df.loc[:,pd.to_datetime('2020-06-29 00:00:00')]
#out=df.loc[:,pd.to_datetime('2020-06-29 00:00:00')].reset_index()
OR
try via pd.Timestamp()
out=df.loc[:,pd.Timestamp('2020-06-29 00:00:00')]
#out=df.loc[:,pd.Timestamp('2020-06-29 00:00:00')].reset_index()
The 0th level of your column is Timestamp and you can verify that by:
df.columns.to_numpy()
#output
array([(Timestamp('2020-06-29 00:00:00'), 'last_sales'),
(Timestamp('2020-06-29 00:00:00'), 'sales'),
(Timestamp('2020-06-29 00:00:00'), 'difference'),
(Timestamp('2020-07-06 00:00:00'), 'last_sales'),
(Timestamp('2020-07-06 00:00:00'), 'sales'),
(Timestamp('2020-07-06 00:00:00'), 'difference'),
(Timestamp('2021-06-28 00:00:00'), 'last_sales'),
(Timestamp('2021-06-28 00:00:00'), 'sales'),
(Timestamp('2021-06-28 00:00:00'), 'difference'),
(Timestamp('2021-07-07 00:00:00'), 'last_sales'),
(Timestamp('2021-07-07 00:00:00'), 'sales'),
(Timestamp('2021-07-07 00:00:00'), 'difference')], dtype=object)
output of out:
last_sales sales difference
group category
A Amazon 195.0 1268.850 0.0
Apple 61.0 18274.385 0.0
Facebook 106.0 19722.650 0.0
Google 61.0 55547.255 0.0
Netflix 37.0 15323.800 0.0
Tesla 13.0 1688.675 0.0
Total 954.0 227463.230 0.0
Uber 4.0 1906.000 0.0
total 477.0 113731.615 0.0
B Amazon 50.0 3219.650 0.0
Apple 50.0 15852.060 0.0
Facebook 75.0 17743.700 0.0
Google 43.0 37795.150 0.0
Netflix 17.0 5918.500 0.0
Tesla 14.0 1708.750 0.0
Total 504.0 166349.640 0.0
Uber 3.0 937.010 0.0
total 252.0 83174.820 0.0
all Total 2916.0 787625.740 0.0
total 2916.0 787625.740 0.0
NOTE:
There is no need of providing a tuple in .loc[] because you are selecting the 0th level

I’m also getting a KeyError, but if you use a Timestamp object to index the first-level columns, it seems to work:
>>> df[pd.Timestamp('2020-06-29 00:00:00')]
last_sales sales difference
group category
A Amazon 195.0 1268.850 0.0
Apple 61.0 18274.385 0.0
Facebook 106.0 19722.650 0.0
Google 61.0 55547.255 0.0
Netflix 37.0 15323.800 0.0
Tesla 13.0 1688.675 0.0
Total 954.0 227463.230 0.0
Uber 4.0 1906.000 0.0
total 477.0 113731.615 0.0
B Amazon 50.0 3219.650 0.0
Apple 50.0 15852.060 0.0
Facebook 75.0 17743.700 0.0
Google 43.0 37795.150 0.0
Netflix 17.0 5918.500 0.0
Tesla 14.0 1708.750 0.0
Total 504.0 166349.640 0.0
Uber 3.0 937.010 0.0
total 252.0 83174.820 0.0
all Total 2916.0 787625.740 0.0
total 2916.0 787625.740 0.0
Otherwise you could use .xs which will then also allow you more flexibility, e.g. selecting in the second level of columns and so on:
>>> df.xs(pd.Timestamp('2020-06-29 00:00:00'), axis='columns', level=0)
last_sales sales difference
group category
A Amazon 195.0 1268.850 0.0
Apple 61.0 18274.385 0.0
Facebook 106.0 19722.650 0.0
Google 61.0 55547.255 0.0
Netflix 37.0 15323.800 0.0
Tesla 13.0 1688.675 0.0
Total 954.0 227463.230 0.0
Uber 4.0 1906.000 0.0
total 477.0 113731.615 0.0
B Amazon 50.0 3219.650 0.0
Apple 50.0 15852.060 0.0
Facebook 75.0 17743.700 0.0
Google 43.0 37795.150 0.0
Netflix 17.0 5918.500 0.0
Tesla 14.0 1708.750 0.0
Total 504.0 166349.640 0.0
Uber 3.0 937.010 0.0
total 252.0 83174.820 0.0
all Total 2916.0 787625.740 0.0
total 2916.0 787625.740 0.0
You can then add .drop(index=[('all', 'total')]) to remove the second total line, and possible .reset_index()
The way to do it with .loc[] is to provide a tuple, with the first item being a Timestamp object and the second an empty slice. However this will keep the 2 levels of indexing, so it is not what you want:
>>> df.loc[:, (pd.Timestamp('2020-06-29 00:00:00'), slice(None))].head(2)
2020-06-29 00:00:00
last_sales sales difference
group category
A Amazon 195.0 1268.850 0.0
Apple 61.0 18274.385 0.0

Related

Convert `DataFrame.groupby()` into dictionary (and then reverse it)

Say I have the following DataFrame() where I have repeated observations per individual (column id_ind). Hence, first two rows belong the first individual, the third and fourth rows belong to the second individual, and so forth...
import pandas as pd
X = pd.DataFrame.from_dict({'x1_1': {0: -0.1766214634108258, 1: 1.645852185286492, 2: -0.13348860101031038, 3: 1.9681043689968933, 4: -1.7004428240831382, 5: 1.4580091413853749, 6: 0.06504113741068565, 7: -1.2168493676768384, 8: -0.3071304478616376, 9: 0.07121332925591593}, 'x1_2': {0: -2.4207773498298844, 1: -1.0828751040719462, 2: 2.73533787008624, 3: 1.5979611987152071, 4: 0.08835542172064115, 5: 1.2209786277076156, 6: -0.44205979195950784, 7: -0.692872860268244, 8: 0.0375521181289943, 9: 0.4656030062266639}, 'x1_3': {0: -1.548320898226322, 1: 0.8457342014424675, 2: -0.21250514722879738, 3: 0.5292389938329516, 4: -2.593946520223666, 5: -0.6188958526077123, 6: 1.6949245117526974, 7: -1.0271341091035742, 8: 0.637561891142571, 9: -0.7717170035055559}, 'x2_1': {0: 0.3797245517345564, 1: -2.2364391598508835, 2: 0.6205947900678905, 3: 0.6623865847688559, 4: 1.562036259999875, 5: -0.13081282910947759, 6: 0.03914373833251773, 7: -0.995761652421108, 8: 1.0649494418154162, 9: 1.3744782478849122}, 'x2_2': {0: -0.5052556836786106, 1: 1.1464291788297152, 2: -0.5662380273138174, 3: 0.6875729143723538, 4: 0.04653136473130827, 5: -0.012885303852347407, 6: 1.5893672346098884, 7: 0.5464286050059511, 8: -0.10430829457707284, 9: -0.5441755265313813}, 'x2_3': {0: -0.9762973303149007, 1: -0.983731467806563, 2: 1.465827578266328, 3: 0.5325950414202745, 4: -1.4452121324204903, 5: 0.8148816373643869, 6: 0.470791989780882, 7: -0.17951636294180473, 8: 0.7351814781280054, 9: -0.28776723200679066}, 'x3_1': {0: 0.12751822396637064, 1: -0.21926633684030983, 2: 0.15758799357206943, 3: 0.5885412224632464, 4: 0.11916562911189271, 5: -1.6436210334529249, 6: -0.12444368631987467, 7: 1.4618564171802453, 8: 0.6847234328916137, 9: -0.23177118858569187}, 'x3_2': {0: -0.6452955690715819, 1: 1.052094761527654, 2: 0.20190339195326157, 3: 0.6839430295237913, 4: -0.2607691613858866, 5: 0.3315513026670213, 6: 0.015901139336566113, 7: 0.15243420084881903, 8: -0.7604225072161022, 9: -0.4387652927008854}, 'x3_3': {0: -1.067058994377549, 1: 0.8026914180717286, 2: -1.9868531745912268, 3: -0.5057770735303253, 4: -1.6589569342151713, 5: 0.358172252880764, 6: 1.9238983803281329, 7: 2.2518318810978246, 8: -1.2781475121874357, 9: -0.7103081175166167}})
Y = pd.DataFrame.from_dict({'CHOICE': {0: 1.0, 1: 1.0, 2: 2.0, 3: 2.0, 4: 3.0, 5: 2.0, 6: 1.0, 7: 1.0, 8: 2.0, 9: 2.0}})
Z = pd.DataFrame.from_dict({'z1': {0: 2.4196730570917233, 1: 2.4196730570917233, 2: 2.822802255159467, 3: 2.822802255159467, 4: 2.073171091633643, 5: 2.073171091633643, 6: 2.044165101485163, 7: 2.044165101485163, 8: 2.4001241292606275, 9: 2.4001241292606275}, 'z2': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 1.0, 5: 1.0, 6: 1.0, 7: 1.0, 8: 0.0, 9: 0.0}, 'z3': {0: 1.0, 1: 1.0, 2: 1.0, 3: 1.0, 4: 2.0, 5: 2.0, 6: 2.0, 7: 2.0, 8: 3.0, 9: 3.0}})
id = pd.DataFrame.from_dict({'id_choice': {0: 1.0, 1: 2.0, 2: 3.0, 3: 4.0, 4: 5.0, 5: 6.0, 6: 7.0, 7: 8.0, 8: 9.0, 9: 10.0}, 'id_ind': {0: 1.0, 1: 1.0, 2: 2.0, 3: 2.0, 4: 3.0, 5: 3.0, 6: 4.0, 7: 4.0, 8: 5.0, 9: 5.0}} )
# Create a dataframe with all the data
data = pd.concat([id, X, Z, Y], axis=1)
print(data.head(4))
# id_choice id_ind x1_1 x1_2 x1_3 x2_1 x2_2 \
# 0 1.0 1.0 -0.176621 -2.420777 -1.548321 0.379725 -0.505256
# 1 2.0 1.0 1.645852 -1.082875 0.845734 -2.236439 1.146429
# 2 3.0 2.0 -0.133489 2.735338 -0.212505 0.620595 -0.566238
# 3 4.0 2.0 1.968104 1.597961 0.529239 0.662387 0.687573
#
# x2_3 x3_1 x3_2 x3_3 z1 z2 z3 CHOICE
# 0 -0.976297 0.127518 -0.645296 -1.067059 2.419673 0.0 1.0 1.0
# 1 -0.983731 -0.219266 1.052095 0.802691 2.419673 0.0 1.0 1.0
# 2 1.465828 0.157588 0.201903 -1.986853 2.822802 0.0 1.0 2.0
# 3 0.532595 0.588541 0.683943 -0.505777 2.822802 0.0 1.0 2.0
I want to perform two operations.
First, I want to convert the DataFrame data into a dictionary of DataFrame()s where the keys are the number of individuals (in this particular case, numbers ranging from 1.0 to 5.0.). I've done this below as suggested here. Unfortunately, I am getting a dictionary of numpy values and not a dictionary of DataFrame()s.
# Create a dictionary with the data for each individual
data_dict = data.set_index('id_ind').groupby('id_ind').apply(lambda x : x.to_numpy().tolist()).to_dict()
print(data_dict.keys())
# dict_keys([1.0, 2.0, 3.0, 4.0, 5.0])
print(data_dict[1.0])
#[[1.0, -0.1766214634108258, -2.4207773498298844, -1.548320898226322, 0.3797245517345564, -0.5052556836786106, -0.9762973303149007, 0.12751822396637064, -0.6452955690715819, -1.067058994377549, 2.4196730570917233, 0.0, 1.0, 1.0], [2.0, 1.645852185286492, -1.0828751040719462, 0.8457342014424675, -2.2364391598508835, 1.1464291788297152, -0.983731467806563, -0.21926633684030983, 1.052094761527654, 0.8026914180717286, 2.4196730570917233, 0.0, 1.0, 1.0]]
Second, I want to recover the original DataFrame data reversing the previous operation. The naive approach is as follows. However, it is, of course, not producing the expected result.
# Naive approach
res = pd.DataFrame.from_dict(data_dict, orient='index')
print(res)
# 0 1
#1.0 [1.0, -0.1766214634108258, -2.4207773498298844... [2.0, 1.645852185286492, -1.0828751040719462, ...
#2.0 [3.0, -0.13348860101031038, 2.73533787008624, ... [4.0, 1.9681043689968933, 1.5979611987152071, ...
#3.0 [5.0, -1.7004428240831382, 0.08835542172064115... [6.0, 1.4580091413853749, 1.2209786277076156, ...
#4.0 [7.0, 0.06504113741068565, -0.4420597919595078... [8.0, -1.2168493676768384, -0.692872860268244,...
#5.0 [9.0, -0.3071304478616376, 0.0375521181289943,... [10.0, 0.07121332925591593, 0.4656030062266639...
This solution was inspired by #mozway comments.
# Create a dictionary with the data for each individual
data_dict = dict(list(data.groupby('id_ind')))
# Convert the dictionary into a dataframe
res = pd.concat(data_dict, axis=0).reset_index(drop=True)
print(res.head(4))
# id_choice id_ind x1_1 x1_2 x1_3 x2_1 x2_2 \
#0 1.0 1.0 -0.176621 -2.420777 -1.548321 0.379725 -0.505256
#1 2.0 1.0 1.645852 -1.082875 0.845734 -2.236439 1.146429
#2 3.0 2.0 -0.133489 2.735338 -0.212505 0.620595 -0.566238
#3 4.0 2.0 1.968104 1.597961 0.529239 0.662387 0.687573
#
# x2_3 x3_1 x3_2 x3_3 z1 z2 z3 CHOICE
#0 -0.976297 0.127518 -0.645296 -1.067059 2.419673 0.0 1.0 1.0
#1 -0.983731 -0.219266 1.052095 0.802691 2.419673 0.0 1.0 1.0
#2 1.465828 0.157588 0.201903 -1.986853 2.822802 0.0 1.0 2.0
#3 0.532595 0.588541 0.683943 -0.505777 2.822802 0.0 1.0 2.0

Why merging 2 data frames gives me one with triple the rows

I have df1:
x y no.
0 -17.7 -0.785430 y1
1 -15.0 -3820.085000 y4
2 -12.5 2.138833 y3
.. .... ........ ..
40 15.6 5.486901 y2
41 19.2 1.980686 y3
42 19.6 9.364718 y2
and df2:
delta y x
0 0.053884 -17.7
1 0.085000 -15.0
2 0.143237 -12.5
.. ........ ....
40 0.113099 15.6
41 0.102245 19.2
42 0.235282 19.6
They both have 43 rows, and x column is exactly the same on both.
Somehow when I merge them on x I get a df with 123 rows:
x y no. delta y
0 -17.7 -0.785430 y1 0.053884
1 -15.0 -3820.085000 y4 0.085000
2 -12.5 2.138833 y3 0.143237
3 -12.4 1.721205 y3 0.251180
4 -12.1 2.227343 y2 0.127343
.. ... ... .. ...
118 12.1 1.642526 y3 0.143886
119 14.4 2576.435000 y4 0.171000
120 15.6 5.486901 y2 0.113099
121 19.2 1.980686 y3 0.102245
122 19.6 9.364718 y2 0.235282
My input: final = df1.merge(df2, on="x")
x float64
y float64
no. object
dtype: object
delta y float64
x float64
dtype: object
x float64
y float64
no. object
dtype: object
delta y float64
x float64
dtype: object
x float64
y float64
no. object
dtype: object
delta y float64
x float64
dtype: object
df1 = pd.DataFrame({'x': {0: -17.7, 1: -15.0, 2: -12.5, 3: -12.4, 4: -12.1, 5: -11.2, 6: -8.9, 7: -7.5, 8: -7.5, 9: -6.0, 10: -6.0, 11: -4.7, 12: -4.1, 13: -3.8, 14: -3.4, 15: -3.4, 16: -1.9, 17: -1.5, 18: -1.1, 19: -0.4, 20: -0.1, 21: 3.5, 22: 3.8, 23: 5.3, 24: 5.3, 25: 5.3, 26: 5.3, 27: 5.3, 28: 5.3, 29: 5.3, 30: 5.3, 31: 5.3, 32: 6.4, 33: 6.8, 34: 6.8, 35: 10.2, 36: 10.3, 37: 11.9, 38: 12.1, 39: 14.4, 40: 15.6, 41: 19.2, 42: 19.6}, 'y': {0: -0.7854295, 1: -3820.085, 2: 2.1388333, 3: 1.7212046, 4: 2.227343, 5: 0.04315967, 6: -0.9616607, 7: -1.9878536, 8: -0.52237016, 9: -283.27216, 10: -282.5332, 11: -0.4335017, 12: -1.1585577, 13: -0.008831219, 14: 848.92303, 15: -57.407845, 16: -9.010686, 17: -3.2473037, 18: 0.5536767, 19: 1.8351307, 20: 4.8347697, 21: -6.45842, 22: -1.5683812, 23: 0.9338831, 24: 0.9338831, 25: 97.65833, 26: 1.6500127, 27: 1.6500127, 28: 97.65833, 29: 97.65833, 30: 1.6500127, 31: 97.65833, 32: -3.655422, 33: 1.9058462, 34: 227.5592, 35: 857.7455, 36: -0.68584794, 37: 1.6785516, 38: 1.6425261, 39: 2576.435, 40: 5.4869013, 41: 1.9806856, 42: 9.364718}, 'no.': {0: 'y1', 1: 'y4', 2: 'y3', 3: 'y3', 4: 'y2', 5: 'y3', 6: 'y2', 7: 'y2', 8: 'y2', 9: 'y4', 10: 'y4', 11: 'y1', 12: 'y3', 13: 'y1', 14: 'y4', 15: 'y4', 16: 'y4', 17: 'y4', 18: 'y1', 19: 'y3', 20: 'y4', 21: 'y2', 22: 'y3', 23: 'y3', 24: 'y3', 25: 'y4', 26: 'y3', 27: 'y3', 28: 'y4', 29: 'y3', 30: 'y4', 31: 'y4', 32: 'y2', 33: 'y3', 34: 'y3', 35: 'y4', 36: 'y3', 37: 'y3', 38: 'y3', 39: 'y4', 40: 'y2', 41: 'y3', 42: 'y2'}})
df2 = pd.DataFrame({'delta y': {0: 0.05388353000000001, 1: 0.08500000000003638, 2: 0.14323679999999994, 3: 0.25117999999999996, 4: 0.12734299999999976, 5: 0.36285006000000003, 6: 0.13833930000000005, 7: 0.5121464, 8: 1.97762984, 9: 0.2721599999999853, 10: 0.4667999999999779, 11: 0.2692114, 12: 0.00890970000000002, 13: 0.314458351, 14: 906.34703, 15: 0.0161549999999977, 16: 0.06831400000000087, 17: 0.3723036999999998, 18: 0.2988478, 19: 0.006991300000000145, 20: 0.14423030000000026, 21: 0.04157999999999973, 22: 0.013554200000000183, 23: 0.17486560000000007, 24: 0.17486560000000007, 25: 0.03866999999999621, 26: 0.541264, 27: 0.541264, 28: 0.03866999999999621, 29: 96.5495813, 30: 96.0469873, 31: 0.03866999999999621, 32: 0.05542200000000008, 33: 0.1670513, 34: 225.82040510000002, 35: 0.38250000000005, 36: 0.59580486, 37: 0.10641100000000003, 38: 0.14388610000000002, 39: 0.17099999999982174, 40: 0.11309869999999922, 41: 0.10224489999999986, 42: 0.23528199999999977}, 'x': {0: -17.7, 1: -15.0, 2: -12.5, 3: -12.4, 4: -12.1, 5: -11.2, 6: -8.9, 7: -7.5, 8: -7.5, 9: -6.0, 10: -6.0, 11: -4.7, 12: -4.1, 13: -3.8, 14: -3.4, 15: -3.4, 16: -1.9, 17: -1.5, 18: -1.1, 19: -0.4, 20: -0.1, 21: 3.5, 22: 3.8, 23: 5.3, 24: 5.3, 25: 5.3, 26: 5.3, 27: 5.3, 28: 5.3, 29: 5.3, 30: 5.3, 31: 5.3, 32: 6.4, 33: 6.8, 34: 6.8, 35: 10.2, 36: 10.3, 37: 11.9, 38: 12.1, 39: 14.4, 40: 15.6, 41: 19.2, 42: 19.6}})
final = df1.merge(df2, on="x")
The problem is that x values are not unique, so the merge duplicates rows to get all of the combinations. In a simple example
>>> import pandas as pd
>>> df1=pd.DataFrame({"a":[1,2,3,2], "b":['a', 'b', 'c', 'd']})
>>> df2=pd.DataFrame({"a":[1,2,3,2], "c":['aa', 'bb', 'cc', 'dd']})
>>> df1.merge(df2, on='a')
a b c
0 1 a aa
1 2 b bb
2 2 b dd
3 2 d bb
4 2 d dd
5 3 c cc
2 is not unique in the column and gets all of the combinations (notice b --> dd and d --> dd).
In your case, the x column is identical in the two dataframes. This would also mean that indexes haven't changed and you could assign the columns you want to df1.
df1["delta y"] = df2["delta y"]
try the following: df1.join(df2)
join is a column-wise left join
pd.merge is a column-wise inner join
pd.concat is a row-wise outer join
pd.concat:
takes Iterable arguments. Thus, it cannot take DataFrames directly (use [df,df2])
Dimensions of DataFrame should match along axis
Join and pd.merge:
can take DataFrame arguments
ref: Merge two dataframes by index
Try the following syntax and I encourage you to thoroughly read the official documentation of python, the link is given at the bottom.
I think you might have different x values in df1 and df2 and they are not 100% identical. This could be perhaps because of the decimals.
import pandas as pd
left = pd.DataFrame(
{
"key": ["K0", "K1", "K2", "K3"],
"A": ["A0", "A1", "A2", "A3"],
"B": ["B0", "B1", "B2", "B3"],
}
)
right = pd.DataFrame(
{
"key": ["K0", "K1", "K2", "K3"],
"C": ["C0", "C1", "C2", "C3"],
"D": ["D0", "D1", "D2", "D3"],
}
)
result = pd.merge(left, right, on="key")
Result Image
Python Merge,Join, Concatenate Official Guide

How to Use Melt to Tidy Dataframe in Pandas?

dt = {'Ind': {0: 'Ind1',
1: 'Ind2',
2: 'Ind3',
3: 'Ind4',
4: 'Ind5',
5: 'Ind6',
6: 'Ind7',
7: 'Ind8',
8: 'Ind9',
9: 'Ind10',
10: 'Ind1',
11: 'Ind2',
12: 'Ind3',
13: 'Ind4',
14: 'Ind5',
15: 'Ind6',
16: 'Ind7',
17: 'Ind8',
18: 'Ind9',
19: 'Ind10'},
'Treatment': {0: 'Treat',
1: 'Treat',
2: 'Treat',
3: 'Treat',
4: 'Treat',
5: 'Treat',
6: 'Treat',
7: 'Treat',
8: 'Treat',
9: 'Treat',
10: 'Cont',
11: 'Cont',
12: 'Cont',
13: 'Cont',
14: 'Cont',
15: 'Cont',
16: 'Cont',
17: 'Cont',
18: 'Cont',
19: 'Cont'},
'value': {0: 4.5,
1: 8.3,
2: 6.2,
3: 4.2,
4: 7.1,
5: 7.5,
6: 7.9,
7: 5.1,
8: 5.8,
9: 6.0,
10: 11.3,
11: 11.6,
12: 13.3,
13: 12.2,
14: 13.4,
15: 11.7,
16: 12.1,
17: 12.0,
18: 14.0,
19: 13.8}}
mydt = pd.DataFrame(dt, columns = ['Ind', 'Treatment', 'value')
How can I tidy up my dataframe to make it look like?
Desired Output
You can use DataFrame.from_dict
pd.DataFrame.from_dict(data, orient='index')

Adding a total per level-2 index in a multiindex pandas dataframe

I have a dataframe:
df_full = pd.DataFrame.from_dict({('group', ''): {0: 'A',
1: 'A',
2: 'A',
3: 'A',
4: 'A',
5: 'A',
6: 'A',
7: 'B',
8: 'B',
9: 'B',
10: 'B',
11: 'B',
12: 'B',
13: 'B'},
('category', ''): {0: 'Books',
1: 'Candy',
2: 'Pencil',
3: 'Table',
4: 'PC',
5: 'Printer',
6: 'Lamp',
7: 'Books',
8: 'Candy',
9: 'Pencil',
10: 'Table',
11: 'PC',
12: 'Printer',
13: 'Lamp'},
(pd.Timestamp('2021-06-28 00:00:00'),
'Sales_1'): {0: 9.937449997200002, 1: 30.71300000639998, 2: 58.81199999639999, 3: 25.661999978399994, 4: 3.657999996, 5: 12.0879999972, 6: 61.16600000040001, 7: 6.319439989199998, 8: 12.333119997600003, 9: 24.0544100028, 10: 24.384659998799997, 11: 1.9992000012000002, 12: 0.324, 13: 40.69122000000001},
(pd.Timestamp('2021-06-28 00:00:00'),
'Sales_2'): {0: 21.890370397789923, 1: 28.300470581874837, 2: 53.52039700062155, 3: 52.425508769690694, 4: 6.384936971649232, 5: 6.807138946302334, 6: 52.172, 7: 5.916852561, 8: 5.810764652, 9: 12.1243325, 10: 17.88071596, 11: 0.913782413, 12: 0.869207661, 13: 20.9447844},
(pd.Timestamp('2021-06-28 00:00:00'), 'last_week_sales'): {0: np.nan,
1: np.nan,
2: np.nan,
3: np.nan,
4: np.nan,
5: np.nan,
6: np.nan,
7: np.nan,
8: np.nan,
9: np.nan,
10: np.nan,
11: np.nan,
12: np.nan,
13: np.nan},
(pd.Timestamp('2021-06-28 00:00:00'), 'total_orders'): {0: 86.0,
1: 66.0,
2: 188.0,
3: 556.0,
4: 12.0,
5: 4.0,
6: 56.0,
7: 90.0,
8: 26.0,
9: 49.0,
10: 250.0,
11: 7.0,
12: 2.0,
13: 44.0},
(pd.Timestamp('2021-06-28 00:00:00'), 'total_sales'): {0: 4390.11,
1: 24825.059999999998,
2: 48592.39999999998,
3: 60629.77,
4: 831.22,
5: 1545.71,
6: 34584.99,
7: 5641.54,
8: 6798.75,
9: 13290.13,
10: 42692.68000000001,
11: 947.65,
12: 329.0,
13: 29889.65},
(pd.Timestamp('2021-07-05 00:00:00'),
'Sales_1'): {0: 13.690399997999998, 1: 38.723000005199985, 2: 72.4443400032, 3: 36.75802000560001, 4: 5.691999996, 5: 7.206999998399999, 6: 66.55265999039996, 7: 6.4613199911999954, 8: 12.845630001599998, 9: 26.032340003999998, 10: 30.1634600016, 11: 1.0203399996, 12: 1.4089999991999997, 13: 43.67116000320002},
(pd.Timestamp('2021-07-05 00:00:00'),
'Sales_2'): {0: 22.874363860953647, 1: 29.5726042895728, 2: 55.926190956481534, 3: 54.7820864335212, 4: 6.671946105284065, 5: 7.113126469779095, 6: 54.517, 7: 6.194107518, 8: 6.083562133, 9: 12.69221484, 10: 18.71872129, 11: 0.956574175, 12: 0.910216433, 13: 21.92632044},
(pd.Timestamp('2021-07-05 00:00:00'), 'last_week_sales'): {0: 4390.11,
1: 24825.059999999998,
2: 48592.39999999998,
3: 60629.77,
4: 831.22,
5: 1545.71,
6: 34584.99,
7: 5641.54,
8: 6798.75,
9: 13290.13,
10: 42692.68000000001,
11: 947.65,
12: 329.0,
13: 29889.65},
(pd.Timestamp('2021-07-05 00:00:00'), 'total_orders'): {0: 109.0,
1: 48.0,
2: 174.0,
3: 587.0,
4: 13.0,
5: 5.0,
6: 43.0,
7: 62.0,
8: 13.0,
9: 37.0,
10: 196.0,
11: 8.0,
12: 1.0,
13: 33.0},
(pd.Timestamp('2021-07-05 00:00:00'), 'total_sales'): {0: 3453.02,
1: 17868.730000000003,
2: 44707.82999999999,
3: 60558.97999999999,
4: 1261.0,
5: 1914.6000000000001,
6: 24146.09,
7: 6201.489999999999,
8: 5513.960000000001,
9: 9645.87,
10: 25086.785,
11: 663.0,
12: 448.61,
13: 26332.7}}).set_index(['group','category'])
I am trying to get a total for each column per category. So in this df example adding 2 lines below Lamp denoting the totals of each column. Red lines indicate the desired totals placement:
What I've tried:
df_out['total'] = df_out.sum(level=1).loc[:, (slice(None), 'total_sales')]
But get:
ValueError: Wrong number of items passed 4, placement implies 1
I also checked this question but could not apply it to my self.
Let us try groupby on level=0
s = df_full.groupby(level=0).sum()
s.index = pd.MultiIndex.from_product([s.index, ['Total']])
df_out = df_full.append(s).sort_index()
print(df_out)
2021-06-28 00:00:00 2021-07-05 00:00:00
Sales_1 Sales_2 last_week_sales total_orders total_sales Sales_1 Sales_2 last_week_sales total_orders total_sales
group category
A Books 9.93745 21.890370 NaN 86.0 4390.11 13.69040 22.874364 4390.11 109.0 3453.020
Candy 30.71300 28.300471 NaN 66.0 24825.06 38.72300 29.572604 24825.06 48.0 17868.730
Lamp 61.16600 52.172000 NaN 56.0 34584.99 66.55266 54.517000 34584.99 43.0 24146.090
PC 3.65800 6.384937 NaN 12.0 831.22 5.69200 6.671946 831.22 13.0 1261.000
Pencil 58.81200 53.520397 NaN 188.0 48592.40 72.44434 55.926191 48592.40 174.0 44707.830
Printer 12.08800 6.807139 NaN 4.0 1545.71 7.20700 7.113126 1545.71 5.0 1914.600
Table 25.66200 52.425509 NaN 556.0 60629.77 36.75802 54.782086 60629.77 587.0 60558.980
Total 202.03645 221.500823 0.0 968.0 175399.26 241.06742 231.457318 175399.26 979.0 153910.250
B Books 6.31944 5.916853 NaN 90.0 5641.54 6.46132 6.194108 5641.54 62.0 6201.490
Candy 12.33312 5.810765 NaN 26.0 6798.75 12.84563 6.083562 6798.75 13.0 5513.960
Lamp 40.69122 20.944784 NaN 44.0 29889.65 43.67116 21.926320 29889.65 33.0 26332.700
PC 1.99920 0.913782 NaN 7.0 947.65 1.02034 0.956574 947.65 8.0 663.000
Pencil 24.05441 12.124332 NaN 49.0 13290.13 26.03234 12.692215 13290.13 37.0 9645.870
Printer 0.32400 0.869208 NaN 2.0 329.00 1.40900 0.910216 329.00 1.0 448.610
Table 24.38466 17.880716 NaN 250.0 42692.68 30.16346 18.718721 42692.68 196.0 25086.785
Total 110.10605 64.460440 0.0 468.0 99589.40 121.60325 67.481717 99589.40 350.0 73892.415

Adding hours values each 3 columns in Python

Imagine to have a raw dataframe like the following:
What I would like to have in order to be able to work on the data is to rearrange it so that every 3 columns (that represent the hourly values of each day) creating a new row with the date time values (e.g. 2015-05-31 00:00:00, 2015-05-31 01:00:00, 2015-05-31 02:00:00, etc.) in order to end up with just 4 columns: Date, Tmin, Tmax, and Nsum.
Here the raw dictionary from the imported CSV (just a few rows):
{'Date': {0: '2015-04-30', 1: '2015-05-01', 2: '2015-05-02', 3: '2015-05-03', 4: '2015-05-04'}, 'T min °C': {0: 11.7, 1: 8.3, 2: 8.3, 3: 11.6, 4: 12.4}, 'T max °C': {0: 11.9, 1: 8.9, 2: 8.4, 3: 11.8, 4: 12.7}, 'N sum mm': {0: 0.0, 1: 0.0, 2: 0.6, 3: 1.9, 4: 0.0}, 'T min °C.1': {0: 11.6, 1: 8.0, 2: 8.3, 3: 11.4, 4: 12.4}, 'T max °C.1': {0: 11.8, 1: 8.2, 2: 8.3, 3: 11.6, 4: 12.4}, 'N sum mm.1': {0: 0.0, 1: 0.1, 2: 0.6, 3: 0.3, 4: 0.0}, 'T min °C.2': {0: 10.2, 1: 7.9, 2: 8.2, 3: 11.1, 4: 12.2}, 'T max °C.2': {0: 11.2, 1: 8.1, 2: 8.3, 3: 11.4, 4: 12.3}, 'N sum mm.2': {0: 0.0, 1: 0.0, 2: 1.5, 3: 0.2, 4: 0.0}, 'T min °C.3': {0: 9.2, 1: 7.5, 2: 8.1, 3: 11.0, 4: 12.1}, 'T max °C.3': {0: 9.8, 1: 7.8, 2: 8.2, 3: 11.1, 4: 12.2}, 'N sum mm.3': {0: 0.0, 1: 0.0, 2: 0.4, 3: 0.0, 4: 0.0}, 'T min °C.4': {0: 8.8, 1: 7.0, 2: 8.2, 3: 10.8, 4: 12.0}, 'T max °C.4': {0: 9.2, 1: 7.5, 2: 8.2, 3: 10.9, 4: 12.1}, 'N sum mm.4': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.1, 4: 0.0}, 'T min °C.5': {0: 8.4, 1: 7.0, 2: 8.2, 3: 10.6, 4: 11.9}, 'T max °C.5': {0: 8.6, 1: 7.1, 2: 8.3, 3: 10.8, 4: 12.1}, 'N sum mm.5': {0: 0.1, 1: 0.0, 2: 0.0, 3: 0.2, 4: 0.0}, 'T min °C.6': {0: 8.6, 1: 6.9, 2: 8.1, 3: 10.5, 4: 11.8}, 'T max °C.6': {0: 8.7, 1: 7.0, 2: 8.3, 3: 10.6, 4: 11.9}, 'N sum mm.6': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.1, 4: 0.0}, 'T min °C.7': {0: 8.5, 1: 6.8, 2: 8.4, 3: 10.4, 4: 11.8}, 'T max °C.7': {0: 8.7, 1: 7.0, 2: 8.9, 3: 10.5, 4: 12.0}, 'N sum mm.7': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.2, 4: 0.2}, 'T min °C.8': {0: 8.4, 1: 7.0, 2: 9.1, 3: 10.6, 4: 12.0}, 'T max °C.8': {0: 8.4, 1: 7.2, 2: 10.8, 3: 10.8, 4: 12.8}, 'N sum mm.8': {0: 1.4, 1: 0.0, 2: 0.0, 3: 0.1, 4: 0.0}, 'T min °C.9': {0: 7.0, 1: 7.3, 2: 11.2, 3: 10.9, 4: 13.0}, 'T max °C.9': {0: 8.3, 1: 7.8, 2: 12.5, 3: 11.4, 4: 13.5}, 'N sum mm.9': {0: 2.9, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0}, 'T min °C.10': {0: 6.7, 1: 8.0, 2: 12.3, 3: 11.5, 4: 13.6}, 'T max °C.10': {0: 6.9, 1: 8.2, 2: 13.9, 3: 12.3, 4: 14.8}, 'N sum mm.10': {0: 2.9, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0}, 'T min °C.11': {0: 6.5, 1: 8.2, 2: 14.5, 3: 12.3, 4: 15.0}, 'T max °C.11': {0: 6.6, 1: 8.5, 2: 16.1, 3: 12.7, 4: 15.8}, 'N sum mm.11': {0: 3.7, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0}, 'T min °C.12': {0: 6.7, 1: 8.3, 2: 16.3, 3: 12.8, 4: 15.9}, 'T max °C.12': {0: 7.3, 1: 8.4, 2: 17.6, 3: 13.4, 4: 16.3}, 'N sum mm.12': {0: 1.1, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0}, 'T min °C.13': {0: 7.6, 1: 8.4, 2: 17.8, 3: 13.6, 4: 16.3}, 'T max °C.13': {0: 8.8, 1: 8.5, 2: 18.6, 3: 13.9, 4: 17.0}, 'N sum mm.13': {0: 0.0, 1: 0.1, 2: 0.0, 3: 0.0, 4: 0.0}, 'T min °C.14': {0: 9.5, 1: 8.6, 2: 19.2, 3: 14.1, 4: 17.0}, 'T max °C.14': {0: 11.4, 1: 9.1, 2: 19.8, 3: 14.3, 4: 17.3}, 'N sum mm.14': {0: 0.0, 1: 0.3, 2: 0.0, 3: 0.0, 4: 0.0}, 'T min °C.15': {0: 11.4, 1: 9.0, 2: 20.0, 3: 14.4, 4: 16.7}, 'T max °C.15': {0: 12.6, 1: 9.1, 2: 20.5, 3: 15.0, 4: 17.0}, 'N sum mm.15': {0: 0.0, 1: 0.4, 2: 0.0, 3: 0.0, 4: 0.0}, 'T min °C.16': {0: 12.6, 1: 9.1, 2: 20.0, 3: 14.8, 4: 16.8}, 'T max °C.16': {0: 13.4, 1: 9.3, 2: 20.4, 3: 14.9, 4: 17.1}, 'N sum mm.16': {0: 0.0, 1: 0.2, 2: 0.0, 3: 0.0, 4: 0.0}, 'T min °C.17': {0: 13.7, 1: 9.2, 2: 19.6, 3: 14.6, 4: 16.3}, 'T max °C.17': {0: 14.1, 1: 9.3, 2: 20.0, 3: 14.7, 4: 16.5}, 'N sum mm.17': {0: 0.0, 1: 0.1, 2: 0.0, 3: 0.0, 4: 0.0}, 'T min °C.18': {0: 12.9, 1: 8.9, 2: 17.7, 3: 14.2, 4: 16.0}, 'T max °C.18': {0: 13.9, 1: 9.1, 2: 19.4, 3: 14.6, 4: 16.4}, 'N sum mm.18': {0: 0.0, 1: 0.1, 2: 0.0, 3: 0.0, 4: 0.0}, 'T min °C.19': {0: 11.0, 1: 8.7, 2: 16.0, 3: 14.0, 4: 15.3}, 'T max °C.19': {0: 12.2, 1: 8.9, 2: 17.9, 3: 14.1, 4: 16.1}, 'N sum mm.19': {0: 0.0, 1: 0.2, 2: 0.0, 3: 0.0, 4: 0.0}, 'T min °C.20': {0: 9.9, 1: 8.6, 2: 14.6, 3: 13.2, 4: 14.5}, 'T max °C.20': {0: 10.9, 1: 8.7, 2: 16.0, 3: 14.0, 4: 15.4}, 'N sum mm.20': {0: 0.0, 1: 0.7, 2: 0.0, 3: 0.0, 4: 0.0}, 'T min °C.21': {0: 10.2, 1: 8.6, 2: 13.8, 3: 12.8, 4: 14.2}, 'T max °C.21': {0: 10.5, 1: 8.6, 2: 14.9, 3: 13.4, 4: 14.9}, 'N sum mm.21': {0: 0.0, 1: 1.5, 2: 0.2, 3: 0.0, 4: 0.0}, 'T min °C.22': {0: 9.1, 1: 8.5, 2: 12.1, 3: 12.8, 4: 13.8}, 'T max °C.22': {0: 10.2, 1: 8.5, 2: 13.2, 3: 12.9, 4: 14.3}, 'N sum mm.22': {0: 0.0, 1: 1.3, 2: 0.7, 3: 0.0, 4: 0.0}, 'T min °C.23': {0: 9.1, 1: 8.4, 2: 11.9, 3: 12.7, 4: 13.4}, 'T max °C.23': {0: 9.6, 1: 8.4, 2: 12.7, 3: 12.8, 4: 14.1}, 'N sum mm.23': {0: 0.0, 1: 1.3, 2: 2.1, 3: 0.0, 4: 0.0}}
First create DatetimeIndex, then reshape values for 3 columns, create new index by numpy.repeat:
df = df.set_index('Date')
df = pd.DataFrame(df.values.reshape(-1, 3),
index=pd.to_datetime(np.repeat(df.index, len(df.columns) // 3)),
columns=['Tmin', 'Tmax', 'Nsum'])
Last add hours by converted modulo to timedeltas:
df.index += pd.to_timedelta(np.arange(len(df)) % 24, unit='h')
df = df.rename_axis('Date').reset_index()
print (df.head(30))
Date Tmin Tmax Nsum
0 2015-04-30 00:00:00 11.7 11.9 0.0
1 2015-04-30 01:00:00 11.6 11.8 0.0
2 2015-04-30 02:00:00 10.2 11.2 0.0
3 2015-04-30 03:00:00 9.2 9.8 0.0
4 2015-04-30 04:00:00 8.8 9.2 0.0
5 2015-04-30 05:00:00 8.4 8.6 0.1
6 2015-04-30 06:00:00 8.6 8.7 0.0
7 2015-04-30 07:00:00 8.5 8.7 0.0
8 2015-04-30 08:00:00 8.4 8.4 1.4
9 2015-04-30 09:00:00 7.0 8.3 2.9
10 2015-04-30 10:00:00 6.7 6.9 2.9
11 2015-04-30 11:00:00 6.5 6.6 3.7
12 2015-04-30 12:00:00 6.7 7.3 1.1
13 2015-04-30 13:00:00 7.6 8.8 0.0
14 2015-04-30 14:00:00 9.5 11.4 0.0
15 2015-04-30 15:00:00 11.4 12.6 0.0
16 2015-04-30 16:00:00 12.6 13.4 0.0
17 2015-04-30 17:00:00 13.7 14.1 0.0
18 2015-04-30 18:00:00 12.9 13.9 0.0
19 2015-04-30 19:00:00 11.0 12.2 0.0
20 2015-04-30 20:00:00 9.9 10.9 0.0
21 2015-04-30 21:00:00 10.2 10.5 0.0
22 2015-04-30 22:00:00 9.1 10.2 0.0
23 2015-04-30 23:00:00 9.1 9.6 0.0
24 2015-05-01 00:00:00 8.3 8.9 0.0
25 2015-05-01 01:00:00 8.0 8.2 0.1
26 2015-05-01 02:00:00 7.9 8.1 0.0
27 2015-05-01 03:00:00 7.5 7.8 0.0
28 2015-05-01 04:00:00 7.0 7.5 0.0
29 2015-05-01 05:00:00 7.0 7.1 0.0

Categories