How to transform multiple columns of days into week columns

How to transform multiple columns of days into week columns - python

I would like to know how can I transform the day columns into week columns.
I tryed groupby.sum() but there is no column name pattern, I dont know what to groupby for.
So the result should be column name like 'weekX' - "week1(Sum of 7 first days) - week2 - week3" and so on.
Thanks in advance.

You can try:
idx = pd.RangeIndex(len(df.columns[4:])) // 7
out = df.iloc[:, 4:].groupby(idx, axis=1).sum().rename(columns=lambda x:f'Week{x+1}')
out = pd.concat([df.iloc[:, :4], out], axis=1)
print(out)
# Output
Province/State Country/Region Lat ... Week26 Week27 Week28
0 NaN Afghanistan 3.393.911 ... 247210 252460 219855
1 NaN Albania 411.533 ... 28068 32671 32113
2 NaN Algeria 280.339 ... 157675 187224 183841
3 NaN Andorra 425.063 ... 6147 6283 5552
4 NaN Angola -112.027 ... 4741 6341 6978
.. ... ... ... ... ... ... ...
261 NaN Sao Tome and Principe 1.864 ... 5199 5813 5231
262 NaN Yemen 15.552.727 ... 11089 11717 10363
263 NaN Comoros -116.455 ... 2310 2419 2292
264 NaN Tajikistan 38.861 ... 47822 50032 44579
265 NaN Lesotho -29.61 ... 2259 3011 3922
[266 rows x 32 columns]

You can use the melt method to combine all your date columns into a single 'Date' column:
df = df.melt(id_vars=['Province/State', 'Country/Region', 'Lat', 'Long'], var_name='Date', value_name='Value')
From this point it should be straightforward to group by the 'Date' column by week, and then unstack it if you want to have it as multiple columns again.

Related

Using DataFrame Columns as id

Does anyone know how to transform this DataFrame in a way that the column names become a query ID (keeping the df length) and the values are flattened. I am trying to learn about 'learning to rank' algorithms. Thanks for the help.
AUD=X CAD=X CHF=X ... SGD=X THB=X ZAR=X
Date ...
2004-06-30 NaN 1.33330 1.25040 ... 1.72090 40.834999 6.12260
2004-07-01 NaN 1.33160 1.24900 ... 1.71420 40.716999 6.16500
2004-07-02 NaN 1.32270 1.23320 ... 1.71160 40.638000 6.12010
2004-07-05 NaN 1.32470 1.23490 ... 1.71480 40.658001 6.15010
2004-07-06 NaN 1.32660 1.23660 ... 1.71530 40.765999 6.20990
... ... ... ... ... ... ...
2021-07-19 1.352997 1.26169 0.91853 ... 1.35630 32.810001 14.38950
2021-07-20 1.362546 1.27460 0.91850 ... 1.36360 32.840000 14.53068
2021-07-21 1.362600 1.26751 0.92123 ... 1.36621 32.820000 14.59157
2021-07-22 1.360060 1.25689 0.91757 ... 1.36383 32.849998 14.57449
2021-07-23 1.354922 1.25640 0.91912 ... 1.35935 32.879002 14.69760

In [3]: df
Out[3]:
AUD=X CAD=X CHF=X SGD=X THB=X ZAR=X
Date
2004-06-30 NaN 1.3333 1.2504 1.7209 40.834999 6.1226
2004-07-01 NaN 1.3316 1.2490 1.7142 40.716999 6.1650
2004-07-02 NaN 1.3227 1.2332 1.7116 40.638000 6.1201
2004-07-05 NaN 1.3247 1.2349 1.7148 40.658001 6.1501
2004-07-06 NaN 1.3266 1.2366 1.7153 40.765999 6.2099
In [6]: df.columns = df.columns.str.slice(0, -2)
In [8]: df.T
Out[8]:
Date 2004-06-30 2004-07-01 2004-07-02 2004-07-05 2004-07-06
AUD NaN NaN NaN NaN NaN
CAD 1.333300 1.331600 1.3227 1.324700 1.326600
CHF 1.250400 1.249000 1.2332 1.234900 1.236600
SGD 1.720900 1.714200 1.7116 1.714800 1.715300
THB 40.834999 40.716999 40.6380 40.658001 40.765999
ZAR 6.122600 6.165000 6.1201 6.150100 6.209900
I'm still not super clear on the requirements, but this transformation might help.

Python : Remodeling the presentation data from a pandas Dataframe / group duplicates

Let's say that I have this dataframe with three column : "Name", "Account" and "Ccy".
import pandas as pd
Name = ['Dan', 'Mike', 'Dan', 'Dan', 'Sara', 'Charles', 'Mike', 'Karl']
Account = ['100', '30', '50', '200', '90', '20', '65', '230']
Ccy = ['EUR','EUR','USD','USD','','CHF', '','DKN']
df = pd.DataFrame({'Name':Name, 'Account' : Account, 'Ccy' : Ccy})
Name Account Ccy
0 Dan 100 EUR
1 Mike 30 EUR
2 Dan 50 USD
3 Dan 200 USD
4 Sara 90
5 Charles 20 CHF
6 Mike 65
7 Karl 230 DKN
I would like to reprensent this data differently. I would like to write a script that find all the duplicates in the column name and regroup them wit the different account and if there are an currency "Ccy", it add a new column next to it with all the currency associated.
So something like that :
Dan Ccy1 Mike Ccy2 Sara Charles Ccy3 Karl Ccy4
0 100 EUR 30 EUR 90 20 CHF 230 DKN
1 50 USD 65
2 200 USD
I dont' really know how to start that ! So I simplify the problem to do step y step. I try to regroup the dupicates by the name with a list however it did not identify the duplicates.
x_len, y_len = df.shape
new_data = []
for i in range(x_len) :
if df.iloc[i,0] not in new_data :
print(str(df.iloc[i,0]) + '\t'+ str(df.iloc[i,1])+ '\t' + str(bool(df.iloc[i,0] not in new_data)))
new_data.append([df.iloc[i,0],df.iloc[i,1]])
else:
new_data[str(df.iloc[i,0])].append(df.iloc[i,1])
Then I thought that it was easier to use a dictionary. So I try this loop but there is an error and maybe it is not the best way to go to the expected final result
from collections import defaultdict
dico=defaultdict(list)
x_len, y_len = df.shape
for i in range(x_len) :
if df.iloc[i,0] not in dico :
print(str(df.iloc[i,0]) + '\t'+ str(df.iloc[i,1])+ '\t' + str(bool(df.iloc[i,0] not in dico)))
dico[str(df.iloc[i,0])] = df.iloc[i,1]
print(dico)
else :
dico[df.iloc[i,0]].append(df.iloc[i,1])
Anyone has an idea how to start or to do the code if it is simple ?
Thank you

Use GroupBy.cumcount for counter, reshape by DataFrame.set_index and DataFrame.unstack and last flatten columns names:
g = df.groupby(['Name']).cumcount()
df = df.set_index([g,'Name']).unstack().sort_index(level=1, axis=1)
df.columns = df.columns.map(lambda x: f'{x[0]}_{x[1]}')
print (df)
Account_Charles Ccy_Charles Account_Dan Ccy_Dan Account_Karl Ccy_Karl \
0 20 CHF 100 EUR 230 DKN
1 NaN NaN 50 USD NaN NaN
2 NaN NaN 200 USD NaN NaN
Account_Mike Ccy_Mike Account_Sara Ccy_Sara
0 30 EUR 90
1 65 NaN NaN
2 NaN NaN NaN NaN
If need custom columns names use if-else in list comprehension:
g = df.groupby(['Name']).cumcount()
df = df.set_index([g,'Name']).unstack().sort_index(level=1, axis=1)
L = [b if a == 'Account' else f'{a}{i // 2}' for i, (a, b) in enumerate(df.columns)]
df.columns = L
print (df)
Charles Ccy0 Dan Ccy1 Karl Ccy2 Mike Ccy3 Sara Ccy4
0 20 CHF 100 EUR 230 DKN 30 EUR 90
1 NaN NaN 50 USD NaN NaN 65 NaN NaN
2 NaN NaN 200 USD NaN NaN NaN NaN NaN NaN

How can I convert a row of years that also contains NaNs in integers?

This is the head of my dataframe (immigration):
nan 1850.0 1851.0 1852.0 1853.0 1854.0 1855.0 1856.0 1857.0 1858.0 ... 2008.0 2009.0 2010.0 2011.0 2012.0 2013.0 2014.0 2015.0 2016.0 2017.0
0 NaN 1850.000000 1851.000000 1852.000000 1853.000000 1854.000000 1855.000000 1856.000000 1857.000000 1858.000000 ... 2008.000000 2009.000000 2010.000000 2011.000000 2012.000000 2013.000000 2014.000000 2015.000000 2016.000000 2017.000000
1 California 0.235450 0.282475 0.311489 0.331177 0.345413 0.356185 0.364622 0.371407 0.376984 ... 0.268349 0.269110 0.271770 0.270484 0.270779 0.268994 0.270921 0.273046 0.272042 0.269457
2 New York 0.211768 0.217419 0.222798 0.227924 0.232815 0.237486 0.241952 0.246226 0.250320 ... 0.212731 0.213811 0.221615 0.221817 0.226076 0.223056 0.226143 0.228841 0.229732 0.228741
3 New Jersey 0.122454 0.130429 0.137851 0.144774 0.151249 0.157317 0.163015 0.168377 0.173430 ... 0.199191 0.202058 0.209573 0.214619 0.212452 0.216395 0.219366 0.220733 0.225400 0.228197
What I would like to do is:
1) Get rid of that "nan" at the beginning, and replace it with the word "Country"
2) I would like to get rid of the decimal points in the numbers in the header since those are years
I tried:
immigration.columns = pd.to_numeric(immigration.iloc[0], downcast='integer', errors='coerce')
Also:
immigration.iloc[0].astype(int)
None of those worked.

Disclaimer
I personally would advise fixing this data at the source, so that once you read it into a DataFrame, you don't have to deal with this type of data cleaning. If that is not an option, you can use this approach.
First, replace the NaN with your Country header:
df.columns = df.iloc[0].fillna('Country').astype(str).values
Country 1850.0 1851.0 1852.0 1853.0 1854.0 1855.0
0 NaN 1850.000000 1851.000000 1852.000000 1853.000000 1854.000000 1855.000000
1 California 0.235450 0.282475 0.311489 0.331177 0.345413 0.356185
2 New York 0.211768 0.217419 0.222798 0.227924 0.232815 0.237486
3 New Jersey 0.122454 0.130429 0.137851 0.144774 0.151249 0.157317
Now use a regular expression to rename your columns, and slice your DataFrame:
df.rename(columns=lambda x: re.sub(r'\.\d+', '', x)).iloc[1:]
Country 1850 1851 1852 1853 1854 1855
1 California 0.235450 0.282475 0.311489 0.331177 0.345413 0.356185
2 New York 0.211768 0.217419 0.222798 0.227924 0.232815 0.237486
3 New Jersey 0.122454 0.130429 0.137851 0.144774 0.151249 0.157317

Pandas merge error TypeError: '>' not supported between instances of 'int' and 'str'

I have a dataset with several tables, each in the form of countries, years, and some indicators. I have converted all the excel tables to csv files, then merged them into one table.
The problem is that I have some tables that refuse to be merged, and the following message appears TypeError: '>' not supported between instances of 'int' and 'str'
I tried everything I can, but no luck, still the same error appears!
Also, I tried with hundreds of different files, but there are still tens of files that face this problem.
For the sample files file17.csv and file35.csv (In case someone needs to repeat it). Here are the code I used:
# To load the first file
import pandas as pd
filename1 = 'file17.csv'
df1 = pd.read_csv(filename1, encoding='cp1252', low_memory=False)
df1.set_index(['Country', 'Year'], inplace=True)
df1.dropna(axis=0, how='all', inplace=True)
df1.head()
Out>>>
+-------------+------+--------+--------+
| | | ind500 | ind356 |
| Country | Year | | |
| Afghanistan | 1800 | 603.0 | NaN |
| | 1801 | 603.0 | NaN |
| | 1802 | 603.0 | NaN |
| | 1803 | 603.0 | NaN |
| | 1804 | 603.0 | NaN |
+-------------+------+--------+--------+
In>>>
# To load the second file
filename2 = 'file35.csv'
df2 = pd.read_csv(filename2, encoding='cp1252', low_memory=False)
df2.set_index(['Country', 'Year'], inplace=True)
df2.dropna(axis=0, how='all', inplace=True)
df2.head()
Out>>>
# To merge the two dataframes
gross_df = pd.merge(df1, df2, left_index=True, right_index=True, how='outer')
gross_df.dropna(axis=0, how='all', inplace=True)
print (gross_df.shape)
gross_df.to_csv('merged.csv')
Important notice:
I noticed that in all the successful files, the columns names appear in ascending orders i.e. ind001, ind009, ind012, as they were sorted automatically. while the files with errors have one or more columns with misordered placement like ind500 followed by in356 in the first table and the same applies to the second sample provided.
Notice that the two dataframesindiceswo indices (Country and year)
The error
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\algorithms.py in safe_sort(values, labels, na_sentinel, assume_unique)
480 try:
--> 481 sorter = values.argsort()
482 ordered = values.take(sorter)
TypeError: '>' not supported between instances of 'int' and 'str'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-11-960b2698de60> in <module>()
----> 1 gross_df = pd.merge(df1, df2, left_index=True, right_index=True, how='outer', sort=False)
2 gross_df.dropna(axis=0, how='all', inplace=True)
3 print (gross_df.shape)
4 gross_df.to_csv('merged.csv')
C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\reshape\merge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator)
52 right_index=right_index, sort=sort, suffixes=suffixes,
53 copy=copy, indicator=indicator)
---> 54 return op.get_result()
55
56
C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\reshape\merge.py in get_result(self)
567 self.left, self.right)
568
--> 569 join_index, left_indexer, right_indexer = self._get_join_info()
570
571 ldata, rdata = self.left._data, self.right._data
C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\reshape\merge.py in _get_join_info(self)
720 join_index, left_indexer, right_indexer = \
721 left_ax.join(right_ax, how=self.how, return_indexers=True,
--> 722 sort=self.sort)
723 elif self.right_index and self.how == 'left':
724 join_index, left_indexer, right_indexer = \
C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\indexes\base.py in join(self, other, how, level, return_indexers, sort)
2995 else:
2996 return self._join_non_unique(other, how=how,
-> 2997 return_indexers=return_indexers)
2998 elif self.is_monotonic and other.is_monotonic:
2999 try:
C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\indexes\base.py in _join_non_unique(self, other, how, return_indexers)
3076 left_idx, right_idx = _get_join_indexers([self.values],
3077 [other._values], how=how,
-> 3078 sort=True)
3079
3080 left_idx = _ensure_platform_int(left_idx)
C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\reshape\merge.py in _get_join_indexers(left_keys, right_keys, sort, how, **kwargs)
980
981 # get left & right join labels and num. of levels at each location
--> 982 llab, rlab, shape = map(list, zip(* map(fkeys, left_keys, right_keys)))
983
984 # get flat i8 keys from label lists
C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\reshape\merge.py in _factorize_keys(lk, rk, sort)
1409 if sort:
1410 uniques = rizer.uniques.to_array()
-> 1411 llab, rlab = _sort_labels(uniques, llab, rlab)
1412
1413 # NA group
C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\reshape\merge.py in _sort_labels(uniques, left, right)
1435 labels = np.concatenate([left, right])
1436
-> 1437 _, new_labels = algos.safe_sort(uniques, labels, na_sentinel=-1)
1438 new_labels = _ensure_int64(new_labels)
1439 new_left, new_right = new_labels[:l], new_labels[l:]
C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\algorithms.py in safe_sort(values, labels, na_sentinel, assume_unique)
483 except TypeError:
484 # try this anyway
--> 485 ordered = sort_mixed(values)
486
487 # labels:
C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\algorithms.py in sort_mixed(values)
469 str_pos = np.array([isinstance(x, string_types) for x in values],
470 dtype=bool)
--> 471 nums = np.sort(values[~str_pos])
472 strs = np.sort(values[str_pos])
473 return _ensure_object(np.concatenate([nums, strs]))
C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\numpy\core\fromnumeric.py in sort(a, axis, kind, order)
820 else:
821 a = asanyarray(a).copy(order="K")
--> 822 a.sort(axis=axis, kind=kind, order=order)
823 return a
824
TypeError: '>' not supported between instances of 'int' and 'str'

This error indicates that indices in merged DF have different dtypes
Demo - how to convert string index level to int:
In [183]: df
Out[183]:
0 1 2 3
bar 1 -0.205037 0.762509 0.816608 -1.057907
2 1.249104 0.338777 -0.982084 0.329330
baz 1 0.845695 -0.996365 0.548100 -0.113733
2 1.247092 -2.674061 -0.071993 -0.734242
foo 1 -1.233825 -0.195377 -0.240303 1.168055
2 -0.108942 -0.615612 -1.299512 0.908641
qux 1 0.844421 0.251425 -0.506877 1.307800
2 0.038580 0.045072 -0.262974 0.629804
In [184]: df.index
Out[184]:
MultiIndex(levels=[['bar', 'baz', 'foo', 'qux'], ['1', '2']],
labels=[[0, 0, 1, 1, 2, 2, 3, 3], [0, 1, 0, 1, 0, 1, 0, 1]])
In [185]: df.index.get_level_values(1)
Out[185]: Index(['1', '2', '1', '2', '1', '2', '1', '2'], dtype='object')
In [187]: df.index = df.index.set_levels(df.index.get_level_values(1) \
.map(lambda x: pd.to_numeric(x, errors='coerce')), level=1)
Result:
In [189]: df.index.get_level_values(1)
Out[189]: Int64Index([1, 2, 1, 2, 1, 2, 1, 2], dtype='int64')
UPDATE: try this:
In [247]: d1 = pd.read_csv('https://docs.google.com/uc?id=1jUsbr5pw6sUMvewI4fmbpssroG4RZ7LE&export=download', index_col=[0,1])
In [248]: d2 = pd.read_csv('https://docs.google.com/uc?id=1Ufx6pvnSC6zQdTAj05ObmV027fA4-Mr3&export=download', index_col=[0,1])
In [249]: d2 = d2[pd.to_numeric(d2.index.get_level_values(1), errors='coerce').notna()]
In [250]: d2.index = d2.index.set_levels(d2.index.get_level_values(1).map(lambda x: pd.to_numeric(x, errors='coerce')), level=1)
In [251]: d1.reset_index().merge(d2.reset_index(), on=['Country','Year'], how='outer').set_index(['Country','Year'])
Out[251]:
ind500 ind356 ind475 ind476 ind456
Country Year
Afghanistan 1800 603.0 NaN NaN NaN NaN
1801 603.0 NaN NaN NaN NaN
1802 603.0 NaN NaN NaN NaN
1803 603.0 NaN NaN NaN NaN
1804 603.0 NaN NaN NaN NaN
1805 603.0 NaN NaN NaN NaN
1806 603.0 NaN NaN NaN NaN
1807 603.0 NaN NaN NaN NaN
1808 603.0 NaN NaN NaN NaN
1809 603.0 NaN NaN NaN NaN
... ... ... ... ... ...
Bahamas, The 1967 NaN NaN NaN NaN 18381.131314
Gambia, The 1967 NaN NaN NaN NaN 937.355288
Korea, Dem. Rep. 1967 NaN NaN NaN NaN 1428.689253
Lao PDR 1967 NaN NaN NaN NaN 1412.359955
Netherlands Antilles 1967 NaN NaN NaN NaN 14076.731352
Russian Federation 1967 NaN NaN NaN NaN 11794.726437
Serbia and Montenegro 1967 NaN NaN NaN NaN 2987.080489
Syrian Arab Republic 1967 NaN NaN NaN NaN 2015.913906
Yemen, Rep. 1967 NaN NaN NaN NaN 1075.693355
Bahamas, The 1968 NaN NaN NaN NaN 18712.082830
[46607 rows x 5 columns]

For anyone who stumbles across this in 2021:
The problem here is that the pandas multi-index is not unique in the dataset.
You can solve this either by:
Selecting a unique multi-index
Or, resetting the index and doing a merge on the columns
eg. pd.merge(d1.reset_index(), d2.reset_index(), on=['Country','Year'], how='outer')

pandas rolling_apply TypeError: int object is not iterable"

I have a function saved and defined in a different script called TechAnalisys.py This function just outputs a scalar, so I plan to use pd.rolling_apply() to generate a new column into the original dataframe (df).
The function works fine when executed, but I have problems when using the rolling_apply() application.This link Passing arguments to rolling_apply shows how you should do it, and that is how I think it my code is but it still shows the error "TypeError: int object is not iterable" appears
This is the function (located in the script TechAnalisys.py)
def hurst(df,days):
import pandas as pd
import numpy as np
df2 = pd.DataFrame()
df2 = df[-days:]
rango = lambda x: x.max() - x.min()
df2['ret'] = 1 - df.PX_LAST/df.PX_LAST.shift(1)
df2 = df2.dropna()
ave = pd.expanding_mean(df2.ret)
df2['desvdeprom'] = df2.ret - ave
df2['acum'] = df2['desvdeprom'].cumsum()
df2['rangorolled'] = pd.expanding_apply(df2.acum, rango)
df2['datastd'] = pd.expanding_std(df2.ret)
df2['rango_rangostd'] = np.log(df2.rangorolled/df2.datastd)
df2['tiempo1'] = np.log(range(1,len(df2.index)+1))
df2 = df2.dropna()
model1 = pd.ols(y=df2['rango_rangostd'], x=df2['tiempo1'], intercept=False)
return model1.beta
and now this is the main script:
import pandas as pd
import numpy as np
import TechAnalysis as ta
df = pd.DataFrame(np.log(np.cumsum(np.random.randn(100000)+1)+1000),columns =['PX_LAST'])
The following works:
print ta.hurst(df,50)
This doesn't work:
df['hurst_roll'] = pd.rolling_apply(df, 15 , ta.hurst, args=(50))
Whats wrong in the code?

If you check the type of df within the hurst function, you'll see that rolling_apply passes it as numpy.array.
If you create a DataFrame from this numpy.array inside rolling_apply, it works. I also used a longer window because there were only 15 values per array but you seemed to be planning on using the last 50 days.
def hurst(df, days):
df = pd.DataFrame(df, columns=['PX_LAST'])
df2 = pd.DataFrame()
df2 = df.loc[-days:, :]
rango = lambda x: x.max() - x.min()
df2['ret'] = 1 - df.loc[:, 'PX_LAST']/df.loc[:, 'PX_LAST'].shift(1)
df2 = df2.dropna()
ave = pd.expanding_mean(df2.ret)
df2['desvdeprom'] = df2.ret - ave
df2['acum'] = df2['desvdeprom'].cumsum()
df2['rangorolled'] = pd.expanding_apply(df2.acum, rango)
df2['datastd'] = pd.expanding_std(df2.ret)
df2['rango_rangostd'] = np.log(df2.rangorolled/df2.datastd)
df2['tiempo1'] = np.log(range(1, len(df2.index)+1))
df2 = df2.dropna()
model1 = pd.ols(y=df2['rango_rangostd'], x=df2['tiempo1'], intercept=False)
return model1.beta
def rol_apply():
df = pd.DataFrame(np.log(np.cumsum(np.random.randn(1000)+1)+1000), columns=['PX_LAST'])
df['hurst_roll'] = pd.rolling_apply(df, 100, hurst, args=(50, ))
PX_LAST hurst_roll
0 6.907911 NaN
1 6.907808 NaN
2 6.907520 NaN
3 6.908048 NaN
4 6.907622 NaN
5 6.909895 NaN
6 6.911281 NaN
7 6.911998 NaN
8 6.912245 NaN
9 6.912457 NaN
10 6.913794 NaN
11 6.914294 NaN
12 6.915157 NaN
13 6.916172 NaN
14 6.916838 NaN
15 6.917235 NaN
16 6.918061 NaN
17 6.918717 NaN
18 6.920109 NaN
19 6.919867 NaN
20 6.921309 NaN
21 6.922786 NaN
22 6.924173 NaN
23 6.925523 NaN
24 6.926517 NaN
25 6.928552 NaN
26 6.930198 NaN
27 6.931738 NaN
28 6.931959 NaN
29 6.932111 NaN
.. ... ...
970 7.562284 0.653381
971 7.563388 0.630455
972 7.563499 0.577746
973 7.563686 0.552758
974 7.564105 0.540144
975 7.564428 0.541411
976 7.564351 0.532154
977 7.564408 0.530999
978 7.564681 0.532376
979 7.565192 0.536758
980 7.565359 0.538629
981 7.566112 0.555789
982 7.566678 0.553163
983 7.566364 0.577953
984 7.567587 0.634843
985 7.568583 0.679807
986 7.569268 0.662653
987 7.570018 0.630447
988 7.570375 0.659497
989 7.570704 0.622190
990 7.571009 0.485458
991 7.571886 0.551147
992 7.573148 0.459912
993 7.574134 0.463146
994 7.574478 0.463158
995 7.574671 0.535014
996 7.575177 0.467705
997 7.575374 0.531098
998 7.575620 0.540611
999 7.576727 0.465572
[1000 rows x 2 columns]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to transform multiple columns of days into week columns - python

I would like to know how can I transform the day columns into week columns. I tryed groupby.sum() but there is no column name pattern, I dont know what to groupby for. So the result should be column name like 'weekX' - "week1(Sum of 7 first days) - week2 - week3" and so on. Thanks in advance.

Related

Using DataFrame Columns as id

Python : Remodeling the presentation data from a pandas Dataframe / group duplicates

How can I convert a row of years that also contains NaNs in integers?

Pandas merge error TypeError: '>' not supported between instances of 'int' and 'str'

pandas rolling_apply TypeError: int object is not iterable"

Categories

Resources