calculate bad month from the given csv

calculate bad month from the given csv - python

I tried finding the five worst months from the data but I'm not sure about the process as I'm very confused . The answer should be something like (June 2001, July 2002 )but when I tried to solve it my answer wasn't as expected. Only the data of January was sorted. This the the way I tried solving my question and the csv data file is also provided on the screenshot.
My solution is given below:
PATH = "tourist_arrival.csv"
df = pd.read_csv(PATH)
print(df.sort_values(by=['Jan.','Feb.','Mar.','Apr.','May.','Jun.','Jul.','Aug.','Sep.','Oct.','Nov.','Dec.'],ascending=False))
Year ,Jan.,Feb.,Mar.,Apr.,May.,Jun.,Jul.,Aug.,Sep.,Oct.,Nov.,Dec.,Total
1992, 17451,27489,31505,30682,29089,22469,20942,27338,24839,42647,32341,27561,334353
1993 ,19238,23931,30818,20121,20585,19602,13588,21583,23939,42242,30378,27542,293567
1994, 21735,24872,31586,27292,26232,22907,19739,27610,27959,39393,28008,29198,326531
1995 ,22207,28240,34219,33994,27843,25650,23980,27686,30569,46845,35782,26380,363395
1996 ,27886,29676,39336,36331,29728,26749,22684,29080,32181,47314,37650,34998,393613
1997,25585,32861,43177,35229,33456,26367,26091,35549,31981,56272,40173,35116,421857
1998,28822,37956,41338,41087,35814,29181,27895,36174,39664,62487,47403,35863,463684
1999,29752,38134,46218,40774,42712,31049,27193,38449,44117,66543,48865,37698,491504
2000,25307,38959,44944,43635,28363,26933,24480,34670,43523,59195,52993,40644,463646
2001,30454,38680,46709,39083,28345,13030,18329,25322,31170,41245,30282,18588,361237
2002,17176,20668,28815,21253,19887,17218,16621,21093,23752,35272,28723,24990,275468
2003,21215,24349,27737,25851,22704,20351,22661,27568,28724,45459,38398,33115,338132
2004,30988,35631,44290,33514,26802,19793,24860,33162,25496,43373,36381,31007,385297
2005,25477,20338,29875,23414,25541,22608,23996,36910,36066,51498,41505,38170,375398
2006,28769,25728,36873,21983,22870,26210,25183,33150,33362,49670,44119,36009,383926
2007,33192,39934,54722,40942,35854,31316,35437,44683,45552,70644,52273,42156,526705
2008,36913,46675,58735,38475,30410,24349,25427,40011,41622,66421,52399,38840,500277
2009,29278,40617,49567,43337,30037,31749,30432,44174,42771,72522,54423,41049,509956
2010,33645,49264,63058,45509,32542,33263,38991,54672,54848,79130,67537,50408,602867
2011,42622,56339,67565,59751,46202,46115,42661,71398,63033,96996,83460,60073,736215
2012,52501,66459,89151,69796,50317,53630,49995,71964,66383,86379,83173,63344,803092
2013,47846,67264,88697,65152,52834,54599,54011,68478,66755,99426,75485,57069,797616

melt your DataFrame and then sort_values:
output = df.melt("Year", df.drop(["Year", "Total"], axis=1).columns, var_name="Month").sort_values("value").reset_index(drop=True)
>>> output
Year Month value
0 2001 Jun. 13030
1 1993 Jul. 13588
2 2002 Jul. 16621
3 2002 Jan. 17176
4 2002 Jun. 17218
.. ... ... ...
259 2012 Oct. 86379
260 2013 Mar. 88697
261 2012 Mar. 89151
262 2011 Oct. 96996
263 2013 Oct. 99426
[264 rows x 3 columns]
For just the 5 worst months, you can do:
>>> output.iloc[:5]
Year Month value
0 2001 Jun. 13030
1 1993 Jul. 13588
2 2002 Jul. 16621
3 2002 Jan. 17176
4 2002 Jun. 17218

Related

Python matplotlib into barchart

I have 2 python file with 1 csv file.
I'm trying to accumulate All the Visitors from 2000 - 2009 from each countries and Select The Top 3 country as it will show up the barchart of the Top 3 Country
The Error I having is :
Traceback (most recent call last):
File "C:/ASP/pythonProjectDA_YODA/main.py", line 3, in <module>
countries=Countries("2000","2009","China","Japan")
File "C:\ASP\pythonProjectDA_YODA\countries.py", line 8, in __init__
dfVisitor.index=pd.to_datetime(dfVisitor.index)
File "C:\Users\65965\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\tools\datetimes.py", line 812, in to_datetime
result = convert_listlike(arg, format, name=arg.name)
File "C:\Users\65965\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\tools\datetimes.py", line 459, in _convert_listlike_datetimes
result, tz_parsed = objects_to_datetime64ns(
File "C:\Users\65965\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\arrays\datetimes.py", line 2044, in objects_to_datetime64ns
result, tz_parsed = tslib.array_to_datetime(
File "pandas\_libs\tslib.pyx", line 352, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 579, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 718, in pandas._libs.tslib.array_to_datetime_object
File "pandas\_libs\tslib.pyx", line 552, in pandas._libs.tslib.array_to_datetime
TypeError: <class 'tuple'> is not convertible to datetime
I have no idea what this means cus this is still first time for me to learn this.
The main.py file code is stated below :
**from countries import Countries
countries=Countries("ListedCountries.csv","2000","2009","China","Japan")
countries.top3()
countries.drawchart()**
Another Python file is stated below as well :
**import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
class Countries:
def __init__(self,syear,eyear,scountries,ecountries):
dfVisitor=pd.read_csv("ListedCountries.csv")
dfVisitor.index=pd.to_datetime(dfVisitor.index)
dfVisitor.columns=dfVisitor.colums.str.strip()
dfOther=dfVisitor.loc[syear:eyear, scountries:ecountries]
dfOtherTotal=dfOther.sum()
self.dfOtherTotalSorted=dfOtherTotal.sort_values(ascending=False)
print(self.dfOtherTotalSorted)
def top3(self):
value=self.dfOtherTotalSorted.to_dict()
c=1
print("Top 3 countries in the region over a span of 10 years")
for x, y in value.items():
if c<=3:
print(c,x,y)
c+=1 #c=c+1
if len(value)>0:
return True
else:
return False
def drawchart(self):
ps=self.dfOtherTotalSorted
index=np.arrange(len(ps.index))
plt.xlabel("No. of Visitors in 10 years visit Singapore",fontsize=10)
plt.ylabel((1000000, 2000000, 3000000, 4000000), fontsize=15)
plt.xticks(index,ps.index,fontsize=15, rotation=90)
plt.title("Total visitors from 2000 to 2009 in Singapore")
plt.bar(ps.index,ps.values)
plt.show()**
Lastly i have is the CSV.file :
**Japan,HongKong,China,Taiwan,Korea
2000 Jan,72,131,16,288,38,887,19,329,32,621
2000 Feb,71,245,28,265,45,148,30,528,28,932
2000 Mar,91,844,21,513,30,644,22,934,30,682
2000 Apr,60,540,29,308,36,511,27,737,27,237
2000 May,62,152,20,822,37,934,23,635,30,739
2000 Jun,67,977,22,011,30,706,26,582,25,318
2000 Jul,84,634,30,218,36,148,35,570,32,960
2000 Aug,101,785,31,963,41,162,30,732,34,877
2000 Sep,89,417,20,566,31,239,19,824,23,207
2000 Oct,73,383,21,512,35,195,17,685,28,185
2000 Nov,80,889,21,326,32,999,17,034,31,169
2000 Dec,73,898,22,183,37,762,19,314,28,426
2001 Jan,65,381,27,778,56,460,20,418,32,727
2001 Feb,72,335,18,442,36,157,20,078,32,777
2001 Mar,85,655,27,025,30,320,16,438,32,441
2001 Apr,58,348,25,816,37,542,19,756,30,150
2001 May,58,984,19,806,41,999,16,381,28,842
2001 Jun,64,582,23,752,31,882,19,445,26,914
2001 Jul,76,373,24,929,45,570,25,185,34,830
2001 Aug,92,508,28,515,51,208,21,981,35,899
2001 Sep,69,850,20,024,34,386,15,218,23,526
2001 Oct,35,970,19,363,42,586,14,259,24,125
2001 Nov,32,294,18,583,41,208,15,219,29,452
2001 Dec,43,483,22,124,48,080,17,709,27,400
2002 Jan,47,447,16,630,50,303,18,995,38,613
2002 Feb,49,583,26,760,81,649,21,463,30,745
2002 Mar,68,549,24,043,42,728,16,038,38,393
2002 Apr,49,149,21,771,63,880,17,554,32,704
2002 May,50,563,23,490,56,486,16,570,27,807
2002 Jun,54,892,22,965,41,186,17,251,27,519
2002 Jul,66,566,26,488,51,147,25,238,32,353
2002 Aug,85,655,26,513,62,699,22,147,39,236
2002 Sep,77,884,18,914,47,217,13,553,21,472
2002 Oct,58,489,21,025,57,693,15,730,28,827
2002 Nov,54,294,17,425,56,422,11,981,28,758
2002 Dec,60,338,19,941,58,683,12,796,24,591
2003 Jan,53,131,17,336,62,454,15,826,34,976
2003 Feb,50,469,24,563,89,704,17,940,32,707
2003 Mar,54,497,16,460,54,063,11,498,25,186
2003 Apr,12,501,4,808,23,002,2,531,2,890
2003 May,7,056,5,510,3,994,1,283,2,552
2003 Jun,14,051,16,426,8,405,5,412,8,477
2003 Jul,28,636,29,541,20,989,18,298,25,714
2003 Aug,43,016,34,391,52,847,19,466,30,591
2003 Sep,47,623,17,839,57,716,13,190,20,942
2003 Oct,38,418,19,234,56,700,14,982,24,175
2003 Nov,37,630,18,368,67,541,12,271,29,059
2003 Dec,47,021,21,778,71,068,12,233,24,125
2004 Jan,39,191,22,763,79,717,17,014,30,255
2004 Feb,43,760,17,189,50,903,13,918,29,835
2004 Mar,53,022,18,564,53,481,13,060,25,853
2004 Apr,38,801,24,158,75,068,13,484,26,713
2004 May,43,714,23,922,70,021,13,963,31,482
2004 Jun,44,112,21,679,63,014,15,181,29,912
2004 Jul,56,066,27,380,92,649,21,955,35,568
2004 Aug,66,617,30,887,90,212,19,708,38,602
2004 Sep,62,264,19,562,62,134,13,542,25,956
2004 Oct,51,340,21,884,70,449,13,840,26,936
2004 Nov,48,066,19,317,88,223,12,747,31,623
2004 Dec,51,858,24,381,84,369,14,030,28,344
2005 Jan,48,004,17,457,45,801,16,774,20,386
2005 Feb,40,310,28,713,61,601,19,104,24,531
2005 Mar,52,225,31,089,52,249,15,669,23,476
2005 Apr,41,599,23,614,68,775,16,345,28,923
2005 May,43,968,25,187,62,872,16,019,28,927
2005 Jun,43,020,23,843,61,150,16,710,32,366
2005 Jul,49,791,35,295,93,889,27,702,42,961
2005 Aug,61,522,38,649,101,134,22,950,42,791
2005 Sep,57,085,23,649,67,061,15,670,25,572
2005 Oct,49,532,22,996,74,501,17,754,30,060
2005 Nov,50,402,20,552,88,704,14,094,31,277
2005 Dec,50,994,22,770,79,945,15,123,32,803
2006 Jan,45,402,23,587,81,734,19,898,40,604
2006 Feb,44,695,22,743,96,562,17,723,40,835
2006 Mar,62,353,21,726,91,092,16,227,36,144
2006 Apr,41,269,28,836,97,423,17,657,31,780
2006 May,42,907,24,008,78,594,15,410,34,236
2006 Jun,43,153,23,998,71,213,17,393,36,327
2006 Jul,52,407,28,265,113,127,27,109,45,685
2006 Aug,62,970,30,672,103,459,23,438,44,846
2006 Sep,51,284,20,463,63,550,15,350,29,315
2006 Oct,47,552,21,801,70,690,17,087,35,025
2006 Nov,52,047,22,845,88,343,15,953,43,791
2006 Dec,48,367,22,530,81,414,16,218,36,134
2007 Jan,49,959,19,559,76,116,17,156,46,756
2007 Feb,46,920,26,025,111,934,23,307,31,464
2007 Mar,58,843,22,361,79,239,16,091,42,071
2007 Apr,37,962,29,338,99,136,15,343,32,219
2007 May,38,813,25,261,85,198,14,952,34,408
2007 Jun,41,289,25,551,77,239,16,868,38,027
2007 Jul,49,234,31,990,108,881,24,849,46,123
2007 Aug,58,288,32,177,114,463,21,028,45,910
2007 Sep,54,186,22,902,76,181,16,276,30,265
2007 Oct,51,825,23,224,83,831,14,426,33,383
2007 Nov,53,784,22,638,103,906,13,742,43,965
2007 Dec,53,411,21,084,97,832,14,118,39,701
2008 Jan,52,973,19,817,108,486,16,342,50,432
2008 Feb,47,449,27,263,121,031,17,829,40,998
2008 Mar,57,364,27,600,98,180,13,778,39,683
2008 Apr,36,301,20,232,107,639,13,944,33,946
2008 May,42,382,22,867,86,785,14,276,36,412
2008 Jun,40,879,23,055,70,565,13,146,35,998
2008 Jul,47,659,28,218,105,528,19,398,39,614
2008 Aug,53,699,26,847,91,325,16,923,42,338
2008 Sep,48,771,20,765,66,582,12,303,25,914
2008 Oct,47,736,20,016,76,836,13,503,30,930
2008 Nov,48,225,19,197,79,096,12,535,24,445
2008 Dec,47,602,22,238,66,689,11,947,22,308
2009 Jan,38,382,23,399,105,144,15,986,25,516
2009 Feb,42,807,19,720,80,037,12,744,27,387
2009 Mar,46,797,21,290,91,275,12,542,20,759
2009 Apr,31,633,28,587,86,525,11,970,21,323
2009 May,29,800,21,529,52,058,11,602,21,854
2009 Jun,28,060,21,703,41,650,11,486,20,991
2009 Jul,46,633,33,382,72,326,17,351,30,389
2009 Aug,50,698,36,218,86,530,17,490,32,332
2009 Sep,52,561,21,509,59,588,10,443,15,714
2009 Oct,43,247,25,512,83,273,13,208,16,069
2009 Nov,38,949,20,525,90,358,11,900,19,620
2009 Dec,40,420,21,046,87,983,10,039,20,033
112,551,37,334,126,870,29,368,52,654**

This code works, after you eliminate the embedded commas in the CSV file. One other problem is you were trying to print "China":"Japan", but your columns were not in that order. It needed to be "Japan":"China". You also had several spelling errors (colums, arrange).
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
class Countries(object):
def __init__(self,filename,syear,eyear,scountries,ecountries):
dfVisitor=pd.read_csv(filename)
dfVisitor = dfVisitor.set_index('Date')
dfVisitor.index=pd.to_datetime(dfVisitor.index)
dfOther=dfVisitor.loc[syear:eyear, scountries:ecountries]
dfOtherTotal=dfOther.sum()
self.dfOtherTotalSorted=dfOtherTotal.sort_values(ascending=False)
print(self.dfOtherTotalSorted)
def top3(self):
value=self.dfOtherTotalSorted.to_dict()
print("Top 3 countries in the region over a span of 10 years")
for x, y in list(value.items())[:3]:
print(x,y)
def drawchart(self):
ps=self.dfOtherTotalSorted
index=np.arange(len(ps.index))
plt.xlabel("No. of Visitors in 10 years visit Singapore",fontsize=10)
plt.ylabel((1000000, 2000000, 3000000, 4000000), fontsize=15)
plt.xticks(index,ps.index,fontsize=15, rotation=90)
plt.title("Total visitors from 2000 to 2009 in Singapore")
plt.bar(ps.index,ps.values)
plt.show()
countries=Countries("ListedCountries.csv",pd.to_datetime("2000-01-01"),pd.to_datetime("2009-12-31"),"Japan","China")
countries.top3()
countries.drawchart()

Adding columns and index to sum up values using Panda in Python

I have a .csv file, after reading it using Panda I have this output
Year Month Brunei Darussalam ... Thailand Viet Nam Myanmar
348 2007 Jan 3813 ... 25863 12555 4887
349 2007 Feb 3471 ... 22575 11969 3749
350 2007 Mar 4547 ... 33087 14060 5480
351 2007 Apr 3265 ... 34500 15553 6838
352 2007 May 3641 ... 30555 14995 5295
.. ... ... ... ... ... ... ...
474 2017 Jul 5625 ... 48620 71153 12619
475 2017 Aug 4610 ... 40993 51866 10934
476 2017 Sep 5387 ... 39692 40270 9888
477 2017 Oct 4202 ... 61448 39013 11616
478 2017 Nov 5258 ... 39304 36964 11402
I use this to get me the sum of all countries within the total years to display top 3
top3_country = new_df.iloc[0:, 2:9].sum(axis=0).sort_values(ascending=False).nlargest(3)
though my output is this
Indonesia 27572424
Malaysia 11337420
Philippines 6548622
I want to add columns and index into the sum value as if it was a new dataframe like this
Countries Visitors
0 Indonesia 27572424
1 Malaysia 11337420
2 Philippines 6548622
Sorry I am just starting to learn learn Panda any help will be gladly appreciated

Use Series.reset_index for 2 columns DataFrame and then set new columns names from list:
top3_country = top3_country.reset_index()
top3_country.columns = ['Countries', 'Visitors']
Or use Series.rename_axis with Series.reset_index:
top3_country = top3_country.rename_axis('Countries').reset_index(name='Visitors')

You can return back pd.DataFrame, use reset_index, and rename. Change your code to:
import pandas as pd
top3_country = pd.DataFrame(df.iloc[0:, 2:9].sum(axis=0).sort_values(ascending=False).nlargest(3)
).reset_index(
).rename(columns={'index':'Countries',0:'visitors'})
top3_country
Countries visitors
0 Indonesia 27572424
1 Malaysia 11337420
2 Philippines 6548622

How do I plot a graph for a specific subset of a dataframe

I've been trying to plot a graph for a specific subset of my data. My data has got the data from the years 1960 to 2018. However, I am only interested in plotting my histogram using only a specific column's variable and the rows that display data from 1981 onwards.
So far I've tried plotting using 2 variables
x = df1y.index
which returns the values:
Int64Index([1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991,
1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013,
2014, 2015, 2016, 2017],
dtype='int64', name=' Time ')
and
y = df1.iloc[21:58, [15]] ## 21 to 58 are rows for 1981 to 2017 while column 15 refers to the column I've been trying to get
y returns:
Resident Live-births(Number)
Time
1981 41,000
1982 41,300
1983 39,300
1984 40,200
1985 41,100
1986 37,159
1987 42,362
1988 51,537
1989 46,361
1990 49,787
1991 47,805
1992 47,907
1993 48,739
1994 48,075
1995 46,916
1996 46,707
1997 45,356
1998 41,636
1999 41,327
2000 44,765
2001 39,281
2002 38,555
2003 35,474
2004 35,135
2005 35,528
2006 36,272
2007 37,074
2008 37,277
2009 36,925
2010 35,129
2011 36,178
2012 38,641
2013 35,681
2014 37,967
2015 37,861
2016 36,875
2017 35,444
After keying in
x = df1y.index
y = df1.iloc[21:58, [15]]
plt.plot(x, y, 'o-')
I've received an error:
TypeError: unhashable type: 'numpy.ndarray'

use
y = df1.iloc[21:58, 15].values
to do it the way you tried to walk
However, normally you don't want to calculate the subset indices by yourself, so consider sth like
y = df1.loc[df1.index > 1981, 'name_of_your_column_15_here'].values
to get the numpy array of the (y-)values you want to have.
And to get some more convenience, just try and apply .plot() directly to the series (works also with whole dataframes) and look what happens...
idx_slice = df1.index > 1981
df1.loc[idx_slice, 'name_of_your_column_15_here'].plot()

Summarize values in panda data frames

I want to calculate the maximum value for each year and show the sector and that value. For example, from the screenshot, I would like to display:
2010: Telecom 781
2011: Tech 973
I have tried using:
df.groupby(['Year', 'Sector'])['Revenue'].max()
but this does not give me the name of Sector which has the highest value.

Try using idxmax and loc:
df.loc[df.groupby(['Sector','Year'])['Revenue'].idxmax()]
MVCE:
import pandas as pd
import numpy as np
np.random.seed(123)
df = pd.DataFrame({'Sector':['Telecom','Tech','Financial Service','Construction','Heath Care']*3,
'Year':[2010,2011,2012,2013,2014]*3,
'Revenue':np.random.randint(101,999,15)})
df.loc[df.groupby(['Sector','Year'])['Revenue'].idxmax()]
Output:
Sector Year Revenue
3 Construction 2013 423
12 Financial Service 2012 838
9 Heath Care 2014 224
1 Tech 2011 466
5 Telecom 2010 843

Also .sort_values + .tail, grouping on just year. Data from #Scott Boston
df.sort_values('Revenue').groupby('Year').tail(1)
Output:
Sector Year Revenue
9 Heath Care 2014 224
3 Construction 2013 423
1 Tech 2011 466
12 Financial Service 2012 838
5 Telecom 2010 843

Transpose subset of pandas dataframe into multi-indexed data frame

I have the following dataframe:
df.head(14)
I'd like to transpose just the yr and the ['WA_','BA_','IA_','AA_','NA_','TOM_']
variables by Label. The resulting dataframe should then be a Multi-indexed frame with Label and the WA_, BA_, etc. and the columns names will be 2010, 2011, etc. I've tried,
transpose(), groubby(), pivot_table(), long_to_wide(),
and before I roll my own nested loop going line by line through this df I thought I'd ping the community. Something like this by every Label group:
I feel like the answer is in one of those functions but I'm just missing it. Thanks for your help!

From what I can tell by your illustrated screenshots, you want WA_, BA_ etc as rows and yr as columns, with Label remaining as a row index. If so, consider stack() and unstack():
# sample data
labels = ["Albany County","Big Horn County"]
n_per_label = 7
n_rows = n_per_label * len(labels)
years = np.arange(2010, 2017)
min_val = 10000
max_val = 40000
data = {"Label": sorted(np.array(labels * n_per_label)),
"WA_": np.random.randint(min_val, max_val, n_rows),
"BA_": np.random.randint(min_val, max_val, n_rows),
"IA_": np.random.randint(min_val, max_val, n_rows),
"AA_": np.random.randint(min_val, max_val, n_rows),
"NA_": np.random.randint(min_val, max_val, n_rows),
"TOM_": np.random.randint(min_val, max_val, n_rows),
"yr":np.append(years,years)
}
df = pd.DataFrame(data)
AA_ BA_ IA_ NA_ TOM_ WA_ Label yr
0 27757 23138 10476 20047 34015 12457 Albany County 2010
1 37135 30525 12296 22809 27235 29045 Albany County 2011
2 11017 16448 17955 33310 11956 19070 Albany County 2012
3 24406 21758 15538 32746 38139 39553 Albany County 2013
4 29874 33105 23106 30216 30176 13380 Albany County 2014
5 24409 27454 14510 34497 10326 29278 Albany County 2015
6 31787 11301 39259 12081 31513 13820 Albany County 2016
7 17119 20961 21526 37450 14937 11516 Big Horn County 2010
8 13663 33901 12420 27700 30409 26235 Big Horn County 2011
9 37861 39864 29512 24270 15853 29813 Big Horn County 2012
10 29095 27760 12304 29987 31481 39632 Big Horn County 2013
11 26966 39095 39031 26582 22851 18194 Big Horn County 2014
12 28216 33354 35498 23514 23879 17983 Big Horn County 2015
13 25440 28405 23847 26475 20780 29692 Big Horn County 2016
Now set Label and yr as indices.
df.set_index(["Label","yr"], inplace=True)
From here, unstack() will pivot the inner-most index to columns. Then, stack() can swing our value columns down into rows.
df.unstack().stack(level=0)
yr 2010 2011 2012 2013 2014 2015 2016
Label
Albany County AA_ 27757 37135 11017 24406 29874 24409 31787
BA_ 23138 30525 16448 21758 33105 27454 11301
IA_ 10476 12296 17955 15538 23106 14510 39259
NA_ 20047 22809 33310 32746 30216 34497 12081
TOM_ 34015 27235 11956 38139 30176 10326 31513
WA_ 12457 29045 19070 39553 13380 29278 13820
Big Horn County AA_ 17119 13663 37861 29095 26966 28216 25440
BA_ 20961 33901 39864 27760 39095 33354 28405
IA_ 21526 12420 29512 12304 39031 35498 23847
NA_ 37450 27700 24270 29987 26582 23514 26475
TOM_ 14937 30409 15853 31481 22851 23879 20780
WA_ 11516 26235 29813 39632 18194 17983 29692

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

calculate bad month from the given csv - python

Related

Python matplotlib into barchart

Adding columns and index to sum up values using Panda in Python

How do I plot a graph for a specific subset of a dataframe

Summarize values in panda data frames

Transpose subset of pandas dataframe into multi-indexed data frame

Categories

Resources