Pandas Merge cant merge all columns - python

I am trying to merge two excels, my data is:
tabla muestra.xlsx
{'Mandante': {0: 400, 1: 400, 2: 400, 3: 400, 4: 400}, 'Usuario': {0: 152163681, 1: '162181297', 2: '144912861', 3: '140752630', 4: '167300316'}, 'Funcion': {0: 'COMPRADOR', 1: 'JEFE DE COMPRAS', 2: 'COMPRADOR', 3: 'COMPRADOR', 4: 'JEFE DE COMPRAS'}, 'Tipo usuario contractual': {0: 'SAP Application Professional', 1: 'SAP Application Professional', 2: 'SAP Application Professional', 3: 'SAP Application Professional', 4: 'SAP Application Professional'}}
and tabla usuarios roles.xlsx
{'Identificación mdte.': {0: 400, 1: 400, 2: 400, 3: 400, 4: 400}, 'Rol': {0: 'SAP_BC_WEBSERVICE_ADMIN', 1: 'SAP_BC_WEBSERVICE_CONSUMER', 2: 'SAP_BC_WEBSERVICE_SERVICE_USER', 3: 'SAP_J2EE_ADMIN', 4: 'SAP_SDCCN_ALL'}, 'Usuario': {0: 'WEBSERVICE', 1: 'WEBSERVICE', 2: 'WEBSERVICE', 3: 'SM_ADMIN_S4P', 4: 'ADMIN_SONDA'}, 'Fecha de inicio': {0: '01.03.2019', 1: '01.03.2019', 2: '01.03.2019', 3: '16.05.2019', 4: '06.08.2019'}, 'Fecha fin': {0: '31.12.9999', 1: '31.12.9999', 2: '31.12.9999', 3: '31.12.9999', 4: '31.12.9999'}, 'Excluido': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan}, 'Fecha': {0: '01.03.2019', 1: '01.03.2019', 2: '01.03.2019', 3: '16.05.2019', 4: '06.08.2019'}, 'Hora': {0: datetime.time(16, 11, 6), 1: datetime.time(16, 11, 6), 2: datetime.time(16, 11, 6), 3: datetime.time(15, 27, 30), 4: datetime.time(9, 25, 57)}, 'Cronomarcador UTC en forma breve (AAAAMMDDhhmmss)': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0}, 'Org.HR': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan}, 'Asign.proviene de rol compuesto': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan}}
Using the code
# importing the module
import pandas
# reading the files
f1 = pandas.read_excel("~/Desktop/tabla muestra.xlsx")
f2 = pandas.read_excel("~/Desktop/tabla usuarios roles.xlsx")
# merging the files
f3 = f1[["Usuario"]].merge(f2[["Usuario", "Rol"]],
on = "Usuario",
how = "outer")
# creating a new file
f3.to_excel("~/Desktop/Resultstest5.xlsx", index = False)
After the code it returns the following
I chacked and the ids are on both tables, any clues whats happening?

Related

Merge dataframes in pandas with a combination of keys

I have two dataframes that I need to combine together based on a key (an 'incident number'). The key, however, is repeated, as the database they will be ingested by requires a particular format for coordinates. How can join the necessary columns based on a combination of keys?
For example, the two tables look like:
Incident_Number
Lat/Long
GPSCoordinates
AB123
Lat
32.123
AB123
Long
120.123
CD321
Lat
31.321
CD321
Long
121.321
and...
Incident_Number
Lat/Long
GeoCodeCoordinates
AB123
Lat
35.123
AB123
Long
125.123
CD321
Lat
36.321
CD321
Long
126.321
And I need to get to...
IncidentNumber
Lat/Long
GPSCoordinates
GeoCodeCoordinates
AB123
Lat
32.123
35.123
AB123
Long
120.123
125.123
CD321
Lat
31.321
36.321
CD321
Long
121.321
126.321
The number of records are not 100% equal in each table so it needs to allow for NaNs. I am essentially trying to add the column 'GeoCodeCoordinates' to the other dataframe on a combination of 'Incident Number' and 'Lat/Long', so it will treat the value 'AB123 + Lat' and 'AB123 + Long' as a single key. Can this be specified within code, or does a new column and a calculation to create that value as a key need to be created?
I imagine I went about this in a bit of a goofy way. The Lat and Long were originally stored in separate fields and I used .melt() to make the data longer. The database that will ultimately take this in requires the longer format for the Lat/Long field.
GPSColList = list(GPSRecords.columns)
GPSColList.remove('Latitude')
GPSList.remove('Longitude')
GPSMelt = GPSRecords.melt(id_vars=GPSColList, value_vars=['Latitude', 'Longitude'], var_name='Lat/Long', value_name="GPSCoordinates")
As the two sets of coordinates were in separate fields I created two dataframes with each set of coordinates and melted them separately. My attempt to merge them looks like:
mergeMelt = pd.merge(GPSMelt, GeoCodeMelt[["GeoCodeCoordinates"]], on=['Incident_Number', 'Lat/Long'])
Result is KeyError: 'Incident_Number'
Adding samples as requested:
geocodeMelt:
print(geocodeMelt.head(10).to_dict())
{'OID_': {0: 5211, 1: 5212, 2: 5213, 3: 5214, 4: 5215, 5: 5216, 6: 5217, 7: 5218, 8: 5219, 9: 5220}, 'Unit_Level': {0: 'RRU (Riverside
Unit)', 1: 'RRU (Riverside Unit)', 2: 'RRU (Riverside Unit)', 3: 'RRU (Riverside Unit)', 4: 'RRU (Riverside Unit)', 5: 'RRU (Riverside
Unit)', 6: 'RRU (Riverside Unit)', 7: 'RRU (Riverside Unit)', 8: 'RRU (Riverside Unit)', 9: 'RRU (Riverside Unit)'}, 'Agency_FDID': {0: 33090, 1: 33051, 2: 33054, 3: 33054, 4: 33090, 5: 33070, 6: 33030, 7: 33054, 8: 33090, 9: 33052}, 'Incident_Number': {0: '21CARRU0000198', 1: '21CARRU0000564', 2: '21CARRU0000523', 3: '21CARRU0000624', 4: '21CARRU0000436', 5: '21CARRU0000439', 6: '21CARRU0000496', 7: '21CARRU0000422', 8: '21CARRU0000466', 9: '21CARRU0000016'}, 'Exposure': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}, 'CAD_Incident_Type': {0: '71', 1: '67B01O', 2: '71C01', 3: '69D03', 4: '67', 5: '67', 6: '71', 7: '69D06', 8: '71C01', 9: '82B01'}, 'CALFIRS_Incident_Type': {0: 'Passenger vehicle fire', 1: 'Outside rubbish, trash or waste fire', 2: 'Passenger vehicle fire', 3: 'Building fire', 4: 'Outside rubbish, trash or waste fire', 5: 'Outside rubbish, trash or waste fire', 6: 'Passenger vehicle fire', 7: 'Dumpster or other outside trash receptacle fire', 8: 'Passenger vehicle fire', 9: 'Brush or brush-and-grass mixture fire'}, 'Incident_Date': {0: '1/1/2021 0:00:00', 1: '1/1/2021 0:00:00', 2: '1/1/2021 0:00:00', 3: '1/1/2021 0:00:00', 4: '1/1/2021 0:00:00', 5: '1/1/2021 0:00:00', 6: '1/1/2021 0:00:00', 7: '1/1/2021 0:00:00', 8: '1/1/2021 0:00:00', 9: '1/1/2021 0:00:00'}, 'Report_Date_Time': {0: nan, 1: '1/1/2021 20:34:00', 2: '1/1/2021 19:07:00', 3: '1/1/2021 23:33:00', 4: nan, 5: '1/1/2021 16:56:00', 6: '1/1/2021 18:28:00', 7: '1/1/2021 16:16:00', 8: '1/1/2021 17:40:00', 9: '1/1/2021 0:15:00'}, 'Day': {0: '06 - Friday', 1: '06 - Friday', 2: '06 - Friday', 3: '06 - Friday', 4: '06 - Friday', 5: '06 - Friday', 6: '06 - Friday', 7: '06 - Friday', 8: '06 - Friday', 9: '06 - Friday'}, 'Incident_Name': {0: 'HY 91 W/ SERFAS CLUB DR', 1: 'QUAIL PL MENI', 2: 'CAR', 3: 'SUNNY', 4: 'MARTINEZ RD SANJ', 5: 'W METZ RD / ALTURA DR', 6: 'PALM DR / BUENA VISTA AV', 7: 'DELL', 8: 'HY 74 E HEM', 9: 'MADISON ST / AVE 60'}, 'Address': {0: 'HY 91 W Corona CA 92880', 1: '23880 KENNEDY LN Menifee CA 92587', 2: 'THEODORE ST/EUCALYPTUS AV Moreno Valley CA 92555', 3: '24490 SUNNYMEAD Moreno Valley CA 92553', 4: '40300 MARTINEZ San Jacinto CA 92583', 5: '1388 West METZ Perris CA 92570', 6: 'PALM DR/BUENA VISTA AV Desert hot springs CA 92240', 7: '25361 DELPHINIUM Moreno Valley CA 92553', 8: '43763 HY 74 East Hemet CA 92544', 9: 'MADISON ST/AVE 60 La Quinta CA 92253'}, 'Acres_Burned': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: 0.01}, 'Wildland_Fire_Cause': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: 'UU - Undetermined'}, 'Latitude_D': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7:
nan, 8: nan, 9: nan}, 'Longitude_D': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'Member_Making_Report': {0: 'Muhammad Nassar', 1: 'TODD PHILLIPS', 2: 'DAVID COLOMBO', 3: 'GREGORY MOWAT', 4: 'MICHAEL ESPARZA', 5: 'Benjamin Hall', 6: 'TIMOTHY CABRAL', 7: 'JORGE LOMELI', 8: 'JOSHUA BALBOA', 9: 'SETH SHIVELY'}, 'Battalion': {0: 4.0, 1: 13.0, 2: 9.0, 3: 9.0, 4: 5.0, 5: 1.0, 6: 10.0, 7: 9.0, 8: 5.0, 9: 6.0}, 'Incident_Status': {0: 'Submitted', 1: 'Submitted', 2: 'Submitted', 3: 'Submitted', 4: 'Submitted', 5: 'Submitted', 6: 'Submitted', 7: 'Submitted', 8: 'Submitted', 9: 'Submitted'}, 'DDLat': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'DDLon': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'DiscrepancyDistanceFeet': {0: 4178.0, 1: 107.0, 2: 2388.0, 3: 233159.0, 4: 102.0, 5: 1768.0, 6: 1094.0, 7: 78.0, 8: 35603721.0, 9: 149143.0}, 'DiscrepancyDistanceMiles': {0: 1.0, 1: 0.0, 2: 0.0, 3: 44.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 6743.0, 9: 28.0}, 'DiscrepancyGreaterThan1000ft': {0: 1.0, 1: 2.0, 2: 1.0, 3: 1.0, 4: 2.0, 5: 1.0, 6: 1.0, 7: 2.0, 8: 1.0, 9: 1.0}, 'LocationLegitimate': {0: nan, 1: 1.0, 2: nan, 3: nan, 4: 1.0, 5: nan, 6: nan, 7: 1.0, 8: nan, 9: nan}, 'LocationErrorCategory': {0: nan, 1: 7.0, 2: nan, 3: nan, 4: 7.0,
5: nan, 6: nan, 7: 7.0, 8: nan, 9: nan}, 'LocationErrorComment': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'LocationErrorResolution': {0: nan, 1: 6.0, 2: nan, 3: nan, 4: 6.0, 5: nan, 6: nan, 7: 6.0, 8: nan, 9: nan}, 'LocationErrorResolutionComment': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'CADLatitudeDDM': {0: '33 53.0746416', 1: '33 42.3811205', 2: '33 55.9728055', 3: '33 56.3706594', 4: '33 47.9788195', 5: '33 47.6486387', 6: '33 57.5747994', 7: '33 54.3721212', 8: '33 44.8499992', 9: '33 38.1589793'}, 'CADLongitudeDDM': {0: '-117 38.2368024', 1: '-117 14.5374611', 2: '-117 07.9119009', 3: '-117 14.1319211', 4: '-116 57.4446600', 5: '-117 15.4013420', 6: '-116 30.2784078', 7: '-117 13.2052213', 8: '-116 53.8524596',
9: '-116 15.0473995'}, 'GeocodeSymbology': {0: 2, 1: 2, 2: 2, 3: 2, 4: 2, 5: 2, 6: 2, 7: 2, 8: 2, 9: 2}, 'Lat/Long': {0: 'Latitude', 1: 'Latitude', 2: 'Latitude', 3: 'Latitude', 4: 'Latitude', 5: 'Latitude', 6: 'Latitude', 7: 'Latitude', 8: 'Latitude', 9: 'Latitude'}, 'CAD_Coords': {0: '33 52.924', 1: '33 42.364', 2: '33 56.100', 3: '33 93.991', 4: '33 47.9629', 5: '33 47.390', 6: '33 57.573', 7: '33 54.385', 8: '33 44.859', 9: '33 61.269'}}
and GPSMelt:
print(geocodeMelt.head(10).to_dict())
{'OID_': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5, 5: 6, 6: 7, 7: 8, 8: 9, 9: 10}, 'Unit_Level': {0: 'RRU (Riverside Unit)', 1: 'RRU (Riverside Unit)', 2: 'RRU (Riverside Unit)', 3: 'RRU (Riverside Unit)', 4: 'RRU (Riverside Unit)', 5: 'RRU (Riverside Unit)', 6: 'RRU (Riverside Unit)', 7: 'RRU (Riverside Unit)', 8: 'RRU (Riverside Unit)', 9: 'RRU (Riverside Unit)'}, 'Agency_FDID': {0: 33090, 1: 33054, 2: 33030, 3: 33051, 4: 33054, 5: 33090, 6: 33070, 7: 33054, 8: 33090, 9: 33035}, 'Incident_Number': {0: '21CARRU0000198', 1: '21CARRU0000523', 2: '21CARRU0000496', 3: '21CARRU0000564', 4: '21CARRU0000624', 5: '21CARRU0000436', 6: '21CARRU0000439', 7: '21CARRU0000422', 8: '21CARRU0000466', 9: '21CARRU0000007'}, 'Exposure': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}, 'CAD_Incident_Type': {0: '71', 1: '71C01', 2: '71', 3: '67B01O', 4: '69D03', 5: '67', 6: '67', 7: '69D06', 8: '71C01', 9: '82C03'}, 'CALFIRS_Incident_Type': {0: 'Passenger vehicle fire', 1: 'Passenger vehicle fire', 2: 'Passenger vehicle fire', 3: 'Outside rubbish, trash or waste fire', 4: 'Building fire', 5: 'Outside rubbish, trash or waste fire', 6: 'Outside rubbish, trash or waste fire', 7: 'Dumpster or other outside trash receptacle fire', 8: 'Passenger vehicle fire', 9: 'Brush or brush-and-grass mixture fire'}, 'Incident_Date': {0: '1/1/2021 0:00:00', 1: '1/1/2021 0:00:00', 2: '1/1/2021 0:00:00', 3: '1/1/2021 0:00:00', 4: '1/1/2021 0:00:00', 5: '1/1/2021 0:00:00', 6: '1/1/2021 0:00:00', 7: '1/1/2021 0:00:00', 8: '1/1/2021 0:00:00', 9: '1/1/2021 0:00:00'}, 'Report_Date_Time': {0: nan, 1: '1/1/2021 19:07:00', 2: '1/1/2021 18:28:00', 3: '1/1/2021 20:34:00', 4: '1/1/2021 23:33:00', 5: nan, 6: '1/1/2021 16:56:00', 7: '1/1/2021 16:16:00', 8: '1/1/2021 17:40:00', 9: '1/1/2021 0:07:00'}, 'Day': {0: '06 - Friday', 1: '06 - Friday', 2: '06 - Friday', 3: '06 - Friday', 4: '06 - Friday', 5: '06 - Friday', 6: '06 - Friday', 7: '06 - Friday', 8: '06 - Friday', 9: '06 - Friday'}, 'Incident_Name': {0: 'HY 91 W/ SERFAS CLUB DR', 1: 'CAR', 2: 'PALM DR / BUENA VISTA AV', 3: 'QUAIL PL MENI', 4: 'SUNNY', 5: 'MARTINEZ RD SANJ', 6: 'W METZ RD / ALTURA DR', 7: 'DELL', 8: 'HY 74 E HEM', 9: 'RIVERSIDE DR / JOY ST'}, 'Address': {0: 'HY 91 W Corona CA 92880', 1: 'THEODORE ST/EUCALYPTUS AV Moreno Valley CA 92555', 2: 'PALM DR/BUENA VISTA AV Desert hot springs CA 92240', 3: '23880 KENNEDY LN Menifee CA 92587', 4: '24490 SUNNYMEAD Moreno Valley CA 92553', 5: '40300 MARTINEZ San Jacinto CA 92583', 6: '1388 West METZ Perris CA 92570', 7: '25361 DELPHINIUM Moreno Valley CA 92553', 8: '43763 HY 74 East Hemet CA 92544', 9: 'RIVERSIDE DR/JOY ST Lake Elsinore CA 92530'}, 'Acres_Burned': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: 1.0}, 'Wildland_Fire_Cause': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: 'Misuse of Fire by a Minor'}, 'Latitude_D': {0: 33.88206666666667, 1: 33.935, 2: 33.95955, 3: 33.706066666666665, 4: 34.566516666666665, 5: 33.79938166666667, 6: 33.789833333333334, 7: 33.906416666666665, 8: 33.74765, 9: 33.679883333333336}, 'Longitude_D': {0: -117.62385, 1: -117.13931666666667, 2: -116.50103333333333, 3: -117.2422, 4: -117.39321666666666, 5: -116.9573, 6: -117.254, 7: -117.22008333333332, 8: 116.89728333333332, 9: -117.37076666666665}, 'Member_Making_Report': {0: 'Muhammad Nassar', 1: 'DAVID COLOMBO', 2: 'TIMOTHY CABRAL', 3: 'TODD PHILLIPS', 4: 'GREGORY MOWAT', 5: 'MICHAEL ESPARZA', 6: 'Benjamin Hall', 7: 'JORGE LOMELI', 8: 'JOSHUA BALBOA', 9: 'KEVIN MERKH'}, 'Battalion': {0: 4.0, 1: 9.0, 2: 10.0, 3: 13.0, 4: 9.0, 5: 5.0, 6: 1.0, 7: 9.0, 8: 5.0, 9: 2.0}, 'Incident_Status': {0: 'Submitted', 1: 'Submitted', 2: 'Submitted', 3: 'Submitted', 4: 'Submitted', 5: 'Submitted', 6: 'Submitted', 7: 'Submitted', 8: 'Submitted', 9: 'Submitted'}, 'DDLat': {0: '33.88206667N', 1: '33.93500000N', 2: '33.95955000N', 3: '33.70606667N', 4: '34.56651667N', 5: '33.79938167N', 6: '33.78983333N', 7: '33.90641667N', 8: '33.74765000N', 9: '33.67988333N'}, 'DDLon': {0: '117.62385000W', 1: '117.13931667W', 2: '116.50103333W', 3: '117.24220000W', 4: '117.39321667W', 5: '116.95730000W', 6: '117.25400000W', 7: '117.22008333W', 8: '116.89728333E', 9: '117.37076667W'}, 'DiscrepancyDistanceFeet': {0: 4178.0, 1: 2388.0, 2: 1094.0, 3: 107.0, 4: 233159.0, 5: 102.0, 6: 1768.0, 7: 78.0, 8: 35603721.0, 9: 9298.0}, 'DiscrepancyDistanceMiles': {0: 1.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 44.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 6743.0, 9: 2.0}, 'DiscrepancyGreaterThan1000ft': {0: 1.0, 1: 1.0, 2: 1.0, 3: 2.0, 4: 1.0, 5: 2.0, 6: 1.0, 7: 2.0, 8: 1.0, 9: 1.0}, 'LocationLegitimate': {0: nan, 1: nan, 2: nan, 3: 1.0, 4: nan, 5: 1.0, 6: nan, 7: 1.0, 8: nan, 9: nan}, 'LocationErrorCategory': {0: nan, 1: nan, 2: nan, 3: 7.0, 4: nan, 5: 7.0, 6: nan, 7: 7.0, 8: nan, 9: nan}, 'LocationErrorComment': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'LocationErrorResolution': {0: nan, 1: nan, 2: nan, 3: 6.0, 4: nan, 5: 6.0, 6: nan, 7: 6.0, 8: nan, 9: nan}, 'LocationErrorResolutionComment': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'CADLatitudeDDM': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'CADLongitudeDDM': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'GeocodeSymbology': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1}, 'Lat/Long': {0: 'Latitude', 1: 'Latitude', 2: 'Latitude', 3: 'Latitude', 4: 'Latitude', 5: 'Latitude', 6: 'Latitude', 7: 'Latitude', 8: 'Latitude', 9: 'Latitude'}, 'CALFIRS_Coords': {0: '33 52.924', 1: '33 56.100', 2: '33 57.573', 3: '33 42.364', 4: '33 93.991', 5: '33 47.9629', 6: '33 47.390', 7: '33 54.385', 8: '33 44.859', 9: '33 40.793'}}
Try:
cols = ['Incident_Number', 'Lat/Long', 'GeoCodeCoordinates']
mergeMelt = GPSMelt.merge(GeoCodeMelt[cols], on=cols[:-1])
The KeyError: 'Incident_Number' is raised because you use GeoCodeMelt[['GeoCodeCoordinates']] so your columns Incident_Number and Lat/Long don't exist when you merge.

How to update a seaborn line plot with ipywidgets checkboxes?

I am struggling with the ipywidgets module.
I am trying to make a plot where you can toggle lines off/on with checkboxes based on a province.
fig, ax = plt.subplots(figsize=(10,10))
sns.lineplot(data=df5, x="Date_of_report", y="Total_reported", hue="Province", ax=ax)
provinces = df5["Province"].unique()
chk = [widgets.Checkbox(description=a) for a in provinces]
def updatePlot(**kwargs):
print([(k,v) for k, v in kwargs.items()])
widgets.interact(updatePlot, **{c.description: c.value for c in chk})
As you can see, I can draw the checkboxes and it prints out the status of the boxes.
but I don't know how to update the seaborn line plot. So when you select say: Drenthe it only shows the line from Drenthe.
here is the dataframe as a dict:
{'Date_of_report': {0: Timestamp('2020-03-13 10:00:00'), 1: Timestamp('2020-03-13 10:00:00'), 2: Timestamp('2020-03-13 10:00:00'), 3: Timestamp('2020-03-13 10:00:00'), 4: Timestamp('2020-03-13 10:00:00'), 5: Timestamp('2020-03-13 10:00:00'), 6: Timestamp('2020-03-13 10:00:00'), 7: Timestamp('2020-03-13 10:00:00'), 8: Timestamp('2020-03-13 10:00:00'), 9: Timestamp('2020-03-13 10:00:00')}, 'Province': {0: 'Drenthe', 1: 'Flevoland', 2: 'Friesland', 3: 'Gelderland', 4: 'Groningen', 5: 'Limburg', 6: 'Noord-Brabant', 7: 'Noord-Holland', 8: 'Overijssel', 9: 'Utrecht'}, 'Total_reported': {0: 14, 1: 7, 2: 8, 3: 64, 4: 4, 5: 71, 6: 377, 7: 66, 8: 18, 9: 83}, 'Hospital_admission': {0: 0, 1: 3, 2: 2, 3: 9, 4: 1, 5: 17, 6: 65, 7: 4, 8: 0, 9: 7}, 'Deceased': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 3, 6: 5, 7: 0, 8: 0, 9: 0}}

Resolving a value is trying to be set on a copy of a slice error

Trying to resolve the error:
application.py:25: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
application.py:26: SettingWithCopyWarning:
but can't figure out why i'm getting this error and how to resolve it.
This is my code:
hr = hr_data[['Month','SalesSystemCode','TITULO','BirthDate','HireDate','SupervisorEmployeeID','BASE','carallowance','Commission_Target','Area','Fulfilment %','Commission Accrued','Commission paid',
'Características (D)', 'Características (I)', 'Características (S)','Características (C)', 'Motivación (D)', 'Motivación (I)','Motivación (S)', 'Motivación (C)', 'Bajo Stress (D)',
'Bajo Stress (I)', 'Bajo Stress (S)', 'Bajo Stress (C)']]
sales = sales_data[['Report month', 'Area','Customer','Rental Charge','Cod. Motivo Desconexion','ID Vendedor']]
#report month to datetime
sales['Report month'] = pd.to_datetime(sales['Report month'])
hr['Month'] = pd.to_datetime(hr['Month'])
#remove sales where customer churned
sales_clean = sales.loc[sales['Cod. Motivo Desconexion'] == 0]
sales_clean = sales_clean[['Report month','Rental Charge','ID Vendedor']]
sales_clean2 = pd.DataFrame(sales_clean.groupby(['Report month','ID Vendedor'])['Rental Charge'].sum())
sales_clean2.reset_index(inplace=True)
hr_area = hr.loc[hr['Area'] == 'Area 1']
merged_hr = hr_area.merge(sales_clean, left_on=['SalesSystemCode','Month'],right_on=['ID Vendedor','Report month'],how='left')
#creating new features: months of employment
merged_hr['MonthsofEmploymentRounded'] = round((merged_hr['Month'] - merged_hr['HireDate'])/np.timedelta64(1,'M')).astype('int')
#filters for interaction
YEAR_MONTH = merged_hr['Month'].unique()
#css stylesheet
external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']
app = dash.Dash(__name__, external_stylesheets=external_stylesheets)
#html layout
app.layout = html.Div(children=[
html.H1(children='SAC Challenge Level 2 Dashboard', style ={
'textAlign': 'center',
'height':'10'
}),
html.Div(children='''
Objective: Studying the impact of supervision on the performance of sales executives in Area 1
'''),
dcc.DatePickerRange(
id='year_month',
start_date= min(merged_hr['Month'].dt.date.tolist()),
end_date = 'Select date'
),
dcc.Graph(
id='performancetable'
)
])
#app.callback(dash.dependencies.Output('performancetable','figure'),
[dash.dependencies.Input('year_month', 'start_date'),
dash.dependencies.Input('year_month','end_date')])
def update_table(year_month):
if year_month is None or year_month ==[]:
year_month = YEAR_MONTH
performance = merged_hr[(merged_hr['Month'].isin(year_month))]
return {
'data': [
go.Table(
header = dict(values=list(performance.columns),fill_color='paleturquoise',align='left'),
cells = dict(values=[performance['Month'],performance['SalesSystemCode'],performance['TITULO'],
performance['HireDate'],performance['MonthsofEmploymentRounded'],performance['SupervisorEmployeeID'],
performance['BASE'],performance['carallowance'],performance['Commission_Target'],
performance['Fulfilment %'], performance['Commission Accrued'],performance['Commission paid'],
performance['Características (D)'],performance['Características (I)'],performance['Características (S)'],
performance['Características (C)'],performance['Motivación (D)'],performance['Motivación (I)'],
performance['Motivación (S)'],performance['Motivación (C)'],performance['Bajo Stress (D)'],
performance['Bajo Stress (I)'],performance['Bajo Stress (S)'],performance['Bajo Stress (C)'],
performance['Rental Charge']])
)],
}
if __name__ == '__main__':
app.run_server(debug=True)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
Here is a sample of hr_data:
{'Month': {0: Timestamp('2017-12-01 00:00:00'),
1: Timestamp('2017-12-01 00:00:00'),
2: Timestamp('2017-12-01 00:00:00'),
3: Timestamp('2017-12-01 00:00:00'),
4: Timestamp('2017-12-01 00:00:00')},
'EmployeeID': {0: 91868, 1: 1812496, 2: 1812430, 3: 700915, 4: 1812581},
'PayrollProviderName': {0: 'Tele',
1: 'People',
2: 'People',
3: 'Stratego',
4: 'People'},
'SalesSystemCode': {0: 91868.0,
1: 802496.0,
2: 2430.0,
3: 700915.0,
4: 802581.0},
'Payroll Type': {0: 'Insourcing',
1: 'Third Party',
2: 'Third Party',
3: 'Third Party',
4: 'Third Party'},
'Name': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'TITULO': {0: 'SALES SUPERVISOR',
1: 'SALES EXECUTIVE',
2: 'SALES EXECUTIVE',
3: 'SALES EXECUTIVE',
4: 'SALES EXECUTIVE'},
'Sexo': {0: 'M', 1: 'F', 2: 'F', 3: 'M', 4: 'F'},
'BirthDate': {0: Timestamp('1982-11-05 00:00:00'),
1: Timestamp('1987-09-24 00:00:00'),
2: Timestamp('1981-01-13 00:00:00'),
3: Timestamp('1986-04-18 00:00:00'),
4: Timestamp('1991-06-24 00:00:00')},
'HireDate': {0: Timestamp('2012-04-23 00:00:00'),
1: Timestamp('2017-04-10 00:00:00'),
2: Timestamp('2017-03-13 00:00:00'),
3: Timestamp('2015-01-22 00:00:00'),
4: Timestamp('2017-05-18 00:00:00')},
'SupervisorEmployeeID': {0: 7935, 1: 91868, 2: 91868, 3: 91868, 4: 91868},
'SupervisorName': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'BASE': {0: 895, 1: 700, 2: 700, 3: 700, 4: 700},
'carallowance': {0: 350, 1: 250, 2: 250, 3: 250, 4: 250},
'Commission_Target': {0: 708.33, 1: 583.33, 2: 583.33, 3: 583.33, 4: 583.33},
'Nacionalidad': {0: 'INT', 1: 'INT', 2: 'INT', 3: 'INT', 4: 'INT'},
'Area': {0: 'Area 1', 1: 'Area 1', 2: 'Area 1', 3: 'Area 1', 4: 'Area 1'},
'Comment': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Sales Quota (points)': {0: 1810.0, 1: 108.0, 2: 108.0, 3: 108.0, 4: 108.0},
'Real (points)': {0: 1855.0, 1: 86.0, 2: 245.0, 3: 149.0, 4: 91.0},
'Fulfilment %': {0: 1.0248618784530388,
1: 0.7962962962962963,
2: 2.2685185185185186,
3: 1.3796296296296295,
4: 0.8425925925925926},
'Commission Accrued': {0: 708.33, 1: 583.33, 2: 583.33, 3: 583.33, 4: 583.33},
'OA Commission Accrued': {0: 653.66,
1: 87.5,
2: 1494.79,
3: 794.79,
4: 160.42},
'Clawback': {0: 0.0, 1: 24.33, 2: 144.9, 3: 36.77, 4: 0.0},
'Other Commissions': {0: 0.0, 1: 0.0, 2: 9.16, 3: 9.16, 4: 0.0},
'Commission paid': {0: 1361.99, 1: 646.51, 2: 1942.38, 3: 1350.52, 4: 743.75},
'Exit Date': {0: NaT,
1: Timestamp('2018-04-13 00:00:00'),
2: NaT,
3: NaT,
4: Timestamp('2018-08-31 00:00:00')},
'Legal Motive': {0: nan,
1: 'Artículo No. 212',
2: nan,
3: nan,
4: 'Artículo No. 212'},
'Características (D)': {0: nan, 1: 70.0, 2: 70.0, 3: 60.0, 4: 67.0},
'Características (I)': {0: nan, 1: 95.0, 2: 62.0, 3: 25.0, 4: 15.0},
'Características (S)': {0: nan, 1: 20.0, 2: 48.0, 3: 75.0, 4: 40.0},
'Características (C)': {0: nan, 1: 25.0, 2: 34.0, 3: 85.0, 4: 94.0},
'Motivación (D)': {0: nan, 1: 85.0, 2: 75.0, 3: 40.0, 4: 59.0},
'Motivación (I)': {0: nan, 1: 95.0, 2: 74.0, 3: 74.0, 4: 25.0},
'Motivación (S)': {0: nan, 1: 11.0, 2: 58.0, 3: 65.0, 4: 65.0},
'Motivación (C)': {0: nan, 1: 7.0, 2: 33.0, 3: 84.0, 4: 93.0},
'Bajo Stress (D)': {0: nan, 1: 60.0, 2: 69.0, 3: 79.0, 4: 79.0},
'Bajo Stress (I)': {0: nan, 1: 86.0, 2: 60.0, 3: 6.0, 4: 18.0},
'Bajo Stress (S)': {0: nan, 1: 40.0, 2: 60.0, 3: 89.0, 4: 30.0},
'Bajo Stress (C)': {0: nan, 1: 60.0, 2: 48.0, 3: 84.0, 4: 92.0}}
sales_data:
{'Month': {0: Timestamp('2017-07-01 00:00:00'),
1: Timestamp('2017-07-01 00:00:00'),
2: Timestamp('2017-07-01 00:00:00'),
3: Timestamp('2017-07-01 00:00:00'),
4: Timestamp('2017-07-01 00:00:00')},
'Report month': {0: '2017-07',
1: '2017-07',
2: '2017-07',
3: '2017-07',
4: '2017-07'},
'Area': {0: 'Area 1', 1: 'Area 1', 2: 'Area 1', 3: 'Area 1', 4: 'Area 1'},
'Fecha de solicitud': {0: Timestamp('2017-07-25 14:49:51'),
1: Timestamp('2017-07-25 14:56:14'),
2: Timestamp('2017-06-30 13:07:10'),
3: Timestamp('2017-07-03 18:25:17'),
4: Timestamp('2017-07-04 09:56:24')},
'Fecha de salida': {0: Timestamp('2017-07-27 13:11:42'),
1: Timestamp('2017-07-27 15:08:39'),
2: Timestamp('2017-07-04 11:50:07'),
3: Timestamp('2017-07-07 16:40:44'),
4: Timestamp('2017-07-14 14:52:45')},
'Fecha de salida final': {0: Timestamp('2017-07-28 15:13:53'),
1: Timestamp('2017-07-27 15:46:16'),
2: Timestamp('2017-07-05 10:24:46'),
3: Timestamp('2017-07-08 08:36:43'),
4: Timestamp('2017-07-15 10:00:02')},
'Fecha de proceso': {0: Timestamp('2017-08-01 00:00:00'),
1: Timestamp('2017-08-01 00:00:00'),
2: Timestamp('2017-08-01 00:00:00'),
3: Timestamp('2017-08-01 00:00:00'),
4: Timestamp('2017-08-01 00:00:00')},
'Fecha de sistema': {0: Timestamp('2017-07-25 14:49:51'),
1: Timestamp('2017-07-25 14:56:14'),
2: Timestamp('2017-06-30 13:07:10'),
3: Timestamp('2017-07-03 18:25:17'),
4: Timestamp('2017-07-04 09:56:24')},
'Fecha de completada': {0: Timestamp('2017-07-28 15:13:52'),
1: Timestamp('2017-07-27 15:46:15'),
2: Timestamp('2017-07-05 10:24:45'),
3: Timestamp('2017-07-08 08:36:42'),
4: Timestamp('2017-07-15 10:00:02')},
'Fecha de creada': {0: Timestamp('2017-07-25 14:50:00'),
1: Timestamp('2017-07-25 14:56:00'),
2: Timestamp('2017-06-30 13:07:00'),
3: Timestamp('2017-07-03 18:25:00'),
4: Timestamp('2017-07-04 09:56:00')},
'Cod. de Distribucion': {0: 2302, 1: 2302, 2: 2302, 3: 91818, 4: 2302},
'Customer': {0: 19308378, 1: 19308378, 2: 27504455, 3: 27104497, 4: 17608676},
'Cod. Tipo Cliente': {0: 'R', 1: 'R', 2: 'R', 3: 'R', 4: 'R'},
'Tipo De Cliente': {0: 'Residencial ',
1: 'Residencial ',
2: 'Residencial ',
3: 'Residencial ',
4: 'Residencial '},
'Cuenta': {0: 193083780000,
1: 193083780000,
2: 275044550000,
3: 271044970000,
4: 176086760000},
'Status Cuenta': {0: 'W', 1: 'W', 2: 'W', 3: 'W', 4: 'W'},
'Tipo de Contabilidad': {0: 'RP', 1: 'RP', 2: 'RP', 3: 'RP', 4: 'RP'},
'Desc. Tipo Contabilidad': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Tos Cat': {0: 'K', 1: 'K', 2: 'K', 3: 'K', 4: 'K'},
'Desc. Tos Cat': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Mktg Cat': {0: 990005.0, 1: 990005.0, 2: 990000.0, 3: 990000.0, 4: 990000.0},
'Desc. Mktg Cat': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Cod. Bill Sort': {0: 571.0, 1: 571.0, 2: 571.0, 3: 691.0, 4: 256.0},
'Orden de Servicio': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Comando': {0: 'PMO', 1: 'PFB', 2: 'PMO', 3: 'PMO', 4: 'PMO'},
'Desc. Comando': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Prioridad': {0: 5, 1: 5, 2: 5, 3: 5, 4: 5},
'Cod. Línea': {0: 3, 1: 2, 2: 1, 3: 1, 4: 1},
'Número de Servicio': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Producto': {0: 1420, 1: 31000, 2: 1403, 3: 1404, 4: 1404},
'Desc. Producto': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Familia': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Sub Familia': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Rental Charge': {0: 22.5,
1: 18.7125,
2: 15.257499999999999,
3: 19.95,
4: 19.95},
'Inst Charge': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0},
'Control': {0: 'CONEXIONES_COMPLETADAS_CT',
1: 'CONEXIONES_COMPLETADAS_CT',
2: 'CONEXIONES_COMPLETADAS',
3: 'CONEXIONES_COMPLETADAS',
4: 'CONEXIONES_COMPLETADAS'},
'Cod. Estatus': {0: 'A', 1: 'A', 2: 'A', 3: 'A', 4: 'A'},
'Status': {0: 'Por Acción ',
1: 'Por Acción ',
2: 'Por Acción ',
3: 'Por Acción ',
4: 'Por Acción '},
'Cod Razon Pendiente': {0: ' ', 1: ' ', 2: ' ', 3: ' ', 4: ' '},
'Razon Pendiente': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Cod. Motivo Desconexion': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0},
'Motivo Desconexion': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Cod. Agencia': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Agencia': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'ID Vendedor': {0: 2352.0, 1: 2352.0, 2: 2352.0, 3: 2352.0, 4: 2352.0},
'ID Oficinista': {0: 229113.0,
1: 229113.0,
2: 224666.0,
3: 221532.0,
4: 224666.0},
'ID Acct Manager': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0},
'Desc. Acct Manager': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Provincia': {0: 'A', 1: 'A', 2: 'A', 3: 'B', 4: 'B'},
'Central': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Chrg Prod Ant': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Tipo Srv': {0: 'MO', 1: 'TI', 2: 'MO', 3: 'MO', 4: 'MO'},
'Tipo Srv Desc': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Diferencia ': {0: 2.5500000000000007,
1: 0.0,
2: 15.257499999999999,
3: 19.95,
4: 19.95},
'Puntos ': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan}}
#QuanHoang was pointing in the right direction with his comment, but you need to add .copy() for both the hr and sales dataframes:
hr = hr_data[['Month','SalesSystemCode','TITULO','BirthDate','HireDate','SupervisorEmployeeID','BASE','carallowance','Commission_Target','Area','Fulfilment %','Commission Accrued','Commission paid',
'Características (D)', 'Características (I)', 'Características (S)','Características (C)', 'Motivación (D)', 'Motivación (I)','Motivación (S)', 'Motivación (C)', 'Bajo Stress (D)',
'Bajo Stress (I)', 'Bajo Stress (S)', 'Bajo Stress (C)']].copy()
sales = sales_data[['Report month', 'Area','Customer','Rental Charge','Cod. Motivo Desconexion','ID Vendedor']].copy()
Using .copy() works because it creates a full copy of the data, rather than a view. Subsequent indexing operations work correctly on the copy.
Another option is to use .loc[] indexing when you do the selection from hr_data and sales_data. This should also work:
hr = hr_data.loc[:, ['Month','SalesSystemCode','TITULO','BirthDate','HireDate','SupervisorEmployeeID','BASE','carallowance','Commission_Target','Area','Fulfilment %','Commission Accrued','Commission paid',
'Características (D)', 'Características (I)', 'Características (S)','Características (C)', 'Motivación (D)', 'Motivación (I)','Motivación (S)', 'Motivación (C)', 'Bajo Stress (D)',
'Bajo Stress (I)', 'Bajo Stress (S)', 'Bajo Stress (C)']]
sales = sales_data.loc[:, ['Report month', 'Area','Customer','Rental Charge','Cod. Motivo Desconexion','ID Vendedor']]
Note that selecting columns with .loc[] uses the format df.loc[:, [ *columns* ] becasue .loc[] requires specifying the rows explicitly.
Using .loc[] works because .loc[] (and .iloc[]) indexing return a reference to the original dataframe, but with updated indexing behavior which is not subject to the 'setting with copy' problems.

How to convert Monthly data into Yearly data in pandas dataframe?

All,
My dataframe looks like following. I am trying to convert my Monthly data into Yearly data. I am trying to aggregate my dataframe such that I can add the monthly data-points for the year 1997 and display the sum column. I would like to perform this activity for the years 1997-2018. I have also included dput of my dataset for reference.
Note: Below snapshot only shows few monthly data for the year 1997 and 1998,However,I have entire monthly data for the years 1997 till 2018.
Dput of the dataframe:
{'RegionID': {0: 84654, 1: 91982, 2: 84616, 3: 93144, 4: 91940}, 'RegionName': {0: 60657, 1: 77494, 2: 60614, 3: 79936, 4: 77449}, 'City': {0: 'Chicago', 1: 'Katy', 2: 'Chicago', 3: 'El Paso', 4: 'Katy'}, 'State': {0: 'IL', 1: 'TX', 2: 'IL', 3: 'TX', 4: 'TX'}, 'Metro': {0: 'Chicago-Naperville-Elgin', 1: 'Houston-The Woodlands-Sugar Land', 2: 'Chicago-Naperville-Elgin', 3: 'El Paso', 4: 'Houston-The Woodlands-Sugar Land'}, 'CountyName': {0: 'Cook County', 1: 'Harris County', 2: 'Cook County', 3: 'El Paso County', 4: 'Harris County'}, 'SizeRank': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5}, '1997-01': {0: 344400.0, 1: 197300.0, 2: 503400.0, 3: 77800.0, 4: 96600.0}, '1997-02': {0: 345700.0, 1: 195400.0, 2: 502200.0, 3: 77900.0, 4: 96400.0}, '1997-03': {0: 346700.0, 1: 193000.0, 2: 500000.0, 3: 77900.0, 4: 96200.0}, '1997-04': {0: 347800.0, 1: 191800.0, 2: 497900.0, 3: 77800.0, 4: 96100.0}, '1997-05': {0: 349000.0, 1: 191800.0, 2: 496300.0, 3: 77800.0, 4: 96200.0}, '1997-06': {0: 350400.0, 1: 193000.0, 2: 495200.0, 3: 77800.0, 4: 96300.0}, '1997-07': {0: 352000.0, 1: 195200.0, 2: 494700.0, 3: 77800.0, 4: 96600.0}, '1997-08': {0: 353900.0, 1: 198400.0, 2: 494900.0, 3: 77800.0, 4: 97000.0}, '1997-09': {0: 356200.0, 1: 202800.0, 2: 496200.0, 3: 77900.0, 4: 97500.0}, '1997-10': {0: 358800.0, 1: 208000.0, 2: 498600.0, 3: 78100.0, 4: 98000.0}, '1997-11': {0: 361800.0, 1: 213800.0, 2: 502000.0, 3: 78200.0, 4: 98400.0}, '1997-12': {0: 365700.0, 1: 220700.0, 2: 507600.0, 3: 78400.0, 4: 98800.0}, '1998-01': {0: 370200.0, 1: 227500.0, 2: 514900.0, 3: 78600.0, 4: 99200.0}, '1998-02': {0: 374700.0, 1: 231800.0, 2: 522200.0, 3: 78800.0, 4: 99500.0}, '1998-03': {0: 378900.0, 1: 233400.0, 2: 529500.0, 3: 79000.0, 4: 99700.0}, '1998-04': {0: 383500.0, 1: 233900.0, 2: 537900.0, 3: 79100.0, 4: 100000.0}, '1998-05': {0: 388300.0, 1: 233500.0, 2: 546900.0, 3: 79200.0, 4: 100200.0}, '1998-06': {0: 393300.0, 1: 233300.0, 2: 556400.0, 3: 79300.0, 4: 100400.0}, '1998-07': {0: 398500.0, 1: 234300.0, 2: 566100.0, 3: 79300.0, 4: 100700.0}, '1998-08': {0: 403800.0, 1: 237400.0, 2: 575600.0, 3: 79300.0, 4: 101100.0}, '1998-09': {0: 409100.0, 1: 242800.0, 2: 584800.0, 3: 79400.0, 4: 101800.0}, '1998-10': {0: 414600.0, 1: 250200.0, 2: 593500.0, 3: 79500.0, 4: 102900.0}, '1998-11': {0: 420100.0, 1: 258600.0, 2: 601600.0, 3: 79500.0, 4: 104300.0}, '1998-12': {0: 426200.0, 1: 268000.0, 2: 610100.0, 3: 79600.0, 4: 106200.0}, '1999-01': {0: 432600.0, 1: 277000.0, 2: 618600.0, 3: 79700.0, 4: 108400.0}, '1999-02': {0: 438600.0, 1: 283600.0, 2: 625600.0, 3: 79900.0, 4: 110400.0}, '1999-03': {0: 444200.0, 1: 288500.0, 2: 631100.0, 3: 80100.0, 4: 112100.0}, '1999-04': {0: 450000.0, 1: 293900.0, 2: 636600.0, 3: 80300.0, 4: 113200.0}, '1999-05': {0: 455900.0, 1: 299200.0, 2: 642100.0, 3: 80600.0, 4: 113600.0}, '1999-06': {0: 462100.0, 1: 304300.0, 2: 647600.0, 3: 80900.0, 4: 113500.0}, '1999-07': {0: 468500.0, 1: 308600.0, 2: 653300.0, 3: 81200.0, 4: 113000.0}, '1999-08': {0: 475300.0, 1: 311400.0, 2: 659300.0, 3: 81400.0, 4: 112500.0}, '1999-09': {0: 482500.0, 1: 312300.0, 2: 665800.0, 3: 81700.0, 4: 112200.0}, '1999-10': {0: 490200.0, 1: 311900.0, 2: 672900.0, 3: 82100.0, 4: 112100.0}, '1999-11': {0: 498200.0, 1: 311100.0, 2: 680500.0, 3: 82400.0, 4: 112400.0}, '1999-12': {0: 507200.0, 1: 311700.0, 2: 689600.0, 3: 82600.0, 4: 113100.0}, '2000-01': {0: 516800.0, 1: 313500.0, 2: 699700.0, 3: 82800.0, 4: 114200.0}, '2000-02': {0: 526300.0, 1: 315000.0, 2: 709300.0, 3: 82900.0, 4: 115700.0}, '2000-03': {0: 535300.0, 1: 316700.0, 2: 718300.0, 3: 83000.0, 4: 117800.0}, '2000-04': {0: 544500.0, 1: 319800.0, 2: 727600.0, 3: 83000.0, 4: 120300.0}, '2000-05': {0: 553500.0, 1: 323700.0, 2: 737100.0, 3: 82900.0, 4: 122900.0}, '2000-06': {0: 562400.0, 1: 327500.0, 2: 746600.0, 3: 82800.0, 4: 125600.0}, '2000-07': {0: 571200.0, 1: 329900.0, 2: 756200.0, 3: 82700.0, 4: 128000.0}, '2000-08': {0: 579800.0, 1: 329800.0, 2: 765800.0, 3: 82400.0, 4: 129800.0}, '2000-09': {0: 588100.0, 1: 326400.0, 2: 775100.0, 3: 82100.0, 4: 130800.0}, '2000-10': {0: 596300.0, 1: 320100.0, 2: 784400.0, 3: 81900.0, 4: 130900.0}, '2000-11': {0: 604200.0, 1: 312200.0, 2: 793500.0, 3: 81600.0, 4: 129900.0}, '2000-12': {0: 612200.0, 1: 304700.0, 2: 803000.0, 3: 81300.0, 4: 128000.0}, '2001-01': {0: 620200.0, 1: 298700.0, 2: 812500.0, 3: 81000.0, 4: 125600.0}, '2001-02': {0: 627700.0, 1: 294300.0, 2: 821200.0, 3: 80800.0, 4: 123000.0}, '2001-03': {0: 634500.0, 1: 291400.0, 2: 829200.0, 3: 80600.0, 4: 120500.0}, '2001-04': {0: 641000.0, 1: 290800.0, 2: 837000.0, 3: 80300.0, 4: 118300.0}, '2001-05': {0: 647000.0, 1: 291700.0, 2: 844400.0, 3: 80000.0, 4: 116600.0}, '2001-06': {0: 652700.0, 1: 293000.0, 2: 851600.0, 3: 79800.0, 4: 115200.0}, '2001-07': {0: 658100.0, 1: 293600.0, 2: 858600.0, 3: 79500.0, 4: 114200.0}, '2001-08': {0: 663300.0, 1: 292900.0, 2: 865300.0, 3: 79200.0, 4: 113500.0}, '2001-09': {0: 668400.0, 1: 290500.0, 2: 871800.0, 3: 78900.0, 4: 113200.0}, '2001-10': {0: 673400.0, 1: 286700.0, 2: 878200.0, 3: 78600.0, 4: 113100.0}, '2001-11': {0: 678300.0, 1: 282200.0, 2: 884700.0, 3: 78400.0, 4: 113200.0}, '2001-12': {0: 683200.0, 1: 276900.0, 2: 891300.0, 3: 78200.0, 4: 113400.0}, '2002-01': {0: 688300.0, 1: 271000.0, 2: 898000.0, 3: 78200.0, 4: 113700.0}, '2002-02': {0: 693300.0, 1: 264200.0, 2: 904700.0, 3: 78200.0, 4: 114000.0}, '2002-03': {0: 698000.0, 1: 257000.0, 2: 911200.0, 3: 78300.0, 4: 114300.0}, '2002-04': {0: 702400.0, 1: 249700.0, 2: 917600.0, 3: 78400.0, 4: 114700.0}, '2002-05': {0: 706400.0, 1: 243100.0, 2: 923800.0, 3: 78600.0, 4: 115100.0}, '2002-06': {0: 710200.0, 1: 237000.0, 2: 929800.0, 3: 78900.0, 4: 115500.0}, '2002-07': {0: 714000.0, 1: 231700.0, 2: 935700.0, 3: 79200.0, 4: 116100.0}, '2002-08': {0: 717800.0, 1: 227100.0, 2: 941400.0, 3: 79500.0, 4: 116700.0}, '2002-09': {0: 721700.0, 1: 223300.0, 2: 947100.0, 3: 79900.0, 4: 117200.0}, '2002-10': {0: 725700.0, 1: 220300.0, 2: 952800.0, 3: 80300.0, 4: 117800.0}, '2002-11': {0: 729900.0, 1: 217300.0, 2: 958900.0, 3: 80700.0, 4: 118200.0}, '2002-12': {0: 733400.0, 1: 214700.0, 2: 965100.0, 3: 81000.0, 4: 118500.0}, '2003-01': {0: 735600.0, 1: 213800.0, 2: 971000.0, 3: 81200.0, 4: 118800.0}, '2003-02': {0: 737200.0, 1: 215100.0, 2: 976400.0, 3: 81400.0, 4: 119100.0}, '2003-03': {0: 739000.0, 1: 217300.0, 2: 981400.0, 3: 81500.0, 4: 119300.0}, '2003-04': {0: 740900.0, 1: 219600.0, 2: 985700.0, 3: 81500.0, 4: 119500.0}, '2003-05': {0: 742600.0, 1: 221400.0, 2: 989400.0, 3: 81600.0, 4: 119600.0}, '2003-06': {0: 744400.0, 1: 222300.0, 2: 992900.0, 3: 81700.0, 4: 119700.0}, '2003-07': {0: 746000.0, 1: 222700.0, 2: 996800.0, 3: 81900.0, 4: 119900.0}, '2003-08': {0: 747200.0, 1: 223000.0, 2: 1000800.0, 3: 82000.0, 4: 120200.0}, '2003-09': {0: 748000.0, 1: 223700.0, 2: 1004600.0, 3: 82200.0, 4: 120500.0}, '2003-10': {0: 749000.0, 1: 225100.0, 2: 1008000.0, 3: 82500.0, 4: 120900.0}, '2003-11': {0: 750200.0, 1: 227200.0, 2: 1010600.0, 3: 82900.0, 4: 121500.0}, '2003-12': {0: 752300.0, 1: 229600.0, 2: 1012600.0, 3: 83400.0, 4: 122500.0}, '2004-01': {0: 755300.0, 1: 231800.0, 2: 1014500.0, 3: 84000.0, 4: 123900.0}, '2004-02': {0: 759200.0, 1: 233100.0, 2: 1017000.0, 3: 84700.0, 4: 125300.0}, '2004-03': {0: 764000.0, 1: 233500.0, 2: 1020500.0, 3: 85500.0, 4: 126600.0}, '2004-04': {0: 769600.0, 1: 233000.0, 2: 1024900.0, 3: 86400.0, 4: 127500.0}, '2004-05': {0: 775600.0, 1: 232100.0, 2: 1029800.0, 3: 87200.0, 4: 128100.0}, '2004-06': {0: 781900.0, 1: 231300.0, 2: 1035100.0, 3: 88000.0, 4: 128500.0}, '2004-07': {0: 787900.0, 1: 230700.0, 2: 1040500.0, 3: 88900.0, 4: 128800.0}, '2004-08': {0: 793200.0, 1: 230800.0, 2: 1046000.0, 3: 89700.0, 4: 128900.0}, '2004-09': {0: 798200.0, 1: 231500.0, 2: 1052100.0, 3: 90400.0, 4: 129000.0}, '2004-10': {0: 803100.0, 1: 232700.0, 2: 1058600.0, 3: 91100.0, 4: 129200.0}, '2004-11': {0: 807900.0, 1: 234000.0, 2: 1065000.0, 3: 91900.0, 4: 129400.0}, '2004-12': {0: 812900.0, 1: 235500.0, 2: 1071900.0, 3: 92700.0, 4: 129800.0}, '2005-01': {0: 818100.0, 1: 237000.0, 2: 1079000.0, 3: 93600.0, 4: 130100.0}, '2005-02': {0: 823200.0, 1: 238700.0, 2: 1086000.0, 3: 94400.0, 4: 130200.0}, '2005-03': {0: 828300.0, 1: 240600.0, 2: 1093100.0, 3: 95200.0, 4: 130300.0}, '2005-04': {0: 834000.0, 1: 241800.0, 2: 1100500.0, 3: 95800.0, 4: 130400.0}, '2005-05': {0: 839800.0, 1: 241700.0, 2: 1107400.0, 3: 96300.0, 4: 130400.0}, '2005-06': {0: 845600.0, 1: 240700.0, 2: 1113500.0, 3: 96700.0, 4: 130300.0}, '2005-07': {0: 851700.0, 1: 239300.0, 2: 1118800.0, 3: 97200.0, 4: 130100.0}, '2005-08': {0: 858000.0, 1: 238000.0, 2: 1123700.0, 3: 97700.0, 4: 129800.0}, '2005-09': {0: 864300.0, 1: 236900.0, 2: 1129200.0, 3: 98400.0, 4: 129400.0}, '2005-10': {0: 870600.0, 1: 235700.0, 2: 1135400.0, 3: 99000.0, 4: 129000.0}, '2005-11': {0: 876200.0, 1: 234700.0, 2: 1141900.0, 3: 99600.0, 4: 128800.0}, '2005-12': {0: 880600.0, 1: 233400.0, 2: 1148000.0, 3: 100200.0, 4: 128800.0}, '2006-01': {0: 884500.0, 1: 231700.0, 2: 1152800.0, 3: 101000.0, 4: 129000.0}, '2006-02': {0: 887800.0, 1: 230100.0, 2: 1155900.0, 3: 102000.0, 4: 129200.0}, '2006-03': {0: 890600.0, 1: 229000.0, 2: 1157900.0, 3: 103000.0, 4: 129400.0}, '2006-04': {0: 893200.0, 1: 228500.0, 2: 1159500.0, 3: 104300.0, 4: 129500.0}, '2006-05': {0: 895500.0, 1: 228700.0, 2: 1161000.0, 3: 105800.0, 4: 129700.0}, '2006-06': {0: 897300.0, 1: 229400.0, 2: 1162800.0, 3: 107400.0, 4: 130000.0}, '2006-07': {0: 898900.0, 1: 230400.0, 2: 1165300.0, 3: 109100.0, 4: 130300.0}, '2006-08': {0: 900300.0, 1: 231600.0, 2: 1168100.0, 3: 111000.0, 4: 130700.0}, '2006-09': {0: 902000.0, 1: 233000.0, 2: 1171300.0, 3: 113000.0, 4: 131200.0}, '2006-10': {0: 904300.0, 1: 234700.0, 2: 1174400.0, 3: 115000.0, 4: 131800.0}, '2006-11': {0: 907000.0, 1: 237100.0, 2: 1176700.0, 3: 117000.0, 4: 132300.0}, '2006-12': {0: 909500.0, 1: 240200.0, 2: 1178400.0, 3: 118800.0, 4: 132700.0}, '2007-01': {0: 912000.0, 1: 242900.0, 2: 1179900.0, 3: 120600.0, 4: 133000.0}, '2007-02': {0: 913400.0, 1: 244600.0, 2: 1181100.0, 3: 122200.0, 4: 133200.0}, '2007-03': {0: 913200.0, 1: 245200.0, 2: 1182800.0, 3: 124000.0, 4: 133600.0}, '2007-04': {0: 911800.0, 1: 245200.0, 2: 1184800.0, 3: 126000.0, 4: 134100.0}, '2007-05': {0: 909200.0, 1: 245000.0, 2: 1185300.0, 3: 128000.0, 4: 134700.0}, '2007-06': {0: 905200.0, 1: 245600.0, 2: 1183700.0, 3: 129600.0, 4: 135400.0}, '2007-07': {0: 901300.0, 1: 246900.0, 2: 1181000.0, 3: 130700.0, 4: 136000.0}, '2007-08': {0: 897900.0, 1: 248700.0, 2: 1177900.0, 3: 131400.0, 4: 136600.0}, '2007-09': {0: 895300.0, 1: 250700.0, 2: 1175400.0, 3: 132000.0, 4: 137000.0}, '2007-10': {0: 893500.0, 1: 252500.0, 2: 1173800.0, 3: 132300.0, 4: 137300.0}, '2007-11': {0: 891100.0, 1: 254000.0, 2: 1171700.0, 3: 132300.0, 4: 137400.0}, '2007-12': {0: 886700.0, 1: 254800.0, 2: 1167900.0, 3: 132000.0, 4: 137200.0}, '2008-01': {0: 881900.0, 1: 254000.0, 2: 1162900.0, 3: 131300.0, 4: 136500.0}, '2008-02': {0: 876500.0, 1: 252400.0, 2: 1157000.0, 3: 130300.0, 4: 135600.0}, '2008-03': {0: 870600.0, 1: 250900.0, 2: 1150700.0, 3: 129300.0, 4: 134700.0}, '2008-04': {0: 864900.0, 1: 249600.0, 2: 1144200.0, 3: 128300.0, 4: 133800.0}, '2008-05': {0: 859000.0, 1: 248400.0, 2: 1135900.0, 3: 127300.0, 4: 133000.0}, '2008-06': {0: 851600.0, 1: 247900.0, 2: 1125700.0, 3: 126300.0, 4: 132000.0}, '2008-07': {0: 843800.0, 1: 247700.0, 2: 1114200.0, 3: 125400.0, 4: 131200.0}, '2008-08': {0: 836400.0, 1: 247800.0, 2: 1102200.0, 3: 124600.0, 4: 130500.0}, '2008-09': {0: 830700.0, 1: 247900.0, 2: 1092100.0, 3: 123900.0, 4: 130000.0}, '2008-10': {0: 827300.0, 1: 247800.0, 2: 1085300.0, 3: 123300.0, 4: 129400.0}, '2008-11': {0: 824800.0, 1: 247600.0, 2: 1079400.0, 3: 122600.0, 4: 128700.0}, '2008-12': {0: 821400.0, 1: 247500.0, 2: 1072500.0, 3: 122100.0, 4: 128200.0}, '2009-01': {0: 818500.0, 1: 246600.0, 2: 1065400.0, 3: 121600.0, 4: 127600.0}, '2009-02': {0: 815200.0, 1: 245700.0, 2: 1057900.0, 3: 121200.0, 4: 127100.0}, '2009-03': {0: 810200.0, 1: 245600.0, 2: 1048900.0, 3: 120800.0, 4: 126400.0}, '2009-04': {0: 803500.0, 1: 246000.0, 2: 1037900.0, 3: 120300.0, 4: 125900.0}, '2009-05': {0: 795400.0, 1: 246300.0, 2: 1024300.0, 3: 119700.0, 4: 125300.0}, '2009-06': {0: 786800.0, 1: 246800.0, 2: 1010100.0, 3: 119100.0, 4: 124700.0}, '2009-07': {0: 780500.0, 1: 247200.0, 2: 999000.0, 3: 118700.0, 4: 124300.0}, '2009-08': {0: 776800.0, 1: 247600.0, 2: 990800.0, 3: 118400.0, 4: 124100.0}, '2009-09': {0: 774600.0, 1: 247900.0, 2: 985400.0, 3: 118200.0, 4: 124100.0}, '2009-10': {0: 774200.0, 1: 248100.0, 2: 983300.0, 3: 117900.0, 4: 124200.0}, '2009-11': {0: 774500.0, 1: 248200.0, 2: 982800.0, 3: 117600.0, 4: 124400.0}, '2009-12': {0: 775800.0, 1: 248000.0, 2: 983000.0, 3: 117500.0, 4: 124500.0}, '2010-01': {0: 774600.0, 1: 249800.0, 2: 985000.0, 3: 117300.0, 4: 124700.0}, '2010-02': {0: 774500.0, 1: 250500.0, 2: 988000.0, 3: 117300.0, 4: 125000.0}, '2010-03': {0: 773800.0, 1: 250100.0, 2: 986200.0, 3: 116900.0, 4: 125100.0}, '2010-04': {0: 769500.0, 1: 250400.0, 2: 978800.0, 3: 116100.0, 4: 124600.0}, '2010-05': {0: 765800.0, 1: 251800.0, 2: 974700.0, 3: 115700.0, 4: 124200.0}, '2010-06': {0: 767300.0, 1: 251300.0, 2: 975300.0, 3: 116100.0, 4: 124100.0}, '2010-07': {0: 765500.0, 1: 251200.0, 2: 973600.0, 3: 116400.0, 4: 124100.0}, '2010-08': {0: 761300.0, 1: 250600.0, 2: 967500.0, 3: 116700.0, 4: 123700.0}, '2010-09': {0: 756700.0, 1: 250000.0, 2: 957800.0, 3: 117400.0, 4: 123400.0}, '2010-10': {0: 747800.0, 1: 250000.0, 2: 945800.0, 3: 118200.0, 4: 123000.0}, '2010-11': {0: 738600.0, 1: 249700.0, 2: 935500.0, 3: 118700.0, 4: 122400.0}, '2010-12': {0: 732000.0, 1: 248100.0, 2: 927000.0, 3: 118800.0, 4: 121400.0}, '2011-01': {0: 730800.0, 1: 247400.0, 2: 924800.0, 3: 119000.0, 4: 120800.0}, '2011-02': {0: 732200.0, 1: 248500.0, 2: 926800.0, 3: 118800.0, 4: 120200.0}, '2011-03': {0: 732500.0, 1: 249400.0, 2: 925200.0, 3: 118300.0, 4: 119900.0}, '2011-04': {0: 731300.0, 1: 249200.0, 2: 918500.0, 3: 118100.0, 4: 120100.0}, '2011-05': {0: 731500.0, 1: 249300.0, 2: 914200.0, 3: 117600.0, 4: 120000.0}, '2011-06': {0: 731400.0, 1: 249500.0, 2: 912100.0, 3: 116800.0, 4: 119600.0}, '2011-07': {0: 732400.0, 1: 249500.0, 2: 913700.0, 3: 116500.0, 4: 119000.0}, '2011-08': {0: 735100.0, 1: 249400.0, 2: 919800.0, 3: 116100.0, 4: 118100.0}, '2011-09': {0: 736500.0, 1: 248900.0, 2: 924800.0, 3: 114800.0, 4: 117100.0}, '2011-10': {0: 736600.0, 1: 248000.0, 2: 925000.0, 3: 113500.0, 4: 116800.0}, '2011-11': {0: 735900.0, 1: 247100.0, 2: 924800.0, 3: 112800.0, 4: 116700.0}, '2011-12': {0: 739000.0, 1: 247000.0, 2: 930400.0, 3: 112700.0, 4: 116400.0}, '2012-01': {0: 739300.0, 1: 248600.0, 2: 930800.0, 3: 112400.0, 4: 116000.0}, '2012-02': {0: 735600.0, 1: 251200.0, 2: 925800.0, 3: 112200.0, 4: 115900.0}, '2012-03': {0: 735700.0, 1: 252600.0, 2: 927300.0, 3: 112400.0, 4: 115800.0}, '2012-04': {0: 741600.0, 1: 252600.0, 2: 940100.0, 3: 112800.0, 4: 115200.0}, '2012-05': {0: 746200.0, 1: 252700.0, 2: 954200.0, 3: 113200.0, 4: 114700.0}, '2012-06': {0: 752200.0, 1: 252700.0, 2: 967900.0, 3: 113400.0, 4: 114700.0}, '2012-07': {0: 762000.0, 1: 252400.0, 2: 978100.0, 3: 113100.0, 4: 115000.0}, '2012-08': {0: 772800.0, 1: 252500.0, 2: 986000.0, 3: 112800.0, 4: 115500.0}, '2012-09': {0: 781400.0, 1: 253300.0, 2: 995100.0, 3: 112900.0, 4: 115800.0}, '2012-10': {0: 788800.0, 1: 254200.0, 2: 1002400.0, 3: 112900.0, 4: 115900.0}, '2012-11': {0: 795800.0, 1: 255200.0, 2: 1005000.0, 3: 112900.0, 4: 116200.0}, '2012-12': {0: 800900.0, 1: 256600.0, 2: 1005100.0, 3: 112800.0, 4: 116700.0}, '2013-01': {0: 804200.0, 1: 257000.0, 2: 1008500.0, 3: 113000.0, 4: 117300.0}, '2013-02': {0: 808100.0, 1: 256500.0, 2: 1015700.0, 3: 113400.0, 4: 117900.0}, '2013-03': {0: 813200.0, 1: 256600.0, 2: 1027500.0, 3: 113600.0, 4: 118500.0}, '2013-04': {0: 819200.0, 1: 257300.0, 2: 1040800.0, 3: 113500.0, 4: 119300.0}, '2013-05': {0: 827900.0, 1: 258400.0, 2: 1055300.0, 3: 113300.0, 4: 120500.0}, '2013-06': {0: 838200.0, 1: 260700.0, 2: 1071300.0, 3: 113000.0, 4: 121800.0}, '2013-07': {0: 848300.0, 1: 263900.0, 2: 1090600.0, 3: 112900.0, 4: 123000.0}, '2013-08': {0: 853800.0, 1: 266900.0, 2: 1108500.0, 3: 112900.0, 4: 124300.0}, '2013-09': {0: 856500.0, 1: 269100.0, 2: 1123600.0, 3: 112700.0, 4: 125400.0}, '2013-10': {0: 856800.0, 1: 270900.0, 2: 1135600.0, 3: 112500.0, 4: 126100.0}, '2013-11': {0: 855400.0, 1: 273100.0, 2: 1142400.0, 3: 112300.0, 4: 126800.0}, '2013-12': {0: 854500.0, 1: 275800.0, 2: 1145800.0, 3: 112000.0, 4: 127600.0}, '2014-01': {0: 858500.0, 1: 277700.0, 2: 1148400.0, 3: 111500.0, 4: 128400.0}, '2014-02': {0: 862700.0, 1: 279600.0, 2: 1150700.0, 3: 111500.0, 4: 129100.0}, '2014-03': {0: 866500.0, 1: 282100.0, 2: 1152700.0, 3: 112100.0, 4: 130100.0}, '2014-04': {0: 874900.0, 1: 284500.0, 2: 1157700.0, 3: 112600.0, 4: 131300.0}, '2014-05': {0: 885100.0, 1: 286200.0, 2: 1162400.0, 3: 112700.0, 4: 132600.0}, '2014-06': {0: 890800.0, 1: 288300.0, 2: 1165200.0, 3: 113100.0, 4: 133700.0}, '2014-07': {0: 893800.0, 1: 290700.0, 2: 1169400.0, 3: 113900.0, 4: 134500.0}, '2014-08': {0: 894100.0, 1: 293100.0, 2: 1174900.0, 3: 114300.0, 4: 135300.0}, '2014-09': {0: 891300.0, 1: 295600.0, 2: 1175700.0, 3: 114400.0, 4: 136400.0}, '2014-10': {0: 889700.0, 1: 298200.0, 2: 1174000.0, 3: 114300.0, 4: 137600.0}, '2014-11': {0: 891900.0, 1: 300200.0, 2: 1176300.0, 3: 114200.0, 4: 138800.0}, '2014-12': {0: 894300.0, 1: 301500.0, 2: 1180100.0, 3: 114300.0, 4: 140000.0}, '2015-01': {0: 895000, 1: 301800, 2: 1178600, 3: 114700, 4: 141000}, '2015-02': {0: 897300, 1: 302200, 2: 1176700, 3: 115000, 4: 142000}, '2015-03': {0: 903700, 1: 303700, 2: 1180800, 3: 115100, 4: 143300}, '2015-04': {0: 911300, 1: 306600, 2: 1187600, 3: 115300, 4: 144800}, '2015-05': {0: 915600, 1: 309300, 2: 1193500, 3: 115700, 4: 146100}, '2015-06': {0: 916200, 1: 311900, 2: 1198300, 3: 115900, 4: 147200}, '2015-07': {0: 916700, 1: 314100, 2: 1199600, 3: 115600, 4: 148500}, '2015-08': {0: 918600, 1: 316000, 2: 1198000, 3: 115300, 4: 149700}, '2015-09': {0: 924400, 1: 318600, 2: 1199200, 3: 115300, 4: 151100}, '2015-10': {0: 935600, 1: 321800, 2: 1206600, 3: 115400, 4: 152200}, '2015-11': {0: 947200, 1: 324400, 2: 1218000, 3: 115700, 4: 153000}, '2015-12': {0: 950900, 1: 326400, 2: 1226400, 3: 116200, 4: 154100}, '2016-01': {0: 952700, 1: 327400, 2: 1230300, 3: 116200, 4: 156000}, '2016-02': {0: 959000, 1: 326900, 2: 1234700, 3: 115700, 4: 157800}, '2016-03': {0: 966400, 1: 327300, 2: 1240300, 3: 115100, 4: 159600}, '2016-04': {0: 970300, 1: 328900, 2: 1244700, 3: 114700, 4: 161700}, '2016-05': {0: 973200, 1: 330000, 2: 1245800, 3: 114300, 4: 164200}, '2016-06': {0: 973300, 1: 330000, 2: 1245300, 3: 114000, 4: 166100}, '2016-07': {0: 970600, 1: 328900, 2: 1243700, 3: 114000, 4: 167400}, '2016-08': {0: 971800, 1: 327500, 2: 1243400, 3: 113800, 4: 168100}, '2016-09': {0: 977800, 1: 326300, 2: 1245000, 3: 114000, 4: 168400}, '2016-10': {0: 985200, 1: 325300, 2: 1250800, 3: 114800, 4: 168400}, '2016-11': {0: 992900, 1: 324700, 2: 1259300, 3: 115600, 4: 168400}, '2016-12': {0: 997600, 1: 324700, 2: 1266600, 3: 116200, 4: 168400}, '2017-01': {0: 996000, 1: 323700, 2: 1270800, 3: 116800, 4: 168200}, '2017-02': {0: 993100, 1: 322100, 2: 1274500, 3: 117400, 4: 167900}, '2017-03': {0: 991500, 1: 320800, 2: 1278900, 3: 117800, 4: 167400}, '2017-04': {0: 990000, 1: 320400, 2: 1282600, 3: 118200, 4: 167000}, '2017-05': {0: 991400, 1: 320300, 2: 1285800, 3: 118700, 4: 166900}, '2017-06': {0: 998200, 1: 320900, 2: 1288100, 3: 119000, 4: 166800}, '2017-07': {0: 1004000, 1: 320900, 2: 1288500, 3: 119100, 4: 166800}, '2017-08': {0: 1006800, 1: 320300, 2: 1287500, 3: 119400, 4: 167300}, '2017-09': {0: 1008400, 1: 319800, 2: 1289200, 3: 119900, 4: 168300}, '2017-10': {0: 1011300, 1: 320200, 2: 1295000, 3: 120200, 4: 169500}, '2017-11': {0: 1015500, 1: 320800, 2: 1301100, 3: 120200, 4: 170700}, '2017-12': {0: 1022000, 1: 321100, 2: 1304300, 3: 120100, 4: 172100}, '2018-01': {0: 1028900, 1: 322700, 2: 1310100, 3: 120300, 4: 173500}, '2018-02': {0: 1034500, 1: 326500, 2: 1315300, 3: 120500, 4: 174600}, '2018-03': {0: 1037400, 1: 330400, 2: 1317900, 3: 120800, 4: 175500}, '2018-04': {0: 1038700, 1: 332700, 2: 1321100, 3: 121300, 4: 176400}, '2018-05': {0: 1041500, 1: 334500, 2: 1325300, 3: 122200, 4: 176900}, '2018-06': {0: 1042800, 1: 335900, 2: 1323800, 3: 123000, 4: 176900}, '2018-07': {0: 1042900, 1: 337000, 2: 1321200, 3: 123600, 4: 177300}, '2018-08': {0: 1044400, 1: 338300, 2: 1320700, 3: 124500, 4: 178000}, '2018-09': {0: 1047800, 1: 338400, 2: 1319500, 3: 125600, 4: 178500}, '2018-10': {0: 1049700, 1: 336900, 2: 1318800, 3: 126300, 4: 179300}, '2018-11': {0: 1048300, 1: 336000, 2: 1319700, 3: 126800, 4: 180200}, '2018-12': {0: 1047900, 1: 336500, 2: 1323300, 3: 127400, 4: 180700}}
I am new to Python, so please provide explanation with your code.
You can perform a groupby and sum on the columns:
df.iloc[:,7:].groupby(by=lambda x: x.split('-')[0], axis=1).sum().add_suffix('_sum')
We extract the monthly data and aggregate by the year. For this, I specify a callback to split the column name and return the year. So, for example x.split('-')[0] returns 1997 whenever x is 1997-XX.

Apply function across pandas dataframe columns

This seems to have been similarly answered, but I can't get it to work.
I have a pandas DataFrame that looks like sig_vars below. This df has a VAF and a Background column. I would like to use the ztest function from statsmodels to assign a p-value to a new p-value column.
The p-value is calculated something like this for each row:
from statsmodels.stats.weightstats import ztest
p_value = ztest(sig_vars.Background,value=sig_vars.VAF)[1]
I have tried something like this, but I can't quite get it to work:
def calc(x):
return ztest(x.Background, value=x.VAF.astype(float))[1]
sig_vars.dropna().assign(pval = lambda x: calc(x)).head()
It seems strange to me that this works just fine however:
def calc(x):
return ztest([0.0001,0.0002,0.0001], value=x.VAF.astype(float))[1]
sig_vars.dropna().assign(pval = lambda x: calc(x)).head()
Here is my DataFrame sig_vars:
sig_vars = pd.DataFrame({'AO': {0: 4.0, 1: 16.0, 2: 12.0, 3: 19.0, 4: 2.0},
'Background': {0: nan,
1: [0.00018832391713747646, 0.0002114408734430263, 0.000247843759294141],
2: nan,
3: [0.00023965141612200435,
0.00018864365214110544,
0.00036566589684372596,
0.0005452562704471102],
4: [0.00017349063150589867]},
'Change': {0: 'T>A', 1: 'T>C', 2: 'T>A', 3: 'T>C', 4: 'C>A'},
'Chrom': {0: 'chr1', 1: 'chr1', 2: 'chr1', 3: 'chr1', 4: 'chr1'},
'ConvChange': {0: 'T>A', 1: 'T>C', 2: 'T>A', 3: 'T>C', 4: 'C>A'},
'DP': {0: 16945.0, 1: 16945.0, 2: 16969.0, 3: 16969.0, 4: 16969.0},
'Downstream': {0: 'NaN', 1: 'NaN', 2: 'NaN', 3: 'NaN', 4: 'NaN'},
'Gene': {0: 'TIIIa', 1: 'TIIIa', 2: 'TIIIa', 3: 'TIIIa', 4: 'TIIIa'},
'ID': {0: '86.fastq/onlyProbedRegions.vcf',
1: '86.fastq/onlyProbedRegions.vcf',
2: '86.fastq/onlyProbedRegions.vcf',
3: '86.fastq/onlyProbedRegions.vcf',
4: '86.fastq/onlyProbedRegions.vcf'},
'Individual': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1},
'IntEx': {0: 'TIII', 1: 'TIII', 2: 'TIII', 3: 'TIII', 4: 'TIII'},
'Loc': {0: 115227854, 1: 115227854, 2: 115227855, 3: 115227855, 4: 115227856},
'Upstream': {0: 'NaN', 1: 'NaN', 2: 'NaN', 3: 'NaN', 4: 'NaN'},
'VAF': {0: 0.00023605783416937148,
1: 0.0009442313366774859,
2: 0.0007071719017031057,
3: 0.0011196888443632507,
4: 0.00011786198361718427},
'Var': {0: 'A', 1: 'C', 2: 'A', 3: 'C', 4: 'A'},
'WT': {0: 'T', 1: 'T', 2: 'T', 3: 'T', 4: 'C'}})
Try this:
def calc(x):
return ztest(x['Background'], value=float(x['VAF']))[1]
sig_vars['pval'] = sig_vars.dropna().apply(calc, axis=1)

Categories