Merge dataframes in pandas with a combination of keys - python

I have two dataframes that I need to combine together based on a key (an 'incident number'). The key, however, is repeated, as the database they will be ingested by requires a particular format for coordinates. How can join the necessary columns based on a combination of keys?
For example, the two tables look like:
Incident_Number
Lat/Long
GPSCoordinates
AB123
Lat
32.123
AB123
Long
120.123
CD321
Lat
31.321
CD321
Long
121.321
and...
Incident_Number
Lat/Long
GeoCodeCoordinates
AB123
Lat
35.123
AB123
Long
125.123
CD321
Lat
36.321
CD321
Long
126.321
And I need to get to...
IncidentNumber
Lat/Long
GPSCoordinates
GeoCodeCoordinates
AB123
Lat
32.123
35.123
AB123
Long
120.123
125.123
CD321
Lat
31.321
36.321
CD321
Long
121.321
126.321
The number of records are not 100% equal in each table so it needs to allow for NaNs. I am essentially trying to add the column 'GeoCodeCoordinates' to the other dataframe on a combination of 'Incident Number' and 'Lat/Long', so it will treat the value 'AB123 + Lat' and 'AB123 + Long' as a single key. Can this be specified within code, or does a new column and a calculation to create that value as a key need to be created?
I imagine I went about this in a bit of a goofy way. The Lat and Long were originally stored in separate fields and I used .melt() to make the data longer. The database that will ultimately take this in requires the longer format for the Lat/Long field.
GPSColList = list(GPSRecords.columns)
GPSColList.remove('Latitude')
GPSList.remove('Longitude')
GPSMelt = GPSRecords.melt(id_vars=GPSColList, value_vars=['Latitude', 'Longitude'], var_name='Lat/Long', value_name="GPSCoordinates")
As the two sets of coordinates were in separate fields I created two dataframes with each set of coordinates and melted them separately. My attempt to merge them looks like:
mergeMelt = pd.merge(GPSMelt, GeoCodeMelt[["GeoCodeCoordinates"]], on=['Incident_Number', 'Lat/Long'])
Result is KeyError: 'Incident_Number'
Adding samples as requested:
geocodeMelt:
print(geocodeMelt.head(10).to_dict())
{'OID_': {0: 5211, 1: 5212, 2: 5213, 3: 5214, 4: 5215, 5: 5216, 6: 5217, 7: 5218, 8: 5219, 9: 5220}, 'Unit_Level': {0: 'RRU (Riverside
Unit)', 1: 'RRU (Riverside Unit)', 2: 'RRU (Riverside Unit)', 3: 'RRU (Riverside Unit)', 4: 'RRU (Riverside Unit)', 5: 'RRU (Riverside
Unit)', 6: 'RRU (Riverside Unit)', 7: 'RRU (Riverside Unit)', 8: 'RRU (Riverside Unit)', 9: 'RRU (Riverside Unit)'}, 'Agency_FDID': {0: 33090, 1: 33051, 2: 33054, 3: 33054, 4: 33090, 5: 33070, 6: 33030, 7: 33054, 8: 33090, 9: 33052}, 'Incident_Number': {0: '21CARRU0000198', 1: '21CARRU0000564', 2: '21CARRU0000523', 3: '21CARRU0000624', 4: '21CARRU0000436', 5: '21CARRU0000439', 6: '21CARRU0000496', 7: '21CARRU0000422', 8: '21CARRU0000466', 9: '21CARRU0000016'}, 'Exposure': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}, 'CAD_Incident_Type': {0: '71', 1: '67B01O', 2: '71C01', 3: '69D03', 4: '67', 5: '67', 6: '71', 7: '69D06', 8: '71C01', 9: '82B01'}, 'CALFIRS_Incident_Type': {0: 'Passenger vehicle fire', 1: 'Outside rubbish, trash or waste fire', 2: 'Passenger vehicle fire', 3: 'Building fire', 4: 'Outside rubbish, trash or waste fire', 5: 'Outside rubbish, trash or waste fire', 6: 'Passenger vehicle fire', 7: 'Dumpster or other outside trash receptacle fire', 8: 'Passenger vehicle fire', 9: 'Brush or brush-and-grass mixture fire'}, 'Incident_Date': {0: '1/1/2021 0:00:00', 1: '1/1/2021 0:00:00', 2: '1/1/2021 0:00:00', 3: '1/1/2021 0:00:00', 4: '1/1/2021 0:00:00', 5: '1/1/2021 0:00:00', 6: '1/1/2021 0:00:00', 7: '1/1/2021 0:00:00', 8: '1/1/2021 0:00:00', 9: '1/1/2021 0:00:00'}, 'Report_Date_Time': {0: nan, 1: '1/1/2021 20:34:00', 2: '1/1/2021 19:07:00', 3: '1/1/2021 23:33:00', 4: nan, 5: '1/1/2021 16:56:00', 6: '1/1/2021 18:28:00', 7: '1/1/2021 16:16:00', 8: '1/1/2021 17:40:00', 9: '1/1/2021 0:15:00'}, 'Day': {0: '06 - Friday', 1: '06 - Friday', 2: '06 - Friday', 3: '06 - Friday', 4: '06 - Friday', 5: '06 - Friday', 6: '06 - Friday', 7: '06 - Friday', 8: '06 - Friday', 9: '06 - Friday'}, 'Incident_Name': {0: 'HY 91 W/ SERFAS CLUB DR', 1: 'QUAIL PL MENI', 2: 'CAR', 3: 'SUNNY', 4: 'MARTINEZ RD SANJ', 5: 'W METZ RD / ALTURA DR', 6: 'PALM DR / BUENA VISTA AV', 7: 'DELL', 8: 'HY 74 E HEM', 9: 'MADISON ST / AVE 60'}, 'Address': {0: 'HY 91 W Corona CA 92880', 1: '23880 KENNEDY LN Menifee CA 92587', 2: 'THEODORE ST/EUCALYPTUS AV Moreno Valley CA 92555', 3: '24490 SUNNYMEAD Moreno Valley CA 92553', 4: '40300 MARTINEZ San Jacinto CA 92583', 5: '1388 West METZ Perris CA 92570', 6: 'PALM DR/BUENA VISTA AV Desert hot springs CA 92240', 7: '25361 DELPHINIUM Moreno Valley CA 92553', 8: '43763 HY 74 East Hemet CA 92544', 9: 'MADISON ST/AVE 60 La Quinta CA 92253'}, 'Acres_Burned': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: 0.01}, 'Wildland_Fire_Cause': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: 'UU - Undetermined'}, 'Latitude_D': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7:
nan, 8: nan, 9: nan}, 'Longitude_D': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'Member_Making_Report': {0: 'Muhammad Nassar', 1: 'TODD PHILLIPS', 2: 'DAVID COLOMBO', 3: 'GREGORY MOWAT', 4: 'MICHAEL ESPARZA', 5: 'Benjamin Hall', 6: 'TIMOTHY CABRAL', 7: 'JORGE LOMELI', 8: 'JOSHUA BALBOA', 9: 'SETH SHIVELY'}, 'Battalion': {0: 4.0, 1: 13.0, 2: 9.0, 3: 9.0, 4: 5.0, 5: 1.0, 6: 10.0, 7: 9.0, 8: 5.0, 9: 6.0}, 'Incident_Status': {0: 'Submitted', 1: 'Submitted', 2: 'Submitted', 3: 'Submitted', 4: 'Submitted', 5: 'Submitted', 6: 'Submitted', 7: 'Submitted', 8: 'Submitted', 9: 'Submitted'}, 'DDLat': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'DDLon': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'DiscrepancyDistanceFeet': {0: 4178.0, 1: 107.0, 2: 2388.0, 3: 233159.0, 4: 102.0, 5: 1768.0, 6: 1094.0, 7: 78.0, 8: 35603721.0, 9: 149143.0}, 'DiscrepancyDistanceMiles': {0: 1.0, 1: 0.0, 2: 0.0, 3: 44.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 6743.0, 9: 28.0}, 'DiscrepancyGreaterThan1000ft': {0: 1.0, 1: 2.0, 2: 1.0, 3: 1.0, 4: 2.0, 5: 1.0, 6: 1.0, 7: 2.0, 8: 1.0, 9: 1.0}, 'LocationLegitimate': {0: nan, 1: 1.0, 2: nan, 3: nan, 4: 1.0, 5: nan, 6: nan, 7: 1.0, 8: nan, 9: nan}, 'LocationErrorCategory': {0: nan, 1: 7.0, 2: nan, 3: nan, 4: 7.0,
5: nan, 6: nan, 7: 7.0, 8: nan, 9: nan}, 'LocationErrorComment': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'LocationErrorResolution': {0: nan, 1: 6.0, 2: nan, 3: nan, 4: 6.0, 5: nan, 6: nan, 7: 6.0, 8: nan, 9: nan}, 'LocationErrorResolutionComment': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'CADLatitudeDDM': {0: '33 53.0746416', 1: '33 42.3811205', 2: '33 55.9728055', 3: '33 56.3706594', 4: '33 47.9788195', 5: '33 47.6486387', 6: '33 57.5747994', 7: '33 54.3721212', 8: '33 44.8499992', 9: '33 38.1589793'}, 'CADLongitudeDDM': {0: '-117 38.2368024', 1: '-117 14.5374611', 2: '-117 07.9119009', 3: '-117 14.1319211', 4: '-116 57.4446600', 5: '-117 15.4013420', 6: '-116 30.2784078', 7: '-117 13.2052213', 8: '-116 53.8524596',
9: '-116 15.0473995'}, 'GeocodeSymbology': {0: 2, 1: 2, 2: 2, 3: 2, 4: 2, 5: 2, 6: 2, 7: 2, 8: 2, 9: 2}, 'Lat/Long': {0: 'Latitude', 1: 'Latitude', 2: 'Latitude', 3: 'Latitude', 4: 'Latitude', 5: 'Latitude', 6: 'Latitude', 7: 'Latitude', 8: 'Latitude', 9: 'Latitude'}, 'CAD_Coords': {0: '33 52.924', 1: '33 42.364', 2: '33 56.100', 3: '33 93.991', 4: '33 47.9629', 5: '33 47.390', 6: '33 57.573', 7: '33 54.385', 8: '33 44.859', 9: '33 61.269'}}
and GPSMelt:
print(geocodeMelt.head(10).to_dict())
{'OID_': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5, 5: 6, 6: 7, 7: 8, 8: 9, 9: 10}, 'Unit_Level': {0: 'RRU (Riverside Unit)', 1: 'RRU (Riverside Unit)', 2: 'RRU (Riverside Unit)', 3: 'RRU (Riverside Unit)', 4: 'RRU (Riverside Unit)', 5: 'RRU (Riverside Unit)', 6: 'RRU (Riverside Unit)', 7: 'RRU (Riverside Unit)', 8: 'RRU (Riverside Unit)', 9: 'RRU (Riverside Unit)'}, 'Agency_FDID': {0: 33090, 1: 33054, 2: 33030, 3: 33051, 4: 33054, 5: 33090, 6: 33070, 7: 33054, 8: 33090, 9: 33035}, 'Incident_Number': {0: '21CARRU0000198', 1: '21CARRU0000523', 2: '21CARRU0000496', 3: '21CARRU0000564', 4: '21CARRU0000624', 5: '21CARRU0000436', 6: '21CARRU0000439', 7: '21CARRU0000422', 8: '21CARRU0000466', 9: '21CARRU0000007'}, 'Exposure': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}, 'CAD_Incident_Type': {0: '71', 1: '71C01', 2: '71', 3: '67B01O', 4: '69D03', 5: '67', 6: '67', 7: '69D06', 8: '71C01', 9: '82C03'}, 'CALFIRS_Incident_Type': {0: 'Passenger vehicle fire', 1: 'Passenger vehicle fire', 2: 'Passenger vehicle fire', 3: 'Outside rubbish, trash or waste fire', 4: 'Building fire', 5: 'Outside rubbish, trash or waste fire', 6: 'Outside rubbish, trash or waste fire', 7: 'Dumpster or other outside trash receptacle fire', 8: 'Passenger vehicle fire', 9: 'Brush or brush-and-grass mixture fire'}, 'Incident_Date': {0: '1/1/2021 0:00:00', 1: '1/1/2021 0:00:00', 2: '1/1/2021 0:00:00', 3: '1/1/2021 0:00:00', 4: '1/1/2021 0:00:00', 5: '1/1/2021 0:00:00', 6: '1/1/2021 0:00:00', 7: '1/1/2021 0:00:00', 8: '1/1/2021 0:00:00', 9: '1/1/2021 0:00:00'}, 'Report_Date_Time': {0: nan, 1: '1/1/2021 19:07:00', 2: '1/1/2021 18:28:00', 3: '1/1/2021 20:34:00', 4: '1/1/2021 23:33:00', 5: nan, 6: '1/1/2021 16:56:00', 7: '1/1/2021 16:16:00', 8: '1/1/2021 17:40:00', 9: '1/1/2021 0:07:00'}, 'Day': {0: '06 - Friday', 1: '06 - Friday', 2: '06 - Friday', 3: '06 - Friday', 4: '06 - Friday', 5: '06 - Friday', 6: '06 - Friday', 7: '06 - Friday', 8: '06 - Friday', 9: '06 - Friday'}, 'Incident_Name': {0: 'HY 91 W/ SERFAS CLUB DR', 1: 'CAR', 2: 'PALM DR / BUENA VISTA AV', 3: 'QUAIL PL MENI', 4: 'SUNNY', 5: 'MARTINEZ RD SANJ', 6: 'W METZ RD / ALTURA DR', 7: 'DELL', 8: 'HY 74 E HEM', 9: 'RIVERSIDE DR / JOY ST'}, 'Address': {0: 'HY 91 W Corona CA 92880', 1: 'THEODORE ST/EUCALYPTUS AV Moreno Valley CA 92555', 2: 'PALM DR/BUENA VISTA AV Desert hot springs CA 92240', 3: '23880 KENNEDY LN Menifee CA 92587', 4: '24490 SUNNYMEAD Moreno Valley CA 92553', 5: '40300 MARTINEZ San Jacinto CA 92583', 6: '1388 West METZ Perris CA 92570', 7: '25361 DELPHINIUM Moreno Valley CA 92553', 8: '43763 HY 74 East Hemet CA 92544', 9: 'RIVERSIDE DR/JOY ST Lake Elsinore CA 92530'}, 'Acres_Burned': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: 1.0}, 'Wildland_Fire_Cause': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: 'Misuse of Fire by a Minor'}, 'Latitude_D': {0: 33.88206666666667, 1: 33.935, 2: 33.95955, 3: 33.706066666666665, 4: 34.566516666666665, 5: 33.79938166666667, 6: 33.789833333333334, 7: 33.906416666666665, 8: 33.74765, 9: 33.679883333333336}, 'Longitude_D': {0: -117.62385, 1: -117.13931666666667, 2: -116.50103333333333, 3: -117.2422, 4: -117.39321666666666, 5: -116.9573, 6: -117.254, 7: -117.22008333333332, 8: 116.89728333333332, 9: -117.37076666666665}, 'Member_Making_Report': {0: 'Muhammad Nassar', 1: 'DAVID COLOMBO', 2: 'TIMOTHY CABRAL', 3: 'TODD PHILLIPS', 4: 'GREGORY MOWAT', 5: 'MICHAEL ESPARZA', 6: 'Benjamin Hall', 7: 'JORGE LOMELI', 8: 'JOSHUA BALBOA', 9: 'KEVIN MERKH'}, 'Battalion': {0: 4.0, 1: 9.0, 2: 10.0, 3: 13.0, 4: 9.0, 5: 5.0, 6: 1.0, 7: 9.0, 8: 5.0, 9: 2.0}, 'Incident_Status': {0: 'Submitted', 1: 'Submitted', 2: 'Submitted', 3: 'Submitted', 4: 'Submitted', 5: 'Submitted', 6: 'Submitted', 7: 'Submitted', 8: 'Submitted', 9: 'Submitted'}, 'DDLat': {0: '33.88206667N', 1: '33.93500000N', 2: '33.95955000N', 3: '33.70606667N', 4: '34.56651667N', 5: '33.79938167N', 6: '33.78983333N', 7: '33.90641667N', 8: '33.74765000N', 9: '33.67988333N'}, 'DDLon': {0: '117.62385000W', 1: '117.13931667W', 2: '116.50103333W', 3: '117.24220000W', 4: '117.39321667W', 5: '116.95730000W', 6: '117.25400000W', 7: '117.22008333W', 8: '116.89728333E', 9: '117.37076667W'}, 'DiscrepancyDistanceFeet': {0: 4178.0, 1: 2388.0, 2: 1094.0, 3: 107.0, 4: 233159.0, 5: 102.0, 6: 1768.0, 7: 78.0, 8: 35603721.0, 9: 9298.0}, 'DiscrepancyDistanceMiles': {0: 1.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 44.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 6743.0, 9: 2.0}, 'DiscrepancyGreaterThan1000ft': {0: 1.0, 1: 1.0, 2: 1.0, 3: 2.0, 4: 1.0, 5: 2.0, 6: 1.0, 7: 2.0, 8: 1.0, 9: 1.0}, 'LocationLegitimate': {0: nan, 1: nan, 2: nan, 3: 1.0, 4: nan, 5: 1.0, 6: nan, 7: 1.0, 8: nan, 9: nan}, 'LocationErrorCategory': {0: nan, 1: nan, 2: nan, 3: 7.0, 4: nan, 5: 7.0, 6: nan, 7: 7.0, 8: nan, 9: nan}, 'LocationErrorComment': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'LocationErrorResolution': {0: nan, 1: nan, 2: nan, 3: 6.0, 4: nan, 5: 6.0, 6: nan, 7: 6.0, 8: nan, 9: nan}, 'LocationErrorResolutionComment': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'CADLatitudeDDM': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'CADLongitudeDDM': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan, 6: nan, 7: nan, 8: nan, 9: nan}, 'GeocodeSymbology': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1}, 'Lat/Long': {0: 'Latitude', 1: 'Latitude', 2: 'Latitude', 3: 'Latitude', 4: 'Latitude', 5: 'Latitude', 6: 'Latitude', 7: 'Latitude', 8: 'Latitude', 9: 'Latitude'}, 'CALFIRS_Coords': {0: '33 52.924', 1: '33 56.100', 2: '33 57.573', 3: '33 42.364', 4: '33 93.991', 5: '33 47.9629', 6: '33 47.390', 7: '33 54.385', 8: '33 44.859', 9: '33 40.793'}}

Try:
cols = ['Incident_Number', 'Lat/Long', 'GeoCodeCoordinates']
mergeMelt = GPSMelt.merge(GeoCodeMelt[cols], on=cols[:-1])
The KeyError: 'Incident_Number' is raised because you use GeoCodeMelt[['GeoCodeCoordinates']] so your columns Incident_Number and Lat/Long don't exist when you merge.

Related

How to substract two dates based on filter of two other columns

I am new in Python and I am struggling to reshape my dataFrame.
For a particular client (contact_id), I want to add an new date column that actually substracts the DTHR_OPERATION date for a 'TYPE_OPER_VALIDATION = 3' minus the DTHR_OPERATION date for a 'TYPE_OPER_VALIDATION = 1'.
If the 'TYPE_OPER_VALIDATION' is equal to 3 and that there is less than a hour difference between those two dates, I want to add a string such as 'connection' for example in the new column.
I have an issue "python Series' object has no attribute 'total_seconds" when I try to compare if the time difference is indeed minus or equal to an hour. I tried many solutions I found on Internet but I always seem to have a data type issue.
Here is my code snippet:
df_oper_one = merged_table.loc[(merged_table['TYPE_OPER_VALIDATION']==1),['contact_id','TYPE_OPER_VALIDATION','DTHR_OPERATION']]
df_oper_three = merged_table.loc[(merged_table['TYPE_OPER_VALIDATION']==3),['contact_id','TYPE_OPER_VALIDATION','DTHR_OPERATION']]
connection = []
for row in merged_table['contact_id']:
if (df_validation.loc[(df_validation['TYPE_OPER_VALIDATION']==3)]) & ((pd.to_datetime(df_oper_three['DTHR_OPERATION'],format='%Y-%m-%d %H:%M:%S') - pd.to_datetime(df_oper_one['DTHR_OPERATION'],format='%Y-%m-%d %H:%M:%S').total_seconds()) <= 3600): connection.append('connection')
# if diff_date.total_seconds() <= 3600: connection.append('connection')
else: connection.append('null')
merged_table['connection'] = pd.Series(connection)
Hello Nicolas and welcome to Stack Overflow. Please remember to always include sample data to reproduce your issue. Here is sample data to reproduce part of your dataframe:
df = pd.DataFrame({'Id contact':['cf2e79bc-8cac-ec11-9840-000d3ab078e6']*12+['865c5edf-c7ac-ec11-9840-000d3ab078e6']*10,
'DTHR OPERATION':['11/10/2022 07:07', '11/10/2022 07:29', '11/10/2022 15:47', '11/10/2022 16:22', '11/10/2022 16:44', '11/10/2022 18:06', '12/10/2022 07:11', '12/10/2022 07:25', '12/10/2022 17:21', '12/10/2022 18:04', '13/10/2022 07:09', '13/10/2022 18:36', '14/09/2022 17:59', '15/09/2022 09:34', '15/09/2022 19:17', '16/09/2022 08:31', '16/09/2022 19:18', '17/09/2022 06:41', '17/09/2022 11:19', '17/09/2022 15:48', '17/09/2022 16:13', '17/09/2022 17:07'],
'lastname':['BOUALAMI']*12+['VERVOORT']*10,
'TYPE_OPER_VALIDATION':[1, 3, 1, 3, 3, 3, 1, 3, 1, 3, 1, 3, 3, 1, 1, 1, 1, 1, 1, 1, 3, 3]})
df['DTHR OPERATION'] = pd.to_datetime(df['DTHR OPERATION'])
I would recommend creating a new table to more easily accomplish your task:
df2 = pd.merge(df[['Id contact', 'DTHR OPERATION']][df['TYPE_OPER_VALIDATION']==3], df[['Id contact', 'DTHR OPERATION']][df['TYPE_OPER_VALIDATION']==1], on='Id contact', suffixes=('_type3','_type1'))
Then find the time difference:
df2['seconds'] = (df2['DTHR OPERATION_type3']-df2['DTHR OPERATION_type1']).dt.total_seconds()
Finally, flag connections of an hour or less:
df2['connection'] = np.where(df2['seconds']<=3600, 'yes', 'no')
Hope this helps!
sure, here is the information you are looking for :
df_contact = pd.DataFrame{'contact_id': {0: '865C5EDF-C7AC-EC11-9840', 1: '9C9690B1-F8AC-EC11', 2: '4DD27359-14AF-EC11-9840', 3: '0091373E-E7F4-4170-BCAC'}, 'birthdate': {0: Timestamp('2005-05-19
00:00:00'), 1: Timestamp('1982-01-28 00:00:00'), 2: Timestamp('1997-05-15 00:00:00'), 3: Timestamp('2005-03-22 00:00:00')}, 'fullname': {0: 'Laura VERVO', 1: 'Mélanie ALBE', 2: 'Eric VANO', 3: 'Jean Docq'}, 'lastname': {0: 'VERVO', 1: 'ALBE', 2: 'VANO', 3: 'Docq'}, 'age': {0: 17, 1: 40, 2: 25, 3: 17}}
df_validation = pd.dataframe{'validation_id': {0: 8263835881, 1: 8263841517, 2: 8263843376, 3: 8263843377, 4: 8263843381, 5: 8263843382, 6: 8263863088, 7: 8263863124, 8: 8263868113, 9: 8263868123}, 'LIBEL_LONG_PRODUIT_TITRE': {0: 'Mens NEXT 12-17', 1: 'Ann NEXT 25-64%B', 2: 'Ann EXPRESS CBLANCHE', 3: 'Multi 8 NEXT', 4: 'Ann EXPRESS 18-24', 5: 'SNCB+TEC NEXT ABO', 6: 'Ann EXPRESS 18-24', 7: 'Ann EXPRESS 12-17%B', 8: '1 jour EX Réfugié', 9: 'Ann EXPRESS 2564%B'}, 'DTHR_OPERATION':
{0: Timestamp('2022-10-01 00:02:02'), 1: Timestamp('2022-10-01 00:22:45'), 2: Timestamp('2022-10-01 00:02:45'), 3: Timestamp('2022-10-01 00:02:49'), 4: Timestamp('2022-10-01 00:07:03'), 5: Timestamp('2022-10-01 00:07:06'), 6: Timestamp('2022-10-01 00:07:40'), 7: Timestamp('2022-10-01 00:31:51'), 8: Timestamp('2022-10-01 00:03:33'), 9: Timestamp('2022-10-01 00:07:40')}, 'TYPE_OPER_VALIDATION': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 3, 7: 3, 8: 2, 9: 1}, 'NUM_SERIE_SUPPORT': {0: '2040121921', 1: '2035998914', 2: '2034456458', 3: '14988572652829627697', 4: '2035956003', 5: '2033613155', 6: '2040119429', 7: '2036114867', 8: '14988572650230713650', 9: '2040146199'}}
{'support_id': {0: '8D3A331D-3E86-EC11-93B0', 1: '44863926-3E86-EC11-93B0', 2: '45863926-3E86-EC11-93B0', 3: '46863926-3E86-EC11-93B0', 4: '47863926-3E86-EC11-93B0', 5: 'E3863926-3E86-EC11-93B0', 6: '56873926-3E86-EC11', 7: 'E3CE312C-3E86-EC11-93B0', 8: 'F3CE312C-3E86-EC11-93B0', 9: '3CCF312C-3E86-EC11-93B0'}, 'bd_linkedcustomer': {0: '15CCC384-C4AD-EC11', 1: '9D27061D-14AE-EC11-9840', 2: '74CAE68F-D4AC-EC11-9840', 3: '18F5FE1A-58AC-EC11-983F', 4: None, 5: '9FBDA103-2FAD-EC11-9840', 6: 'EEA1FB63-75AC-EC11-9840', 7: 'F150EC3D-0DAD-EC11-9840', 8: '111DE8C4-CAAC-EC11-9840', 9: None}, 'bd_supportserialnumber': {0: '44884259', 1: '2036010559', 2: '62863150', 3: '2034498160', 4: '62989611', 5: '2036094315', 6: '2033192919', 7: '2036051529', 8: '2036062236', 9: '2033889172'}}
df_support = pd.dataframe{'support_id': {0: '8D3A331D-3E86-EC11-93B0', 1: '44863926-3E86-EC11', 2: '45863926-3E86-EC11-93B0', 3: '46863926-3E86-EC11-93B0', 4: '47863926-3E86-EC11-93B0', 5: 'E3863926-3E86-EC11-93B0', 6: '56873926-3E86-EC11-93B0', 7: 'E3CE312C-3E86-EC11-93B0', 8: 'F3CE312C-3E86-EC11-93B0', 9: '3CCF312C-3E86-EC11-93B0'}, 'bd_linkedcustomer': {0: '15CCC384-C4AD-EC11-9840', 1: '9D27061D-14AE-EC11-9840', 2: '74CAE68F-D4AC-EC11-9840', 3: '18F5FE1A-58AC-EC11-983F', 4: None, 5: '9FBDA103-2FAD-EC11', 6: 'EEA1FB63-75AC-EC11-9840', 7: 'F150EC3D-0DAD-EC11-9840', 8: '111DE8C4-CAAC-EC11-9840', 9: None}, 'bd_supportserialnumber': {0: '44884259', 1: '2036010559', 2: '62863150', 3: '2034498160', 4: '62989611', 5: '2036094315', 6: '2033192919', 7: '2036051529', 8: '2036062236', 9: '2033889172'}}
df2 = pd.dataframe{'support_id': {0: '4BE73E8C-B8F9-EC11-BB3D', 1: '4BE73E8C-B8F9-EC11-BB3D', 2: '4BE73E8C-B8F9-EC11-BB3D', 3: '4BE73E8C-B8F9-EC11-BB3D', 4: '4BE73E8C-B8F9-EC11-BB3D', 5: '4BE73E8C-B8F9-EC11-BB3D', 6: '4BE73E8C-B8F9-EC11', 7: '4BE73E8C-B8F9-EC11-BB3D', 8: '4BE73E8C-B8F9-EC11-BB3D', 9: '4BE73E8C-B8F9-EC11-BB3D'}, 'bd_linkedcustomer': {0: '9C9690B1-F8AC-EC11-9840', 1: '9C9690B1-F8AC-EC11-9840', 2: '9C9690B1-F8AC-EC11-9840', 3: '9C9690B1-F8AC-EC11-9840', 4: '9C9690B1-F8AC-EC11-9840',
5: '9C9690B1-F8AC-EC11-9840', 6: '9C9690B1-F8AC-EC11-9840', 7: '9C9690B1-F8AC-EC11-9840', 8: '9C9690B1-F8AC-EC11-9840', 9: '9C9690B1-F8AC-EC11-9840'}, 'bd_supportserialnumber': {0: '2036002771', 1: '2036002771', 2: '2036002771', 3: '2036002771', 4: '2036002771', 5: '2036002771', 6: '2036002771', 7: '2036002771', 8: '2036002771', 9: '2036002771'}, 'contact_id': {0: '9C9690B1-F8AC-EC11-9840', 1: '9C9690B1-F8AC-EC11-9840', 2: '9C9690B1-F8AC-EC11-9840', 3: '9C9690B1-F8AC-EC11-9840', 4: '9C9690B1-F8AC-EC11-9840', 5: '9C9690B1-F8AC-EC11-9840', 6: '9C9690B1-F8AC-EC11-9840', 7: '9C9690B1-F8AC-EC11-9840', 8: '9C9690B1-F8AC-EC11-9840', 9: '9C9690B1-F8AC-EC11-9840'}, 'birthdate': {0: Timestamp('1982-01-28 00:00:00'), 1: Timestamp('1982-01-28 00:00:00'), 2: Timestamp('1982-01-28 00:00:00'), 3: Timestamp('1982-01-28 00:00:00'), 4: Timestamp('1982-01-28 00:00:00'), 5: Timestamp('1982-01-28 00:00:00'), 6: Timestamp('1982-01-28 00:00:00'), 7: Timestamp('1982-01-28 00:00:00'), 8: Timestamp('1982-01-28 00:00:00'), 9: Timestamp('1982-01-28 00:00:00')}, 'fullname': {0: 'Mélanie ALBE', 1: 'Mélanie ALBE', 2: 'Mélanie ALBE', 3: 'Mélanie ALBE', 4: 'Mélanie ALBE', 5: 'Mélanie ALBE', 6: 'Mélanie ALBE', 7: 'Mélanie ALBE', 8: 'Mélanie ALBE', 9: 'Mélanie ALBE'}, 'lastname': {0: 'ALBE', 1: 'ALBE', 2: 'ALBE', 3: 'ALBE', 4: 'ALBE', 5: 'ALBE', 6: 'ALBE', 7: 'ALBE', 8: 'ALBE', 9: 'ALBE'}, 'age': {0: 40, 1: 40, 2: 40, 3: 40, 4: 40, 5: 40, 6: 40, 7: 40, 8: 40, 9: 40}, 'validation_id': {0: 8264573419, 1: 8264574166, 2: 8264574345, 3: 8264676975, 4: 8265441741, 5: 8272463799, 6: 8272471694, 7: 8274368291, 8: 8274397366, 9: 8277077728}, 'LIBEL_LONG_PRODUIT_TITRE': {0: 'Ann NEXT 25-64', 1: 'Ann NEXT 25-64', 2: 'Ann NEXT 25-64', 3: 'Ann NEXT 25-64', 4: 'Ann NEXT 25-64', 5: 'Ann NEXT 25-64', 6: 'Ann NEXT 25-64', 7: 'Ann NEXT 25-64', 8: 'Ann NEXT 25-64', 9: 'Ann NEXT 25-64'}, 'DTHR_OPERATION': {0: Timestamp('2022-10-01 08:30:18'), 1: Timestamp('2022-10-01 12:23:34'), 2: Timestamp('2022-10-01 07:47:46'), 3: Timestamp('2022-10-01 13:11:54'), 4: Timestamp('2022-10-01 12:35:02'), 5: Timestamp('2022-10-04 08:34:23'), 6: Timestamp('2022-10-04 08:04:50'), 7: Timestamp('2022-10-04 17:17:47'), 8: Timestamp('2022-10-04 15:20:29'), 9: Timestamp('2022-10-05 07:54:14')}, 'TYPE_OPER_VALIDATION': {0: 3, 1: 1, 2: 1, 3: 3, 4: 3, 5: 3, 6: 1, 7: 1, 8: 1, 9: 1}, 'NUM_SERIE_SUPPORT': {0: '2036002771', 1: '2036002771', 2: '2036002771', 3: '2036002771', 4: '2036002771', 5: '2036002771', 6: '2036002771', 7: '2036002771', 8: '2036002771', 9: '2036002771'}}
df3 = pd.dataframe{'contact_id': {0: '9C9690B1-F8AC-EC11-9840', 1: '9C9690B1-F8AC-EC11-9840', 2: '9C9690B1-F8AC-EC11-9840', 3: '9C9690B1-F8AC-EC11-9840', 4: '9C9690B1-F8AC-EC11-9840', 5: '9C9690B1-F8AC-EC11-9840', 6: '9C9690B1-F8AC-EC11-9840', 7: '9C9690B1-F8AC-EC11-9840', 8: '9C9690B1-F8AC-EC11-9840', 9: '9C9690B1-F8AC-EC11-9840'}, 'DTHR_OPERATION_type3': {0: Timestamp('2022-10-01 08:30:18'), 1: Timestamp('2022-10-01 08:30:18'), 2: Timestamp('2022-10-01 08:30:18'), 3: Timestamp('2022-10-01 08:30:18'), 4: Timestamp('2022-10-01 08:30:18'), 5: Timestamp('2022-10-01 08:30:18'), 6: Timestamp('2022-10-01 08:30:18'), 7: Timestamp('2022-10-01 08:30:18'), 8: Timestamp('2022-10-01 08:30:18'), 9: Timestamp('2022-10-01 08:30:18')}, 'DTHR_OPERATION_type1': {0: Timestamp('2022-10-01 12:23:34'), 1: Timestamp('2022-10-01 07:47:46'), 2: Timestamp('2022-10-04 08:04:50'), 3: Timestamp('2022-10-04 17:17:47'), 4: Timestamp('2022-10-04 15:20:29'), 5: Timestamp('2022-10-05 07:54:14'), 6: Timestamp('2022-10-05 18:22:42'), 7: Timestamp('2022-10-06 08:14:28'), 8: Timestamp('2022-10-06 18:19:33'), 9: Timestamp('2022-10-08 07:46:45')}, 'seconds': {0: -13996.0, 1: 2552.0, 2: -257672.00000000003, 3: -290849.0, 4: -283811.0, 5: -343436.0, 6: -381144.0, 7: -431050.0, 8: -467355.00000000006, 9: -602187.0}, 'first_connection': {0: 'no', 1: 'yes', 2: 'no', 3: 'no', 4: 'no', 5: 'no', 6: 'no', 7: 'no', 8: 'no', 9: 'no'}}
df4 = pd.dataframe{'contact_id': {0: '9C9690B1-F8AC-EC11-9840', 1: '9C9690B1-F8AC-EC11-9840', 2: '9C9690B1-F8AC-EC11-9840', 3: '9C9690B1-F8AC-EC11-9840', 4: '9C9690B1-F8AC-EC11-9840', 5: '9C9690B1-F8AC-EC11-9840', 6: '9C9690B1-F8AC-EC11-9840', 7: '9C9690B1-F8AC-EC11-9840', 8: '9C9690B1-F8AC-EC11-9840', 9: '9C9690B1-F8AC-EC11-9840'}, 'DTHR_OPERATION_type3': {0: Timestamp('2022-10-01 08:30:18'), 1: Timestamp('2022-10-01 08:30:18'), 2: Timestamp('2022-10-01 08:30:18'), 3: Timestamp('2022-10-01 08:30:18'), 4: Timestamp('2022-10-01 08:30:18'), 5: Timestamp('2022-10-01 08:30:18'), 6: Timestamp('2022-10-01 08:30:18'), 7: Timestamp('2022-10-01 08:30:18'), 8: Timestamp('2022-10-01 08:30:18'), 9: Timestamp('2022-10-01 08:30:18')}, 'DTHR_OPERATION_type3bis': {0: Timestamp('2022-10-01 08:30:18'), 1: Timestamp('2022-10-01 13:11:54'), 2: Timestamp('2022-10-01 12:35:02'), 3: Timestamp('2022-10-04 08:34:23'), 4: Timestamp('2022-10-05 08:27:04'), 5: Timestamp('2022-10-05 19:05:29'), 6: Timestamp('2022-10-06 08:34:21'), 7: Timestamp('2022-10-06 18:37:56'), 8: Timestamp('2022-10-06 19:08:30'), 9: Timestamp('2022-10-08 13:01:13')}, 'seconds_type3': {0: 0.0, 1: -16896.0, 2: -14684.000000000002, 3: -259445.00000000003, 4: -345406.0, 5: -383711.0, 6: -432243.0, 7: -468458.00000000006, 8: -470292.00000000006, 9: -621055.0}, 'second_or_more_connection': {0: 'no', 1: 'no', 2: 'no', 3: 'no', 4: 'no', 5: 'no', 6: 'no', 7: 'no', 8: 'no', 9: 'no'}}
The desired result is a dF5 with the following columns [['contact_id', 'fullname', 'validation_id', 'LIBEL_LONG_PRODUIT_TITRE', 'TYPE_OPER_VALIDATION']] as well as this new colum dF5['connection]. Don't hestitate to reach out if you need further information or clarifications. Many thanks for your support :)

Pandas Merge cant merge all columns

I am trying to merge two excels, my data is:
tabla muestra.xlsx
{'Mandante': {0: 400, 1: 400, 2: 400, 3: 400, 4: 400}, 'Usuario': {0: 152163681, 1: '162181297', 2: '144912861', 3: '140752630', 4: '167300316'}, 'Funcion': {0: 'COMPRADOR', 1: 'JEFE DE COMPRAS', 2: 'COMPRADOR', 3: 'COMPRADOR', 4: 'JEFE DE COMPRAS'}, 'Tipo usuario contractual': {0: 'SAP Application Professional', 1: 'SAP Application Professional', 2: 'SAP Application Professional', 3: 'SAP Application Professional', 4: 'SAP Application Professional'}}
and tabla usuarios roles.xlsx
{'Identificación mdte.': {0: 400, 1: 400, 2: 400, 3: 400, 4: 400}, 'Rol': {0: 'SAP_BC_WEBSERVICE_ADMIN', 1: 'SAP_BC_WEBSERVICE_CONSUMER', 2: 'SAP_BC_WEBSERVICE_SERVICE_USER', 3: 'SAP_J2EE_ADMIN', 4: 'SAP_SDCCN_ALL'}, 'Usuario': {0: 'WEBSERVICE', 1: 'WEBSERVICE', 2: 'WEBSERVICE', 3: 'SM_ADMIN_S4P', 4: 'ADMIN_SONDA'}, 'Fecha de inicio': {0: '01.03.2019', 1: '01.03.2019', 2: '01.03.2019', 3: '16.05.2019', 4: '06.08.2019'}, 'Fecha fin': {0: '31.12.9999', 1: '31.12.9999', 2: '31.12.9999', 3: '31.12.9999', 4: '31.12.9999'}, 'Excluido': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan}, 'Fecha': {0: '01.03.2019', 1: '01.03.2019', 2: '01.03.2019', 3: '16.05.2019', 4: '06.08.2019'}, 'Hora': {0: datetime.time(16, 11, 6), 1: datetime.time(16, 11, 6), 2: datetime.time(16, 11, 6), 3: datetime.time(15, 27, 30), 4: datetime.time(9, 25, 57)}, 'Cronomarcador UTC en forma breve (AAAAMMDDhhmmss)': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0}, 'Org.HR': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan}, 'Asign.proviene de rol compuesto': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan}}
Using the code
# importing the module
import pandas
# reading the files
f1 = pandas.read_excel("~/Desktop/tabla muestra.xlsx")
f2 = pandas.read_excel("~/Desktop/tabla usuarios roles.xlsx")
# merging the files
f3 = f1[["Usuario"]].merge(f2[["Usuario", "Rol"]],
on = "Usuario",
how = "outer")
# creating a new file
f3.to_excel("~/Desktop/Resultstest5.xlsx", index = False)
After the code it returns the following
I chacked and the ids are on both tables, any clues whats happening?

Give multiple dictionaries can I get the lowest(or highest)value for each key?

I'm not sure how to approach this problem but given the following dict:
{'diff': {0: 358438.3179047619, 1: 2877912.924419369, 2: 822017.9039274186, 3: 4914425.223282051, 4: 574184.9971827588, 5: 7432268.5341428565, 6: 1111639.5132252753, 7: 1322861.412610346, 8: 1179799.2592362808, 9: 87556.64146904761}}
{'diff': {0: 292811.4124761905, 1: 2831096.9336261265, 2: 760006.755798387, 3: 4868369.423293451, 4: 509515.30310344836, 5: 7390444.080714285, 6: 1028933.0801098899, 7: 1240273.4906724147, 8: 1138039.7093932922, 9: 43618.81660000001}}
{'diff': {0: 393148.40700238093, 1: 2923931.0134306327, 2: 878450.4552137096, 3: 4962539.102763245, 4: 660218.1550965513, 5: 7483527.590967346, 6: 1223029.819152747, 7: 1372622.6893804593, 8: 1202322.4719079277, 9: 113611.58858809523}}
{'diff': {0: 386402.65016666666, 1: 2916900.423062612, 2: 870947.0239475806, 3: 4954526.795990028, 4: 652106.3039551723, 5: 7475754.573836735, 6: 1212934.2664368134, 7: 1365836.4194977009, 8: 1196003.2297920743, 9: 108039.20073571429}}
{'diff': {0: 349975.29688095255, 1: 2876674.3017342356, 2: 827975.0650000006, 3: 4913329.426507118, 4: 605245.163706897, 5: 7431737.75197959, 6: 1154341.8745934067, 7: 1325611.1466724137, 8: 1167062.6884146344, 9: 78813.5207857143}}
{'diff': {0: 389236.3094642856, 1: 2919969.395930179, 2: 873295.801427419, 3: 4957163.9330507135, 4: 653377.0037568965, 5: 7479596.044428572, 6: 1214463.8978571433, 7: 1366351.4634890805, 8: 1200255.7743564018, 9: 112641.91081666667}}
{'diff': {0: 391681.69095, 1: 2921278.030853604, 2: 874417.996964516, 3: 4960328.984978635, 4: 658758.8998741381, 5: 7484168.382208164, 6: 1218278.5344219788, 7: 1367466.964590805, 8: 1200111.4596570123, 9: 113533.64980238095}}
{'diff': {0: 355994.5180714284, 1: 2882303.7541306294, 2: 835458.8338790324, 3: 4919442.302396014, 4: 610290.0786551724, 5: 7441912.343979592, 6: 1164700.055917583, 7: 1327737.2043103438, 8: 1169616.6454146332, 9: 81680.70286904761}}
{'diff': {0: 379893.7180714286, 1: 2913403.720793244, 2: 865857.7399225802, 3: 4948973.331188316, 4: 643761.719862069, 5: 7468621.204883674, 6: 1209897.9149901094, 7: 1359244.204440804, 8: 1192828.6090381108, 9: 104051.28336904761}}
{'diff': {0: 390466.6839142858, 1: 2923088.262698646, 2: 877156.1510145164, 3: 4962513.822048144, 4: 659759.9533551724, 5: 7483875.484744897, 6: 1222339.2461901105, 7: 1369121.0132643674, 8: 1201501.5448817061, 9: 113458.57306428571}}
{'diff': {0: 301792.62588095234, 1: 2854027.945333335, 2: 804759.8740564514, 3: 4876267.124210826, 4: 584088.0599310346, 5: 7378153.378530612, 6: 1133044.61306044, 7: 1291385.6421149436, 8: 1139054.1821890248, 9: 38275.36907142856}}
{'diff': {0: 387509.1658071429, 1: 2919049.8491373872, 2: 874219.6323653222, 3: 4955459.435102557, 4: 656559.3065396551, 5: 7476533.654826531, 6: 1217855.0112197807, 7: 1366842.1718931037, 8: 1198388.2114634141, 9: 108848.47544047613}}
{'diff': {0: 377328.5187738094, 1: 2907686.5556463962, 2: 861963.8367822578, 3: 4942903.962752138, 4: 642356.8619948275, 5: 7463152.797857141, 6: 1203804.631930769, 7: 1356454.3497155162, 8: 1189309.752909755, 9: 99476.27148809524}}
{'diff': {0: 352355.7500238095, 1: 2887318.768563064, 2: 841642.5822338712, 3: 4925029.717854701, 4: 621227.5312931032, 5: 7443790.6748775495, 6: 1183558.6595329673, 7: 1333697.2241666662, 8: 1172889.8671798778, 9: 81039.74188095241}}
{'diff': {0: 396255.3198571428, 1: 2926250.0441639633, 2: 880795.3943693547, 3: 4965277.919590886, 4: 663756.0494362068, 5: 7487330.063967346, 6: 1225360.306425824, 7: 1374148.3940419543, 8: 1204383.4553957311, 9: 117133.45492380955}}
{'diff': {0: 397275.22611428564, 1: 2928138.3937932434, 2: 882549.978358064, 3: 4967271.384024783, 4: 665063.7757241379, 5: 7489353.048779594, 6: 1227848.7195598893, 7: 1375612.8537936783, 8: 1205968.5550199081, 9: 118622.92846666674}}
{'diff': {0: 370638.9714999999, 1: 2901794.814063063, 2: 854231.343169355, 3: 4941840.968413107, 4: 636963.8949827587, 5: 7462906.844836734, 6: 1198474.5955769236, 7: 1349199.7593390818, 8: 1181772.3528810989, 9: 94418.88628571431}}
{'diff': {0: 399605.39451595227, 1: 2930519.3274677014, 2: 884866.901809758, 3: 4970067.843492109, 4: 668209.9673794828, 5: 7492181.322271633, 6: 1230438.6087753302, 7: 1377940.4613927014, 8: 1207999.2446168917, 9: 120766.5979569048}}
{'diff': {0: 394437.6273380953, 1: 2926444.621315316, 2: 880587.8419403222, 3: 4965842.826658971, 4: 663091.0379724137, 5: 7487427.719579593, 6: 1226653.0014609892, 7: 1373195.8957902302, 8: 1204177.9670981697, 9: 116665.88206190476}}
{'diff': {0: 343177.5738333332, 1: 2872438.88899099, 2: 824308.511145161, 3: 4901171.498498574, 4: 594996.441275862, 5: 7417912.784775511, 6: 1150261.9712527473, 7: 1323742.7629367814, 8: 1160229.847768293, 9: 70927.47897619048}}
{'diff': {0: 388380.7712333334, 1: 2919408.214353603, 2: 872090.4287451615, 3: 4957030.500496009, 4: 653652.7608896552, 5: 7478706.309169387, 6: 1210682.681124176, 7: 1365970.9027482753, 8: 1199850.2533893296, 9: 112281.24546190478}}
{'diff': {0: 397734.3032357143, 1: 2928578.1657990995, 2: 883142.2520741936, 3: 4967757.699845867, 4: 666066.6204482759, 5: 7489537.145657143, 6: 1228560.9616604394, 7: 1376307.6271563205, 8: 1206326.1817006094, 9: 118790.83264523809}}
{'diff': {0: 382516.58267857146, 1: 2915073.5680945935, 2: 869510.7518991937, 3: 4954054.122938737, 4: 651079.4144172415, 5: 7474328.030612243, 6: 1213478.4472478013, 7: 1363524.8940068972, 8: 1194637.4700762194, 9: 106180.7426238095}}
{'diff': {0: 395288.79071904765, 1: 2925967.5104626133, 2: 880203.2222838707, 3: 4964695.553390598, 4: 663391.0626017239, 5: 7486925.765212244, 6: 1224951.42912967, 7: 1373539.6626603457, 8: 1204264.0033243895, 9: 116594.66418571424}}
{'diff': {0: 397971.03177380946, 1: 2928499.596860811, 2: 882838.8586330643, 3: 4967899.803066952, 4: 665805.6550189656, 5: 7489992.297071429, 6: 1227996.4815417586, 7: 1376172.2323091957, 8: 1206288.5885036567, 9: 119011.36759404762}}
{'diff': {0: 381045.4717000001, 1: 2915758.7289301776, 2: 868614.6180701618, 3: 4952364.463031057, 4: 649488.6040396551, 5: 7473145.036408164, 6: 1211084.349763737, 7: 1359986.3620787358, 8: 1195206.9199817067, 9: 106315.4963142857}}
{'diff': {0: 396112.6919309524, 1: 2927023.3355063056, 2: 881553.0804177421, 3: 4965659.387115391, 4: 664581.2356241376, 5: 7487379.988112247, 6: 1226928.2231780214, 7: 1375081.4878034485, 8: 1204924.4063521333, 9: 117568.09009999997}}
{'diff': {0: 398791.83564142865, 1: 2929904.1134937378, 2: 884268.6928083871, 3: 4969186.882990996, 4: 667453.0766655172, 5: 7491095.65462857, 6: 1229856.0675498352, 7: 1377319.7854759197, 8: 1207308.6914243596, 9: 119888.01348333333}}
{'diff': {0: 361949.4825238095, 1: 2896682.3701126124, 2: 848862.5437822583, 3: 4928751.334897436, 4: 630220.5688913792, 5: 7450946.972428572, 6: 1187394.4575274729, 7: 1340303.0000775873, 8: 1179480.5218445128, 9: 87392.22891666667}}
{'diff': {0: 315083.9590238095, 1: 2875386.155256756, 2: 791020.9478145165, 3: 4919627.580308269, 4: 527800.8608620691, 5: 7418705.913040818, 6: 1038398.8350329669, 7: 1278070.7195321836, 8: 1177528.7164743906, 9: 74112.39018833333}}
{'diff': {0: 372749.6816428571, 1: 2896460.682382884, 2: 847202.8589435485, 3: 4930094.333652413, 4: 625970.5209655174, 5: 7459734.096877551, 6: 1180734.9652692305, 7: 1343552.4978419545, 8: 1181099.5920807915, 9: 96045.81380238094}}
{'diff': {0: 344613.2344047619, 1: 2879554.2198873875, 2: 832675.4379758065, 3: 4903486.329074071, 4: 607037.1537931036, 5: 7421040.479510205, 6: 1162536.2775054947, 7: 1323128.7494655175, 8: 1167686.5103018305, 9: 71892.5343452381}}
{'diff': {0: 384315.38414285716, 1: 2912584.135851351, 2: 868077.15266129, 3: 4948184.254348999, 4: 649068.8833655174, 5: 7468412.446755102, 6: 1210267.5453626376, 7: 1363363.9941091961, 8: 1193581.5346847544, 9: 105967.95393333334}}
{'diff': {0: 301436.47307142866, 1: 2814021.3400585586, 2: 745901.4201774193, 3: 4840460.746145298, 4: 474416.3661724136, 5: 7368423.201306123, 6: 956427.8998351648, 7: 1251712.683614943, 8: 1124866.8502560984, 9: 39000.96757142855}}
{'diff': {0: 350845.38038095244, 1: 2877748.116752253, 2: 830765.2659354841, 3: 4904564.910629633, 4: 609597.3515431035, 5: 7433322.860551022, 6: 1165465.3320219782, 7: 1328359.751505749, 8: 1164311.8769268298, 9: 75226.42738095239}}
{'diff': {0: 273791.3171666667, 1: 2806945.4198468463, 2: 722071.8942903227, 3: 4835423.34654701, 4: -74570571675091.14, 5: 7366856.7167755095, 6: 949178.8157032969, 7: 1234820.710781609, 8: 1113428.221018293, 9: 19590.516166666657}}
For each key 0-9, in my case, it is larger. I want an outcome where there is one dictionary with the lowest value for each key across all the different dictionaries.
so if, we have:
{'diff': {0: 5, 1: 4, 2: 3, 3: 43, 4: -34, 5: 43, 6: 65, 7: 543, 8: 23, 9: 23}}
{'diff': {0: 6, 1: 3, 2: 8, 3: 78, 4: -23, 5: 54, 6: 76, 7: 43, 8: 234, 9: 54}}
Then I would expect:
{'diff': {0: 5, 1: 3, 2: 3, 3: 43, 4: -23, 5: 43, 6: 65, 7: 43, 8: 23, 9: 23}}
update: when you print the list of dicts, you get:
[{'diff': {0: 358438.3179047619, 1: 2877912.924419369, 2: 822017.9039274186, 3: 4914425.223282051, 4: 574184.9971827588, 5: 7432268.5341428565, 6: 1111639.5132252753, 7: 1322861.412610346, 8: 1179799.2592362808, 9: 87556.64146904761}}, {'diff': {0: 292811.4124761905, 1: 2831096.9336261265, 2: 760006.755798387, 3: 4868369.423293451, 4: 509515.30310344836, 5: 7390444.080714285, 6: 1028933.0801098899, 7: 1240273.4906724147, 8: 1138039.7093932922, 9: 43618.81660000001}}, {'diff': {0: 393148.40700238093, 1: 2923931.0134306327, 2: 878450.4552137096, 3: 4962539.102763245, 4: 660218.1550965513, 5: 7483527.590967346, 6: 1223029.819152747, 7: 1372622.6893804593, 8: 1202322.4719079277, 9: 113611.58858809523}}, {'diff': {0: 386402.65016666666, 1: 2916900.423062612, 2: 870947.0239475806, 3: 4954526.795990028, 4: 652106.3039551723, 5: 7475754.573836735, 6: 1212934.2664368134, 7: 1365836.4194977009, 8: 1196003.2297920743, 9: 108039.20073571429}}, {'diff': {0: 349975.29688095255, 1: 2876674.3017342356, 2: 827975.0650000006, 3: 4913329.426507118, 4: 605245.163706897, 5: 7431737.75197959, 6: 1154341.8745934067, 7: 1325611.1466724137, 8: 1167062.6884146344, 9: 78813.5207857143}}, {'diff': {0: 389236.3094642856, 1: 2919969.395930179, 2: 873295.801427419, 3: 4957163.9330507135, 4: 653377.0037568965, 5: 7479596.044428572, 6: 1214463.8978571433, 7: 1366351.4634890805, 8: 1200255.7743564018, 9: 112641.91081666667}}, {'diff': {0: 391681.69095, 1: 2921278.030853604, 2: 874417.996964516, 3: 4960328.984978635, 4: 658758.8998741381, 5: 7484168.382208164, 6: 1218278.5344219788, 7: 1367466.964590805, 8: 1200111.4596570123, 9: 113533.64980238095}}, {'diff': {0: 355994.5180714284, 1: 2882303.7541306294, 2: 835458.8338790324, 3: 4919442.302396014, 4: 610290.0786551724, 5: 7441912.343979592, 6: 1164700.055917583, 7: 1327737.2043103438, 8: 1169616.6454146332, 9: 81680.70286904761}}, {'diff': {0: 379893.7180714286, 1: 2913403.720793244, 2: 865857.7399225802, 3: 4948973.331188316, 4: 643761.719862069, 5: 7468621.204883674, 6: 1209897.9149901094, 7: 1359244.204440804, 8: 1192828.6090381108, 9: 104051.28336904761}}, {'diff': {0: 390466.6839142858, 1: 2923088.262698646, 2: 877156.1510145164, 3: 4962513.822048144, 4: 659759.9533551724, 5: 7483875.484744897, 6: 1222339.2461901105, 7: 1369121.0132643674, 8: 1201501.5448817061, 9: 113458.57306428571}}, {'diff': {0: 301792.62588095234, 1: 2854027.945333335, 2: 804759.8740564514, 3: 4876267.124210826, 4: 584088.0599310346, 5: 7378153.378530612, 6: 1133044.61306044, 7: 1291385.6421149436, 8: 1139054.1821890248, 9: 38275.36907142856}}, {'diff': {0: 387509.1658071429, 1: 2919049.8491373872, 2: 874219.6323653222, 3: 4955459.435102557, 4: 656559.3065396551, 5: 7476533.654826531, 6: 1217855.0112197807, 7: 1366842.1718931037, 8: 1198388.2114634141, 9: 108848.47544047613}}, {'diff': {0: 377328.5187738094, 1: 2907686.5556463962, 2: 861963.8367822578, 3: 4942903.962752138, 4: 642356.8619948275, 5: 7463152.797857141, 6: 1203804.631930769, 7: 1356454.3497155162, 8: 1189309.752909755, 9: 99476.27148809524}}, {'diff': {0: 352355.7500238095, 1: 2887318.768563064, 2: 841642.5822338712, 3: 4925029.717854701, 4: 621227.5312931032, 5: 7443790.6748775495, 6: 1183558.6595329673, 7: 1333697.2241666662, 8: 1172889.8671798778, 9: 81039.74188095241}}, {'diff': {0: 396255.3198571428, 1: 2926250.0441639633, 2: 880795.3943693547, 3: 4965277.919590886, 4: 663756.0494362068, 5: 7487330.063967346, 6: 1225360.306425824, 7: 1374148.3940419543, 8: 1204383.4553957311, 9: 117133.45492380955}}, {'diff': {0: 397275.22611428564, 1: 2928138.3937932434, 2: 882549.978358064, 3: 4967271.384024783, 4: 665063.7757241379, 5: 7489353.048779594, 6: 1227848.7195598893, 7: 1375612.8537936783, 8: 1205968.5550199081, 9: 118622.92846666674}}, {'diff': {0: 370638.9714999999, 1: 2901794.814063063, 2: 854231.343169355, 3: 4941840.968413107, 4: 636963.8949827587, 5: 7462906.844836734, 6: 1198474.5955769236, 7: 1349199.7593390818, 8: 1181772.3528810989, 9: 94418.88628571431}}, {'diff': {0: 399605.39451595227, 1: 2930519.3274677014, 2: 884866.901809758, 3: 4970067.843492109, 4: 668209.9673794828, 5: 7492181.322271633, 6: 1230438.6087753302, 7: 1377940.4613927014, 8: 1207999.2446168917, 9: 120766.5979569048}}, {'diff': {0: 394437.6273380953, 1: 2926444.621315316, 2: 880587.8419403222, 3: 4965842.826658971, 4: 663091.0379724137, 5: 7487427.719579593, 6: 1226653.0014609892, 7: 1373195.8957902302, 8: 1204177.9670981697, 9: 116665.88206190476}}, {'diff': {0: 343177.5738333332, 1: 2872438.88899099, 2: 824308.511145161, 3: 4901171.498498574, 4: 594996.441275862, 5: 7417912.784775511, 6: 1150261.9712527473, 7: 1323742.7629367814, 8: 1160229.847768293, 9: 70927.47897619048}}, {'diff': {0: 388380.7712333334, 1: 2919408.214353603, 2: 872090.4287451615, 3: 4957030.500496009, 4: 653652.7608896552, 5: 7478706.309169387, 6: 1210682.681124176, 7: 1365970.9027482753, 8: 1199850.2533893296, 9: 112281.24546190478}}, {'diff': {0: 397734.3032357143, 1: 2928578.1657990995, 2: 883142.2520741936, 3: 4967757.699845867, 4: 666066.6204482759, 5: 7489537.145657143, 6: 1228560.9616604394, 7: 1376307.6271563205, 8: 1206326.1817006094, 9: 118790.83264523809}}, {'diff': {0: 382516.58267857146, 1: 2915073.5680945935, 2: 869510.7518991937, 3: 4954054.122938737, 4: 651079.4144172415, 5: 7474328.030612243, 6: 1213478.4472478013, 7: 1363524.8940068972, 8: 1194637.4700762194, 9: 106180.7426238095}}, {'diff': {0: 395288.79071904765, 1: 2925967.5104626133, 2: 880203.2222838707, 3: 4964695.553390598, 4: 663391.0626017239, 5: 7486925.765212244, 6: 1224951.42912967, 7: 1373539.6626603457, 8: 1204264.0033243895, 9: 116594.66418571424}}, {'diff': {0: 397971.03177380946, 1: 2928499.596860811, 2: 882838.8586330643, 3: 4967899.803066952, 4: 665805.6550189656, 5: 7489992.297071429, 6: 1227996.4815417586, 7: 1376172.2323091957, 8: 1206288.5885036567, 9: 119011.36759404762}}, {'diff': {0: 381045.4717000001, 1: 2915758.7289301776, 2: 868614.6180701618, 3: 4952364.463031057, 4: 649488.6040396551, 5: 7473145.036408164, 6: 1211084.349763737, 7: 1359986.3620787358, 8: 1195206.9199817067, 9: 106315.4963142857}}, {'diff': {0: 396112.6919309524, 1: 2927023.3355063056, 2: 881553.0804177421, 3: 4965659.387115391, 4: 664581.2356241376, 5: 7487379.988112247, 6: 1226928.2231780214, 7: 1375081.4878034485, 8: 1204924.4063521333, 9: 117568.09009999997}}, {'diff': {0: 398791.83564142865, 1: 2929904.1134937378, 2: 884268.6928083871, 3: 4969186.882990996, 4: 667453.0766655172, 5: 7491095.65462857, 6: 1229856.0675498352, 7: 1377319.7854759197, 8: 1207308.6914243596, 9: 119888.01348333333}}, {'diff': {0: 361949.4825238095, 1: 2896682.3701126124, 2: 848862.5437822583, 3: 4928751.334897436, 4: 630220.5688913792, 5: 7450946.972428572, 6: 1187394.4575274729, 7: 1340303.0000775873, 8: 1179480.5218445128, 9: 87392.22891666667}}, {'diff': {0: 315083.9590238095, 1: 2875386.155256756, 2: 791020.9478145165, 3: 4919627.580308269, 4: 527800.8608620691, 5: 7418705.913040818, 6: 1038398.8350329669, 7: 1278070.7195321836, 8: 1177528.7164743906, 9: 74112.39018833333}}, {'diff': {0: 372749.6816428571, 1: 2896460.682382884, 2: 847202.8589435485, 3: 4930094.333652413, 4: 625970.5209655174, 5: 7459734.096877551, 6: 1180734.9652692305, 7: 1343552.4978419545, 8: 1181099.5920807915, 9: 96045.81380238094}}, {'diff': {0: 344613.2344047619, 1: 2879554.2198873875, 2: 832675.4379758065, 3: 4903486.329074071, 4: 607037.1537931036, 5: 7421040.479510205, 6: 1162536.2775054947, 7: 1323128.7494655175, 8: 1167686.5103018305, 9: 71892.5343452381}}, {'diff': {0: 384315.38414285716, 1: 2912584.135851351, 2: 868077.15266129, 3: 4948184.254348999, 4: 649068.8833655174, 5: 7468412.446755102, 6: 1210267.5453626376, 7: 1363363.9941091961, 8: 1193581.5346847544, 9: 105967.95393333334}}, {'diff': {0: 301436.47307142866, 1: 2814021.3400585586, 2: 745901.4201774193, 3: 4840460.746145298, 4: 474416.3661724136, 5: 7368423.201306123, 6: 956427.8998351648, 7: 1251712.683614943, 8: 1124866.8502560984, 9: 39000.96757142855}}, {'diff': {0: 350845.38038095244, 1: 2877748.116752253, 2: 830765.2659354841, 3: 4904564.910629633, 4: 609597.3515431035, 5: 7433322.860551022, 6: 1165465.3320219782, 7: 1328359.751505749, 8: 1164311.8769268298, 9: 75226.42738095239}}, {'diff': {0: 273791.3171666667, 1: 2806945.4198468463, 2: 722071.8942903227, 3: 4835423.34654701, 4: -74570571675091.14, 5: 7366856.7167755095, 6: 949178.8157032969, 7: 1234820.710781609, 8: 1113428.221018293, 9: 19590.516166666657}}]
This is a classical reduce problem, so one approach is to use the built-in function functools.reduce:
from functools import reduce
def min_(x, y, key="diff"):
return { key : { ki : min(xi, y[key][ki]) for ki, xi in x[key].items() } }
res = reduce(min_, data)
print(res)
Output (for the given data)
{'diff': {0: 273791.3171666667, 1: 2806945.4198468463, 2: 722071.8942903227, 3: 4835423.34654701, 4: -74570571675091.14, 5: 7366856.7167755095, 6: 949178.8157032969, 7: 1234820.710781609, 8: 1113428.221018293, 9: 19590.516166666657}}
As an alternative, you could use pandas, as below:
import pandas as pd
# assuming data is a list of dictionaries with the same format of the question
res = {"diff": pd.DataFrame(data=[d["diff"] for d in data]).min().to_dict()}
print(res)
Output (using pandas)
{'diff': {0: 273791.3171666667, 1: 2806945.4198468463, 2: 722071.8942903227, 3: 4835423.34654701, 4: -74570571675091.14, 5: 7366856.7167755095, 6: 949178.8157032969, 7: 1234820.710781609, 8: 1113428.221018293, 9: 19590.516166666657}}
Note that pandas is a (heavy) third-party library that needs to be installed.
You can do it with a dictionary comprehension that calls min() across all the dictionaries in the list.
result = {'diff':
{key: min(item['diff'][key] for item in list_of_dicts)
for key in list_of_dicts[0]['diff']}
}
A combination of zip and map can do this for you:
Input:
dicts = [ {'diff': {0: 358438.3179047619, 1: 2877912.924419369, 2: 822017.9039274186, 3: 4914425.223282051, 4: 574184.9971827588, 5: 7432268.5341428565, 6: 1111639.5132252753, 7: 1322861.412610346, 8: 1179799.2592362808, 9: 87556.64146904761}},
{'diff': {0: 292811.4124761905, 1: 2831096.9336261265, 2: 760006.755798387, 3: 4868369.423293451, 4: 509515.30310344836, 5: 7390444.080714285, 6: 1028933.0801098899, 7: 1240273.4906724147, 8: 1138039.7093932922, 9: 43618.81660000001}},
{'diff': {0: 393148.40700238093, 1: 2923931.0134306327, 2: 878450.4552137096, 3: 4962539.102763245, 4: 660218.1550965513, 5: 7483527.590967346, 6: 1223029.819152747, 7: 1372622.6893804593, 8: 1202322.4719079277, 9: 113611.58858809523}},
{'diff': {0: 386402.65016666666, 1: 2916900.423062612, 2: 870947.0239475806, 3: 4954526.795990028, 4: 652106.3039551723, 5: 7475754.573836735, 6: 1212934.2664368134, 7: 1365836.4194977009, 8: 1196003.2297920743, 9: 108039.20073571429}},
{'diff': {0: 349975.29688095255, 1: 2876674.3017342356, 2: 827975.0650000006, 3: 4913329.426507118, 4: 605245.163706897, 5: 7431737.75197959, 6: 1154341.8745934067, 7: 1325611.1466724137, 8: 1167062.6884146344, 9: 78813.5207857143}},
{'diff': {0: 389236.3094642856, 1: 2919969.395930179, 2: 873295.801427419, 3: 4957163.9330507135, 4: 653377.0037568965, 5: 7479596.044428572, 6: 1214463.8978571433, 7: 1366351.4634890805, 8: 1200255.7743564018, 9: 112641.91081666667}},
{'diff': {0: 391681.69095, 1: 2921278.030853604, 2: 874417.996964516, 3: 4960328.984978635, 4: 658758.8998741381, 5: 7484168.382208164, 6: 1218278.5344219788, 7: 1367466.964590805, 8: 1200111.4596570123, 9: 113533.64980238095}},
{'diff': {0: 355994.5180714284, 1: 2882303.7541306294, 2: 835458.8338790324, 3: 4919442.302396014, 4: 610290.0786551724, 5: 7441912.343979592, 6: 1164700.055917583, 7: 1327737.2043103438, 8: 1169616.6454146332, 9: 81680.70286904761}},
{'diff': {0: 379893.7180714286, 1: 2913403.720793244, 2: 865857.7399225802, 3: 4948973.331188316, 4: 643761.719862069, 5: 7468621.204883674, 6: 1209897.9149901094, 7: 1359244.204440804, 8: 1192828.6090381108, 9: 104051.28336904761}},
{'diff': {0: 390466.6839142858, 1: 2923088.262698646, 2: 877156.1510145164, 3: 4962513.822048144, 4: 659759.9533551724, 5: 7483875.484744897, 6: 1222339.2461901105, 7: 1369121.0132643674, 8: 1201501.5448817061, 9: 113458.57306428571}},
{'diff': {0: 301792.62588095234, 1: 2854027.945333335, 2: 804759.8740564514, 3: 4876267.124210826, 4: 584088.0599310346, 5: 7378153.378530612, 6: 1133044.61306044, 7: 1291385.6421149436, 8: 1139054.1821890248, 9: 38275.36907142856}},
{'diff': {0: 387509.1658071429, 1: 2919049.8491373872, 2: 874219.6323653222, 3: 4955459.435102557, 4: 656559.3065396551, 5: 7476533.654826531, 6: 1217855.0112197807, 7: 1366842.1718931037, 8: 1198388.2114634141, 9: 108848.47544047613}},
{'diff': {0: 377328.5187738094, 1: 2907686.5556463962, 2: 861963.8367822578, 3: 4942903.962752138, 4: 642356.8619948275, 5: 7463152.797857141, 6: 1203804.631930769, 7: 1356454.3497155162, 8: 1189309.752909755, 9: 99476.27148809524}},
{'diff': {0: 352355.7500238095, 1: 2887318.768563064, 2: 841642.5822338712, 3: 4925029.717854701, 4: 621227.5312931032, 5: 7443790.6748775495, 6: 1183558.6595329673, 7: 1333697.2241666662, 8: 1172889.8671798778, 9: 81039.74188095241}},
{'diff': {0: 396255.3198571428, 1: 2926250.0441639633, 2: 880795.3943693547, 3: 4965277.919590886, 4: 663756.0494362068, 5: 7487330.063967346, 6: 1225360.306425824, 7: 1374148.3940419543, 8: 1204383.4553957311, 9: 117133.45492380955}},
{'diff': {0: 397275.22611428564, 1: 2928138.3937932434, 2: 882549.978358064, 3: 4967271.384024783, 4: 665063.7757241379, 5: 7489353.048779594, 6: 1227848.7195598893, 7: 1375612.8537936783, 8: 1205968.5550199081, 9: 118622.92846666674}},
{'diff': {0: 370638.9714999999, 1: 2901794.814063063, 2: 854231.343169355, 3: 4941840.968413107, 4: 636963.8949827587, 5: 7462906.844836734, 6: 1198474.5955769236, 7: 1349199.7593390818, 8: 1181772.3528810989, 9: 94418.88628571431}},
{'diff': {0: 399605.39451595227, 1: 2930519.3274677014, 2: 884866.901809758, 3: 4970067.843492109, 4: 668209.9673794828, 5: 7492181.322271633, 6: 1230438.6087753302, 7: 1377940.4613927014, 8: 1207999.2446168917, 9: 120766.5979569048}},
{'diff': {0: 394437.6273380953, 1: 2926444.621315316, 2: 880587.8419403222, 3: 4965842.826658971, 4: 663091.0379724137, 5: 7487427.719579593, 6: 1226653.0014609892, 7: 1373195.8957902302, 8: 1204177.9670981697, 9: 116665.88206190476}},
{'diff': {0: 343177.5738333332, 1: 2872438.88899099, 2: 824308.511145161, 3: 4901171.498498574, 4: 594996.441275862, 5: 7417912.784775511, 6: 1150261.9712527473, 7: 1323742.7629367814, 8: 1160229.847768293, 9: 70927.47897619048}},
{'diff': {0: 388380.7712333334, 1: 2919408.214353603, 2: 872090.4287451615, 3: 4957030.500496009, 4: 653652.7608896552, 5: 7478706.309169387, 6: 1210682.681124176, 7: 1365970.9027482753, 8: 1199850.2533893296, 9: 112281.24546190478}},
{'diff': {0: 397734.3032357143, 1: 2928578.1657990995, 2: 883142.2520741936, 3: 4967757.699845867, 4: 666066.6204482759, 5: 7489537.145657143, 6: 1228560.9616604394, 7: 1376307.6271563205, 8: 1206326.1817006094, 9: 118790.83264523809}},
{'diff': {0: 382516.58267857146, 1: 2915073.5680945935, 2: 869510.7518991937, 3: 4954054.122938737, 4: 651079.4144172415, 5: 7474328.030612243, 6: 1213478.4472478013, 7: 1363524.8940068972, 8: 1194637.4700762194, 9: 106180.7426238095}},
{'diff': {0: 395288.79071904765, 1: 2925967.5104626133, 2: 880203.2222838707, 3: 4964695.553390598, 4: 663391.0626017239, 5: 7486925.765212244, 6: 1224951.42912967, 7: 1373539.6626603457, 8: 1204264.0033243895, 9: 116594.66418571424}},
{'diff': {0: 397971.03177380946, 1: 2928499.596860811, 2: 882838.8586330643, 3: 4967899.803066952, 4: 665805.6550189656, 5: 7489992.297071429, 6: 1227996.4815417586, 7: 1376172.2323091957, 8: 1206288.5885036567, 9: 119011.36759404762}},
{'diff': {0: 381045.4717000001, 1: 2915758.7289301776, 2: 868614.6180701618, 3: 4952364.463031057, 4: 649488.6040396551, 5: 7473145.036408164, 6: 1211084.349763737, 7: 1359986.3620787358, 8: 1195206.9199817067, 9: 106315.4963142857}},
{'diff': {0: 396112.6919309524, 1: 2927023.3355063056, 2: 881553.0804177421, 3: 4965659.387115391, 4: 664581.2356241376, 5: 7487379.988112247, 6: 1226928.2231780214, 7: 1375081.4878034485, 8: 1204924.4063521333, 9: 117568.09009999997}},
{'diff': {0: 398791.83564142865, 1: 2929904.1134937378, 2: 884268.6928083871, 3: 4969186.882990996, 4: 667453.0766655172, 5: 7491095.65462857, 6: 1229856.0675498352, 7: 1377319.7854759197, 8: 1207308.6914243596, 9: 119888.01348333333}},
{'diff': {0: 361949.4825238095, 1: 2896682.3701126124, 2: 848862.5437822583, 3: 4928751.334897436, 4: 630220.5688913792, 5: 7450946.972428572, 6: 1187394.4575274729, 7: 1340303.0000775873, 8: 1179480.5218445128, 9: 87392.22891666667}},
{'diff': {0: 315083.9590238095, 1: 2875386.155256756, 2: 791020.9478145165, 3: 4919627.580308269, 4: 527800.8608620691, 5: 7418705.913040818, 6: 1038398.8350329669, 7: 1278070.7195321836, 8: 1177528.7164743906, 9: 74112.39018833333}},
{'diff': {0: 372749.6816428571, 1: 2896460.682382884, 2: 847202.8589435485, 3: 4930094.333652413, 4: 625970.5209655174, 5: 7459734.096877551, 6: 1180734.9652692305, 7: 1343552.4978419545, 8: 1181099.5920807915, 9: 96045.81380238094}},
{'diff': {0: 344613.2344047619, 1: 2879554.2198873875, 2: 832675.4379758065, 3: 4903486.329074071, 4: 607037.1537931036, 5: 7421040.479510205, 6: 1162536.2775054947, 7: 1323128.7494655175, 8: 1167686.5103018305, 9: 71892.5343452381}},
{'diff': {0: 384315.38414285716, 1: 2912584.135851351, 2: 868077.15266129, 3: 4948184.254348999, 4: 649068.8833655174, 5: 7468412.446755102, 6: 1210267.5453626376, 7: 1363363.9941091961, 8: 1193581.5346847544, 9: 105967.95393333334}},
{'diff': {0: 301436.47307142866, 1: 2814021.3400585586, 2: 745901.4201774193, 3: 4840460.746145298, 4: 474416.3661724136, 5: 7368423.201306123, 6: 956427.8998351648, 7: 1251712.683614943, 8: 1124866.8502560984, 9: 39000.96757142855}},
{'diff': {0: 350845.38038095244, 1: 2877748.116752253, 2: 830765.2659354841, 3: 4904564.910629633, 4: 609597.3515431035, 5: 7433322.860551022, 6: 1165465.3320219782, 7: 1328359.751505749, 8: 1164311.8769268298, 9: 75226.42738095239}},
{'diff': {0: 273791.3171666667, 1: 2806945.4198468463, 2: 722071.8942903227, 3: 4835423.34654701, 4: -74570571675091.14, 5: 7366856.7167755095, 6: 949178.8157032969, 7: 1234820.710781609, 8: 1113428.221018293, 9: 19590.516166666657}}]
Output:
result = {'diff':dict(map(max,zip(*(d['diff'].items() for d in dicts))))}
print(result)
{'diff': {0: 399605.39451595227, 1: 2930519.3274677014,
2: 884866.901809758, 3: 4970067.843492109,
4: 668209.9673794828, 5: 7492181.322271633,
6: 1230438.6087753302, 7: 1377940.4613927014,
8: 1207999.2446168917, 9: 120766.5979569048}}
Note that this assumes that all 10 keys are always present and in the same order in every dictionary
If the keys are not always present or not in the same order, you could do this:
result = {'diff':{k:max(d['diff'].get(k,0) for d in dicts) for k in range(10)}}

How to compare numerical values to categorical ranges in column headers in pandas?

I have one dataframe that looks like this:
import pandas as pd
import datetime
df1 = pd.DataFrame.from_dict(
{'Unnamed: 4': {0: 'Values'},
datetime.datetime(2021, 1, 1, 0, 0): {0: 8},
datetime.datetime(2021, 1, 2, 0, 0): {0: 12},
datetime.datetime(2021, 1, 3, 0, 0): {0: 99},
datetime.datetime(2021, 1, 4, 0, 0): {0: 25},
datetime.datetime(2021, 1, 5, 0, 0): {0: 35}}
)
and a second dataframe that looks like this
df2 = pd.DataFrame.from_dict(
{'Level': {0: 'Range',
1: 'Middle point',
2: 'Total available',
3: nan,
4: 1,
5: 2,
6: 3,
7: 4,
8: 5,
9: 6,
10: 7,
11: 8,
12: 9,
13: 10,
14: 11,
15: 12,
16: 13,
17: 14,
18: 15,
19: 16,
20: 17,
21: 18,
22: 19,
23: 20,
24: 21,
25: 22,
26: 23,
27: 24,
28: 25,
29: 26,
30: 27},
1: {0: '1 to 10',
1: 5,
2: 17.5,
3: nan,
4: nan,
5: nan,
6: 8,
7: nan,
8: nan,
9: nan,
10: nan,
11: nan,
12: 1,
13: 1,
14: nan,
15: nan,
16: nan,
17: nan,
18: 1,
19: 1,
20: nan,
21: nan,
22: 0.5,
23: nan,
24: 1,
25: 1,
26: 1,
27: nan,
28: nan,
29: 1,
30: 1},
11: {0: '11 to 20',
1: 15,
2: 24.5,
3: nan,
4: nan,
5: nan,
6: 15,
7: nan,
8: nan,
9: nan,
10: nan,
11: nan,
12: 1,
13: 1,
14: nan,
15: nan,
16: nan,
17: nan,
18: 1,
19: 1,
20: nan,
21: nan,
22: 0.5,
23: nan,
24: 1,
25: 1,
26: 1,
27: nan,
28: nan,
29: 1,
30: 1},
21: {0: '21 to 30',
1: 25,
2: 34.5,
3: nan,
4: nan,
5: nan,
6: 25,
7: nan,
8: nan,
9: nan,
10: nan,
11: nan,
12: 1,
13: 1,
14: nan,
15: nan,
16: nan,
17: nan,
18: 1,
19: 1,
20: nan,
21: nan,
22: 0.5,
23: nan,
24: 1,
25: 1,
26: 1,
27: nan,
28: nan,
29: 1,
30: 1},
31: {0: '31 to 40',
1: 35,
2: 46.5,
3: nan,
4: nan,
5: nan,
6: 37,
7: nan,
8: nan,
9: nan,
10: nan,
11: nan,
12: 1,
13: 1,
14: nan,
15: nan,
16: nan,
17: nan,
18: 1,
19: 1,
20: nan,
21: nan,
22: 0.5,
23: nan,
24: 1,
25: 1,
26: 1,
27: nan,
28: nan,
29: 1,
30: 1},
41: {0: '41 to 50',
1: 45,
2: 53.5,
3: nan,
4: nan,
5: nan,
6: 44,
7: nan,
8: nan,
9: nan,
10: nan,
11: nan,
12: 1,
13: 1,
14: nan,
15: nan,
16: nan,
17: nan,
18: 1,
19: 1,
20: nan,
21: nan,
22: 0.5,
23: nan,
24: 1,
25: 1,
26: 1,
27: nan,
28: nan,
29: 1,
30: 1}}
)
How can I compare 1 to 10 ranges with 8, these record are for full month and range values can go upto 10000 with bin size of 10.
How can I get it to compare with my values row,
One thing I didn't mentioned before, the bins are not constant e.g. upto 200 bin size is 10 then change to 50
EDIT
df1 contains values according to the dates
df2 contains ranges
I need to compare these values in df with the ranges in df2 , e.g. on jan-01 value is 8 and it falls in range 1-10. so now I need a final df which has all the values under this range for index 1-30
and final df looks like this
final output
output = pd.DataFrame.from_dict(
{'Date': {0: 'Values',
1: 1,
2: 2,
3: 3,
4: 4,
5: 5,
6: 6,
7: 7,
8: 8,
9: 9,
10: 10},
datetime.datetime(2021, 1, 1, 0, 0): {0: 8.0,
1: nan,
2: nan,
3: 8.0,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: 1.0,
10: 1.0},
datetime.datetime(2021, 1, 2, 0, 0): {0: 12.0,
1: nan,
2: nan,
3: 15.0,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: 1.0,
10: 1.0},
datetime.datetime(2021, 1, 3, 0, 0): {0: 39.0,
1: nan,
2: nan,
3: 25.0,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: 1.0,
10: 1.0},
datetime.datetime(2021, 1, 4, 0, 0): {0: 25.0,
1: nan,
2: nan,
3: 37.0,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: 1.0,
10: 1.0},
datetime.datetime(2021, 1, 5, 0, 0): {0: 35.0,
1: nan,
2: nan,
3: 44.0,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: 1.0,
10: 1.0}}
)

Resolving a value is trying to be set on a copy of a slice error

Trying to resolve the error:
application.py:25: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
application.py:26: SettingWithCopyWarning:
but can't figure out why i'm getting this error and how to resolve it.
This is my code:
hr = hr_data[['Month','SalesSystemCode','TITULO','BirthDate','HireDate','SupervisorEmployeeID','BASE','carallowance','Commission_Target','Area','Fulfilment %','Commission Accrued','Commission paid',
'Características (D)', 'Características (I)', 'Características (S)','Características (C)', 'Motivación (D)', 'Motivación (I)','Motivación (S)', 'Motivación (C)', 'Bajo Stress (D)',
'Bajo Stress (I)', 'Bajo Stress (S)', 'Bajo Stress (C)']]
sales = sales_data[['Report month', 'Area','Customer','Rental Charge','Cod. Motivo Desconexion','ID Vendedor']]
#report month to datetime
sales['Report month'] = pd.to_datetime(sales['Report month'])
hr['Month'] = pd.to_datetime(hr['Month'])
#remove sales where customer churned
sales_clean = sales.loc[sales['Cod. Motivo Desconexion'] == 0]
sales_clean = sales_clean[['Report month','Rental Charge','ID Vendedor']]
sales_clean2 = pd.DataFrame(sales_clean.groupby(['Report month','ID Vendedor'])['Rental Charge'].sum())
sales_clean2.reset_index(inplace=True)
hr_area = hr.loc[hr['Area'] == 'Area 1']
merged_hr = hr_area.merge(sales_clean, left_on=['SalesSystemCode','Month'],right_on=['ID Vendedor','Report month'],how='left')
#creating new features: months of employment
merged_hr['MonthsofEmploymentRounded'] = round((merged_hr['Month'] - merged_hr['HireDate'])/np.timedelta64(1,'M')).astype('int')
#filters for interaction
YEAR_MONTH = merged_hr['Month'].unique()
#css stylesheet
external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']
app = dash.Dash(__name__, external_stylesheets=external_stylesheets)
#html layout
app.layout = html.Div(children=[
html.H1(children='SAC Challenge Level 2 Dashboard', style ={
'textAlign': 'center',
'height':'10'
}),
html.Div(children='''
Objective: Studying the impact of supervision on the performance of sales executives in Area 1
'''),
dcc.DatePickerRange(
id='year_month',
start_date= min(merged_hr['Month'].dt.date.tolist()),
end_date = 'Select date'
),
dcc.Graph(
id='performancetable'
)
])
#app.callback(dash.dependencies.Output('performancetable','figure'),
[dash.dependencies.Input('year_month', 'start_date'),
dash.dependencies.Input('year_month','end_date')])
def update_table(year_month):
if year_month is None or year_month ==[]:
year_month = YEAR_MONTH
performance = merged_hr[(merged_hr['Month'].isin(year_month))]
return {
'data': [
go.Table(
header = dict(values=list(performance.columns),fill_color='paleturquoise',align='left'),
cells = dict(values=[performance['Month'],performance['SalesSystemCode'],performance['TITULO'],
performance['HireDate'],performance['MonthsofEmploymentRounded'],performance['SupervisorEmployeeID'],
performance['BASE'],performance['carallowance'],performance['Commission_Target'],
performance['Fulfilment %'], performance['Commission Accrued'],performance['Commission paid'],
performance['Características (D)'],performance['Características (I)'],performance['Características (S)'],
performance['Características (C)'],performance['Motivación (D)'],performance['Motivación (I)'],
performance['Motivación (S)'],performance['Motivación (C)'],performance['Bajo Stress (D)'],
performance['Bajo Stress (I)'],performance['Bajo Stress (S)'],performance['Bajo Stress (C)'],
performance['Rental Charge']])
)],
}
if __name__ == '__main__':
app.run_server(debug=True)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
Here is a sample of hr_data:
{'Month': {0: Timestamp('2017-12-01 00:00:00'),
1: Timestamp('2017-12-01 00:00:00'),
2: Timestamp('2017-12-01 00:00:00'),
3: Timestamp('2017-12-01 00:00:00'),
4: Timestamp('2017-12-01 00:00:00')},
'EmployeeID': {0: 91868, 1: 1812496, 2: 1812430, 3: 700915, 4: 1812581},
'PayrollProviderName': {0: 'Tele',
1: 'People',
2: 'People',
3: 'Stratego',
4: 'People'},
'SalesSystemCode': {0: 91868.0,
1: 802496.0,
2: 2430.0,
3: 700915.0,
4: 802581.0},
'Payroll Type': {0: 'Insourcing',
1: 'Third Party',
2: 'Third Party',
3: 'Third Party',
4: 'Third Party'},
'Name': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'TITULO': {0: 'SALES SUPERVISOR',
1: 'SALES EXECUTIVE',
2: 'SALES EXECUTIVE',
3: 'SALES EXECUTIVE',
4: 'SALES EXECUTIVE'},
'Sexo': {0: 'M', 1: 'F', 2: 'F', 3: 'M', 4: 'F'},
'BirthDate': {0: Timestamp('1982-11-05 00:00:00'),
1: Timestamp('1987-09-24 00:00:00'),
2: Timestamp('1981-01-13 00:00:00'),
3: Timestamp('1986-04-18 00:00:00'),
4: Timestamp('1991-06-24 00:00:00')},
'HireDate': {0: Timestamp('2012-04-23 00:00:00'),
1: Timestamp('2017-04-10 00:00:00'),
2: Timestamp('2017-03-13 00:00:00'),
3: Timestamp('2015-01-22 00:00:00'),
4: Timestamp('2017-05-18 00:00:00')},
'SupervisorEmployeeID': {0: 7935, 1: 91868, 2: 91868, 3: 91868, 4: 91868},
'SupervisorName': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'BASE': {0: 895, 1: 700, 2: 700, 3: 700, 4: 700},
'carallowance': {0: 350, 1: 250, 2: 250, 3: 250, 4: 250},
'Commission_Target': {0: 708.33, 1: 583.33, 2: 583.33, 3: 583.33, 4: 583.33},
'Nacionalidad': {0: 'INT', 1: 'INT', 2: 'INT', 3: 'INT', 4: 'INT'},
'Area': {0: 'Area 1', 1: 'Area 1', 2: 'Area 1', 3: 'Area 1', 4: 'Area 1'},
'Comment': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Sales Quota (points)': {0: 1810.0, 1: 108.0, 2: 108.0, 3: 108.0, 4: 108.0},
'Real (points)': {0: 1855.0, 1: 86.0, 2: 245.0, 3: 149.0, 4: 91.0},
'Fulfilment %': {0: 1.0248618784530388,
1: 0.7962962962962963,
2: 2.2685185185185186,
3: 1.3796296296296295,
4: 0.8425925925925926},
'Commission Accrued': {0: 708.33, 1: 583.33, 2: 583.33, 3: 583.33, 4: 583.33},
'OA Commission Accrued': {0: 653.66,
1: 87.5,
2: 1494.79,
3: 794.79,
4: 160.42},
'Clawback': {0: 0.0, 1: 24.33, 2: 144.9, 3: 36.77, 4: 0.0},
'Other Commissions': {0: 0.0, 1: 0.0, 2: 9.16, 3: 9.16, 4: 0.0},
'Commission paid': {0: 1361.99, 1: 646.51, 2: 1942.38, 3: 1350.52, 4: 743.75},
'Exit Date': {0: NaT,
1: Timestamp('2018-04-13 00:00:00'),
2: NaT,
3: NaT,
4: Timestamp('2018-08-31 00:00:00')},
'Legal Motive': {0: nan,
1: 'Artículo No. 212',
2: nan,
3: nan,
4: 'Artículo No. 212'},
'Características (D)': {0: nan, 1: 70.0, 2: 70.0, 3: 60.0, 4: 67.0},
'Características (I)': {0: nan, 1: 95.0, 2: 62.0, 3: 25.0, 4: 15.0},
'Características (S)': {0: nan, 1: 20.0, 2: 48.0, 3: 75.0, 4: 40.0},
'Características (C)': {0: nan, 1: 25.0, 2: 34.0, 3: 85.0, 4: 94.0},
'Motivación (D)': {0: nan, 1: 85.0, 2: 75.0, 3: 40.0, 4: 59.0},
'Motivación (I)': {0: nan, 1: 95.0, 2: 74.0, 3: 74.0, 4: 25.0},
'Motivación (S)': {0: nan, 1: 11.0, 2: 58.0, 3: 65.0, 4: 65.0},
'Motivación (C)': {0: nan, 1: 7.0, 2: 33.0, 3: 84.0, 4: 93.0},
'Bajo Stress (D)': {0: nan, 1: 60.0, 2: 69.0, 3: 79.0, 4: 79.0},
'Bajo Stress (I)': {0: nan, 1: 86.0, 2: 60.0, 3: 6.0, 4: 18.0},
'Bajo Stress (S)': {0: nan, 1: 40.0, 2: 60.0, 3: 89.0, 4: 30.0},
'Bajo Stress (C)': {0: nan, 1: 60.0, 2: 48.0, 3: 84.0, 4: 92.0}}
sales_data:
{'Month': {0: Timestamp('2017-07-01 00:00:00'),
1: Timestamp('2017-07-01 00:00:00'),
2: Timestamp('2017-07-01 00:00:00'),
3: Timestamp('2017-07-01 00:00:00'),
4: Timestamp('2017-07-01 00:00:00')},
'Report month': {0: '2017-07',
1: '2017-07',
2: '2017-07',
3: '2017-07',
4: '2017-07'},
'Area': {0: 'Area 1', 1: 'Area 1', 2: 'Area 1', 3: 'Area 1', 4: 'Area 1'},
'Fecha de solicitud': {0: Timestamp('2017-07-25 14:49:51'),
1: Timestamp('2017-07-25 14:56:14'),
2: Timestamp('2017-06-30 13:07:10'),
3: Timestamp('2017-07-03 18:25:17'),
4: Timestamp('2017-07-04 09:56:24')},
'Fecha de salida': {0: Timestamp('2017-07-27 13:11:42'),
1: Timestamp('2017-07-27 15:08:39'),
2: Timestamp('2017-07-04 11:50:07'),
3: Timestamp('2017-07-07 16:40:44'),
4: Timestamp('2017-07-14 14:52:45')},
'Fecha de salida final': {0: Timestamp('2017-07-28 15:13:53'),
1: Timestamp('2017-07-27 15:46:16'),
2: Timestamp('2017-07-05 10:24:46'),
3: Timestamp('2017-07-08 08:36:43'),
4: Timestamp('2017-07-15 10:00:02')},
'Fecha de proceso': {0: Timestamp('2017-08-01 00:00:00'),
1: Timestamp('2017-08-01 00:00:00'),
2: Timestamp('2017-08-01 00:00:00'),
3: Timestamp('2017-08-01 00:00:00'),
4: Timestamp('2017-08-01 00:00:00')},
'Fecha de sistema': {0: Timestamp('2017-07-25 14:49:51'),
1: Timestamp('2017-07-25 14:56:14'),
2: Timestamp('2017-06-30 13:07:10'),
3: Timestamp('2017-07-03 18:25:17'),
4: Timestamp('2017-07-04 09:56:24')},
'Fecha de completada': {0: Timestamp('2017-07-28 15:13:52'),
1: Timestamp('2017-07-27 15:46:15'),
2: Timestamp('2017-07-05 10:24:45'),
3: Timestamp('2017-07-08 08:36:42'),
4: Timestamp('2017-07-15 10:00:02')},
'Fecha de creada': {0: Timestamp('2017-07-25 14:50:00'),
1: Timestamp('2017-07-25 14:56:00'),
2: Timestamp('2017-06-30 13:07:00'),
3: Timestamp('2017-07-03 18:25:00'),
4: Timestamp('2017-07-04 09:56:00')},
'Cod. de Distribucion': {0: 2302, 1: 2302, 2: 2302, 3: 91818, 4: 2302},
'Customer': {0: 19308378, 1: 19308378, 2: 27504455, 3: 27104497, 4: 17608676},
'Cod. Tipo Cliente': {0: 'R', 1: 'R', 2: 'R', 3: 'R', 4: 'R'},
'Tipo De Cliente': {0: 'Residencial ',
1: 'Residencial ',
2: 'Residencial ',
3: 'Residencial ',
4: 'Residencial '},
'Cuenta': {0: 193083780000,
1: 193083780000,
2: 275044550000,
3: 271044970000,
4: 176086760000},
'Status Cuenta': {0: 'W', 1: 'W', 2: 'W', 3: 'W', 4: 'W'},
'Tipo de Contabilidad': {0: 'RP', 1: 'RP', 2: 'RP', 3: 'RP', 4: 'RP'},
'Desc. Tipo Contabilidad': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Tos Cat': {0: 'K', 1: 'K', 2: 'K', 3: 'K', 4: 'K'},
'Desc. Tos Cat': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Mktg Cat': {0: 990005.0, 1: 990005.0, 2: 990000.0, 3: 990000.0, 4: 990000.0},
'Desc. Mktg Cat': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Cod. Bill Sort': {0: 571.0, 1: 571.0, 2: 571.0, 3: 691.0, 4: 256.0},
'Orden de Servicio': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Comando': {0: 'PMO', 1: 'PFB', 2: 'PMO', 3: 'PMO', 4: 'PMO'},
'Desc. Comando': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Prioridad': {0: 5, 1: 5, 2: 5, 3: 5, 4: 5},
'Cod. Línea': {0: 3, 1: 2, 2: 1, 3: 1, 4: 1},
'Número de Servicio': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Producto': {0: 1420, 1: 31000, 2: 1403, 3: 1404, 4: 1404},
'Desc. Producto': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Familia': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Sub Familia': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Rental Charge': {0: 22.5,
1: 18.7125,
2: 15.257499999999999,
3: 19.95,
4: 19.95},
'Inst Charge': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0},
'Control': {0: 'CONEXIONES_COMPLETADAS_CT',
1: 'CONEXIONES_COMPLETADAS_CT',
2: 'CONEXIONES_COMPLETADAS',
3: 'CONEXIONES_COMPLETADAS',
4: 'CONEXIONES_COMPLETADAS'},
'Cod. Estatus': {0: 'A', 1: 'A', 2: 'A', 3: 'A', 4: 'A'},
'Status': {0: 'Por Acción ',
1: 'Por Acción ',
2: 'Por Acción ',
3: 'Por Acción ',
4: 'Por Acción '},
'Cod Razon Pendiente': {0: ' ', 1: ' ', 2: ' ', 3: ' ', 4: ' '},
'Razon Pendiente': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Cod. Motivo Desconexion': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0},
'Motivo Desconexion': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Cod. Agencia': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Agencia': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'ID Vendedor': {0: 2352.0, 1: 2352.0, 2: 2352.0, 3: 2352.0, 4: 2352.0},
'ID Oficinista': {0: 229113.0,
1: 229113.0,
2: 224666.0,
3: 221532.0,
4: 224666.0},
'ID Acct Manager': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0},
'Desc. Acct Manager': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Provincia': {0: 'A', 1: 'A', 2: 'A', 3: 'B', 4: 'B'},
'Central': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Chrg Prod Ant': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Tipo Srv': {0: 'MO', 1: 'TI', 2: 'MO', 3: 'MO', 4: 'MO'},
'Tipo Srv Desc': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Diferencia ': {0: 2.5500000000000007,
1: 0.0,
2: 15.257499999999999,
3: 19.95,
4: 19.95},
'Puntos ': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan}}
#QuanHoang was pointing in the right direction with his comment, but you need to add .copy() for both the hr and sales dataframes:
hr = hr_data[['Month','SalesSystemCode','TITULO','BirthDate','HireDate','SupervisorEmployeeID','BASE','carallowance','Commission_Target','Area','Fulfilment %','Commission Accrued','Commission paid',
'Características (D)', 'Características (I)', 'Características (S)','Características (C)', 'Motivación (D)', 'Motivación (I)','Motivación (S)', 'Motivación (C)', 'Bajo Stress (D)',
'Bajo Stress (I)', 'Bajo Stress (S)', 'Bajo Stress (C)']].copy()
sales = sales_data[['Report month', 'Area','Customer','Rental Charge','Cod. Motivo Desconexion','ID Vendedor']].copy()
Using .copy() works because it creates a full copy of the data, rather than a view. Subsequent indexing operations work correctly on the copy.
Another option is to use .loc[] indexing when you do the selection from hr_data and sales_data. This should also work:
hr = hr_data.loc[:, ['Month','SalesSystemCode','TITULO','BirthDate','HireDate','SupervisorEmployeeID','BASE','carallowance','Commission_Target','Area','Fulfilment %','Commission Accrued','Commission paid',
'Características (D)', 'Características (I)', 'Características (S)','Características (C)', 'Motivación (D)', 'Motivación (I)','Motivación (S)', 'Motivación (C)', 'Bajo Stress (D)',
'Bajo Stress (I)', 'Bajo Stress (S)', 'Bajo Stress (C)']]
sales = sales_data.loc[:, ['Report month', 'Area','Customer','Rental Charge','Cod. Motivo Desconexion','ID Vendedor']]
Note that selecting columns with .loc[] uses the format df.loc[:, [ *columns* ] becasue .loc[] requires specifying the rows explicitly.
Using .loc[] works because .loc[] (and .iloc[]) indexing return a reference to the original dataframe, but with updated indexing behavior which is not subject to the 'setting with copy' problems.

Categories