Calculate stoppage time between start and stop events with python - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 hours ago.
Improve this question
I have a dataframe with number of serial numbers, each serial number has 'START' and 'STOP' events. I want to calculate stoppage time for each serial number for each day, there can be multiple start stop events in a day but i need to consider cumulative stoppage time(STOP-START). How can I do it with python? Added image of glimpse of data.
enter image description here
This is for 1 serial number, but how to write it for all serial numbers
dfout = pd.DataFrame()
dfout['EventType'] = UptimeS['EventType']
dfout['EventStartTime'] = UptimeS['EventDate']
dfout['SerialNumber'] = UptimeS['SerialNumber']
dfout['change'] = np.where(UptimeS['EventType']!=UptimeS['EventType'].shift(),1,0)
dfout = dfout.loc[dfout['change'] !=0 ,:]
dfout['EventEndTime'] = dfout['EventStartTime'].shift(-1)
dfout = dfout.loc[dfout['EventType']=='STOP_Op']
dfout['CommDownTime'] = pd.to_datetime(dfout['EventEndTime']) - pd.to_datetime(dfout['EventStartTime'])
dfout['CommDownTime'] = pd.to_datetime(dfout['CommDownTime'])
dfout = dfout.groupby(['SerialNumber'])['DownTime'].sum()

Related

Finding the delta of two unmatched dataframes in pandas [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
Having 2 Data Frames with readings at 2 different times as:
DF1
Sensor ID Reference Pressure Sensor Pressure
0 013677 100.15 93.18
1 013688 101.10 95.23
2 013699 100.87 93.77
... ... ... ...
And
DF2
Sensor ID Reference Pressure Sensor Pressure
0 013688 120.01 119.43
1 013677 118.93 118.88
2 013699 120.05 118.85
... ... ... ...
What would be the optimal way of creating a third Dataframe, that contains the difference between those readings, given that the "Sensor ID" values order does not match between the two dataframes?
Pandas has this beautiful feature where it automatically aligns on indices. So we can use that to solve your problem:
df1.set_index("Sensor ID").sub(df2.set_index("Sensor ID"))
Reference Pressure Sensor Pressure
Sensor ID
13677 -18.78 -25.70
13688 -18.91 -24.20
13699 -19.18 -25.08

how to get Unique count from a DataFrame in case of duplicate index [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I am working on a dataframe. Data in the image
Q. I want the number of shows released per year but if I'm applying count() function, it's giving me 6 instead of 3. Could anyone suggest how do I get the correct value count.
To get unique value of single year, you can use
count = len(df.loc[df['release_year'] == 1945, 'show_id'].unique())
# or
count = df.loc[df['release_year'] == 1945, 'show_id'].nunique()
To summarize unique value of dataframe by year, you can drop_duplicates() on column show_id first.
df.drop_duplicates(subset=['show_id']).groupby('release_year').count()
Or use value_counts() on column after dropping duplicates.
df.drop_duplicates(subset=['show_id'])['release_year'].value_counts()
df['show_id'].nunique().count()
should do the job.

How do I remove squared brackets from data that is saved as a list in a Dataframe in Python [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
The data is like this and it is in a data frame.
PatientId Payor
0 PAT10000 [Cash, Britam]
1 PAT10001 [Madison, Cash]
2 PAT10002 [Cash]
3 PAT10003 [Cash, Madison, Resolution]
4 PAT10004 [CIC Corporate, Cash]
I want to remove the square brackets and filter all patients who used at least a certain mode of payment eg madison then obtain their ID. Please help.
This will generate a list of tuples (id, payor). (df is the dataframe)
payment = 'Madison'
ids = [(id, df.Payor[i][1:-1]) for i, id in enumerate(df.PatientId) if payment in df.Payor[i]]
let's say, your data frame variable initialized as "df" and after removing square brackets, you want to filter all elements containing "Madison" under "Payor" column
df.replace({'[':''}, regex = True)
df.replace({']':''}, regex = True)
filteredDf = df.loc[df['Payor'].str.contains("Madison")]
print(filteredDf)

How to fill color in particular cell in dataframe? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have created a data Frame which has a column Status.
This column can have two values - Success and Failed.
I want to fill color to all rows in this column with value Failed.
Please help me to implement this?
example:
Sample dataframe is given below:
Master Job Name Status
Settlement_limit Success
Settlement_Trans **Failed**
Ix_rm_bridge Success
Unit_test **Failed**
u can use Styler.applymap to apply style functions elementwise
import pandas as pd
data = {'Master Job Name':['Settlement_limit', 'Settlement_Trans', 'Ix_rm_bridge', 'Unit_test'],
'Status':['Success', 'Failed', 'Success', 'Failed']}
df = pd.DataFrame(data)
# function to fill font color
def fill_color(val):
return 'color: red' if val=='Failed' else ''
df = df.style.applymap(fill_color, subset=['Status'])
df.to_excel('mysheet.xlsx') # excel sheet will be saved to present working directory, open the excel sheet to view changes

creating date range on csv using python-pandas [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
How to create a date range in python using pandas in Y-M-D?
import pandas as pd
df = pd.DataFrame([['2015-07-07','2016-09-22'],['2012-02-03','2013-02-19'],['2013-02-17','2013-03-22']],columns = ['start','end'])
#change strings to date format
df['start'] = [pd.to_datetime(x) for x in df['start']]
df['end'] = [pd.to_datetime(x) for x in df['end']]
df['range'] = df['end']-df['start']
df
Output should be:
start end range
0 2015-07-07 2016-09-22 443 days
1 2012-02-03 2013-02-19 382 days
2 2013-02-17 2013-03-22 33 days
In case you want to read from csv, switch the beginning to:
df = pd.read_csv('file_name.csv')
in case you want a concatenated column:
df['details'] = [str(x)+' - '+str(y)+' has '+str(z)[:-9] for x,y,z in zip(df['start'],df['end'],df['range'])]

Categories