How to add minutes to datetime64 object in a Pandas Dataframe [closed] - python

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I want to add a list of minutes to datetime64 columns into a new df column.
I tried using datetime.timedelta(minutes=x) in a for loop. But as a result, it is adding a constant value to all of my rows. How do I resolve this?
for x in wait_min:
data['New_datetime'] = data['Date'] + datetime.timedelta(minutes=x)
I expect to iterate through the list and add corresponding minutes, but this is adding a constant value of 16 minutes to each row.

Let us try
data['Date'] + pd.to_timedelta(wait_min, unit='m')

pandas sums two Series element-wise, if they have the same length. All you need to do is create a Series of timedelta objects.
So if wait_min is a list of minutes of length equal to the number of rows in your dataframe, this will do:
data['New_datetime'] = data['Date'] + pd.Series([datetime.timedelta(minutes=x) for x in wait_min])

The following changes worked for me:
for i, x in enumerate(wait_min):
data['New_Datetime'].iloc[i] = data['Date'].iloc[i] + datetime.timedelta(minutes=x)
might not be the best solution, but this works for what I was trying to do.

Related

how to get Unique count from a DataFrame in case of duplicate index [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I am working on a dataframe. Data in the image
Q. I want the number of shows released per year but if I'm applying count() function, it's giving me 6 instead of 3. Could anyone suggest how do I get the correct value count.
To get unique value of single year, you can use
count = len(df.loc[df['release_year'] == 1945, 'show_id'].unique())
# or
count = df.loc[df['release_year'] == 1945, 'show_id'].nunique()
To summarize unique value of dataframe by year, you can drop_duplicates() on column show_id first.
df.drop_duplicates(subset=['show_id']).groupby('release_year').count()
Or use value_counts() on column after dropping duplicates.
df.drop_duplicates(subset=['show_id'])['release_year'].value_counts()
df['show_id'].nunique().count()
should do the job.

how to add 1 to a column in dataframe range [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
i want a range of values from two columns From and To since the number in To column should be included in range of values so i'm adding 1 to that as shown in below
df.apply(lambda x : range(x['From'],x['To']+1),1)
df.apply(lambda x : ','.join(map(str, range(x['From'],x['To']))),1)
i need output some thing like this
if from value starts from 5 and To value ends with 11
myoutput should be like this
5,6,7,8,9,10,11
i'm getting till 10 only even i have added +1 to range of end value
df:
----
From To
15887 16251
15888 16252
15889 16253
15890 16254
and range should be written in new column
Try this:
df=pd.DataFrame({'From':[15887,15888,15889,15890],'To':[16251,16252,16253,16254]})
df['Range']=[list(range(i,k+1)) for i,k in zip(df['From'],df['To'])]

How to bypass this error to make the column zero if it is greater than the said date? " TypeError: invalid type promotion " [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
df_main['month_1']= np.where(df_main['month_1'] >='2020-02-01',0,df_main['month_1'])
I need all the items in the month_1 column to be zero if the date is February 1st, 2020.
I tried '02/01/2020' format as well, which doesn't work.
since your column datatype is timestamp you can not use str '2020-02-01' to compare with your column so you need also a timestamp value: pd.Timestamp(2020, 2,1)
you can use pandas.Series.map:
df_main['month_1'] = df_main['month_1'].map(lambda x: 0 if x >= pd.Timestamp(2020, 2,1) else x)
or you can filter and assign:
df_main['month_1'][ df_main['month_1'] >= pd.Timestamp(2020, 2,1)] = 0

User entry as column names in pandas [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I have the user input two lists, one for sizes one for minutes they are each stored in a list. For example they can input sizes: 111, 121 and for minutes, 5, 10, 15.
I want to have the dataframe have columns that are named by the size and minute. (I did a for loop to extract each size and minute.) For example I want the columns to say 111,5 ; 111,10; 111;15, etc. I tried to do df[size+minute]=values (Values is data I want to input into each column) but instead the column name is just the values added up so I got the column name to be 116 instead of 111,5.
If you have two lists:
l = [111,121]
l2 = [5,10,15]
Then you can use list comprehension to form your column names:
col_names = [str(x)+';'+str(y) for x in l for y in l2]
print(col_names)
['111;5', '111;10', '111;15', '121;5', '121;10', '121;15']
And create a dataframe with these column names using pandas.DataFrame:
df = pd.DataFrame(columns=col_names)
If we add a row of data:
row = pd.DataFrame([[1,2,3,4,5,6]])
row.columns = col_names
df = df.append(pd.DataFrame(row))
We can see that our dataframe looks like this:
print(df)
111;5 111;10 111;15 121;5 121;10 121;15
0 1 2 3 4 5 6

Pandas new column with calculation based on other existing column [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have a Panda and want to do a calculation based on an existing column.
However, the apply. function is not working for some reason.
It's something like letssay
df = pd.DataFrame({'Age': age, 'Input': input})
and the input column is something like [1.10001, 1.49999, 1.60001]
Now I want to add a new column to the Dataframe, that is doing the following:
Add 0.0001 to each element in column
Multiply each value by 10
Transform each value of new column to int
Use Series.add, Series.mul and Series.astype:
#input is python code word (builtin), so better dont use it like variable
inp = [1.10001, 1.49999, 1.60001]
age = [10,20,30]
df = pd.DataFrame({'Age': age, 'Input': inp})
df['new'] = df['Input'].add(0.0001).mul(10).astype(int)
print (df)
Age Input new
0 10 1.10001 11
1 20 1.49999 15
2 30 1.60001 16
You could make a simple function and then apply it by row.
def f(row):
return int((row['input']+0.0001)*10))
df['new'] = df.apply(f, axis=1)

Categories