how to add 1 to a column in dataframe range [closed] - python

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
i want a range of values from two columns From and To since the number in To column should be included in range of values so i'm adding 1 to that as shown in below
df.apply(lambda x : range(x['From'],x['To']+1),1)
df.apply(lambda x : ','.join(map(str, range(x['From'],x['To']))),1)
i need output some thing like this
if from value starts from 5 and To value ends with 11
myoutput should be like this
5,6,7,8,9,10,11
i'm getting till 10 only even i have added +1 to range of end value
df:
----
From To
15887 16251
15888 16252
15889 16253
15890 16254
and range should be written in new column

Try this:
df=pd.DataFrame({'From':[15887,15888,15889,15890],'To':[16251,16252,16253,16254]})
df['Range']=[list(range(i,k+1)) for i,k in zip(df['From'],df['To'])]

Related

How to get the City from address column in Python Pandas [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed last month.
Improve this question
I am trying to get the city from the Purchase Address column, my code is like below:
when I tried [0] or [-1], I can get the street address or the state/zip. But when I try 1, it raised the error: index out of range?
Can anyone help solve this problem?
when I try to get the street address, it works
enter image description here
This is the result when I tried 1, since city is in the middle of the address
when I try to get the city, it raise error
Example
we need minimal and reproducible example for answer. also need text or code not image.
df = pd.DataFrame(['a,B,c', 'a,C,b', 'd'], columns=['col1'])
df
col1
0 a,B,c
1 a,C,b
2 d
Code
your code :
df['col1'].apply(lambda x: x.split(',')[1])
IndexError: list index out of range
try following code:
out = df['col1'].str.split(',').str[1]
out
0 B
1 C
2 NaN
Name: col1, dtype: object

How to find which doctor a patient is using, when only given a list of doctor's patients? (code improvement request) [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I need to create a dataframe which lists all patients and their matching doctors.
I have a txt file with doctor/patient records organized in the following format:
Doctor_1: patient23423,patient292837,patient1232423...
Doctor_2: patient456785,patient25363,patient23425665...
And a list of all unique patients.
To do this, I imported the txt file into a doctorsDF dataframe, separated by a colon. I also created a patientsDF dataframe with 2 columns: 'Patients' filled from the patient list, and 'Doctors' column empty.
I then ran the following:
for pat in patientsDF['Patient']:
for i, doc in enumerate(doctorsDF[1]):
if doctorsDF[1][i].find(str(pat)) >= 0 :
patientsDF['Doctor'][i] = doctorsDF.loc[i,0]
else:
continue
This worked fine, and now all patients are matched with the doctors, but the method seems clumsy. Is there any function that can more cleanly achieve the result? Thanks!
(First StackOverflow post here. Sorry if this is a newb question!)
If you use Pandas, try:
df = pd.read_csv('data.txt', sep=':', header=None, names=['Doctor', 'Patient'])
df = df[['Doctor']].join(df['Patient'].str.strip().str.split(',')
.explode()).reset_index(drop=True)
Output:
>>> df
Doctor Patient
0 Doctor_1 patient23423
1 Doctor_1 patient292837
2 Doctor_1 patient1232423
3 Doctor_2 patient456785
4 Doctor_2 patient25363
5 Doctor_2 patient23425665
How to search:
>>> df.loc[df['Patient'] == 'patient25363', 'Doctor'].squeeze()
'Doctor_2'

how to get Unique count from a DataFrame in case of duplicate index [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I am working on a dataframe. Data in the image
Q. I want the number of shows released per year but if I'm applying count() function, it's giving me 6 instead of 3. Could anyone suggest how do I get the correct value count.
To get unique value of single year, you can use
count = len(df.loc[df['release_year'] == 1945, 'show_id'].unique())
# or
count = df.loc[df['release_year'] == 1945, 'show_id'].nunique()
To summarize unique value of dataframe by year, you can drop_duplicates() on column show_id first.
df.drop_duplicates(subset=['show_id']).groupby('release_year').count()
Or use value_counts() on column after dropping duplicates.
df.drop_duplicates(subset=['show_id'])['release_year'].value_counts()
df['show_id'].nunique().count()
should do the job.

Pandas:slicing the dataframe using index values [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
Pandas:I have a dataframe given below which contains the same set of banks twice..I need to slice the data from 0th index that contains a bank name, upto the index that contains the same bank name..here in the problem -DEUTSCH BANK AG..I need to apply same logic to any such kind of dataframes.ty..
I tried with logic:- df25.iloc[0,1]==df25[1].any().. but it returns nly true but not the index position.
DataFrame:-[1]:https://i.stack.imgur.com/iJ1hJ.png, https://i.stack.imgur.com/J2aDX.png
You need to get the index of all the rows that has the value you are looking for (in this case the bank name) and get the slice the data frame using the indices.
Example:
df = pd.DataFrame({'Col1':list('abcdeafgbfhi')})
search_str = 'b'
idx_list = list(df[(df['Col1']==search_str)].index.values)
print(df[idx_list[0]:idx_list[1]])
Output:
Col1
1 b
2 c
3 d
4 e
5 a
6 f
7 g
Note that the assumption is that there will be only 2 rows with the same value. If there are more than 2, you have to play with the index list values and get what you need. Hope this helps.
Keep in mind that posting a sample data set will always help you get more answers as people will move away to another question when they see images or screenshots, because it involves additional steps to reproduce the issue

How to add minutes to datetime64 object in a Pandas Dataframe [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I want to add a list of minutes to datetime64 columns into a new df column.
I tried using datetime.timedelta(minutes=x) in a for loop. But as a result, it is adding a constant value to all of my rows. How do I resolve this?
for x in wait_min:
data['New_datetime'] = data['Date'] + datetime.timedelta(minutes=x)
I expect to iterate through the list and add corresponding minutes, but this is adding a constant value of 16 minutes to each row.
Let us try
data['Date'] + pd.to_timedelta(wait_min, unit='m')
pandas sums two Series element-wise, if they have the same length. All you need to do is create a Series of timedelta objects.
So if wait_min is a list of minutes of length equal to the number of rows in your dataframe, this will do:
data['New_datetime'] = data['Date'] + pd.Series([datetime.timedelta(minutes=x) for x in wait_min])
The following changes worked for me:
for i, x in enumerate(wait_min):
data['New_Datetime'].iloc[i] = data['Date'].iloc[i] + datetime.timedelta(minutes=x)
might not be the best solution, but this works for what I was trying to do.

Categories