I want to lower data taken from pandas sheet and trim all spaces then to look for an equality.
df['ColumnA'].loc[lambda x: x.lower().replace(" ", "") == var_name]
Code is above.
It says pandas series has no lower method. But I need to search for data inside column A via pandas framework while lowering all letters to small and whitespace trimmering.
Any other idea, how can I achieve in pandas?
In your lambda function, x is a Series not a string so you have to use str accessor:
df['ColumnA'].loc[lambda x: x.str.lower().replace(" ", "") == var_name]
Another way:
df.loc[df['ColumnA'].str.lower().str.replace(' ', '') == var_name, 'ColumnA']
Related
I would like to extract the text inside the range "text: ....." from this dataframe and create another column with that value.
This is my Pandas Dataframe
issues_df['new_column'] = issues_df['fields.description.content'].apply(lambda x: x['text'])
However, it returns the following error:
issues_df['new_column'] = issues_df['fields.description.content'].apply(lambda x: x['text'])
TypeError: Object 'float' is not writable.
Any suggestions?
Thanks in advance.
Problem is NaN in column, you can try .str accessor
issues_df['new_column'] = issues_df['fields.description.content'].str[0].str['content'].str[0].str['text']
That could be a good task for the rather efficient json_normalize:
df['new_column'] = pd.json_normalize(
df['fields.description.content'], 'content'
)['text']
I'm searching a DF using:
df.loc[df['Ticker'] == 'ibm'
The problem is df['Ticker'] is formated with another value after it(for example 'ibm US').
normally for string I can do something like .split[" "][0] to find the match but it doesn't work for my pandas search above(df.loc[df['Ticker'].split[" "][0] == 'ibm' - fails with AttributeError: 'Series' object has no attribute 'split').
What can I do to achieve my goal?
Are you are looking for str.contains?:
new_df = df[df['Ticket'].str.contains(r'ibm',case=False)]
which will create a new dataframe from rows that the 'Ticker' column contains 'ibm'.
You can use or and case=False (case insensitive) in str.contains:
new_df = df[df['Ticket'].str.contains(r'ibm|msft|google|..',case=False)]
I tried to code the program that allows the user enter the column and sort the column and replace the cell to the other entered information but I probably get syntact errors
I tried to search but I could not find any solution
import pandas as pd
data = pd.read_csv('List')
df = pd.DataFrame(data, columns = ['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O'])
findL = ['example']
replaceL = ['convert']
col = 'C';
df[col] = df[col].replace(findL, replaceL)
TypeError: Cannot compare types 'ndarray(dtype=float64)' and 'str'
I seems that your df[col] and findLand replaceLdo not have the same datatype. Try to run df[col] = df[col].astype(str) beofre you run df[col]=df[col].replace(findL, replaceL)and it should work
If the column/s you are dealing with has blank entries in it, you have to specify the na_filter parameter in .read_csv() method to be False.
That way, it will take all the column entries with blank/empty values as str and thus the not empty ones as str as well.
Doing the .replace() method using this will not give a TypeError as you will be parsing through both columns as strings and not 'ndarray(dtype=float64) and str.
I'm trying to convert object to string in my dataframe using pandas.
Having following data:
particulars
NWCLG 545627 ASDASD KJKJKJ ASDASD
TGS/ASDWWR42045645010009 2897/SDFSDFGHGWEWER
dtype:object
while trying to convert particulars column from object to string using astype()[with str, |S, |S32, |S80] types, or directly using str functions it is not converting in string (remain object) and for str methods[replacing '/' with ' '] it says AttributeError: 'DataFrame' object has no attribute 'str'
using pandas 0.23.4
Also refereed: https://github.com/pandas-dev/pandas/issues/18796
Use astype('string') instead of astype(str) :
df['column'] = df['column'].astype('string')
You could read the excel specifying the dtype as str:
df = pd.read_excel("Excelfile.xlsx", dtype=str)
then use string replace in particulars column as below:
df['particulars'] = df[df['particulars'].str.replace('/','')]
Notice that the df assignment is also a dataframe in '[]' brackets.
When you're using the below command in your program, it returns a string which you're trying to assign to a dataframe column. Hence the error.
df['particulars'] = df['particulars'].str.replace('/',' ')
I'm having trouble applying upper case to a column in my DataFrame.
dataframe is df.
1/2 ID is the column head that need to apply UPPERCASE.
The problem is that the values are made up of three letters and three numbers. For example rrr123 is one of the values.
df['1/2 ID'] = map(str.upper, df['1/2 ID'])
I got an error:
TypeError: descriptor 'upper' requires a 'str' object but received a 'unicode' error.
How can I apply upper case to the first three letters in the column of the DataFrame df?
If your version of pandas is a recent version then you can just use the vectorised string method upper:
df['1/2 ID'] = df['1/2 ID'].str.upper()
This method does not work inplace, so the result must be assigned back.
This should work:
df['1/2 ID'] = map(lambda x: str(x).upper(), df['1/2 ID'])
and should you want all the columns names to be in uppercase format:
df.columns = map(lambda x: str(x).upper(), df.columns)
str.upper() wants a plain old Python 2 string
unicode.upper() will want a unicode not a string (or you get TypeError: descriptor 'upper' requires a 'unicode' object but received a 'str')
So I'd suggest making use of duck typing and call .upper() on each of your elements, e.g.
df['1/2 ID'].apply(lambda x: x.upper(), inplace=True)