Create a new column which is cast to a string in pandas - python

What would be the proper way to assign a stringified column to a dataframe, as I would like to keep the original so I don't want to use .astype({'deliveries': 'str'). SO far I have:
df = ( df.groupby('path')
.agg(agg_dict)
.assign(deliveries_str=df['deliveries'].str ??)
)
What would be the proper way to do this?
I also tried the following but I get an unhashable type error:
.assign(deliveries_str=lambda x: x.deliveries.str)
TypeError: unhashable type: 'list'

You need try change .str since it is a function
.assign(deliveries_str=lambda x: x.deliveries.astype(str))
Adding mask
.assign(deliveries_str=lambda x: x['deliveries'].astype(str).mask(x['deliveries'].isnull()))

Related

How in Pandas Dataframe and Python can extract the specific text string after a given word?

I would like to extract the text inside the range "text: ....." from this dataframe and create another column with that value.
This is my Pandas Dataframe
issues_df['new_column'] = issues_df['fields.description.content'].apply(lambda x: x['text'])
However, it returns the following error:
issues_df['new_column'] = issues_df['fields.description.content'].apply(lambda x: x['text'])
TypeError: Object 'float' is not writable.
Any suggestions?
Thanks in advance.
Problem is NaN in column, you can try .str accessor
issues_df['new_column'] = issues_df['fields.description.content'].str[0].str['content'].str[0].str['text']
That could be a good task for the rather efficient json_normalize:
df['new_column'] = pd.json_normalize(
df['fields.description.content'], 'content'
)['text']

String object is not callable when creating a dataframe

I am trying to create a dataframe in pandas as follows:
cols = ['col1','col2']
df = pd.DataFrame(columns = cols)
I get the following error:
TypeError: 'str' object is not callable
Does anybody know the solution here?
Somewhere in your code you've overwritten the value for pd.DataFrame by assigning it a string value. Find and remove the offending line, restart your kernel, and try again.

Pandas column dtype is object but python thinks it is float

I read in a csv like this
df = pd.read_csv(self.file_path, dtype=str)
then I try this:
df = df[df["MY_COLUMN"].apply(lambda x: x.isnumeric())]
I get an AttributeError:
AttributeError: 'float' object has no attribute 'isnumeric'
Why is this happening? The column contains mostly digits.
I want to filter out the ones where there are no digits.
This question is not how to achieve that or do it better but why do I get an AttributeError here?
Why is this happening?
I think because NaN is not converting to string if use dtype=str, still is missing value, so type=float
Use Series.str.isnumeric for working isnumeric with missing values like all text functions in pandas:
df[df["MY_COLUMN"].str.isnumeric()]

Pandas - Convert object to string and then to int

I have the following data frame.
What I am trying to do is
Convert this object to a string and then to a numeric
I have looked at using the astype function (string) and then again to int. What I would like to get is the data type to be
df['a'] = df['a'].astype(string).astype(int).
I have tried other variations. What I have been trying to do is get the column values to become a number(obviously without the columns). It is just that the data type is an object initially.
Thanks so much!
You need to remove all the ,:
df['a'] = df['a'].str.replace(',', '').astype(int)
With both columns, you can do:
df[['a','b']] = df[['a','b']].replace(',', '', regex=True).astype('int')

Lambda mapping column to uppercase [duplicate]

I'm having trouble applying upper case to a column in my DataFrame.
dataframe is df.
1/2 ID is the column head that need to apply UPPERCASE.
The problem is that the values are made up of three letters and three numbers. For example rrr123 is one of the values.
df['1/2 ID'] = map(str.upper, df['1/2 ID'])
I got an error:
TypeError: descriptor 'upper' requires a 'str' object but received a 'unicode' error.
How can I apply upper case to the first three letters in the column of the DataFrame df?
If your version of pandas is a recent version then you can just use the vectorised string method upper:
df['1/2 ID'] = df['1/2 ID'].str.upper()
This method does not work inplace, so the result must be assigned back.
This should work:
df['1/2 ID'] = map(lambda x: str(x).upper(), df['1/2 ID'])
and should you want all the columns names to be in uppercase format:
df.columns = map(lambda x: str(x).upper(), df.columns)
str.upper() wants a plain old Python 2 string
unicode.upper() will want a unicode not a string (or you get TypeError: descriptor 'upper' requires a 'unicode' object but received a 'str')
So I'd suggest making use of duck typing and call .upper() on each of your elements, e.g.
df['1/2 ID'].apply(lambda x: x.upper(), inplace=True)

Categories