Get pandas to print the complete string [duplicate] - python

This question already has answers here:
Pandas to_html() truncates string contents
(2 answers)
Closed 5 years ago.
I have a pandas dataframe with two columns - one for id and other for the corresponding title. I am subsetting the dataframe for a few project ids and displaying the resulting dataframe. On doing so, the project id gets displayed fine but the corresponding title gets truncated and end with ... after a few characters, How do I get pandas to display the full text in the title column?

You can use display.max_colwidth:
df = pd.DataFrame(np.array([[1,'aaa'],
[2, 'long string long string long string long string long string']]), columns=['id','title'])
print (df)
id title
0 1 aaa
1 2 long string long string long string long strin...
#temporaly display long text
with pd.option_context('display.max_colwidth', 100):
print (df)
id title
0 1 aaa
1 2 long string long string long string long string long string
Documentation

Related

Efficiently labelling a column that contains repeated elements [duplicate]

This question already has answers here:
How to efficiently assign unique ID to individuals with multiple entries based on name in very large df
(3 answers)
Pandas: convert categories to numbers
(6 answers)
Convert pandas series from string to unique int ids [duplicate]
(2 answers)
Closed 1 year ago.
This post was edited and submitted for review 1 year ago and failed to reopen the post:
Original close reason(s) were not resolved
I have a dataframe with a column consisting of author names, where sometimes the name of an author repeats. My problem is: I want to assign a unique number to each author name in a corresponding parallel column (for simplicity, assume that this numbering follows the progression of whole numbers, starting with 0, then 1, 2, 3, and so on).
I can do this using nested FOR loops, but with 57000 records consisting of 500 odd unique authors, it is taking way too long. Is there a quicker way to do this?
For example,
Original DataFrame contains:
**Author**
Name 1
Name 2
Name 1
Name 3
I want another column added next to it, such that:
**Author** **AuthorID*
Name 1 1
Name 2 2
Name 1 1
Name 3 3

How to convert all numeric values to floats when treated as string [duplicate]

This question already has answers here:
Change column type in pandas
(16 answers)
Closed 7 months ago.
I have a df like this:
label data start
37 1 Ses01M_impro04_F018 [145.2100-153.0500]: We're... 145.21000
38 2 Ses01M_impro04_M019 [148.3800-151.8400]: Well,... 148.38000
39 2 M: [BREATHING] BREATHING
40 1 Ses01M_impro04_M020 [159.7700-161.8600]: I'm n... 159.77000
I parsed out the start column to get the starting timestamp for each row using this code:
df['start'] = df.data.str.split().str[1].str[1:-2].str.split('-').str[0]
I want to convert df.start into floats because they are treated as string right now. However, I can't simply to .astype(float) because of the actual string BREATHING in row 39.
I'd like to just drop the row containing alphabet characters (row 39). I do not know how to do this because at this point, all values in df.start are type string, so I can't filter with something like isnumeric(). How do I do this?
Pasting a skeletal code. You can modify and use it
if a.isnumeric():
newa=to_numeric(a)
else:
newa=a

Split a string with , and split a string even no , present in a dataframe column [duplicate]

This question already has answers here:
Pandas split column into multiple columns by comma
(8 answers)
Closed 1 year ago.
I have a dataframe column and i need to split a column with "," and even if no "," present in the value.
Value
=====
59.5
59.5, 5
60
60,5
desired output
value1 value2
====== ======
59.5
59.5 5
60
60 5
Tried the code but getting an below error:
df['value1'], df_merge['value2'] = df['value'].str.split(',', 1).str
ValueError: not enough values to unpack (expected 2, got 1)
You could search for "," first and only do the split if str contains ",".
Use the string method str.partition. It returns a 3 tuple every time with what's on the left of your chosen partition (you would choose a comma) and whats to the right of it.
df['value1'], _, df_merge['value2'] = str(df['value']).partition(',')

extract semicolon separated value from pandas df column [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 2 years ago.
I need to extract a specific value from pandas df column. The data looks like this:
row my_column
1 artid=delish.recipe.45064;artid=delish_recipe_45064;avb=83.3;role=4;data=list;prf=i
2 ab=px_d_1200;ab=2;ab=t_d_o_1000;artid=delish.recipe.23;artid=delish;role=1;pdf=true
3 dat=_o_1000;artid=delish.recipe.23;ar;role=56;passing=true;points001
The data is not consistent, but separated by a comma and I need to extract role=x.
I separated the data by a semicolon. And can loop trough the values to fetch the roles, but was wondering if there is a more elegant way to solve it.
Desired output:
row my_column
1 role=4
2 role=1
3 role=56
Thank you.
You can use str.extract and pass the required pattern within parentheses.
df['my_column'] = df['my_column'].str.extract('(role=\d+)')
row my_column
0 1 role=4
1 2 role=1
2 3 role=56
This should work:
def get_role(x):
l=x.split(sep=';')
t=[i for i in l if i[:4]=='role')][0]
return t
df['my_column']=[i for i in map(lambda y: get_role(y), df['my_column'])]

Return rows that match a larger partial string of a string [duplicate]

This question already has answers here:
Python Pandas: Check if string in one column is contained in string of another column in the same row
(3 answers)
Closed 4 years ago.
I have a dataframe df that looks like the following:
Type Size
Biomass 12
Coal 15
Nuclear 23
And I have a string str such as the following: Biomass_wood
I would like to return the following dataframe:
Type Size
Biomass 12
This is because Biomass is partially matched by the first part of Biomass_wood.
This would effectively be the opposite of df[df.Type.str.contains(str)] as the bigger string is contained in str and not in the column Type
The following should do
df[df['Type'].map(lambda t: t in 'Biomass_wood')]

Categories