Pandas DataFrames Lists

Pandas DataFrames Lists - python

So I created a new column in my dataframe using a list. Now every entry has the ‘[ ]’ squared parentheses around the text. How do I remove them? Please help! It seems easy but I’m not getting there. Code used:
df.insert(1, ‘Email’, emails_list, True)
Now all the data in the Email column is in [square brackets]. I want to remove those parentheses.

You probably have lists as values to each row in the column 'Email'. You can try the below code to take the first element of the list, and replace the original list with it.
df['Email'] = df['Email'].map(lambda x: x[0] if len(x)> 0 else '')
The above code takes each cell value of the column, and checks if it of non zero length. If it has non-zero length, then it replaces the list in the cell, with the first element of the list. Otherwise, it just replaces it with an empty string.
This should help. If the error persists, please check the type and shape of 'emails_list'

Related

How to append cell values in openpyxl directly

I'm trying to append cell value using openpyxl, by appending the value directly.
this works:
wb1=load_workbook('test.xlsx')
ws1=wb1.active
testlist=('two','three','four','five')
for i in testlist:
ws1['A1'].value = ws1['A1'].value +(' ') + i
print(ws1['A1'].value)
A1 has a value of "one", after the loop runs it has "one two three four five"
But is it possible to use the append method directly on the cell value?
for i in testlist:
ws1.append['A1'].value = i
however this throws an error
"TypeError: 'method' object does not support item assignment"

The error "method' object is not subscriptable" means that, you are treating an object as python dict or dict like object which the object isn't. Because the append method returns None.
As per documentation of openpyxl, You can worksheet.append via:
A list: all values are added in order, starting from the first column.
which is your case. simply doing the following should work:
wb1=Workbook()
ws1=wb1.active
testlist=('one','two','three','four','five')
# append each element side by side in a single row
ws1.append(testlist)
# To append each element vertical direction in new row you can un-comment next 2 lines.
#for entry in testlist:
# ws1.append([entry])
wb1.save('test.xlsx')
A dict: values are assigned to the columns indicated by the keys (numbers or letters). This might help if you are targeting a specific column.
Or To have more control simply use worksheet.cell.

You will need to move the tuple into a string and can add it to cell A1 like this.
wb1=Workbook()
ws1=wb1.active
testlist=('one','two','three','four','five')
myString = ' '.join(map(str, testlist))
myString.strip()
ws1['A1'].value = myString
wb1.save('test1.xlsx')

Efficient way to convert a list of one string to a string and perform split operation

I have a list which contains a string shown below. I have defined mylist in the global space as a string using "".
mylist = ""
mylist = ["1.22.43.45"]
I get an execution error stating that the split operation is not possible as it is being performed on a list rather than the string.
mylist.rsplit(".",1)[-1]
I tried to resolve it by using the following code:
str(mylist.rsplit(".",1)[-1]
Is this the best way to do it? The output I want is 45. I am splitting the string and accessing the last element. Any help is appreciated.

mylist=["1.22.43.45"]
newstring = mylist[0].rsplit(".",1)[-1]
First select the element in your list then split then choose the last element in the split

Just because you assigned mylist = "" first, doesn't mean it'll cast the list to a string. You've just reassigned the variable to point at a list instead of an empty string.
You can accomplish what you want using:
mylist = ["1.22.43.45"]
mylist[-1].rsplit('.', 1)[-1]
Which will get the last item from the list and try and perform a rsplit on it. Of course, this won't work if the list is empty, or if the last item in the list is not a string. You may want to wrap this in a try/except block to catch IndexError for example.
EDIT: Added the [-1] index to the end to grab the last list item from the split, since rsplit() returns a list, not a string. See DrBwts' answer

You can access the first element (the string, in your case) by the index operator []
mylist[0].rsplit(".", 1)[-1]

Remove list type in columns while preserving list structure

I have two columns that from the way my data was pulled are in lists. This may be a really easy question, I just haven't found the exactly correct way to create the result I'm looking for.
I need the "a" column to be a string without the [] and the "a" column to be integers separated by a column if that's possible.
I've tried this code:
df['a'] = df['a'].astype(str)
to convert to a string: but it failed and outputs:
What I need the output to look like is:
a b
hbhprecision.com 123,1234,12345,123456
thomsonreuters.com 1234,12345,123456
etc.
Please help and thank you very much in advance!

for the first part, removing the brackets [ ]
df['c_u'].apply(lambda x : x.strip("['").strip("']"))
for the second part (assuming you removed your brackets as well), splitting the values across columns:
df['tawgs.db_id'].str.split(',', expand=True)

Remove first 3 characters in string using a condition statement

Can anyone Kindly help please?
I'm trying to remove three of the first characters within the string using the statement:
Data['COUNTRY_CODE'] = Data['COUNTRY1'].str[3:]
This will create a new column after removing the first three values of the string. However, I do not want this to be applied to all of the values within the same column so was hoping there would be a way to use a conditional statement such as 'Where' in order to only change the desired strings?

I assume you are using pandas so your condition check can be like:
condition_mask = Data['COL_YOU_WANT_TO_CHECK'] == 'SOME CONDITION'
Your new column can be created as:
# Assuming you want the first 3 chars as COUNTRY_CODE
Data.loc[condition_mask, 'COUNTRY_CODE'] = Data['COUNTRY1'].str[:3]

Append specific rows from one list to another

Having some difficulty trying to take a 2d list with 7 columns and 10 rows, and append all rows from only columns 4,5 and 6 (or 3,4,5 from index 0) to a new list. The original list is actually a csv and is much, much longer but I've just put part of it in the function for troubleshooting purposes.
What I have so far is...
def coords():
# just an example of first couple lines...
bigList = [['File','FZone','Type','ID','Lat','Lon','Ref','RVec']
['20120505','Cons','mit','3_10','-21.77','119.11','mon_grs','14.3']
newList=[]
for row in bigList[1:]: # skip the header
newList.append(row[3])
return newList # return newList to main so it can be sent to other functions
This code gives me a new list with 'ID' only but I also want 'Lat' and 'Lon'.
The new list should look like...['3_10', '-21.77','119.11']['4_10','-21.10'...]
I tried re-writing newList.append(row[3,4,5])...and of course that doesn't work but not sure how to go about it.

row[3] refers to the fourth element. You seem to want the fourth through sixth elements, so slice it:
row[3:6]
You could also do this all with a list comprehension:
newList = [row[3:6] for row in myList[1:]]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pandas DataFrames Lists - python

Related

How to append cell values in openpyxl directly

Efficient way to convert a list of one string to a string and perform split operation

Remove list type in columns while preserving list structure

Remove first 3 characters in string using a condition statement

Append specific rows from one list to another

Categories

Resources