How to split elements of a list? - python

I have a list:
my_list = ['element1\t0238.94', 'element2\t2.3904', 'element3\t0139847']
How can I delete the \t and everything after to get this result:
['element1', 'element2', 'element3']

Something like:
>>> l = ['element1\t0238.94', 'element2\t2.3904', 'element3\t0139847']
>>> [i.split('\t', 1)[0] for i in l]
['element1', 'element2', 'element3']

myList = [i.split('\t')[0] for i in myList]

Try iterating through each element of the list, then splitting it at the tab character and adding it to a new list.
for i in list:
newList.append(i.split('\t')[0])

Do not use list as variable name.
You can take a look at the following code too:
clist = ['element1\t0238.94', 'element2\t2.3904', 'element3\t0139847', 'element5']
clist = [x[:x.index('\t')] if '\t' in x else x for x in clist]
Or in-place editing:
for i,x in enumerate(clist):
if '\t' in x:
clist[i] = x[:x.index('\t')]

Solution with map and lambda expression:
my_list = list(map(lambda x: x.split('\t')[0], my_list))

I had to split a list for feature extraction in two parts lt,lc:
ltexts = ((df4.ix[0:,[3,7]]).values).tolist()
random.shuffle(ltexts)
featsets = [(act_features((lt)),lc)
for lc, lt in ltexts]
def act_features(atext):
features = {}
for word in nltk.word_tokenize(atext):
features['cont({})'.format(word.lower())]=True
return features

Related

Slice all strings in a list from their first '\n'

How to remove first RemoveThisX\n from list
['RemoveThis1\nDontRemove\nDontRemove','RemoveThis2\nDontRemove\nDontRemove', 'RemoveThis3\nDontRemove\nDontRemove', 'RemoveThis4\nDontRemove\nDontRemove']
Trying to remove RemoveThis1\n, RemoveThis2\n, RemoveThis3, RemoveThis4\n
Final result need to be
['DontRemove\nDontRemove','DontRemove\nDontRemove', 'DontRemove\nDontRemove', 'DontRemove\nDontRemove']
a_list = ['RemoveThis1\nDontRemove\nDontRemove','RemoveThis2\nDontRemove\nDontRemove', 'RemoveThis3\nDontRemove\nDontRemove', 'RemoveThis4\nDontRemove\nDontRemove']
result = [item[item.find('\n')+1:] for item in a_list]
print(result)
['DontRemove\nDontRemove', 'DontRemove\nDontRemove', 'DontRemove\nDontRemove', 'DontRemove\nDontRemove']
test_list = ['RemoveThis1\nDontRemove\nDontRemove','RemoveThis2\nDontRemove\nDontRemove', 'RemoveThis3\nDontRemove\nDontRemove', 'RemoveThis4\nDontRemove\nDontRemove']
result = ["\n".join(item.split("\n")[1:]) for item in test_list]
print(result)
Output will be:
['DontRemove\nDontRemove', 'DontRemove\nDontRemove', 'DontRemove\nDontRemove', 'DontRemove\nDontRemove']
assuming:
initial_list = ['RemoveThis1\nDontRemove\nDontRemove','RemoveThis2\nDontRemove\nDontRemove', 'RemoveThis3\nDontRemove\nDontRemove', 'RemoveThis4\nDontRemove\nDontRemove']
I would recommend using either the map function:
mapped_list = list(map(lambda x: x[x.find('\n') + 1:], initial_list))
or list comprehension:
comprehended_list = [string[string.find('\n') + 1:] for string in initial_list]
Both should produce the asked list.

List resolver with tuples

I have a list:
List = [('4022-a751',), ('0bfc-4d53',)]
And want to resolve it into the output below:
Output = ['4022-a751','0bfc-4d53']
You should read about List Comprehensions in Python
list_ = [('4022-a751',), ('0bfc-4d53',)]
res = [x for item in list_ for x in item]
Output
['4022-a751', '0bfc-4d53']
A tuple can be manipulated like an array with index.
input_arr = [('4022-a751',), ('0bfc-4d53',)]
output_arr = [a[0] for a in input_arr]
print(output_arr)
You can use this.
old_list= [('4022-a751',), ('0bfc-4d53',)]
new_list = [''.join(i) for i in old_list]
print(new_list)

Delete end of element from a list if the element ends with an element from another list

I have the following two lists. If my_list ends with an extension from extensions, then it should be removed. I can't seem to find a solution that doesn't require too many lines of code.
Input:
my_list = ['abc_sum_def_sum', 'abc_sum_def_mean', 'abc_sum', 'abc_abc']
extensions = ['_sum', '_mean']
Output:
new_list = ['abc_sum_def', 'abc_sum_def', 'abc', 'abc_abc']
One-liner list comprehension:
new_list = [min(e[:(-len(ext) if e.endswith(ext) else len(e))] for ext in extensions) for e in my_list]
Result:
['abc_sum_def', 'abc_sum_def', 'abc', 'abc_abc']
Explanation:
What this does is basically loops over my_list, checks if its element e has either of the two extensions items at its end. If it does, it trims that extensions piece down. If it doesn't, leaves that element of my_list untouched. It basically first does this (without the min applied):
[[e[:(-len(ext) if e.endswith(ext) else len(e))] for ext in extensions] for e in my_list]
which produces:
[['abc_sum_def', 'abc_sum_def_sum'],
['abc_sum_def_mean', 'abc_sum_def'],
['abc', 'abc_sum'],
['abc_abc', 'abc_abc']]
and then applies min to collect the smaller item of each pair. That min corresponds to either the trimmed-down version of each element, or the untouched element itself.
To have a better pythonic approach, You can convert it into a list comprehension:
my_list = ['abc_sum_def_sum','abc_sum_def_mean','abc_sum','abc_abc']
extensions = ['_sum','_mean']
new_list =[]
for x in my_list:
for elem in extensions:
if x.endswith(elem):
y = x[:-len(elem)]
new_list.append(y)
This is one approach using Regex.
Ex:
import re
my_list = ['abc_sum_def_sum','abc_sum_def_mean','abc_sum','abc_abc']
extensions = ['_sum','_mean']
pattern = re.compile(r"(" + "|".join(extensions) + r")$")
print([pattern.sub("", i) for i in my_list])
Output:
['abc_sum_def', 'abc_sum_def', 'abc', 'abc_abc']
my_list = ['abc_sum_def_sum','abc_sum_def_mean','abc_sum','abc_abc']
extensions = ['_sum','_mean']
new_list =[]
for x in my_list:
if x.endswith(extensions[0]) or x.endswith(extensions[1]):
if x.endswith(extensions[0]):
y = x[:-len(extensions[0])]
new_list.append(y)
else:
y = x[:-len(extensions[1])]
new_list.append(y)
else:
new_list.append(x)
print(new_list)
output:
['abc_sum_def', 'abc_sum_def', 'abc', 'abc_abc']
A solution using lambdas:
my_list = ['abc_sum_def_sum','abc_sum_def_mean','abc_sum','abc_abc']
extensions = ['_sum','_mean']
def ext_cleaner(extensions, str_arg):
ext_found = [ext for ext in extensions if str_arg.endswith(ext)]
ret = str_arg[:-len(ext_found[0])] if ext_found else str_arg
return ret
list(map(lambda x: ext_cleaner(extensions, x), my_list))

Two conditions loop python

I have data as below
lst = ['abs', '#abs', '&abs']
I need to replace all parts with # or &. I do like this
new = []
simbol1 = '#'
simbol2 = '&'
for item in lst:
if simbol1 not in item:
if simbol2 not in item:
new.append(item)
But is there more simple way for this loop?
I tried like this
lst = ['abs', '#abs', '&abs']
new = []
simbol1 = '#'
simbol2 = '&'
for item in lst:
if any([simbol1 not in item , simbol2 not in item]):
new.append(item)
But i get
new
['abs', '#abs', '&abs']
What is a correct way to use multiple conditions in this case?
You can use list comprehension & merge the two if's as follows:
>>> lst = ['abs', '#abs', '&abs']
>>> new_lst = [l for l in lst if '#' not in l and '&' not in l]
>>> new_lst
['abs']
>>>
You can also use all() instead of multiple ifs like follows:
>>> lst = ['abs', '#abs', '&abs']
>>> new_lst = [l for l in lst if all(x not in l for x in ['#','&'])]
>>> new_lst
['abs']
>>>
You can just combine two ifs:
if simbol1 not in item and simbol2 not in item:
new.append(item)
lst = ['abs', '#abs', '&abs']
new = []
simbol1 = '#'
simbol2 = '&'
for item in lst:
if all([simbol1 not in item , simbol2 not in item]):
new.append(item)
print (new)
Functionally, you could do
new_list = list(filter(lambda x: all(f not in x for f in ['#', '&']), lst))
as an explanation, the lambda function ensures that none of the forbidden f characters are in string by filtering out all values that evaluate to False. filter returns a generator, so one can make that a list.
lst = ['abs', '#abs', '&abs']
out_lst = [el for el in lst if '#' not in el and '&' not in el]
print(out_lst) # ['abc']
Your code is very close to correct; the only problem is that you got the negation backward.
This:
if any([simbol1 not in item , simbol2 not in item]):
… is asking "are any of these are not in the item?" Just as in English, that's true if one is not in the item, but the other is in the item, which isn't what you want. You only want it to be true if neither is in the item.
In other words, you want either "are all of these not in the item?"
if all([simbol1 not in item, simbol2 not in item]):
… or "are none of these in the item?"
if not any([simbol1 in item, simbol2 in item]):
However, when you only have a fixed list of two things like this, it's usually easier to use and or or instead of any or all—again, just like in English:
if symbol1 not in item and simbol2 not in item:
… or:
if not (simbol1 in item or simbol2 in item):
If, on the other hand, you had a whole bunch of symbols to check, or a list of them that you couldn't even know until runtime, you'd want a loop:
if all(simbol not in item for simbol in (simbol1, simbol2)):
if not any(simbol in item for simbol in (simbol1, simbol2)):

Splitting a list into a 2 dimensional list

Given a list with several names as a parameter, I was wondering if there was a way to split the names by first and last and create a 2d list of all first names and all last names. For example given:
lst=(["DiCaprio,Leonardo","Pitt, Brad", "Jolie, Angelina"])
the output should be:
[["Leonoardo","Brad","Angelina"],["DiCaprio","Pitt","Jolie"]]
I know that I need to iterate over the given list, and append the first names and last names to a new lists but I'm not quite sure how to go about it.
This is what I've got so far:
fname=[]
lname=[]
for i in nlst:
i.split()
fname.append(i[0])
lname.append(i[1])
return lname,fname
You can use a simple list comprehension:
>>> lst=(["DiCaprio,Leonardo","Pitt, Brad", "Jolie, Angelina"])
>>> new = [[item.split(',')[1] for item in lst], [item.split(',')[0] for item in lst]]
>>> new
[['Leonardo', ' Brad', ' Angelina'], ['DiCaprio', 'Pitt', 'Jolie']]
Or you can zip it:
>>> x = [item.split(',') for item in lst]
>>> list(reversed(zip(*x)))
[('Leonardo', ' Brad', ' Angelina'), ('DiCaprio', 'Pitt', 'Jolie')]
This might help you:
list_names = []
list_last =[]
lst=["DiCaprio,Leonardo","Pitt, Brad", "Jolie, Angelina"]
for element in lst:
list_names.append(element.split(',')[0])
list_last.append(element.split(',')[1])
Let me know if you need anything else
a function to return the names after splitting.
def split_list(the_list=["DiCaprio,Leonardo","Pitt, Brad", "Jolie, Angelina"]):
list_2d = [[],[]]
for full_name in the_list:
first_name, second_name = full_name.split(',')
list_2d[0].append(first_name)
list_2d[1].append(second_name)
return list_2d
print split_list()
According your code, you can try this:
lst=(["DiCaprio,Leonardo","Pitt, Brad", "Jolie, Angelina"])
fname=[]
lname=[]
for i in lst:
fname.append(i.split(",")[0])
lname.append(i.split(",")[1])
print [lname,fname]
You can split and transpose:
lst = ["DiCaprio,Leonardo","Pitt, Brad", "Jolie, Angelina"]
print([list(ele) for ele in zip(*(ele.split(",") for ele in lst))])
Or use map:
print(list(map(list, zip(*(ele.split(",") for ele in lst)))))
[['DiCaprio', 'Pitt', 'Jolie'], ['Leonardo', ' Brad', ' Angelina']]
If you need the order reversed:
print(list(map(list, zip(*(reversed(ele.split(",")) for ele in lst)))))
[['Leonardo', ' Brad', ' Angelina'], ['DiCaprio', 'Pitt', 'Jolie']]
For python2 you can use izip and the remove the list call on map:
from itertools import izip
print(map(list, izip(*(reversed(ele.split(",")) for ele in lst))))
You can use zip function :
>>> l=["DiCaprio,Leonardo","Pitt, Brad", "Jolie, Angelina"]
>>> zip(*[i.split(',') for i in l])[::-1]
[('Leonardo', ' Brad', ' Angelina'), ('DiCaprio', 'Pitt', 'Jolie')]
If you want the result as list you can convert to list with map function :
>>> map(list,zip(*[i.split(',') for i in l])[::-1])
[['Leonardo', ' Brad', ' Angelina'], ['DiCaprio', 'Pitt', 'Jolie']]
You could use a regex:
>>> re.findall(r'(\w+),\s*(\w+);?', ';'.join(lst))
[('DiCaprio', 'Leonardo'), ('Pitt', 'Brad'), ('Jolie', 'Angelina')]
Then use either zip or map to transpose the list of tuples:
>>> map(None, *re.findall(r'(\w+),\s*(\w+);?', ';'.join(lst)))
[('DiCaprio', 'Pitt', 'Jolie'), ('Leonardo', 'Brad', 'Angelina')]
On Python 3, you can't use map that way. Instead, you would do:
>>> list(map(lambda *x: [e for e in x], *re.findall(r'(\w+),\s*(\w+);?', ';'.join(lst))))
Then reverse the order and make a list of lists rather than list of tuples if you wish:
>>> lot=map(None, *re.findall(r'(\w+),\s*(\w+);?', ';'.join(lst)))
>>> [list(lot[1]), list(lot[0])]
[['Leonardo', 'Brad', 'Angelina'], ['DiCaprio', 'Pitt', 'Jolie']]

Categories