Removing quotation marks in python - python

I used python to import txt data as list. As a result, there are double quotes on the items in the list which i do not want. I want to remove them but I'm currently having some difficulties. I would be glad if someone can kindly help me with this issue.
The list in python is of this nature:
lst = ["'BEN','JIM'", "'GIN','ANN'"]
I want to have the double quotes removed so I can get this list:
lst = ['BEN','JIM', 'GIN','ANN']
I tried this lines of code but to no avail:
lst = ["'BEN','JIM'", "'GIN','ANN'"]
lst = [x.strip("'") for x in lst]
lst
and the result is not what im expecting:
["BEN','JIM", "GIN','ANN"]
Again, I would be grateful for your help.

Use the replace() string function to get rid of the single quotes, and then split on commas.
lst = ["'BEN','JIM'", "'GIN','ANN'"]
newlst = []
for pair in lst:
for name in pair.replace("'", "").split(","):
newlst.append(name)

You have confused the display representation of an item as being equivalent to its value.
Look at what you have: a list of two elements:
["'BEN','JIM'",
"'GIN','ANN'"]
You want to obtain a list of four elements:
['BEN',
'JIM',
'GIN',
'ANN']
You cannot do this by simple character manipulation: the operations you tried do not change the quantity of elements.
Instead, you have to process the elements you have, splitting 2-for-1.
I'll keep the Python technology low ...
new_lst = []
for two_name in lst:
name_pair = two_name.split(',')
new_lst.extend(name_pair)
Output:
["'BEN'", "'JIM'", "'GIN'", "'ANN'"]
Now you can use your previous technique to remove the single-quotes, leaving you with the four, 3-letter names you wanted.
Does that solve your problem?

The data looks similar to a CSV file where the individual cells are quoted. If it looks like a duck... use a reader to strip them for you. This is a nested list comprehension to first build rows from the the list and then to flatten them to a single list.
>>> import csv
>>> lst = ["'BEN','JIM'", "'GIN','ANN'"]
>>> [cell for row in csv.reader(lst, quotechar="'") for cell in row]
['BEN', 'JIM', 'GIN', 'ANN']

lst = ["'BEN','JIM'", "'GIN','ANN'"]
lst = "".join(lst)
print(lst)
Output:
'BEN','JIM''GIN','ANN'

You can do thi with list comprehension on two lines:
input:
lst = ["'BEN','JIM'", "'GIN','ANN'"]
lst1 = [x.replace("'", "").split(',') for x in lst]
lst2 = [z for y in lst1 for z in y]
or on one line
lst2 = [z for y in [x.replace("'", "").split(',') for x in lst] for z in y]
output:
['BEN', 'JIM', 'GIN', 'ANN']

Related

Splitting a list into single list amounts

I'm trying to split a list in python to single amounts but I can't seem to get it to work and I can't find any questions on stackoverflow which try to achieve this
At the moment, I've got code which is producing id's but I need those id's separate
['325', '323', '324', '322']
I want to split these so they go into
['323']
['324']
['322']
What would be the best way to do this?
The list has different amounts and some of them only have one id
since you want each element of the list in a separate/individual list, then you have to iterate through the original list and add an element in a empty list and append that new list to the resultant list.
main_list = ['325', '323', '324', '322']
final_solution = []
for element in main_list:
tmp = [element]
final_solution.append(tmp)
print(final_solution)
# output -> [['325'], ['323'], ['324'], ['322']]`
or, by using list comprehension
final_solution = [[element] for element in main_list]
print(final_solution)
# output -> [['325'], ['323'], ['324'], ['322']]`
Here's a simple one-liner that turns each item in the list to an array containing the item:
list(map(lambda x : [x], arr))
so if you have arr = [1,2,3] :
>>> a = list(map(lambda x : [x], arr))
>>> print(a)
[[1], [2], [3]]
my_list = ['325', '323', ['324', '327'], '345', '322']
new_list = []
for i in my_list:
if type(i) == str:
new_list.append([i])
else:
for j in i:
new_list.append([j])
new_list = [['325'], ['323'], ['324'], ['327'], ['345'], ['322']].

Replace string in specific index in list of lists python

How can i replace a string in list of lists in python but i want to apply the changes only to the specific index and not affecting the other index, here some example:
mylist = [["test_one", "test_two"], ["test_one", "test_two"]]
i want to change the word "test" to "my" so the result would be only affecting the second index:
mylist = [["test_one", "my_two"], ["test_one", "my_two"]]
I can figure out how to change both of list but i can't figure out what I'm supposed to do if only change one specific index.
Use indexing:
newlist = []
for l in mylist:
l[1] = l[1].replace("test", "my")
newlist.append(l)
print(newlist)
Or oneliner if you always have two elements in the sublist:
newlist = [[i, j.replace("test", "my")] for i, j in mylist]
print(newlist)
Output:
[['test_one', 'my_two'], ['test_one', 'my_two']]
There is a way to do this on one line but it is not coming to me at the moment. Here is how to do it in two lines.
for two_word_list in mylist:
two_word_list[1] = two_word_list.replace("test", "my")

Check if element of list is sub-element of other list elements in same list

I am looking for a way to check if an element of a list is sub-element of any other elements of that same list?
For example, let's use the below list as an example.
['Lebron James', 'Lebron', 'James']
The 2nd and 3rd elements of this list are a sub-element of the 1st element of the list.
I am looking for a way to remove these elements from the list so only the 1st element remains. I have been spinning my wheels and unable to come up with a solution.
Can someone help?
Thanks
Here's a slow solution, might be acceptable depending on your data size:
lst = ['Lebron James', 'Lebron', 'James']
[s for s in lst if not any(s in s2.split() for s2 in lst if s != s2)]
This is definitely an easier problem to tackle with the starting and ending points for the match instead of the strings themselves.
One approach can be to take all ranges from biggest to smallest, and work backwards, creating the result as you go, given a range is not fully contained in another.
lst = [(0, 10),(0, 4),(5, 10)]
result = []
def membership(big_range, small_range):
'''return true if big_range fully contains the small_range.
where both are tuples with a start and end value.
'''
if small_range[0] >= big_range[0] and small_range[1] <= big_range[1]:
return True
return False
for range_ in sorted(lst, key= lambda x: x[1] - x[0], reverse=True):
if not any(membership(x, range_) for x in result):
result.append(range_)
print(result)
#[(0, 10)]
Edit: this answer was in response to the OP'S edited question, which seems to have since been rolled back. Oh well. Hope it helps someone anyways.
Can try to create a dictionary of all permutations (the choice between permutations, or sublists, or whatever, depends on the desired behavior) grouped by element's word count:
import re
import itertools
from collections import defaultdict
lst = [
'Lebron Raymone James', 'Lebron Raymone',
'James', "Le", "Lebron James",
'Lebron James 1 2 3', 'Lebron James 1 2'
]
d = defaultdict(dict)
g = "\\b\w+\\b"
for x in lst:
words = re.findall(g, x) # could simply use x.split() if have just spaces
combos = [
x for i in range(1, len(words) + 1)
for x in list(itertools.permutations(words, i))
]
for c in combos:
d[len(words)][tuple(c)] = True
and take just elements whose words are not present in any of the groups with greater words count:
M = max(d)
res = []
for x in lst:
words = tuple(re.findall(g, x))
if not any(d[i].get(words) for i in range(len(words)+1, M+1)):
res.append(x)
set(res)
# {'Le', 'Lebron James 1 2 3', 'Lebron Raymone James'}
Create a set containing all the words in the strings that are multiple words. Then go through the list, testing the strings to see if they're in the set.
wordset = set()
lst = ['Lebron James', 'Lebron', 'James']
for s in lst:
if " " in s:
wordset.update(s.split())
result = [x for x in lst if x not in wordset]

Join elements of an arbitrary number of lists into one list of strings python

I want to join the elements of two lists into one list and add some characters, like so:
list_1 = ['some1','some2','some3']
list_2 = ['thing1','thing2','thing3']
joined_list = ['some1_thing1', 'some2_thing2', 'some3_thing3']
however i don't know in advance how many lists I will have to do this for, i.e. I want to do this for an arbitrary number of lists
Also, I currently receive a list in the following form:
list_A = [('some1','thing1'),('some2','thing2'),('some3','thing3')]
so I split it up into lists like so:
list_B = [i for i in zip(*list_A)]
I do this because sometimes I have an int instead of a string
list_A = [('some1','thing1',32),('some1','thing1',42),('some2','thing3', 52)]
so I can do this after
list_C = [list(map(str,list_B[i])) for i in range(0,len(list_B)]
and basically list_1 and list_2 are the elements of list_C.
So is there a more efficient way to do all this ?
Try this if you are using python>=3.6:
[f'{i}_{j}' for i,j in zip(list_1, list_2)]
If you using python3.5, you can do this:
['{}_{}'.format(i,j) for i,j in zip(list_1, list_2)]
also you can use this if you don't want to use formatted string:
['_'.join([i,j]) for i,j in zip(list_1, list_2)]
You can join function like this on the base list_A, itself, no need to split it for probable int values:
list_A = [('some1','thing1',32),('some1','thing1',42), ('some2','thing3', 52)]
["_".join(map(str, i)) for i in list_A]
Output:
['some1_thing1_32', 'some1_thing1_42', 'some2_thing3_52']
Update:
For you requirement, where you want to ignore last element for last tuple in your list_A, need to add if-else condition inside the list-comprehension as below:
["_".join(map(str, i)) if list_A.index(i) != len(list_A)-1 else "_".join(map(str, i[:-1])) for i in list_A ]
Updated Output:
['some1_thing1_32', 'some1_thing1_42', 'some2_thing3']
For ignoring the last element of every tuple in list_A, I found this to be the quickest way:
["_".join(map(str, i)) for i in [x[:-1] for x in list_A] ]

How to split a list of lists

So I want to split a list of lists.
the code is
myList = [['Sam has an apple,5,May 5'],['Amy has a pie,6,Mar 3'],['Yoo has a Football, 5 ,April 3']]
I tried use this:
for i in mylist:
i.split(",")
But it keeps give me error message
I want to get:
["Amy has a pie" , "6" , "Mar 3"] THis kinds of format
Here's how you do it. I used an inline for loop to iterate through each item and split them by comma.
myList = [item[0].split(",") for item in myList]
print(myList)
OR You can enumerate to iterate through the list normally, renaming items as you go:
for index, item in enumerate(myList):
myList[index] = myList[index][0].split(",")
print(myList)
OR You can create a new list as you iterate through with the improved value:
newList = []
for item in myList:
newList.append(item[0].split(","))
print(newList)
Split each string in sublist :)
new = []
for l in myList:
new.append([x.split(',') for x in l])
print new
It's because it's a list of lists. Your code is trying to split the sub-list, not a string. Simply:
for i in mylist:
i[0].split(",")
Another way would be:
list(map(lambda x: x[0].split(","), myList))

Categories