Splitting a Python list based on criteria - python

I have a python list as follows:
mylist = [('Item A','CA','10'),('Item B','CT','12'),('Item C','CA','14')]
I would like to split it into a list based on column 2 == 'CA'
Desired Output:
filtered_list = [('Item A','CA','10'),('Item C','CA','14')]
My Attempt: Clearly there are some issues here!
mylist = [('Item A','CA','10'),('Item B','CT','12'),('Item C','CA','14')]
filtered_list[]
for row in mylist:
if [row:1] = 'CA'
filtered_list.append(mylist[row])

You can use list comprehension to achieve this:
mylist = [('Item A','CA','10'),('Item B','CT','12'),('Item C','CA','14')]
filtered_list = [item for item in mylist if item[1]=='CA']

You can use python's filter for this purpose in the following way.
filtered_list = list(filter(lambda x: x[1] =='CA',mylist)))

Instead of writing my own answer I would like to point out where you went wrong with an explanation.
mylist = [('Item A','CA','10'),('Item B','CT','12'),('Item C','CA','14')]
filtered_list[] ## I believe this was typo
for row in mylist:
if [row:1] = 'CA' ## this where you missed it!
filtered_list.append(mylist[row])
I have corrected your code.
mylist = [('Item A','CA','10'),('Item B','CT','12'),('Item C','CA','14')]
filtered_list = [] ## list created
for row in mylist:
if row[1] == 'CA': ## == for if condition and :
filtered_list.append(row) ## appending the row if row[1] == "CA"
print filtered_list

Related

trying to delete duplicate dictionary values from the list of dictionaries

below is my list of dictionaries:
I'm tried to remove the duplicate values from 'r_id'. some values are in list and some are in string.
list_of_dict = [{'fid':200, 'r_id':['4321', '4321']}, {'fid':201, 'r_id':'5321'}]
expected output
list_of_dict = [{'fid':200, 'r_id':['4321']}, {'fid':201, 'r_id':['5321']}]
I've tried below piece of code but not worked
for item in list_of_dict:
if type(item["record_id"]) == list:
item["record_id"] = [set(item["record_id"])]
please suggest me with the solution
Do:
result = [{ "fid" : d["fid"] , "r_id" : list(set(d["r_id"])) } for d in list_of_dict]
print(result)
Or simply:
for d in list_of_dict:
d["r_id"] = list(set(d["r_id"]))
print(list_of_dict)
If you really need to check the type, use isinstance:
for d in list_of_dict:
if isinstance(d["r_id"], list):
d["r_id"] = list(set(d["r_id"]))
For the canonical way of checking type in Python, read this.
If in item['r_id'] you have another type like str you can try this:
list_of_dict = [{'fid':201, 'r_id':'5321'}, {'fid':200, 'r_id':['4321', '4321']}]
for item in list_of_dict:
if type (item['r_id']) == list:
# if isinstance(item['r_id'],list):
item['r_id'] = list(set(item['r_id']))
elif type (item['r_id']) == str:
# elif isinstance(item['r_id'],str):
item['r_id'] = [item['r_id']]
#Shortest approach
>>> [{'fid' : item['fid'], 'r_id' : list(set(item['r_id'])) if type(item['r_id']) == list else [item['r_id']]} for item in list_of_dict]
[{'fid': 201, 'r_id': ['5321']}, {'fid': 200, 'r_id': ['4321']}]
You are almost there !
Though there may be other (better) solutions, your solution will also work if you change it as below:
for item in list_of_dict:
if type(item["r_id"]) == list:
item["r_id"] = list(set(item["r_id"]))
try this:
for items in list_of_dict:
temp_list = list()
if isinstance(item["r_id"], list):
for value in item["r_id"]:
if value not in templist:
temp_list.append(value)
item["r_id"] = temp_list

Filtering a list with itens with same hour and letting just one in Python

I have a list of multiple items in Python, and the list is generated randomly, this is an example:
['12:01;Jhon',
'13:25;Charlie',
'14:00;Joshua',
'12:01;Dean',
'15:04;Derek',
'14:58;George',
'12:01;Wilson',
'15:04;Marcus']
And i need to generate a new list with, picking the first item with same hour, and letting the items with different hour:
['12:01;Jhon',
'13:25;Charlie',
'14:00;Joshua',
'15:04;Derek',
'14:58;George']
Explainig the new list: Jhon was the first item with 12:01, so it is in the new list, and removing Dean and Wilson because they have also 12:01. Joshua and George contain in the list because they have different hours from the others. And Derek was the first item with 15:04, removing Marcus from the list because he have 15:04.
You can use set() to filter out the duplicates. For example:
lst = [
"12:01;Jhon",
"13:25;Charlie",
"14:00;Joshua",
"12:01;Dean",
"15:04;Derek",
"14:58;George",
"12:01;Wilson",
"15:04;Marcus",
]
out, seen = [], set()
for item in lst:
hour = item.split(";", maxsplit=1)[0]
if hour not in seen:
out.append(item)
seen.add(hour)
print(out)
Prints:
['12:01;Jhon', '13:25;Charlie', '14:00;Joshua', '15:04;Derek', '14:58;George']
Just a short dict solution:
d = {}
for s in lst:
d.setdefault(s[:5], s)
result = list(d.values())
Try it online!
This could work
x= ['12:01;Jhon', '13:25;Charlie', '14:00;Joshua', '12:01;Dean', '15:04;Derek', '14:58;George', '12:01;Wilson', '15:04;Marcus']
y = {}
for elems in x:
elems = elems.split(';')
if elems[0] not in y:y[elems[0]] = elems[1]
x = [elems2+';'+y[elems2] for elems2 in y]
print(x)
Also I suggest using a dictionary for this kind of stuff, but for your output example I turned the dict into a list
My suggestion:
We need to split up the items into lists [hour, name] and make it a dict:
items = ['12:01;Jhon', '13:25;Charlie', '14:00;Joshua', '12:01;Dean', '15:04;Derek', '14:58;George', '12:01;Wilson', '15:04;Marcus']
split_items = list(map(lambda x: x.split(';'), items))
# we need to reverse it first because dict overwrites existant keys in order
new_items_dict = dict(reversed(split_items))
# return it back to a list
new_items_list = list(new_items_dict.items())
# new_items_list == [('15:04', 'Derek'), ('12:01', 'Jhon'), ('14:58', 'George'), ('14:00', 'Joshua'), ('13:25', 'Charlie')]
# And if you want to join them back
new_items = list(map(lambda x: ';'.join(x), new_items_list))
# new_items == ['15:04;Derek', '12:01;Jhon', '14:58;George', '14:00;Joshua', '13:25;Charlie']
you can try this
new_list= []
checks = []
for i in item:
a= i.split(';')[0]
if a not in checks:
new_list.append(i)
checks.append(a)
del checks
print(new_list)

What is an easy way to remove duplicates from only part of the string in Python?

I have a list of strings that goes like this:
1;213;164
2;213;164
3;213;164
4;213;164
5;213;164
6;213;164
7;213;164
8;213;164
9;145;112
10;145;112
11;145;112
12;145;112
13;145;112
14;145;112
15;145;112
16;145;112
17;145;112
1001;1;151
1002;2;81
1003;3;171
1004;4;31
I would like to remove all duplicates where second 2 numbers are the same. So after running it through program I would get something like this:
1;213;164
9;145;112
1001;1;151
1002;2;81
1003;3;171
1004;4;31
But something like
8;213;164
15;145;112
1001;1;151
1002;2;81
1003;3;171
1004;4;31
would also be correct.
Here is a nice and fast trick you can use (assuming l is your list):
list({ s.split(';', 1)[1] : s for s in l }.values())
No need to import anything, and fast as can be.
In general you can define:
def custom_unique(L, keyfunc):
return list({ keyfunc(li): li for li in L }.values())
You can group the items by this key and then use the first item in each group (assuming l is your list).
import itertools
keyfunc = lambda x: x.split(";", 1)[1]
[next(g) for k, g in itertools.groupby(sorted(l, key=keyfunc), keyfunc)]
Here is a code on the few first items, just switch my list with yours:
x = [
'7;213;164',
'8;213;164',
'9;145;112',
'10;145;112',
'11;145;112',
]
new_list = []
for i in x:
check = True
s_part = i[i.find(';'):]
for j in new_list:
if s_part in j:
check = False
if check == True:
new_list.append(i)
print(new_list)
Output:
['7;213;164', '9;145;112']

python edit tuple duplicates in a list

my target is:
while for looping a list I would like to check for duplicates and if there are some i would like to append a number to it see following example
my list output as an example:
[('name','company'), ('someguy','microsoft'), ('anotherguy','microsoft'), ('thirdguy','amazon')]
in a loop i would like to edit those duplicates so instead of the 2nd microsoft i would like to have microsoft1 (if there would be 3 microsoft guys so the third guy would have microsoft2)
with this i can filter the duplicates but i dont know how to edit them directly in the list
list = [('name','company'), ('someguy','microsoft'), ('anotherguy','microsoft'), ('thirdguy','amazon')]
names = []
double = []
for u in list[1:]:
names.append(u[1])
list_size = len(names)
for i in range(list_size):
k = i + 1
for j in range(k, list_size):
if names[i] == names[j] and names[i] not in double:
double.append(names[i])
This is one approach using collections.defaultdict.
Ex:
from collections import defaultdict
lst = [('name','company'), ('someguy','microsoft'), ('anotherguy','microsoft'), ('thirdguy','amazon')]
seen = defaultdict(int)
result = []
for k, v in lst:
if seen[v]:
result.append((k, "{}_{}".format(v, seen[v])))
else:
result.append((k,v))
seen[v] += 1
print(result)
Output:
[('name', 'company'),
('someguy', 'microsoft'),
('anotherguy', 'microsoft_1'),
('thirdguy', 'amazon')]

Slice a list into a nested list based on special characters using Python

I have a list of strings like this:
lst = ['23532','user_name=app','content=123',
'###########################',
'54546','user_name=bee','content=998 hello','source=fb',
'###########################',
'12/22/2015']
I want a similar method like string.split('#') that can give me output like this:
[['23532','user_name=app','content='123'],
['54546','user_name=bee',content='998 hello','source=fb'],
['12/22/2015']]
but I know list has not split attribute. I cannot use ''.join(lst) either because this list comes from part of a txt file I read in and my txt.file was too big, so it will throw an memory error to me.
I don't think there's a one-liner for this, but you can easily write a generator to do what you want:
def sublists(lst):
x = []
for item in lst:
if item == '###########################': # or whatever condition you like
if x:
yield x
x = []
else:
x.append(item)
if x:
yield x
new_list = list(sublists(old_list))
If you can't use .join(), you can loop through the list and save the index of any string that contains # then loop again to slice the list:
lst = ['23532', 'user_name=app', 'content=123', '###########################' ,'54546','user_name=bee','content=998 hello','source=fb','###########################','12/22/2015']
idx = []
new_lst = []
for i,val in enumerate(lst):
if '#' in val:
idx.append(i)
j = 0
for x in idx:
new_lst.append(lst[j:x])
j = x+1
new_lst.append(lst[j:])
print new_lst
output:
[['23532', 'user_name=app', 'content=123'], ['54546', 'user_name=bee', 'content=998 hello', 'source=fb'], ['12/22/2015']]
sep = '###########################'
def split_list(_list):
global sep
lists = list()
sub_list = list()
for x in _list:
if x == sep:
lists.append(sub_list)
sub_list = list()
else:
sub_list.append(x)
lists.append(sub_list)
return lists
l = ['23532','user_name=app','content=123',
'###########################',
'54546','user_name=bee','content=998 hello','source=fb',
'###########################',
'12/22/2015']
pprint(split_list(l))
Output:
[['23532', 'user_name=app', 'content=123'],
['54546', 'user_name=bee', 'content=998 hello', 'source=fb'],
['12/22/2015']]
You can achieve this by itertools.groupby
from itertools import groupby
lst = ['23532','user_name=app','content=123',
'###########################','54546','user_name=bee','content=998 hello','source=fb',
'###########################','12/22/2015']
[list(g) for k, g in groupby(lst, lambda x: x == '###########################') if not k ]
Output
[['23532', 'user_name=app', 'content=123'],
['54546', 'user_name=bee', 'content=998 hello', 'source=fb'],
['12/22/2015']]

Categories