Splitting a Python list based on criteria

Splitting a Python list based on criteria - python

I have a python list as follows:
mylist = [('Item A','CA','10'),('Item B','CT','12'),('Item C','CA','14')]
I would like to split it into a list based on column 2 == 'CA'
Desired Output:
filtered_list = [('Item A','CA','10'),('Item C','CA','14')]
My Attempt: Clearly there are some issues here!
mylist = [('Item A','CA','10'),('Item B','CT','12'),('Item C','CA','14')]
filtered_list[]
for row in mylist:
if [row:1] = 'CA'
filtered_list.append(mylist[row])

You can use list comprehension to achieve this:
mylist = [('Item A','CA','10'),('Item B','CT','12'),('Item C','CA','14')]
filtered_list = [item for item in mylist if item[1]=='CA']

You can use python's filter for this purpose in the following way.
filtered_list = list(filter(lambda x: x[1] =='CA',mylist)))

Instead of writing my own answer I would like to point out where you went wrong with an explanation.
mylist = [('Item A','CA','10'),('Item B','CT','12'),('Item C','CA','14')]
filtered_list[] ## I believe this was typo
for row in mylist:
if [row:1] = 'CA' ## this where you missed it!
filtered_list.append(mylist[row])
I have corrected your code.
mylist = [('Item A','CA','10'),('Item B','CT','12'),('Item C','CA','14')]
filtered_list = [] ## list created
for row in mylist:
if row[1] == 'CA': ## == for if condition and :
filtered_list.append(row) ## appending the row if row[1] == "CA"
print filtered_list

Related

trying to delete duplicate dictionary values from the list of dictionaries

below is my list of dictionaries:
I'm tried to remove the duplicate values from 'r_id'. some values are in list and some are in string.
list_of_dict = [{'fid':200, 'r_id':['4321', '4321']}, {'fid':201, 'r_id':'5321'}]
expected output
list_of_dict = [{'fid':200, 'r_id':['4321']}, {'fid':201, 'r_id':['5321']}]
I've tried below piece of code but not worked
for item in list_of_dict:
if type(item["record_id"]) == list:
item["record_id"] = [set(item["record_id"])]
please suggest me with the solution

Do:
result = [{ "fid" : d["fid"] , "r_id" : list(set(d["r_id"])) } for d in list_of_dict]
print(result)
Or simply:
for d in list_of_dict:
d["r_id"] = list(set(d["r_id"]))
print(list_of_dict)
If you really need to check the type, use isinstance:
for d in list_of_dict:
if isinstance(d["r_id"], list):
d["r_id"] = list(set(d["r_id"]))
For the canonical way of checking type in Python, read this.

If in item['r_id'] you have another type like str you can try this:
list_of_dict = [{'fid':201, 'r_id':'5321'}, {'fid':200, 'r_id':['4321', '4321']}]
for item in list_of_dict:
if type (item['r_id']) == list:
# if isinstance(item['r_id'],list):
item['r_id'] = list(set(item['r_id']))
elif type (item['r_id']) == str:
# elif isinstance(item['r_id'],str):
item['r_id'] = [item['r_id']]
#Shortest approach
>>> [{'fid' : item['fid'], 'r_id' : list(set(item['r_id'])) if type(item['r_id']) == list else [item['r_id']]} for item in list_of_dict]
[{'fid': 201, 'r_id': ['5321']}, {'fid': 200, 'r_id': ['4321']}]

You are almost there !
Though there may be other (better) solutions, your solution will also work if you change it as below:
for item in list_of_dict:
if type(item["r_id"]) == list:
item["r_id"] = list(set(item["r_id"]))

try this:
for items in list_of_dict:
temp_list = list()
if isinstance(item["r_id"], list):
for value in item["r_id"]:
if value not in templist:
temp_list.append(value)
item["r_id"] = temp_list

Filtering a list with itens with same hour and letting just one in Python

I have a list of multiple items in Python, and the list is generated randomly, this is an example:
['12:01;Jhon',
'13:25;Charlie',
'14:00;Joshua',
'12:01;Dean',
'15:04;Derek',
'14:58;George',
'12:01;Wilson',
'15:04;Marcus']
And i need to generate a new list with, picking the first item with same hour, and letting the items with different hour:
['12:01;Jhon',
'13:25;Charlie',
'14:00;Joshua',
'15:04;Derek',
'14:58;George']
Explainig the new list: Jhon was the first item with 12:01, so it is in the new list, and removing Dean and Wilson because they have also 12:01. Joshua and George contain in the list because they have different hours from the others. And Derek was the first item with 15:04, removing Marcus from the list because he have 15:04.

You can use set() to filter out the duplicates. For example:
lst = [
"12:01;Jhon",
"13:25;Charlie",
"14:00;Joshua",
"12:01;Dean",
"15:04;Derek",
"14:58;George",
"12:01;Wilson",
"15:04;Marcus",
]
out, seen = [], set()
for item in lst:
hour = item.split(";", maxsplit=1)[0]
if hour not in seen:
out.append(item)
seen.add(hour)
print(out)
Prints:
['12:01;Jhon', '13:25;Charlie', '14:00;Joshua', '15:04;Derek', '14:58;George']

Just a short dict solution:
d = {}
for s in lst:
d.setdefault(s[:5], s)
result = list(d.values())
Try it online!

This could work
x= ['12:01;Jhon', '13:25;Charlie', '14:00;Joshua', '12:01;Dean', '15:04;Derek', '14:58;George', '12:01;Wilson', '15:04;Marcus']
y = {}
for elems in x:
elems = elems.split(';')
if elems[0] not in y:y[elems[0]] = elems[1]
x = [elems2+';'+y[elems2] for elems2 in y]
print(x)
Also I suggest using a dictionary for this kind of stuff, but for your output example I turned the dict into a list

My suggestion:
We need to split up the items into lists [hour, name] and make it a dict:
items = ['12:01;Jhon', '13:25;Charlie', '14:00;Joshua', '12:01;Dean', '15:04;Derek', '14:58;George', '12:01;Wilson', '15:04;Marcus']
split_items = list(map(lambda x: x.split(';'), items))
# we need to reverse it first because dict overwrites existant keys in order
new_items_dict = dict(reversed(split_items))
# return it back to a list
new_items_list = list(new_items_dict.items())
# new_items_list == [('15:04', 'Derek'), ('12:01', 'Jhon'), ('14:58', 'George'), ('14:00', 'Joshua'), ('13:25', 'Charlie')]
# And if you want to join them back
new_items = list(map(lambda x: ';'.join(x), new_items_list))
# new_items == ['15:04;Derek', '12:01;Jhon', '14:58;George', '14:00;Joshua', '13:25;Charlie']

you can try this
new_list= []
checks = []
for i in item:
a= i.split(';')[0]
if a not in checks:
new_list.append(i)
checks.append(a)
del checks
print(new_list)

What is an easy way to remove duplicates from only part of the string in Python?

I have a list of strings that goes like this:
1;213;164
2;213;164
3;213;164
4;213;164
5;213;164
6;213;164
7;213;164
8;213;164
9;145;112
10;145;112
11;145;112
12;145;112
13;145;112
14;145;112
15;145;112
16;145;112
17;145;112
1001;1;151
1002;2;81
1003;3;171
1004;4;31
I would like to remove all duplicates where second 2 numbers are the same. So after running it through program I would get something like this:
1;213;164
9;145;112
1001;1;151
1002;2;81
1003;3;171
1004;4;31
But something like
8;213;164
15;145;112
1001;1;151
1002;2;81
1003;3;171
1004;4;31
would also be correct.

Here is a nice and fast trick you can use (assuming l is your list):
list({ s.split(';', 1)[1] : s for s in l }.values())
No need to import anything, and fast as can be.
In general you can define:
def custom_unique(L, keyfunc):
return list({ keyfunc(li): li for li in L }.values())

You can group the items by this key and then use the first item in each group (assuming l is your list).
import itertools
keyfunc = lambda x: x.split(";", 1)[1]
[next(g) for k, g in itertools.groupby(sorted(l, key=keyfunc), keyfunc)]

Here is a code on the few first items, just switch my list with yours:
x = [
'7;213;164',
'8;213;164',
'9;145;112',
'10;145;112',
'11;145;112',
]
new_list = []
for i in x:
check = True
s_part = i[i.find(';'):]
for j in new_list:
if s_part in j:
check = False
if check == True:
new_list.append(i)
print(new_list)
Output:
['7;213;164', '9;145;112']

python edit tuple duplicates in a list

my target is:
while for looping a list I would like to check for duplicates and if there are some i would like to append a number to it see following example
my list output as an example:
[('name','company'), ('someguy','microsoft'), ('anotherguy','microsoft'), ('thirdguy','amazon')]
in a loop i would like to edit those duplicates so instead of the 2nd microsoft i would like to have microsoft1 (if there would be 3 microsoft guys so the third guy would have microsoft2)
with this i can filter the duplicates but i dont know how to edit them directly in the list
list = [('name','company'), ('someguy','microsoft'), ('anotherguy','microsoft'), ('thirdguy','amazon')]
names = []
double = []
for u in list[1:]:
names.append(u[1])
list_size = len(names)
for i in range(list_size):
k = i + 1
for j in range(k, list_size):
if names[i] == names[j] and names[i] not in double:
double.append(names[i])

This is one approach using collections.defaultdict.
Ex:
from collections import defaultdict
lst = [('name','company'), ('someguy','microsoft'), ('anotherguy','microsoft'), ('thirdguy','amazon')]
seen = defaultdict(int)
result = []
for k, v in lst:
if seen[v]:
result.append((k, "{}_{}".format(v, seen[v])))
else:
result.append((k,v))
seen[v] += 1
print(result)
Output:
[('name', 'company'),
('someguy', 'microsoft'),
('anotherguy', 'microsoft_1'),
('thirdguy', 'amazon')]

Slice a list into a nested list based on special characters using Python

I have a list of strings like this:
lst = ['23532','user_name=app','content=123',
'###########################',
'54546','user_name=bee','content=998 hello','source=fb',
'###########################',
'12/22/2015']
I want a similar method like string.split('#') that can give me output like this:
[['23532','user_name=app','content='123'],
['54546','user_name=bee',content='998 hello','source=fb'],
['12/22/2015']]
but I know list has not split attribute. I cannot use ''.join(lst) either because this list comes from part of a txt file I read in and my txt.file was too big, so it will throw an memory error to me.

I don't think there's a one-liner for this, but you can easily write a generator to do what you want:
def sublists(lst):
x = []
for item in lst:
if item == '###########################': # or whatever condition you like
if x:
yield x
x = []
else:
x.append(item)
if x:
yield x
new_list = list(sublists(old_list))

If you can't use .join(), you can loop through the list and save the index of any string that contains # then loop again to slice the list:
lst = ['23532', 'user_name=app', 'content=123', '###########################' ,'54546','user_name=bee','content=998 hello','source=fb','###########################','12/22/2015']
idx = []
new_lst = []
for i,val in enumerate(lst):
if '#' in val:
idx.append(i)
j = 0
for x in idx:
new_lst.append(lst[j:x])
j = x+1
new_lst.append(lst[j:])
print new_lst
output:
[['23532', 'user_name=app', 'content=123'], ['54546', 'user_name=bee', 'content=998 hello', 'source=fb'], ['12/22/2015']]

sep = '###########################'
def split_list(_list):
global sep
lists = list()
sub_list = list()
for x in _list:
if x == sep:
lists.append(sub_list)
sub_list = list()
else:
sub_list.append(x)
lists.append(sub_list)
return lists
l = ['23532','user_name=app','content=123',
'###########################',
'54546','user_name=bee','content=998 hello','source=fb',
'###########################',
'12/22/2015']
pprint(split_list(l))
Output:
[['23532', 'user_name=app', 'content=123'],
['54546', 'user_name=bee', 'content=998 hello', 'source=fb'],
['12/22/2015']]

You can achieve this by itertools.groupby
from itertools import groupby
lst = ['23532','user_name=app','content=123',
'###########################','54546','user_name=bee','content=998 hello','source=fb',
'###########################','12/22/2015']
[list(g) for k, g in groupby(lst, lambda x: x == '###########################') if not k ]
Output
[['23532', 'user_name=app', 'content=123'],
['54546', 'user_name=bee', 'content=998 hello', 'source=fb'],
['12/22/2015']]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Splitting a Python list based on criteria - python

You can use list comprehension to achieve this: mylist = [('Item A','CA','10'),('Item B','CT','12'),('Item C','CA','14')] filtered_list = [item for item in mylist if item[1]=='CA']

You can use python's filter for this purpose in the following way. filtered_list = list(filter(lambda x: x[1] =='CA',mylist)))

Related

trying to delete duplicate dictionary values from the list of dictionaries

Filtering a list with itens with same hour and letting just one in Python

What is an easy way to remove duplicates from only part of the string in Python?

python edit tuple duplicates in a list

Slice a list into a nested list based on special characters using Python

Categories

Resources