d = {'Name1': ['Male', '18'],
'Name2': ['Male', '16'],
'Name3': ['Male', '18'],
'Name4': ['Female', '18'],
'Name5': ['Female', '18']}
I am trying to find a way to isolate the duplicate keys to a list if any. Like:
['Name1', 'Name3']
['Name4', 'Name5']
How can I achieve this? Thanks
An imperative solution would be to just iterate over the dictionary and add the items into another dictionary that uses the gender-age-tuple as a key, for example:
# using a defaultdict, which automatically adds an empty list for missing keys when first accesses
from collections import defaultdict
by_data = defaultdict(list)
for name, data in d.items():
# turn the data into something immutable, so it can be used as a dictionary key
data_tuple = tuple(data)
by_data[data_tuple].append(name)
the result will be:
{('Female', '18'): ['Name4', 'Name5'],
('Male', '16'): ['Name2'],
('Male', '18'): ['Name1', 'Name3']})
You can filter out entries with only one value, if you are only interested in duplicates
try this:
d = {'Name1': ['Male', '18'],
'Name2': ['Male', '16'],
'Name3': ['Male', '18'],
'Name4': ['Female', '18'],
'Name5': ['Female', '18']}
ages = {} #create a dictionary to hold items with identical ages
#loop over all the items in the dictionary
for key in d.keys():
age = d[key][1]
#if the ages dictionary still does not have an item
#for the age we create an array to hold items with the same age
if(age not in ages.keys()):
ages[age] = []
ages[age].append(key) #finally append items with the same ages together
#loop over all the items in the ages dictionary
for value in ages.values():
if(len(value) > 1):#if we have more than one item in the ages dictionary
print(value) #print it
I'm guessing you meant duplicate values and not keys, in which case you could do this with pandas:
import pandas as pd
df = pd.DataFrame(d).T #load the data into a dataframe, and transpose it
df.index[df.duplicated(keep = False)]
df.duplicated(keep = False) gives you a series of True/False, where the value is True whenever that item has a duplicate, and False otherwise. We use that to index the row names, which is 'Name1','Name2', etc.
Related
I have a sorted dict as below -
check = {'id1':'01', 'id2':'03', 'id3':'03', 'id4':'10'}
I want to check the values in the above list in python codee and randomize the ids if values are same.
Expected output is check2 = {'id1':'01', 'id3':'03', 'id2':'03', 'id4':'10'} (randomize the ids which has the same values. Sometimes id2 in second position and sometimes id3 in second position)
As your dictionary is sorted by values, you can use itertools.groupby to group by identical values.
Then use random.sample to shuffle the keys per group.
Finally generate a new dictionary from the list of keys:
from itertools import groupby, chain
from random import sample
keys = list(chain.from_iterable(sample((l:=list(g)), len(l))
for k,g in groupby(check, lambda x:check[x])))
check2 = {k: check[k] for k in keys}
example output:
{'id1': '01', 'id3': '03', 'id2': '03', 'id4': '10'}
intermediate result:
>>> keys
['id1', 'id3', 'id2', 'id4']
lst = [['111', 'kam'],['222', 'Van']]
Header = ['ID', 'Name']
I want to convert the above list to dictionary based on the Headers. I can do this simply using for loop by taking each element in loop and append to some new list one by one.
But, I want to achieve the same without using loop to prevent from performance issue.
Is there any way to do that?
Expected Output:
[{'ID' : '111', 'Name' : 'Kam'},{'ID' : '222', 'Name' : 'Van'}]
You can use a list comprehension:
lst = [['111', 'kam'],['222', 'Van']]
Header = ['ID', 'Name']
result = [dict(zip(Header, row)) for row in lst]
print(result)
Output:
[{'ID': '111', 'Name': 'kam'}, {'ID': '222', 'Name': 'Van'}]
I have two dictionaries that have the same key and value pairs. I want to compare only the specific key-value pairs and return true.
I am new to python, Please help me to write a function for the same.
The dictionaries are
A: {'id1': 'target', 'start1': '39', 'end1': '45', \
'id2': 'query', 'start2': '98', 'end2': '104'}
B: {'id1': 'target', 'start1': '39', 'end1': '45', \
'id2': 'query', 'start2': '98', 'end2': '104'}
Here I want to check if the 'start1', 'end1', 'start2' and 'end2' values are the same are not.
result = all( A[k]==B[k] for k in ('start1', 'end1', 'start2', 'end2'))
you can use a for loop:
wanted_keys = {'start1', 'end1', 'start2', 'end2'}
same = True
for k in wanted_keys:
if A.get(k) != B.get(k):
same = False
break
one line code:
all(A.get(k) == B.get(k) for k in wanted_keys)
keys = ['key1', 'key2', 'key3', 'key4']
list1 = ['a1', 'b3', 'c4', 'd2', 'h0', 'k1', 'p2', 'o3']
list2 = ['1', '2', '25', '23', '4', '5', '6', '210', '8', '02', '92', '320']
abc = dict(zip(keys[:4], [list1,list2]))
with open('myfilecsvs.csv', 'wb') as f:
[f.write('{0},{1}\n'.format(key, value)) for key, value in abc.items()]
I am getting all keys in 1st column with this and values in other column respectively.
What I am trying to achieve is all keys in first row i-e each key in specific column of first row and then their values below. Something like transpose
I willbe much grateful for your assist on this
You can use join and zip_longest to do this.
",".join(abc.keys()) will return first row (the keys) like key1,key2,and then use zip_longest(Python2.x use izip_longest) to aggregate elements.And use the same way append , and \n to the string.
zip_longest
Make an iterator that aggregates elements from each of the iterables.
If the iterables are of uneven length, missing values are filled-in
with fillvalue.
from itertools import zip_longest
with open('myfilecsvs.csv', 'w') as f:
f.write("\n".join([",".join(abc.keys()),*(",".join(i) for i in zip_longest(*abc.values(),fillvalue=''))]))
Output:
key1,key2
a1,1
b3,2
...
,02
,92
,320
I have a dict that's like
dict1 = {'Lou': ['Male', '15', '2'],'Jen':['Female','10','3']...and more}
Im trying to search for values greater than 14 in the 2nd part of the list and then print out the key/value. I understand that I have to convert the strings to an integer and I believe I have to iterate by doing a dict1.values method however I'm unsure of how to specify the 2nd value in the list.
You can use dict1.items to iterate through key and values at the same time:
for key, value in dict1.items():
if int(value[1]) > 14:
print key, value
For each value you get the second part with value[1], you convert it to an integer with int and then you perform your check. When the check is successful, we print both key and value, as we have access to them.
You could use dict_comprehension.
>>> dict1 = {'Lou': ['Male', '15', '2'],'Jen':['Female','10','3']}
>>> {x:y for x,y in dict1.items() if int(y[1]) > 14}
{'Lou': ['Male', '15', '2']}
you need to use dict.items it will give you a tuple containing key/value pair.
Using filter and lambda:
>>> my_dict = {'Lou': ['Male', '15', '2'],'Jen':['Female','10','3']}
>>> filter(lambda x:int(x[1][1])>14, my_dict.items())
[('Lou', ['Male', '15', '2'])]
using Keys:
>>> {x:my_dict[x] for x in my_dict if int(my_dict[x][1])>14}
{'Lou': ['Male', '15', '2']}