Deleting multiple key value pairs from dictionary in python - python

I generated a python dictionary for all the duplicate images in a folder. The python dictonary now contains values in the following format:
{
"image_1.jpg": ['image_xyz.jpg', 'image_abc.jpg'],
"image_xyz.jpg": ["image_1.jpg", "image_abc.jpg"],
"image_abc.jpg": ["image_xyz.jpg","image_1.jpg"],
"image_2.jpg": ["image_3.jpg"],
"image_3.jpg": ["image_2.jpg"],
"image_5.jpg": []
}
Each key, value pair thus appears atleast twice in the list. Empty list for keys are present which have no duplicates.
Is there a way to delete all the duplicate key value pairs present? so that the dictionary looks like the following:
{
"image_1.jpg": ['image_xyz.jpg', 'image_abc.jpg'],
"image_2.jpg": ["image_3.jpg"],
"image_5.jpg": []
}
I tried using list to first store all the values from the key value pair and then deleting them from the dictionary but it empties the whole dictionary.

source = {
"image_1.jpg": ['image_xyz.jpg', 'image_abc.jpg'],
"image_xyz.jpg": ["image_1.jpg", "image_abc.jpg"],
"image_abc.jpg": ["image_xyz.jpg","image_1.jpg"],
"image_2.jpg": ["image_3.jpg"],
"image_3.jpg": ["image_2.jpg"],
"image_5.jpg": []
}
dest = dict()
for k,v in source.items():
ok = True
for k1,v1 in dest.items():
if k in v1: ok = False
if ok: dest[k] = v
print(dest) # New filtered dict

I usually do this method when getting rid of duplicates in a list:
First put all the values in a matrix / 2 dimensional list including the key, so the first 3 values would be like this:
{
"image_1.jpg": ['image_xyz.jpg', 'image_abc.jpg'],
"image_xyz.jpg": ["image_1.jpg", "image_abc.jpg"],
"image_abc.jpg": ["image_xyz.jpg","image_1.jpg"],
}
would turn into:
List=[
["image_1.jpg","image_xyz.jpg","image_abc.jpg"],
["image_xyz.jpg","image_1.jpg","image_abc"],
["image_abc.jpg","image_xyz.jpg","image_1.jpg"]
]
make sure the keys are all in the 0th position so that you can save them.
keys=[x[0] for x in List]
then sort the list:
sorted_list=[sorted(x) for x in List]
then simply compare them using an if statement in a nested for loop and if a list is equal to another then delete it:
for i in sorted_list:
for j,k in enumerate(sorted_list):
if i==k:
del sorted_list[j] # deleting any equal lists
now that the duplicates are gone and you have the keys convert the list back into a dictionary if preferred
Overall code if needed:
List=[
["image_1.jpg","image_xyz.jpg","image_abc.jpg"],
["image_xyz.jpg","image_1.jpg","image_abc"],
["image_abc.jpg","image_xyz.jpg","image_1.jpg"]
]
keys=[x[0] for x in List]
sorted_list=[sorted(x) for x in List]
for i in sorted_list:
for j,k in enumerate(sorted_list):
if i==k:
del sorted_list[j] # deleting any equal lists

Related

Iteratively adding new lists as values to a dictionary

I have created a dictionary (dict1) which is not empty and contains keys with corresponding lists as their values. I want to create a new dictionary (dict2) in which new lists modified by some criterion should be stored as values with the corresponding keys from the original dictionary. However, when trying to add the newly created list (list1) during every loop iteratively to the dictionary (dict2) the stored values are empty lists.
dict1 = {"key1" : [-0.04819, 0.07311, -0.09809, 0.14818, 0.19835],
"key2" : [0.039984, 0.0492105, 0.059342, -0.0703545, -0.082233],
"key3" : [0.779843, 0.791255, 0.802576, 0.813777, 0.823134]}
dict2 = {}
list1 = []
for key in dict1:
if (index + 1 < len(dict1[key]) and index - 1 >= 0):
for index, element in enumerate(dict1[key]):
if element - dict1[key][index+1] > 0:
list1.append(element)
dict2['{}'.format(key)] = list1
list.clear()
print(dict2)
The output I want:
dict2 = {"key1" : [0.07311, 0.14818, 0.19835],
"key2" : [0.039984, 0.0492105, 0.059342],
"key3" : [0.779843, 0.791255, 0.802576, 0.813777, 0.823134]}
The problem is that list always refers to the same list, which you empty by calling clear. Therefore all values in the dict refer to the same empty list object in memory.
>>> # ... running your example ...
>>> [id(v) for v in dict2.values()]
[2111145975936, 2111145975936, 2111145975936]
It looks like you want to filter out negative elements from the values in dict1. A simple dict-comprehension will do the job.
>>> dict2 = {k: [x for x in v if x > 0] for k, v in dict1.items()}
>>> dict2
{'key1': [0.07311, 0.14818, 0.19835],
'key2': [0.039984, 0.0492105, 0.059342],
'key3': [0.779843, 0.791255, 0.802576, 0.813777, 0.823134]}
#timgeb gives a great solution which simplifies your code to a dictionary comprehension but doesn't show how to fix your existing code. As he says there, you are reusing the same list on each iteration of the for loop. So to fix your code, you just need to create a new list on each iteration instead:
for key in dict1:
my_list = []
# the rest of the code is the same, expect you don't need to call clear()

How to Mix 2 list as Dictionary in Python with custom key value pair

I have 2 List
1. Contains Keys
2. Contains Keys+Values
Now I have to make a Dictionary from it which will filter out the keys and insert all the values before the next key arrives in the list.
Example:
List 1: ['a','ef','ddw','b','rf','re','rt','c','dc']
List 2: ['a','b','c']
Dictionary that I want to create: {
'a':['ef','ddw'],
'b':['rf','re','rt'],
'c':['dc']
}
I am only familiar with python language and want solution for same in python only.
I think this should work:
result = {}
cur_key = None
for key in list_1:
if key in list_2:
result[key] = []
cur_key = key
else:
result[cur_key].append(key)

python get dictionary values list as list if key match

I have the following list and dictionary:
match_keys = ['61df50b6-3b50-4f22-a175-404089b2ec4f']
locations = {
'2c50b449-416e-456a-bde6-c469698c5f7': ['422fe2d0-b10f-446d-ac3c-f75e5a3ff138'],
'61df50b6-3b50-4f22-a175-404089b2ec4f': [
'7112fa59-63b1-4057-8822-fe11168c328f', '6d06ee0a-7447-4726-822f-942b9e12c9ce'
]
}
If I want to search 'locations' for keys that match in the match_keys list and extract their values, to get something like this...
['7112fa59-63b1-4057-8822-fe11168c328f', '6d06ee0a-7447-4726-822f-942b9e12c9ce']
...what would be the best way?
You can iterate over match_keys and use dict.get to get the values under each key:
out = [v for key in match_keys if key in locations for v in locations[key]]
Output:
['7112fa59-63b1-4057-8822-fe11168c328f', '6d06ee0a-7447-4726-822f-942b9e12c9ce']
for key, value in locations.items():
if key == match_keys[0]:
print(value)
Iterate over keys and get value by [].
res = []
for key in matched_keys:
res.append(matched_keys[key])
or if you dont want list of lists you can use extend()
res = []
for key in matched_keys:
res.extend(matched_keys[key])

How to iterate through dict values containing lists and remove items?

Python novice here. I have a dictionary of lists, like so:
d = {
1: ['foo', 'foo(1)', 'bar', 'bar(1)'],
2: ['foobaz', 'foobaz(1)', 'apple', 'apple(1)'],
3: ['oz', 'oz(1)', 'boo', 'boo(1)']
}
I am trying to figure out how to loop through the keys of the dictionary and the corresponding list values and remove all strings in each in list with a parantheses tail. So far this is what I have:
for key in keys:
for word in d[key]...: # what else needs to go here?
regex = re.compile('\w+\([0-9]\)')
re.sub(regex, '', word) # Should this be a ".pop()" from list instead?
I would like to do this with a list comprehension, but as I said, I can't find much information on looping through dict keys and corresponding dict value of lists. What's the most efficient way of setting this up?
You can re-build the dictionary, letting only elements without parenthesis through:
d = {k:[elem for elem in v if not elem.endswith(')')] for k,v in d.iteritems()}
temp_dict = d
for key, value is temp_dict:
for elem in value:
if temp_dict[key][elem].find(")")!=-1:
d[key].remove[elem]
you can't edit a list while iterating over it, so you create a copy of your list as temp_list and if you find parenthesis tail in it, you delete corresponding element from your original list.
Alternatively, you can do it without rebuilding the dictionary, which may be preferable if it's huge...
for k, v in d.iteritems():
d[k] = filter(lambda x: not x.endswith(')'), v)

Defining a list of values for a dictionary key using an external file

I have a file with a list of paired entries (keys) that goes like this:
6416 2318
84665 88
90 2339
2624 5371
6118 6774
And I've got another file with the values to those keys:
266743 Q8IUM7
64343 H7BXU6
64343 Q9H6S1
64343 C9JB40
23301 Q8NDI1
23301 A8K930
As you can see the same key can have more than one value. What I'm trying to do is creating a dictionary by automatically creating the initial k, v pair, and then append more values for each entry that is already in the dictionary, like this:
Program finds "266743: 'Q8IUM7'", then "64343: 'H7BXU6'". And when it finds "64343: 'Q9H6S1'" it does this: "64343: ['H7BXU6', 'Q9H6S1']".
This is what I have so far:
# Create dictionary
data = {}
for line in inmap:
value = []
k, v = [x.strip() for x in line.split('\t')]
data[k] = value.append(v)
if k in data.viewkeys() == True and v in data.viewvalues() == False:
data[k] = value.append(v)
But the if statement seems to not be working. That or having the value = [] inside the for loop. Any thoughts?
This is not a good idea. You should be using a list from the start and expand that list as you go along, not change from "string" to "list of strings" when more than one value is found for the key.
For this, you can simply use
from collections import defaultdict
data = defaultdict(list)
for line in inmap:
k, v = (x.strip() for x in line.split('\t'))
data[k].append(v)
This works because a defaultdict of type list will automatically create a key together with an empty list as its value when you try to reference a key that doesn't yet exist. Otherwise, it behaves just like a normal dictionary.
Result:
>>> data
defaultdict(<type 'list'>, {'23301': ['Q8NDI1', 'A8K930'],
'64343': ['H7BXU6', 'Q9H6S1', 'C9JB40'], '266743': ['Q8IUM7']})

Categories