Python dictionary check if the values match with other key values - python

I have created a python dictionary with a structure like :-
mydict = {'2018-08' : [32124,4234,23,2323,32423,342342],
'2018-07' : [13123,23424,2,3,4343,4232,2342],
'2018-06' : [1231,12,12313,12331,3123131313,434546,232]}
I want to check if any value in the values of key '2018-08' match with any values of other keys. is there a short way to write this?

You can simply loop over your expected values of mydict, and then for each of them check if its present in any of the values of the dictionary
You can use the idiom if item in list to check if item item is present in the list list
expected_values = mydict['2018-08']
found = False
for expected in expected_values:
for key in mydict:
if expected in mydict[key]:
found = True
break
Take into account that is a brute force algorithm and it may not be the optimal solution for larger dictionaries

The question is vague, so I am assuming that you want the values in the target month (e.g. 2018-08) that are contained somewhere within the other months.
Sets are much faster for testing membership compared to list iteration.
target = '2018-08'
s = set()
for k, v in mydict.iteritems():
if k != target:
s.update(v)
matches = set(mydict[target]) & s

Can use itertools.chain to create a long list
import itertools
for key,value in mydict.items():
temp_dict = mydict.copy()
temp_dict.pop(key)
big_value_list=list(itertools.chain(*temp_dict.values()))
print(key, set(value) & set(big_value_list))
Dry run by changing your provided inputs
mydict = {'2018-08' : [32124,4234,23,2323,32423,342342],
'2018-07' : [13123,23424,2,3,4343,4232,2342],
'2018-06' : [1231,12,12313,12331,3123131313,434546,232,342342,2342]}
Output:
('2018-08', set([342342]))
('2018-07', set([2342]))
('2018-06', set([342342, 2342]))

Related

How to index a list of dictionaries in python?

If I have a list of dictionaries in a python script, that I intend to later on dump in a JSON file as an array of objects, how can I index the keys of a specific dictionary within the list?
Example :
dict_list = [{"first_dict": "some_value"}, {"second_dict":"some_value"}, {"third_dict": "[element1,element2,element3]"}]
My intuitive solution was dict_list[-1][0] (to access the first key of the last dictionary in the list for example). This however gave me the following error:
IndexError: list index out of range
the key inputted into the dictionary will pick the some value in the format dict = {0:some_value}
to find a specific value:
list_dictionary = [{"dict1":'value1'},{"dict2","value2"}]
value1 = list_dictionary[0]["dict1"]
the 'key' is what you have to use to find a value from a dictionary
Example:
dictionary = {0:value}
dictionary[0]
in this case it will work
but to pick the elements we will do
values = []
for dictionary in dict_list:
for element in dictionary:
values.append(dictionary[element])
Output:
['some_value', 'some_value', ['element1', 'element2', 'element3']]
dict_list = [{"first_dict": "some_value"}, {"second_dict":"some_value"}, {"third_dict": ['element1','element2','element3']}]
If your dict look like this you can do as well
dict_list[-1]["third_dict"]
You can't access 'the first key' with a int since you have a dict
You can get the first key with .keys() and then
dict_list[-1].keys()[0]
By using dict_list[-1][0], you are trying to access a list with a list, which you do not have. You have a list with a dict key within a list.
Taking your example dict_list[-1][0]:
When you mention dict_list you are already "in the list".
The first index [-1] is referring to the last item of the list.
The second index would only be "usable" if the item mentioned in the previous index were a list. Hence the error.
Using:
dict_list=[{"first_dict": "some_value"}, {"second_dict":"some_value"},{"third_dict": [0,1,2]}]
to access the value of third_dict you need:
for value in list(dict_list[-1].values())[0]:
print(value)
Output:
0
1
2
If you know the order of dictionary keys and you are using one of the latest python versions (key stays in same order), so:
dict_list = [
{"first_dict": "some_value"}
, {"second_dict":"some_value"}
, {"third_dict": ["element1", "element2", "element3"]}
]
first_key = next(iter(dict_list[-1].keys()))
### OR: value
first_value = next(iter(dict_list[-1].values()))
### OR: both key and value
first_key, first_value = next(iter(dict_list[-1].items()))
print(first_key)
print(first_key, first_value)
print(first_value)
If you have the following list of dictionaries:
dict_list = [{"key1":"val1", "key2":"val2"}, {"key10":"val10"}]
Then to access the last dictionary you'd indeed use dict_list[-1] but this returns a dictionary with is indexed using its keys and not numbers: dict_list[0]["key1"]
To only use numbers, you'd need to get a list of the keys first: list(dict_list[-1]). The first element of this list list(dict_list[-1])[0] would then be the first key "key10"
You can then use indices to access the first key of the last dictionary:
dict_index = -1
key_index = 0
d = dict_list[dict_index]
keys = list(d)
val = d[keys[key_index]]
However you'd be using the dictionary as a list, so maybe a list of lists would be better suited than a list of dictionaries.

Python Comparing Values of Numpy Array Between 2 Dictionaries Value

I have 2 dictionary and an input
letter = 'd'
dict_1 = {"label_1": array(['a','b']), "label_2": array(['c','d']), ...}
dict_2 = {"label_1": array(['x','y']), "label_2": array(['z','o']), ...}
letter_translated = some_function(letter)
output desired: 'o'
What I have in mind right now is to get the index number from the array of the key "label_2" in dict_1 then searching for the same index in dict_2. I am open to other way of doing it. If you are unclear about the question, feel free to drop a comment.
Note: the arrays are numpy arrays
I propose to iterate through the first dictionary, while keeping a trace of how to get to the current element (key and i) so that we can look in the second dict in the same place:
from numpy import array
dict_1 = {"label_1": array(['a','b']), "label_2": array(['c','d'])}
dict_2 = {"label_1": array(['x','y']), "label_2": array(['z','o'])}
def look_for_corresponding(letter, d1, d2):
for key, array_of_letters in d1.items():
for position, d1_letter in enumerate(array_of_letters):
if d1_letter == letter:
return d2[key][position]
return None # Line not necessary, but added for clarity
output = look_for_corresponding('d', dict_1, dict_2)
print(output)
# o
Of course, this code will fail if dict_1 and dict_2 do not have exactly the same structure, or if the arrays are more than 1D. If those cases apply to you, please edit your question to indicate it.
Also, I am not sure what should be done if the letter is not to be found within dict_1. This code will return None, but it could also raise an exception.
What do you mean with 'index'? The number?
dictionaries don't have the concept of counted indices of their entries. You can only access data through the key (here "label_2"), or by iterating (for key in dict_1 ...).
The order is not guaranteed and can change. The order or your declaration is not kept.
If you wish to have "label_2" in both, then you need to access
key = "label_2"
item_from_1 = dict_1[key]
item_from_2 = dict_2[key]
If you need to iterate dict_1, then on each item find the appropriate item in the second, then this also needs to go over the key:
for (key,value1) in dict_1.iteritems():
value2 = dict_2[key]
.....
Note that the order the items come up in the loop may vary. Even from one run of the program to the next.

Indexing Dict with Multiple values at one key

I'm new to python and I was wondering if there's a way for me to pull a value at a specific index. Let's say I have a key with multiple values(list) associated with it.
d = {'ANIMAL' : ['CAT','DOG','FISH','HEDGEHOG']}
Let's say I want to iterate through values and print out the value if it's equal to 'DOG'. Do Values, Key pairs have a specific index associated with the position of Values?
I've try reading up on dict and how it works apparently you can't really index it. I just wanted to know if there's a way to get around that.
You can perform the following (comments included):
d = {'ANIMAL' : ['CAT','DOG','FISH','HEDGEHOG']}
for keys, values in d.items(): #Will allow you to reference the key and value pair
for item in values: #Will iterate through the list containing the animals
if item == "DOG":
print(item)
print(values.index(item)) #will tell you the index of "DOG" in the list.
So maybe this will help:
d = {'ANIMAL' : ['CAT','DOG','FISH','HEDGEHOG']}
for item in d:
for animal in (d[item]):
if animal == "DOG":
print(animal)
Update -What if I want to compare the string to see if they're equal or not... let say if the value at the first index is equal to the value at the second index.
You can use this:
d = {'ANIMAL' : ['CAT','DOG','FISH','HEDGEHOG']}
for item in d:
for animal in (d[item]):
if animal == "DOG":
if list(d.keys())[0] == list(d.keys())[1]:
print("Equal")
else: print("Unequal")
Keys and values in a dictionary are indexed by key and do not have a fixed index like in lists.
However, you can leverage the use of 'OrderedDict' to give an indexing scheme to your dictionaries. It is seldom used, but handy.
That being said, dictionaries in python3.6 are insertion ordered :
More on that here :
Are dictionaries ordered in Python 3.6+?
d = {'animal': ['cat', 'dog', 'kangaroo', 'monkey'], 'flower': ['hibiscus', 'sunflower', 'rose']}
for key, value in d.items():
for element in value:
if element is 'dog':
print(value)
does this help? or, you want to print index of key in dictionary?

Checking items in a list of dictionaries in python

I have a list of dictionaries=
a = [{"ID":1, "VALUE":2},{"ID":2, "VALUE":2},{"ID":3, "VALUE":4},...]
"ID" is a unique identifier for each dictionary. Considering the list is huge, what is the fastest way of checking if a dictionary with a certain "ID" is in the list, and if not append to it? And then update its "VALUE" ("VALUE" will be updated if the dict is already in list, otherwise a certain value will be written)
You'd not use a list. Use a dictionary instead, mapping ids to nested dictionaries:
a = {
1: {'VALUE': 2, 'foo': 'bar'},
42: {'VALUE': 45, 'spam': 'eggs'},
}
Note that you don't need to include the ID key in the nested dictionary; doing so would be redundant.
Now you can simply look up if a key exists:
if someid in a:
a[someid]['VALUE'] = newvalue
I did make the assumption that your ID keys are not necessarily sequential numbers. I also made the assumption you need to store other information besides VALUE; otherwise just a flat dictionary mapping ID to VALUE values would suffice.
A dictionary lets you look up values by key in O(1) time (constant time independent of the size of the dictionary). Lists let you look up elements in constant time too, but only if you know the index.
If you don't and have to scan through the list, you have a O(N) operation, where N is the number of elements. You need to look at each and every dictionary in your list to see if it matches ID, and if ID is not present, that means you have to search from start to finish. A dictionary will still tell you in O(1) time that the key is not there.
If you can, convert to a dictionary as the other answers suggest, but in case you you have reason* to not change the data structure storing your items, here's what you can do:
items = [{"ID":1, "VALUE":2}, {"ID":2, "VALUE":2}, {"ID":3, "VALUE":4}]
def set_value_by_id(id, value):
# Try to find the item, if it exists
for item in items:
if item["ID"] == id:
break
# Make and append the item if it doesn't exist
else: # Here, `else` means "if the loop terminated not via break"
item = {"ID": id}
items.append(id)
# In either case, set the value
item["VALUE"] = value
* Some valid reasons I can think of include preserving the order of items and allowing duplicate items with the same id. For ways to make dictionaries work with those requirements, you might want to take a look at OrderedDict and this answer about duplicate keys.
Convert your list into a dict and then checking for values is much more efficient.
d = dict((item['ID'], item['VALUE']) for item in a)
for new_key, new_value in new_items:
if new_key not in d:
d[new_key] = new_value
Also need to update on key found:
d = dict((item['ID'], item['VALUE']) for item in a)
for new_key, new_value in new_items:
d.setdefault(new_key, 0)
d[new_key] = new_value
Answering the question you asked, without changing the datastructure around, there's no real faster way of looking without a loop and checking every element and doing a dictionary lookup for each one - but you can push the loop down to the Python runtime instead of using Python's for loop.
I haven't tried if it ends up faster though.
a = [{"ID":1, "VALUE":2},{"ID":2, "VALUE":2},{"ID":3, "VALUE":4}]
id = 2
tmp = filter(lambda d: d['ID']==id, a)
# the filter will either return an empty list, or a list of one item.
if not tmp:
tmp = {"ID":id, "VALUE":"default"}
a.append(tmp)
else:
tmp = tmp[0]
# tmp is bound to the found/new dictionary

How to Re-arrange items in a Python Dictionary during For Loop?

I am building a Python dictionary from a table in Excel. It's a Category:Name relationship. So, the first column in the spreadsheet is a category and the second column is the name of a file:
Forests - Tree Type
Forests - Soil Type
Administrative - Cities
Administrative - Buildings
Mineral - Gold
Mineral - Platinum
Water - Watershed
Water - Rivers
Water - Lakes
Water - Streams
and so on...
I use this code to build the dictionary:
layerListDict = dict()
for row in arcpy.SearchCursor(xls):
# Set condition to pull out the Name field in the xls file.
# LayerList being the list of all the 'Name' from the 'Name' column built earlier in the script
if str(row.getValue("Name")).rstrip() in layerList:
# Determine if the category item is in the dictionary as a key already. If so, then append the Name to the list of values associated with the category
if row.getValue("Category") in layerListDict:
layerListDict[row.getValue("Category")].append(str(row.getValue("Name")))
# if not, create a new category key and add the associated Name value to it
else:
layerListDict[row.getValue("Category")] = [str(row.getValue("Name"))]
So, now I have a dictionary with Category as the key and a list of Names as the values:
{u'Forests': ['Tree Type', 'Soil Type'], u'Administrative': ['Cities', 'Buildings'], u'Mineral': ['Gold', 'Platinum'], u'Water': ['Watershed', 'Rivers', 'Lakes', 'Streams']}
I can now iterate over the sorted dictionary by key:
for k,v in sorted(layerListDict.iteritems()):
print k, v
PROBLEM: What I would like to do is to iterate over the sorted dictionary with one caveat...I wanted to have the 'Mineral' key to be the very first key and then have the rest of the keys print out in alphabetical order like this:
Mineral ['Gold', 'Platinum']
Administrative ['Cities', 'Buildings']
Forests ['Tree Type', 'Soil Type']
Water ['Watershed', 'Rivers', 'Lakes', 'Streams']
Can anyone suggest how I can accomplish this?
I tried to set a variable to a sorted list, but it returns as a python list and I cannot iterate over the Python list by a key value pair anymore.
List2 = sorted(layerListDict.iteritems())
[u'Forests':['Tree Type', 'Soil Type'], u'Administrative': ['Cities', 'Buildings'], u'Mineral': ['Gold', 'Platinum'], u'Water': ['Watershed', 'Rivers', 'Lakes', 'Streams']]
print "Mineral", layerListDict.pop("Mineral")
for k, v in sorted(layerListDict.iteritems()):
print k, v
If you don't want to modify layerListDict:
print "Mineral", layerListDict["Mineral"]
for k, v in sorted(layerListDict.iteritems()):
if k != "Mineral":
print k, v
An overly general solution:
import itertools
first = 'Mineral'
for k, v in itertools.chain([(first, layersListDict[first])],
((k,v) for (k,v) in layerListDict.iteritems() if k != first)):
print k, v
or closer to my original incorrect solution:
for k, layersListDict[k] in itertools.chain((first,),
(k for k in layerListDict
if k != first)):
print k, v
If you're just looking to print the key-value pairs, then the other solutions get the job done quite well. If you're looking for the resulting dictionary to have a certain order so that you can perform other operations on it, you should look into the OrderedDict class:
https://docs.python.org/2/library/collections.html#collections.OrderedDict
Objects are stored in the order that they are inserted. In your case, you would do something similar to the other answers first to define the order:
dict_tuples = sorted(layerListDict.items())
ordered_tuples = [("Mineral", layerListDict["Mineral"],)]
ordered_tuples += [(k, v,) for k, v in dict_tuples if k != "Mineral"]
ordered_dict = collections.OrderedDict(ordered_tuples) #assumes import happened above
Now you can do whatever you want with ordered_dict (careful with deleting then reinserting, see the link above). Don't know if that helps you more than some of the other answers (which are all pretty great!).
EDIT: Whoops, my recollection of the update behavior of OrderedDicts was a bit faulty. Fixed above. Also streamlined the code a little. You could potentially generate the tuples in your first for loop and then put them in the OrderedDict, too.
EDIT 2: Forgot that tuples are naturally sorted by the first element (thanks John Y), removed the unnecessary key param in the sorted() call.
Keep a list of keys in the order you want to iterate over the map. Then iterate through the list, using the values as keys into the map.
Actually, after seeing the other solutions, I like chepner's answer with itertools.chain() better, especially if the list of keys is large, because mine will move things around in the list too much.
# sort the keys
keyList = sorted(keys(layerListDict))
# remove 'Mineral' from it's place
del keyList[keyList.index('Mineral')]
# Put it in the beginning
keyList = ['Mineral'] + keyList
# Iterate
for k in keyList:
for v in layerListDict[k]:
print k, v
Second shot at an answer. This is pretty different from my original, and makes some possibly wrong assertions, but I like the feel of it a lot better. Since you're able to determine all of the values in the "name" column (layerList), I'm going to assume you can do the same for the "categories" column. This code assumes you've placed your categories (including "Mineral") into an unsorted list called categories, and replaces the original code:
categories.sort()
categories = ["Mineral"] + [cat for cat in categories if cat != "Mineral"]
# Insert the categories into our dict with placeholder lists that we can append to
layerListDict = collections.OrderedDict([(cat, [],) for cat in categories])
for row in arcpy.SearchCursor(xls):
if str(row.getValue("Name")).rstrip() in layerList:
layerListDict[row.getValue("Category")].append(str(row.getValue("Name")))
Now you can just iterate over layerListDict.items().

Categories