I have 2 dictionary and an input
letter = 'd'
dict_1 = {"label_1": array(['a','b']), "label_2": array(['c','d']), ...}
dict_2 = {"label_1": array(['x','y']), "label_2": array(['z','o']), ...}
letter_translated = some_function(letter)
output desired: 'o'
What I have in mind right now is to get the index number from the array of the key "label_2" in dict_1 then searching for the same index in dict_2. I am open to other way of doing it. If you are unclear about the question, feel free to drop a comment.
Note: the arrays are numpy arrays
I propose to iterate through the first dictionary, while keeping a trace of how to get to the current element (key and i) so that we can look in the second dict in the same place:
from numpy import array
dict_1 = {"label_1": array(['a','b']), "label_2": array(['c','d'])}
dict_2 = {"label_1": array(['x','y']), "label_2": array(['z','o'])}
def look_for_corresponding(letter, d1, d2):
for key, array_of_letters in d1.items():
for position, d1_letter in enumerate(array_of_letters):
if d1_letter == letter:
return d2[key][position]
return None # Line not necessary, but added for clarity
output = look_for_corresponding('d', dict_1, dict_2)
print(output)
# o
Of course, this code will fail if dict_1 and dict_2 do not have exactly the same structure, or if the arrays are more than 1D. If those cases apply to you, please edit your question to indicate it.
Also, I am not sure what should be done if the letter is not to be found within dict_1. This code will return None, but it could also raise an exception.
What do you mean with 'index'? The number?
dictionaries don't have the concept of counted indices of their entries. You can only access data through the key (here "label_2"), or by iterating (for key in dict_1 ...).
The order is not guaranteed and can change. The order or your declaration is not kept.
If you wish to have "label_2" in both, then you need to access
key = "label_2"
item_from_1 = dict_1[key]
item_from_2 = dict_2[key]
If you need to iterate dict_1, then on each item find the appropriate item in the second, then this also needs to go over the key:
for (key,value1) in dict_1.iteritems():
value2 = dict_2[key]
.....
Note that the order the items come up in the loop may vary. Even from one run of the program to the next.
Related
The title is a bit confusing but I am essentially trying to store the name x of a tuple (x,y) by calling the values of y. The tuples are in a list, and there are three of them. However, this data may change so I want this loop to be able to process any sort of list length as long as the form comes in tuples. Additionally my tuples are in (str, [list]) form. Here is an example of my code.
key, val = list(d.keys()), list(d.values())
for i in val:
if i == group_ids[0]:
for x,y in list(d.items()):
if y == i:
print(x,y)
I have created a defaultdict previously in order to access my list of tuples in (str, [list]) format because ipywidgets8 no longer accepts dictionaries. After having successfully selected the list of values based on the str in the widgets interface I am looking to store the name of the selected list so that I can save each file based on their respective name. I know with pandas dataframe you can call the value of one column based on the value of another with .loc and I am trying to do something similar with my list of tuples but cannot figure out how. The group_ids variable is the list of selected values received from the ipywidgets button. I have to call [0] position because it has been stored with double brackets. Even more simply put, I want (this is not in any coding language just how my brain can best present this in human words):
FOR i IN val,
if i == group_ids[0]
PRINT x of key at the index where i in val is found
I hope this explanation is clear. I feel that this should not be so difficult to figure out but for some reason I cannot.
EDIT:
A sample of my data
group_ids = [[5876233, 5883627, 5891029, 5892881, 5896571, 5900242, 5902043, 5905766, 5905796, 5913064, 5913075, 5913080, 5914875, 5920356, 5924048, 5925824, 5927655, 5929456, 5929479, 5931307, 5934950, 5936704, 5940344, 5943972, 5944002, 5945785, 5947627, 5951172, 5951181, 5954751, 5958339, 5958354, 5958358, 5965416, 5965424, 5967145, 5968843, 5972099, 5978640, 5981887, 5983501, 5993193, 5967178, 5967171, 5963649, 5951209, 5929476, 5958331, 5938533, 5933134, 5918577, 5958359]]
list(d.items()) = [('Libraries', [5876233, 5883627, 5891029, 5892881, 5896571, 5900242, 5902043, 5905766, 5905796, 5913064, 5913075, 5913080, 5914875, 5920356, 5924048, 5925824, 5927655, 5929456, 5929479, 5931307, 5934950, 5936704, 5940344, 5943972, 5944002, 5945785, 5947627, 5951172, 5951181, 5954751, 5958339, 5958354, 5958358, 5965416, 5965424, 5967145, 5968843, 5972099, 5978640, 5981887, 5983501, 5993193, 5967178, 5967171, 5963649, 5951209, 5929476, 5958331, 5938533, 5933134, 5918577, 5958359]), ('Sport Facilities', [5812360, 5817970, 5818061, 5821851, 5823750, 5823751, 5827499, 5829344, 5829423, 5831312, 5833208, 5838752, 5840647, 5842526, 5842539, 5842575, 5842583, 5844396....)]
d.keys = dict_keys(['Libraries', 'Sports Facilities', 'Youth Facilities'])
I want the respective key for the list of ids seen in the group_ids variable
So I think you just need:
for k, v in d.items():
if v[0] == group_ids[0]:
print(k, v)
Checking if v == group_ids would be more precise, but I assume that all ids are different.
I have an array of dictionaries of the form:
[
{
generic_key: specific_key,
generic_value: specific_value
}
...
]
I am trying to interpret this into an array of dictionaries of this form:
[
{
specific_key: specific_value
}
...
]
I tried this:
new_array = []
for row in old_array:
values = list(row.values())
key = values[0]
val = values[1]
new_array.append({key: val})
This works in most cases, but in some, it swaps them around to form a dict like this:
{
specific_value: specific_key
}
I've looked at the source file, and the rows in which it does this are identical to the rows in which it does not do this.
It's perhaps worth mentioning that the list in question is about 3000 elements in length.
Am I doing something stupid? I guess that maybe list(row.values()) does not necessarily preserve the order, but I don't see why it wouldn't.
EDIT fixed code typo suggesting that it was appending sets
The order in which dict keys/values are enumerated is ostensibly arbitrary (there's certainly a logic to it, and as of I think python3.7+, it's consistent, but while I don't know off the top of my head what the ordering criteria are) - if you wanted order, you would have used a list instead of a dict to store them in the first place. If generic_key and generic_value are the same each time, then the ideal way to handle this problem is to simply extract by key:
key = row['generic_key']
value = row['generic_value']
If this isn't the case but there is a consistent way to differentiate between generic_key and generic_value, then you can grab both the keys and values, and do that:
items = tuple(row.items())
if items[0][0] is the generic_key: # insert whatever condition you need to here
key = items[0][1]
value = items[1][1]
else
key = items[1][1]
value = items[0][1]
I have created a python dictionary with a structure like :-
mydict = {'2018-08' : [32124,4234,23,2323,32423,342342],
'2018-07' : [13123,23424,2,3,4343,4232,2342],
'2018-06' : [1231,12,12313,12331,3123131313,434546,232]}
I want to check if any value in the values of key '2018-08' match with any values of other keys. is there a short way to write this?
You can simply loop over your expected values of mydict, and then for each of them check if its present in any of the values of the dictionary
You can use the idiom if item in list to check if item item is present in the list list
expected_values = mydict['2018-08']
found = False
for expected in expected_values:
for key in mydict:
if expected in mydict[key]:
found = True
break
Take into account that is a brute force algorithm and it may not be the optimal solution for larger dictionaries
The question is vague, so I am assuming that you want the values in the target month (e.g. 2018-08) that are contained somewhere within the other months.
Sets are much faster for testing membership compared to list iteration.
target = '2018-08'
s = set()
for k, v in mydict.iteritems():
if k != target:
s.update(v)
matches = set(mydict[target]) & s
Can use itertools.chain to create a long list
import itertools
for key,value in mydict.items():
temp_dict = mydict.copy()
temp_dict.pop(key)
big_value_list=list(itertools.chain(*temp_dict.values()))
print(key, set(value) & set(big_value_list))
Dry run by changing your provided inputs
mydict = {'2018-08' : [32124,4234,23,2323,32423,342342],
'2018-07' : [13123,23424,2,3,4343,4232,2342],
'2018-06' : [1231,12,12313,12331,3123131313,434546,232,342342,2342]}
Output:
('2018-08', set([342342]))
('2018-07', set([2342]))
('2018-06', set([342342, 2342]))
I have a list of dictionaries=
a = [{"ID":1, "VALUE":2},{"ID":2, "VALUE":2},{"ID":3, "VALUE":4},...]
"ID" is a unique identifier for each dictionary. Considering the list is huge, what is the fastest way of checking if a dictionary with a certain "ID" is in the list, and if not append to it? And then update its "VALUE" ("VALUE" will be updated if the dict is already in list, otherwise a certain value will be written)
You'd not use a list. Use a dictionary instead, mapping ids to nested dictionaries:
a = {
1: {'VALUE': 2, 'foo': 'bar'},
42: {'VALUE': 45, 'spam': 'eggs'},
}
Note that you don't need to include the ID key in the nested dictionary; doing so would be redundant.
Now you can simply look up if a key exists:
if someid in a:
a[someid]['VALUE'] = newvalue
I did make the assumption that your ID keys are not necessarily sequential numbers. I also made the assumption you need to store other information besides VALUE; otherwise just a flat dictionary mapping ID to VALUE values would suffice.
A dictionary lets you look up values by key in O(1) time (constant time independent of the size of the dictionary). Lists let you look up elements in constant time too, but only if you know the index.
If you don't and have to scan through the list, you have a O(N) operation, where N is the number of elements. You need to look at each and every dictionary in your list to see if it matches ID, and if ID is not present, that means you have to search from start to finish. A dictionary will still tell you in O(1) time that the key is not there.
If you can, convert to a dictionary as the other answers suggest, but in case you you have reason* to not change the data structure storing your items, here's what you can do:
items = [{"ID":1, "VALUE":2}, {"ID":2, "VALUE":2}, {"ID":3, "VALUE":4}]
def set_value_by_id(id, value):
# Try to find the item, if it exists
for item in items:
if item["ID"] == id:
break
# Make and append the item if it doesn't exist
else: # Here, `else` means "if the loop terminated not via break"
item = {"ID": id}
items.append(id)
# In either case, set the value
item["VALUE"] = value
* Some valid reasons I can think of include preserving the order of items and allowing duplicate items with the same id. For ways to make dictionaries work with those requirements, you might want to take a look at OrderedDict and this answer about duplicate keys.
Convert your list into a dict and then checking for values is much more efficient.
d = dict((item['ID'], item['VALUE']) for item in a)
for new_key, new_value in new_items:
if new_key not in d:
d[new_key] = new_value
Also need to update on key found:
d = dict((item['ID'], item['VALUE']) for item in a)
for new_key, new_value in new_items:
d.setdefault(new_key, 0)
d[new_key] = new_value
Answering the question you asked, without changing the datastructure around, there's no real faster way of looking without a loop and checking every element and doing a dictionary lookup for each one - but you can push the loop down to the Python runtime instead of using Python's for loop.
I haven't tried if it ends up faster though.
a = [{"ID":1, "VALUE":2},{"ID":2, "VALUE":2},{"ID":3, "VALUE":4}]
id = 2
tmp = filter(lambda d: d['ID']==id, a)
# the filter will either return an empty list, or a list of one item.
if not tmp:
tmp = {"ID":id, "VALUE":"default"}
a.append(tmp)
else:
tmp = tmp[0]
# tmp is bound to the found/new dictionary
In the below code d_arr is an array of dictionaries
def process_data(d_arr):
flag2 = 0
for dictionaries in d_arr:
for k in dictionaries:
if ( k == "*TYPE" ):
""" Here we determine the type """
if (dictionaries[k].lower() == "name"):
dictionaries.update({"type" : 0})
func = name(dictionaries)
continue
elif (dictionaries[k].lower() == "ma"):
dictionaries.update({"type" : 1})
func = DCC(dictionaries)
logging.debug(type(func))
continue
When the above is done i get an error saying
for k in dictionaries:
RuntimeError: dictionary changed size during iteration
2010-08-02 05:26:44,167 DEBUG Returning
Is this forbidden to do something like this
It is, indeed, forbidden. Moreover, you don't really need a loop over all keys here, given that the weirdly named dictionaries appears to be a single dict; rather than the for k in dictionaries: (or the workable for k in dictionaries.keys() that #Triptych's answer suggests), you could use...:
tp = dictionaries.get('*TYPE')
if tp is not None:
""" Here we determine the type """
if tp.lower() == 'name':
dictionaries.update({"type" : 0})
func = name(dictionaries)
elif tp.lower() == "ma":
dictionaries.update({"type" : 1})
func = DCC(dictionaries)
logging.debug(type(func))
This is going to be much faster if dictionaries has any considerable length, for you're reaching directly for the one entry you care about, rather than looping over all entries to check each of them for the purpose of seeing if it is the one you care about.
Even if you've chosen to omit part of your code, so that after this start the loop on dictionaries is still needed, I think my suggestion is still preferable because it lets you get any alteration to dictionaries done and over with (assuming of course that you don't keep altering it in the hypothetical part of your code I think you may have chosen to omit;-).
That error is pretty informative; you can't change the size of a dictionary you are currently iterating over.
The solution is to get the keys all at once and iterate over them:
# Do this
for k in dictionaries.keys():
# Not this
for k in dictionaries: