Append values to a list of json objects - python

I'm collecting weather data and trying to create a list that has the latest (temperature) value by minute.
I want to add them to a list, and if the list does not contains the "minute index" it should at it as a new element in the list. So the list always keeps the latest temperature value per minute:
def AddValue(arr, value):
timestamp = datetime.datetime.utcnow().strftime("%Y-%m-%d %H:%M")
for v in arr['values']:
try:
e = v[timestamp] # will trigger the try/catch if not there
v[timestamp] = value
except KeyError:
v.append({ timestamp: value })
history = [
{ 'values': [ {'2017-12-22 10:20': 1}, {'2017-12-22 10:21': 2}, {'2017-12-22 10:22': 3} ] },
]
AddValue(history, 99)
However, I'm getting
AttributeError: 'dict' object has no attribute 'append'**

You associate a key k with a value v in a dictionary d with:
d[k] = v
this works regardless whether there is already a key k present in the dictioanry. In case that happens, the value is "overwritten". We can thus rewrite the for loop to:
for v in arr['values']:
v[timestamp] = value
In case you want to update a dictionary with several keys, you can use .update and pass a dictionary object, or named parameters as keys (and the corresponding values as value). So we can write it as:
for v in arr['values']:
v.update({timestamp: value})
which is semantically the same, but will require more computational effort.
Nevertheless since you need to iterate over a dictionary, you perhaps should reconsider the way you structured the data.

Related

RuntimeError: dictionary changed size

I have two nested dictionaries. Each dictionary has a key/value pair that are the same. I have some code that says hey if these are the same, update an existing dictionary with another key/value combo that exists in one of the dictionaries to the other dictionary. I get the error RuntimeError: dictionary changed size during iteration. I've seen that you can use deepcopy to solve for this, but does anybody else have any other ideas?
performances = [
{'campaign_id': 'bob'},
{'campaign_id': 'alice'},
]
campaign_will_spend = [
{'id': 'bob'},
{'id': 'alice'},
]
for item in campaign_will_spend:
ad_dictt = dict()
for willspendkey, willspendvalue in item.items():
if willspendkey == "id":
for i in performances:
for key, value in i.items():
if key == 'campaign_id' and value == willspendvalue:
i['lifetime_budget'] = item
You're causing yourself a lot of trouble by treating dictionaries like lists and iterating over them in their entirety to find a particular item. Most of the code goes away when you just stop doing that, and the rest of it goes away if you build a dictionary to be able to easily look up entries in campaign_will_spend:
# Easy lookup for campaign_will_spend dictionaries by id.
cws_by_id = {d['id']: d for d in campaign_will_spend}
for p in performances:
p["lifetime_budget"] = cws_by_id[p["campaign_id"]]

Filtering out Python Dictionary Values with Array of Nested Keys

I am trying to filter out a number of values from a python dictionary. Based on the answer seen here: Filter dict to contain only certain keys. I am doing something like:
new = {k:data[k] for k in FIELDS if k in data}
Basically create the new dictionary and care only about the keys listed in the FIELDS array. My array looks like:
FIELDS = ["timestamp", "unqiueID",etc...]
However, how do I do this if the key is nested? I.E. ['user']['color']?
How do I add a nested key to this array? I've tried:
[user][color], ['user']['color'], 'user]['color, and none of them are right :) Many of the values I need are nested fields. How can I add a nested key to this array and still have the new = {k:data[k] for k in FIELDS if k in data} bit work?
A quite simple approach, could look like the following (it will not work for all possibilities - objects in lists/arrays). You just need to specify a 'format' how you want to look for nested values.
'findValue' will split the searchKey (here on dots) in the given object, if found it searches the next 'sub-key' in the following value (assuming it is an dict/object) ...
myObj = {
"foo": "bar",
"baz": {
"foo": {
"bar": True
}
}
}
def findValue(obj, searchKey):
keys = searchKey.split('.')
for i, subKey in enumerate(keys):
if subKey in obj:
if i == len(subKey) -1:
return obj[subKey]
else:
obj = obj[subKey]
else:
print("Key not found: %s (%s)" % (subKey, keys))
return None
res = findValue(myObj, 'foo')
print(res)
res = findValue(myObj, 'baz.foo.bar')
print(res)
res = findValue(myObj, 'cantFind')
print(res)
Returns:
bar
True
Key not found: cantFind (cantFind)
None
Create a recursive function which checks whether the dictionary key has value or dictionary.
If key has dictionary again call function until you find the non-dictionary value.
When you find value just add it to your new created dictionary.
Hope this helps.

Iterating through (JSON?) map in Python

I have a datastructure like that:
sample_map= {'key1': {'internal_key1': ['value1']},
'key2': {'internal_key2': ['value2']},
}
I would like to iterate throgh on every line key1 and key2. I would like to get the value of 'internal_key1' and 'value_1'variables.
I tried this way:
for keys in sample_map.keys():
for value in sample_map[keys]:
#get internal_keys and values
How should I do this? Someone maybe could tell me about this data structure and the using?
for item in sample_map.values():
for k, v in item.items():
print k, v
How about the following:
for k,v in sample_map.iteritems():
print k
for k1,v1 in v.iteritems():
print k1,v1[0]
This will print the following:
key2
internal_key2 value2
key1
internal_key1 value1
This is called a dictionary (type dict). It does resemble the JSON structure, although JSON is a format in which a string is built to represent a specific structure of data, and dict is a structure of data (that can be converted to a string in JSON format).
Anyway, this line - for value in sample_map[keys]: is invalid. To get the value linked to a key, you just have to do value = sample_map[keys]. In this example, dicts will be assigned to val. Those will be the inner dicts ({'internal_key1': ['value1']} and so on).
So to access the inner keys, call .keys() of value:
for keys in sample_map.keys():
value = sample_map[keys]:
for internal_key in value.keys():
internal_value = value[internal_key]
Also, when using a for loop, there's no need for dict.keys(), it will be used automatically, so your code could look like:
for keys in sample_map:
value = sample_map[keys]:
for internal_key in value:
internal_value = value[internal_key]

Checking items in a list of dictionaries in python

I have a list of dictionaries=
a = [{"ID":1, "VALUE":2},{"ID":2, "VALUE":2},{"ID":3, "VALUE":4},...]
"ID" is a unique identifier for each dictionary. Considering the list is huge, what is the fastest way of checking if a dictionary with a certain "ID" is in the list, and if not append to it? And then update its "VALUE" ("VALUE" will be updated if the dict is already in list, otherwise a certain value will be written)
You'd not use a list. Use a dictionary instead, mapping ids to nested dictionaries:
a = {
1: {'VALUE': 2, 'foo': 'bar'},
42: {'VALUE': 45, 'spam': 'eggs'},
}
Note that you don't need to include the ID key in the nested dictionary; doing so would be redundant.
Now you can simply look up if a key exists:
if someid in a:
a[someid]['VALUE'] = newvalue
I did make the assumption that your ID keys are not necessarily sequential numbers. I also made the assumption you need to store other information besides VALUE; otherwise just a flat dictionary mapping ID to VALUE values would suffice.
A dictionary lets you look up values by key in O(1) time (constant time independent of the size of the dictionary). Lists let you look up elements in constant time too, but only if you know the index.
If you don't and have to scan through the list, you have a O(N) operation, where N is the number of elements. You need to look at each and every dictionary in your list to see if it matches ID, and if ID is not present, that means you have to search from start to finish. A dictionary will still tell you in O(1) time that the key is not there.
If you can, convert to a dictionary as the other answers suggest, but in case you you have reason* to not change the data structure storing your items, here's what you can do:
items = [{"ID":1, "VALUE":2}, {"ID":2, "VALUE":2}, {"ID":3, "VALUE":4}]
def set_value_by_id(id, value):
# Try to find the item, if it exists
for item in items:
if item["ID"] == id:
break
# Make and append the item if it doesn't exist
else: # Here, `else` means "if the loop terminated not via break"
item = {"ID": id}
items.append(id)
# In either case, set the value
item["VALUE"] = value
* Some valid reasons I can think of include preserving the order of items and allowing duplicate items with the same id. For ways to make dictionaries work with those requirements, you might want to take a look at OrderedDict and this answer about duplicate keys.
Convert your list into a dict and then checking for values is much more efficient.
d = dict((item['ID'], item['VALUE']) for item in a)
for new_key, new_value in new_items:
if new_key not in d:
d[new_key] = new_value
Also need to update on key found:
d = dict((item['ID'], item['VALUE']) for item in a)
for new_key, new_value in new_items:
d.setdefault(new_key, 0)
d[new_key] = new_value
Answering the question you asked, without changing the datastructure around, there's no real faster way of looking without a loop and checking every element and doing a dictionary lookup for each one - but you can push the loop down to the Python runtime instead of using Python's for loop.
I haven't tried if it ends up faster though.
a = [{"ID":1, "VALUE":2},{"ID":2, "VALUE":2},{"ID":3, "VALUE":4}]
id = 2
tmp = filter(lambda d: d['ID']==id, a)
# the filter will either return an empty list, or a list of one item.
if not tmp:
tmp = {"ID":id, "VALUE":"default"}
a.append(tmp)
else:
tmp = tmp[0]
# tmp is bound to the found/new dictionary

Multiple keys per value

Is it possible to assign multiple keys per value in a Python dictionary. One possible solution is to assign value to each key:
dict = {'k1':'v1', 'k2':'v1', 'k3':'v1', 'k4':'v2'}
but this is not memory efficient since my data file is > 2 GB. Otherwise you could make a dictionary of dictionary keys:
key_dic = {'k1':'k1', 'k2':'k1', 'k3':'k1', 'k4':'k4'}
dict = {'k1':'v1', 'k4':'v2'}
main_key = key_dict['k2']
value = dict[main_key]
This is also very time and effort consuming because I have to go through whole dictionary/file twice. Is there any other easy and inbuilt Python solution?
Note: my dictionary values are not simple string (as in the question 'v1', 'v2') rather complex objects (contains different other dictionary/list etc. and not possible to pickle them)
Note: the question seems similar as How can I use both a key and an index for the same dictionary value?
But I am not looking for ordered/indexed dictionary and I am looking for other efficient solutions (if any) other then the two mentioned in this question.
What type are the values?
dict = {'k1':MyClass(1), 'k2':MyClass(1)}
will give duplicate value objects, but
v1 = MyClass(1)
dict = {'k1':v1, 'k2':v1}
results in both keys referring to the same actual object.
In the original question, your values are strings: even though you're declaring the same string twice, I think they'll be interned to the same object in that case
NB. if you're not sure whether you've ended up with duplicates, you can find out like so:
if dict['k1'] is dict['k2']:
print("good: k1 and k2 refer to the same instance")
else:
print("bad: k1 and k2 refer to different instances")
(is check thanks to J.F.Sebastian, replacing id())
Check out this - it's an implementation of exactly what you're asking: multi_key_dict(ionary)
https://pypi.python.org/pypi/multi_key_dict
(sources at https://github.com/formiaczek/python_data_structures/tree/master/multi_key_dict)
(on Unix platforms it possibly comes as a package and you can try to install it with something like:
sudo apt-get install python-multi-key-dict
for Debian, or an equivalent for your distribution)
You can use different types for keys but also keys of the same type. Also you can iterate over items using key types of your choice, e.g.:
m = multi_key_dict()
m['aa', 12] = 12
m['bb', 1] = 'cc and 1'
m['cc', 13] = 'something else'
print m['aa'] # will print '12'
print m[12] # will also print '12'
# but also:
for key, value in m.iteritems(int):
print key, ':', value
# will print:1
# 1 : cc and 1
# 12 : 12
# 13 : something else
# and iterating by string keys:
for key, value in m.iteritems(str):
print key, ':', value
# will print:
# aa : 12
# cc : something else
# bb : cc and 1
m[12] = 20 # now update the value
print m[12] # will print '20' (updated value)
print m['aa'] # will also print '20' (it maps to the same element)
There is no limit to number of keys, so code like:
m['a', 3, 5, 'bb', 33] = 'something'
is valid, and either of keys can be used to refer to so-created value (either to read / write or delete it).
Edit: From version 2.0 it should also work with python3.
Using python 2.7/3 you can combine a tuple, value pair with dictionary comprehension.
keys_values = ( (('k1','k2'), 0), (('k3','k4','k5'), 1) )
d = { key : value for keys, value in keys_values for key in keys }
You can also update the dictionary similarly.
keys_values = ( (('k1',), int), (('k3','k4','k6'), int) )
d.update({ key : value for keys, value in keys_values for key in keys })
I don't think this really gets to the heart of your question but in light of the title, I think this belongs here.
The most straightforward way to do this is to construct your dictionary using the dict.fromkeys() method. It takes a sequence of keys and a value as inputs and then assigns the value to each key.
Your code would be:
dict = dict.fromkeys(['k1', 'k2', 'k3'], 'v1')
dict.update(dict.fromkeys(['k4'], 'v2'))
And the output is:
print(dict)
{'k1': 'v1', 'k2': 'v1', 'k3': 'v1', 'k4': 'v2'}
You can build an auxiliary dictionary of objects that were already created from the parsed data. The key would be the parsed data, the value would be your constructed object -- say the string value should be converted to some specific object. This way you can control when to construct the new object:
existing = {} # auxiliary dictionary for making the duplicates shared
result = {}
for k, v in parsed_data_generator():
obj = existing.setdefault(v, MyClass(v)) # could be made more efficient
result[k] = obj
Then all the result dictionary duplicate value objects will be represented by a single object of the MyClass class. After building the result, the existing auxiliary dictionary can be deleted.
Here the dict.setdefault() may be elegant and brief. But you should test later whether the more talkative solution is not more efficient -- see below. The reason is that MyClass(v) is always created (in the above example) and then thrown away if its duplicate exists:
existing = {} # auxiliary dictionary for making the duplicates shared
result = {}
for k, v in parsed_data_generator():
if v in existing:
obj = existing[v]
else:
obj = MyClass(v)
existing[v] = obj
result[k] = obj
This technique can be used also when v is not converted to anything special. For example, if v is a string, both key and value in the auxiliary dictionary will be of the same value. However, the existence of the dictionary ensures that the object will be shared (which is not always ensured by Python).
I was able to achieve similar functionality using pandas MultiIndex, although in my case the values are scalars:
>>> import numpy
>>> import pandas
>>> keys = [numpy.array(['a', 'b', 'c']), numpy.array([1, 2, 3])]
>>> df = pandas.DataFrame(['val1', 'val2', 'val3'], index=keys)
>>> df.index.names = ['str', 'int']
>>> df.xs('b', axis=0, level='str')
0
int
2 val2
>>> df.xs(3, axis=0, level='int')
0
str
c val3
I'm surprised no one has mentioned using Tuples with dictionaries. This works just fine:
my_dictionary = {}
my_dictionary[('k1', 'k2', 'k3')] = 'v1'
my_dictionary[('k4')] = 'v2'

Categories