I have a dictionary whose keys and values are updated from internet. This dictionary keeps changing the position and number of its keys/variables on every update (for some reason) but the names of keys and formats of values remain the same. Initially, I converted its keys and values to different arrays and was storing their values to database by following their array locaton, but after I discovered its variability, its no more possible to do it the same way I was doing, since the len(dictionary) keeps changing. The dictionary items are fetched from a url on every update, which sometime gives me 31 items (each item is key:value) and sometime gives me 3, 29 , 28 or even 27 items in the dictionary. So, I have made a generalization about some 'always-there' items and now I want to extract them on every update, but not according to their order, but according to their keys. Its more like: I need to search for specific keywords in the dictionary and to save their corresponding values to the variables. For instance, on one update it's keys are:
>>> len(dict.keys())
>>> 30
on another update:
>>> len(dict.keys())
>>> 26
This shows the number of items in the dictionary keeps variating. However, I have noted a list of some obligatory keys (that I am mentioning below) which are always there so I just need to look for them whenever the thing is updated. In more precise terms, I need a way to extract specific keys (probably by searching) and their corresponding values from the dictionary and to save both them to different variables so that I can save them to database. The keys to be searched are:
temp_f
relative_humidity
wind_dir
pressure_mb
location
Thanks.
If I understood your problem well, you don't need to maintain the order of keys/values in your dictionary and you just want to strip your dictionary from unwanted keys and rename the keys you are interested in. Your concern is that some keys might also be missing. I would solve it in this way.
new_dict = {
'tf' : original_dict.get('temp_f', None),
'rh' : original_dict.get('relative_humidity', None),
# And so on...
}
If you want to maintain the order, use collections.OrderedDict instead of normal dict.
Related
I have a fairly complex dictionary with lists that I'm trying to use. I'm attempting to search the dictionary for the key "rows", which has a list.
With my current sample of data, I can easily pull it like so with index operator:
my_rows = my_dict['inputContent']['document']['fields'][2]['value']['rows']
'rows' is a list and those are the values I am trying to pull. However, my data from 'rows' won't always be in the exact same location but it will always be in my_dict. It could be in a different index like so:
my_dict['inputContent']['document']['fields'][4]['value']['rows']
or
my_dict['inputContent']['document']['fields'][7]['value']['rows']
Really any number.
I've tried using just the basic:
my_rows = my_dict.get("rows")
But this returns None.
I find lots of articles on how to search for values and return key, but I know the key and it will always be the same, while my values in 'rows' will always be different.
I'm new to python and using dictionaries in general, but i'm really struggling to drill down into this dictionary to pull this list.
my_dict['inputContent']['document']['fields'][2]['value']['rows']
my_dict['inputContent']['document']['fields'][4]['value']['rows']
my_dict['inputContent']['document']['fields'][7]['value']['rows']
Looks like the overall structure is the same, the only variable is the numeric list index of the fourth element. So we need a loop that iterates over each element of that list:
for element in my_dict['inputContent']['document']['fields']:
if 'rows' in element['value']:
# found it!
print(element['value']['rows'])
I used jsondiff.diff() to compare the flattened items (as a json) from two jsons of the mapping of an Elasticsearch index. One is what I loaded, the original mapping I created, and the other is the mapping queried from Elasticsearch, so it could technically be different if the data inserted was different from expected.
The mappings are identical, so to test finding the diff I added 2 different fields to the original mapping.
Using jsondiff.diff() works beautifully, and I find my differences.
diff(dfpulledlist, dfloadedlist)
diff(dfloadedlist, dfpulledlist)
That produces:
{insert: [(114, 'mappings.properties.saw_something.properties.monster_fish'), (121, 'mappings.properties.aliens')]}
It looks like a dictionary. Type() tells me it is a <class 'dict'>, but trying to get the items under insert gives me errors. I therefore printed the items and found something strange. Now I'm trying to figure out how to get to the items of insert so I can count them and don't know how to work my way through this.
print(diff_pulled_to_loaded.items())
Gives me:
dict_items([(insert, [(114, 'mappings.properties.saw_something.properties.monster_fish'), (121, 'mappings.properties.aliens')])])
Looking at this page, it looks like it is a dictionary from sequence having each item as a pair. I can't access it as a dictionary though. dict['insert'] or dict.insert . On each attempt I get KeyError: 'insert'.
How do I get to the values of insert, which is an array or list of tuples, so I can query that information? What am I missing/misunderstanding here? I want to get to this:
[(114, 'mappings.properties.saw_something.properties.monster_fish'),
(121, 'mappings.properties.aliens')]
It's a dictionary, but keys like insert and replace are not strings, they're instances of the jsondiff.Symbol class. That's why they don't have quotes around them -- this class has a custom representation that just returns the symbol name.
So to access it I think you have to use
d = diff(dfpulledlist, dfloadedlist)
print(d[jsondiff.insert])
So I am new to python and I'm trying to do a small project which I want to do in pure pythonic way not using any additional libraries, where my data set looks like this:
LOC,DATE,ATTRIBUTE,COUNT
A,03/01/19,alpha,6483
A,03/01/19,beta,19
B,03/01/19,gamma,346158
B,02/01/19,gamma,156891
A,02/01/19,delta,1319
A,02/01/19,gamma,15272
A,02/01/19,gamma,56810
I have to transform this data set to this output:
B,02/01/19,gamma, 346158
A,02/01/19,alpha,6483
A,02/01/19,beta,19
B,02/01/19,gamma, 172163
A,02/01/19,delta,1319
B,01/01/19,gamma,56810
The data needs to be sorted by Date, Value, Measure, Loc
I thought that nested dictionaries should work, because I only have to update value of attirbute, LOC can be come the outer key
dict = {A:{}, B:{}}
Then date can be used as the key for the nested dictionary:
dict = {A:{03/01/19:{}, 02/01/19:{}}, B:{03/01/19:{}, 02/01/19:{}}
And keep going forward until I reach Count, and every time I keep updating count. But the code is getting more complex every time, my question:
Is there any other alternative data structure that I could use?
If with a dictionary is there a way to check nested keys and keep adding only new values for every key!
Any help would be really grateful!
This is a question and answer I wanted to share, since I found it very useful.
Suppose I have a dictionary accessible with different keys. And at each position of the dictionary I have a list of a fixed length:
a={}
a["hello"]=[2,3,4]
a["bye"]=[0,10,100]
a["goodbye"]=[2,5,50]
I was interested to compute the sum across all entries in a using only position 1 of their respective lists.
In the example, I wanted to sum:
finalsum=sum([3,10,5]) #-----> 18
Just skip the keys entirely, since they don't really matter.
sum(i[1] for i in a.itervalues())
Also as a side note, you don't need to do a.keys() when iterating over a dict, you can just say for key in a and it will use the keys.
You can use a.values() to get a list of all the values in a dict. As far as I can tell, the keys are irrelevant. a.itervalues() works by iterating rather than constructing a new list. By using this, and a generator expression as the argument to sum, there are no extraneous lists created.
I used list-comprehensions for my one line solution(here separated in two lines):
elements=[a[pos][1] for pos in a.keys()] #----> [3,5,10]
finalsum=sum(elements)
I'm happy with this solution :) , but, any other suggestions?
Ok, this one should be simple. I have 3 dictionaries. They are all made, ordered, and filled to my satisfaction but I would like to put them all in an overarching dictionary so I can reference and manipulate them more easily and efficiently.
Layer0 = {}
Layer1 = {}
Layer2 = {}
here they are when created, and afterwards I feebly tried different things based on SO questions:
Layers = {Layer0, Layer1, Layer2}
which raised a syntax error
Layers = {'Layer0', 'Layer1', 'Layer2'}
which raised another syntax error
(Layers is the Dictionary I'm trying to create that will have all the previously made dictionaries within it)
All the other examples I found on SO have been related to creating dictionaries within dictionaries in order to fill them (or filling them simultaneously) and since I already coded a large number of lines to make these dictionaries, I'd rather put them into a dictionary after the fact instead of re-writing code.
It would be best if the order of the dictionaries are preserved when put into Layers
Does anyone know if this is possible and how I should do it?
Dictionary items have both a key and a value.
Layers = {'Layer0': Layer0, 'Layer1': Layer1, 'Layer2': Layer2}
Keep in mind that dictionaries don't have an order, since a dictionary is a hash table (i.e. a mapping from your key names to a unique hash value). Using .keys() or .values() generates a list, which does have an order, but the dictionary itself doesn't.
So when you say "It would be best if the order of the dictionaries are preserved when put into Layers" - this doesn't really mean anything. For example, if you rename your dictionaries from "Layer1, Layer2, Layer3" to "A, B, C," you'll see that Layers.keys() prints in the order "A, C, B." This is true regardless of the order you used when building the dictionary. All this shows is that the hash value of "C" is less than that of "B," and it doesn't tell you anything about the structure of your dictionary.
This is also why you can't directly iterate over a dictionary (you have to iterate over e.g. a list of the keys).
As a side note, this hash function is what allows a dictionary to do crazy fast lookups. A good hash function will give you constant time [O(1)] lookup, meaning you can check if a given item is in your dictionary in the same amount of time whether the dictionary contains ten items or ten million. Pretty cool.