I successfully imported from the web this json file, which looks like:
[{"h_mag":"19.7","i_deg":"9.65","moid_au":"0.035"},{"h_mag":"20.5","i_deg":"14.52","moid_au":"0.028"},
etc ...
I want to extract the values of the key moid_au, later compare moid_au with the key values of h_mag.
This works: print(data[1]['moid_au']), but if I try to ask all the elements of the list it won't, I tried: print(data[:]['moid_au']).
I tried iterators and a lambda function but still has not work yet, mostly because I'm new in data manipulation. It works when I have one dictionary, not with a list of dictionaries.
Thanks in advance for other tips. Some links were confusing.
Sounds like you are using lambda wrong because you need map as well:
c = [{"h_mag":"19.7","i_deg":"9.65","moid_au":"0.035"},{"h_mag":"20.5","i_deg":"14.52","moid_au":"0.028"}]
list(map(lambda rec: rec.get('moid_au'), c))
['0.035', '0.028']
Each lambda grabs a record from your list and you map your function to that.
Using print(data[:]['moid_au']) equals to print(data['moid_au']), and you can see that it won't work, as data has no key named 'moid_au'.
Try working with a loop:
for item in data:
print(item['moid_au'])
using your approach to iterate over the whole array to get all the instances of a key,this method might work for you
a = [data[i]['moid_au']for i in range(len(data))]
print(a)
In which exact way do you want to compare them?
Would it be useful getting the values in a way like this?
list_of_dicts = [{"h_mag":"19.7","i_deg":"9.65","moid_au":"0.035"}, {"h_mag":"20.5","i_deg":"14.52","moid_au":"0.028"}]
mod_au_values = [d["moid_au"] for d in list_of_dicts]
h_mag_values = [d["h_mag"] for d in list_of_dicts]
for key, value in my_list.items ():
print key
print value
for value in my_list.values ():
print value
for key in my_list.keys():
print key
Thank you for your help. Still very new to python. I trust I'm not abusing the goodwill of SO with such questions. I am trying to evolve from SQL database mentality to a python lists/dictionary approach.
Here is a snippet of a list with nested tuples (always containing three elements):
List = [(u'32021', u'161', 1696.2), (u'32021', u'162', 452.2), (u'32044', u'148', 599.2), (u'32044', u'149', 212.2)]
Can this be converted to a dictionary with nested dictionaries, something like this:
{'32021': ('161': 1696.2, '162': 452.2), '32044': ('148': 599.2, '149': 212.2)}
I addressed a similar problem that only had two items in each tuple using:
d = defaultdict(list)
for k, v in values:
d[k].append(v)
For three items, is one solution using indexing with a for loop?
Thank you.
You probably want this:
d = {}
for k1,k2,v in List:
d[(k1,k2)] = v
or even this:
d = {(k1,k2):v for k1,k2,v in List}
you can do this case with a nested defaultdict:
d = defaultdict(lambda:defaultdict(list))
for k1, k2, v in values:
d[k1][k2].append(v)
print d['32044']['148']
[599.2]
etc.
Also, see the bunch pattern, which is an easy-to-use similar idea that even lets you assign attributes inline without having to declare them first:
https://pypi.python.org/pypi/bunch/1.0.1
I was just wondering if there is a simple way to do this. I have a particular structure that is parsed from a file and the output is a list of a dict of a list of a dict. Currently, I just have a bit of code that looks something like this:
for i in xrange(len(data)):
for j, k in data[i].iteritems():
for l in xrange(len(data[i]['data'])):
for m, n in data[i]['data'][l].iteritems():
dostuff()
I just wanted to know if there was a function that would traverse a structure and internally figure out whether each entry was a list or a dict and if it is a dict, traverse into that dict and so on. I've only been using Python for about a month or so, so I am by no means an expert or even an intermediate user of the language. Thanks in advance for the answers.
EDIT: Even if it's possible to simplify my code at all, it would help.
You never need to iterate through xrange(len(data)). You iterate either through data (for a list) or data.items() (or values()) (for a dict).
Your code should look like this:
for elem in data:
for val in elem.itervalues():
for item in val['data']:
which is quite a bit shorter.
Will, if you're looking to decend an arbitrary structure of array/hash thingies then you can create a function to do that based on the type() function.
def traverse_it(it):
if (isinstance(it, list)):
for item in it:
traverse_it(item)
elif (isinstance(it, dict)):
for key in it.keys():
traverse_it(it[key])
else:
do_something_with_real_value(it)
Note that the average object oriented guru will tell you not to do this, and instead create a class tree where one is based on an array, another on a dict and then have a single function to process each with the same function name (ie, a virtual function) and to call that within each class function. IE, if/else trees based on types are "bad". Functions that can be called on an object to deal with its contents in its own way "good".
I think this is what you're trying to do. There is no need to use xrange() to pull out the index from the list since for iterates over each value of the list. In my example below d1 is therefore a reference to the current data[i].
for d1 in data: # iterate over outer list, d1 is a dictionary
for x in d1: # iterate over keys in d1 (the x var is unused)
for d2 in d1['data']: # iterate over the list
# iterate over (key,value) pairs in inner most dict
for k,v in d2.iteritems():
dostuff()
You're also using the name l twice (intentionally or not), but beware of how the scoping works.
well, question is quite old. however, out of my curiosity, I would like to respond to your question for much better answer which I just tried.
Suppose, dictionary looks like: dict1 = { 'a':5,'b': [1,2,{'a':100,'b':100}], 'dict 2' : {'a':3,'b':5}}
Solution:
dict1 = { 'a':5,'b': [1,2,{'a':100,'b':100}], 'dict 2' : {'a':3,'b':5}}
def recurse(dict):
if type(dict) == type({}):
for key in dict:
recurse(dict[key])
elif type(dict) == type([]):
for element in dict:
if type(element) == type({}):
recurse(element)
else:
print element
else:
print dict
recurse(dict1)
I have the two following lists:
# List of tuples representing the index of resources and their unique properties
# Format of (ID,Name,Prefix)
resource_types=[('0','Group','0'),('1','User','1'),('2','Filter','2'),('3','Agent','3'),('4','Asset','4'),('5','Rule','5'),('6','KBase','6'),('7','Case','7'),('8','Note','8'),('9','Report','9'),('10','ArchivedReport',':'),('11','Scheduled Task',';'),('12','Profile','<'),('13','User Shared Accessible Group','='),('14','User Accessible Group','>'),('15','Database Table Schema','?'),('16','Unassigned Resources Group','#'),('17','File','A'),('18','Snapshot','B'),('19','Data Monitor','C'),('20','Viewer Configuration','D'),('21','Instrument','E'),('22','Dashboard','F'),('23','Destination','G'),('24','Active List','H'),('25','Virtual Root','I'),('26','Vulnerability','J'),('27','Search Group','K'),('28','Pattern','L'),('29','Zone','M'),('30','Asset Range','N'),('31','Asset Category','O'),('32','Partition','P'),('33','Active Channel','Q'),('34','Stage','R'),('35','Customer','S'),('36','Field','T'),('37','Field Set','U'),('38','Scanned Report','V'),('39','Location','W'),('40','Network','X'),('41','Focused Report','Y'),('42','Escalation Level','Z'),('43','Query','['),('44','Report Template ','\\'),('45','Session List',']'),('46','Trend','^'),('47','Package','_'),('48','RESERVED','`'),('49','PROJECT_TEMPLATE','a'),('50','Attachments','b'),('51','Query Viewer','c'),('52','Use Case','d'),('53','Integration Configuration','e'),('54','Integration Command f'),('55','Integration Target','g'),('56','Actor','h'),('57','Category Model','i'),('58','Permission','j')]
# This is a list of resource ID's that we do not want to reference directly, ever.
unwanted_resource_types=[0,1,3,10,11,12,13,14,15,16,18,20,21,23,25,27,28,32,35,38,41,47,48,49,50,57,58]
I'm attempting to compare the two in order to build a third list containing the 'Name' of each unique resource type that currently exists in unwanted_resource_types. e.g. The final result list should be:
result = ['Group','User','Agent','ArchivedReport','ScheduledTask','...','...']
I've tried the following that (I thought) should work:
result = []
for res in resource_types:
if res[0] in unwanted_resource_types:
result.append(res[1])
and when that failed to populate result I also tried:
result = []
for res in resource_types:
for type in unwanted_resource_types:
if res[0] == type:
result.append(res[1])
also to no avail. Is there something i'm missing? I believe this would be the right place to perform list comprehension, but that's still in my grey basket of understanding fully (The Python docs are a bit too succinct for me in this case).
I'm also open to completely rethinking this problem, but I do need to retain the list of tuples as it's used elsewhere in the script. Thank you for any assistance you may provide.
Your resource types are using strings, and your unwanted resources are using ints, so you'll need to do some conversion to make it work.
Try this:
result = []
for res in resource_types:
if int(res[0]) in unwanted_resource_types:
result.append(res[1])
or using a list comprehension:
result = [item[1] for item in resource_types if int(item[0]) in unwanted_resource_types]
The numbers in resource_types are numbers contained within strings, whereas the numbers in unwanted_resource_types are plain numbers, so your comparison is failing. This should work:
result = []
for res in resource_types:
if int( res[0] ) in unwanted_resource_types:
result.append(res[1])
The problem is that your triples contain strings and your unwanted resources contain numbers, change the data to
resource_types=[(0,'Group','0'), ...
or use int() to convert the strings to ints before comparison, and it should work. Your result can be computed with a list comprehension as in
result=[rt[1] for rt in resource_types if int(rt[0]) in unwanted_resource_types]
If you change ('0', ...) into (0, ... you can leave out the int() call.
Additionally, you may change the unwanted_resource_types variable into a set, like
unwanted_resource_types=set([0,1,3, ... ])
to improve speed (if speed is an issue, else it's unimportant).
The one-liner:
result = map(lambda x: dict(map(lambda a: (int(a[0]), a[1]), resource_types))[x], unwanted_resource_types)
without any explicit loop does the job.
Ok - you don't want to use this in production code - but it's fun. ;-)
Comment:
The inner dict(map(lambda a: (int(a[0]), a[1]), resource_types)) creates a dictionary from the input data:
{0: 'Group', 1: 'User', 2: 'Filter', 3: 'Agent', ...
The outer map chooses the names from the dictionary.