Creating a nested dictionary where inner values are lists - python

So I want to create a nested dictionary of lists where the inner value is a list that allows duplicates:
d = {'45678':{'ant':['N4', 'N4', 'P3', 'P3']}}
This is what I have so far but can't figure out how to append a list to the inner value:
d={}
with open(file_path, 'r') as f:
for l in f.readlines()[4:]:
peaks = l.split()
if '1' in peaks[5]:
d.setdefault(peaks[0], {})['ant'] = [peaks[7]]
Which returns:
{'20065037': {'ant': ['N4']}}
My question is how can I append a list as the inner value in the nested dictionary?

I think I may have figured out what you're trying to do.
Does this help?
d.setdefault(peaks[0], {}).setdefault('ant', []).append(peaks[7])
If not, please explain what the file looks like or something else about what you're trying to do.

Assuming your code is basically sound (in terms of processing the file), you should use setdefault on the inner dictionary and append to the list.
d={}
with open(file_path, 'r') as f:
for l in f.readlines()[4:]:
peaks = l.split()
if '1' in peaks[5]:
d.setdefault(peaks[0], {}).setdefault('ant', []).append(peaks[7])
Currently, you are always creating a new list using [peaks[7]] instead of giving an option to append.

Related

How do I print specific items from a python list?

I am trying to print multiple items from a list in Python. I have looked at many examples but I am unable to get them to work. What is a good way to print items from the imported list based on their list index.
The commented out example in my code is what I thought will work and cant find a simple solution.
items = []
with open('input.txt') as input_file:
for line in input_file:
items.append(line)
print(items[5],items[6])
#print(items[5,6,7]
One way to do it would be with a for loop.
item_list = ['a','b','c','d','e','f','g','i'] # an example list.
index_to_print = [5,6,7] # add all index you want to print to this list.
for i in item_list:
if item_list.index(i) in index_to_print:
print(i)
update: instead of looping through the whole item_list, we can loop through the indexes you want to print.
for i in index_to_print:
if i < len(item_list):
print(item_list[i])
else:
print('given index is out of range.')

Iterating over a list of dictionaries

JSON
Newbie here. I want to iterate over this list of dictionaries to get the "SUPPLIER" for every dictionary. I tried
turbine_json_path = '_maps/turbine/turbine_payload.json'
with open(turbine_json_path, "r") as f:
turbine = json.load(f)
# print((turbine))
for supplier in turbine[0]['GENERAL']:
print(supplier["SUPPLIER"])
But I get a type error.. TypeError: string indices must be integers
Any help is appreciated.
for d in turbine:
print(d["GENERAL"]["SUPPLIER"])
There is only one supplier key in your dictionary, so it would be
supplier = turbine[0]['GENERAL']['SUPPLIER']
Other wise your for loop is looping over the keys within the 'GENERAL' dictionary, which are strings.

Multiple dictionary list of values assignment with a single for loop for multiple keys

I want to create a dictionary with a list of values for multiple keys with a single for loop in Python3. For me, the time execution and memory footprint are of utmost importance since the file which my Python3 script is reading is rather long.
I have already tried the following simple script:
p_avg = []
p_y = []
m_avg = []
m_y = []
res_dict = {}
with open('/home/user/test', 'r') as f:
for line in f:
p_avg.append(float(line.split(" ")[5].split(":")[1]))
p_y.append(float(line.split(" ")[6].split(":")[1]))
m_avg.append(float(line.split(" ")[1].split(":")[1]))
m_avg.append(float(line.split(" ")[2].split(":")[1]))
res_dict['p_avg'] = p_avg
res_dict['p_y'] = p_y
res_dict['m_avg'] = m_avg
res_dict['m_y'] = mse_y
print(res_dict)
The format of my home/user/test file is:
n:1 m_avg:7588.39 m_y:11289.73 m_u:147.92 m_v:223.53 p_avg:9.33 p_y:7.60 p_u:26.43 p_v:24.64
n:2 m_avg:7587.60 m_y:11288.54 m_u:147.92 m_v:223.53 p_avg:9.33 p_y:7.60 p_u:26.43 p_v:24.64
n:3 m_avg:7598.56 m_y:11304.50 m_u:148.01 m_v:225.33 p_avg:9.32 p_y:7.60 p_u:26.43 p_v:24.60
.
.
.
The Python script shown above works but first it is too long and repetitive, second, I am not sure how efficient it is. I was eventually thinking to create the same with list-comprehensions. Something like that:
(res_dict['p_avg'], res_dict['p_y']) = [(float(line.split(" ")[5].split(":")[1]), float(line.split(" ")[6].split(":")[1])) for line in f]
But for all four dictionary keys. Do you think that using list comprehension could reduce the used memory footprint of the script and the speed of execution? What should be the right syntax for the list-comprehension?
[EDIT] I have changed the dict -> res_dict as it was mentioned that it is not a good practice, I have also fixed a typo, where the p_y wasn't pointing to the right value and added a print statement to print the resulting dictionary as mentioned by the other users.
You can make use of defaultdict. There is no need to split the line each time, and to make it more readable you can use a lambda to extract the fields for each item.
from collections import defaultdict
res = defaultdict(list)
with open('/home/user/test', 'r') as f:
for line in f:
items = line.split()
extract = lambda x: x.split(':')[1]
res['p_avg'].append(extract(items[5]))
res['p_y'].append(extract(items[6]))
res['m_avg'].append(extract(items[1]))
res['m_y'].append(extract(items[2]))
You can initialize your dict to contain the string/list pairs, and then append directly as you iterate through every line. Also, you don't want to keep calling split() on line on each iteration. Rather, just call once and save to a local variable and index from this variable.
# Initialize dict to contain string key and list value pairs
dictionary = {'p_avg':[],
'p_y':[],
'm_avg':[],
'm_y':[]
}
with open('/home/user/test', 'r') as f:
for line in f:
items = line.split() # store line.split() so you don't split multiple times per line
dictionary['p_avg'].append(float(items[5].split(':')[1]))
dictionary['p_y'].append(float(items[6].split(':')[1])) # I think you meant index 6 here
dictionary['m_avg'].append(float(items[1].split(':')[1]))
dictionary['m_y'].append(float(items[2].split(':')[1]))
You can just pre-define dict attributes:
d = {
'p_avg': [],
'p_y': [],
'm_avg': [],
'm_y': []
}
and then append directly to them:
with open('/home/user/test', 'r') as f:
for line in f:
splitted_line = line.split(" ")
d['p_avg'].append(float(splitted_line[5].split(":")[1]))
d['p_y'].append(float(splitted_line[5].split(":")[1]))
d['m_avg'].append(float(splitted_line[1].split(":")[1]))
d['m_avg'].append(float(splitted_line[2].split(":")[1]))
P.S. Never use variable names equal to built-in words, like dict, list etc. It can cause MANY various errors!

how to return all lines in csv.Dicreader

I'm reading a CSV file using csv.Dicreader. It returns only the last line as a dict but I want to return all of the lines.
I'm filtering the entire row file with dictionary comprehension to get only two keys:value using the field dict, then doing a little cleanup. I need to return each line as a dict after the cleaning process. Finally, it should return a dict.
for row in reader:
data={value:row[key] for key, value in fields.items()}
if data['binomialAuthority']=='NULL':
data['binomialAuthority']=None
data['label']=re.sub(r'\(.*?\)','',data['label']).strip()
return data
out put:
data= {{'label': 'Argiope', 'binomialAuthority': None}
{'label': 'Tick', 'binomialAuthority': None}}
Each iteration through the loop, you assign to data a single value. Think of data like a small markerboard that only has the last thing you wrote on it. At the end of the loop it will refer to the last item assigned.
If you just want to print your structure, move the print statement into the loop.
If you want a data structure containing multiple dicts, then you need to create a list and then append to it in the loop. Note that this will use a lot of memory when loading a large file.
eg.
my_list = []
for row in reader:
data = '...'
my_list.append(data)
return my_list
the best way is to append it to a list and then use a for loop to unwind the list so that you get a dict type.
my_list = []
for row in reader:
data = '...'
my_list.append(data)
for i in my_list:
print (i)

python csv TypeError: unhashable type: 'list'

Hi Im trying to compare two csv files and get the difference. However i get the above mentioned error. Could someone kindly give a helping hand. Thanks
import csv
f = open('ted.csv','r')
psv_f = csv.reader(f)
attendees1 = []
for row in psv_f:
attendees1.append(row)
f.close
f = open('ted2.csv','r')
psv_f = csv.reader(f)
attendees2 = []
for row in psv_f:
attendees2.append(row)
f.close
attendees11 = set(attendees1)
attendees12 = set(attendees2)
print (attendees12.difference(attendees11))
When you iterate csv reader you get lists, so when you do
for row in psv_f:
attendees2.append(row)
Row is actually a list instance. so attendees1 / attendees2 is a list of lists.
When you convert it to set() it need to make sure no item appear more than once, and set() relay on hash function of the items in the list. so you are getting error because when you convert to set() it try to hash a list but list is not hashable.
You will get the same exception if you do something like this:
set([1, 2, [1,2] ])
More in sets: https://docs.python.org/2/library/sets.html
Happened on the line
attendees11 = set(attendees1)
didn't it? You are trying to make a set from a list of lists but it is impossible because set may only contain hashable types, which list is not. You can convert the lists to tuples.
attendees1.append(tuple(row))
Causes you created list of list:
attendees1.append(row)
Like wise:
attendees2.append(row)
Then when you do :
attendees11 = set(attendees1)
The error will be thrown
What you should do is :
attendees2.append(tuple(row))

Categories