Python: Redundancy when iterating through nested dictionary - python

I have a nested dictionary which I am trying to loop through in order to write to an excel file.
This is the code which initiates and creates the nested dictionary
def tree(): return defaultdict(tree)
KMstruct = tree()
for system in sheet.columns[0]:
if system.value not in KMstruct:
KMstruct[system.value]
for row in range(1,sheet.get_highest_row()+1):
if sheet['A'+str(row)].value == system.value and sheet['B'+str(row)].value not in KMstruct:
KMstruct[system.value][sheet['B'+str(row)].value]
if sheet['B'+str(row)].value == sheet['B'+str(row)].value and sheet['C'+str(row)].value not in KMstruct:
KMstruct[system.value][sheet['B'+str(row)].value][sheet['C'+str(row)].value]
if sheet['C'+str(row)].value == sheet['C'+str(row)].value and sheet['D'+str(row)].value not in KMstruct:
KMstruct[system.value][sheet['B'+str(row)].value][sheet['C'+str(row)].value][sheet['D'+str(row)].value]
KMstruct[system.value][sheet['B'+str(row)].value][sheet['C'+str(row)].value][sheet['D'+str(row)].value] = [sheet['E'+str(row)].value]
This is the code where I loop through it:
for key in KMstruct.keys():
r += 1
worksheet.write(r, col, key)
for subkey in KMstruct[key]:
if currsubkeyval != subkey:
r += 1
worksheet.write(r, col, key)
r +=1
worksheet.write(r, col, key + '\\' + subkey)
for item in KMstruct[key][subkey]:
if curritemval != item:
r +=1
worksheet.write(r, col, key + '\\' + subkey)
for subitem in KMstruct[key][subkey][item]:
r += 1
worksheet.write(r, col, key + '\\' + subkey + '\\' + item)
worksheet.write(r, col + 1, subitem)
curritemval = item
for finalitem in KMstruct[key][subkey][item][subitem]:
r += 1
worksheet.write(r, col, key + '\\' + subkey + '\\' + item + '\\' + subitem)
worksheet.write(r, col + 1, KMstruct[key][subkey][item][subitem])
Bear with me on this code since I am a noob, I am aware that this is not so beautiful. Anyhow, my problem is the last loop. I am trying to take the string values in KMstruct[key][subkey][item][subitem] but the loop variable lastitem goes through each char of key's string value (note: the key subitem contains a list of strings). This means that if I only have one value that is to be written, it gets written as many times as there are characters in the string.
E.g.: value: apple will be written on a new excel row 5 times
What am I doing wrong here?
Edit: The issue on the redundancy has been solved but now I need to understand if I am doing something wrong when assigning my lastitem, i.e. my list of strings, to the subitem key.

The problem is that in Python a str is also an iterable, for example:
>>> for s in 'hello':
... print(s)
h
e
l
l
o
So you either need to avoid iterating when the value is a str, or wrap the str in another iterable (e.g. a list) so that it can be handled the same way. This is made slightly difficult by how you are structuring your code. For example, the following:
for key in KMstruct.keys():
for subkey in KMstruct[key]:
...can be written as:
for key, value in KMstruct.items():
for subkey, subvalue in value.items():
...giving you the value on each loop, making it possible to apply a test.
for key, val in KMstruct.items():
r += 1
worksheet.write(r, col, key)
for subkey, subval in val.items():
if currsubkeyval != subkey:
r += 1
worksheet.write(r, col, key)
r +=1
worksheet.write(r, col, key + '\\' + subkey)
for item, itemval in subval.items():
if curritemval != item:
r +=1
worksheet.write(r, col, key + '\\' + subkey)
for subitem, subitemval in itemval.items():
r += 1
worksheet.write(r, col, key + '\\' + subkey + '\\' + item)
worksheet.write(r, col + 1, subitem)
curritemval = item
# I am assuming at this point subitemval is either a list or a str?
if type(subitemval) == str:
# If it is, wrap it as a list, so the next block works as expected
subitemval = [subitemval]
for finalitem in subitemval:
r += 1
worksheet.write(r, col, key + '\\' + subkey + '\\' + item + '\\' + subitem)
worksheet.write(r, col + 1, finalitem) # This should be finalitem, correct?
Here we test the type of the subitemval and if it is a str wrap it in a list [str] so the following block iterates as expected. There is also an apparent bug where you are not outputting the finalitem on the last line. It's not posssible to test this without your example data but it should be functionally equivalent.

Your immediate problem is that the last line of your example should be:
worksheet.write(r,col+1, finalitem)
BTW, your code would be a lot easier to read if you created occasional temporary variables like:
subitemlist = KMstruct[key][subkey][item]
for subitem in subitemlist:

Related

Is there any way to loop through multiple lists and rename the second occurrence?

I am trying to loop over through multiple lists and check whether the list has multiple occurrences. If so I want to rename he lists.
I have tried looping over using two for loops but this code works perfectly for single list and not multi list.
new_word = [['abc'],['out'],['pqr'],['abc']]
for i in range(len(new_word)-1):
word_counter = 1
for j in range(i+1, len(new_word)):
if new_word[i] == new_word[j]:
word_counter = word_counter + 1
new_word[j] = new_word[j] + "_" + str(word_counter)
if word_counter > 1:
new_word[i] = new_word[i] + "_1"
Expected :
[['abc_1'],['out'],['pqr'],['abc_2']]
Actual:
[['abc'],['out'],['pqr'],['abc']]
This is a bit more verbose than the accepted answer, but will scan all
items in the original a list just once. It does not matter the least thing for small lists, (<10000 items) and no critical response time (i.e. , you don't have to feed this result to a HTTP response several times each second, for example).
from copy import deepcopy
def suffix_repeats(lst):
result = deepcopy(lst)
mapping = {}
for i, item in enumerate(lst):
mapping.setdefault(item[0], []).append(i)
for key, positions in mapping.items():
if len(positions) > 1:
for j, pos in enumerate(positions, 1):
result[pos][0] += f"_{j}"
return result
new_word = [['abc'],['out'],['pqr'],['abc']]
print(suffix_repeats(new_word))
new_word = [["abc"], ["out"], ["pqr"], ["abc"]]
for i in range(len(new_word) - 1):
word_counter = 1
for j in range(i + 1, len(new_word)):
if new_word[i] == new_word[j]:
word_counter = word_counter + 1
new_word[j] = [new_word[j][0] + "_" + str(word_counter)]
if word_counter > 1:
new_word[i] = [new_word[i][0] + "_1"]
Try this:
new_word = [['abc'],['out'],['pqr'],['abc']]
res=[[el[0]] if new_word.count(el)<2 else [el[0]+"_"+str(new_word[:k].count(el)+1)] for k, el in enumerate(new_word)]
print(res)
And output:
[['abc_1'], ['out'], ['pqr'], ['abc_2']]
[Program finished]

python adding tuple to list based on string representation

so I have this string dictionary of tuples:
data = '604452601061112210'
NewDict = {'60': ('PUSH1', 1), '61': ('PUSH2', 2), '52' : ('MSTORE', 0 ), '12' : ('ADD', 0)}
im trying to scan the string for the dictionary keys, to add to add the dict key to the list, along with the following bytes within the string, for example, it should print out in this order:
[PUSH1 44, MSTORE, PUSH1 10, PUSH2 1122, ADD]
however, my output is:
['PUSH1 44', 'MSTORE ', 'PUSH1 10', 'PUSH2 2210']
this is my code:
i = 0
L = []
while i < len(data):
op = data[i:i+2]
for item in NewDict:
if op in item:
i += NewDict[item][1] * 2
pep = NewDict[item][0] + ' ' + data[i:i+NewDict[item][1] * 2]
L.append(pep)
else:
i += 2
print(L)
any help would be appreciated, thanks
The problem is that you're incrementing i by NewDict[item][1] * 2 before you extract the parameter after the operator. You should do this after extracting.
You also don't need the for item in NewDict: loop, just access the element of NewDict directly.
while i < len(data):
op = data[i:i+2]
if op in NewDict:
opname, oplen = NewDict[op]
param = data[i+2:i+2+oplen*2]
L.append(opname + " " + param)
i += 2 + oplen*2
else:
i += 2
Even after fixing this you don't get ADD in the result because the last op in data is 10, not 12.
i = 0
L = []
while i < len(data):
op = data[i:i+2]
for item in NewDict:
if op in item:
i += NewDict[item][1] * 2
pep = NewDict[item][0] + ' ' + data[i:i+NewDict[item][1] * 2]
L.append(pep)
else:
i += 2
print(','.join(L))
','.join(L) removes the quotes around it. Hope this is what you meant? you can't really add a tuple to a list because those are different things. A tuple is, easy said, an unchangeable list.

Python: Calculating difference of values in a nested list by using a While Loop

I have a list that is composed of nested lists, each nested list contains two values - a float value (file creation date), and a string (a name of the file).
For example:
n_List = [[201609070736L, 'GOPR5478.MP4'], [201609070753L, 'GP015478.MP4'],[201609070811L, 'GP025478.MP4']]
The nested list is already sorted in order of ascending values (creation dates). I am trying to use a While loop to calculate the difference between each sequential float value.
For Example: 201609070753 - 201609070736 = 17
The goal is to use the time difference values as the basis for grouping the files.
The problem I am having is that when the count reaches the last value for len(n_List) it throws an IndexError because count+1 is out of range.
IndexError: list index out of range
I can't figure out how to work around this error. no matter what i try the count is always of range when it reaches the last value in the list.
Here is the While loop I've been using.
count = 0
while count <= len(n_List):
full_path = source_folder + "/" + n_List[count][1]
time_dif = n_List[count+1][0] - n_List[count][0]
if time_dif < 100:
f_List.write(full_path + "\n")
count = count + 1
else:
f_List.write(full_path + "\n")
f_List.close()
f_List = open(source_folder + 'GoPro' + '_' + str(count) + '.txt', 'w')
f_List.write(full_path + "\n")
count = count + 1
PS. The only work around I can think of is to assume that the last value will always be appended to the final group of files. so, when the count reaches len(n_List - 1), I skip the time dif calculation, and just automatically add that final value to the last group. While this will probably work most of the time, I can see edge cases where the final value in the list may need to go in a separate group.
I think using zip could be easier to get difference.
res1,res2 = [],[]
for i,j in zip(n_List,n_List[1:]):
target = res1 if j[0]-i[0] < 100 else res2
target.append(i[1])
n_list(len(n_list)) will always return an index out of range error
while count < len(n_List):
should be enough because you are starting count at 0, not 1.
FYI, here is the solution I used, thanks to #galaxyman for the help.
I handled the issue of the last value in the nested list, by simply
adding that value after the loop completes. Don't know if that's the most
elegant way to do it, but it works.
(note: i'm only posting the function related to the zip method suggested in the previous posts).
def list_zip(get_gp_list):
ffmpeg_list = open(output_path + '\\' + gp_List[0][1][0:8] + '.txt', 'a')
for a,b in zip(gp_List,gp_List[1:]):
full_path = gopro_folder + '\\' + a[1]
time_dif = b[0]-a[0]
if time_dif < 100:
ffmpeg_list.write("file " + full_path + "\n")
else:
ffmpeg_list.write("file " + full_path + "\n")
ffmpeg_list.close()
ffmpeg_list = open(output_path + '\\' + b[1][0:8] + '.txt', 'a')
last_val = gp_List[-1][1]
ffmpeg_list.write("file " + gopro_folder + '\\' + last_val + "\n")
ffmpeg_list.close()

How to check and change a value in a Text file? (Python 2.7.11)

Text File explained:
Joe,Bloggs,J.bloggs#anemailaddress.com,01269512355, 1 ,0, 0, 0, 0
Fname, Lname, Email, number, Value i want checking ^ , ...,...,...,...
Objective: The check that the number in value[4] of the key is not 0 or 7.
If it is 0 then it changes to 1, and if it is 7 then it changes to 6.
So if it's 0 then + 1, if it's 7 then -1.
Text file:
Joe,Bloggs,J.bloggs#anemailaddress.com,01269512355, 1,0, 0, 0, 0
Sarah,Brown,S.brown#anemailaddress.com,01866522555, 5,0, 0, 0, 0
Andrew,Smith,A.smith#anemailaddress.com,01899512785, 7,0, 0, 0, 0
Kevin,White,K.white#anemailaddress.com,01579122345, 0,0, 0, 0, 0
Samantha,Collins,S.collins#anemailaddress.com,04269916257, 0,0, 0, 0, 0
After the code has run, it should look like this:
Joe,Bloggs,J.bloggs#anemailaddress.com,01269512355, 1,0, 0, 0, 0
Sarah,Brown,S.brown#anemailaddress.com,01866522555, 5,0, 0, 0, 0
Andrew,Smith,A.smith#anemailaddress.com,01899512785, 6,0, 0, 0, 0
Kevin,White,K.white#anemailaddress.com,01579122345, 1,0, 0, 0, 0
Samantha,Collins,S.collins#anemailaddress.com,04269916257, 1,0, 0, 0, 0
The code i have so far produces an error:
fileinfo[j] = i[0] + ',' + i[1] + ',' + i[2] + ',' + i[3] + ',' + str(value) + ',' + i[5] + ',' + i[6] + ',' + i[7]+ ',' + i[8] + '\n'
IndexError: list assignment index out of range
Code:
f = open ("players.txt","r")
fileinfo = f.readlines()
f.close()
j = 0
for i in fileinfo:
i = i.strip()
i = i.split(",")
value = int(i[4])
if value == "0":
value = value + 1
fileinfo[j] = i[0] + ',' + i[1] + ',' + i[2] + ',' + i[3] + ',' + str(value) + ',' + i[5] + ',' + i[6] + ',' + i[7]+ ',' + i[8] + '\n'
j = j + 1
if value == "7":
value = value - 1
fileinfo[j] = i[0] + ',' + i[1] + ',' + i[2] + ',' + i[3] + ',' + str(value) + ',' + i[5] + ',' + i[6] + ',' + i[7]+ ',' + i[8] + '\n'
j = j + 1
f = open ("players.txt","w")
for i in fileinfo:
f.write(i)
f.close()
This is probably a very complex way of doing what i want to do. Please can you help me with my objective. Feel free to rewrite my entire code, but can you explain in detail what you have done. I am quite new to coding.
For future readers
There are two answers that work. I can only tick one of them. Hope this question helps future readers as it has done me.
The problems with your code are as follows:
You should remove the whitespace around the number before converting it to an int:
value = int(i[4]) would crash if there were whitespace around the number. Use value = int(i[4].strip()) to fix this.
You're converting the value to an integer, then comparing that integer to a string. This will always evaluate to False.
value = int(i[4])
if value == "0":
You're incrementing j twice per loop, which is why your code crashes with an IndexError. I suggest using enumerate instead of manually maintaining j.
The fixed code could look like this:
for j, i in enumerate(fileinfo):
i = i.strip()
i = i.split(",")
value = i[4].strip()
if value == "0":
i[4] = "1"
elif value == "7":
i[4] = "6"
fileinfo[j] = ','.join(i) + '\n'
Here is a simplified version of your code (Totally untested).
with open ("players.txt","r") as f:
parts = [[y.strip() for y in x.split(",")] for x in f if x.strip()]
with open("players.txt", "w") as f:
for part in parts:
#So if it's 0 then + 1, if it's 7 then -1
if part[4] == "0": part[4] = "1"
elif part[4] == "7": part[4] = "6"
print>>f, ",".join(part)
You increment j twice each time through the loop, when it is supposed to track the position of each line, and thus it gets too big and causes the error.
Here is a corrected version of your code (explanation below). I tried to make as little changes as possible.
f = open("players.txt","r")
fileinfo = f.readlines()
f.close()
for i in enumerate(fileinfo):
i[1] = i[1].strip()
i[1] = i[1].split(",")
#Make sure we don't go out of range, if so, skip to the next line.
if len(i[1]) < 5:
continue
value = int(i[1][4].strip())
if value == 0:
value += 1
elif value == 7:
value -= 1
i[1][4] = str(value)
fileinfo[i[0]] = ",".join(i)
f = open("players.txt","w")
f.write("\n".join(fileinfo))
f.close()
value is an int, but you were comparing it against strings.
j was not needed and I replaced it with the enumerate function. This will create pairs of [index,value] for you to use.
You were checking value twice and setting i twice. I corrected this by cleaning up the if blocks and utilizing elif.
The .join method saves you a lot of work. It creates a string from a list with a deliminator. ' '.join(['Hello', 'world!']) becomes Hello world!.
If there are any lines which are too short, it would error (a blank line for example). I added a check which will skip that line in such a scenario by utilizing continue.
You should strip the whitespace around the element before attempting to convert to int.
When adding to a value, it's often easier to utilize += and -=.
This is what Scott Hunter meant:
j = 0
for i in fileinfo:
# Do something on fileinfo[j]
j = j + 1
# Do something on fileinfo[j]
j = j + 1
Let's say you have three lines in fileinfo: ['aaa', 'bbb', 'ccc']
On first loop iteration you have i = 'aaa' and j = 0. fileinfo[0] points to the first line.
On second iteration i = 'bbb' and j = 2. fileinfo[2] points to the third line.
On the third iteration i = 'ccc'and j = 4. You try to read the fifth line (fileinfo[4]) which does not exists.
Only increment the value of j once per loop iteration and i/j will stay in sync.
Scott Hunter is write about you erroneously incrementing j twice in the loop. Also you would probably also want to write the whole thing a little more readable, such as:
filename = 'players.txt'
with open(filename, 'r') as f:
lines = f.readlines()
output = []
for line in lines:
split_line = line.split(',')
split_line = map(str.strip, split_line)
num = split_line[4]
if num == '0':
split_line[4] = '1'
if num == '7':
split_line[4] = '6'
output.append(','.join(split_line))
with open(filename, 'w') as f:
f.write('\n'.join(output))
assuming you dont care for whitespace between the commas.

Write multiple columns on .xls through Python 2.7

out_file.write('Position'+'\t'+'Hydrophobic'+'\n')
for i in position:
out_file.write(str(i)+'\n')
for j in value:
out_file.write('\t'+str(j)+'\n')
so it says
Position Hydrophobic
0 a
1 b
2 c
#... and so on
when it writes to the excel file it puts it puts the j from value at the bottom of the column of the i in position column
How do i put them side by side with '\t' and '\n'?
for i, j in zip(position, value):
out_file.write(str(i) + '\t' + str(j) + '\n')
or better:
out_file.write('%s\t%s\n' % (str(i), str(j))
or better:
text = ['%s\t%s' % (str(i), str(j) for i, j in zip(position, value)]
text = '\n'.join(text)
out_file.write(text)

Categories