Afternoon all,
I'm encountering some strange behavior from the starred for loop below. The below function essentially iterates over the input dictionary patient_features, concatenating several strings to produce a SVMLight style vector. This vector is then intended to be written to the deliverables file. However, for some reason the writes at the end of the function are being called for every iteration of the starred for loop, resulting in a massive file size (and some other more minor issues). Any help on what might be causing this would be greatly appreciated.
def save_svmlight(patient_features, mortality, op_file, op_deliverable):
deliverable1 = open(op_file, 'wb') # feature without patient id
deliverable2 = open(op_deliverable, 'wb') # features with patient id
d1_line = ''
d2_line = ''
count = 0 # VALUE TO TEST IF INCREMENTING
print count
for patient_id in patient_features: #**********
value_tuple_list = patient_features[patient_id]
value_tuple_list.sort()
d2_line += str(int(patient_id)) + ' '
if patient_id in mortality:
d1_line += str(1) + ' '
d2_line += str(1) + ' '
else:
d1_line += str(0) + ' '
d2_line += str(0) + ' '
for value_tuple in value_tuple_list:
d1_line += str(int(value_tuple[0])) + ":" + str("{:1.6f}".format(value_tuple[1])) + ' '
d2_line += str(int(value_tuple[0])) + ":" + str("{:1.6f}".format(value_tuple[1])) + ' '
count += 1
print count # VALUE INCREMENTS WHEN IT SHOULD NOT
deliverable1.write(d1_line); # <- BEING WRITTEN TO EACH LOOP :(
deliverable2.write(d2_line); # <- BEING WRITTEN TO EACH LOOP :(
The issue is with using both indentations and tabs in your code. Python is not a fan if you mix the two :/
For future reference, if you are using Sublime Text, select all the text, and in the toolbar on top go to View > Indentation > Convert Tabs to Spaces and the issue will be resolved.
Thought I would save you having to manually search for each tab and replacing it :)
Related
I am trying to have my program format properly in a tkinter GUI but for some reason I get the error message:
product += ('%-10s%-10s%-0s%-0s') % (str(names[i])+ str(addedHours[i]) + str(payOut) + "\n") TypeError: not enough arguments for format string
Desired Output:
Here is the snip of code related to the problem (Note that the Name/Hours/Pay and --- parts are working fine, just not the ones under the variable product)
def printPayroll(self):
i = 0
product = ""
for y in names:
payOut = float(wage[i]) * float(addedHours[i])
product += ('%-10s%-10s%-0s%-0s') % (str(names[i])+ str(addedHours[i]) + str(payOut) + "\n")
i += 1
self.text.insert(END,("%-10s%-10s%-0s") % ('Name', 'Hours', 'Pay\n'))
self.text.insert(END,("%-10s%-10s%-0s") % ('---','-----','---\n'))
self.text.insert(END, product)
The error message you get, TypeError: not enough arguments for format string, is telling you exactly what the problem is.
Consider this line of code:
product += ('%-10s%-10s%-0s%-0s') % (str(names[i])+ str(addedHours[i]) + str(payOut) + "\n")
The above code is functionally identical to this:
s = str(names[i]) + str(addedHours[i]) + str(payOut) + "\n"
product += ('%-10s%-10s%-0s%-0s') % s
Your formatting string requires four arguments but you are only giving one. The simple solution is to replace every + with a ,:
product += ('%-10s%-10s%-0s%-0s') % (str(names[i]), str(addedHours[i]), str(payOut), "\n")
I'm trying to make my program return the exact same string but with ** between each character. Here's my code.
def separate(st):
total = " "
n = len(st + st[-1])
for i in range(n):
total = str(total) + str(i) + str("**")
return total
x = separate("12abc3")
print(x)
This should return:
1**2**a**b**c**3**
However, I'm getting 0**1**2**3**4**5**6**.
You can join the characters in the string together with "**" as the separator (this works because strings are basically lists in Python). To get the additional "**" at the end, just concatenate.
Here's an example:
def separate(st):
return "**".join(st) + "**"
Sample:
x = separate("12abc3")
print(x) # "1**2**a**b**c**3**"
A note on your posted code:
The reason you get the output you do is because you loop using for i in range(n): so the iteration variable i will be each index in st. Then when you call str(total) + str(i) + str("**"), you cast i to a string, and i was just each index (from 0 to n-1) in st.
To fix that you could iterate over the characters in st directly, like this:
for c in st:
or use the index i to get the character at each position in st, like this:
for i in range(len(st)):
total = total + st[i] + "**"
welcome to StackOverflow!
I will explain part of your code line by line.
for i in range(n) since you are only providing 1 parameter (which is for the stopping point), this will loop starting from n = 0, 1, 2, ... , n-1
total = str(total) + str(i) + str("**") this add i (which is the current number of iteration - 1) and ** to the current total string. Hence, which it is adding those numbers sequentially to the result.
What you should do instead is total = str(total) + st[i] + str("**") so that it will add each character of st one by one
In addition, you could initialize n as n = len(st)
I need to print and append the list of all value in the list of python, which I not able to get it. Can you one help me on this. I am just comparing one system data with another system data and identified the mismatches on each field.
Example, Table as kid_id, name, age, gender, address in 2 different systems. I need to ensure all kids data are correctly moved from 1data to 2data systems.
Emp_id like 1,2,3,4,5,6
2_data = self.get2Data(kid_id)
1_data = self.get1Data(kid_id)
for i in range(len(1data)):
for key, value in 1data[i].items():
if 1data[i][key] == 2data[i][key]:
result = str("LKG") + ","+ str(kid_id) +","+ str("PASS") + "," + str(key)
else:
result = str("LKG") + "," + str(kid_id) + "," + str("FAIL") + "," + str(key)
MatchResult = result.split()
print MatchResult
print "***It is Done*****"
Currently my Output is like,
['LKG,100,PASS,address']
['LKG,102,FAIL,dob']
['LKG,105,FAIL,gender']
but i need in the way of,
(['LKG,100,PASS,address'],['LKG,102,FAIL,dob'],['LKG,105,FAIL,gender'])
or
[('LKG,100,PASS,address'),('LKG,102,FAIL,dob'),('LKG,105,FAIL,gender')]
Code details: The above code will compare the two system data and show the pass and fail cases by printing the above format. If you see the above result, it is print address as pass and dob as fail and gender as fail that means still data mismatch is their for dob and gender field for the kid holding 102 and 105.
Move list variable declaration before loop and initialize it to empty list and then append the results each time.
2_data = self.get2Data(kid_id)
1_data = self.get1Data(kid_id)
MatchResult=[]
for i in range(len(1data)):
for key, value in 1data[i].items():
if 1data[i][key] == 2data[i][key]:
result = str("LKG") + ","+ str(kid_id) +","+ str("PASS") + "," + str(key)
else:
result = str("LKG") + "," + str(kid_id) + "," + str("FAIL") + "," + str(key)
MatchResult.append( result.split())
print MatchResult
print "***It is Done*****"
I am reading an external txt file and displaying all the lines which has field 6 as Y. I then need to count these lines. However, when I add the sum function it will only print 1 of the lines, if I remove this sum function all the lines display as expected. I am presuming its something to do with the for loop but can't seem figure out how to get all lines to display and keep my sum. Can anyone help me spot where this is going wrong?
noLines = 0
fileOpen = open ("file.txt","r")
print ("Name: " + "\tDate: " + "\tAge: " + "\tColour: " + "\tPet")
for line in fileOpen:
line = line[:-1]
field = line.split(',')
if field[6] == "Y":
print()
print (field[0] +"\t\t" + field[1] + "\t" + field[2] + "\t\t" + field[3] + "\t\t" + field[4])
noLines = sum(1 for line in fileOpen)
print ()
print(noLines)
You are using sum incorrectly. In order to achieve the desired result, you may replace your current code with sum as:
noLines = sum(1 for l in open("file.txt","r") if l[:-1].split(',')[6]=='Y')
Issue with current code: Because fileOpen is a generator. You are exhausting it completely within sum and hence your next for iteration is not happening. Instead of using sum, you may initialize noLines before the for loop as:
noLines = 0
for line in fileOpen:
# your stuff ...
And instead of sum, do:
noLines += 1
# instead of: noLines = sum(1 for line in fileOpen)
I have a list that is composed of nested lists, each nested list contains two values - a float value (file creation date), and a string (a name of the file).
For example:
n_List = [[201609070736L, 'GOPR5478.MP4'], [201609070753L, 'GP015478.MP4'],[201609070811L, 'GP025478.MP4']]
The nested list is already sorted in order of ascending values (creation dates). I am trying to use a While loop to calculate the difference between each sequential float value.
For Example: 201609070753 - 201609070736 = 17
The goal is to use the time difference values as the basis for grouping the files.
The problem I am having is that when the count reaches the last value for len(n_List) it throws an IndexError because count+1 is out of range.
IndexError: list index out of range
I can't figure out how to work around this error. no matter what i try the count is always of range when it reaches the last value in the list.
Here is the While loop I've been using.
count = 0
while count <= len(n_List):
full_path = source_folder + "/" + n_List[count][1]
time_dif = n_List[count+1][0] - n_List[count][0]
if time_dif < 100:
f_List.write(full_path + "\n")
count = count + 1
else:
f_List.write(full_path + "\n")
f_List.close()
f_List = open(source_folder + 'GoPro' + '_' + str(count) + '.txt', 'w')
f_List.write(full_path + "\n")
count = count + 1
PS. The only work around I can think of is to assume that the last value will always be appended to the final group of files. so, when the count reaches len(n_List - 1), I skip the time dif calculation, and just automatically add that final value to the last group. While this will probably work most of the time, I can see edge cases where the final value in the list may need to go in a separate group.
I think using zip could be easier to get difference.
res1,res2 = [],[]
for i,j in zip(n_List,n_List[1:]):
target = res1 if j[0]-i[0] < 100 else res2
target.append(i[1])
n_list(len(n_list)) will always return an index out of range error
while count < len(n_List):
should be enough because you are starting count at 0, not 1.
FYI, here is the solution I used, thanks to #galaxyman for the help.
I handled the issue of the last value in the nested list, by simply
adding that value after the loop completes. Don't know if that's the most
elegant way to do it, but it works.
(note: i'm only posting the function related to the zip method suggested in the previous posts).
def list_zip(get_gp_list):
ffmpeg_list = open(output_path + '\\' + gp_List[0][1][0:8] + '.txt', 'a')
for a,b in zip(gp_List,gp_List[1:]):
full_path = gopro_folder + '\\' + a[1]
time_dif = b[0]-a[0]
if time_dif < 100:
ffmpeg_list.write("file " + full_path + "\n")
else:
ffmpeg_list.write("file " + full_path + "\n")
ffmpeg_list.close()
ffmpeg_list = open(output_path + '\\' + b[1][0:8] + '.txt', 'a')
last_val = gp_List[-1][1]
ffmpeg_list.write("file " + gopro_folder + '\\' + last_val + "\n")
ffmpeg_list.close()