Append the string in python for forloop - python

I need to print and append the list of all value in the list of python, which I not able to get it. Can you one help me on this. I am just comparing one system data with another system data and identified the mismatches on each field.
Example, Table as kid_id, name, age, gender, address in 2 different systems. I need to ensure all kids data are correctly moved from 1data to 2data systems.
Emp_id like 1,2,3,4,5,6
2_data = self.get2Data(kid_id)
1_data = self.get1Data(kid_id)
for i in range(len(1data)):
for key, value in 1data[i].items():
if 1data[i][key] == 2data[i][key]:
result = str("LKG") + ","+ str(kid_id) +","+ str("PASS") + "," + str(key)
else:
result = str("LKG") + "," + str(kid_id) + "," + str("FAIL") + "," + str(key)
MatchResult = result.split()
print MatchResult
print "***It is Done*****"
Currently my Output is like,
['LKG,100,PASS,address']
['LKG,102,FAIL,dob']
['LKG,105,FAIL,gender']
but i need in the way of,
(['LKG,100,PASS,address'],['LKG,102,FAIL,dob'],['LKG,105,FAIL,gender'])
or
[('LKG,100,PASS,address'),('LKG,102,FAIL,dob'),('LKG,105,FAIL,gender')]
Code details: The above code will compare the two system data and show the pass and fail cases by printing the above format. If you see the above result, it is print address as pass and dob as fail and gender as fail that means still data mismatch is their for dob and gender field for the kid holding 102 and 105.

Move list variable declaration before loop and initialize it to empty list and then append the results each time.
2_data = self.get2Data(kid_id)
1_data = self.get1Data(kid_id)
MatchResult=[]
for i in range(len(1data)):
for key, value in 1data[i].items():
if 1data[i][key] == 2data[i][key]:
result = str("LKG") + ","+ str(kid_id) +","+ str("PASS") + "," + str(key)
else:
result = str("LKG") + "," + str(kid_id) + "," + str("FAIL") + "," + str(key)
MatchResult.append( result.split())
print MatchResult
print "***It is Done*****"

Related

print statement never gets executed in a loop

in the below posted code, the first nested for loops displays the logs or the print statemnt as expected. but the latter nested for loops which has k and l as indces never displys the logs or the print statement within it.
please let me know why the print statement
print(str(x) + ",,,,,,,,,,,,,,,,,,," + str(y))
never gets displayed despite the polygonCoordinatesInEPSG25832 contains values
python code:
for feature in featuresArray:
polygonCoordinatesInEPSG4326.append(WebServices.fetchCoordinateForForFeature(feature))
for i in range(len(polygonCoordinatesInEPSG4326)):
for j in range(len(polygonCoordinatesInEPSG4326[i])):
lon = polygonCoordinatesInEPSG4326[i][j][0]
lat = polygonCoordinatesInEPSG4326[i][j][1]
x, y = transform(inputProj, outputProj, lon, lat)
xy.append([x,y])
print ("lon:" + str(lon) + "," + "lat:" + str(lat) + "<=>" + "x:" + str(x) + "," + "y:" , str(y))
print(str(x) + "," + str(y))
print("xy[%d]: %s"%(len(xy)-1,str(xy[len(xy)-1])))
print("\n")
print("len(xy): %d"%(len(xy)))
polygonCoordinatesInEPSG25832.append(xy)
print("len(polygonCoordinatesInEPSG25832[%d]: %d"%(i,len(polygonCoordinatesInEPSG25832[i])))
xy.clear()
print("len(polygonCoordinatesInEPSG25832 = %d" %(len(polygonCoordinatesInEPSG25832)))
for k in range(len(polygonCoordinatesInEPSG25832)):
for l in range(len(polygonCoordinatesInEPSG25832[k])):
x = polygonCoordinatesInEPSG25832[k][l][0]
y = polygonCoordinatesInEPSG25832[k][l][1]
print(str(x) + ",,,,,,,,,,,,,,,,,,," + str(y))
polygonCoordinatesInEPSG25832 contain values but polygonCoordinatesInEPSG25832[k] don't.
You append it with xy but you didn't unlinked it so when you call xy.clear() it become empty. Try deep copy it instead.

For Loop is not Following Indentations

Afternoon all,
I'm encountering some strange behavior from the starred for loop below. The below function essentially iterates over the input dictionary patient_features, concatenating several strings to produce a SVMLight style vector. This vector is then intended to be written to the deliverables file. However, for some reason the writes at the end of the function are being called for every iteration of the starred for loop, resulting in a massive file size (and some other more minor issues). Any help on what might be causing this would be greatly appreciated.
def save_svmlight(patient_features, mortality, op_file, op_deliverable):
deliverable1 = open(op_file, 'wb') # feature without patient id
deliverable2 = open(op_deliverable, 'wb') # features with patient id
d1_line = ''
d2_line = ''
count = 0 # VALUE TO TEST IF INCREMENTING
print count
for patient_id in patient_features: #**********
value_tuple_list = patient_features[patient_id]
value_tuple_list.sort()
d2_line += str(int(patient_id)) + ' '
if patient_id in mortality:
d1_line += str(1) + ' '
d2_line += str(1) + ' '
else:
d1_line += str(0) + ' '
d2_line += str(0) + ' '
for value_tuple in value_tuple_list:
d1_line += str(int(value_tuple[0])) + ":" + str("{:1.6f}".format(value_tuple[1])) + ' '
d2_line += str(int(value_tuple[0])) + ":" + str("{:1.6f}".format(value_tuple[1])) + ' '
count += 1
print count # VALUE INCREMENTS WHEN IT SHOULD NOT
deliverable1.write(d1_line); # <- BEING WRITTEN TO EACH LOOP :(
deliverable2.write(d2_line); # <- BEING WRITTEN TO EACH LOOP :(
The issue is with using both indentations and tabs in your code. Python is not a fan if you mix the two :/
For future reference, if you are using Sublime Text, select all the text, and in the toolbar on top go to View > Indentation > Convert Tabs to Spaces and the issue will be resolved.
Thought I would save you having to manually search for each tab and replacing it :)

Python: Calculating difference of values in a nested list by using a While Loop

I have a list that is composed of nested lists, each nested list contains two values - a float value (file creation date), and a string (a name of the file).
For example:
n_List = [[201609070736L, 'GOPR5478.MP4'], [201609070753L, 'GP015478.MP4'],[201609070811L, 'GP025478.MP4']]
The nested list is already sorted in order of ascending values (creation dates). I am trying to use a While loop to calculate the difference between each sequential float value.
For Example: 201609070753 - 201609070736 = 17
The goal is to use the time difference values as the basis for grouping the files.
The problem I am having is that when the count reaches the last value for len(n_List) it throws an IndexError because count+1 is out of range.
IndexError: list index out of range
I can't figure out how to work around this error. no matter what i try the count is always of range when it reaches the last value in the list.
Here is the While loop I've been using.
count = 0
while count <= len(n_List):
full_path = source_folder + "/" + n_List[count][1]
time_dif = n_List[count+1][0] - n_List[count][0]
if time_dif < 100:
f_List.write(full_path + "\n")
count = count + 1
else:
f_List.write(full_path + "\n")
f_List.close()
f_List = open(source_folder + 'GoPro' + '_' + str(count) + '.txt', 'w')
f_List.write(full_path + "\n")
count = count + 1
PS. The only work around I can think of is to assume that the last value will always be appended to the final group of files. so, when the count reaches len(n_List - 1), I skip the time dif calculation, and just automatically add that final value to the last group. While this will probably work most of the time, I can see edge cases where the final value in the list may need to go in a separate group.
I think using zip could be easier to get difference.
res1,res2 = [],[]
for i,j in zip(n_List,n_List[1:]):
target = res1 if j[0]-i[0] < 100 else res2
target.append(i[1])
n_list(len(n_list)) will always return an index out of range error
while count < len(n_List):
should be enough because you are starting count at 0, not 1.
FYI, here is the solution I used, thanks to #galaxyman for the help.
I handled the issue of the last value in the nested list, by simply
adding that value after the loop completes. Don't know if that's the most
elegant way to do it, but it works.
(note: i'm only posting the function related to the zip method suggested in the previous posts).
def list_zip(get_gp_list):
ffmpeg_list = open(output_path + '\\' + gp_List[0][1][0:8] + '.txt', 'a')
for a,b in zip(gp_List,gp_List[1:]):
full_path = gopro_folder + '\\' + a[1]
time_dif = b[0]-a[0]
if time_dif < 100:
ffmpeg_list.write("file " + full_path + "\n")
else:
ffmpeg_list.write("file " + full_path + "\n")
ffmpeg_list.close()
ffmpeg_list = open(output_path + '\\' + b[1][0:8] + '.txt', 'a')
last_val = gp_List[-1][1]
ffmpeg_list.write("file " + gopro_folder + '\\' + last_val + "\n")
ffmpeg_list.close()

Organize dictionary by frequency

I create a dictionary for the most used words and get the top ten. I need to sort this for the list, which should be in order. I can't do that without making a list, which I can't use. Here is my code. I am away dictionaries cannot be sorted, but i still need help.
most_used_words = Counter()
zewDict = Counter(most_used_words).most_common(10)
newDict = dict(zewDict)
keys = newDict.keys()
values = newDict.values()
msg = ('Here is your breakdown of your most used words: \n\n'
'Word | Times Used'
'\n:--:|:--:'
'\n' + str(keys[0]).capitalize() + '|' + str(values[0]) +
'\n' + str(keys[1]).capitalize() + '|' + str(values[1]) +
'\n' + str(keys[2]).capitalize() + '|' + str(values[2]) +
'\n' + str(keys[3]).capitalize() + '|' + str(values[3]) +
'\n' + str(keys[4]).capitalize() + '|' + str(values[4]) +
'\n' + str(keys[5]).capitalize() + '|' + str(values[5]) +
'\n' + str(keys[6]).capitalize() + '|' + str(values[6]) +
'\n' + str(keys[7]).capitalize() + '|' + str(values[7]) +
'\n' + str(keys[8]).capitalize() + '|' + str(values[8]) +
'\n' + str(keys[9]).capitalize() + '|' + str(values[9]))
r.send_message(user, 'Most Used Words', msg)
How would I do it so the msg prints the words in order from most used word on the top to least on the bottom with the correct values for the word?
Edit: I know dictionaries cannot be sorted on their own, so can I work around this somehow?
Once you have the values it's as simple as:
print('Word | Times Used')
for e, t in collections.Counter(values).most_common(10):
print("%s|%d" % (e,t))
Print something like:
Word | Times Used
e|4
d|3
a|2
c|2
From the Docs: most_common([n])
Return a list of the n most common elements and their counts from the
most common to the least. If n is not specified, most_common() returns
all elements in the counter. Elements with equal counts are ordered
arbitrarily:
>>> Counter('abracadabra').most_common(3)
[('a', 5), ('r', 2), ('b', 2)]
Your code can be:
from collections import Counter
c = Counter(most_used_words)
msg = "Here is your breakdown of your most used words:\n\nWords | Times Used\n:--:|:--:\n"
msg += '\n'.join('%s|%s' % (k.capitalize(), v) for (k, v) in c.most_common(10))
r.send_message(user, 'Most Used Words', msg)
import operator
newDict = dict(zewDict)
sorted_newDict = sorted(newDict.iteritems(), key=operator.itemgetter(1))
msg = ''
for key, value in sorted_newDict:
msg.append('\n' + str(key).capitalize() + '|' + str(value))
This will sort by the dictionary values. If you want it in the other order add reverse=True to sorted().

Formatting output when writing a list to textfile

i have a list of lists that looks like this:
dupe = [['95d1543adea47e88923c3d4ad56e9f65c2b40c76', 'ron\\c', 'apa.txt'], ['95d1543adea47e88923c3d4ad56e9f65c2b40c76', 'ron\\c', 'knark.txt'], ['b5cc17d3a35877ca8b76f0b2e07497039c250696', 'ron\\a', 'apa2.txt'], ['b5cc17d3a35877ca8b76f0b2e07497039c250696', 'ron\\a', 'jude.txt']]
I write it to a file using a very basic function():
try:
file_name = open("dupe.txt", "w")
except IOError:
pass
for a in range (len(dupe)):
file_name.write(dupe[a][0] + " " + dupe[a][1] + " " + dupe[a][2] + "\n");
file_name.close()
With the output in the file looking like this:
95d1543adea47e88923c3d4ad56e9f65c2b40c76 ron\c apa.txt
95d1543adea47e88923c3d4ad56e9f65c2b40c76 ron\c knark.txt
b5cc17d3a35877ca8b76f0b2e07497039c250696 ron\a apa2.txt
b5cc17d3a35877ca8b76f0b2e07497039c250696 ron\a jude.txt
However, how can i make the output in the dupe.txt file to look like this:
95d1543adea47e88923c3d4ad56e9f65c2b40c76 ron\c apa.txt, knark.txt
b5cc17d3a35877ca8b76f0b2e07497039c250696 ron\a apa2.txt, jude.txt
First, group the lines by the "key" (the first two elements of each array):
dupedict = {}
for a, b, c in dupe:
dupedict.setdefault((a,b),[]).append(c)
Then print it out:
for key, values in dupedict.iteritems():
print ' '.join(key), ', '.join(values)
i take it your last question didn't solve your problem?
instead of putting each list with repeating ID's and directories in seperate lists, why not make the file element of the list another sub list which contains all the files which have the same id and directory.
so dupe would look like this:
dupe = [['95d1543adea47e88923c3d4ad56e9f65c2b40c76', 'ron\\c', ['apa.txt','knark.txt']],
['b5cc17d3a35877ca8b76f0b2e07497039c250696', 'ron\\a', ['apa2.txt','jude.txt']]
then your print loop could be similar to:
for i in dupe:
print i[0], i[1],
for j in i[2]
print j,
print
from collections import defaultdict
dupe = [
['95d1543adea47e88923c3d4ad56e9f65c2b40c76', 'ron\\c', 'apa.txt'],
['95d1543adea47e88923c3d4ad56e9f65c2b40c76', 'ron\\c', 'knark.txt'],
['b5cc17d3a35877ca8b76f0b2e07497039c250696', 'ron\\a', 'apa2.txt'],
['b5cc17d3a35877ca8b76f0b2e07497039c250696', 'ron\\a', 'jude.txt'],
]
with open("dupe.txt", "w") as f:
data = defaultdict(list)
for hash, dir, fn in dupe:
data[(hash, dir)].append(fn)
for hash_dir, fns in data.items():
f.write("{0[0]} {0[1]} {1}\n".format(hash_dir, ', '.join(fns)))
Use a dict to group them:
data = [['95d1543adea47e88923c3d4ad56e9f65c2b40c76', 'ron\\c', 'apa.txt'], \
['95d1543adea47e88923c3d4ad56e9f65c2b40c76', 'ron\\c', 'knark.txt'], \
['b5cc17d3a35877ca8b76f0b2e07497039c250696', 'ron\\a', 'apa2.txt'], \
['b5cc17d3a35877ca8b76f0b2e07497039c250696', 'ron\\a', 'jude.txt']]
dupes = {}
for row in data:
if dupes.has_key(row[0]):
dupes[row[0]].append(row)
else:
dupes[row[0]] = [row]
for dupe in dupes.itervalues():
print "%s\t%s\t%s" % (dupe[0][0], dupe[0][1], ",".join([x[2] for x in dupe]))
If this is your actual answer, you can:
Output one line per every two elements in dupe. This is easier. Or,
If your data isn't as structured (so you may you can make a dictionary where your long hash is the key, and the tail end of the string is your output. Make sense?
In idea one, mean that you can something like this:
tmp_string = ""
for a in range (len(dupe)):
if isOdd(a):
tmp_string = dupe[a][0] + " " + dupe[a][1] + " " + dupe[a][2]
else:
tmp_string += ", " + dupe[a][2]
file_name.write(dupe[a][0] + " " + dupe[a][1] + " " + dupe[a][2] + "\n");
In idea two, you may have something like this:
x=dict()
for a in range(len(dupe)):
# check if the hash exists in x; bad syntax - I dunno "exists?" syntax
if (exists(x[dupe[a][0]])):
x[a] += "," + dupe[a][2]
else:
x[a] = dupe[a][0] + " " + dupe[a][1] + " " + dupe[a][2]
for b in x: # bad syntax: basically, for every key in dictionary x
file_name.write(x[b]);

Categories