My script run only through 1 "if" on 2 - python

Here is a description of what I want to do :
I've 2 csv files.
I want to search the same thing (lets call it "hugo") in my 2 files at the same time.
The thing is that it only print one and not the other.
Here is my code :
try:
while True:
if pm10pl[j][2] == '"Victor Hugo"':
victor1 = pm10pl[j]
print victor1
if pm25pl[t][2] == '"Victor Hugo"':
victor2= pm25pl[t]
print victor2
j=j+1
t=t+1
except IndexError:
pass
I've tried different things such as elif instead of if, replace t by j, passing by 2 functions. Each if works perfectly when the other is not here, and when I invert the 2 of them, that's the same that print aka pm25pl.
Can't do anything.
(here is only the part of my code that has interest, opening of file etc works fine, the '""' is normal hugo appeared in my file as "hugo" (with the double quote))
Plus, I can't call victor1 and victor2 outside of the if.
Do you have any idea what's going on ?

You can iterate through 2 lists simultaneously using itertool's zip function.
import itertools
l = []
for victor1, victor2 in itertools.izip_longest(pm10pl, pm25pl):
if victor1 and victor1[2] == '"Victor Hugo"':
#print victor1
if victor2 and victor2[2] == '"Victor Hugo"':
#print victor2
l.append((victor1, victor2)) # add the pair to list.
for i in l: # prints all pairs.
print i

Do one list comprehension for each csv file:
[pm10pl[i] for i in range(0,len(pm10pl)) if 'Victor Hugo' in pm10pl[i][2]]
[pm25pl[i] for i in range(0,len(pm25pl)) if 'Victor Hugo' in pm25pl[i][2]]

Related

Why I have IndexError when program makes succesful first step?

I tried to make sorter that deletes duplicates of IP's in first list and saves it into a file, but after first succesful round it gives me IndexError: list index out of range.
I've expected normal sorting process, but it doesn't works
Code:
ip1 = open('hosts', 'r')
ip2 = open('rotten', 'r')
ipList1 = [line.strip().split('\n') for line in ip1]
ipList2 = [line.strip().split('\n') for line in ip2]
for i in range(len(ipList1)):
for a in range(len(ipList2)):
if(ipList1[i] == ipList2[a]):
print('match')
del(ipList1[i])
del(ipList2[a])
i -= 1
a -= 1
c = open('end', 'w')
for d in range(len(ipList1)):
c.write(str(ipList1[d]) + '\n')
c.close()
You're deleting from the list while iterating over it, that's why you're getting an IndexError.
This could be easier done with sets:
with open('hosts') as ip1, open('rotten') as ip2:
ipList1 = set(line.strip().split('\n') for line in ip1)
ipList2 = set(line.strip().split('\n') for line in ip2)
good = ipList1 - ipList2
with open('end', 'w') as c:
for d in good:
c.write(d + '\n')
You changed lists in a fly. For expression gets a list with, as an example, of 5 elements length, after the first iteration you remove 4, so in the second iteration for tried to extract the second element but now it does not exist.
If necessary save the ordering you can use generator expression:
ips = [ip for ip in ipList1 if ip not in set(list2)]
If doesn't, just use sets expression.
You should never modify a list that you are currently iterating over.
A fix would be just to make a third list that saves the non duplicates. Another way would be to just use sets and subtract them from each other although I do know if you like duplicates in one list itself. Also, the way you are doing it right now a duplicate is only found if its at the same index.
ip2 = open('rotten', 'r')
ipList1 = [line.strip().replace('\n', '') for line in ip1]
ipList2 = [line.strip().replace('\n', '') for line in ip2]
ip1.close()
ip2.close()
newlist = []
for v in ip1:
if v not in ip2:
newlist.append(v)
c = open('end', 'w')
c.write('\n'.join(newlist))
c.close()
Other answers focus on deleting from a container while iterating over it. While that’s generally a bad idea, it’s not the crux of the problem here because you have (unpythonically) set up the for loops to use a sequence of indices, and so you aren’t strictly speaking iterating over the lists themselves anyway.
No, the problem here is that i-=1 and a-=1 have no effect: when a for loop begins a new iteration, it doesn’t work off of the previous value of the index. It just takes the next value that it was always destined to take, from the iterator that you established at the beginning (in your case, the output of range())

Cannot understand how this Python code works

I found this question on HackerRank and I am unable to understand the code(solution) that is displayed in the discussions page.
The question is:
Consider a list (list = []). You can perform the following commands:
insert i e: Insert integer at position .
print: Print the list.
remove e: Delete the first occurrence of integer .
append e: Insert integer at the end of the list.
sort: Sort the list.
pop: Pop the last element from the list.
reverse: Reverse the list.
Even though I have solved the problem using if-else, I do not understand how this code works:
n = input()
slist = []
for _ in range(n):
s = input().split()
cmd = s[0]
args = s[1:]
if cmd !="print":
cmd += "("+ ",".join(args) +")"
eval("slist."+cmd)
else:
print slist
Well, the code takes advantage of Python's eval function. Many languages have this feature: eval, short for "evaluate", takes a piece of text and executes it as if it were part of the program instead of just a piece of data fed to the program. This line:
s = input().split()
reads a line of input from the user and splits it into words based on whitespace, so if you type "insert 1 2", s is set to the list ["insert","1","2"]. That is then transformed by the following lines into "insert(1,2)", which is then appended to "slist." and passed to eval, resulting in the method call slist.insert(1,2) being executed. So basically, this code is taking advantage of the fact that Python already has methods to perform the required functions, that even happen to have the same names used in the problem. All it has to do is take the name and arguments from an input line and transform them into Python syntax. (The print option is special-cased since there is no method slist.print(); for that case it uses the global command: print slist.)
In real-world code, you should almost never use eval; it is a very dangerous feature, since it allows users of your application to potentially cause it to run any code they want. It's certainly one of the easier features for hackers to use to break into things.
It's dirty code that's abusing eval.
Basically, when you enter, for example, "remove 1", it creates some code that looks like sList.remove(1), then gives the created code to eval. This has Python interpret it.
This is probably the worst way you could solve this outside of coding competitions though. The use of eval is entirely unnecessary here.
Actually I Find some error in the code, but I came to an understanding of how this code runs. here is it:
input :
3
1 2 3
cmd = 1 + ( 2 + 3)
then eval(cmd) i.e., eval("1 + (2 + 3)") which gives an output 6
another input:
4
4 5 6 2
cmd = 4 + ( 5 + 6 + 2)
eval(cmd)
if __name__ == '__main__':
N = int(raw_input())
lst=[]
for _ in range(N):
cmd, *line = input().split()
ele= list(map(str,line))
if cmd in dir(lst):
exec('lst.'+cmd+'('+','.join(ele)+')')
elif cmd == 'print':
print(lst)
else:
print('wrong command', cmd)

Using two for loop to iterate two large list in python. Nothing happens

I want to use two for loop to iterate two large lists, however nohting happens.
Here is the code
pfam = open(r'D:\RPS_data\pfam_annotations.tbl', 'r')
number = []
description = []
for j in pfam:
number.append(j.split('\t')[0])
description.append(j.split('\t')[-2])
print len(number),len(description)
for j in number:
for r in description:
if j==r:
print 'ok'
else:
print 'not match'
The result is:
D:\Pathyon\pathyon2.7\python.exe "D:/Biopython/Pycharm/Python Learning/war_battle/AUL-prediction/trash.py"
16230 16230
Process finished with exit code 0
My question is to find the same fraction in number and description. It is obvious that python does not run the for loop. Is it because I get two large lists? Anyone knows how to overcome this?
I modified your script to run a test. This is what I did: I inserted a dummy text blob as an io.StringIO file, and printed the resulting columns.
import io
annotations=u"""
1024\tmiddle\tkay\tend
3192\tmiddle\tdiana\tend
8675309\tmiddle\tjenny\tend
""".strip()
#pfam = open(r'D:\RPS_data\pfam_annotations.tbl', 'r')
pfam = io.StringIO(annotations)
number = []
description = []
for j in pfam:
number.append(j.split('\t')[0])
description.append(j.split('\t')[-2])
print len(number),len(description)
print number, description
for j in number:
for r in description:
if j==r:
print 'ok'
else:
print 'not match'
When I run this, the output is:
$ python2.7 test.py
3 3
[u'1024', u'3192', u'8675309'] [u'kay', u'diana', u'jenny']
not match
I'll argue that (1) your initial code appears to be working fine; and (2) Python is, in fact, running the second loop.
I am suspicious of your second loop. You are comparing every number against every description, using a direct comparison (==). Those both seem to be possible sources of error.
For example, are you sure you want to compare a number from line 1 (1024) with a description from line 3 (jenny)?
For example, are you sure you want to compare using equality? Are you looking for directly-equals, or containment. A "description" might contain several words, one of which was the number in question.

Writing sublists in a list of lists to separate text files

I’m really new to Python but find myself working on the travelling salesman problem with multiple drivers. Currently I handle the routes as a list of lists but I’m having trouble getting the results out in a suitable .txt format. Each sub-list represents the locations for a driver to visit, which corresponds to a separate list of lat/long tuples. Something like:
driver_routes = [[0,5,3,0],[0,1,4,2,0]]
lat_long =[(lat0,long0),(lat1,long1)...(latn,longn)]
What I would like is a separate .txt file (named “Driver(n)”) that lists the lat/long pairs for that driver to visit.
When I was just working with a single driver, the following code worked fine for me:
optimised_locs = open('Optimisedroute.txt', 'w')
for x in driver_routes:
to_write = ','.join(map(str, lat_long[x]))
optimised_locs.write(to_write)
optimised_locs.write("\n")
optimised_locs.close()
So, I took the automated file naming code from Chris Gregg here (Printing out elements of list into separate text files in python) and tried to make an iterating loop for sublists:
num_drivers = 2
p = 0
while p < num_drivers:
for x in driver_routes[p]:
f = open("Driver"+str(p)+".txt","w")
to_write = ','.join(map(str, lat_long[x]))
print to_write # for testing
f.write(to_write)
f.write("\n")
f.close()
print "break" # for testing
p += 1
The output on my screen looks exactly how I would expect it to look and I generate .txt files with the correct name. However, I just get one tuple printed to each file, not the list that I expect. It’s probably very simple but I can't see why the while loop causes this issue. I would appreciate any suggestions and thank you in advance.
You're overwriting the contents of the file f on every iteration of your for loop because you're re-opening it. You just need to modify your code as follows to open the file once per driver:
while p < num_drivers:
f = open("Driver"+str(p)+".txt","w")
for x in driver_routes[p]:
to_write = ','.join(map(str, lat_long[x]))
print to_write # for testing
f.write(to_write)
f.write("\n")
f.close()
p += 1
Note that opening f is moved to outside the for loop.

Trying to write python CSV extractor

I am complete newbie for programming and this is my first real program I am trying to write.
So I have this huge CSV file (hundreds of cols and thousands of rows) where I am trying to extract only few columns based on value in the field. It works fine and I get nice output, but the problem arises when I am try to encapsulate the same logic in a function.
it returns only first extracted row however print works fine.
I have been playing for this for hours and read other examples here and now my mind is mush.
import csv
import sys
newlogfile = csv.reader(open(sys.argv[1], 'rb'))
outLog = csv.writer(open('extracted.csv', 'w'))
def rowExtractor(logfile):
for row in logfile:
if row[32] == 'No':
a = []
a.append(row[44])
a.append(row[58])
a.append(row[83])
a.append(row[32])
return a
outLog.writerow(rowExtractor(newlogfile))
You are exiting prematurely. When you put return a inside the for loop, return gets called on the first iteration. Which means that only the firs iteration runs.
A simple way to do this would be to do:
def rowExtractor(logfile):
#output holds all of the rows
ouput = []
for row in logfile:
if row[32] == 'No':
a = []
a.append(row[44])
a.append(row[58])
a.append(row[83])
a.append(row[32])
output.append(a)
#notice that the return statement is outside of the for-loop
return output
outLog.writerows(rowExtractor(newlogfile))
You could also consider using yield
You've got a return statement in your function...when it hits that line, it will return (thus terminating your loop). You'd need yield instead.
See What does the "yield" keyword do in Python?

Categories