Python readlines() doesn't function outputs bug in for loop - python

Intro:
I'm a beginner python learning syntax at the moment. I've come across this concept of reading and writing files natively supported by python. I've figured to give it a try and find bugs after attempting looping reading and writing commands. I wanted to randomly pick a name from a name file and then writing it into a new file. My file includes 19239 lines of names, randrange(18238) generates from 0 - 18238, and, supposedly, would read a randomly read a line between 1 - 18239. The problem is that the code that reads and writes works without the for loop but not with the for loop.
My attempt:
from random import randrange
rdname = open("names.dat", "r")
wrmain = open("main.dat", "a")
rdmain = open("main.dat", "r")
for x in range(6):
nm = rdname.readlines()[randrange(18238)]
print(str(randrange(18238)) + ": " + nm)
wrmain.write("\n" + nm)
...
Error code:
Exception has occurred: IndexError
list index out of range

Good luck with your programming journey.
The readlines() method. Has some non-intuitive behaviour. When you use the readlines() it "dumps" the entire content of the file and returns a list of strings of each line. Thus the second time you call the rdname.readlines()[randrange(18238)], the rdname file object is completely empty and you actually have an empty list. So functionally you are telling your programme to run [][randrange(18238)] on the second iteration of the loop.
I also took the liberty of fixing the random number call, as the way you had implemented it would mean it would call 2 different random numbers when selecting the name nm = rdname.readlines()[randrange(18238)] and printing the selected name and linenumber print(str(randrange(18238)) + ": " + nm)
...
rdname = open("names.dat", "r")
wrmain = open("main.dat", "a")
rdmain = open("main.dat", "r")
rdname_list = rdname.readlines()
for x in range(6):
rd_number = randrange(18238)
nm = rdname_list[rd_number]
print(str(rd_number) + ": " + nm)
wrmain.write("\n" + nm)
...

rdname.readlines() exhausts your file handle. Running rdname.readlines() gives you the list of lines the first time, but returns an empty list every subsequent time. Obviously, you can't access an element in an empty list. To fix this, assign the result of readlines() to a variable just once, before your loop.
rdlines = rdname.readlines()
maxval = len(rdlines)
for x in range(6):
randval = randrange(maxval)
nm = rdlines[randval]
print(str(randval) + ": " + nm)
wrmain.write("\n" + nm)
Also, making sure your random number can only go to the length of your list is a good idea. No need to hardcode the length of the list though -- the len() function will give you that.
I highly recommend you take a look at how to debug small programs. Using a debugger to step through your code is immensely helpful because you can see how each line affects the values of your variables. In this case, if you'd looked at the value of nm in each iteration, it would be obvious why you got the IndexError, and finding out that nm becomes an empty list on readlines() would point you in the direction of the answer.

Related

Why I have IndexError when program makes succesful first step?

I tried to make sorter that deletes duplicates of IP's in first list and saves it into a file, but after first succesful round it gives me IndexError: list index out of range.
I've expected normal sorting process, but it doesn't works
Code:
ip1 = open('hosts', 'r')
ip2 = open('rotten', 'r')
ipList1 = [line.strip().split('\n') for line in ip1]
ipList2 = [line.strip().split('\n') for line in ip2]
for i in range(len(ipList1)):
for a in range(len(ipList2)):
if(ipList1[i] == ipList2[a]):
print('match')
del(ipList1[i])
del(ipList2[a])
i -= 1
a -= 1
c = open('end', 'w')
for d in range(len(ipList1)):
c.write(str(ipList1[d]) + '\n')
c.close()
You're deleting from the list while iterating over it, that's why you're getting an IndexError.
This could be easier done with sets:
with open('hosts') as ip1, open('rotten') as ip2:
ipList1 = set(line.strip().split('\n') for line in ip1)
ipList2 = set(line.strip().split('\n') for line in ip2)
good = ipList1 - ipList2
with open('end', 'w') as c:
for d in good:
c.write(d + '\n')
You changed lists in a fly. For expression gets a list with, as an example, of 5 elements length, after the first iteration you remove 4, so in the second iteration for tried to extract the second element but now it does not exist.
If necessary save the ordering you can use generator expression:
ips = [ip for ip in ipList1 if ip not in set(list2)]
If doesn't, just use sets expression.
You should never modify a list that you are currently iterating over.
A fix would be just to make a third list that saves the non duplicates. Another way would be to just use sets and subtract them from each other although I do know if you like duplicates in one list itself. Also, the way you are doing it right now a duplicate is only found if its at the same index.
ip2 = open('rotten', 'r')
ipList1 = [line.strip().replace('\n', '') for line in ip1]
ipList2 = [line.strip().replace('\n', '') for line in ip2]
ip1.close()
ip2.close()
newlist = []
for v in ip1:
if v not in ip2:
newlist.append(v)
c = open('end', 'w')
c.write('\n'.join(newlist))
c.close()
Other answers focus on deleting from a container while iterating over it. While that’s generally a bad idea, it’s not the crux of the problem here because you have (unpythonically) set up the for loops to use a sequence of indices, and so you aren’t strictly speaking iterating over the lists themselves anyway.
No, the problem here is that i-=1 and a-=1 have no effect: when a for loop begins a new iteration, it doesn’t work off of the previous value of the index. It just takes the next value that it was always destined to take, from the iterator that you established at the beginning (in your case, the output of range())

How to export int to "txt" file and then at a later date be able to import them back as int 's

Exporting the data:
num = 0
exportData = open("results_file.txt", "a")
while num < len(runs) - 1:
exportData.write(str(runs[num]) + "\n")
num = num + 1
exportData.close()
Importing the data into the new file:
runs = []
num = 1
count = len(open("results_file.txt").readlines( ))
print(count)
importData = open("results_file.txt", "r")
while num < count:
runs.append(importData.read(num))
print(importData.read(num))
num = num + 1
importData.close()
My goal is to export the array of integers to a file (can be something else than a txt file for all I care) and then to import them at a later date into a new file and use them there as integers (performing mathematical operations on them)
The error that I'm getting (on line 28 I'm trying to use the first number in the array for a mathematical calculation):
line 28, in if runs[num] < 7: TypeError: '<' not supported between instances of 'str' and 'int'
runs = []
num = 1
count = len(open("results_file.txt").readlines( ))
print(count)
importData = open("results_file.txt", "r")
while num < count:
runs.append(int(importData.read(num)))
print(importData.read(num))
num = num + 1
importData.close()
Adding int() returns this error:
ValueError: invalid literal for int() with base 10: '4\n1'
You're not being pythonic, and many of the answers here aren't either. So, let me clean up your code a bit.
from ast import literal_eval
with open("results_file.txt", "a") as exportData:
for run in runs:
exportData.write(str(run) + "\n")
runs = []
with open("results_file.txt", "r") as importData:
runs.extend([literal_eval(x) for x in importData])
I'll break this down line by line:
from ast import literal_eval is the safest way to parse things that are strings into python objects. It's better than using a plain old eval because it won't run arbitrary code. We'll use this function to read the data latter.
with open(...) as ... is the best way to open a file. It contains the file-object within a single scope and catches errors. Look this one up here: Pep 343
for ... in ... For loops that you're using are not pythonoic at all. The pythonic way is to use iterators no need to count lines and declare variables to keep track... the python objects keep track of themselves. (If you need a counter I highly recommend that you look up enumerate() Enumerate() in Python
exportData.write(str(run) + "\n") only change here is that with the pythonic for loop there's no need to index the runs list anymore.
runs = [] I think you know what this is, but I have to declare it out of the with statement because if the with statement throws an error, and you were to catch it, runs will be initialized.
I've already discussed with statements.
runs.extend([literal_eval(x) for x in importData]) Has two things going on. extend appends a list to a list... cool. The more interesting part here is the list comprehension. Here's a tutorial on list comprehensions. As soon as you get comfortable with the for loops, the list comprehension is the next pythonic step. For further pythonic enlightenment, this line could also be replaced with: runs.extend(map(literal_eval, importData))
That's it, 9 lines. Happy hacking.
The error you are experiencing is most likely due to the fact you're trying to add a string to an integer. Try doing
runs = []
num = 1
count = len(open("results_file.txt").readlines( ))
print(count)
importData = open("results_file.txt", "r")
while num < count:
runs.append(int(importData.read(num)))
print(importData.read(num))
num = num + 1
importData.close()
The main function/tool you're looking for is int(). For example:
>>> int('15')
15
>>> int('15') + 5
20
But you also can save yourself some real headaches by coding this differently. For example, you do not need to know the number of lines ahead of time. You can just call .readline() until it returns an empty string. Or even iterate over the file and when it ends, it with exit.
Another tip, and this is just good practice, is to use the context manager for opening files. So instead of
file = open('myfile.txt')
for line in file:
print(line)
you would do:
with open('myfile.txt') as file:
for line in file:
print(line)
The big advantage of the latter is that if will always make sure file is closed properly. This is especially helpful when writing to a file.

Python - program for searching for relevant cells in excel does not work correctly

I've written a code to search for relevant cells in an excel file. However, it does not work as well as I had hoped.
In pseudocode, this is it what it should do:
Ask for input excel file
Ask for input textfile containing keywords to search for
Convert input textfile to list containing keywords
For each keyword in list, scan the excelfile
If the keyword is found within a cell, write it into a new excelfile
Repeat with next word
The code works, but some keywords are not found while they are present within the input excelfile. I think it might have something to do with the way I iterate over the list, since when I provide a single keyword to search for, it works correctly. This is my whole code: https://pastebin.com/euZzN3T3
This is the part I suspect is not working correctly. Splitting the textfile into a list works fine (I think).
#IF TEXTFILE
elif btext == True:
#Split each line of textfile into a list
file = open(txtfile, 'r')
#Keywords in list
for line in file:
keywordlist = file.read().splitlines()
nkeywords = len(keywordlist)
print(keywordlist)
print(nkeywords)
#Iterate over each string in list, look for match in .xlsx file
for i in range(1, nkeywords):
nfound = 0
ws_matches.cell(row = 1, column = i).value = str.lower(keywordlist[i-1])
for j in range(1, worksheet.max_row + 1):
cursor = worksheet.cell(row = j, column = c)
cellcontent = str.lower(cursor.value)
if match(keywordlist[i-1], cellcontent) == True:
ws_matches.cell(row = 2 + nfound, column = i).value = cellcontent
nfound = nfound + 1
and my match() function:
def match(keyword, content):
"""Check if the keyword is present within the cell content, return True if found, else False"""
if content.find(keyword) == -1:
return False
else:
return True
I'm new to Python so my apologies if the way I code looks like a warzone. Can someone help me see what I'm doing wrong (or could be doing better?)? Thank you for taking the time!
Splitting the textfile into a list works fine (I think).
This is something you should actually test (hint: it does but is inelegant). The best way to make easily testable code is to isolate functional units into separate functions, i.e. you could make a function that takes the name of a text file and returns a list of keywords. Then you can easily check if that bit of code works on its own. A more pythonic way to read lines from a file (which is what you do, assuming one word per line) is as follows:
with open(filename) as f:
keywords = f.readlines()
The rest of your code may actually work better than you expect. I'm not able to test it right now (and don't have your spreadsheet to try it on anyway), but if you're relying on nfound to give you an accurate count for all keywords, you've made a small but significant mistake: it's set to zero inside the loop, and thus you only get a count for the last keyword. Move nfound = 0 outside the loop.
In Python, the way to iterate over lists - or just about anything - is not to increment an integer and then use that integer to index the value in the list. Rather loop over the list (or other iterable) itself:
for keyword in keywordlist:
...
As a hint, you shouldn't need nkeywords at all.
I hope this gets you on the right track. When asking questions in future, it'd be a great help to provide more information about what goes wrong, and preferably enough to be able to reproduce the error.

If-statement seemingly ignored by Write operation

What I am trying to do here is write the latitude and longitude of the sighting of a pokemon to a text file if it doesn't already exist. Since I am using an infinite loop, I added an if-state that prevents an already existent pair of coordinates to be added.
Note that I also have a list Coordinates that stores the same information. The list works as no repeats are added.(By checking) However, the text file has the same coordinates appended over and over again even though it theoretically shouldn't as it is contained within the same if-block as the list.
import requests
pokemon_url = 'https://pogo.appx.hk/top'
while True:
response = requests.get(pokemon_url)
response.raise_for_status()
pokemon = response.json()[0:]
Sighting = 0
Coordinates = [None] * 100
for num in range(len(pokemon)):
if pokemon[num]['pokemon_name'] == 'Aerodactyl':
Lat = pokemon[num]['latitude']
Long = pokemon[num]['longitude']
if (Lat, Long) not in Coordinates:
Coordinates[Sighting] = (Lat, Long)
file = open("aerodactyl.txt", "a")
file.write(str(Lat) + "," + str(Long) + "\n")
file.close()
Sighting += 1
For clarity purposes, this is the output
You need to put your Sighting and Coordinates variables outside of the while loop if you do not want them to reset on every iteration.
However, there are a lot more things wrong with the code. Without trying it, here's what I spot:
You have no exit condition for the while loop. Please don't do this to the poor website. You'll essentially be spamming requests.
file.close should be file.close(), but overall you should only need to open the file once, not on every single iteration of the loop. Open it once, and close once you're done (assuming you will add an exit condition).
Slicing from 0 (response.json()[0:]) is unnecessary. By default the list starts at index 0. This may be a convoluted way to get a new list, but that seems unnecessary here.
Coordinates should not be a hard-coded list of 100 Nones. Just use a set to track existing coordinates.
Get rid of Sighting altogether. It doesn't make sense if you're re-issuing the request over and over again. If you want to iterate through the pokémon from one response, use enumerate if you need the index.
It's generally good practice to use snake case for Python variables.
Try this:
#!/usr/bin/env python
import urllib2
import json
pokemon_url = 'https://pogo.appx.hk/top'
pokemon = urllib2.urlopen(pokemon_url)
pokeset = json.load(pokemon)
Coordinates = [None] * 100
for num in range(len(pokeset)):
if pokeset[num]['pokemon_name'] == 'Aerodactyl':
Lat = pokeset[num]['latitude']
Long = pokeset[num]['longitude']
if (Lat, Long) not in Coordinates:
Coordinates.append((Lat, Long))
file = open("aerodactyl.txt", "a")
file.write(str(Lat) + "," + str(Long) + "\n")
file.close

Python - 'str' object has no attribute 'append'

I've searched this error on here, but haven't seen anything that yet matches my situation (disclaimer, I'm still getting used to Python).
import os
os.chdir("C:\Projects\Rio_Grande\SFR_Checking") # set working directory
stressPeriod = 1
segCounter = 1
inFlow = 0
outFlow = 0
with open(r"C:\Projects\streamflow.dat") as inputFile:
inputList = list(inputFile)
while stressPeriod <= 1:
segCounter = 1
lineCounter = 1
outputFile = open("stats.txt", 'w') # Create the output file
for lineItem in inputList:
if (((stressPeriod - 1) * 11328) + 8) < lineCounter <= (stressPeriod * 11328):
lineItem = lineItem.split()
if int(lineItem[3]) == int(segCounter) and int(lineItem[4]) == int(1):
inFlow = lineItem[5]
outFlow = lineItem[7]
lineItemMem = lineItem
elif int(lineItem[3]) == int(segCounter) and int(lineItem[4]) <> int(1):
outFlow = lineItem[7]
else:
gainLoss = str(float(outFlow) - float(inFlow))
lineItemMem.append(gainLoss)
lineItemMem = ','.join(lineItemMem)
outputFile.write(lineItemMem + "\n") # write # lines to file
segCounter += 1
inFlow = lineItem[5]
outFlow = lineItem[7]
lineCounter += 1
outputFile.close()
So basically this program is supposed to read a .dat file and parse out bits of information from it. I split each line of the file into a list to do some math on it (math operations are between varying lines in the file, which adds complexity to the code). I then append a new number to the end of the list for a given line, and that's where things inexplicably break down. I get the following error:
Traceback (most recent call last):
File "C:/Users/Chuck/Desktop/Python/SFR/SFRParser2.py", line 49, in <module>
lineItemMem.append(gainLoss)
AttributeError: 'str' object has no attribute 'append'
When I give it a print command to test that lineItemMem is actually a list and not a string, it prints a list for me. If I put in code for
lineItemMem.split(",") to break the string, I get an error saying that list object has no attribute split. So basically, when I try to do list operations, the error says its a string, and when I try to do string operations, the error says it's a list. I've tried a fair bit of mucking around, but frankly can't tell what the problem is here. Any insight is appreciated, thanks.
I think the issue has to do with these lines:
lineItemMem.append(gainLoss)
lineItemMem = ','.join(lineItemMem)
Initially lineItemMem is a list, and you can append an item to the end of it. However, the join call you're doing turns the list into a string. That means the next time this part of the code runs, the append call will fail.
I'm not certain exactly what the best solution is. Perhaps you should use a different variable for the string version? Or maybe after you join the list items together into a single string and write that result out, you should reinitialize the lineItemMem variable to a new empty list? You'll have to decide what works best for your actual goals.
There are two places where lineItemMem is set. The first is this:
lineItem = lineItem.split()
# ...
lineItemMem = lineItem
where it is set to the result of a split operation, i.e. a List.
The second place is this:
lineItemMem = ','.join(lineItemMem)
here, it is set to the result of a join operation, i.e. a String.
So, the reason why the error sometimes states that it is a string and sometimes a list is, that that is acutally the case depending on the conditions in the if statement.
The code, as presented, is imho near undebuggable. Instead of tinkering, it would be a better approach to think about the different goals that should be achieved (reading a file, parsing the content, formatting the data, writing it to another file) and tackle them individually.

Categories