Just started learning Python and I'm struggling with this a little.
I'm opening a txt file that will be variable in length and I need to iterate over a user definable amount of lines at a time. When I get to the end of the file I receive the error in the subject field. I've also tried the readlines() function and a couple of variations on the "if" statement that causes the problem. I just can't seem to get the code to find EOF.
Hmm, as I write this, I'm thinking ... do I need to addlist "EOF" to the array and just look for that? Is that the best solution, to find a custom EOF?
My code snippet goes something like:
### variables defined outside of scapy PacketHandler ##
x = 0
B = 0
##########
with open('dict.txt') as f:
lines = list(f)
global x
global B
B = B + int(sys.argv[3])
while x <= B:
while y <= int(sys.argv[2]):
if lines[x] != "":
#...do stuff...
# Scapy send packet Dot11Elt(ID="SSID",info"%s" % (lines[x].strip())
# ....more code...
x = x 1
Let’s say you need to read X lines at a time, put it in a list and process it:
with open('dict.txt') as f:
enoughLines = True
while enoughLines:
lines = []
for i in range(X):
l = f.readline()
if l != '':
lines.append( l )
else:
enoughLines = False
break
if enoughLines:
#Do what has to be done with the list “lines”
else:
break
#Do what needs to be done with the list “lines” that has less than X lines in it
Try a for in loop. You have created your list, now iterate through it.
with open('dict.txt') as f:
lines = list(f)
for item in lines: #each item here is an item in the list you created
print(item)
this way you go through each line of your text file and don't have to worry about where it ends.
edit:
you can do this as well!
with open('dict.txt') as f:
for row in f:
print(row)
The following function will return a generator that returns the next n lines in a file:
def iter_n(obj, n):
iterator = iter(obj)
while True:
result = []
try:
while len(result) < n:
result.append(next(iterator))
except StopIteration:
if len(result) == 0:
raise
yield result
Here is how you can use it:
>>> with open('test.txt') as f:
... for three_lines in iter_n(f, 3):
... print three_lines
...
['first line\n', 'second line\n', 'third line\n']
['fourth line\n', 'fifth line\n', 'sixth line\n']
['seventh line\n']
Contents of test.txt:
first line
second line
third line
fourth line
fifth line
sixth line
seventh line
Note that, because the file does not have a multiple of 3 lines, the last value returned is not 3 lines, but just the rest of the file.
Because this solution uses a generator, it doesn't require that the full file be read into memory (into a list), but iterates over it as needed.
In fact, the above function can iterate over any iterable object, like lists, strings, etc:
>>> for three_numbers in iter_n([1, 2, 3, 4, 5, 6, 7], 3):
... print three_numbers
...
[1, 2, 3]
[4, 5, 6]
[7]
>>> for three_chars in iter_n("1234567", 3):
... print three_chars
...
['1', '2', '3']
['4', '5', '6']
['7']
If you want to get n lines in a list use itertools.islice yielding each list:
from itertools import islice
def yield_lists(f,n):
with open(f) as f:
for sli in iter(lambda : list(islice(f,n)),[]):
yield sli
If you want to use loops, you don't need a while loop at all, you can use an inner loop in range n-1 calling next on the file object with a default value of an empty string, if we get an empty string break the loop if not just append and again yield each list:
def yield_lists(f,n):
with open(f) as f:
for line in f:
temp = [line]
for i in range(n-1):
line = next(f,"")
if not line:
break
temp.append(line)
yield temp
Related
I am trying to writer unique values to a csv that already has a list of ints inside it.
Currently I have tried to loop through a range of possible numbers then check if those numbers are in the csv. It appears that the checking is not working properly.
def generateUserCode():
with open ('/MyLocation/user_codes.csv') as csvDataFile:
userCodes = csv.reader(csvDataFile)
for x in range(0, 201):
if x not in userCodes:
return x
def writeUserCode(userCode):
with open ('/MyLocation/user_codes.csv', 'a') as csvDataFile:
csvDataFile.write('\n' + str(userCode))
userCode = generateUserCode()
writeUserCode(userCode)
So it should print the first number not in the csv and add the number to the csv. However all it is doing is printing 0 and adding 0 to my csv every time it is run even if 0 is in the csv.
Update:
The csv looks something like this:
3
4
5
35
56
100
There are more values but it is generally the same with no repeats and values between 0 and 200
The problem is with the following line:
if x not in userCodes:
userCodes is not a list it is a csvReader object. Also, you should use
if str(x) not in line:
#use str(x) instead of x
This is the code that works for me:
import csv
def generateUserCode():
with open ('file.csv') as csvDataFile:
csvread = csv.reader(csvDataFile)
userCodes = []
#print(userCodes)
for line in csvread:
try:
userCodes.append(line[0]) # As long as the code is the first
# element in that line, it should work
except:
IndexError # Avoid blank lines
print(userCodes)
for x in range(0, 201):
if str(x) not in userCodes:
return x
def writeUserCode(userCode):
with open ('file.csv', 'a') as csvDataFile:
csvDataFile.write('\n' + str(userCode))
userCode = generateUserCode()
writeUserCode(userCode)
Iterating userCodes shows each item is a list of strings:
for x in userCodes:
print(x)
returns:
['3']
['4']
['5']
['35']
['56']
['100']
So there are a lot of possible fixes, one would be:
def generateUserCode():
with open ('/MyLocation/user_codes.csv') as csvDataFile:
userCodes = csv.reader(csvDataFile)
userCodes = [int(item[0]) for item in userCodes]
for x in range(0, 201):
if x not in userCodes:
return x
It’s tricky to answer without seeing the CSV, but when you read the CSV, all fields are strings. Therefor you need to convert either the userCodes to int or x to string for the comparison to work.
For example:
userCodes = [int(d[0]) for d in csv.reader(csvDataFile)]
for x in range(0, 201):
if x not in userCodes:
return x
You are checking if a str is in an instance of csv.reader. This syntax doesn't work even with a normal file handle:
with open('somefile.txt') as fh:
x = fh.read()
x
'Spatial Reference: 43006\nName: Jones Tract\n424564.620666, 4396443.55267\n425988.30892, 4395630.01652\n426169.09473, 4395426.63249\n426214.291182, 4395268.4449\n\nName: Lewis Tract\n427909.158152, 4393935.14955\n428587.104939, 4393731.76552\n428700.096071, 4393528.38148\n428745.292523, 4393347.59567\n\nName: Adams Tract\n424180.450819, 4393957.74778\n424361.236629, 4393709.16729\n424655.013571, 4393641.37261\n424858.397607, 4393776.96197\n'
# now check if 'e' is in fh
with open('somefile.txt') as fh:
'e' in fh
False
'e' in x
True
Also, your csv file isn't really a csv file, so I'd just use a normal file handle and ignore the csv entirely.
The better approach may be to aggregate your codes in a set and check from there:
def get_codes():
with open('user_codes.csv') as fh:
# return a set to test membership quickly
return {line.strip() for line in fh}
codes = get_codes()
def add_code(code):
if code not in codes:
codes.add(code)
with open('user_codes.csv', 'a') as fh:
fh.write(code)
else:
raise ValueError("Code already exists")
# or do something else
add_code(88)
add_code(88)
# ValueError
To generate a user code automatically, since you are using a range, this becomes relatively easy:
def generate_user_code():
try:
# this returns the first number not in codes
return next(i for i in range(201) if i not in codes)
except StopIteration:
# you've exhausted your range, nothing is left
raise ValueError("No unique codes available")
# and your write method can be changed to
def add_code(code):
with open('user_codes.csv', 'a') as fh:
codes.add(code)
fh.write(code)
codes = get_codes()
user_code = generate_user_code()
add_code(user_code)
You may try to do this:
....
userCodes = csv.reader(csvDataFile)
uc = []
for y in userCodes:
uc += y
for x in range(0, 201):
if str(x) not in uc:
return x
....
I am trying to figure out if it is possible to access the elements of a list around the element you are currently at. I have a list that is large (20k+ lines) and I want to find every instance of the string 'Name'. Additionally, I also want to get +/- 5 elements around each 'Name' element. So 5 lines before and 5 lines after. The code I am using is below.
search_string = 'Name'
with open('test.txt', 'r') as infile, open ('textOut.txt','w') as outfile:
for line in infile:
if search_string in line:
outfile.writelines([line, next(infile), next(infile),
next(infile), next(infile), next(infile)])
Getting the lines after the occurrence of 'Name' is pretty straightforward, but figuring out how to access the elements before it has me stumped. Anyone have an ideas?
20k lines isn't that much, if it's ok to read all of them in a list, we can take slices around the index where a match is found, like this:
with open('test.txt', 'r') as infile, open('textOut.txt','w') as outfile:
lines = [line.strip() for line in infile.readlines()]
n = len(lines)
for i in range(n):
if search_string in lines[i]:
start = max(0, i - 5)
end = min(n, i + 6)
outfile.writelines(lines[start:end])
You can use the function enumerate that allows you to iterate through both elements and indexes.
Example to access elements 5 indexes before and after your current element :
n = len(l)
for i, x in enumerate(l):
print(l[max(i-5, 0)]) # Prevent picking last elements of iterable by using negative indexes
print(x)
print(l[min(i+5, n-1)]) # Prevent overflow
You need to keep track of the index of where in the list you currently are
So something like:
# Read the file into list_of_lines
index = 0
while index < len(list_of_lines):
if list_of_lines[index] == 'Name':
print(list_of_lines[index - 1]) # This is the previous line
print(list_of_lines[index + 1]) # This is the next line
# And so on...
index += 1
Let's say you have your lines stored in your list:
lines = ['line1', 'line2', 'line3', 'line4', 'line5', 'line6', 'line7', 'line8', 'line9']
You could define a method returning elements grouped by n consecutives, as a generator:
def each_cons(iterable, n = 2):
if n < 2: n = 1
i, size = 0, len(iterable)
while i < size-n+1:
yield iterable[i:i+n]
i += 1
Teen, just call the method. To show the content I'm calling list on it, but you can iterate over it:
lines_by_3_cons = each_cons(lines, 3) # or any number of lines, 5 in your case
print(list(lines_by_3_cons))
#=> [['line1', 'line2', 'line3'], ['line2', 'line3', 'line4'], ['line3', 'line4', 'line5'], ['line4', 'line5', 'line6'], ['line5', 'line6', 'line7'], ['line6', 'line7', 'line8'], ['line7', 'line8', 'line9']]
I personally loved that problem. All guys here are doing it by taking the whole file into memory. I think I wrote a memory efficient code.
Here, check this out!
myfile = open('infile.txt')
stack_print_moments = []
expression = 'MYEXPRESSION'
neighbourhood_size = 5
def print_stack(stack):
for line in stack:
print(line.strip())
print('-----')
current_stack = []
for index, line in enumerate(myfile):
current_stack.append(line)
if len(current_stack) > 2 * neighbourhood_size + 1:
current_stack.pop(0)
if expression in line:
stack_print_moments.append(index + neighbourhood_size)
if index in stack_print_moments:
print_stack(current_stack)
last_index = index
for index in range(last_index, last_index + neighbourhood_size + 1):
if index in stack_print_moments:
print_stack(current_stack)
current_stack.pop(0)
More advanced code is here: Github link
I am trying to read a file, collect some lines, batch process them and then post process the result.
Example:
with open('foo') as input:
line_list = []
for line in input:
line_list.append(line)
if len(line_list) == 10:
result = batch_process(line_list)
# something to do with result here
line_list = []
if len(line_list) > 0: # very probably the total lines is not mutiple of 10 e.g. 11
result = batch_process(line_list)
# something to do with result here
I do not want to duplicate the batch invoking and post processing so I want to know if could dynamically add some content to input, e.g.
with open('foo') as input:
line_list = []
# input.append("THE END")
for line in input:
if line != 'THE END':
line_list.append(line)
if len(line_list) == 10 or line == 'THE END':
result = batch_process(line_list)
# something to do with result here
line_list = []
So if in this case I cannot duplicate the code in if branch. Or if has any other better manner could know it's the last line?
If your input is not too large and fits comfortably in memory, you can read everything into a list, slice the list into sub-list of length 10 and loop over them.
k = 10
with open('foo') as input:
lines = input.readlines()
slices = [lines[i:i+k] for i in range(0, len(lines), k)]
for slice in slices:
batch_process(slice)
If you want to append a mark to the input lines, you also have to read all lines first.
I recently started programming and I wanted to sort a file, but in the end, this code only returns one line, even though the text file has 65 lines...
f = open(".\\test.txt")
g, u = [], []
a = 0
for i, line in enumerate(f):
a += 1
if i%2 == 0:
g.append(f.readlines()[i])
print(i),
elif i%2 == 1:
u.append(f.readlines()[i])
print(i),
print(u),
print(g)
Your for loop starts reading the file line by line. But then its contents go and read the rest of the file in a single readlines call; after that, there's nothing more to be read! So you end up with the first line in line and the second line in g, since you only kept one of the lines that readlines() read.
open(filename) gives you an iterator over the lines of a file. This iterator will be exhausted after reading all the lines once, any subsequent calls to readlines after the first one will therefore give you an empty list.
Demo:
>>> with open('testfile.txt') as f:
... a = f.readlines()
... b = f.readlines()
... a
... b
...
['hello\n', 'stack\n', 'overflow\n']
[]
You have to do
lines = f.readlines()
for i, line in enumerate(lines):
a += 1
if i%2 == 0:
g.append(lines[i])
print(i),
elif i%2 == 1:
u.append(lines[i])
print(i),
So I have a text file consisting of one column, each column consist two numbers
190..255
337..2799
2801..3733
3734..5020
5234..5530
5683..6459
8238..9191
9306..9893
I would like to discard the very 1st and the very last number, in this case, 190 and 9893.
and basically moves the rest of the numbers one spot forward. like this
My desired output
255..337
2799..2801
3733..3734
5020..5234
5530..5683
6459..8238
9191..9306
I hope that makes sense I'm not sure how to approach this
lines = """190..255
337..2799
2801..3733"""
values = [int(v) for line in lines.split() for v in line.split('..')]
# values = [190, 255, 337, 2799, 2801, 3733]
pairs = zip(values[1:-1:2], values[2:-1:2])
# pairs = [(255, 337), (2799, 2801)]
out = '\n'.join('%d..%d' % pair for pair in pairs)
# out = "255..337\n2799..2801"
Try this:
with open(filename, 'r') as f:
lines = f.readlines()
numbers = []
for row in lines:
numbers.extend(row.split('..'))
numbers = numbers[1:len(numbers)-1]
newLines = ['..'.join(numbers[idx:idx+2]) for idx in xrange(0, len(numbers), 2]
with open(filename, 'w') as f:
for line in newLines:
f.write(line)
f.write('\n')
Try this:
Read all of them into one list, split each line into two numbers, so you have one list of all your numbers.
Remove the first and last item from your list
Write out your list, two items at a time, with dots in between them.
Here's an example:
a = """190..255
337..2799
2801..3733
3734..5020
5234..5530
5683..6459
8238..9191
9306..9893"""
a_list = a.replace('..','\n').split()
b_list = a_list[1:-1]
b = ''
for i in range(len(a_list)/2):
b += '..'.join(b_list[2*i:2*i+2]) + '\n'
temp = []
with open('temp.txt') as ofile:
for x in ofile:
temp.append(x.rstrip("\n"))
for x in range(0, len(temp) - 1):
print temp[x].split("..")[1] +".."+ temp[x+1].split("..")[0]
x += 1
Maybe this will help:
def makeColumns(listOfNumbers):
n = int()
while n < len(listOfNumbers):
print(listOfNumbers[n], '..', listOfNumbers[(n+1)])
n += 2
def trim(listOfNumbers):
listOfNumbers.pop(0)
listOfNumbers.pop((len(listOfNumbers) - 1))
listOfNumbers = [190, 255, 337, 2799, 2801, 3733, 3734, 5020, 5234, 5530, 5683, 6459, 8238, 9191, 9306, 9893]
makeColumns(listOfNumbers)
print()
trim(listOfNumbers)
makeColumns(listOfNumbers)
I think this might be useful too. I am reading data from a file name list.
data = open("list","r")
temp = []
value = []
print data
for line in data:
temp = line.split("..")
value.append(temp[0])
value.append(temp[1])
for i in range(1,(len(value)-1),2):
print value[i].strip()+".."+value[i+1]
print value
After reading the data I split and store it in the temporary list.After that, I copy data to the main list value which have all of the data.Then I iterate from the second element to second last element to get the output of interest. strip function is used in order to remove the '\n' character from the value.
You can later write these values to a file Instead of printing out.