making lists from data in a file in python - python

I'm really new at python and needed help in making a list from data in a file. The list contains numbers on separate lines (by use of "\n" and this is something I don't want to change to CSV). The amount of numbers saved can be changed at any time because the way the data is saved to the file is as follows:
Program 1:
# creates a new file for writing
numbersFile = open('numbers.txt', 'w')
# determines how many times the loop will iterate
totalNumbers = input("How many numbers would you like to save in the file? ")
# loop to get numbers
count = 0
while count < totalNumbers:
number = input("Enter a number: ")
# writes number to file
numbersFile.write(str(number) + "\n")
count = count + 1
This is the second program that uses that data. This is the part that is messy and that I'm unsure of:
Program 2:
maxNumbers = input("How many numbers are in the file? ")
numFile = open('numbers.txt', 'r')
total = 0
count = 0
while count < maxNumbers:
total = total + numbers[count]
count = count + 1
I want to use the data gathered from program 1 to get a total in program 2. I wanted to put it in a list because the amount of numbers can vary. This is for an introduction to computer programming class, so I need a SIMPLE fix. Thank you to all who help.

Your first program is fine, although you should use raw_input() instead of input() (which also makes it unnecessary to call str() on the result).
Your second program has a small problem: You're not actually reading anything from the file. Fortunately, that's easy in Python. You can iterate over the lines in a file using
for line in numFile:
# line now contains the current line, including a trailing \n, if present
so you don't need to ask for the total of numbers in your file at all.
If you want to add the numbers, don't forget to convert the string line to an int first:
total += int(line) # shorthand for total = total + int(line)
There remains one problem (thanks #tobias_k!): The last line of the file will be empty, and int("") raises an error, so you could check that first:
for line in numFile:
if line:
total += int(line)

Related

How to write numbers to a text file (line by line) and add all of them?

I'm making a stock game's python version. There are stock prices and users can buy them. I need to make a thing that shows the average price of purchases made by the user when buying more than 0 stocks. Stock prices are connected to a randomizer. That's the story. But I think I can make a smaller code and then place that code into that game's code. Here's that small code:
numA=int(input("Number 1: "))
numB=int(input("Number 2: "))
with open('average.txt','a') as avg:
avg.write(str(numA))
avg.write("\n")
avg.write(str(numB))
avg.write("\n")
How can I make a code that reads the text file and sums numA and numB without doing something like total=numA+numB. I want the program to import it from the text file because as I said, I'm making a bigger program that includes loops and I will use the text file as memory.
you can try this one :
import numpy as np
with open('average.txt','r') as avg:
lines = avg.readlines()
Sum = np.sum([int(line.rstrip()) for line in lines], axis = 0)
Do you want something like this?
with open("New Text Document.txt", "r") as file:
text = file.read()
nums = text.replace("\n", " ").split(" ")
nums = (int(i) for i in nums if i.isdigit())
total = sum(nums)
I basically read the file in read mode and stored it inside of the text variable. Then I replaced all the newlines with spaces and then split the string by spaces. Then I used a generator comprehension to check if the element I is a number and if it is, I added it to the nums generator. And finally I used sum to get the sum of the numbers inside of the nums generator.
Or you could do something like this:
with open("file", "r") as file:
try:
total = sum(map(int, file))
except ValueError:
print("File has to contain only numbers.")

MemoryError Python, in file 99999999 string

Windows 10 pro 64bit, python installed 64bit version
The file weighs 1,80 gb
How to fix thiss error, and print all string
def count():
reg = open('link_genrator.txt', 'r')
s = reg.readline().split()
print(s)
reg.read().split('\n') will give a list of all lines.
Why don't you just do s = reg.read(65536).splitlines()? This will give you a hint on the structure of the content and you can then play with the size you read in a chunk.
Once you know a bit more, you can try to loop that line an sum up the number of lines
After looking at the answers and trying to understand what the initial question could be I come to more complete answer than my previous one.
Looking at the question and the code in the sample function I assume now following:
is seems he want to separate the contents of a file into words and print them
from the function name I suppose he would like to count all these words
the whole file is quite big and thus Python stops with a memory error
Handling such large files obviously asks for a different treatment than the usual ones. For example, I do not see any use in printing all the separated words of such a file on the console. Of course it might make sense to count these words or search for patterns in it.
To show as an example how one might treat such big files I wrote following example. It is meant as a starting point for further refinements and changes according your own requirements.
MAXSTR = 65536
MAXLP = 999999999
WORDSEP = ';'
lineCnt = 0
wordCnt = 0
lpCnt = 0
fn = 'link_genrator.txt'
fin = open(fn, 'r')
try:
while lpCnt < MAXLP:
pos = fin.tell()
s = fin.read(MAXSTR)
lines = s.splitlines(True)
if len(lines) == 0:
break
# count words of line
k= 0
for l in lines:
lineWords = l.split(WORDSEP)# semi-colon separates each word
k += len(lineWords) # sum up words of each line
wordCnt += k - 1 # last word most probably not complete: subtract one
# count lines
lineCnt += len(lines)-1
# correction when line ends with \n
if lines[len(lines)-1][-1] == '\n':
lineCnt += 1
wordCnt += 1
lpCnt += 1
print('{0} {4} - {5} act Pos: {1}, act lines: {2}, act words: {3}'.format(lpCnt, pos, lineCnt, wordCnt, lines[0][0:10], lines[len(lines)-1][-10:]))
finally:
fin.close()
lineCnt += 1
print('Total line count: {}'.format(lineCnt))
That code works for files up to 2GB (tested with 2.1GB). The two constants at the beginning let you play with the size of the read in chunks and limit the amount of text processed. During testing you can then just process a subset of the whole data which goes much faster.

Python splitting function and storing values

I have a txt file that reads as follows:
math,89
history,90
history,94
I am trying to split each line so I can convert the the numbers to integers and use the numbers to find the average grade. I am having issues with splitting the string at the ' , ' and storing the 2 parts I split into 2 different variables.
Here is the main part of the code that i think is the issue:
def main():
total = 0
count = 0
myfile = open('grades.txt','r')
for line in myfile:
array = line.split(",")
course = array[0]
amount = float(array[1])
total += amount
myfile.close()
Sorry forgot to add the error I am getting: ValueError: I/O operation on closed file

How to read data like this from a text file?

The text file is like
101 # an integer
abcd # a string
2 # a number that indicates how many 3-line structures will there be below
1.4 # some float number
2 # a number indicating how many numbers will there be in the next line
1 5 # 2 numbers
2.7 # another float number
3 # another number
4 2 7 # three numbers
and the output should be like
[101,'abcd',[1.4,[1,5]],[2.7,[4,2,7]]]
I can do it line by line, with readlines(), strip(), int(), and for loop, but I'm not sure how to do it like a pro.
P.S. there can be spaces and tabs and maybe empty lines randomly inserted in the text file. The input was originally intended for C program where it doesn't matter :(
My code:
with open('data','r') as f:
lines = [line.strip('\n') for line in f.readlines()]
i=0
while(i<len(lines)):
course_id = int(lines[i])
i+=1
course_name = lines[i]
i+=1
class_no = int(lines[i])
i+=1
for j in range(class_no):
fav = float(lines[i])
i+=2
class_sched = lines[i].split(" ")
the variables read from the file will be handled afterwards
All those i+='s look absolutely hideous! And it seems to be a long Python program for this sort of task

PYTHON how to search a text file for a number

There's a text file that I'm reading line by line. It looks something like this:
3
3
67
46
67
3
46
Each time the program encounters a new number, it writes it to a text file. The way I'm thinking of doing this is writing the first number to the file, then looking at the second number and checking if it's already in the output file. If it isn't, it writes THAT number to the file. If it is, it skips that line to avoid repetitions and goes on to the next line. How do I do this?
Rather than searching your output file, keep a set of the numbers you've written, and only write numbers that are not in the set.
Instead of checking output file for the number if it was already written it is better to keep this information in a variable (a set or list). It will save you on disk reads.
To search a file for numbers you need to loop through each line of that file, you can do that with for line in open('input'): loop, where input is the name of your file. On each iteration line would contain one line of input file ended with end of line character '\n'.
In each iteration you should try to convert the value on that line to a number, int() function may be used. You may want to protect yourself against empty lines or non-number values with try statement.
In each iteration having the number you should check if the value you found wasn't already written to the output file by checking a set of already written numbers. If value is not in the set yet, add it and write to the output file.
#!/usr/bin/env python
numbers = set() # create a set for storing numbers that were already written
out = open('output', 'w') # open 'output' file for writing
for line in open('input'): # loop through each line of 'input' file
try:
i = int(line) # try to convert line to integer
except ValueError: # if conversion to integer fails display a warning
print "Warning: cannot convert to number string '%s'" % line.strip()
continue # skip to next line on error
if i not in numbers: # check if the number wasn't already added to the set
out.write('%d\n' % i) # write the number to the 'output' file followed by EOL
numbers.add(i) # add number to the set to mark it as already added
This example assumes that your input file contains single numbers on each line. In case of empty on incorrect line a warning will be displayed to stdout.
You could also use list in the above example, but it may be less efficient.
Instead of numbers = set() use numbers = [] and instead of numbers.add(i): numbers.append(i). The if condition stays the same.
Don't do that. Use a set() to keep track of all the numbers you have seen. It will only have one of each.
numbers = set()
for line in open("numberfile"):
numbers.add(int(line.strip()))
open("outputfile", "w").write("\n".join(str(n) for n in numbers))
Note this reads them all, then writes them all out at once. This will put them in a different order than in the original file (assuming they're integers, they will come out in ascending numeric order). If you don't want that, you can also write them as you read them, but only if they are not already in the set:
numbers = set()
with open("outfile", "w") as outfile:
for line in open("numberfile"):
number = int(line.strip())
if number not in numbers:
outfile.write(str(number) + "\n")
numbers.add(number)
Are you working with exceptionally large files? You probably don't want to try to "search" the file you're writing to for a value you just wrote. You (probably) want something more like this:
encountered = set([])
with open('file1') as fhi, open('file2', 'w') as fho:
for line in fhi:
if line not in encountered:
encountered.add(line)
fho.write(line)
If you want to scan through a file to see if it contains a number on any line, you could do something like this:
def file_contains(f, n):
with f:
for line in f:
if int(line.strip()) == n:
return True
return False
However as Ned points out in his answer, this isn't a very efficient solution; if you have to search through the file again for each line, the running time of your program will increase proportional to the square of the number of numbers.
It the number of values is not incredibly large, it would be more efficient to use a set (documentation). Sets are designed to very efficiently keep track of unordered values. For example:
with open("input_file.txt", "rt") as in_file:
with open("output_file.txt", "wt") as out_file:
encountered_numbers = set()
for line in in_file:
n = int(line.strip())
if n not in encountered_numbers:
encountered_numbers.add(n)
out_file.write(line)

Categories