Reading a file and displaying the sum of names within that file - python

What I would like the final code to execute is read a string of names in a text document named, 'names.txt'. Then tell the program to calculate how many names there are in that file and display the amount of names. The code I have so far was meant to display the sum of the numbers in a text file, but it was close enough to the program I need now that I think I may be able to rework it to gather the amount of strings/names and display that instead of the sum.
Here is the code so far:
def main():
#initialize an accumulator.
total = 0.0
try:
# Open the file.
myfile = open('names.txt', 'r')
# Read and display the file's contents.
for line in myfile:
amount = float(line)
total += amount
# Close the file.
myfile.close()
except IOError:
print('An error occured trying to read the file.')
except ValueError:
print('Non-numeric data found in the file.')
except:
print('An error occured.')
# Call the main function.
main()
I am still really new to Python programming so please don't be too harsh on me. If anyone can figure out how to rework this to display the amount of numbers/names instead of the sum of numbers. I would greatly appreciate it. If this program cannot be reworked, I would be happy to settle for a new solution.
Edit: This it an example of what the 'names.txt' will look like:
john
mary
paul
ann

If you just want to count the lines in the file
# Open the file.
myfile = open('names.txt', 'r')
#Count the lines in the file
totalLines = len(myfile.readlines()):
# Close the file.
myfile.close()

fh = open("file","r")
print "%d lines"%len(fh.readlines())
fh.close()
or you could do
fh=open("file","r")
print "%d words"%len(fh.read().split())
fh.close()
All this is readily available information that is not hard to find if you put forth some effort...just getting the answers usually results in flunked classes...

Considering the names in your text files are delimited by line.
myfile = open('names.txt', 'r')
lstLines = myfile.read().split('\n')
dict((name,lstLines.count(name)) for name in lstLines)
This creates a dictionary of each name having its number of occurrence.
To search for the occurrence of perticular name such as 'name1' in the list
lstLines.count('name1')

Assuming names are splitted using whitespaces :
def main():
#initialize an accumulator.
total = 0.0
try:
# Open the file.
myfile = open('names.txt', 'r')
# Read and display the file's contents.
for line in myfile:
words = line.split()
total += len(words)
# Close the file.
myfile.close()
except IOError:
print('An error occured trying to read the file.')
except ValueError:
print('Non-numeric data found in the file.')
except:
print('An error occured.')
# Call the main function.
main()

Use with statement to open a file. It will close the file properly even if an exception occurred. You can omit the file mode, it is default.
If each name is on its own line and there are no duplicates:
with open('names.txt') as f:
number_of_nonblank_lines = sum(1 for line in f if line.strip())
name_count = number_of_nonblank_lines
The task is very simple. Start with a new code to avoid accumulating unused/invalid for the problem code.
If all you need is to count lines in a file (like wc -l command) then you could use .count('\n') method:
#!/usr/bin/env python
import sys
from functools import partial
read_chunk = partial(sys.stdin.read, 1 << 15) # or any text file instead of stdin
print(sum(chunk.count('\n') for chunk in iter(read_chunk, '')))
See also, Why is reading lines from stdin much slower in C++ than Python?

Related

Try/Except not running properly when opening files

I am trying to open a file with this Try/Except block but it is going straight to Except and not opening the file.
I've tried opening multiple different files but they are going directly to not being able to open.
import string
fname = input('Enter a file name: ')
try:
fhand = open(fname)
except:
print('File cannot be opened:', fname)
exit()
counts = dict()
L_N=0
for line in fhand:
line= line.rstrip()
line = line.translate(line.maketrans(' ', ' ',string.punctuation))
line = line.lower()
words = line.split()
L_N+=1
for word in words:
if word not in counts:
counts[word]= [L_N]
else:
if L_N not in counts[word]:
counts[word].append(L_N)
for h in range(len(counts)):
print(counts)
out_file = open('word_index.txt', 'w')
out_file.write('Text file being analyzed is: '+str(fname)+ '\n\n')
out.file_close()
I would like the output to read a specific file and count the created dictionary
make sure you are inputting quotes for your filename ("myfile.txt") if using python 2.7. if python3, quotes are not required.
make sure your input is using absolute path to the file, or make sure the file exists in the same place you are running the python program.
for example,
if your program and current working directory is in ~/code/
and you enter: 'myfile.txt', 'myfile.txt' must exist in ~/code/
however, its best you provide the absolute path to your input file such as
/home/user/myfile.txt
then your script will work 100% of the time, no matter what directory you call your script from.

How do I stop new data from python replacing old data in a csv file? [duplicate]

The code below is what I have so far. When it writes to the .csv it overwrites what I had previously written in the file.How can I write to the file in such a way that it doesn't erase my previous text.(The objective of my code is to have a person enter their name and have the program remember them)
def main(src):
try:
input_file = open(src, "r")
except IOError as error:
print("Error: Cannot open '" + src + "' for processing.")
print("Welcome to Learner!")
print("What is your name? ")
name = input()
for line in input_file:
w = line.split(",")
for x in w:
if x.lower() == name.lower():
print("I remember you "+ name.upper())
else:
print("NO")
a = open("learner.csv", "w")
a.write(name)
a.close()
break
if __name__ == "__main__":
main("learner.csv")
You need to append to file the next time. This can be done by opening the file in append mode.
def addToFile(file, what):
f = open(file, 'a').write(what)
change open("learner.csv", "w") to open("learner.csv", "a")
The second parameter with open is the mode, w is write, a is append. With append it automatically seeks to the end of the file.
You'll want to open the file in append-mode ('a'), rathen than write-mode ('w'); the Python documentation explains the different modes available.
Also, you might want to consider using the with keyword:
It is good practice to use the with keyword when dealing with file objects. This has the advantage that the file is properly closed after its suite finishes, even if an exception is raised on the way.
>>> with open('/tmp/workfile', 'a') as f:
... f.write(your_input)

PermissionError: [Errno 13] Permission denied (after multiple successful writting attemps in the file)

I wrote a code which query a mongo database and write results in a file.
My code create the file and start to write in it succesfully. But after multiple iterations (not sure if the number of iteration is fix or not) I got a PermissionError.
I've search about it but I only found answers about people who got the error at first attempt because they don't have permission. I will precise that I am not doing anything on my computer during the execution so I really don't understand how it can happen.
Here is parts of the code:
def query(self, query_date_part, query_actKey_part, filepath):
empty = True
print("0.0 %")
for i in range(len(query_date_part)):
query = {"dt": query_date_part[i], "actKey": query_actKey_part}
cursor = self.collection.find(query)
while cursor.alive:
try:
if empty:
with open(filepath, 'w') as fp:
json.dump(cursor.next(), fp, default=json_util.default)
empty = False
else:
append_to_json(filepath, cursor.next())
except StopIteration:
print("Stop Iteration")
print(str(round(float(i+1) / len(query_date_part) * 100, ndigits=2)) + " %")
return 0
def append_to_json(filepath, data):
"""
Append data in JSON format to the end of a JSON file.
NOTE: Assumes file contains a JSON object (like a Python dict) ending in '}'.
:param filepath: path to file
:param data: dict to append
"""
# construct JSON fragment as new file ending
new_ending = ", " + json.dumps(data, default=json_util.default)[1:-1] + "}\n"
# edit the file in situ - first open it in read/write mode
with open(filepath, 'r+') as f:
f.seek(0, 2) # move to end of file
index = f.tell() # find index of last byte
# walking back from the end of file, find the index
# of the original JSON's closing '}'
while not f.read().startswith('}'):
index -= 1
if index == 0:
raise ValueError("can't find JSON object in {!r}".format(filepath))
f.seek(index)
# starting at the original ending } position, write out
# the new ending
f.seek(index)
f.write(new_ending)`
Part of the output:
6.75 %
Stop Iteration
6.76 %
Traceback (most recent call last):
File "C:/Users/username/PycharmProjects/mongodbtk/mquerytk.py", line 237, in <module>
mdbc.query(split_date(2017,5,6,1,0,2017,5,16,10,0,step=2), {"$in": ["aFeature"]}, 'test.json')
File "C:/Users/username/PycharmProjects/mongodbtk/mquerytk.py", line 141, in query
append_to_json(filepath, cursor.next())
File "C:/Users/username/PycharmProjects/mongodbtk/mquerytk.py", line 212, in append_to_json
with open(filepath, 'r+') as f:
PermissionError: [Errno 13] Permission denied: 'test.json'
Process finished with exit code 1
Note: The size of the file increase during the execution. When it crash it is about 300 Mo, I still have a lot of space on my hard drive but maybe the size of the file can be an issue ?
Config: I use Windows 7, Python 3.6 and my IDE is PyCharm Community Edition 2016.3.2
I had the same issue, and after testing it out, it seems like there might be some "bug" when trying to write to the same file too "often", multiple times in each sec. I'll provide a very small code snippet what you can test with:
import csv
text = "dfkjghdfkljghflkjghjkdfdfgsktjgrhsleiuthsl uirghuircbl iawehcg uygbc sgygerh"
FIELD_NAMES = ['asd', 'qwe']
with open('test.txt', 'w', newline='') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=FIELD_NAMES)
writer.writeheader()
max = 10000
i = 0
while i <= max:
print(str(i))
with open('test.txt', 'a', newline='') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=FIELD_NAMES)
rowData = {'asd': text, 'qwe': text}
writer.writerow(rowData)
i += 1
The code I think is pretty self explanatory. I got the error very randomly, sometimes it happens after the ~75th iteration, sometimes it can get to even ~750, but it looks like the code can't reach the limit. So I recommend you to try to write more data a few times rather than few data very often. I hope it helps.
Try this
except StopIteration:
if not fp.closed: fp.close()
print("Stop Iteration")
It appears to be a Race Condition. Creating a helper function which does same task as a function raising a race condition, worked out for me.
For example, if a code block or function A() is creating a race condition, then we can handle this exception by calling again another helper function A_helper() which also does the same thing.
A python code example:
try:
A()
except PermissionError:
# Do something
else:
A_helper()
Your file is located somewhere where your program can't access it. Try moving it to a different directory. Or it might be you are entering the wrong filepath. I hope this Works for you!

Why does my script write every input string twice to the output file?

When I look at what it wrote it's always double. For example if I write 'dog' ill get 'dogdog'. Why?
Reading and writing to file, filename taken from command line arguments:
from sys import argv
script,text=argv
def reading(f):
print f.read()
def writing(f):
print f.write(line)
filename=open(text)
#opening file
reading(filename)
filename.close()
filename=open(text,'w')
line=raw_input()
filename.write(line)
writing(filename)
filename.close()
As I said the output I am getting is the double value of what input I am giving.
You are getting double value because you are writing two times
1) From the Function call
def writing(f):
print f.write(line)
2) By writing in file using filename.write(line)
Use this code:
from sys import argv
script,text=argv
def reading(f):
print f.read()
def writing(f):
print f.write(line)
filename=open(text,'w')
line=raw_input()
writing(filename)
filename.close()
And also no need to close file two times, once you finished all the read and write operations then just close it.
If you want to display each line and then write a new line, you should probably just read the entire file first, and then loop over the lines when writing new content.
Here's how you can do it. When you use with open(), you don't have to close() the file, since that's done automatically.
from sys import argv
filename = argv[1]
# first read the file content
with open(filename, 'r') as fp:
lines = fp.readlines()
# `lines` is now a list of strings.
# then open the file for writing.
# This will empty the file so we can write from the start.
with open(filename, 'w') as fp:
# by using enumerate, we can get the line numbers as well.
for index, line in enumerate(lines, 1):
print 'line %d of %d:\n%s' % (index, len(lines), line.rstrip())
new_line = raw_input()
fp.write(new_line + '\n')

Slow python file I:O; Ruby runs better than this; Got the wrong language?

Please advise - I'm going to use this asa learning point. I'm a beginner.
I'm splitting a 25mb file into several smaller file.
A Kindly guru here gave me a Ruby sript. It works beautifully fast. So, in order to learn I mimicked it with a python script. This runs like a three-legged cat (slow). I wonder if anyone can tell me why?
My python script
##split a file into smaller files
###########################################
def splitlines (file) :
fileNo=0001
outFile=open("C:\\Users\\dunner7\\Desktop\\Textomics\\Media\\LexisNexus\\ele\\newdocs\%s.txt" % fileNo, 'a') ## open file to append
fh = open(file, "r") ## open the file for reading
mylines = fh.readlines() ### read in lines
for line in mylines: ## for each line
if re.search("Copyright ", line): # if the line is equal to the regex
outFile.close() ## close the file
fileNo +=1 #and add one to the filename, starting to read lines in again
else: # otherwise
outFile=open("C:\\Users\\dunner7\\Desktop\\Textomics\\Media\\LexisNexus\\ele\\newdocs\%s.txt" % fileNo, 'a') ## open file to append
outFile.write(line) ## then append it to the open outFile
fh.close()
The guru's Ruby 1.9 script
g=0001
f=File.open(g.to_s + ".txt","w")
open("corpus1.txt").each do |line|
if line[/\d+ of \d+ DOCUMENTS/]
f.close
f=File.open(g.to_s + ".txt","w")
g+=1
end
f.print line
end
There are many reasons why your script is slow -- the main reason being that you reopen the outputfile for almost every line you write. Since the old file gets implicitly closed on opening a new one (due to Python garbage collection), the write buffer is flushed for every single line you write, which is quite expensive.
A cleaned up and corrected version of your script would be
def file_generator():
file_no = 1
while True:
f = open(r"C:\Users\dunner7\Desktop\Textomics\Media"
r"\LexisNexus\ele\newdocs\%s.txt" % file_no, 'a')
yield f
f.close()
file_no += 1
def splitlines(filename):
files = file_generator()
out_file = next(files)
with open(filename) as in_file:
for line in in_file:
if "Copyright " in line:
out_file = next(files)
out_file.write(line)
out_file.close()
I guess the reason your script is so slow is that you open a new file descriptor for each line. If you look at your guru's ruby script, it closes and opens the output file only if your separator matches.
In contrast to that, your python script opens a new file descriptor for every line you read (and btw, does not close them). Opening a file requires talking to the kernel, so this is relatively slow.
Another change I would suggest is to change
fh = open(file, "r") ## open the file for reading
mylines = fh.readlines() ### read in lines
for line in mylines: ## for each line
to
fh = open(file, "r")
for line in fh:
With this change, you do not read the whole file into memory, but only block after block. Although it should not matter with a 25MiB file, it will hurt you with big files and is good practice (and less code ;)).
The Python code might be slow due to regex and not IO. Try
def splitlines (file) :
fileNo=0001
outFile=open("newdocs/%s.txt" % fileNo, 'a') ## open file to append
reg = re.compile("Copyright ")
for line in open(file, "r"):
if reg.search("Copyright ", line): # if the line is equal to the regex
outFile.close() ## close the file
outFile=open("newdocs%s.txt" % fileNo, 'a') ## open file to append
fileNo +=1 #and add one to the filename, starting to read lines in again
outFile.write(line) ## then append it to the open outFile
Several notes
Always use / instead of \ for path name
If regex is used repeatedly, compile it
Do you need re.search? or re.match?
UPDATE:
#Ed. S: point taken
#Winston Ewert: code updated to be closer to the original Ruby code
rosser,
Don't use names of built-in objects as identifiers in a code (file, splitlines)
The following code respects the effect of your own code: an out_file is closed without the line containing 'Copyright ' that constitutes the signal of closing
The use of the function writelines() is intended to obtain a faster execution than with a repetition of out_file.write(line)
The if li: block is there to trigger the closing of out_file in case the last line of the read file doesn't contains 'Copyright '
def splitfile(filename, wordstop, destrep, file_no = 1, li = []):
with open(filename) as in_file:
for line in in_file:
if wordstop in line:
with open(destrep+str(file_no)+'.txt','w') as f:
f.writelines(li)
file_no += 1
li = []
else:
li.append(line)
if li:
with open(destrep+str(file_no)+'.txt','w') as f:
f.writelines(li)

Categories