I am trying to understand what the commented out lines of code below do. When the lines are commented out, the program works as expected: it reads the function tuple_to_word creates a dictionary with the lines of words.txt as the values.
When the code is uncommented out, however, the program only prints an empty dictionary. But I can't understand why the for loop would have any effect on the call to tuple_to_word. I am guessing that the for loop in question changes the underlying file object, but how?
fin = open('words.txt')
word_dict = {}
'''
for i in fin:
word_dict[i.strip()] = 1
'''
def signature(s):
t = list(s)
t.sort()
t = ''.join(t)
return t
def tuple_to_word():
words_match_tuple = { }
for line in fin:
word = line.strip().lower()
t = signature(word)
words_match_tuple.setdefault(t, []).append(word)
return words_match_tuple
print tuple_to_word()
The answer is: if you activate the code between ''' .. ''' this will parse the input file line by line. Then the function tuple_to_word() will find the file cursor at the end and there will be no line to parse from the input file.
You should either reopen the input file or go to the beginning of the file with:
fin.seek(0)
Related
I am new here and new to Programming too.
I am reading Jamie Chan's Learn Python in One Day and am currently at the Practical Project section. I am trying to make python read a line from a txt file. The txt file contains a name and a number seperated by a comma,
This is the text file
Benny, 102
Ann, 100
Carol, 214
Darren, 129
I succeded in making it read the first line but the trying to print the second line by calling on the name there keeps returning a nill. When I switch the lines, the same thing occurs, it reads the name in the first line but returns nill on the name in the second file.
This is the function I tried to use to read the texts:
def getUserPoint(userName):
f = open('userScores.txt', 'r')
for line in f:
result = line.splitlines()
if userName in line:
return result
else:
return "nill"
f.close()
s = getUserPoint(input('Ann'))
print(s)
And this is the result:
nill
and this is the instructions:
Each line records the information of one user. The first value is the user’s username and the second is the user’s score.
Next, the function reads the file line by line using a for loop. Each line is then split using the split() function
Let’s store the results of the split() function in the list content.
Next, the function checks if any of the lines has the same username as the value that is passed in as the parameter. If there is, the function closes the file and returns the score beside that username. If there isn’t, the function closes the file and returns the string ‘-1’
Am terribly sorry for the long winded post.
you can use :
def getUserPoint(userName):
f = open('userScores.txt', 'r')
for line in f.readlines():
result = line.splitlines()
if userName in line:
f.close()
return result
f.close()
return "nill"
s = getUserPoint(input('Ann'))
print(s)
One problem is that you have an else statement that is matched and will immediately end the function and loop
You need to return the default result after you've looked at all lines
def getUserPoint(userName):
with open('userScores.txt') as f:
for line in f:
if userName == line.rstrip().split(',')[0]:
return line
return "nill"
Then, as shown, you either want to split the comma and check the first column, or userName in line . Otherwise, you are checking
'Ann' in ["Ann, 100", ""]
since splitlines() will split at the newline character at the end, which returns False
See below
The code takes care of closing the file.
It will return None if no match found, else 'user point' is returned
def get_user_point(user_name):
with open('userScores.txt', 'r') as f:
lines = [l.strip() for l in f]
for line in lines:
parts = line.split(',')
if user_name == parts[0]:
return parts[1]
Thanks everyone for the help...
This code by OneCricketeer worked:
def getUserPoint(userName):
with open('userScores.txt') as f:
for line in f:
if userName == line.split(',')[0]:
return line
return "nill"
Since am new to Python and programming in General, I will probably be asking a lot more questions.
Thanks for the help everyone.
I have the following problem. I am supposed to open a CSV file (its an excel table) and read it without using any library.
I tried already a lot and have now the first row in a tuple and this in a list. But only the first line. The header. But no other row.
This is what I have so far.
with open(path, 'r+') as file:
results=[]
text = file.readline()
while text != '':
for line in text.split('\n'):
a=line.split(',')
b=tuple(a)
results.append(b)
return results
The output should: be every line in a tuple and all the tuples in a list.
My question is now, how can I read the other lines in python?
I am really sorry, I am new to programming all together and so I have a real hard time finding my mistake.
Thank you very much in advance for helping me out!
This problem was many times on Stackoverflow so you should find working code.
But much better is to use module csv for this.
You have wrong indentation and you use return results after reading first line so it exits function and it never try read other lines.
But after changing this there are still other problems so it still will not read next lines.
You use readline() so you read only first line and your loop will works all time with the same line - and maybe it will never ends because you never set text = ''
You should use read() to get all text which later you split to lines using split("\n") or you could use readlines() to get all lines as list and then you don't need split(). OR you can use for line in file: In all situations you don't need while
def read_csv(path):
with open(path, 'r+') as file:
results = []
text = file.read()
for line in text.split('\n'):
items = line.split(',')
results.append(tuple(items))
# after for-loop
return results
def read_csv(path):
with open(path, 'r+') as file:
results = []
lines = file.readlines()
for line in lines:
line = line.rstrip('\n') # remove `\n` at the end of line
items = line.split(',')
results.append(tuple(items))
# after for-loop
return results
def read_csv(path):
with open(path, 'r+') as file:
results = []
for line in file:
line = line.rstrip('\n') # remove `\n` at the end of line
items = line.split(',')
results.append(tuple(items))
# after for-loop
return results
All this version will not work correctly if you will '\n' or , inside item which shouldn't be treated as end of row or as separtor between items. These items will be in " " which also can make problem to remove them. All these problem you can resolve using standard module csv.
Your code is pretty well and you are near goal:
with open(path, 'r+') as file:
results=[]
text = file.read()
#while text != '':
for line in text.split('\n'):
a=line.split(',')
b=tuple(a)
results.append(b)
return results
Your Code:
with open(path, 'r+') as file:
results=[]
text = file.readline()
while text != '':
for line in text.split('\n'):
a=line.split(',')
b=tuple(a)
results.append(b)
return results
So enjoy learning :)
One caveat is that the csv may not end with a blank line as this would result in an ugly tuple at the end of the list like ('',) (Which looks like a smiley)
To prevent this you have to check for empty lines: if line != '': after the for will do the trick.
I am having a file with 2 blocks slightly diffrent from each other.Below is the content of the file
Other codes in the file
function void one(int x)
message_same
rest of the code
endfunction
Other codes in the file
function void othercheck ::two(int x)
message_same
rest of the code
endfunction
Different codes in the file
I read this file in a list and made some changes and would like to write into another file.
But I want if "message_same" is seen under function one then,it should be written as it is but if it is seen under function two,then it should delete the line or do not write that line into the output file. Other line of code should remain as it is
Expected Output:
Other codes in the file
virtual function void one(int x)
message_same
rest of the code
endfunction
Other codes in the file
function void two:: othercheck(int x)
rest of the code
endfunction
Different codes in the file
I tried with the following code:
for word in words:
found_one_function=re.search('virtual function',word)
if found_in_function :
found_in_end=re.search('endfunction',word)
if not found_in_end:
found_in_function=True
while(found_in_function):
fw.write(word)
continue
if re.search('message_same', word):
continue
fw.write(word)
I understand that logically it's not right but I am not sure how to iterate after finding the virtual function till I get the end function.
Any help would be great.
That's relatively easy to do - what you want is to iterate over your words list (assuming each element contains a single line from your example data) and check for the beginning of the second 'type' of functions and then strip out lines containing message_same until you encounter a sole endfunction, something like:
# assuming `words` list with each line of your data
# if not it's as easy as: with open("input.txt") as f: words = [line for line in f]
with open("output.txt", "w") as f: # open output.txt for writing
in_function = False # an identifier to tell us we are within a `::` function
for line in words: # iterate over words
if in_function: # we are inside of a `::` function...
if line.strip() == "endfunction": # end of the function
in_function = False
elif "message_same" in line: # skip this line
continue
# detect function begin if there is "function" in the line followed with ::
elif "function" in line and line.find("function") < line.find("::"):
in_function = True
f.write(line) # write the line to the output file
# f.write("\n") # uncomment if the lines in your `words` are not terminated
For a file whose lines have been loaded as elements of words containing:
Other codes in the file
function void one(int x)
message_same
rest of the code
endfunction
Other codes in the file
function void othercheck ::two(int x)
message_same
rest of the code
endfunction
Different codes in the file
It will produce output.txt containing:
Other codes in the file
function void one(int x)
message_same
rest of the code
endfunction
Other codes in the file
function void othercheck ::two(int x)
rest of the code
endfunction
Different codes in the file
You can have as many functions as you want, and they don't need to be ordered - the processing will be applied only on those with ::.
Iterate over each line in the file; use a flag to track whether the process is IN a :: function; use the flag to discard a message_same line; modify line as needed; write line to the new file.
import re
special = re.compile(r'function.*?::')
in_special_func = False
with open(in_filepath) as in_file, open(out_filepath, 'w') as out_file:
for line in in_file:
if special.search(line):
in_special_func = True
if 'endfunction' in line:
in_special_func = False
if in_special_func and 'message_same' in line:
#skip
continue
# make line modifications here if needed
# line = modify(line)
# line = some_variation_of(line)
# print(line)
out_file.write(line)
PREVIOUS ERRONEOUS ATTEMPT
Construct a regex that will capture a complete function
f_re = re.compile(r'function.*?endfunction', flags = re.DOTALL)
Construct a regex to identify the special functions
special = re.compile(r'function.*?::')
Construct a a regex that will match the line that needs to be removed
message_same = re.compile(r'^\s*message_same\s*\n', flags = re.MULTILINE)
Read the file into a string:
with open(in_filepath) as in_file:
s = in_file.read()
Iterate over all the functions; if a function is special remove the line; make other modifications to the function; write it to a file.
with open(out_filepath, 'w') as out_file:
for f in f_re.findall(s):
#print(f)
if special.search(f):
f = message_same.sub('', f)
# make other changes here
# assuming the result is a single string
out_file.write(f)
#print(f)
Here's a way to remove the 'message same' line for every function with a signature that contains 'function' and 'two'. This assumes that the structure of your input file is very consistent.
# read file into list of lists (each inner list is a block)
with open('code_blocks.txt', 'r') as f:
blocks = [block.split('\n') for block in f.read().split('\n\n')]
# iterate over blocks
for block in blocks:
# if first line contains 'function' and 'two' and second line contains 'message same'
if 'function' in block[0] and '::' in block[0] and 'message_same' in block[1]:
# remove message same
block.pop(block.index(block[1]))
# combine list of lists back into single string and write it out
with open('code_blocks_out.txt', 'w') as f:
f.write('\n\n'.join(['\n'.join(block) for block in blocks]))
I have a function that is meant to count the number of times each key in a dictionary occurs in a list of files (list_of_docs).
def calc(dictionary):
for token in dictionary:
count = 0
for files in list_of_docs:
current_file = open(files.name, 'r')
text = current_file.read()
line = text.split()
if token in line:
count +=1
return count
When I call this function, it doesn't stop. When I interrupt the program it indicates that it's stuck on the line line = text.split(). (And if I remove that line, it gets stuck on text = current_doc.read().) Not sure why the program isn't stopping?
you are not closing your files, call current_file.close() when you are finished reading it. Alternatively you can wrap the file reading in a with statement:
with open(current_file, 'r') as f:
f.read()
...
I just started learning python 3 weeks ago, I apologize if this is really basic. I needed to open a .txt file and print the length of the longest line of code in the file. I just made a random file named it myfile and saved it to my desktop.
myfile= open('myfile', 'r')
line= myfile.readlines()
len(max(line))-1
#the (the "-1" is to remove the /n)
Is this code correct? I put it in interpreter and it seemed to work OK.
But I got it wrong because apparently I was supposed to use a while loop. Now I am trying to figure out how to put it in a while loop. I've read what it says on python.org, watched videos on youtube and looked through this site. I just am not getting it. The example to follow that was given is this:
import os
du=os.popen('du/urs/local')
while 1:
line= du.readline()
if not line:
break
if list(line).count('/')==3:
print line,
print max([len(line) for line in file(filename).readlines()])
Taking what you have and stripping out the parts you don't need
myfile = open('myfile', 'r')
max_len = 0
while 1:
line = myfile.readline()
if not line:
break
if len(line) # ... somethin
# something
Note that this is a crappy way to loop over a file. It relys on the file having an empty line at the end. But homework is homework...
max(['b','aaa']) is 'b'
This lexicographic order isn't what you want to maximise, you can use the key flag to choose a different function to maximise, like len.
max(['b','aaa'], key=len) is 'aaa'
So the solution could be: len ( max(['b','aaa'], key=len) is 'aaa' ).
A more elegant solution would be to use list comprehension:
max ( len(line)-1 for line in myfile.readlines() )
.
As an aside you should enclose opening a file using a with statement, this will worry about closing the file after the indentation block:
with open('myfile', 'r') as mf:
print max ( len(line)-1 for line in mf.readlines() )
As other's have mentioned, you need to find the line with the maximum length, which mean giving the max() function a key= argument to extract that from each of lines in the list you pass it.
Likewise, in a while loop you'd need to read each line and see if its length was greater that the longest one you had seen so far, which you could store in a separate variable and initialize to 0 before the loop.
BTW, you would not want to open the file with os.popen() as shown in your second example.
I think it will be easier to understand if we keep it simple:
max_len = -1 # Nothing was read so far
with open("filename.txt", "r") as f: # Opens the file and magically closes at the end
for line in f:
max_len = max(max_len, len(line))
print max_len
As this is homework... I would ask myself if I should count the line feed character or not. If you need to chop the last char, change len(line) by len(line[:-1]).
If you have to use while, try this:
max_len = -1 # Nothing was read
with open("t.txt", "r") as f: # Opens the file
while True:
line = f.readline()
if(len(line)==0):
break
max_len = max(max_len, len(line[:-1]))
print max_len
For those still in need. This is a little function which does what you need:
def get_longest_line(filename):
length_lines_list = []
open_file_name = open(filename, "r")
all_text = open_file_name.readlines()
for line in all_text:
length_lines_list.append(len(line))
max_length_line = max(length_lines_list)
for line in all_text:
if len(line) == max_length_line:
return line.strip()
open_file_name.close()