Trying to modify text file with if statements - python

I have a text file that I am reading in, and based on certain conditions modifying specific lines and rewriting the file to a new text file. My present code mostly works but one of the elif statements seems to be simply ignored by Python as there are no run time errors. A MWE is as follows:
energy = .1
source = str('source x y z energy=%f' %energy)
c = energy - 0.001
c_string = str("1010 %f %f" %(c, energy))
f = open("file.txt", "r+")
with open ("newfiletxti", 'w') as n:
d = f.readlines()
for line in d:
if not line.startswith("source"):
if not line.startswith("xyz"):
n.write(line)
elif line.startswith("source"):
n.write(source + "\n")
elif line.startswith("xyz"):
n.write(c_string + "\n")
n.truncate()
n.close()
The code:
elif line.startswith("source"):
n.write(source + "\n")
Works as expected where the line in the text file is replaced with the string titled "source" however the next block:
elif line.startswith("xyz"):
n.write(c_string + "\n")
Has no effect. The new text file is simply missing the line that starts with xyz. My guess is my syntax for multiple elif statements is incorrect but I am uncertain as to why.

The first if and elif handle all the cases -- either the line starts with source or it doesn't. I think you need to combine the first if and its nested if into a single condition:
if not line.startswith("source") and not line.startswith("xyz"):
n.write(line)
or the equvivalent (by de Morgan's Laws):
if not(line.startswith("source") or line.startswith("xyz")):
n.write(line)
Or you can make it clearer by reordering your conditions:
if line.startswith("source"):
n.write(source + "\n")
elif line.startswith("xyz"):
n.write(c_string + "\n")
else:
n.write(line)

Try your if block like this:
if line.startswith("source"):
n.write(source + "\n")
elif line.startswith("xyz"):
n.write(c_string + "\n")
else:
n.write(line)

The third elif will never be reached. Here is the code reduced for clarity:
if not line.startswith("source"):
# suff
elif line.startswith("xyz"):
# stuff
Something that starts with "xyz" does not start with "source".

Related

Python If/Else Logic with File I/O

I have a program that reads in 5 files, performs a "two sum" algorithm over the arrays in the file, and outputs to a new file if there's a sum of two numbers that matches the target.
I've got the logic to handle everything except if there's no match. If there's no match I need to write "No" to the output file. If I just add else: f.write("No") after the second if then it'll write "No" for every pass that it's not a match. I need it write "No" ONCE at the end of the processing, after it hasn't found a match.
Read 5 "in" files
inPrefix = "in"
outPrefix = "out"
for i in range(1, 6):
inFile = inPrefix + str(i) + ".txt"
with open(inFile, 'r') as f:
fileLines = f.readlines()
target = fileLines[1]
arr = fileLines[2]
Output 5 "out" files
outFile = outPrefix + str(i) + ".txt"
with open(outFile, 'a') as f:
f.write(target)
f.write(arr)
target = int(target)
num_arr = [int(j) for j in arr.split()]
for a in range(len(num_arr)):
for b in range(a, len(num_arr)):
curr = num_arr[a] + num_arr[b]
if num_arr[a]*2 == target:
a = str(num_arr[a])
target = str(target)
answer = "{}+{}={}".format(a,a,target)
f.write("Yes")
f.write("\n")
f.write(answer)
break
if curr == target:
a = str(num_arr[a])
b = str(num_arr[b])
target = str(target)
answer = "{}+{}={}".format(a,b,target)
f.write("Yes")
f.write("\n")
f.write(answer)
break
f.close()
Initialize a variable -- let's call it wrote_yes -- to False at the top of the code.
Anytime you write "Yes" to the file, set that variable to True.
When you reach the end of all the processing, check that variable. If it's still False, then you never wrote "Yes", so now you can write "No".

Spell check algorithm outputs everything instead of just the typos (Python)?

I'm basically trying to code a simple spell-check program that will prompt you for an input file, then analyze the input file for possible spelling errors (by using binary search to see if the word is in the dictionary), before printing them in the output file.
However, currently, it outputs everything in the input file instead of just the errors...
My code is as follows:
import re
with open('DICTIONARY1.txt', 'r') as file:
content = file.readlines()
dictionary = []
for line in content:
line = line.rstrip()
dictionary.append(line)
def binary_search(array, target, low, high):
mid = (low + high) // 2
if low > high:
return -1
elif array[mid] == target:
return mid
elif target < array[mid]:
return binary_search(array, target, low, mid-1)
else:
return binary_search(array, target, mid+1, high)
input = input("Please enter file name of file to be analyzed: ")
infile = open(input, 'r')
contents = infile.readlines()
text = []
for line in contents:
for word in line.split():
word = re.sub('[^a-z\ \']+', " ", word.lower())
text.append(word)
infile.close()
outfile = open('TYPO.txt', 'w')
for data in text:
if data.strip() == '':
pass
elif binary_search(dictionary, data, 0, len(data)) == -1:
outfile.write(data + "\n")
else:
pass
file.close
outfile.close
I can't seem to figure out what's wrong. :(
Any help would be very much appreciated!
Thank you. :)
I tried replacing len(data) with len(dictionary) as that made more sense to me and it seems to work in my very limited tests.
I think you were passing the length of the word in question as the upper bound on the dictionary. So if you were looking up the word "dog" you were only checking the first 3 words in the dictionary, and since your dictionary is probably very large, almost every word was never found (so every word was in the output file).

Print and file.write giving different outputs, Python

message = 'aaa'
for alpha in message:
num = ord(alpha) + 2
ans = chr(num)
print(ans)
file = open('f.txt', 'w')
file.write(ans)
file.close()
print(ans) prints ccc as expected, but file.write(ans) writes only 'c'. How come it doesn't print the entire string? Thanks a bunch.
write method starts to write from a certain (current) position in a file. Since you open the file on every iteration, write will always start from the beginning of the file, overwriting the existing data. In your case it will replace the existing letter c.
You'll need to open the file before the loop and close it after the loop. This will make the file retain the position (check that with file.tell()):
message = "aaa"
file = open('f.txt', 'w')
for alpha in message:
num = ord(alpha) + 2
ans = chr(num)
print(ans)
file.write(ans)
file.close()
Or, even better, use a context manager:
message = "aaa"
with open('f.txt', 'w') as file:
for alpha in message:
num = ord(alpha) + 2
ans = chr(num)
print(ans)
file.write(ans)

How to make automated writer to append lines in Python

Here is my Code :
b = 1
a = "line"
f = open("test.txt", "rb+")
if a + " " + str(b) in f.read():
f.write(a + " " + str(b + 1) + "\n")
else:
f.write(a + " " + str(b) + "\n")
f.close()
It prints now line 1 and then line 2, but how can i make this read what is the last "line x" and print out line x + 1?
for example:
test.txt would have
line 1
line 2
line 3
line 4
and my code would append line 5 in the end.
I was thinking maybe some kind of "find last word" kind of code?
How can I do this?
If you know for certain that every line has the format "word number" then you could use:
f = open("test.txt", "rb+")
# Set l to be the last line
for l in f:
pass
# Get the number from the last word in the line
num = int(l.split()[-1]))
f.write("line %d\n"%num)
f.close()
If the format of each line can change and you also need to handle extracting numbers, re might be useful.
import re
f = open("test.txt", "rb+")
# Set l to be the last line
for l in f:
pass
# Get the numbers in the line
numstrings = re.findall('(\d+)', l)
# Handle no numbers
if len(numstrings) == 0:
num = 0
else:
num = int(numstrings[0])
f.write("line %d\n"%num)
f.close()
You can find more efficient ways of getting the last line as mentioned here What is the most efficient way to get first and last line of a text file?

Forward slash not found in last string in line

My code is not very good but this is a really interesting problem. When looking for a forward slash in a string all are found except for if the forward slash is in the last word in the file. Here is my code.
#!/usr/bin/python
import sys
if len(sys.argv)!=2:
print "usage: %s filename\n" % (sys.argv[0]);
exit(0);
f = open(sys.argv[1]);
lines = [i for i in f.readlines()]
finals = [];
for line in lines:
words = line.split(",");
for word in words:
if word.find("/") != -1:
datefixes = word.split("/")
if datefixes[2].__len__() == 4:
temp = datefixes[2]
word = datefixes[0] + "-" + datefixes[1] + "-" + temp[-2:]
finals += "," + word;
tempstring = ''.join(finals)
finallist = tempstring.split("\r\n")
finalstring = ""
for tmpstrpart in finallist:
if tmpstrpart != "" or tmpstrpart !="\r\n":
finalstring += tmpstrpart[1:] + "\r\n"
print finalstring
and here is a sample input
ACPVBF,1930-729,Z729,12/16/2014,6/10/2008,1/5/2003,44-48-46,39-43-41,35-39-37,29-33-31
ACPVGT,1930-729,Z729,25-29-27,19-23-21,14-18-16,7/11/2009,2/6/2004,48-2-0,42-46-44
ACPUQH,1930-729,Z729,32-40-19,26-34-13,21-29-8,14-22-1,9/17/1946,5/13/1942,49-7-36
ACPVOU,1930-729,Z729,42-0-29,36-44-23,31-39-18,24-32-11,19-27-6,15-23-2,9/17/1946
in the code these lines are split by commas. if the word at the end contains a / the forward slash is not found. but only if it is at the end. the rest work fine.
edit: The output I am currently getting on these lines is:
ACPVBF,1930-729,Z729,12-16-14,6-10-08,1-5-03,44-48-46,39-43-41,35-39-37,29-33-31
ACPVGT,1930-729,Z729,25-29-27,19-23-21,14-18-16,7-11-09,2-6-04,48-2-0,42-46-44
ACPUQH,1930-729,Z729,32-40-19,26-34-13,21-29-8,14-22-1,9-17-46,5-13-42,49-7-36
ACPVOU,1930-729,Z729,42-0-29,36-44-23,31-39-18,24-32-11,19-27-6,15-23-2,9/17/1946
the output that I am trying to get from these lines is:
ACPVBF,1930-729,Z729,12-16-14,6-10-08,1-5-03,44-48-46,39-43-41,35-39-37,29-33-31
ACPVGT,1930-729,Z729,25-29-27,19-23-21,14-18-16,7-11-09,2-6-04,48-2-0,42-46-44
ACPUQH,1930-729,Z729,32-40-19,26-34-13,21-29-8,14-22-1,9-17-46,5-13-42,49-7-36
ACPVOU,1930-729,Z729,42-0-29,36-44-23,31-39-18,24-32-11,19-27-6,15-23-2,[9-17-46]
I want the one with the brackets around it to change also.
final working code based on BrenBarn's answer:
#!/usr/bin/python
import sys
import re
if len(sys.argv)!=2:
print "usage: %s filename\n" % (sys.argv[0]);
exit(0);
f = open(sys.argv[1]);
x = f.read()
f.close()
filename = sys.argv[1]
filename = filename[:-4] + " finished.csv"
f = open(filename, 'w')
f.write(re.sub(r'(\d{1,2})/(\d{1,2})/\d{2}(\d{2})', r'\1-\2-\3', x))
f.close()
Thanks for all the help. Sorry I can't upvote yet.
I'm not sure what the problem is, but I think what you're trying to do can be accomplished way more easily by just using a regular expression.
>>> print x
ACPVBF,1930-729,Z729,12/16/2014,6/10/2008,1/5/2003,44-48-46,39-43-41,35-39-37,29-33-31
ACPVGT,1930-729,Z729,25-29-27,19-23-21,14-18-16,7/11/2009,2/6/2004,48-2-0,42-46-44
ACPUQH,1930-729,Z729,32-40-19,26-34-13,21-29-8,14-22-1,9/17/1946,5/13/1942,49-7-36
ACPVOU,1930-729,Z729,42-0-29,36-44-23,31-39-18,24-32-11,19-27-6,15-23-2,9/17/1946
>>> print re.sub(r'(\d{1,2})/(\d{1,2})/\d{2}(\d{2})', r'\1-\2-\3', x)
ACPVBF,1930-729,Z729,12-16-14,6-10-08,1-5-03,44-48-46,39-43-41,35-39-37,29-33-31
ACPVGT,1930-729,Z729,25-29-27,19-23-21,14-18-16,7-11-09,2-6-04,48-2-0,42-46-44
ACPUQH,1930-729,Z729,32-40-19,26-34-13,21-29-8,14-22-1,9-17-46,5-13-42,49-7-36
ACPVOU,1930-729,Z729,42-0-29,36-44-23,31-39-18,24-32-11,19-27-6,15-23-2,9-17-46
I think the issue with your code is the trailing new line. My guess is it would fail for last word in each line. I would suggest you do a line.strip() before splitting

Categories