When I open a text file, it only reads the last line - python

Say customPassFile.txt has two lines in it. First line is "123testing" and the second line is "testing321". If passwordCracking = "123testing", then the output would be that "123testing" was not found in the file (or something similar). If passwordCracking = "testing321", then the output would be that "testing321" was found in the file. I think that the for loop I have is only reading the last line of the text file. Any solutions to fix this?
import time
import linecache
def solution_one(passwordCracking):
print("Running Solution #1 # " + time.strftime("%Y-%m-%d %H:%M:%S",time.localtime()))
startingTimeSeconds = time.time()
currentLine = 1
attempt = 1
passwordFound = False
wordListFile = open("customPassFile.txt", encoding="utf8")
num_lines = sum(1 for line in open('customPassFile.txt'))
while(passwordFound == False):
for i, line in enumerate(wordListFile):
if(i == currentLine):
line = line
passwordChecking = line
if(passwordChecking == passwordCracking):
passwordFound = True
endingTimeSeconds = time.time()
overallTimeSeconds = endingTimeSeconds - startingTimeSeconds
print("~~~~~~~~~~~~~~~~~")
print("Password Found: {}".format(passwordChecking))
print("ATTEMPTS: {}".format(attempt))
print("TIME TO FIND: {} seconds".format(overallTimeSeconds))
wordListFile.close()
break
elif(currentLine == num_lines):
print("~~~~~~~~~~~~~~~~~")
print("Stopping Solution #1 # " + time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))
print("REASON: Password could not be cracked")
print("ATTEMPTS: {}".format(attempt))
break
else:
attempt = attempt + 1
currentLine = currentLine + 1
continue

The main problem with your code is that you open the file and you read it multiple times. The first time the file object position goes to the end and stays there. Next time you read the file nothing happens, since you are already at the end of the file.
Example
Sometimes an example is worth more than lots of words.
Take the file test_file.txt with the following lines:
line1
line2
Now open the file and read it twice:
f = open('./test_file.txt')
f.tell()
>>> 0
for l in f:
print(l, end='')
else:
print('nothing')
>>> line1
>>> line2
>>> nothing
f.tell()
>>> 12
for l in f:
print(l, end='')
else:
print('nothing')
>>> nothing
f.close()
The second time nothing happen, as the file object is already at the end.
Solution
Here you have two options:
you read the file only once and save all the lines in a list and then use the list in your code. It should be enough to replace
wordListFile = open("customPassFile.txt", encoding="utf8")
num_lines = sum(1 for line in open('customPassFile.txt'))
with
with open("customPassFile.txt", encoding="utf8") as f:
wordListFile = f.readlines()
num_lines = len(wordListFile)
you reset the file object position after you read the file using seek. It would be something along the line:
for i, line in enumerate(wordListFile):
if(i == currentLine):
line = line
wordListFile.seek(0)
I would go with option 1., unless you have memory constraint (e.g. the file is bigger than memory)
Notes
I have a few extra notes:
python starts counters with 0 (like c/c++) and not 1 (like fortran). So probably you want to set:
currentLine = 0
when you read a file, the new line character \n is not stripped, so you have to do it (with strip) or account for it when comparing strings (using e.g. startswith). As example:
passwordChecking == passwordCracking
will likely always return False as passwordChecking contains \n and passwordCracking very likely doesn't.
Disclamer
I haven't tried the code, nor my suggestions, so there might be other bugs lurking around.

**I will delete this answer after OP understands the problem in indentation of I understand his intention of his code.*
for i, line in enumerate(wordListFile):
if(i == currentLine):
line = line
passwordChecking = line
#rest of the code.
Here your code is outside of for loop so only last line is cached.
for i, line in enumerate(wordListFile):
if(i == currentLine):
line = line
passwordChecking = line
#rest of the code.

Related

How do I count the number of lines that are full-line comments in python?

I'm trying to create a function that accepts a file as input and prints the number of lines that are full-line comments (i.e. the line begins with #followed by some comments).
For example a file that contains say the following lines should print the result 2:
abc
#some random comment
cde
fgh
#another random comment
So far I tried along the lines of but just not picking up the hash symbol:
infile = open("code.py", "r")
line = infile.readline()
def countHashedLines(filename) :
while line != "" :
hashes = '#'
value = line
print(value) #here you will get all
#if(value == hashes): tried this but just wasn't working
# print("hi")
for line in value:
line = line.split('#', 1)[1]
line = line.rstrip()
print(value)
line = infile.readline()
return()
Thanks in advance,
Jemma
I re-worded a few statements for ease of use (subjective) but this will give you the desired output.
def countHashedLines(lines):
tally = 0
for line in lines:
if line.startswith('#'): tally += 1
return tally
infile = open('code.py', 'r')
all_lines = infile.readlines()
num_hash_nums = countHashedLines(all_lines) # <- 2
infile.close()
...or if you want a compact and clean version of the function...
def countHashedLines(lines):
return len([line for line in lines if line.startswith('#')])
I would pass the file through standard input
import sys
count = 0
for line in sys.stdin: """ Note: you could also open the file and iterate through it"""
if line[0] == '#': """ Every time a line begins with # """
count += 1 """ Increment """
print(count)
Here is another solution that uses regular expressions and will detect comments that have white space in front.
import re
def countFullLineComments(infile) :
count = 0
p = re.compile(r"^\s*#.*$")
for line in infile.readlines():
m = p.match(line)
if m:
count += 1
print(m.group(0))
return count
infile = open("code.py", "r")
print(countFullLineComments(infile))

Evaluating the next line in a For Loop while in the current iteration

Here is what I am trying to do:
I am trying to solve an issue that has to do with wrapping in a text file.
I want to open a txt file, read a line and if the line contains what I want it to contain, check the next line to see if it does not contain what is in the first line. If it does not, add the line to the first line.
import re
stuff = open("my file")
for line in stuff:
if re.search("From ", line):
first = line
print first
if re.search('From ', handle.next()):
continue
else: first = first + handle.next()
else: continue
I have looked a quite a few things and cannot seem to find an answer. Please help!
I would try to do something like this, but this is invalid for triples of "From " and not elegant at all.
lines = open("file", 'r').readlines()
lines2 = open("file2", 'w')
counter_list=[]
last_from = 0
for counter, line in enumerate(lines):
if "From " in line and counter != last_from +1:
last_from = counter
current_count = counter
if current_count+1 == counter:
if "From " in line:
counter_list.append(current_count+1)
for counter, line in enumerate(lines):
if counter in counter_list:
lines2.write(line)
else:
lines2.write(line, '\n')
Than you can check the lines2 if its helped.
You could also revert order of lines, then check in next line not in previous. That would solve your problem in one loop.
Thank you Martjin for helping me reset my mind frame! This is what I came up with:
handle = open("my file")
first = ""
second = ""
sent = ""
for line in handle:
line = line.rstrip()
if len(first) > 0:
if line.startswith("From "):
if len(sent) > 0:
print sent
else: continue
first = line
second = ""
else:
second = second + line
else:
if line.startswith("From "):
first = line
sent = first + second
It is probably crude, but it definitely got the job done!

Split large text file using keyword delimiter

I'm trying to split a large text files into smaller text files by using a word delimiter. I tried searching but I've only seen posts to break apart files after x lines. I'm fairly new to programming but I've given it a start. I want to go through all the lines, and if it starts with hello, it will put all of those lines into one file until it reaches the next hello. The first word in the file is hello. Ultimately, I'm trying to get the text into R, but I think it would be easier if I split it up like this first. Any help is appreciated, thanks.
text_file = open("myfile.txt","r")
lines = text_file.readlines()
print len(lines)
for line in lines :
print line
if line[0:5] == "hello":
If you are finding for a very simple logic, Try this.
text_file = open("myfile.txt","r")
lines = text_file.readlines()
print len(lines)
target = open ("filename.txt", 'a') ## a will append, w will over-write
hello1Found = False
hello2Found = False
for line in lines :
if hello1Found == True :
if line[0:5] == "hello":
hello2Found = True
hello1Found = False
break ## When second hello is found looping/saving to file is stopped
##(though using break is not a good practice here it suffice your simple requirement
else:
print line #write the line to new file
target.write(line)
if hello1Found == False:
if line[0:5] == "hello": ##find first occurrence of hello
hello1Found = True
print line
target.write(line) ##if hello is found for the first time write the
##line/subsequent lines to new file till the occurrence of second hello
I am new to Python. I just finished a Python for Geographic Information Systems class at Northeastern University. This is what I came up with.
import os
import sys
import arcpy
def files():
n = 0
while True:
n += 1
yield open('/output/dir/%d.txt' % n, 'w')
pattern = 'hello'
fs = files()
outfile = next(fs)
filename = r'C:\output\dir\filename.txt'
with open(filename) as infile:
for line in infile:
if pattern not in line:
outfile.write(line)
else:
items = line.split(pattern)
outfile.write
(items[0])
for item in items:
outfile = next(fs)
outfile.write(item)
filename.close();outfile.close();

Python Replace String in File in With clause

I am trying to replace a string in a file.
Below code is simply modifying certain substrings within the bigger string from the file. Any ideas on how I can actually replace line with current_line in the filename?
from sys import *
import os
import re
import datetime
import fileinput
script, filename = argv
userhome = os.path.expanduser('~')
username = os.path.split(userhome)[-1]
print "\n"
print "User: " + username
today = datetime.date.today().strftime("%Y/%m/%d")
time = datetime.datetime.now().strftime("%H:%M:%S")
print "Date: " + str(today)
print "Current time: " + str(time)
print "Filename: %s\n" % filename
def replace_string():
found = False
with open(filename, 'r+') as f:
for line in f:
if re.search("CVS Header", line):
print line
####################################################################################
# Below logic: #
# if length of revision number is 4 characters (e.g. 1.15) then increment by 0.01 #
# else if it is 3 characters (e.g. 1.5) then increment by 0.1 #
####################################################################################
if len(line.split("$Revision: ")[1].split()[0]) == 4:
new_line = str.replace(line, line.split("$Revision: ")[1].split()[0], str(float(line.split("$Revision: ")[1].split()[0]) + 0.01))
elif len(line.split("$Revision: ")[1].split()[0]) == 3:
new_line = str.replace(line, line.split("$Revision: ")[1].split()[0], str(float(line.split("$Revision: ")[1].split()[0]) + 0.1))
###
###
newer_line = str.replace(new_line, line.split("$Author: ")[1].split()[0], username)
newest_line = str.replace(newer_line, line.split("$Date: ")[1].split()[0], today)
current_line = str.replace(newest_line, line.split("$Date: ")[1].split()[1], time)
print current_line
found = True
if not found:
print "No CVS Header exists in %s" % filename
if __name__ == "__main__":
replace_string()
I tried adding something like..
f.write(f.replace(line, current_line))
but this just clears all the contents out of the file and leaves it blank so obviously that is incorrect.
The fileinput provides a way to edit a file in place. If you use the inplace parameter the file is moved to a backup file and standard output is directed to the input file.
import fileinput
def clause(line):
return len(line) < 5
for line in fileinput.input('file.txt', inplace=1):
if clause(line):
print '+ ' + line[:-1]
fileinput.close()
Trying to apply this idea to your example, it could be something like this:
def replace_string():
found = False
for line in fileinput.input(filename, inplace=1): # <-
if re.search("CVS Header", line):
#print line
####################################################################################
# Below logic: #
# if length of revision number is 4 characters (e.g. 1.15) then increment by 0.01 #
# else if it is 3 characters (e.g. 1.5) then increment by 0.1 #
####################################################################################
if len(line.split("$Revision: ")[1].split()[0]) == 4:
new_line = str.replace(line, line.split("$Revision: ")[1].split()[0], str(float(line.split("$Revision: ")[1].split()[0]) + 0.01))
elif len(line.split("$Revision: ")[1].split()[0]) == 3:
new_line = str.replace(line, line.split("$Revision: ")[1].split()[0], str(float(line.split("$Revision: ")[1].split()[0]) + 0.1))
###
###
newer_line = str.replace(new_line, line.split("$Author: ")[1].split()[0], username)
newest_line = str.replace(newer_line, line.split("$Date: ")[1].split()[0], today)
current_line = str.replace(newest_line, line.split("$Date: ")[1].split()[1], time)
print current_line[:-1] # <-
found = True
else:
print line[:-1] # <- keep original line otherwise
fileinput.close() # <-
if not found:
print "No CVS Header exists in %s" % filename
The solution proposed by user2040251 is the correct way, and the way used but all text editors I know. The reason is that in case of a major problem when writing the file, you keep the previous version unmodified until the new version is ready.
But of course if you want you can edit in place, if you accept the risk of completely losing the file in case of crash - it can be acceptable for a file under version control since you can always get previous commited version.
The principle is then a read before write, ensuring that you never write something that you have not still read.
At the simplest level, you load everything in memory with readlines, replace the line rewind the file the the correct position (or to the beginning) and write it back.
Edit : here is a simple implementation when all lines can be loaded in memory :
fd = open(filename, "r+")
lines = fd.readlines()
for i, line in enumerate(lines):
# test if line if the searched line
if found :
lines[i] = replacement_line
break
fd.seek(0)
fd.writelines()
It could be done even for a big file using readlines(16384) for example instead of readlines() to read by chunks of little more than 16K, and always reading one chunk before writing previous, but it is really much more complicated and anyway you should use a backup file when processing big files.
You can create another file and write the output to it. After that, you can just remove the original file and rename the new file.

How to change back a line during file read

In my code I have a line length print like this:
line = file.readline()
print("length = ", len(line))
after that I start to scan the lines by doing this:
for i in range(len(line)):
if(file.read(1) == 'b'):
print("letter 'b' found.")
The problem is that the for loop starts reading on line 2 of the file.
How can I make it start reading at line 1 without closing and reopening the file?
It is possible to use file.seek to move the position of the next read, but that's inefficient. You've already read in the line, so you can just process
line without having to read it in a second time.
with open(filename,'r') as f:
line = f.readline()
print("length = ", len(line))
if 'b' in line:
print("letter 'b' found.")
for line in f:
...
It seems that you need to handle the first line specially.
lineno = 1
found = False
for line in file:
if 'b' in line:
found = True
if lineno == 1:
print("length of first line: %d" % len(line))
lineno += 1
if found:
print("letter 'b' found.")
It sounds like you want something like this:
with open('file.txt', 'r') as f:
for line in f:
for character in line:
if character == "b":
print "letter 'b' found."
or if you just need the number:
with open('file.txt', 'r') as f:
b = sum(1 for line in f for char in line if char == "b")
print "found %d b" % b
#! usr/bin/env python
#Open the file , i assumed its called somefile.txt
file = open('somefile.txt.txt','r')
#Lets loop through the lines ...
for line in file.readlines():
#test if letter 'b' is in each line ...
if 'b' in line:
#print that we found a b in the line
print "letter b found"

Categories