python: remove strings found in other files between specific strings - python

Ive got a txt file like:
first.txt
Johnny^plumber^NY;Anna^doctor^Washington;Kate^admin^Florida
then i've got one many output3*.txt files in folder which the data is saving all the time:
haha plumber blabla;
other one could be like:
haha doctor blabla;haha admin blabla
if there is no word "exit" in the output3*.txt files - its waiting for few seconds and then searching those words (plumber doctor admin) between haha and blabla in every file which didnt had "exit" inside and removing those words from the first txt file.
file_names3 = glob.glob(pathtemp+"/output3*.txt")
abort_after = 1 * 5
start = time.time()
while True:
if not file_names3:
break
delta = time.time() - start
if delta >= abort_after:
with open(path+"/"+statuses, "a") as statuses:
statuses.write("-----------------\n ERRORS:\n\n-----------------\n")
for file_name in file_names3:
statuses.write("%s" % file_name + " - file not done: ")
with open(file_name, 'r') as prenotf:
reader=prenotf.read()
for "haha" in reader:
finding=reader[reader.find("haha")+5:reader.find("blabla")]
statuses.write(finding)
break
time.sleep(3)
for file_name in file_names3:
with open(file_name, "r") as zz:
if "exit" in zz.read(): #<<<--- test data
file_names3.remove(file_name)
print ("\n ############# List of files still Waiting to be done:\n")
print (file_names3)
Im stuck in searching for those words between haha and blabla.
Thanks for any help.

When you alter an object while you're iterating through it, you foul up the inherent location pointer. This pointer is absolute. If you delete 10 characters from the file, the rest of the file shifts up, but the pointer doesn't change. This effective skips the next 10 characters.
Your logic comes in two parts, then:
Write to a second file while you parse the first. Once you're done, you can move the new file to the old name.
Maintain an active flag. Turn it off when you hit haha and back on when you hit blabla.
It looks something like this:
temp_file = open("tempfile.txt", 'w')
active = True
for line in <your input>:
if "haha" in line:
active = True
elif "blabla" in line:
active = False
elif active
temp_file.write(line)
Can you work that into your program's current logic?

Related

searching for a particular character, except in comments of a file

I am working on python migration from 2 to 3.
I want to check if the files have a "/" operation. Since the files are too many, I plan to use a script to do so.
Although the script works fine, some files have comments and those comments have the "/" in between.
Eg:
File:
import sys
#blah blah
#get/set ---This gets detected
a=5
b=2
c=a/b --- I want to detect this
d=5/3 --- I want to detect this
I do not want the comments section to be considered, is there any regex that could help me here?
Script:
text = '/'
APP_FOLDER: "C\Users\Files"
for dirpath, dirnames, filenames in os.walk(APP_FOLDER):
for inputFile in filenames:
if pathlib.Path(inputFile).suffix == ".py":
file_path = os.path.join(dirpath, inputFile)
with open(file_path) as f:
num_lines = len(f.readlines())
with open(file_path, 'r') as fp:
for line in fp:
if re.findall(text, line, flags=re.IGNORECASE):
file_count = file_count + 1
print "File path: " + file_path
print "File name: " + inputFile
print "*******************************************************************************"
break
Looking forward for suggestions. PS: The # symbol need not be the first character in the line.
NOTE:
The comments on your question actually give a better answer than this...
You can do this quite easily by simply splitting on the # character and only evaluating the part before the # character. See below:
def find_char_in_text(text, subtext, commentchar='#'):
result = []
for line in text.split('\n'):
if commentchar in line:
# split on the comment character
# reason to not change line itself directly is
# so you can add the whole line to the results.
evaluate_this = line.split(commentchar)[0]
else:
evaluate_this = line
if subtext in evaluate_this:
result.append(line)
return result
text = """File:
import sys
#blah blah
#get/set ---This gets detected
a=5
b=2
c=a/b --- I want to detect this
d=5/3 --- I want to detect this"""
for result in find_char_in_text(text, '/'):
print(result)
output
c=a/b --- I want to detect this
d=5/3 --- I want to detect this

Appending the correct values from a list

I am making an Instagram bot and I store the names of the users that the bot has followed in file.txt.
unique_photos = len(pic_hrefs) # TODO Let this run once and check whether this block of code works or not
followers_list = [] # Contains the names of the people you followed
for pic_href in pic_hrefs:
driver.get(pic_href)
sleep(2)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
try:
# Like this picture
driver.find_element_by_xpath("//*[#aria-label='Like']").click()
print("Picture liked") # TODO After checking delete this line
follow_button = driver.find_element_by_class_name('bY2yH')
# Follow the user if not followed already
if follow_button.text == "•\n" + "Follow":
follow_button.click()
followed = driver.find_element_by_class_name('e1e1d')
followers_list.append(followed.text)
with open("file.txt", 'a') as file:
file.write(",".join(followers_list))
file.write(",")
else:
continue
for second in reversed(range(0, 3)):
print_same_line("#" + tag + ': unique photos left: ' + str(unique_photos)
+ " | Sleeping " + str(second))
sleep(1)
except Exception:
sleep(2)
unique_photos -= 1
This is the final result in the file.txt:
kr.dramas_,kr.dramas_,marcelly.lds,kr.dramas_,marcelly.lds,espn
It's clear that the problem is that as I append the whole followers_list (which contains all the usernames of the people the bot followed) the names repeat. So I need a way to only append the new names.
And I know that I can just change the code to 'w' to create a whole new file every time but that creates a problem because after I stop the bot and if I don't unfollow the users from that list and start the bot again I will lose all the names from the file, which I don't want.
So I need suggestions so that after the bot is stopped the file.txt looks like this:
kr.dramas_,marcelly.lds,espn,
I would suggest that once you've followed everyone, you can read all of the names from the file into a list/set and then add names that aren't in the list/set into it. Then simply overwrite the old file.
followers_list = [] # will be populated with follower names
with open("file.txt", 'r') as file:
file_names = file.readline().split(",")
for follower in followers_list:
if follower not in file_names:
file_names.append(follower)
with open("file.txt", 'w') as file:
file.write(",".join(file_names))

Error whilst trying to delete string from a 'txt' file - Contacts list program

I'm creating a Contact list/book program which can create new contacts for you. Save them in a 'txt' file. List all contacts, and delete existing contacts. Well sort of. In my delete function there is an error which happens and I can't quite tell why?. There isn't a error prompted on the shell when running. It's meant to ask the user which contact they want to delete, find what the user said in the 'txt' file. Then delete it. It can find it easily, however it just doesn't delete the string at all.
I have tried other methods including if/else statements, other online code (copied) - nothing works.
import os, time, random, sys, pyautogui
#function for creating a new contact.
def new_contact():
name = str(input("Clients name?\n:"))
name = name + " -"
info = str(input("Info about the client?\n:"))
#starts formatting clients name and info for injection into file.
total = "\n\n"
total = total + name
total = total + " "
total = total + info
total = total + "\n"
#Injects info into file.
with open("DATA.txt", "a") as file:
file.write(str(total))
file.close
main()
#function for listing ALL contacts made.
def list():
file = open("DATA.txt", "r")
read = file.read()
file.close
#detects whether there are any contacts at all. If there are none the only str in the file is "Clients:"
if read == "Clients:":
op = str(input("You havn't made any contacts yet..\nDo you wish to make one?\n:"))
if op == "y":
new_contact()
else:
main()
else:
print (read)
os.system('pause')
main()
#Function for deleting contact
def delete_contact():
file = open("DATA.txt", "r")
read = file.read()
file.close
#detects whether there are any contacts at all. If there are none the only str in the file is "Clients:"
if read == "Clients:":
op = str(input("You havn't made any contacts yet..\nDo you wish to make one?\n:"))
if op == "y":
new_contact()
else:
main()
else:
#tries to delete whatever was inputted by the user.
file = open("DATA.txt", "r")
read = file.read()
file.close
print (read, "\n")
op = input("copy the Clinets name and information you wish to delete\n:")
with open("DATA.txt") as f:
reptext=f.read().replace((op), '')
with open("FileName", "w") as f:
f.write(reptext)
main()
#Main Menu Basically.
def main():
list_contacts = str(input("List contacts? - L\n\n\nDo you want to make a new contact - N\n\n\nDo you want to delete a contact? - D\n:"))
if list_contacts in ("L", "l"):
list()
elif list_contacts in ("N", "n"):
new_contact()
elif list_contacts in ("D", "d"):
delete_contact()
else:
main()
main()
It is expected to delete everything the user inputs from the txt file. No errors show up on shell/console, it's as if the program thinks it's done it, but it hasn't. The content in the txt file contains:
Clients:
Erich - Developer
Bob - Test subject
In your delete function, instead of opening DATA.txt, you open "FileName"
When using “with”, a file handle doesn't need to be closed. Also, file.close() is a function, you didnt call the function, just its address.
In addition, in the delete function, you opened “fileName” instead of “DATA.txt”

stop while loop when the text ends

I have a program that loops through the lines of a book to match some tags I've created indicating the start and the end of each chapter of this book. I want to separate each chapter into a different file. The program finds each chapter and asks the user to name the file, then it continues until the next chapter and so on. I don't know exactly where to put my "break" or something that could stop my loop. The program runs well but when it reaches the last chapter it goes back to the first chapter. I want to stop the loop and terminate the program when the tags and the chapters finish and also print something like "End of chapters". Can anyone help me with that? The code is below:
import re
def separate_files ():
with open('sample.txt') as file:
chapters = file.readlines()
pat=re.compile(r"[#introS\].[\#introEnd#]")
reg= list(filter(pat.match, chapters))
txt=' '
while True:
for i in chapters:
if i in reg:
print(i)
inp=input("write text a file? Y|N: ")
if inp =='Y':
txt=i
file_name=input('Name your file: ')
out_file=open(file_name,'w')
out_file.write(txt)
out_file.close()
print('text', inp, 'written to a file')
elif inp =='N':
break
else:
continue
else:
continue
separate_files()
I think a simpler definition would be
import re
def separate_files ():
pat = re.compile(r"[#introS\].[\#introEnd#]")
with open('sample.txt') as file:
for i in filter(pat.match, file):
print(i)
inp = input("write text to a file? Y|N: ")
if inp != "Y":
continue
file_name = input("Name of your file: ")
with open(file_name, "w") as out_file:
out_file.write(i)
print("text {} written to a file".format(i))
Continue the loop as soon as possible in each case, so that the following code doesn't need to be nested more and more deeply. Also, there's no apparent need to read the entire file into memory at once; just match each line against the pattern as it comes up.
You might also consider simply asking for a file name, treating a blank file name as declining to write the line to a file.
for i in filter(pat.match, file):
print(i)
file_name = input("Enter a file name to write to (or leave blank to continue: ")
if not file_name:
continue
with open(file_name, "w") as out_file:
out_file.write(i)
print("text {} written to {}".format(i, file_name)
I can't run your code but I assume if you remove the
while True:
line it should work fine. This will always be executed as there is nothing checked

conditions on python when i check hard bounces email

i write Python script to verify hard bounces
from validate_email import validate_email
with open("test.txt") as fp:
line = fp.readline()
cnt = 1
while line:
line = fp.readline()
print ('this email :' + str(line) +'status : ' + str((validate_email(line,verify=True))))
stt=str(validate_email(line,verify=True))
email=str(line)
print ("-----------------")
cnt += 1
if stt == "True":
file=open("clean.txt",'w+')
file.write(email)
if stt == "None":
file=open("checkagain.txt",'w+')
file.write(email)
if stt == "False":
file=open("bounces.txt",'w+')
file.write(email)
for False condition it create the file but no emails inside even if am sure that i have bounces emails
You need to close the file to reflect your changes in file, put:
file.close()
at the end
you should instead be using:
with open('bounces.txt', 'a') as file:
# your file operations
that way you wont have to close the file
Your script contains a number of errors.
Each input line contains a trailing newline.
Opening the same file for writing multiple times is hideously inefficient. Failing to close the files is what caused them to end up empty. Reopening without closing might end up discarding things you've written on some platforms.
Several operations are repeated, some merely introducing inefficiencies, others outright errors.
Here is a refactoring, with inlined comments on the changes.
from validate_email import validate_email
# Open output files just once, too
with open("test.txt") as fp, \
open('clean.txt', 'w') as clean, \
open('checkagain.txt', 'w') as check, \
open('bounces.txt', 'w') as bounces:
# Enumerate to keep track of line number
for i, line in enumerate(fp, 1):
# Remove trailing newline
email = line.rstrip()
# Only validate once; don't coerce to string
stt = validate_email(email, verify=True)
# No need for str()
print ('this email:' + email +'status: ' + stt)
# Really tempted to remove this, too...
print ("-----------------")
# Don't compare to string
if stt == True:
clean.write(line)
elif stt == None:
check.write(line)
elif stt == False:
bounces.write(line)
You are not using the line number for anything, but I left it in to show how it's usually done.

Categories