I want to delete some specific lines in a file. The below code doesn't seem to work. There are no errors thrown but this code won't delete the lines that are meant to be deleted.
#!/usr/bin/python
import argparse
import re
import string
p = argparse.ArgumentParser()
p.add_argument("input", help="input the data in format ip:port:name", nargs='*')
args = p.parse_args()
kkk_list = args.input # ['1.1.1.1:443:myname', '2.2.2.2:443:yourname']
def getStringInFormat(ip, port, name):
formattedText = "HOST Address:{ip}:PORT:{port}\n"\
" server tcp\n"\
" server {ip}:{port} name {name}\n\n".format(ip=ip,
port=port,
name=name)
return formattedText
with open("file.txt", "r+") as f:
fileContent = f.read()
# below two lines delete old content of file
f.seek(0)
f.truncate()
for kkk in kkk_list:
ip, port, name = re.split(":|,", kkk)
stringNeedsToBeDeleted = getStringInFormat(ip, port, name)
fileContent = fileContent.replace(stringNeedsToBeDeleted, "")
f.write(fileContent)
The content of the file from which I'm trying to delete looks like following. Please note the space before 2nd and 3rd lines
------------do not delete this line------
HOST Address:1.1.1.1:PORT:443\n"\
server tcp\n"\
server 1.1.1.1:443 name myname1
--------------- do not delete this line either
If the script is successful the file should look like below where there is only one new line in between.
------------do not delete this line------
------------do not delete this line either ----
Any insights?
You are doing everything correct in your file editing loop, which means that if you aren't actually replacing anything its because the string you are looking for doesn't exist. Indeed, when you tell us that you are looking for this string:
------------do not delete this line------
HOST Address:1.1.1.1:PORT:443\n"\
server tcp\n"\
server 1.1.1.1:443 name myname1
--------------- do not delete this line either
It doesn't appear to match up with the string you are trying to match it with:
formattedText = "HOST Address:{ip}:PORT:{port}\n"\
" server tcp\n"\
" server {ip}:{port} name {name}\n\n"
Keep in mind in order to replace this string with your current code, the strings have to exactly match, in this case I don't see the \n between "HOST Addess..." and " server tcp\n"\ or the \ lines. But I suspect those were just formatting errors on your part.
If you really want to get to the root of this problem I suggest you find a string you know for certain you are trying to delete and test your code with that to make sure the strings are the same. Here is an example. If you want to find:
HOST Address:1.1.1.1:PORT:443
server tcp
server 1.1.1.1:443 name myname1
Then compare with your search string via:
test_string = # the string I posted above, you should probably
# grab this from a file for consistency.
kkk = '1.1.1.1:443:myname'
ip, port, name = re.split(":|,", kkk)
assert ip == '1.1.1.1'
assert port == '443'
assert name == 'myname'
stringNeedsToBeDeleted = getStringInFormat(ip, port, name)
assert stringNeedsToBeDeleted == test_string, "Error, strings are not equal!"
This should give you a clue into what the actual problem is. myname1, which I grabbed directly from your example, doesn't match up with your match string.
You're opening the file in read mode 'r+'. You need to open it in write mode 'w' to write to it. Or just don't specify a mode.
You can copy the contents of your file to a list, write over the old file, mutate your list, and then write the list to the file.
import argparse
import re
import string
p = argparse.ArgumentParser()
p.add_argument("input", help="input the data in format ip:port:name",nargs='*')
args = p.parse_args()
kkk_list = args.input # ['1.1.1.1:443:myname', '2.2.2.2:443:yourname']
def getStringInFormat(ip, port, name):
formattedText = "HOST Address:{ip}:PORT:{port}\n"\
" server tcp\n"\
" server {ip}:{port} name {name}\n\n".format(ip=ip,
port=port,
name=name)
return formattedText
with open("file.txt", "r+") as f:
for kkk in kkk_list:
ip, port, name = re.split(":|,", kkk)
stringNeedsToBeDeleted = getStringInFormat(ip, port, name)
fileContent = fileContent.replace(stringNeedsToBeDeleted, "")
f.write(fileContent) #now, your file contains all the host addressed and IPs
f = open("file.txt").readlines()
contents = [i.strip('\n').split() for i in f]
new_file = open('file.txt', 'w')
new_file.write('')
new_file.write(''.join(contents[0]))
new_file.write('\n\n\n')
new_file.write(contents[''.join(len(contents)-1])))
new_file.close()
Related
I have a huge report file with some data where i have to do some data processing on lines starting with the code "MLT-TRR"
For now i have extracted all the lines in my script that start with that code and placed them in a separate file. The new file looks like this- Rules.txt.
MLT-TRR Warning C:\Users\Di\Pictures\SavedPictures\top.png 63 10 Port is not registered [Folder: 'Picture']
MLT-TRR Warning C:\Users\Di\Pictures\SavedPictures\tree.png 315 10 Port is not registered [Folder: 'Picture.first_inst']
MLT-TRR Warning C:\Users\Di\Pictures\SavedPictures\top.png 315 10 Port is not registered [Folder: 'Picture.second_inst']
MLT-TRR Warning C:\Users\Di\Pictures\SavedPictures\tree.png 317 10 Port is not registered [Folder: 'Picture.third_inst']
MLT-TRR Warning C:\Users\Di\Pictures\SavedPictures\top.png 317 10 Port is not registered [Folder: 'Picture.fourth_inst']
For each of these lines i have to extract the data that lies after "[Folder: 'Picture" If there is no data after "[Folder: 'Picture" as in the case of my first line, then skip that line and move on to the next line.
I also want to extract the file names for each of those lines- top.txt, tree.txt
I couldnt think of a simpler method to do this as this involves a loop and gets messier.
Is there any way out i can do this? extracting just the file paths and the ending data of each line.
import os
import sys
from os import path
import numpy as np
folder_path = os.path.dirname(os.path.abspath(__file__))
inFile1 = 'Rules.txt'
inFile2 = 'TopRules.txt'
def open_file(filename):
try:
with open(filename,'r') as f:
targets = [line for line in f if "MLT-TRR" in line]
print targets
f.close()
with open(inFile1, "w") as f2:
for line in targets:
f2.write(line + "\n")
f2.close()
except Exception,e:
print str(e)
exit(1)
if __name__ == '__main__':
name = sys.argv[1]
filename = sys.argv[1]
open_file(filename)
To extract the filenames and other data, you should be able to use a regular expression:
import re
for line in f:
match = re.match(r"^MLT-TRR.*([A-Za-z]:\\[-A-Za-z0-9_:\\.]+).*\[Folder: 'Picture\.(\w+)']", line)
if match:
filename = match.group(1)
data = match.group(2)
This assumes that the data after 'Picture. only contains alphanumeric characters and underscores. And you may have to change the allowed characters in the filename part [A-Za-z0-9_:\\.] if you have weird filenames. It also assumes the filenames start with the Windows drive letter (so absolute paths), to make it easier to distinguish from other data in the line.
If you just want the basename of the filename, then after extracting it you can use os.path.basename or pathlib.Path.name.
I had a very similar problem and solved it by searching for the specific line 'key', in your case MLT-TRR" with regex and then specifying which 'bytes' to take from that line. I then append the selected data to an array.
import re #Import the regex function
#Make empty arrays:
P190=[] #my file
shot=[] #events in my file (multiple lines of text for each event)
S011east=[] #what I want
S011north #another thing I want
#Create your regex:
S011=re.compile(r"^S0\w*\W*11\b")
#search and append:
#Open P190 file
with open(import_file_path,'rt') as infile:
for lines in infile:
P190.append(lines.rstrip('\n'))
#Locate specific lines and extract data
for line in P190:
if S011.search(line)!= None:
easting=line[47:55]
easting=float(easting)
S011east.append(easting)
northing=line[55:64]
northing=float(northing)
S011north.append(northing)
If you set up regex to look for "MLT_TRR ????? Folder: 'Picture.'" then it should skip any lines that don't have any further information.
For the second part of your question.
I doubt your file names are a constant length so the above method won't work as you can't specify a number of bytes to extract.This code extracts the name and extension from a file path, you could apply it to whatever you extract from each line.
import os
tail=os.path.basename(import_file_path) #Get file name from path
I've got a problem with validating text file. I need to check if parameters that I set are correctly saved. File name is an actual date and time and I need to check if parameters that were send are in this text (log) file. Below you can find my code:
Arguments are sent with argpars eg.
parser.add_argument("freq", type=int)
print('Saving Measurement...')
print(inst.write(':MMEMory:STORe:TRACe 0, "%s"' % timestr)) #Saving file on the inst
time.sleep(1) #Wait for file to save
print('Downloading file from device...')
ftp = FTP('XX.XX.XXX.XXX')
ftp.login()
ftp.retrbinary('RETR %s'% timestr + '.spa', open(timestr + '.spa', 'wb').write) #downloading saved file into a directory where you run script
print('Done, saved as: ' + timestr)
time.sleep(1)
with open (timestr + '.spa') as f:
if (str(args.freq)) in f.read():
print("saved correctly")
ftp.delete(timestr + '.spa') #Delete file from inst
ftp.quit()
I'm not sure if it works for me. Thank you for your help
You could use the re module to help you find a date pattern inside your file. I will give you a little example code that searches, at least for this case, this date pattern dd-mm-yyyy
import re
filepath = 'your-file-path.spa'
regex = '\d\d-\d\d-\d\d\d\d'
with open(filepath, 'r') as f:
file = f.read()
dates_found = re.findall(regex, file)
# dates_found will be an array with all the dates found in the file
print(dates_found)
You could use any regex you want as the first argument of re.findall(regex, file)
I need a way to compare two files that have the same hostname in them. I have written a function that will parse out the hostnames and save that in a list. Once I have that I need to be able to compare the files.
Each file is in a different directory.
Step One: Retrieve "hostname" from each file.
Step Two: Run compare on files with same "hostname" from two directories.
Retrieve hostname Code:
def hostname_parse(directory):
results = []
try:
for filename in os.listdir(directory):
if filename.endswith(('.cfg', '.startup', '.confg')):
file_name = os.path.join(directory, filename)
with open(file_name, "r") as in_file:
for line in in_file:
match = re.search('hostname\s(\S+)', line)
if match:
results.append(match.group(1))
#print "Match Found"
return results
except IOError as (errno, strerror):
print "I/O error({0}): {1}".format(errno, strerror)
print "Error in hostname_parse function"
Sample Data:
Test File:
19-30#
!
version 12.3
service timestamps debug datetime msec
service timestamps log datetime msec
service password-encryption
!
hostname 19-30
!
boot-start-marker
boot-end-marker
!
ntp clock-period 17179738
ntp source Loopback0
!
end
19-30#
In this case the hostname is 19-30. For ease of testing I just used the same file but modified it to be the same or not the same.
As stated above. I can extract the hostname but am now looking for a way to then compare the files based on the hostname found.
At the core of things it is a file comparison. However being able to look at specific fields will be what I would like to accomplish. For starters I'm just looking to see that the files are identical. Case sensitivity shouldn't matter as these are cisco generated files that have the same formatting. The contents of the files are more important as I'm looking for "configuration" changes.
Here is some code to meet your requirements. I had no way to test, so it may have a few challenges. Is used hash lib to calculate a hash on the file contents, as a way to find changes.
import hashlib
import os
import re
HOSTNAME_RE = re.compile(r'hostname +(\S+)')
def get_file_info_from_lines(filename, file_lines):
hostname = None
a_hash = hashlib.sha1()
for line in file_lines:
a_hash.update(line.encode('utf-8'))
match = HOSTNAME_RE.match(line)
if match:
hostname = match.group(1)
return hostname, filename, a_hash.hexdigest()
def get_file_info(filename):
if filename.endswith(('.cfg', '.startup', '.confg')):
with open(filename, "r") as in_file:
return get_file_info_from_lines(filename, in_file.readlines())
def hostname_parse(directory):
results = {}
for filename in os.listdir(directory):
info = get_file_info(filename)
if info is not None:
results[info[0]] = info
return results
results1 = hostname_parse('dir1')
results2 = hostname_parse('dir2')
for hostname, filename, filehash in results1.values():
if hostname in results2:
_, filename2, filehash2 = results2[hostname]
if filehash != filehash2:
print("%s has a change (%s, %s)" % (
hostname, filehash, filehash2))
print(filename)
print(filename2)
print()
I am trying to write a simple Python SMTP enumeration script, which reads usernames from a text file (filename supplied as the second argument - sys.argv[2]), and checks them against an SMTP server (hostname or ip supplied as the first argument - sys.argv[1]. I found something that is kind of close, and tweaked it a bit, like so:
#!/usr/bin/python
import socket
import sys
users = sys.argv[2]
for line in users:
line = line.strip()
if line!='':
users.append(line)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((sys.argv[1], 25))
fn = s.makefile('rwb')
fn.readline()
fn.write('HELO testing.com \r\n')
fn.flush()
fn.readline()
for user in users:
fn.write('VRFY %s\r\n' % user)
fn.flush()
print '%s: %s' % (user, fn.readline().strip())
fn.write('QUIT\r\n')
fn.flush()
s.close()
However, when I run the script (for example):
./smtp-vrfy.py 192.168.1.9 users.txt
It results in an error:
File "./smtp-vrfy.py", line 10, in
users.append(line)
AttributeError: 'str' object has no attribute 'append'
What am I doing wrong? How can I fix it? Perhaps there is an easier way to accomplish what I'm trying to do?
users is a file name, but you're not reading it. Instead, see what happens:
>>> users = "users.txt"
>>> for line in users:
... print(line)
...
u
s
e
r
s
.
t
x
t
You probably want:
with open(users) as f:
for line in f:
# ...
Even better:
filename = sys.argv[2]
with open(filename) as f:
users = [line.strip() for line in f.readlines() if line]
I have written this small script to open a file iterate through a list and create new files based on the names in the list "hostnames.txt". Now I need to be able to read a seperate file we will call "template" and search through and replace the word "hostname" with the actually hostnames acquired from the "Hostnames.txt" file.Then output the files with that hostname actually in the new file that was created. Also upon creating the files it adds a "?" at the end of the hostname before adding the ".test" and I don't know where it's coming from.
import sys
input_file = open ('hostnames.txt', 'r')
template = open ('hosttemplate.txt', 'r')
count_lines = 0
for hostname in input_file:
system = hostname
computername = open(system.strip()+".test",'a')
computername.write("need to write data from template to file and replace the string hostname with the hostname from hostnames.txt")
print hostname
count_lines += 1
print 'number of lines:', count_lines
import sys
input_file = open ('hostnames.txt', 'r')
template = open ('hosttemplate.txt', 'r')
tdata = template.readlines()
count_lines = 0
for hostname in input_file:
system = hostname.strip()
computername = open(system+".test",'a')
for line in tdata:
computername.write(line.replace('hostname', system))
computername.close()
print hostname
count_lines += 1
print 'number of lines:', count_lines
Before the loop, we open hostnames.txt and read the contents of hosttemplate.txt into list tdata.
As we step through hostnames.txt ("for hostname in input_file"), we
strip the newline off hostname before assigning to system,
open the file .test with handle computername
iterate through tdata, writing each line to computername after replacing the string 'hostname' with the actual name of the host (system)
close the file open through handle computername
print the hostname and increment count_lines