How to convert this sed command to Python script? - python

I have one old shell script which include sed command as below.
The source data($Tmp) is a HTML table.
sed '/<table border/,/table>/d' $Tmp > $Out
Can someone help me to convert this command to Python script?
I really can't figure out how to do that with regular expression.
Many thanks..

Here's a simple implementation.
Briefly, it opens the file, iterates line by line and prints each line to the output. If it matches "<table border", delete flag set to True and following lines aren't printed to the output until it matches "table>".
import sys
f = open(sys.argv[1])
delete = False
for line in f:
if delete == False:
if "<table border" in line:
delete = True
if delete == False:
print line,
if delete == True:
if "table>" in line:
delete = False

The script copys all lines from the input file to the output file, unless it finds a line containing <table border, then it deletes all lines until it finds /table> and continues writing all further lines.
So one possibility would be:
with open('in') as inf, open('out', 'w') as outf:
while True:
line = inf.readline()
if '<table border' in line:
while True:
line = inf.readline()
if not line or '/table>' in line:
line = inf.readline()
break
if not line:
break
outf.write(line)

import sys
with open(sys.argv[1]) as f:
for line in f:
if '<table border' in line:
for line in f:
if 'table>' in line:
break
else:
sys.stdout.write(line)

Related

Python: modify part of a string inside a .cfg file

I want to access a file (C:\Programmer\Test.txt
), find a string inside that file beginning with 'SS' and replace everything after that on the same line with a new string 'C:\Test\Flash'
The code below prints out the line I want to modify but I can't seem to find a suitable function that will replace everything after the 'SS' with the new string.
import re
for line in open('C:\Programmer\Build\Test.txt'):
if line.startswith('SS'):
print(line)
storedline = line
print(storedline)
You can do
file_path = 'C:\Programmer\Build\Test.txt'
new_line_content = 'C:\Test\Flash'
output = []
with open(file_path, 'r') as infile:
line = infile.readline()
while line:
if line[0:2] == 'SS':
output.append('SS{}\n'.format(new_line_content))
else:
output.append(line)
line = infile.readline()
with open(file_path, 'w') as outfile:
outfile.write(''.join(output))
Note that here the detection of the line(s) if line[0:2] == 'SS' is based on interpreting literally your requirement 'find a string inside that file beginning with 'SS''

Writing a line in a file when finding a keyword

I am trying to make a function which writes a line when it finds some text inside a file.
Example: it finds "hello" in a .txt file so then writes "Hi!" in the following line. There is something else, i want it to write "Hi!" not the first time it finds "hello" but the second.
Here is what i have been trying, but i don't know if the idea is right. Any help?
def line_replace(namefilein):
print namefilein
filein=open(namefilein, "rw")
tag="intro"
filein.read()
for line in filein:
if tag=="second" or tag=="coord":
try:
filein.write("\n\n %s" %(text-to-be-added))
print line
except:
if tag=="coord":
tag="end"
else:
tag="coord"
if " text-to-find" in line:
if tag=="intro":
tag="first"
elif tag=="first":
tag="second"
filein.close()
You can use this code:
def line_replace(namefilein):
new_content = ''
first_time = False
with open(namefilein, 'r') as f:
for line in f:
new_content += line
if 'hello' in line:
if first_time:
new_content += 'Hi!' + '\n'
else:
first_time = True
with open(namefilein, 'w') as f:
f.write(new_content)
Look that I am using the with statement that in Python is a context manager, so it means, in this case, when the block of code has executed, then the file will be closed automatically.
Let's supposed you have a file my_file.txt which contents is:
hello
friend
this
is
hello
And let's say your file is in the same directory than the python file that has your code, so calling:
line_replace('my_file.txt')
will produce the following output:
hello
friend
hello
Hi!
is

Reading a textfile into a String

I'm just starting to learn python and have a textfile that looks like this:
Hello
World
Hello
World
And I want to add the numbers '55' to the beggining and end of every string that starts with 'hello'
The numbers '66' to the beggining and every of every string that starts with 'World'
etc
So my final file should look like this:
55Hello55
66World66
55Hello55
66World66
I'm reading the file in all at once, storing it in a string, and then trying to append accordingly
fp = open("test.txt","r")
strHolder = fp.read()
print(strHolder)
if 'Hello' in strHolder:
strHolder = '55' + strHolder + '55'
if 'World' in strHolder:
strHolder = '66' + strHolder + '66'
print(strHolder)
fp.close()
However, my string values '55' and '66' are always being added to the front of the file and end of the file, not the front of a certain string and to the end of the string, where I get this output of the string:
6655Hello
World
Hello
World
5566
Any help would be much appreciated.
You are reading the whole file at once with .read().
You can read it line by line in a for loop.
new_file = []
fp = open("test.txt", "r")
for line in fp:
line = line.rstrip("\n") # The string ends in a newline
# str.rstrip("\n") removes newlines at the end
if "Hello" in line:
line = "55" + line + "55"
if "World" in line:
line = "66" + line + "66"
new_file.append(line)
fp.close()
new_file = "\n".join(new_file)
print(new_file)
You could do it all at once, by reading the whole file and splitting by "\n" (newline)
new_file = []
fp = open("text.txt")
fp_read = fp.read()
fp.close()
for line in fp_read.split("\n"):
if "Hello" # ...
but this would load the whole file into memory at once, while the for loop only loads line by line (So this may not work for larger files).
The behaviour of this is that if the line has "Hello" in it, it will get "55" before and after it (even if the line is " sieohfoiHellosdf ") and the same for "World", and if it has both "Hello" and "World" (e.g. "Hello, World!" or "asdifhoasdfhHellosdjfhsodWorldosadh") it will get "6655" before and after it.
Just as a side note: You should use with to open a file as it makes sure that the file is closed later.
new_file = []
with open("test.txt") as fp: # "r" mode is default
for line in fp:
line = line.rstrip("\n")
if "Hello" in line:
line = "55" + line + "55"
if "World" in line:
line = "66" + line + "66"
new_file.append(line)
new_file = "\n".join(new_file)
print(new_file)
You need to iterate over each line of the file in order to get the desired result. In your code you are using .read(), instead use .readlines() to get list of all lines.
Below is the sample code:
lines = []
with open("test.txt", "r") as f:
for line in f.readlines(): # < Iterate over each line
if line.startswith("Hello"): # <-- check if line starts with "Hello"
line = "55{}55".format(line)
elif line.startswith("World"):
line = "66{}66".format(line)
lines.append(line)
print "\n".join(lines)
Why to use with? Check Python doc:
The ‘with‘ statement clarifies code that previously would use try...finally blocks to ensure that clean-up code is executed. In this section, I’ll discuss the statement as it will commonly be used. In the next section, I’ll examine the implementation details and show how to write objects for use with this statement.
The ‘with‘ statement is a control-flow structure whose basic structure is:
with expression [as variable]: with-block
The expression is evaluated, and it should result in an object that supports the context management protocol (that is, has enter() and exit() methods).
once you have read the file:
read_file = read_file.replace('hello','55hello55')
It'll replace all hellos with 55hello55
and use with open(text.txt, 'r' ) as file_hndler:
To read a text file, I recommend the following way which is compatible with Python 2 & 3:
import io
with io.open("test", mode="r", encoding="utf8") as fd:
...
Here, I make the assumption that your file use uft8 encoding.
Using a with statement make sure the file is closed at the end of reading even if a error occurs (an exception). To learn more about context manager, take a look at the Context Library.
There are several ways to read a text file:
read the whole file with: fd.read(), or
read line by line with a loop: for line in fd.
If you read the whole file, you'll need to split the lines (see str.splitlines. Here are the two solutions:
with io.open("test", mode="r", encoding="utf8") as fd:
content = fd.read()
for line in content.splilines():
if "Hello" in line:
print("55" + line + "55")
if "World" in line:
print("66" + line + "66")
Or
with io.open("test", mode="r", encoding="utf8") as fd:
for line in content.splilines():
line = line[:-1]
if "Hello" in line:
print("55" + line + "55")
if "World" in line:
print("66" + line + "66")
If you need to write the result in another file you can open the output file in write mode and use print(thing, file=out) as follow:
with io.open("test", mode="r", encoding="utf8") as fd:
with io.open("test", mode="w", encoding="utf8") as out:
for line in content.splilines():
line = line[:-1]
if "Hello" in line:
print("55" + line + "55", file=out)
if "World" in line:
print("66" + line + "66", file=out)
If you use Python 2, you'll need the following directive to use the print function:
from __future__ import print_function

pyqt Qtextbrowser update

def sort_domain():
if self.cb1.isChecked():
for line in f:
line= line.strip()
if line.endswith('.com') is True:
self.textBrowser.append(line)
else:
pass
elif not self.cb1.isChecked() and not self.cb2.isChecked():
for line in f:
line=line.strip()
self.textBrowser.append(line)
if self.cb2.isChecked():
for line in f:
line= line.strip()
if line.endswith('.net') is True:
self.textBrowser.append(line)
else:
pass
elif not self.cb1.isChecked() and not self.cb2.isChecked():
for line in f:
line=line.strip()
self.textBrowser.append(line)
self.btn2.clicked.connect(sort_domain)
If I checked cb1 and cb2 ((checkbox1 and chekbok2))
the results are all domains with extension .com only.
What is the correct way to write a function to show all Domains when you press the chekBox1 ".com" and chekBox2 ".net"?
Your implementation is not really efficient: it reads the contents of the file more than once. And this is also the issue of your program. After the first for-loop the file object points to the end of the file and to make it work you'd have to seek to the start again: f.seek(0)

MD5 decrypt script

__author__ = 'Zane'
import hashlib
import sys
if (len(sys.argv)!=2 ) or (len(sys.argv[1])!= 32):
print("[---] md5cracker.py & hash")
sys.exit(1)
crackedmd5 = sys.argv[1]
# open a file and read its contents
f = open('file.txt')
lines = f.readline()
f.close()
for line in lines:
cleanline = line.rstrip()
hashobject = hashlib.md5(cleanline)
if (hashobject==crackedmd5):
print('Plain text password for ' + crackedmd5 + "is " + hashobject + '\n')
I get no error with exit code 1 and i do not know where i get it wrong
Your program exits with status code one because you told it so (roughly on line 8):
sys.exit(1)
Pythons code structure is based on indent of lines. For now your whole code is part of the if (len(sys.argv)!=2 ) or (len(sys.argv[1])!= 32): condition.
You need to unindent all lines with one tab starting from crackedmd5 = sys.argv[1]
EDIT
You also used lines = f.readline() which will read only one line and so for line in lines will iterate over every single char in that line and not over multiple lines. You need to use lines = f.readlines() instead.

Categories