Notepad ++ Python code to add a value to numbers matching a pattern - python

I have a notepad++ file with contents like below
{s:11:"wpseo_title";s:42:"Web Designing training institutes in Kochi";}i:357;a:1:
{s:11:"wpseo_title";s:32:"CSS training institutes in Kochi";}i:358;a:1:
{s:11:"wpseo_title";s:34:"HTML5 training institutes in Kochi";}i:359;a:1:
{s:11:"wpseo_title";s:39:"JavaScript training institutes in Kochi";}i:360;a:1:
{s:11:"wpseo_title";s:32:"XML training institutes in Kochi";}}}
I need a way to search for the phrase ";s:42:" and increment the number part of the phrase by 1. In this case, 42 will become 43.
I just need to do it. Dont care if it is through a python script like this
Notepad++ Regular Expression add up numbers
or any other method.
Please help me. I am new to python/ any such language.

Perl one-liner version:
perl -ne 's/(?<=;s:)(\d+)(?=:)/$1+1/ge; print' data.txt

Using your link as an example, the regex match should be:
def calculate(linenumber, match):
editor.pyreplace(match.group(0), ';s:%d="%d"' % (match.group(1), str(int(match.group(2))+1)), 0, 0, linenumber, linenumber)
editor.pysearch(r';s:([0-9])="([0-9]+)"', calculate)
I think. I've never actually done this before!

target_file = "some_file.txt"
with open("tmp_out.txt","w") as f_out:
with open(target_file) as f_in:
for line in f_in:
f_out.write(re.sub(";s:(\d+):",lambda match:";s:%d:"%(int(match.groups(0)[0])+1,line) + "\n")
shutil.move("tmp_out.txt",target_file)
something like that I think
or even better
import fileinput
for line in fileinput.input(target_file, inplace=True):
print re.sub(";s:(\d+):",lambda match:";s:%d:"%(int(match.groups(0)[0])+1,line)

Though, one liner in Python is difficult to achieve, but you can use something similar using the callback feature of re.sub
repl = lambda e: e.group(1) + str(int(e.group(2))+1)
with open("in.txt") as fin:
with open("out.txt","w") as fout:
fout.write(re.sub(r"(;s:)(\d+):",repl, fin.read()))

Related

how do I write the output of this code in a different file in python [duplicate]

How do I write a line to a file in modern Python? I heard that this is deprecated:
print >>f, "hi there"
Also, does "\n" work on all platforms, or should I use "\r\n" on Windows?
This should be as simple as:
with open('somefile.txt', 'a') as the_file:
the_file.write('Hello\n')
From The Documentation:
Do not use os.linesep as a line terminator when writing files opened in text mode (the default); use a single '\n' instead, on all platforms.
Some useful reading:
The with statement
open()
'a' is for append, or use
'w' to write with truncation
os (particularly os.linesep)
You should use the print() function which is available since Python 2.6+
from __future__ import print_function # Only needed for Python 2
print("hi there", file=f)
For Python 3 you don't need the import, since the print() function is the default.
The alternative in Python 3 would be to use:
with open('myfile', 'w') as f:
f.write('hi there\n') # python will convert \n to os.linesep
Quoting from Python documentation regarding newlines:
When writing output to the stream, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '' or '\n', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.
See also: Reading and Writing Files - The Python Tutorial
The python docs recommend this way:
with open('file_to_write', 'w') as f:
f.write('file contents\n')
So this is the way I usually do it :)
Statement from docs.python.org:
It is good practice to use the 'with' keyword when dealing with file
objects. This has the advantage that the file is properly closed after
its suite finishes, even if an exception is raised on the way. It is
also much shorter than writing equivalent try-finally blocks.
Regarding os.linesep:
Here is an exact unedited Python 2.7.1 interpreter session on Windows:
Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.linesep
'\r\n'
>>> f = open('myfile','w')
>>> f.write('hi there\n')
>>> f.write('hi there' + os.linesep) # same result as previous line ?????????
>>> f.close()
>>> open('myfile', 'rb').read()
'hi there\r\nhi there\r\r\n'
>>>
On Windows:
As expected, os.linesep does NOT produce the same outcome as '\n'. There is no way that it could produce the same outcome. 'hi there' + os.linesep is equivalent to 'hi there\r\n', which is NOT equivalent to 'hi there\n'.
It's this simple: use \n which will be translated automatically to os.linesep. And it's been that simple ever since the first port of Python to Windows.
There is no point in using os.linesep on non-Windows systems, and it produces wrong results on Windows.
DO NOT USE os.linesep!
I do not think there is a "correct" way.
I would use:
with open('myfile', 'a') as f:
f.write('hi there\n')
In memoriam Tim Toady.
In Python 3 it is a function, but in Python 2 you can add this to the top of the source file:
from __future__ import print_function
Then you do
print("hi there", file=f)
If you are writing a lot of data and speed is a concern you should probably go with f.write(...). I did a quick speed comparison and it was considerably faster than print(..., file=f) when performing a large number of writes.
import time
start = start = time.time()
with open("test.txt", 'w') as f:
for i in range(10000000):
# print('This is a speed test', file=f)
# f.write('This is a speed test\n')
end = time.time()
print(end - start)
On average write finished in 2.45s on my machine, whereas print took about 4 times as long (9.76s). That being said, in most real-world scenarios this will not be an issue.
If you choose to go with print(..., file=f) you will probably find that you'll want to suppress the newline from time to time, or replace it with something else. This can be done by setting the optional end parameter, e.g.;
with open("test", 'w') as f:
print('Foo1,', file=f, end='')
print('Foo2,', file=f, end='')
print('Foo3', file=f)
Whichever way you choose I'd suggest using with since it makes the code much easier to read.
Update: This difference in performance is explained by the fact that write is highly buffered and returns before any writes to disk actually take place (see this answer), whereas print (probably) uses line buffering. A simple test for this would be to check performance for long writes as well, where the disadvantages (in terms of speed) for line buffering would be less pronounced.
start = start = time.time()
long_line = 'This is a speed test' * 100
with open("test.txt", 'w') as f:
for i in range(1000000):
# print(long_line, file=f)
# f.write(long_line + '\n')
end = time.time()
print(end - start, "s")
The performance difference now becomes much less pronounced, with an average time of 2.20s for write and 3.10s for print. If you need to concatenate a bunch of strings to get this loooong line performance will suffer, so use-cases where print would be more efficient are a bit rare.
Since 3.5 you can also use the pathlib for that purpose:
Path.write_text(data, encoding=None, errors=None)
Open the file pointed to in text mode, write data to it, and close the file:
import pathlib
pathlib.Path('textfile.txt').write_text('content')
When you said Line it means some serialized characters which are ended to '\n' characters. Line should be last at some point so we should consider '\n' at the end of each line. Here is solution:
with open('YOURFILE.txt', 'a') as the_file:
the_file.write("Hello")
in append mode after each write the cursor move to new line, if you want to use w mode you should add \n characters at the end of the write() function:
the_file.write("Hello\n")
If you want to avoid using write() or writelines() and joining the strings with a newline yourself, you can pass all of your lines to print(), and the newline delimiter and your file handle as keyword arguments. This snippet assumes your strings do not have trailing newlines.
print(line1, line2, sep="\n", file=f)
You don't need to put a special newline character is needed at the end, because print() does that for you.
If you have an arbitrary number of lines in a list, you can use list expansion to pass them all to print().
lines = ["The Quick Brown Fox", "Lorem Ipsum"]
print(*lines, sep="\n", file=f)
It is OK to use "\n" as the separator on Windows, because print() will also automatically convert it to a Windows CRLF newline ("\r\n").
One can also use the io module as in:
import io
my_string = "hi there"
with io.open("output_file.txt", mode='w', encoding='utf-8') as f:
f.write(my_string)
If you want to insert items in a list with a format per line, a way to start could be:
with open('somefile.txt', 'a') as the_file:
for item in items:
the_file.write(f"{item}\n")
When I need to write new lines a lot, I define a lambda that uses a print function:
out = open(file_name, 'w')
fwl = lambda *x, **y: print(*x, **y, file=out) # FileWriteLine
fwl('Hi')
This approach has the benefit that it can utilize all the features that are available with the print function.
Update: As is mentioned by Georgy in the comment section, it is possible to improve this idea further with the partial function:
from functools import partial
fwl = partial(print, file=out)
IMHO, this is a more functional and less cryptic approach.
To write text in a file in the flask can be used:
filehandle = open("text.txt", "w")
filebuffer = ["hi","welcome","yes yes welcome"]
filehandle.writelines(filebuffer)
filehandle.close()
You can also try filewriter
pip install filewriter
from filewriter import Writer
Writer(filename='my_file', ext='txt') << ["row 1 hi there", "row 2"]
Writes into my_file.txt
Takes an iterable or an object with __str__ support.
with open('sample.txt', 'a') as f:
f.write('Hello')
f.write('\n')
Insert f.write('\n') at the end
since others have answered how to do it, I'll answer how it happens line by line.
with FileOpenerCM('file.txt') as fp: # is equal to "with open('file.txt') as fp:"
fp.write('dummy text')
this is a so-called context manager, anything that comes with a with block is a context manager. so let's see how this happens under the hood.
class FileOpenerCM:
def __init__(self, file, mode='w'):
self.file = open(file, mode)
def __enter__(self):
return self.file
def __exit__(self, exc_type, exc_value, exc_traceback):
self.file.close()
the first method __init__ is (as you all know) the initialization method of an object. whenever an object is created obj.__init__ is definitely called. and that's the place where you put your all the init kinda code.
the second method __enter__ is a bit interesting. some of you might not have seen it because it is a specific method for context managers. what it returns is the value to be assigned to the variable after the as keyword. in our case, fp.
the last method is the method to run after an error is captured or if the code exits the with block. exc_type, exc_value, exc_traceback variables are the variables that hold the values of the errors that occurred inside with block. for example,
exc_type: TypeError
exc_value: unsupported operand type(s) for +: 'int' and 'str
exc_traceback: <traceback object at 0x6af8ee10bc4d>
from the first two variables, you can get info enough info about the error. honestly, I don't know the use of the third variable, but for me, the first two are enough. if you want to do more research on context managers surely you can do it and note that writing classes are not the only way to write context managers. with contextlib you can write context managers through functions(actually generators) as well. it's totally up to you to have a look at it. you can surely try
generator functions with contextlib but as I see classes are much cleaner.

Python writing to file [duplicate]

How do I write a line to a file in modern Python? I heard that this is deprecated:
print >>f, "hi there"
Also, does "\n" work on all platforms, or should I use "\r\n" on Windows?
This should be as simple as:
with open('somefile.txt', 'a') as the_file:
the_file.write('Hello\n')
From The Documentation:
Do not use os.linesep as a line terminator when writing files opened in text mode (the default); use a single '\n' instead, on all platforms.
Some useful reading:
The with statement
open()
'a' is for append, or use
'w' to write with truncation
os (particularly os.linesep)
You should use the print() function which is available since Python 2.6+
from __future__ import print_function # Only needed for Python 2
print("hi there", file=f)
For Python 3 you don't need the import, since the print() function is the default.
The alternative in Python 3 would be to use:
with open('myfile', 'w') as f:
f.write('hi there\n') # python will convert \n to os.linesep
Quoting from Python documentation regarding newlines:
When writing output to the stream, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '' or '\n', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.
See also: Reading and Writing Files - The Python Tutorial
The python docs recommend this way:
with open('file_to_write', 'w') as f:
f.write('file contents\n')
So this is the way I usually do it :)
Statement from docs.python.org:
It is good practice to use the 'with' keyword when dealing with file
objects. This has the advantage that the file is properly closed after
its suite finishes, even if an exception is raised on the way. It is
also much shorter than writing equivalent try-finally blocks.
Regarding os.linesep:
Here is an exact unedited Python 2.7.1 interpreter session on Windows:
Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.linesep
'\r\n'
>>> f = open('myfile','w')
>>> f.write('hi there\n')
>>> f.write('hi there' + os.linesep) # same result as previous line ?????????
>>> f.close()
>>> open('myfile', 'rb').read()
'hi there\r\nhi there\r\r\n'
>>>
On Windows:
As expected, os.linesep does NOT produce the same outcome as '\n'. There is no way that it could produce the same outcome. 'hi there' + os.linesep is equivalent to 'hi there\r\n', which is NOT equivalent to 'hi there\n'.
It's this simple: use \n which will be translated automatically to os.linesep. And it's been that simple ever since the first port of Python to Windows.
There is no point in using os.linesep on non-Windows systems, and it produces wrong results on Windows.
DO NOT USE os.linesep!
I do not think there is a "correct" way.
I would use:
with open('myfile', 'a') as f:
f.write('hi there\n')
In memoriam Tim Toady.
In Python 3 it is a function, but in Python 2 you can add this to the top of the source file:
from __future__ import print_function
Then you do
print("hi there", file=f)
If you are writing a lot of data and speed is a concern you should probably go with f.write(...). I did a quick speed comparison and it was considerably faster than print(..., file=f) when performing a large number of writes.
import time
start = start = time.time()
with open("test.txt", 'w') as f:
for i in range(10000000):
# print('This is a speed test', file=f)
# f.write('This is a speed test\n')
end = time.time()
print(end - start)
On average write finished in 2.45s on my machine, whereas print took about 4 times as long (9.76s). That being said, in most real-world scenarios this will not be an issue.
If you choose to go with print(..., file=f) you will probably find that you'll want to suppress the newline from time to time, or replace it with something else. This can be done by setting the optional end parameter, e.g.;
with open("test", 'w') as f:
print('Foo1,', file=f, end='')
print('Foo2,', file=f, end='')
print('Foo3', file=f)
Whichever way you choose I'd suggest using with since it makes the code much easier to read.
Update: This difference in performance is explained by the fact that write is highly buffered and returns before any writes to disk actually take place (see this answer), whereas print (probably) uses line buffering. A simple test for this would be to check performance for long writes as well, where the disadvantages (in terms of speed) for line buffering would be less pronounced.
start = start = time.time()
long_line = 'This is a speed test' * 100
with open("test.txt", 'w') as f:
for i in range(1000000):
# print(long_line, file=f)
# f.write(long_line + '\n')
end = time.time()
print(end - start, "s")
The performance difference now becomes much less pronounced, with an average time of 2.20s for write and 3.10s for print. If you need to concatenate a bunch of strings to get this loooong line performance will suffer, so use-cases where print would be more efficient are a bit rare.
Since 3.5 you can also use the pathlib for that purpose:
Path.write_text(data, encoding=None, errors=None)
Open the file pointed to in text mode, write data to it, and close the file:
import pathlib
pathlib.Path('textfile.txt').write_text('content')
When you said Line it means some serialized characters which are ended to '\n' characters. Line should be last at some point so we should consider '\n' at the end of each line. Here is solution:
with open('YOURFILE.txt', 'a') as the_file:
the_file.write("Hello")
in append mode after each write the cursor move to new line, if you want to use w mode you should add \n characters at the end of the write() function:
the_file.write("Hello\n")
If you want to avoid using write() or writelines() and joining the strings with a newline yourself, you can pass all of your lines to print(), and the newline delimiter and your file handle as keyword arguments. This snippet assumes your strings do not have trailing newlines.
print(line1, line2, sep="\n", file=f)
You don't need to put a special newline character is needed at the end, because print() does that for you.
If you have an arbitrary number of lines in a list, you can use list expansion to pass them all to print().
lines = ["The Quick Brown Fox", "Lorem Ipsum"]
print(*lines, sep="\n", file=f)
It is OK to use "\n" as the separator on Windows, because print() will also automatically convert it to a Windows CRLF newline ("\r\n").
One can also use the io module as in:
import io
my_string = "hi there"
with io.open("output_file.txt", mode='w', encoding='utf-8') as f:
f.write(my_string)
If you want to insert items in a list with a format per line, a way to start could be:
with open('somefile.txt', 'a') as the_file:
for item in items:
the_file.write(f"{item}\n")
When I need to write new lines a lot, I define a lambda that uses a print function:
out = open(file_name, 'w')
fwl = lambda *x, **y: print(*x, **y, file=out) # FileWriteLine
fwl('Hi')
This approach has the benefit that it can utilize all the features that are available with the print function.
Update: As is mentioned by Georgy in the comment section, it is possible to improve this idea further with the partial function:
from functools import partial
fwl = partial(print, file=out)
IMHO, this is a more functional and less cryptic approach.
To write text in a file in the flask can be used:
filehandle = open("text.txt", "w")
filebuffer = ["hi","welcome","yes yes welcome"]
filehandle.writelines(filebuffer)
filehandle.close()
You can also try filewriter
pip install filewriter
from filewriter import Writer
Writer(filename='my_file', ext='txt') << ["row 1 hi there", "row 2"]
Writes into my_file.txt
Takes an iterable or an object with __str__ support.
with open('sample.txt', 'a') as f:
f.write('Hello')
f.write('\n')
Insert f.write('\n') at the end
since others have answered how to do it, I'll answer how it happens line by line.
with FileOpenerCM('file.txt') as fp: # is equal to "with open('file.txt') as fp:"
fp.write('dummy text')
this is a so-called context manager, anything that comes with a with block is a context manager. so let's see how this happens under the hood.
class FileOpenerCM:
def __init__(self, file, mode='w'):
self.file = open(file, mode)
def __enter__(self):
return self.file
def __exit__(self, exc_type, exc_value, exc_traceback):
self.file.close()
the first method __init__ is (as you all know) the initialization method of an object. whenever an object is created obj.__init__ is definitely called. and that's the place where you put your all the init kinda code.
the second method __enter__ is a bit interesting. some of you might not have seen it because it is a specific method for context managers. what it returns is the value to be assigned to the variable after the as keyword. in our case, fp.
the last method is the method to run after an error is captured or if the code exits the with block. exc_type, exc_value, exc_traceback variables are the variables that hold the values of the errors that occurred inside with block. for example,
exc_type: TypeError
exc_value: unsupported operand type(s) for +: 'int' and 'str
exc_traceback: <traceback object at 0x6af8ee10bc4d>
from the first two variables, you can get info enough info about the error. honestly, I don't know the use of the third variable, but for me, the first two are enough. if you want to do more research on context managers surely you can do it and note that writing classes are not the only way to write context managers. with contextlib you can write context managers through functions(actually generators) as well. it's totally up to you to have a look at it. you can surely try
generator functions with contextlib but as I see classes are much cleaner.

Python replacement of Rubys grep?

abc=123
dabc=123
abc=456
dabc=789
aabd=123
From the above file I need to find lines beginning with abc= (whitespaces doesn't matter)
in ruby I would put this in an array and do
matches = input.grep(/^\s*abc=.*/).map(&:strip)
I'm a totally noob in Python, even said I'm a fresh Python developer is too much.
Maybe there is a better "Python way" of doing this without even grepping ?
The Python version I have available on the platform where I need to solve the problem is 2.6
There is no way of use Ruby at that time
with open("myfile.txt") as myfile:
matches = [line.rstrip() for line in myfile if line.lstrip().startswith("abc=")]
In Python you would typically use a list comprehension whose if clause does what you'd accomplish with Ruby's grep:
import sys, re
matches = [line.strip() for line in sys.stdin
if re.match(r'^\s*abc=.*', line)]

Using grep in python

There is a file (query.txt) which has some keywords/phrases which are to be matched with other files using grep. The last three lines of the following code are working perfectly but when the same command is used inside the while loop it goes into an infinite loop or something(ie doesn't respond).
import os
f=open('query.txt','r')
b=f.readline()
while b:
cmd='grep %s my2.txt'%b #my2 is the file in which we are looking for b
os.system(cmd)
b=f.readline()
f.close()
a='He is'
cmd='grep %s my2.txt'%a
os.system(cmd)
First of all, you are not iterating over the file properly. You can simply use for b in f: without the .readline() stuff.
Then your code will blow in your face as soon as the filename contains any characters which have a special meaning in the shell. Use subprocess.call instead of os.system() and pass an argument list.
Here's a fixed version:
import os
import subprocess
with open('query.txt', 'r') as f:
for line in f:
line = line.rstrip() # remove trailing whitespace such as '\n'
subprocess.call(['/bin/grep', line, 'my2.txt'])
However, you can improve your code even more by not calling grep at all.
Read my2.txt to a string instead and then use the re module to perform the search. In case you do not need a regex at all, you can even simply use if line in my2_content
Your code scans the whole my2.txt file for each query in query.txt.
You want to:
read all queries into a list
iterate once over all lines of the text file and check each file against all queries.
Try this code:
with open('query.txt','r') as f:
queries = [l.strip() for l in f]
with open('my2.txt','r') as f:
for line in f:
for query in queries:
if query in line:
print query, line
This isn't actually a good way to use Python, but if you have to do something like that, then do it correctly:
from __future__ import with_statement
import subprocess
def grep_lines(filename, query_filename):
with open(query_filename, "rb") as myfile:
for line in myfile:
subprocess.call(["/bin/grep", line.strip(), filename])
grep_lines("my2.txt", "query.txt")
And hope that your file doesn't contain any characters which have special meanings in regular expressions =)
Also, you might be able to do this with grep alone:
grep -f query.txt my2.txt
It works like this:
~ $ cat my2.txt
One two
two two
two three
~ $ cat query.txt
two two
three
~ $ python bar.py
two two
two three
$ grep -wFf query.txt my2.txt > out.txt
this will match all the keywords in query.txt with my2.txt file and save the output in out.txt
Read man grep for a description of all the possible arguments.

Replace part of string using python regular expression

I have the following lines (many, many):
...
gfnfgnfgnf: 5656756734
arvervfdsa: 1343453563
particular: 4685685685
erveveersd: 3453454545
verveversf: 7896789567
..
What I'd like to do is to find line 'particular' (whatever number is after ':')
and replace this number with '111222333'. How can I do that using python regular expressions ?
for line in input:
key, val = line.split(':')
if key == 'particular':
val = '111222333'
I'm not sure regex would be of any value in this specific case. My guess is they'd be slower. That said, it can be done. Here's one way:
for line in input:
re.sub('^particular : .*', 'particular : 111222333')
There are subtleties involved in this, and this is almost certainly not what you'd want in production code. You need to check all of the re module constants to make sure the regex is acting the way you expect, etc. You might be surprised at the flexibility you find in dealing with problems like this in Python if you try not to use re (of course, this isn't to say re isn't useful) ;-)
Sure you need a regular expression?
other_number = '111222333'
some_text, some_number = line.split(': ')
new_line = ': '.join(some_text, other_number)
#!/usr/bin/env python
import re
text = '''gfnfgnfgnf: 5656756734
arvervfdsa: 1343453563
particular: 4685685685
erveveersd: 3453454545
verveversf: 7896789567'''
print(re.sub('[0-9]+', '111222333', text))
input = """gfnfgnfgnf: 5656756734
arvervfdsa: 1343453563
particular: 4685685685
erveveersd: 3453454545
verveversf: 7896789567"""
entries = re.split("\n+", input)
for entry in entries:
if entry.startswith("particular"):
entry = re.sub(r'[0-9]+', r'111222333', entry)
or with sed:
sed -e 's/^particular: [0-9].*$/particular: 111222333/g' file
An important point here is that if you have a lot of lines, you want to process them one by one. That is, instead of reading all the lines in replacing them, and writing them out again, you should read in a line at a time and write out a line at a time. (This would be inefficient if you were actually reading a line at a time from the disk; however, Python's IO is competent and will buffer the file for you.)
with open(...) as infile, open(...) as outfile:
for line in infile:
if line.startswith("particular"):
outfile.write("particular: 111222333")
else:
outfile.write(line)
This will be speed- and memory-efficient.
Your sed example forces me to say neat!
python -c "import re, sys; print ''.join(re.sub(r'^(particular:) \d+', r'\1 111222333', l) for l in open(sys.argv[1]))" file

Categories