Code too slow USACO Bronze 2015 Problem 1 python

Code too slow USACO Bronze 2015 Problem 1 python - python

I am practicing for USACO and I came across the "Censoring" Problem: http://www.usaco.org/index.php?page=viewproblem2&cpid=526
I solved it pretty quickly and though I got it right. However, it turns out that my the server gives me a time error for test cases 7-15 (it works well for the first 6 test cases).
Here is my code.
import sys
sys.stdin = open('censor.in', 'r')
sys.stdout = open('censor.out', 'w')
# Real code begins here
original_string = input()
censor_string = input()
# print(original_string.find(censor_string) + len(censor_string))
while censor_string in original_string:
original_string = original_string[0:original_string.find(censor_string)] + original_string[original_string.find(censor_string) +
len(censor_string): len(original_string)]
print(original_string)
Can someone help me fix it? The problem is probably that while loop. Not sure how to fix it though.

This is fast enough to get accepted. I build the result string one character at a time. Whenever this creates the bad string (at the end of the partial result), I remove it.
import sys
sys.stdin = open('censor.in')
sys.stdout = open('censor.out', 'w')
s, t = input(), input()
res = ''
for c in s:
res += c
if res.endswith(t):
res = res[:-len(t)]
print(res)

Related

Is the for loop in my code the speed bottleneck?

The following code looks through 2500 markdown files with a total of 76475 lines, to check each one for the presence of two strings.
#!/usr/bin/env python3
# encoding: utf-8
import re
import os
zettelkasten = '/Users/will/Dropbox/zettelkasten'
def zsearch(s, *args):
for x in args:
r = (r"(?=.* " + x + ")")
p = re.search(r, s, re.IGNORECASE)
if p is None:
return None
return s
for filename in os.listdir(zettelkasten):
if filename.endswith('.md'):
with open(os.path.join(zettelkasten, filename),"r") as fp:
for line in fp:
result_line = zsearch(line, "COVID", "vaccine")
if result_line != None:
UUID = filename[-15:-3]
print(f'›[[{UUID}]] OR', end=" ")
This correctly gives output like:
›[[202202121717]] OR ›[[202003311814]] OR
, but it takes almost two seconds to run on my machine, which I think is much too slow. What, if anything, can be done to make it faster?

The main bottleneck is the regular expressions you're building.
If we print(f"{r=}") inside the zsearch function:
>>> zsearch("line line covid line", "COVID", "vaccine")
r='(?=.* COVID)'
r='(?=.* vaccine)'
The (?=.*) lookahead is what is causing the slowdown - and it's also not needed.
You can achieve the same result by searching for:
r=' COVID'
r=' vaccine'

Piping my Python Program through another program

I'm trying to make program using Python.
I want to be able to pipe program through another program:
" #EXAMPLE " ./my_python | another programme "
Here is the code I have so far.
This code saves output to file:
#!/usr/bin/env python
import os, random, string
# This is not my own code
''' As far asi know, It belongs to NullUserException. Was found on stackoverflow.com'''
length = 8
chars = string.ascii_letters.upper()+string.digits
random.seed = (os.urandom(1024))
# my code
file_out = open('newRa.txt','w') # Create a 'FILE' to save Generated Passwords
list1=[]
while len(list1) < 100000:
list1.append(''.join(random.choice(chars) for i in range(length)))
for item in list1:
file_out.write('%s\n' % item)
file_out.close()
file_out1=open('test.txt','w')
for x in list1:
file_out1.write('%s\n' %x[::-1])
This is the code I have trying to pipe it through another program:
#!/usr/bin/env python
import os,string,random,sys
length = 8
chars = string.ascii_letters.upper()+string.digits
random.seed = (os.urandom(1024))
keep=[]
keep1=[]
while len(keep)<1000:
keep.append(''.join(random.choice(chars) for i in range(length)))
print '\n',keep[::-1]
for x in keep:
keep1.append(x[::-1])
while len(keep1) < 1000:
print keep1
I have tried chmod and using the script as a executable.

Ok sorry for my lack of google search.
sys.stdout is the answer
#!/usr/bin/env python
import os,string,random,sys
length = 8
chars = string.ascii_letters.upper()+string.digits
random.seed = (os.urandom(1024))
keep=[]
while len(keep)<1000:
keep = (''.join(random.choice(chars) for i in range(length)))
print sys.stdout.write(keep)
sys.stdout.flush()
I stripped my code down (as it makes it a lot faster, But I'm getting this when execute
my code........
P5DBLF4KNone
DVFV3JQVNone
CIMKZFP0None
UZ1QA3HTNone
How do I get rid of the 'None' on the end?
What I have done to cause this ?
Should This Be A Seperate Question??

Two regex functions together do not work

I am trying to get the index for the start of a tag and the end of another tag. However, when I use one regex it works absolutely fine but for two regex functions, it gives an error for the second one.
Kindly help in explaining the reason
The below code works fine:
import re
f = open('C:/Users/Jyoti/Desktop/PythonPrograms/try.xml','r')
opentag = re.search('<TEXT>',f.read())
begin = opentag.start()+6
print begin
But when I add another similar regex it give me the error
AttributeError: 'NoneType' object has no attribute 'start'
which I understand is due to the start() function returning None
Below is the code:
import re
f = open('C:/Users/Jyoti/Desktop/PythonPrograms/try.xml','r')
opentag = re.search('<TEXT>',f.read())
begin = opentag.start()+6
print begin
closetag = re.search('</TEXT>',f.read())
end = closetag.start() - 1
print end
Please provide a solution to how can I get this working. Also I am a newbie here so please don't mind if I ask more questions on the solution.

You are reading the file in f.read() which reads the whole file, and so the file descriptor moves forward, which means the text can't be read again when you do f.read() the next time.
If you need to search on the same text again, save the output of f.read(), and then do a regular expression search on it as below:
import re
f = open('C:/Users/Jyoti/Desktop/PythonPrograms/try.xml','r')
text = f.read()
opentag = re.search('<TEXT>',text)
begin = opentag.start()+6
print begin
closetag = re.search('</TEXT>',text)
end = closetag.start() - 1
print end

f.read() reads the whole file. So there's nothing left to read on the second f.read() call.
See https://docs.python.org/2/tutorial/inputoutput.html#methods-of-file-objects

First of all you have to know that f.read() after read file sets the pointer to the EOF so if you again use f.read() it gives you empty string ''. Secondly you should use r before string passed as a pattern of re.search function, which means raw, and automatically escapes special characters. So you have to do something like this:
import re
f = open('C:/Users/Jyoti/Desktop/PythonPrograms/try.xml','r')
data = f.read()
opentag = re.search(r'<TEXT>',data)
begin = opentag.start()+6
print begin
closetag = re.search(r'</TEXT>',data)
end = closetag.start() - 1
print end
gl & hf with Python :)

Python Memory error during function return statement

Hi i am processing a 600Mb file. i have written the below code. What i am doing was, to search for a keyword in the data between <dest> tags and if it exists then add a city tag to <dest> tag. It worked fine for small set of data but when i ran the program on large file it is throwing MEMORY ERROR. I guess i am getting this error when i use return statement in if condition can any one please let me know how to solve this?
import re
def casp ( tx ):
def tbcnv( st ):
ct = ''
prt = re.compile(r"(?i)(Slip Copy,.*?\))", re.DOTALL|re.M)
val = re.search(prt, st)
try:
ct = val.group(1)
if re.search(r"(?i)alaska", ct):
jval = "Alaska"
print jval
if jval:
prt = re.compile(r"(?i)(.*?<dest.*?>)", re.DOTALL|re.M)
vl = re.sub(prt, "\\1\n" + "<city>" + jval + "</city>" + "\n" ,st)
return vl
else:
return st
else:
return st
except:
print "Not available"
return st
pt = re.compile("(?i)(<dest.*?</dest>)", re.DOTALL|re.M)
t = re.sub(pt, lambda m: tbcnv(m.group(1)), tx)
return t
with open('input.txt', 'r') as content_file:
content = content_file.read()
pt = re.compile(r"(?i)<Lrlevel level='3'>(.*?)</Lrlevel>", re.DOTALL|re.M)
content = re.sub(pt,lambda m: "<Lrlevel level='3'>" + casp(m.group(1) + "</Lrlevel>" ), content)
with open('out.txt', 'w') as out_file:
out_file.write(content)

If you remove the return statement just before the expect, then the string built by re.sub() is much smaller.
I'm getting memory usage that is 3 times the file size, which means that you'd get a MemoryError if you don't have (more than) 2GB. This is reasonable here --- or at least I can guess why. It's how re.sub() works.
This means that you're using somehow the wrong tools, as explained in the comments above. You should either use a full xml-processing tool like lxml, or if you want to stick with regular expressions, find a way to never need the whole string in memory; or at least to never call re.sub() on it (e.g. only the tx variable ever contains a big string, which is the input; and you do pt.search(tx, startpos) in a loop, locating the places to change, and writing piece by piece parts of tx).

Function append lines to .csv

It has been awhile since I have written functions with for loops and writing to files so bare with my ignorance.
This function is given an IP address to read from a text file; pings the IP, searches for the received packets and then appends it to a .csv
My question is: Is there a better or an easier way to write this?
def pingS (IPadd4):
fTmp = "tmp"
os.system ("ping " + IPadd4 + "-n 500 > tmp")
sName = siteNF #sys.argv[1]
scrap = open(fTmp,"r")
nF = file(sName,"a") # appends
nF.write(IPadd4 + ",")
for line in scrap:
if line.startswith(" Packets"):
arrT = line.split(" ")
nF.write(arrT[10]+" \n")
scrap.close()
nF.close()
Note: If you need the full script I can supply that as well.

This in my opinion at least makes what is going on a bit more obvious. The len('Received = ') could obviously be replaced by a constant.
def pingS (IPadd4):
fTmp = "tmp"
os.system ("ping " + IPadd4 + "-n 500 > tmp")
sName = siteNF #sys.argv[1]
scrap = open(fTmp,"r")
nF = file(sName,"a") # appends
ip_string = scrap.read()
recvd = ip_string[ip_string.find('Received = ') + len('Received = ')]
nF.write(IPadd4 + ',' + recvd + '\n')
You could also try looking at the Python csv module for writing to the csv. In this case it's pretty trivial though.

This may not be a direct answer, but you may get some performance increase from using StringIO. I have had some dramatic speedups in IO with this. I'm a bioinformatics guy, so I spend a lot of time shooting large text files out of my code.
http://www.skymind.com/~ocrow/python_string/
I use method 5. Didn't require many changes. There are some fancier methods in there, but they didn't appeal to me as much.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Code too slow USACO Bronze 2015 Problem 1 python - python

Related

Is the for loop in my code the speed bottleneck?

Piping my Python Program through another program

Two regex functions together do not work

Python Memory error during function return statement

Function append lines to .csv

Categories

Resources