Optimise a Python/ C++ algorithm - python

I was participating in a competitive programming contest, and faced a question where out of four test cases, my answer was correct in 3, but exceeded time limit in 4th.
I tried to get better results by converting my code from python to cpp (I know that time complexity remains same, but it was worth a shot :))
Following is the question:
A string is said to be using strong language if it contains at least K consecutive characters '*'.
You are given a string S with length N. Determine whether it uses strong language or not.
Input:
The first line of the input contains a single integer T denoting the number of test cases. The description of T test cases follows.
The first line of each test case contains two space-separated integers N and K.
The second line contains a single string S with length N.
Output:
Print a single line containing the string "YES" if the string contains strong language or "NO" if it does not
My python approach:
for _ in range(int(input())):
k = int(input().split()[1])
s = input()
s2 = "".join(["*"]*k)
if len(s.split(s2))>1:
print("YES")
else:
print("NO")
My converted Cpp code (converted it myself)
#include <iostream>
#include<string>
using namespace std;
int main() {
// your code goes here
int t;
std::cin >> t;
for (int i = 0; i < t; i++) {
/* code */
int n,k;
std::cin >> n >> k;
string str;
cin >> str;
string str2(k,'*');
size_t found = str.find(str2);
if (found != string::npos){
std::cout << "YES" << std::endl;
} else {
std::cout << "NO" << std::endl;
}
}
return 0;
}
Please guide me how can I reduce my time complexity?
Other approaches : "Using find() function instead of split or using for loop"
Edit:
Sample Input :
2
5 1
abd
5 2
*i**j
Output :
NO
YES

The bounds you posted suggest that linear time is OK in Python. You can simply keep a running track of how many asterisks you have seen in a row.
T = int(input())
for _ in range(T):
n, k = map(int, input())
s = input()
count, ans = 0, False
for c in s:
if c == "*":
count += 1
else:
count = 0
ans = ans or count >= k
if ans:
print("NO")
else:
print("YES")
I can also tell you why you are TLE'ing. Consider the case where n = 1e6, k = 5e5, and s is a string where the first k-1 characters are asterisks. The find method you have is going to check every position for matching the str2 you created. This will take O(n^2) time, giving you a TLE.

Related

python - High time consumption and low efficiency in programming challenge

I was trying to solve this problem.
Recently Oz has found a magical string consisting of single digit "1". After experimenting on the string, Oz found a weird magical property of the string that is whenever he touches the string then each digit "1" of string changed to digit "0" and each digit "0" of string changed to "01". Oz found this property interesting and immediately asked a question to RK : "How many 1's and 0's will be in the magical string if he touches the string M times ?"
I wrote the following code for it:
l = [] #List of values
for x in range(int(raw_input())):
l.append(int(raw_input()))
def after_touchs(n, string): #Main function finds the no. of 0's and 1's
for x in range(n):
string = string.replace('1', '2').replace('0', '01').replace('2', '0')
return map(str, [string.count('1'), string.count('0')])
for num in l:
print ' '.join(after_touchs(num, '1'))
I don't understand why this code is taking a lot of time. To me it seems perfectly normal and does not use much time. Since it didn't work on the site and ran the code with the interpreter on my computer and even an input of 50 seemed to large. Does the string.replace function take up too much time? So what alternatives can I use to it? Please help me reduce the time consumption and increase the efficiency of the code.
You just need to count the number of 1 and 0, string manipulation is always heavy so I guess that's what slow you down.
number1 = 1
number0 = 0
for i in xrange(M):
# 0 -> 01
newnumber1 = number0
# 1 -> 0 and 0 -> 01
number0 += number1
# we replace the number1 with the new number
number1 = newnumber1
print "%d %d"%(number0,number1)
EDIT
There is a more efficient solution, that I saw in Tim Stopfer comment.
In fact, the number of 0's and 1's follow a fibonnacci sequence after the first change.
1: 101123
0: 011235
M: 012345
Which mean an O(1) solution would be:
if M>0:
number1 = Fibo(M-1)
number0 = Fibo(M)
But you have to approximate the value of the Fibonnacci Sequence with a formula found in wikipedia
It might be because you edit the string each time.
You are basically implementing the fibonacci sequence and the 50th number is
50 : 12586269025 = 52 x 11 x 101 x 151 x 3001
So you have a string of this length and you apply several string operations to it.
This may cause the process to slow down.
I hope I could help.
according to the question the strings for the first 6 touches would look like this:
"1", "0", "01", "010", "01001", "01001010", "0100101001001"
and the counts will be
1 0, 0 1, 1 1, 1 2, 2 3, 3 5, 5 8
which reminds me of fibonacci series.
fibonacci numbers increase rapidly so you will end up with very long strings which take up a lot of memory and are slow to manipulate.
If you need speed then just calculate the fibonacci numbers.
You can also speed up the naive fibonacci by caching values that you already calculate so if you already know 4th and 5th elements of the series you can quickly calculate the 6th.
public class MagicalString {
public static void main(String[] args) {
String modifiedString = touchTheString();
countOneAndZero(modifiedString);
}
private static void countOneAndZero(String modifiedString) {
Map<Character,Integer> map = new HashMap<Character,Integer>();
char [] data = modifiedString.toCharArray();
for(char c : data) {
if(map.containsKey(c)) {
map.put(c, map.get(c)+1);
}else {
map.put(c, 1);
}
}
System.out.println(map);
System.out.println(map.get('0'));
System.out.println(map.get('1'));
}
static String touchTheString() {
String input = "1";
int touch = 5;
while (touch != 0) {
StringBuffer br = new StringBuffer();
char[] temp = input.toCharArray();
for (char c : temp) {
if (c == '0') {
input = br.append("01").toString();
} else if (c == '1') {
input = br.append("0").toString();
}
}
touch--;
}
System.out.println(input);
return input;
}
}

Incorrect LRC value calculated from checksum

I'm trying to calculate an LRC (Longitudinal Redundancy Check) value with Python.
My Python code is pulled from other posts on StackOverflow. It looks like this:
lrc = 0
for b in message:
lrc ^= b
print lrc
If I plug in the value '\x02\x47\x30\x30\x03', I get an LRC value of 70 or 0x46 (F)
However, I am expecting a value of 68 - 0x44 (D) instead.
I have calculated the correct LRC value via C# code:
byte LRC = 0;
for (int i = 1; i < bytes.Length; i++)
{
LRC ^= bytes[i];
}
return LRC;
If I plug in the same byte array values, I get the expected result of 0x44.
Functionally, the code looks very similar. So I'm wondering what the difference is between the code. Is it my input value? Should I format my string differently?
Arrays are 0-ordered in C#, so by starting iteration from int i = 1; you are skipping 1st byte.
Python result is correct one.
Fixed reference code:
byte LRC = 0;
for (int i = 0; i < bytes.Length; i++)
{
LRC ^= bytes[i];
}
return LRC;
To avoid such mistake you should consider using foreach syntactic sugar (although I'm not familiar with C# practices).
/edit
To skip first byte in Python simply use slice syntax:
lrc = 0
for b in message[1:]:
lrc ^= b
print lrc
So I figured out the answer to my question. Thanks to Nsh for his insight. I found a way to make the algorithm work. I just had to skip the first byte in the for-loop. There's probably a better way to do this but it was quick and it's readable.
def calcLRC(input):
input=input.decode('hex')
lrc = 0
i = 0
message = bytearray(input)
for b in message:
if(i == 0):
pass
else:
lrc ^= b
i+=1;
return lrc
It now returns the expected 0x44 in my use case.

Possible optimizations for Project Euler #4 algorithm

Find the largest palindrome made from the product of two 3-digit numbers.
Even though the algorithm is fast enough for the problem at hand, I'd like to know if I missed any obvious optimizations.
from __future__ import division
from math import sqrt
def createPalindrome(m):
m = str(m) + str(m)[::-1]
return int(m)
def problem4():
for x in xrange(999,99,-1):
a = createPalindrome(x)
for i in xrange(999,int(sqrt(a)),-1):
j = a/i
if (j < 1000) and (j % 1 == 0):
c = int(i * j)
return c
It seems the biggest slowdown in my code is converting an integer to a string, adding its reverse and converting the result back to an integer.
I looked up more information on palindromes and stumbled upon this formula, which allows me to convert a 3-digit number "n" into a 6-digit palindrome "p" (can be adapted for other digits but I'm not concerned about that).
p = 1100*n−990*⌊n/10⌋−99*⌊n/100⌋
My original code runs in about 0.75 ms and the new one takes practically the same amount of time (not to mention the formula would have to be adapted depending on the number of digits "n" has), so I guess there weren't many optimizations left to perform.
Look here for Ideas
In C++ I do it like this:
int euler004()
{
// A palindromic number reads the same both ways. The largest palindrome
// made from the product of two 2-digit numbers is 9009 = 91 99.
// Find the largest palindrome made from the product of two 3-digit numbers.
const int N=3;
const int N2=N<<1;
int min,max,a,b,c,i,j,s[N2],aa=0,bb=0,cc=0;
for (min=1,a=1;a<N;a++) min*=10; max=(min*10)-1;
i=-1;
for (a=max;a>=min;a--)
for (b=a;b>=min;b--)
{
c=a*b; if (c<cc) continue;
for (j=c,i=0;i<N2;i++) { s[i]=j%10; j/=10; }
for (i=0,j=N2-1;i<j;i++,j--)
if (s[i]!=s[j]) { i=-1; break; }
if (i>=0) { aa=a; bb=b; cc=c; }
}
return cc; // cc is the output
}
no need for sqrt ...
the subcall to createPalindrome can slow things down due to heap/stack trashing
string manipulation m = str(m) + str(m)[::-1] is slow
string to int conversion can be faster if you do it your self on fixed size array
mine implementation runs around 1.7ms but big portion of that time is the App output and formating (AMD 3.2GHz 32bit app on W7 x64)...
[edit1] implementing your formula
int euler004()
{
int i,c,cc,c0,a,b;
for (cc=0,i=999,c0=1100*i;i>=100;i--,c0-=1100)
{
c=c0-(990*int(i/10))-(99*int(i/100));
for(a=999;a>=300;a--)
if (c%a==0)
{
b=c/a;
if ((b>=100)&&(b<1000)) { cc=c; i=0; break; }
}
}
return cc;
}
this takes ~0.4 ms
[edit2] further optimizations
//---------------------------------------------------------------------------
int euler004()
{
// A palindromic number reads the same both ways. The largest palindrome
// made from the product of two 2-digit numbers is 9009 = 91 99.
// Find the largest palindrome made from the product of two 3-digit numbers.
int i0,i1,i2,c0,c1,c,cc=0,a,b,da;
for (c0= 900009,i0=9;i0>=1;i0--,c0-=100001) // first digit must be non zero so <1,9>
for (c1=c0+90090,i1=9;i1>=0;i1--,c1-= 10010) // all the rest <0,9>
for (c =c1+ 9900,i2=9;i2>=0;i2--,c -= 1100) // c is palindrome from 999999 to 100001
for(a=999;a>=948;a-- )
if (c%a==0)
{
// biggest palindrome is starting with 9
// so smallest valid result is 900009
// it is odd and sqrt(900009)=948 so test in range <948,999>
b=c/a;
if ((b>=100)&&(b<1000)) { cc=c; i0=0; i1=0; i2=0; break; }
}
return cc;
}
//---------------------------------------------------------------------------
this is too fast for me to properly measure the time (raw time is around 0.037 ms)
removed the divisions and multiplications from palindrome generation
changed the ranges after some numeric analysis and thinking while waiting for bus
the first loop can be eliminated (result starts with 9)
I wrote this a while back when I just started learning python, but here it is:
for i in range (999, 800, -1):
for j in range (999,800, -1):
number = i*j
str_number = str(number)
rev_str_number = str_number[::-1]
if str_number == rev_str_number:
print("%s a palendrome") % number
I did not check all the numbers you did, but I still got the correct answer. What I really learned in this exercise is the "::" and how it works. You can check that out here.
Good luck with Euler!

What is the difference between this C++ code and this Python code?

Answer
Thanks to #TheDark for spotting the overflow. The new C++ solution is pretty freakin' funny, too. It's extremely redundant:
if(2*i > n && 2*i > i)
replaced the old line of code if(2*i > n).
Background
I'm doing this problem on HackerRank, though the problem may not be entirely related to this question. If you cannot see the webpage, or have to make an account and don't want to, the problem is listed in plain text below.
Question
My C++ code is timing out, but my python code is not. I first suspected this was due to overflow, but I used sizeof to be sure that unsigned long long can reach 2^64 - 1, the upper limit of the problem.
I practically translated my C++ code directly into Python to see if it was my algorithms causing the timeouts, but to my surprise my Python code passed every test case.
C++ code:
#include <iostream>
bool pot(unsigned long long n)
{
if (n % 2 == 0) return pot(n/2);
return (n==1); // returns true if n is power of two
}
unsigned long long gpt(unsigned long long n)
{
unsigned long long i = 1;
while(2*i < n) {
i *= 2;
}
return i; // returns greatest power of two less than n
}
int main()
{
unsigned int t;
std::cin >> t;
std::cout << sizeof(unsigned long long) << std::endl;
for(unsigned int i = 0; i < t; i++)
{
unsigned long long n;
unsigned long long count = 1;
std::cin >> n;
while(n > 1) {
if (pot(n)) n /= 2;
else n -= gpt(n);
count++;
}
if (count % 2 == 0) std::cout << "Louise" << std::endl;
else std::cout << "Richard" << std::endl;
}
}
Python 2.7 code:
def pot(n):
while n % 2 == 0:
n/=2
return n==1
def gpt(n):
i = 1
while 2*i < n:
i *= 2
return i
t = int(raw_input())
for i in range(t):
n = int(raw_input())
count = 1
while n != 1:
if pot(n):
n /= 2
else:
n -= gpt(n)
count += 1
if count % 2 == 0:
print "Louise"
else:
print "Richard"
To me, both versions look identical. I still think I'm somehow being fooled and am actually getting overflow, causing timeouts, in my C++ code.
Problem
Louise and Richard play a game. They have a counter is set to N. Louise gets the first turn and the turns alternate thereafter. In the game, they perform the following operations.
If N is not a power of 2, they reduce the counter by the largest power of 2 less than N.
If N is a power of 2, they reduce the counter by half of N.
The resultant value is the new N which is again used for subsequent operations.
The game ends when the counter reduces to 1, i.e., N == 1, and the last person to make a valid move wins.
Given N, your task is to find the winner of the game.
Input Format
The first line contains an integer T, the number of testcases.
T lines follow. Each line contains N, the initial number set in the counter.
Constraints
1 ≤ T ≤ 10
1 ≤ N ≤ 2^64 - 1
Output Format
For each test case, print the winner's name in a new line. So if Louise wins the game, print "Louise". Otherwise, print "Richard". (Quotes are for clarity)
Sample Input
1
6
Sample Output
Richard
Explanation
As 6 is not a power of 2, Louise reduces the largest power of 2 less than 6 i.e., 4, and hence the counter reduces to 2.
As 2 is a power of 2, Richard reduces the counter by half of 2 i.e., 1. Hence the counter reduces to 1.
As we reach the terminating condition with N == 1, Richard wins the game.
When n is greater than 2^63, your gpt function will eventually have i as 2^63 and then multiply 2^63 by 2, giving an overflow and a value of 0. This will then end up with an infinite loop, multiplying 0 by 2 each time.
Try this bit-twiddling hack, which is probably slightly faster:
unsigned long largest_power_of_two_not_greater_than(unsigned long x) {
for (unsigned long y; (y = x & (x - 1)); x = y) {}
return x;
}
x&(x-1) is x without its least significant one-bit. So y will be zero (terminating the loop) exactly when x has been reduced to a power of two, which will be the largest power of two not greater than the original x. The loop is executed once for every 1-bit in x, which is on average half as many iterations as your approach. Also, this one has not issues with overflow. (It does return 0 if the original x was 0. That may or may not be what you want.)
Note the if the original x was a power of two, that value is simply returned immediately. So the function doubles as a test whether x is a power of two (or 0).
While that is fun and all, in real-life code you'd probably be better off finding your compiler's equivalent to this gcc built-in (unless your compiler is gcc, in which case here it is):
Built-in Function: int __builtin_clz (unsigned int x)
Returns the number of leading 0-bits in X, starting at the most
significant bit position. If X is 0, the result is undefined.
(Also available as __builtin_clzl for unsigned long arguments and __builtin_clzll for unsigned long long.)

(Project Euler #3) Trying to replicate a solution in Python to C++, going horribly wrong, not sure how

EDIT: Solved! Simple mistake, accidentally left the int values at just int which couldn't hold that big of a number. Thanks for the help!
I already completed the Project Euler third problem:
"The prime factors of 13195 are 5, 7, 13 and 29. What is the largest prime factor of the number 600851475143 ?"?
In Python with this code (that works):
def main():
num = 600851475143 # You can replace this number with any number you want to find the largest prime to
x = 2
while x * x < num:
while num % x == 0:
num = num / x #Divide number by generated number (X) to get the prime number.
x = x + 1 # Continue in formula searching for largest prime
print num #Prints largest prime of the assigned number (600851475143)
main()
and that worked fine. However, when I tried replacating said code into C++ with this code:
#include "stdafx.h"
#include <iostream>
int main()
{
int num = 600851475143;
int x = 2;
while (x*x < num)
{
while (num % x == 0)
{
num /= x;
}
x = x++;
}
std::cout << num;
char z;
std::cin >> z;
return 0;
}
I always get the output "-443946297" instead of the correct and very different output I was expecting, "6857"
Can anyone help explain how I am getting such an extremely crazy answer from essentially the same code? Thanks in advance!
600851475143 is probably too large to fit in an int, leading to overflow. Try changing the type to long long. (You should probably change x to long long too, although it might not matter in this case.)

Categories