String sorting problem with code execution time limit

String sorting problem with code execution time limit - python

I was recently trying to solve a HackerEarth problem. The code worked on the sample inputs and some custom inputs that I gave. But, when I submitted, it showed errors for exceeding the time limit. Can someone explain how I can make the code run faster?
Problem Statement: Cyclic shift
A large binary number is represented by a string A of size N and comprises of 0s and 1s. You must perform a cyclic shift on this string. The cyclic shift operation is defined as follows:
If the string A is [A0, A1,..., An-1], then after performing one cyclic shift, the string becomes [A1, A2,..., An-1, A0].
You performed the shift infinite number of times and each time you recorded the value of the binary number represented by the string. The maximum binary number formed after performing (possibly 0) the operation is B. Your task is to determine the number of cyclic shifts that can be performed such that the value represented by the string A will be equal to B for the Kth time.
Input format:
First line: A single integer T denoting the number of test cases
For each test case:
First line: Two space-separated integers N and K
Second line: A denoting the string
Output format:
For each test case, print a single line containing one integer that represents the number of cyclic shift operations performed such that the value represented by string A is equal to B for the Kth time.
Code:
import math
def value(s):
u = len(s)
d = 0
for h in range(u):
d = d + (int(s[u-1-h]) * math.pow(2, h))
return d
t = int(input())
for i in range(t):
x = list(map(int, input().split()))
n = x[0]
k = x[1]
a = input()
v = 0
for j in range(n):
a = a[1:] + a[0]
if value(a) > v:
b = a
v = value(a)
ctr = 0
cou = 0
while ctr < k:
a = a[1:] + a[0]
cou = cou + 1
if a == b:
ctr = ctr + 1
print(cou)

In the problem, the constraint on n is 0<=n<=1e5. In the function value(), you calculating integer from the binary string whose length can go up to 1e5. so the integer calculating by you can go as high as pow(2, 1e5). This surely impractical.
As mentioned by Prune, you must use some efficient algorithms for finding a subsequence, say sub1, whose repetitions make up the given string A. If you solve this by brute-force, the time complexity will be O(n*n), as maximum value of n is 1e5, time limit will exceed. so use some efficient algorithm.

I can't do much with the code you posted, since you obfuscated it with meaningless variables and a lack of explanation. When I scan it, I get the impression that you've made the straightforward approach of doing a single-digit shift in a long-running loop. You count iterations until you hit B for the Kth time.
This is easy to understand, but cumbersome and inefficient.
Since the cycle repeats every N iterations, you gain no new information from repeating that process. All you need to do is find where in the series of N iterations you encounter B ... which could be multiple times.
In order for B to appear multiple times, A must consist of a particular sub-sequence of bits, repeated 2 or more times. For instance, 101010 or 011011. You can detect this with a simple addition to your current algorithm: at each iteration, check to see whether the current string matches the original. The first time you hit this, simply compute the repetition factor as rep = len(a) / j. At this point, exit the shifting loop: the present value of b is the correct one.
Now that you have b and its position in the first j rotations, you can directly compute the needed result without further processing.
I expect that you can finish the algorithm and do the coding from here.
Ah -- taken as a requirements description, the wording of your problem suggests that B is a given. If not, then you need to detect the largest value.
To find B, append A to itself. Find the A-length string with the largest value. You can hasten this by finding the longest string of 1s, applying other well-known string-search algorithms for the value-trees after the first 0 following those largest strings.
Note that, while you iterate over A, you look for the first place in which you repeat the original value: this is the desired repetition length, which drives the direct-computation phase in the first part of my answer.

Related

How to implement a bubble sort that only performs "K" times

I am solving the following bubble sort algorithm.
However, this algorithm does not seem to be a common bubble sort implementation. I have implemented the following code, but it will time out. The reason is that the time complexity of my code is still O (n^2).
How do I write code for a bubble sort to correctly understand and solve the problem?
Solution
Bubble sorting is an algorithm that sorts sequences of length N in such a way that two adjacent elements are examined to change their positions. Bubble sorting can be performed N times as shown below.
Compare the first value with the second value, and change the position if the first value is greater.
Compares the second value with the third value, and changes the position if the second value is greater.
...
Compare the N-1 and N-th values, and change the position if the N-1th value is greater.
I know the result of bubble sorting, so I know the intermediate process of bubble sorting. However, since N is very large, it takes a long time to perform the above steps K times. Write a program that will help you to find the intermediate process of bubble sorting.
Input
N and K are given in the first line.
The second line gives the status of the first sequence. That is, N integers forming the first sequence are given in turn, with spaces between them.
1 <= N <= 100,000
1 <= K <= N
Each term in the sequence is an integer from 1 to 1,000,000,000.
Output
The above steps are repeated K times and the status of the sequence is output.
Commandline
Example input
4 1
62 23 32 15
Example output
23 32 15 62
My Code
n_k = input() # 4 1
n_k_s = [int(num) for num in n_k.split()]
progression = input() # 62 23 32 15 => 23 32 15 62
progressions = [int(num) for num in progression.split()]
def bubble(n_k_s, progressions):
for i in range(0, n_k_s[1]):
for j in range(i, n_k_s[0]-1):
if (progressions[j] > progressions[j+1]):
temp = progressions[j]
progressions[j] = progressions[j+1]
progressions[j+1] = temp
for k in progressions:
print(k, end=" ")
bubble(n_k_s, progressions)

I'm confused as to why you're saying "The reason is that the time complexity of my code is still O (n^2)"
The time complexity is always O(n²), unless you add a flag to check if your list is already sorted (complexity would now be 0(n) if the list is sorted at the beginning of your program)

As best I can tell, you have implemented the algorithm requested. It is O(nk); Phillippe already covered the rationale I was typing.
Yes, you can set a flag to indicate whether you've made any exchanges on this pass. That doesn't change any complexity except for best-case -- although it does reduce the constant factor in many other cases.
One possibility I see for speeding up your process is to use a more efficient value exchange: use the Python idiom a, b = b, a. In your case, the inner loop might become:
done = True
for j in range(i, n_k_s[0]-1):
if progressions[j] > progressions[j+1]:
progressions[j], progressions[j+1] = progressions[j+1], progressions[j]
done = False
if done:
break

Repeated String Match

I did Contest 52 of leetcode.com and I had trouble understanding the solution. The problem statement is:
Given two strings A and B, find the minimum number of times A has to be >repeated such that B is a substring of it. If no such solution, return -1.
For example, with A = "abcd" and B = "cdabcdab.
Return 3, because by repeating A three times (“abcdabcdabcd”), B is a >substring of it; and B is not a substring of A repeated two times
("abcdabcd").
The solution is:
def repeatedStringMatch(self, A, B):
"""
:type A: str
:type B: str
:rtype: int
"""
times = int(math.ceil(float(len(B)) / len(A)))
for i in range(2):
if B in (A * (times + i)):
return times + i
return -1
The explanation from one of the collaborators was:
A has to be repeated sufficient times such that it is at least as long as >B (or one more), hence we can conclude that the theoretical lower bound >for the answer would be length of B / length of A.
Let x be the theoretical lower bound, which is ceil(len(B)/len(A)).
The answer n can only be x or x + 1
I don't understand why n can only be x or x+1, can someone help?

If x+1 < n and B is a substring of A repeated n times and you've embedded B in it then either you can chop off the last copy of A without hitting B (meaning that n is not minimal) or else the start of B in A is after the end of the first copy so you can chop off the first copy (and again n is not minimal).
Therefore if it fits at all, it must fit within x+1 copies. Based on length alone it can't fit within < x copies. So the only possibilities left are x and x+1 copies. (And examples can be found where each is the answer.)

Suppose we have two strings m = "abcde" and any 11 character string "e_abcde_abcde", n to be searched. Since the 11 digits are to be matched, the minimum number of characters required is 11 which requires ((n/m)+1)=3 minimum sets for search due to integer rounding. Now the ending position depends on the beginning position and there are at most length(m) unique starting points in each set of m elements. So we can start at each of the positions till the first set is exhausted. The last try may go to m_last = 4, hence the last element of the matched string will go to (n/m+2) set. After the first set is exhausted, the above pattern is exhausted and there is no new searches. Hence we require (n/m + 2) * length (m) for full search.

I know this is an old question still, I would like to add my 2 cents here.
I think #btilly has provided enough clarity but a picture may help further.

Random number generator that returns only one number each time

Does Python have a random number generator that returns only one random integer number each time when next() function is called? Numbers should not repeat and the generator should return random integers in the interval [1, 1 000 000] that are unique.
I need to generate more than million different numbers and that sounds as if it is very memory consuming in case all the number are generated at same time and stored in a list.

You are looking for a linear congruential generator with a full period. This will allow you to get a pseudo-random sequence of non-repeating numbers in your target number range.
Implementing a LCG is actually very simple, and looks like this:
def lcg(a, c, m, seed = None):
num = seed or 0
while True:
num = (a * num + c) % m
yield num
Then, it just comes down to choosing the correct values for a, c, and m to guarantee that the LCG will generate a full period (which is the only guarantee that you get non-repeating numbers). As the Wikipedia article explains, the following three conditions need to be true:
m and c need to be relatively prime.
a - 1 is divisible by all prime factors of m
a - 1 is divisible by 4, if m is also divisible by 4.
The first one is very easily guaranteed by simply choosing a prime for c. Also, this is the value that can be chosen last, and this will ultimately allow us to mix up the sequence a bit.
The relationship between a - 1 and m is more complicated though. In a full period LCG, m is the length of the period. Or in other words, it is the number range your numbers come from. So this is what you are usually choosing first. In your case, you want m to be around 1000000. Choosing exactly your maximum number might be difficult since that restricts you a lot (in both your choice of a and also c), so you can also choose numbers larger than that and simply skip all numbers outside of your range later.
Let’s choose m = 1000000 now though. The prime factors of m are 2 and 5. And it’s also obviously divisible by 4. So for a - 1, we need a number that is a multiple of 2 * 2 * 5 to satisfy the conditions 2 and 3. Let’s choose a - 1 = 160, so a = 161.
For c, we are using a random prime that’s somewhere in between of our range: c = 506903
Putting that into our LCG gives us our desired sequence. We can choose any seed value from the range (0 <= seed <= m) as the starting point of our sequence.
So let’s try it out and verify that what we thought of actually works. For this purpose, we are just collecting all numbers from the generator in a set until we hit a duplicate. At that point, we should have m = 1000000 numbers in the set:
>>> g = lcg(161, 506903, 1000000)
>>> numbers = set()
>>> for n in g:
if n in numbers:
raise Exception('Number {} already encountered before!'.format(n))
numbers.add(n)
Traceback (most recent call last):
File "<pyshell#5>", line 3, in <module>
raise Exception('Number {} already encountered before!'.format(n))
Exception: Number 506903 already encountered before!
>>> len(numbers)
1000000
And it’s correct! So we did create a pseudo-random sequence of numbers that allowed us to get non-repeating numbers from our range m. Of course, by design, this sequence will be always the same, so it is only random once when you choose those numbers. You can switch up the values for a and c to get different sequences though, as long as you maintain the properties mentioned above.
The big benefit of this approach is of course that you do not need to store all the previously generated numbers. It is a constant space algorithm as it only needs to remember the initial configuration and the previously generated value.
It will also not deteriorate as you get further into the sequence. This is a general problem with solutions that just keep generating a random number until a new one is found that hasn’t been encountered before. This is because the longer the list of generated numbers gets, the less likely you are going to hit a numbers that’s not in that list with an evenly distributed random algorithm. So getting the 1000000th number will likely take you a long time to generate with memory based random generators.
But of course, having this simply algorithm which just performs some multiplication and some addition does not appear very random. But you have to keep in mind that this is actually the basis for most pseudo-random number generators out there. So random.random() uses something like this internally. It’s just that the m is a lot larger, so you don’t notice it there.

If you really care about the memory you could use a NumPy array (or a Python array).
A one million NumPy array of int32 (more than enough to contain integers between 0 and 1 000 000) will only consume ~4MB, Python itself would require ~36MB (roughly 28byte per integer and 8 byte for each list element + overallocation) for an identical list:
>>> # NumPy array
>>> import numpy as np
>>> np.arange(1000000, dtype=np.int32).nbytes
4 000 000
>>> # Python list
>>> import sys
>>> import random
>>> l = list(range(1000000))
>>> random.shuffle(l)
>>> size = sys.getsizeof(l) # size of the list
>>> size += sum(sys.getsizeof(item) for item in l) # size of the list elements
>>> size
37 000 108
You only want unique values and you have a consecutive range (1 million requested items and 1 million different numbers), so you could simply shuffle the range and then yield items from your shuffled array:
def generate_random_integer():
arr = np.arange(1000000, dtype=np.int32)
np.random.shuffle(arr)
yield from arr
# yield from is equivalent to:
# for item in arr:
# yield item
And it can be called using next:
>>> gen = generate_random_integer()
>>> next(gen)
443727
However that will throw away the performance benefit of using NumPy, so in case you want to use NumPy don't bother with the generator and just perform the operations (vectorized - if possible) on the array. It consumes much less memory than Python and it could be orders of magnitude faster (factors of 10-100 faster are not uncommon!).

For a large number of non-repeating random numbers use an encryption. With a given key, encrypt the numbers: 0, 1, 2, 3, ... Since encryption is uniquely reversible then each encrypted number is guaranteed to be unique, provided you use the same key. For 64 bit numbers use DES. For 128 bit numbers use AES. For other size numbers use some Format Preserving Encryption. For pure numbers you might find Hasty Pudding cipher useful as that allows a large range of different bit sizes and non-bit sizes as well, like [0..5999999].
Keep track of the key and the last number you encrypted. When you need a new unique random number just encrypt the next number you haven't used so far.

Considering your numbers should fit in a 64bit integer, one million of them stored in a list would be up to 64 mega bytes plus the list object overhead, if your processing computer can afford that the easyest way is to use shuffle:
import random
randInts = list(range(1000000))
random.shuffle(randInts)
print(randInts)
Note that the other method is to keep track of the previously generated numbers, which will get you to the point of having all of them stored too.

I just needed that function, and to my huge surprise I haven't found anything that would suit my needs. #poke's answer didn't satisfy me because I needed to have precise borders, and other ones which included lists caused heaped memory.
Initially, I needed a function that would generate numbers from a to b, where a - b could be anything from 0 to 2^32 - 1, which means the range of those numbers could be as high as maximal 32-bit unsigned integer.
The idea of my own algorithm is simple both to understand and implement. It's a binary tree, where the next branch is chosen by 50/50 chance boolean generator. Basically, we divide all numbers from a to b into two branches, then decide from which one we yield the next value, then do that recursively until we end up with single nodes, which are also being picked up by random.
The depth of recursion is:
, which implies that for the given stack limit of 256, your highest range would be 2^256, which is impressive.
Things to note:
a must be lesser or equal b - otherwise no output will be displayed.
Boundaries are included, meaning unique_random_generator(0, 3) will generate [0, 1, 2, 3].
TL;DR - here's the code
import math, random
# a, b - inclusive
def unique_random_generator(a, b):
# corner case on wrong input
if a > b:
return
# end node of the tree
if a == b:
yield a
return
# middle point of tree division
c = math.floor((a + b) / 2)
generator_left = unique_random_generator(a, c) # left branch - contains all the numbers between 'a' and 'c'
generator_right = unique_random_generator(c + 1, b) # right branch - contains all the numbers between 'c + 1' and 'b'
has_values = True
while (has_values):
# decide whether we pick up a value from the left branch, or the right
decision = bool(random.getrandbits(1))
if decision:
next_left = next(generator_left, None)
# if left branch is empty, check the right one
if next_left == None:
next_right = next(generator_right, None)
# if both empty, current recursion's dessicated
if next_right == None:
has_values = False
else:
yield next_right
else:
yield next_left
next_right = next(generator_right, None)
if next_right != None:
yield next_right
else:
next_right = next(generator_right, None)
# if right branch is empty, check the left one
if next_right == None:
next_left = next(generator_left, None)
# if both empty, current recursion's dessicated
if next_left == None:
has_values = False
else:
yield next_left
else:
yield next_right
next_left = next(generator_left, None)
if next_left != None:
yield next_left
Usage:
for i in unique_random_generator(0, 2**32):
print(i)

import random
# number of random entries
x = 1000
# The set of all values
y = {}
while (x > 0) :
a = random.randint(0 , 10**10)
if a not in y :
a -= 1
This way you are sure you have perfectly random unique values
x represents the number of values you want

You can easily make one yourself:
from random import random
def randgen():
while True:
yield random()
ran = randgen()
next(ran)
next(ran)
...

I need help finding a smart solution to shorten the time this code runs

2 days ago i started practicing python 2.7 on Codewars.com and i came across a really interesting problem, the only thing is i think it's a bit too much for my level of python knowledge. I actually did solve it in the end but the site doesn't accept my solution because it takes too much time to complete when you call it with large numbers, so here is the code:
from itertools import permutations
def next_bigger(n):
digz =list(str(n))
nums =permutations(digz, len(digz))
nums2 = []
for i in nums:
z =''
for b in range(0,len(i)):
z += i[b]
nums2.append(int(z))
nums2 = list(set(nums2))
nums2.sort()
try:
return nums2[nums2.index(n)+1]
except:
return -1
"You have to create a function that takes a positive integer number and returns the next bigger number formed by the same digits" - These were the original instructions
Also, at one point i decided to forgo the whole permutations idea, and in the middle of this second attempt i realized that there's no way it would work:
def next_bigger(n):
for i in range (1,11):
c1 = n % (10**i) / (10**(i-1))
c2 = n % (10**(i+1)) / (10**i)
if c1 > c2:
return ((n /(10**(i+1)))*10**(i+1)) + c1 *(10**i) + c2*(10**(i-1)) + n % (10**(max((i-1),0)))
break
if anybody has any ideas, i'm all-ears and if you hate my code, please do tell, because i really want to get better at this.

stolen from http://www.geeksforgeeks.org/find-next-greater-number-set-digits/
Following are few observations about the next greater number.
1) If all digits sorted in descending order, then output is always “Not Possible”. For example, 4321.
2) If all digits are sorted in ascending
order, then we need to swap last two digits. For example, 1234.
3) For
other cases, we need to process the number from rightmost side (why?
because we need to find the smallest of all greater numbers)
You can now try developing an algorithm yourself.
Following is the algorithm for finding the next greater number.
I)
Traverse the given number from rightmost digit, keep traversing till
you find a digit which is smaller than the previously traversed digit.
For example, if the input number is “534976”, we stop at 4 because 4
is smaller than next digit 9. If we do not find such a digit, then
output is “Not Possible”.
II) Now search the right side of above found digit ‘d’ for the
smallest digit greater than ‘d’. For “534976″, the right side of 4
contains “976”. The smallest digit greater than 4 is 6.
III) Swap the above found two digits, we get 536974 in above example.
IV) Now sort all digits from position next to ‘d’ to the end of
number. The number that we get after sorting is the output. For above
example, we sort digits in bold 536974. We get “536479” which is the
next greater number for input 534976.

"formed by the same digits" - there's a clue that you have to break the number into digits: n = list(str(n))
"next bigger". The fact that they want the very next item means that you want to make the least change. Focus on changing the 1s digit. If that doesn't work, try the 10's digit, then the 100's, etc. The smallest change you can make is to exchange two furthest digits to the right that will increase the value of the integer. I.e. exchange the two right-most digits in which the more right-most is bigger.
def next_bigger(n):
n = list(str(n))
for i in range(len(n)-1, -1, -1):
for j in range(i-1, -1, -1):
if n[i] > n[j]:
n[i], n[j] = n[j], n[i]
return int("".join(n))
print next_bigger(123)
Oops. This fails for next_bigger(1675). I'll leave the buggy code here for a while, for whatever it is worth.

How about this? See in-line comments for explanations. Note that the way this is set up, you don't end up with any significant memory use (we're not storing any lists).
from itertools import permutations
#!/usr/bin/python3
def next_bigger(n):
# set next_bigger to an arbitrarily large value to start: see the for-loop
next_bigger = float('inf')
# this returns a generator for all the integers that are permutations of n
# we want a generator because when the potential number of permutations is
# large, we don't want to store all of them in memory.
perms = map(lambda x: int(''.join(x)), permutations(str(n)))
for p in perms:
if (p > n) and (p <= next_bigger):
# we can find the next-largest permutation by going through all the
# permutations, selecting the ones that are larger than n, and then
# selecting the smallest from them.
next_bigger = p
return next_bigger
Note that this is still a brute-force algorithm, even if implemented for speed. Here is an example result:
time python3 next_bigger.py 3838998888
3839888889
real 0m2.475s
user 0m2.476s
sys 0m0.000s
If your code needs to be faster yet, then you'll need a smarter, non-brute-force algorithm.

You don't need to look at all the permutations. Take a look at the two permutations of the last two digits. If you have an integer greater than your integer, that's it. If not, take a look at the permutations of the last three digits, etc.
from itertools import permutations
def next_bigger(number):
check = 2
found = False
digits = list(str(number))
if sorted(digits, reverse=True) == digits:
raise ValueError("No larger number")
while not found:
options = permutations(digits[-1*check:], check)
candidates = list()
for option in options:
new = digits.copy()[:-1*check]
new.extend(option)
candidate = int(''.join(new))
if candidate > number:
candidates.append(candidate)
if candidates:
result = sorted(candidates)[0]
found = True
return result
check += 1

HackerRank "AND product"

When I submit the below code for testcases in HackerRank challenge "AND product"...
You will be given two integers A and B. You are required to compute the bitwise AND amongst all natural numbers lying between A and B, both inclusive.
Input Format:
First line of the input contains T, the number of testcases to follow.
Each testcase in a newline contains A and B separated by a single space.
from math import log
for case in range(int(raw_input())):
l, u = map(int, (raw_input()).split())
if log(l, 2) == log(u, 2) or int(log(l,2))!=int(log(l,2)):
print 0
else:
s = ""
l, u = [x for x in str(bin(l))[2:]], [x for x in str(bin(u))[2:]]
while len(u)!=len(l):
u.pop(0)
Ll = len(l)
for n in range(0, len(l)):
if u[n]==l[n]:
s+=u[n]
while len(s)!=len(l):
s+="0"
print int(s, 2)
...it passes 9 of the test cases, Shows "Runtime error" in 1 test case and shows "Wrong Answer" in the rest 10 of them.
What's wrong in this?

It would be better for you to use the Bitwise operator in Python for AND. The operator is: '&'
Try this code:
def andProduct(a, b):
j=a+1
x=a
while(j<=b):
x = x&j
j+=1
return x
For more information on Bitwise operator you can see: https://wiki.python.org/moin/BitwiseOperators

Yeah you can do this much faster.
You are doing this very straightforward, calculating all ands in a for loop.
It should actually be possible to calculate this in O(1) (I think)
But here are some optimisations:
1) abort the for loop if you get the value 0, because it will stay 0 no matter what
2)If there is a power of 2 between l and u return 0 (you don't need a loop in that case)
My Idea for O(1) would be to think about which bits change between u and l.
Because every bit that changes somewhere between u and l becomes 0 in the answer.
EDIT 1: Here is an answer in O(same leading digits) time.
https://math.stackexchange.com/questions/1073532/how-to-find-bitwise-and-of-all-numbers-for-a-given-range
EDIT 2: Here is my code, I have not tested it extensively but it seems to work. (O(log(n))
from math import log
for case in [[i+1,j+i+1] for i in range(30) for j in range(30)]:
#Get input
l, u = case
invL=2**int(log(l,2)+1)-l
invU=2**int(log(u,2)+1)-u
#Calculate pseudo bitwise xnor of input and format to binary rep
e=format((u&l | invL & invU),'010b')
lBin=format(l,'010b')
#output to zero
res=0
#boolean to check if we have found any zero
anyZero=False
#boolean to check the first one because we have leading zeros
firstOne=False
for ind,i in enumerate(e):
#for every digit
#if it is a leading one
if i=='1' and (not anyZero):
firstOne=True
#leftshift result (multiply by 2)
res=res<<1
#and add 1
res=res+int(lBin[ind])
#else if we already had a one and find a zero this happens every time
elif(firstOne):
anyZero=True
#leftshift
res=res<<1
#test if we are in the same power, if not there was a power between
if(res!=0):
#print "test",(int(log(res,2))!=int(log(l,2))) | ((log(res,2))!=int(log(u,2)))
if((int(log(res,2))!=int(log(l,2))) or (int(log(res,2))!=int(log(u,2)))):
res=0
print res
Worked for every but a single testcase. Small change needed to get the last one. You'll have to find out what that small change is yourself. Seriously

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.