Suggestions for optimization Python

Suggestions for optimization Python - python

This is a fairly straight forward programming problem in Python and I am looking for suggestions for further optimization. I am successfully processing in time except for very large strings. I am not looking for code rather areas that I should research for optimization improvements. I have already identified that I can skip even numbers reducing the loop operation and given the nature of the operations the pattern eventually repeats which is why I track when repeat occurs. This allows me break out if n > repeat. I am not positive if converting the string to a list is the most effective.
Problem:
We have a string s and we have a number n that indicates the number of times to run the function. Here is a function that takes your string, concatenates the even-indexed chars to the front, odd-indexed chars to the back. You perform this operation n times.
Example:
example where s = "qwertyuio" and n = 2:
after 1 iteration s = "qetuowryi"
after 2 iterations s = "qtorieuwy"
return "qtorieuwy"
def jumbled_string(s, n):
sl = list(s)
repeat = 0
for y in range(0,n):
for i in range(1, (len(sl)//2)+1):
sl.append(sl.pop(i))
if repeat == 0 and ''.join(sl) == s:
repeat = y+1
break
if repeat != 0:
afterrepeat = n%repeat
for y in range(0,afterrepeat):
for i in range(1, (len(sl)//2)+1):
sl.append(sl.pop(i))
return ''.join(sl)

I don't know what you mean by "pattern repeats". But if we stick to the problem statement, it's a one liner in Python:
s='abecidofug'
from itertools import chain
s2 = ''.join(chain([s[c] for c in range(0, len(s), 2)],[s[c] for c in range(1, len(s), 2)]))
s2
'aeioubcdfg'

In python 3.8+ (due to := operator) you could do it like this:
import collections
def jumbled_string(s: str, n: int) -> str:
generator = (s:=s[::2]+s[1::2] for _ in range(n))
collections.deque(generator, maxlen=0)
return s
Using collections.deque as this is the Fastest (most Pythonic) way to consume an iterator.
Though, for small n I'm finding it faster to use:
def jumbled_string(s: str, n: int) -> str:
for _ in (s:=s[::2]+s[1::2] for _ in range(n)):
pass
return s
Test:
jumbled_string("qwertyuio", 2)
Output:
'qtorieuwy'

You don't explain what n does. The statement is this:
def jumbled_string(s: str) -> str:
even = s[::2]
odd = s[1::2]
return even+odd
print(jumbled_string("0123456789"))
>>>0246813579

Related

Why does the `eval` function seemingly take so long in Python?

TL;DR why is Python's eval function slow?
Hi. I was solving a coding exercise and noticed that the eval function was causing timeouts for some test cases and was wondering why this is the case as I don't particularly recall reading that the function is slow.
Here's the exercise:
You are given a list of n nonnegative integers and a target integer. Find out how many possible ways there are to add '+' or '-' in between the provided integers in order to obtain the target. For example, the input [1, 1, 1, 1, 1] and 3 would return 5 since there are five total ways to insert + or - to obtain 3: -1+1+1+1+1, +1-1+1+1+1, +1+1-1+1+1, +1+1+1-1+1, +1+1+1+1-1.
The solution I initially came up with used eval as follows:
from itertools import product
from typing import List
def solution(numbers: List[int], target: int) -> int:
numbers = [str(x) for x in numbers]
length = len(numbers)
all_operations = list(map(''.join, product('+-', repeat=length)))
answer = 0
for operation in all_operations:
eval_string = ''.join([''.join(x) for x in zip(operation, numbers)])
if eval(eval_string) == target:
answer += 1
return answer
The second solution that passes all test cases doesn't use eval and simply performs the arithmetic step by step:
from itertools import product
from typing import List
def solution(numbers: List[int], target: int) -> int:
numbers = [str(x) for x in numbers]
length = len(numbers)
all_signs = list(map(''.join, product('+-', repeat=length)))
answer = 0
for operation in all_signs:
value = 0
for op, number in zip(operation, numbers):
if op == '+':
value += int(number)
elif op == '-':
value -= int(number)
if value == target:
answer += 1
return answer

Converting your values to strings then running eval is not going to be efficient. This will be faster although it may be possible to improve this even further:
import itertools
INTS = [1, 1, 1, 1, 1]
CONST = 3
# build a new list comprised of the original INTS plus their negative values
P = list(map(lambda x: -x, INTS)) + INTS
# work out the unique permutations
s = set(itertools.permutations(P, len(INTS)))
# iterate over (map) each unique permutations to see if its sum matches CONST
print(list(map(lambda x: sum(x) == CONST, s)).count(True))

most efficient way to iterate over a large array looking for a missing element in Python

I was trying an online test. the test asked to write a function that given a list of up to 100000 integers whose range is 1 to 100000, would find the first missing integer.
for example, if the list is [1,4,5,2] the output should be 3.
I iterated over the list as follow
def find_missing(num)
for i in range(1, 100001):
if i not in num:
return i
the feedback I receives is the code is not efficient in handling big lists.
I am quite new and I couldnot find an answer, how can I iterate more efficiently?

The first improvement would be to make yours linear by using a set for the repeated membership test:
def find_missing(nums)
s = set(nums)
for i in range(1, 100001):
if i not in s:
return i
Given how C-optimized python sorting is, you could also do sth like:
def find_missing(nums)
s = sorted(set(nums))
return next(i for i, n in enumerate(s, 1) if i != n)
But both of these are fairly space inefficient as they create a new collection. You can avoid that with an in-place sort:
from itertools import groupby
def find_missing(nums):
nums.sort() # in-place
return next(i for i, (k, _) in enumerate(groupby(nums), 1) if i != k)

For any range of numbers, the sum is given by Gauss's formula:
# sum of all numbers up to and including nums[-1] minus
# sum of all numbers up to but not including nums[-1]
expected = nums[-1] * (nums[-1] + 1) // 2 - nums[0] * (nums[0] - 1) // 2
If a number is missing, the actual sum will be
actual = sum(nums)
The difference is the missing number:
result = expected - actual
This compulation is O(n), which is as efficient as you can get. expected is an O(1) computation, while actual has to actually add up the elements.
A somewhat slower but similar complexity approach would be to step along the sequence in lockstep with either a range or itertools.count:
for a, e in zip(nums, range(nums[0], len(nums) + nums[0])):
if a != e:
return e # or break if not in a function
Notice the difference between a single comparison a != e, vs a linear containment check like e in nums, which has to iterate on average through half of nums to get the answer.

You can use Counter to count every occurrence of your list. The minimum number with occurrence 0 will be your output. For example:
from collections import Counter
def find_missing():
count = Counter(your_list)
keys = count.keys() #list of every element in increasing order
main_list = list(range(1:100000)) #the list of values from 1 to 100k
missing_numbers = list(set(main_list) - set(keys))
your_output = min(missing_numbers)
return your_output

Python recursive list comprehension to iterative approach

I'm trying to understand how to think a recursive method iteratively. For example, I have the following backtracking method:
def bitStr(n, s):
if n == 1:
return s
return [digit + bits for digit in bitStr(1, s) for bits in bitStr(n - 1, s)]
I'm practicing how to do accomplish a similar iteratively or explicitly using double for-loop.
I started something like this which I understand is incorrect; however, unable to fix it:
def bitStr2(n, s):
if n == 1:
return [c for c in s]
for bits in bitStr2(n - 1, s):
for digit in bitStr2(1, s):
return digit + bits
Thank You

There are two issues in your code.
First, as pointed out by #MisterMiyagi, you switched the loops. In a list comprehension, loops are read from left to right. You should write the regular loops like this:
for digit in bitStr2(1, s):
for bits in bitStr2(n - 1, s):
...
Second, a list comprehension produces... a list. You have to store the elements in a list:
...
result = []
for digit in bitStr2(1, s):
for bits in bitStr2(n - 1, s):
result.append(digit + bits)
return result
(Conversely: never use a list comprehension if you don't want to produce a list.) And you don't have to handle differently the n = 1 case. Full code:
def bitStr2(n, s):
if n == 1:
return s
result = []
for digit in bitStr2(1, s):
for bits in bitStr2(n - 1, s):
result.append(digit + bits)
return result
Note that for digit in bitStr(1, s) is equivalent to for digit in s. I don't see why you call the method bitStr in this case, since you already know the result.

Joining a string in another string

I have done this code but the output is not like what I want
def replace(s,p,n):
return "".join("{}".format(p) if not i % n else char for i, char in enumerate(s,1))
print(replace("university","-",3))
the output that I get is un-ve-si-y
I must get it like :
uni-ver-sit-y

This is one approach. using str slicing.
Demo:
def replace(s,p,n):
return p.join([s[i:i+n] for i in range(0, len(s), n)])
print(replace("university","-",3))
Output:
uni-ver-sit-y

If you extend the code out over multiple lines:
chars_to_join = []
for i, char in enumerate(s,1):
if not i % n:
chars_to_join.append("{}".format(p))
else:
chars_to_join.append(char)
You'll see that when the if statement is true it'll just replace the character rather than include the replacement character after the given character, so just modify the format string to include the currently iterated character aswell
"{}{}".format(char, p)

Alternatively you can do it functionally like this:
from itertools import repeat
def take(s, n):
""""take n characters from s"""
return s[:n]
def skip(s, n):
""""skip n characters from s"""
return s[n:]
def replace(s, p, n):
# create intervals at which to prefix
intervals = range(0, len(s), n)
# create the prefix for all chunks
prefix = map(skip, repeat(s), intervals)
# trim prefix for n characters each
chunks = map(take, prefix, repeat(n))
return p.join(chunks)
And now:
replace('university', '-', 3)
Will give you:
'uni-ver-sit-y'
Note: this is sample code, if this is meant to be efficient you probably should use lazy evaluated functions (like islice) which can take a lot less memory for bigger inputs.

For this question, I think the list-comprehension is not a very good idea. It's not clearly understood. Maybe we can make it clearer by following:
def replace(s,p,n):
new_list = []
for i, c in enumerate(s, 1):
new_list.append(c)
if i % n == 0:
new_list.append(p)
return "".join(new_list)
print(replace("university","-",3))

Find the smallest positive number not in list

I have a list in python like this:
myList = [1,14,2,5,3,7,8,12]
How can I easily find the first unused value? (in this case '4')

I came up with several different ways:
Iterate the first number not in set
I didn't want to get the shortest code (which might be the set-difference trickery) but something that could have a good running time.
This might be one of the best proposed here, my tests show that it might be substantially faster - especially if the hole is in the beginning - than the set-difference approach:
from itertools import count, filterfalse # ifilterfalse on py2
A = [1,14,2,5,3,7,8,12]
print(next(filterfalse(set(A).__contains__, count(1))))
The array is turned into a set, whose __contains__(x) method corresponds to x in A. count(1) creates a counter that starts counting from 1 to infinity. Now, filterfalse consumes the numbers from the counter, until a number is found that is not in the set; when the first number is found that is not in the set it is yielded by next()
Timing for len(a) = 100000, randomized and the sought-after number is 8:
>>> timeit(lambda: next(filterfalse(set(a).__contains__, count(1))), number=100)
0.9200698399945395
>>> timeit(lambda: min(set(range(1, len(a) + 2)) - set(a)), number=100)
3.1420603669976117
Timing for len(a) = 100000, ordered and the first free is 100001
>>> timeit(lambda: next(filterfalse(set(a).__contains__, count(1))), number=100)
1.520096342996112
>>> timeit(lambda: min(set(range(1, len(a) + 2)) - set(a)), number=100)
1.987783643999137
(note that this is Python 3 and range is the py2 xrange)
Use heapq
The asymptotically good answer: heapq with enumerate
from heapq import heapify, heappop
heap = list(A)
heapify(heap)
from heapq import heapify, heappop
from functools import partial
# A = [1,2,3] also works
A = [1,14,2,5,3,7,8,12]
end = 2 ** 61 # these are different and neither of them can be the
sentinel = 2 ** 62 # first gap (unless you have 2^64 bytes of memory).
heap = list(A)
heap.append(end)
heapify(heap)
print(next(n for n, v in enumerate(
iter(partial(heappop, heap), sentinel), 1) if n != v))
Now, the one above could be the preferred solution if written in C, but heapq is written in Python and most probably slower than many other alternatives that mainly use C code.
Just sort and enumerate to find the first not matching
Or the simple answer with good constants for O(n lg n)
next(i for i, e in enumerate(sorted(A) + [ None ], 1) if i != e)
This might be fastest of all if the list is almost sorted because of how the Python Timsort works, but for randomized the set-difference and iterating the first not in set are faster.
The + [ None ] is necessary for the edge cases of there being no gaps (e.g. [1,2,3]).

This makes use of the property of sets
>>> l = [1,2,3,5,7,8,12,14]
>>> m = range(1,len(l))
>>> min(set(m)-set(l))
4

I would suggest you to use a generator and use enumerate to determine the missing element
>>> next(a for a, b in enumerate(myList, myList[0]) if a != b)
4
enumerate maps the index with the element so your goal is to determine that element which differs from its index.
Note, I am also assuming that the elements may not start with a definite value, in this case which is 1, and if it is so, you can simplify the expression further as
>>> next(a for a, b in enumerate(myList, 1) if a != b)
4

A for loop with the list will do it.
l = [1,14,2,5,3,7,8,12]
for i in range(1, max(l)):
if i not in l: break
print(i) # result 4

Don't know how efficient, but why not use an xrange as a mask and use set minus?
>>> myList = [1,14,2,5,3,7,8,12]
>>> min(set(xrange(1, len(myList) + 1)) - set(myList))
4
You're only creating a set as big as myList, so it can't be that bad :)
This won't work for "full" lists:
>>> myList = range(1, 5)
>>> min(set(xrange(1, len(myList) + 1)) - set(myList))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: min() arg is an empty sequence
But the fix to return the next value is simple (add one more to the masked set):
>>> min(set(xrange(1, len(myList) + 2)) - set(myList))
5

import itertools as it
next(i for i in it.count() if i not in mylist)
I like this because it reads very closely to what you're trying to do: "start counting, keep going until you reach a number that isn't in the list, then tell me that number". However, this is quadratic since testing i not in mylist is linear.
Solutions using enumerate are linear, but rely on the list being sorted and no value being repeated. Sorting first makes it O(n log n) overall, which is still better than quadratic. However, if you can assume the values are distinct, then you could put them into a set first:
myset = set(mylist)
next(i for i in it.count() if i not in myset)
Since set containment checks are roughly constant time, this will be linear overall.

I just solved this in a probably non pythonic way
def solution(A):
# Const-ish to improve readability
MIN = 1
if not A: return MIN
# Save re-computing MAX
MAX = max(A)
# Loop over all entries with minimum of 1 starting at 1
for num in range(1, MAX):
# going for greatest missing number return optimistically (minimum)
# If order needs to switch, then use max as start and count backwards
if num not in A: return num
# In case the max is < 0 double wrap max with minimum return value
return max(MIN, MAX+1)
I think it reads quite well

My effort, no itertools. Sets "current" to be the one less than the value you are expecting.
list = [1,2,3,4,5,7,8]
current = list[0]-1
for i in list:
if i != current+1:
print current+1
break
current = i

The naive way is to traverse the list which is an O(n) solution. However, since the list is sorted, you can use this feature to perform binary search (a modified version for it). Basically, you are looking for the last occurance of A[i] = i.
The pseudo algorithm will be something like:
binarysearch(A):
start = 0
end = len(A) - 1
while(start <= end ):
mid = (start + end) / 2
if(A[mid] == mid):
result = A[mid]
start = mid + 1
else: #A[mid] > mid since there is no way A[mid] is less than mid
end = mid - 1
return (result + 1)
This is an O(log n) solution. I assumed lists are one indexed. You can modify the indices accordingly
EDIT: if the list is not sorted, you can use the heapq python library and store the list in a min-heap and then pop the elements one by one
pseudo code
H = heapify(A) //Assuming A is the list
count = 1
for i in range(len(A)):
if(H.pop() != count): return count
count += 1

sort + reduce to the rescue!
from functools import reduce # python3
myList = [1,14,2,5,3,7,8,12]
res = 1 + reduce(lambda x, y: x if y-x>1 else y, sorted(myList), 0)
print(res)
Unfortunatelly it won't stop after match is found and will iterate whole list.
Faster (but less fun) is to use for loop:
myList = [1,14,2,5,3,7,8,12]
res = 0
for num in sorted(myList):
if num - res > 1:
break
res = num
res = res + 1
print(res)

you can try this
for i in range(1,max(arr1)+2):
if i not in arr1:
print(i)
break

Easy to read, easy to understand, gets the job done:
def solution(A):
smallest = 1
unique = set(A)
for int in unique:
if int == smallest:
smallest += 1
return smallest

Keep incrementing a counter in a loop until you find the first positive integer that's not in the list.
def getSmallestIntNotInList(number_list):
"""Returns the smallest positive integer that is not in a given list"""
i = 0
while True:
i += 1
if i not in number_list:
return i
print(getSmallestIntNotInList([1,14,2,5,3,7,8,12]))
# 4
I found that this had the fastest performance compared to other answers on this post. I tested using timeit in Python 3.10.8. My performance results can be seen below:
import timeit
def findSmallestIntNotInList(number_list):
# Infinite while-loop until first number is found
i = 0
while True:
i += 1
if i not in number_list:
return i
t = timeit.Timer(lambda: findSmallestIntNotInList([1,14,2,5,3,7,8,12]))
print('Execution time:', t.timeit(100000), 'seconds')
# Execution time: 0.038100800011307 seconds
import timeit
def findSmallestIntNotInList(number_list):
# Loop with a range to len(number_list)+1
for i in range (1, len(number_list)+1):
if i not in number_list:
return i
t = timeit.Timer(lambda: findSmallestIntNotInList([1,14,2,5,3,7,8,12]))
print('Execution time:', t.timeit(100000), 'seconds')
# Execution time: 0.05068870005197823 seconds
import timeit
def findSmallestIntNotInList(number_list):
# Loop with a range to max(number_list) (by silgon)
# https://stackoverflow.com/a/49649558/3357935
for i in range (1, max(number_list)):
if i not in number_list:
return i
t = timeit.Timer(lambda: findSmallestIntNotInList([1,14,2,5,3,7,8,12]))
print('Execution time:', t.timeit(100000), 'seconds')
# Execution time: 0.06317249999847263 seconds
import timeit
from itertools import count, filterfalse
def findSmallestIntNotInList(number_list):
# iterate the first number not in set (by Antti Haapala -- Слава Україні)
# https://stackoverflow.com/a/28178803/3357935
return(next(filterfalse(set(number_list).__contains__, count(1))))
t = timeit.Timer(lambda: findSmallestIntNotInList([1,14,2,5,3,7,8,12]))
print('Execution time:', t.timeit(100000), 'seconds')
# Execution time: 0.06515420007053763 seconds
import timeit
def findSmallestIntNotInList(number_list):
# Use property of sets (by Bhargav Rao)
# https://stackoverflow.com/a/28176962/3357935
m = range(1, len(number_list))
return min(set(m)-set(number_list))
t = timeit.Timer(lambda: findSmallestIntNotInList([1,14,2,5,3,7,8,12]))
print('Execution time:', t.timeit(100000), 'seconds')
# Execution time: 0.08586219989228994 seconds

The easiest way would be just to loop through the sorted list and check if the index is equal the value and if not return the index as solution.
This would have complexity O(nlogn) because of the sorting:
for index,value in enumerate(sorted(myList)):
if index is not value:
print(index)
break
Another option is to use python sets which are somewhat dictionaries without values, just keys. In dictionaries you can look for a key in constant time which make the whol solution look like the following, having only linear complexity O(n):
mySet = set(myList)
for i in range(len(mySet)):
if i not in mySet:
print(i)
break
Edit:
If the solution should also deal with lists where no number is missing (e.g. [0,1]) and output the next following number and should also correctly consider 0, then a complete solution would be:
def find_smallest_positive_number_not_in_list(myList):
mySet = set(myList)
for i in range(1, max(mySet)+2):
if i not in mySet:
return i

A solution that returns all those values is
free_values = set(range(1, max(L))) - set(L)
it does a full scan, but those loops are implemented in C and unless the list or its maximum value are huge this will be a win over more sophisticated algorithms performing the looping in Python.
Note that if this search is needed to implement "reuse" of IDs then keeping a free list around and maintaining it up-to-date (i.e. adding numbers to it when deleting entries and picking from it when reusing entries) is a often a good idea.

The following solution loops all numbers in between 1 and the length of the input list and breaks the loop whenever a number is not found inside it. Otherwise the result is the length of the list plus one.
listOfNumbers=[1,14,2,5,3,7,8,12]
for i in range(1, len(listOfNumbers)+1):
if not i in listOfNumbers:
nextNumber=i
break
else:
nextNumber=len(listOfNumbers)+1

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Suggestions for optimization Python - python

I don't know what you mean by "pattern repeats". But if we stick to the problem statement, it's a one liner in Python: s='abecidofug' from itertools import chain s2 = ''.join(chain([s[c] for c in range(0, len(s), 2)],[s[c] for c in range(1, len(s), 2)])) s2 'aeioubcdfg'

You don't explain what n does. The statement is this: def jumbled_string(s: str) -> str: even = s[::2] odd = s[1::2] return even+odd print(jumbled_string("0123456789")) >>>0246813579

Related

Why does the `eval` function seemingly take so long in Python?

most efficient way to iterate over a large array looking for a missing element in Python

Python recursive list comprehension to iterative approach

Joining a string in another string

Find the smallest positive number not in list

Categories

Resources