Reverse string time and space complexity - python

I have written different python codes to reverse a given string. But, couldn't able to figure the which one among them is efficient. Can someone point out the differences between these algorithms using time and space complexities?
def reverse_1(s):
result = ""
for i in s :
result = i + result
return result
def reverse_2(s):
return s[::-1]
There are already some solutions out there, but I couldn't find out the time and space complexity. I would like to know how much space s[::-1] will take?

Without even trying to bench it (you can do it easily), reverse_1 would be dead slow because of many things:
loop with index
constantly adding character to string, creating a copy each time.
So, slow because of loop & indexes, O(n*n) time complexity because of the string copies, O(n) complexity because it uses extra memory to create temp strings (which are hopefully garbage collected in the loop).
On the other hand s[::-1]:
doesn't use a visible loop
returns a string without the need to convert from/to list
uses compiled code from python runtime
So you cannot beat it in terms of time & space complexity and speed.
If you want an alternative you can use:
''.join(reversed(s))
but that will be slower than s[::-1] (it has to create a list so join can build a string back). It's interesting when other transformations are required than reversing the string.
Note that unlike C or C++ languages (as far as the analogy goes for strings) it is not possible to reverse the string with O(1) space complexity because of the immutability of strings: you need twice the memory because string operations cannot be done in-place (this can be done on list of characters, but the str <=> list conversions use memory)

Related

How can I mark an index as 'used' when iterating over a list?

I will iterate over a list of integers, nums, multiple times, and each time, when an integer has been 'used' for something (doesn't matter what), I want to mark the index as used. So that in future iterations, I do not use this integer again.
Two questions:
My idea is to simply create a separate list marker = [1]*len(nums) ; and each time I use a number in nums, I will subtract 1 from the corresponding index in marker as a way to keep track of the numbers in nums I have used.
My first question is, is there a well known efficient way to do this? As I believe this would make the SPACE COMPLEXITY O(n)
My other idea is to replace each entry in nums, like this. nums = [1,2,3,4] -> nums = [(1,1),(2,1),(3,1),(4,1)]. And each time I use an integer in nums, I would subtract 1 from the second index in each pair as a way of marking that it has been used. My question is, am I right in understanding that this would optimise the SPACE COMPLEXITY relative to solution 1. above? And the SPACE COMPLEXITY here would be O(1)?
For reference, I am solving the following question: https://leetcode.com/contest/weekly-contest-256/problems/minimum-number-of-work-sessions-to-finish-the-tasks/
Where each entry in tasks needs to be used once.
I don't think there is a way to do it in O(1) space. Although, I believe that using a boolean value instead of an integer value or using the concept of sets would be a better solution.
No, the space complexity is still O(n). Think about it like this. Let us assume n is the size of the list. In the first method that you mentioned, we are storing n 'stuff' separately. So, the space complexity is O(n). In the second method also, we are storing n 'stuff' separately. It's just that those n 'stuff' are being stored as part of the same array. So, the space complexity still remains the same which is O(n).
Firstly, In both cases, the space Complexity comes out to be O(n). This is because nums itself utilizes O(n) space whether or not you use a separate list to store usage of elements. So space complexity in any way cannot come down to O(1).
However, here is a suggestion.
If you don't want to use the used element again then why not just remove it from the list.
Or, in case you don't want to disrupt the indexing, just change the number to -1.

How to confirm if my answer has O(1) space complexity and in modification of an array? It is the Reverse a String Leetcode question

I was doing a problem on Leetcode - here is the problem:
Write a function that reverses a string. The input string is given as an array of characters char[].
Do not allocate extra space for another array, you must do this by modifying the input array in-place with O(1) extra memory.
You may assume all the characters consist of printable ascii characters.
My solution is
def reverseString(s):
"""
Do not return anything, modify s in-place instead.
"""
temp = ""
for index,value in enumerate(s):
temp+=value
s.clear()
for i in "".join(reversed(temp)):
s.append(i)
reverseString(["h","e","l","l","o"])
My solution works and is accepted by Leetcode. It also passes all the test cases. However, I am still new to the concept of space and time and was not sure if my solution follows the requirements of O(1) and modifies the array in place. If someone could confirm if it does or not and also teach me how to confirm this, it would be helpful. Thank you!
O(1) means you use extra constant memory of variables to solve the question. To be opposed, the extra memory of variabels you used related to the question's data size. E.g, the size of string array which want you to reverse is x,you use y=ax+b memory of variables means O(n). y=ax^2+bx+c means O(n^2). Do you get it?

How is the string.join(str_list, ''") implemented under the hood in Python?

I know that concatenating two strings using the += operator makes a new copy of the old string and then concatenates the new string to that, resulting in quadratic time complexity.
This answer gives a nice time comparison between the += operation and string.join(str_list, ''). It looks like the join() method runs in linear time (correct me if I am wrong). Out of curiosity, I wanted to know how the string.join(str_list, '') method is implemented in Python since strings are immutable objects?
It's implemented in C, so python mutability is less important. You can find the appropriate source here: unicodeobject.c

Python: understanding iterators and `join()` better

The join() function accepts an iterable as parameter. However, I was wondering why having:
text = 'asdfqwer'
This:
''.join([c for c in text])
Is significantly faster than:
''.join(c for c in text)
The same occurs with long strings (i.e. text * 10000000).
Watching the memory footprint of both executions with long strings, I think they both create one and only one list of chars in memory, and then join them into a string. So I am guessing perhaps the difference is only between how join() creates this list out of the generator and how the Python interpreter does the same thing when it sees [c for c in text]. But, again, I am just guessing, so I would like somebody to confirm/deny my guesses.
The join method reads its input twice; once to determine how much memory to allocate for the resulting string object, then again to perform the actual join. Passing a list is faster than passing a generator object that it needs to make a copy of so that it can iterate over it twice.
A list comprehension is not simply a generator object wrapped in a list, so constructing the list externally is faster than having join create it from a generator object. Generator objects are optimized for memory efficiency, not speed.
Of course, a string is already an iterable object, so you could just write ''.join(text). (Also again this is not as fast as creating the list explicitly from the string.)

Longest Increasing Subsequence code in O(N)?

Someone asked me a question
Find the longest alphabetically increasing or equal string
composed of those letters. Note that you are allowed to drop
unused characters.
So ghaaawxyzijbbbklccc returns aaabbbccc.
Is an O(n) solution possible?
and I implemented it code [in python]
s = 'ghaaawxyzijbbbklccc'
lst = [[] for i in range(26)]
for ch in s:
ml = 0
for i in range(0,ord(ch) + 1 - ord('a')):
if len(lst[i]) > len(lst[ml]):
ml= i
cpy = ''.join(lst[ml])
lst[ord(ch) - ord('a')] = cpy + ch
ml = 0
for i in range(26):
if len(lst[i]) > len(lst[ml]):
ml = i
print lst[ml]
and the answer is 'aaabbbccc'
I have tried this some more examples and all works!
and as far as I can think the complexity of this code is O(N)
let's take an example
suppose I have a string 'zzzz'
so the main loop will run 4 times and internal loop will run 26 times for each iteration so we can say in worst case the code will run in
O(26*N + 26)
---------^-
this is the last iteration
so O(N) is acceptable?
Now questions are
Is it works in O(N) my code at ideone
If it works in O(N) then why to use DP of O(N2) code of DP
Is it better then this code Friends code
Limitations of this code
It's O(N)
'why to use DP of O(N2)' : You don't need to for this problem. Note, though, that you take advantage of the fact that your sequence tokens (letters) are finite - so you can set up a list to hold all the possible starting values (26) and you need only look for the longest member of that list - an O(1) operation. A more generalised solution for sequences with an arbitrary number of ordered tokens can be done in O(NlogN).
Your friend's code is basically the same, just mapping the letters to numbers and their list for the 26 starting places holds 26 numbers for letter counts - they don't need to do either of those. Conceptually, though, it's the same thing - holding a list of lists.
"Better" is a matter of opinion. Although it has the same asymptotic complexity, the constant terms may be different, so one may execute faster than the other. Also, in terms of storage, one may use very slightly more memory than the other. With such low n - judging which is more readable may be more important than the absolute performance of either algorithm. I'm not going to make a judgement.
You might notice a slight difference where the "winning" sequence is a tie. For instance - on the test string edxeducation that you have there - your implementation returns ddin whereas your friend's returns ddio. Both seem valid to me - without a rule to break such ties.
The major limitation of this code is that it can only cope with sequences composed entirely of letters in one particular case. You could extend it to cope with upper and lower case letters, either treating them the same, or using an ordering where all lower case letters were "less than" all upper case letters or something similar. This is just extending the finite set of tokens that it can cope with.
To generalise this limitation - the code will only cope with finite sets of sequence tokens as noted in 2. above. Also - there is no error handling, so if you put in a string with, say, digits or punctuation, it will fail.
This is a variation of the Longest Increasing Subsequence.
The difference is that your elements are bounded, since they can only run from 'a' to 'z'. Your algorithm is indeed O(N). O(N log N) is possible, for example using the algorithm from the link above. The bound on the number of possible elements turns this into O(N).

Categories