Python comparing strings to their equivalent integers effeciently - python

What's the most efficient way to compare two python values both of which are probably strings, but might be integers. So far I'm using str(x)==str(y) but that feels inefficient and (more importantly) ugly:
>>> a = 1.0
>>> b = 1
>>> c = '1'
>>> a == b
True
>>> b == c
False # here I wanted this to be true
>>> str(b)==str(c)
True # true, as desired
My actual objects are dictionary values retrieved with get(), and most of them are strings.

Test it out. I like using %timeit in ipython:
In [1]: %timeit str("1") == str(1)
1000000 loops, best of 3: 702 ns per loop
In [2]: %timeit "1" == str(1)
1000000 loops, best of 3: 412 ns per loop
In [3]: %timeit int("1") == 1
1000000 loops, best of 3: 906 ns per loop
Apart from that, though, if you truly don't know what the input type is, there isn't much you can do about it, unless you want to make assumptions about the input data. For example, if you assume that most of the inputs are equal (same type, same value), you could do something like:
if a == b or str(a) == str(b):
... they are equal ...
Which would be faster if they are normally the same type and normally equal... But it will be slower if they aren't normally the same type, or aren't normally equal.
However, are you sure you can't cast everything to a str/int when they enter your code?

wim#wim-acer:~/sandpit$ python -mtimeit "str('69') == str(69)"
1000000 loops, best of 3: 0.28 usec per loop
wim#wim-acer:~/sandpit$ python -mtimeit "int('69') == int(69)"
1000000 loops, best of 3: 0.5 usec per loop
wim#wim-acer:~/sandpit$ python -mtimeit "str('32767') == str(32767)"
1000000 loops, best of 3: 0.317 usec per loop
wim#wim-acer:~/sandpit$ python -mtimeit "int('32767') == int(32767)"
1000000 loops, best of 3: 0.492 usec per loop
Conclusion: Probably how you're already doing it is plenty fast enough. Optimise the slowest parts of your program, after everything is working.

Related

what is the most efficient way to find the position of the first np.nan value?

consider the array a
a = np.array([3, 3, np.nan, 3, 3, np.nan])
I could do
np.isnan(a).argmax()
But this requires finding all np.nan just to find the first.
Is there a more efficient way?
I've been trying to figure out if I can pass a parameter to np.argpartition such that np.nan get's sorted first as opposed to last.
EDIT regarding [dup].
There are several reasons this question is different.
That question and answers addressed equality of values. This is in regards to isnan.
Those answers all suffer from the same issue my answer faces. Note, I provided a perfectly valid answer but highlighted it's inefficiency. I'm looking to fix the inefficiency.
EDIT regarding second [dup].
Still addressing equality and question/answers are old and very possibly outdated.
It might also be worth to look into numba.jit; without it, the vectorized version will likely beat a straight-forward pure-Python search in most scenarios, but after compiling the code, the ordinary search will take the lead, at least in my testing:
In [63]: a = np.array([np.nan if i % 10000 == 9999 else 3 for i in range(100000)])
In [70]: %paste
import numba
def naive(a):
for i in range(len(a)):
if np.isnan(a[i]):
return i
def short(a):
return np.isnan(a).argmax()
#numba.jit
def naive_jit(a):
for i in range(len(a)):
if np.isnan(a[i]):
return i
#numba.jit
def short_jit(a):
return np.isnan(a).argmax()
## -- End pasted text --
In [71]: %timeit naive(a)
100 loops, best of 3: 7.22 ms per loop
In [72]: %timeit short(a)
The slowest run took 4.59 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 37.7 µs per loop
In [73]: %timeit naive_jit(a)
The slowest run took 6821.16 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 6.79 µs per loop
In [74]: %timeit short_jit(a)
The slowest run took 395.51 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 144 µs per loop
Edit: As pointed out by #hpaulj in their answer, numpy actually ships with an optimized short-circuited search whose performance is comparable with the JITted search above:
In [26]: %paste
def plain(a):
return a.argmax()
#numba.jit
def plain_jit(a):
return a.argmax()
## -- End pasted text --
In [35]: %timeit naive(a)
100 loops, best of 3: 7.13 ms per loop
In [36]: %timeit plain(a)
The slowest run took 4.37 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.04 µs per loop
In [37]: %timeit naive_jit(a)
100000 loops, best of 3: 6.91 µs per loop
In [38]: %timeit plain_jit(a)
10000 loops, best of 3: 125 µs per loop
I'll nominate
a.argmax()
With #fuglede's test array:
In [1]: a = np.array([np.nan if i % 10000 == 9999 else 3 for i in range(100000)])
In [2]: np.isnan(a).argmax()
Out[2]: 9999
In [3]: np.argmax(a)
Out[3]: 9999
In [4]: a.argmax()
Out[4]: 9999
In [5]: timeit a.argmax()
The slowest run took 29.94 ....
10000 loops, best of 3: 20.3 µs per loop
In [6]: timeit np.isnan(a).argmax()
The slowest run took 7.82 ...
1000 loops, best of 3: 462 µs per loop
I don't have numba installed, so can compare that. But my speedup relative to short is greater than #fuglede's 6x.
I'm testing in Py3, which accepts <np.nan, while Py2 raises a runtime warning. But the code search suggests this isn't dependent on that comparison.
/numpy/core/src/multiarray/calculation.c PyArray_ArgMax plays with axes (moving the one of interest to the end), and delegates the action to arg_func = PyArray_DESCR(ap)->f->argmax, a function that depends on the dtype.
In numpy/core/src/multiarray/arraytypes.c.src it looks like BOOL_argmax short circuits, returning as soon as it encounters a True.
for (; i < n; i++) {
if (ip[i]) {
*max_ind = i;
return 0;
}
}
And #fname#_argmax also short circuits on maximal nan. np.nan is 'maximal' in argmin as well.
#if #isfloat#
if (#isnan#(mp)) {
/* nan encountered; it's maximal */
return 0;
}
#endif
Comments from experienced c coders are welcomed, but it appears to me that at least for np.nan, a plain argmax will be as fast you we can get.
Playing with the 9999 in generating a shows that the a.argmax time depends on that value, consistent with short circuiting.
Here is a pythonic approach using itertools.takewhile():
from itertools import takewhile
sum(1 for _ in takewhile(np.isfinite, a))
Benchmark with generator_expression_within_next approach: 1
In [118]: a = np.repeat(a, 10000)
In [120]: %timeit next(i for i, j in enumerate(a) if np.isnan(j))
100 loops, best of 3: 12.4 ms per loop
In [121]: %timeit sum(1 for _ in takewhile(np.isfinite, a))
100 loops, best of 3: 11.5 ms per loop
But still (by far) slower than numpy approach:
In [119]: %timeit np.isnan(a).argmax()
100000 loops, best of 3: 16.8 µs per loop
1. The problem with this approach is using enumerate function. Which returns an enumerate object from the numpy array first (which is an iterator like object) and calling the generator function and next attribute of the iterator will take time.
When looking for the first match in various scenarios, we could iterate through and look for the first match and exit out on the first match rather than going/processing the entire array. So, we would have an approach using Python's next function , like so -
next((i for i, val in enumerate(a) if np.isnan(val)))
Sample runs -
In [192]: a = np.array([3, 3, np.nan, 3, 3, np.nan])
In [193]: next((i for i, val in enumerate(a) if np.isnan(val)))
Out[193]: 2
In [194]: a[2] = 10
In [195]: next((i for i, val in enumerate(a) if np.isnan(val)))
Out[195]: 5

python: deque vs list performance comparison

In python docs I can see that deque is a special collection highly optimized for poping/adding items from left or right sides. E.g. documentation says:
Deques are a generalization of stacks and queues (the name is
pronounced “deck” and is short for “double-ended queue”). Deques
support thread-safe, memory efficient appends and pops from either
side of the deque with approximately the same O(1) performance in
either direction.
Though list objects support similar operations, they are optimized for
fast fixed-length operations and incur O(n) memory movement costs for
pop(0) and insert(0, v) operations which change both the size and
position of the underlying data representation.
I decided to make some comparisons using ipython. Could anyone explain me what I did wrong here:
In [31]: %timeit range(1, 10000).pop(0)
10000 loops, best of 3: 114 us per loop
In [32]: %timeit deque(xrange(1, 10000)).pop()
10000 loops, best of 3: 181 us per loop
In [33]: %timeit deque(range(1, 10000)).pop()
1000 loops, best of 3: 243 us per loop
Could anyone explain me what I did wrong here
Yes, your timing is dominated by the time to create the list or deque. The time to do the pop is insignificant in comparison.
Instead you should isolate the thing you're trying to test (the pop speed) from the setup time:
In [1]: from collections import deque
In [2]: s = list(range(1000))
In [3]: d = deque(s)
In [4]: s_append, s_pop = s.append, s.pop
In [5]: d_append, d_pop = d.append, d.pop
In [6]: %timeit s_pop(); s_append(None)
10000000 loops, best of 3: 115 ns per loop
In [7]: %timeit d_pop(); d_append(None)
10000000 loops, best of 3: 70.5 ns per loop
That said, the real differences between deques and list in terms of performance are:
Deques have O(1) speed for appendleft() and popleft() while lists have O(n) performance for insert(0, value) and pop(0).
List append performance is hit and miss because it uses realloc() under the hood. As a result, it tends to have over-optimistic timings in simple code (because the realloc doesn't have to move data) and really slow timings in real code (because fragmentation forces realloc to move all the data). In contrast, deque append performance is consistent because it never reallocs and never moves data.
For what it is worth:
Python 3
deque.pop vs list.pop
> python3 -mtimeit -s 'import collections' -s 'items = range(10000000); base = [*items]' -s 'c = collections.deque(base)' 'c.pop()'
5000000 loops, best of 5: 46.5 nsec per loop
> python3 -mtimeit -s 'import collections' -s 'items = range(10000000); base = [*items]' 'base.pop()'
5000000 loops, best of 5: 55.1 nsec per loop
deque.appendleft vs list.insert
> python3 -mtimeit -s 'import collections' -s 'c = collections.deque()' 'c.appendleft(1)'
5000000 loops, best of 5: 52.1 nsec per loop
> python3 -mtimeit -s 'c = []' 'c.insert(0, 1)'
50000 loops, best of 5: 12.1 usec per loop
Python 2
> python -mtimeit -s 'import collections' -s 'c = collections.deque(xrange(1, 100000000))' 'c.pop()'
10000000 loops, best of 3: 0.11 usec per loop
> python -mtimeit -s 'c = range(1, 100000000)' 'c.pop()'
10000000 loops, best of 3: 0.174 usec per loop
> python -mtimeit -s 'import collections' -s 'c = collections.deque()' 'c.appendleft(1)'
10000000 loops, best of 3: 0.116 usec per loop
> python -mtimeit -s 'c = []' 'c.insert(0, 1)'
100000 loops, best of 3: 36.4 usec per loop
As you can see, where it really shines is in appendleft vs insert.
I would recommend you to refer
https://wiki.python.org/moin/TimeComplexity
Python lists and deque have simlilar complexities for most operations(push,pop etc.)
I found my way to this question and thought I'd offer up an example with a little context.
A classic use-case for using a Deque might be rotating/shifting elements in a collection because (as others have mentioned), you get very good (O(1)) complexity for push/pop operations on both ends because these operations are just moving references around as opposed to a list which has to physically move objects around in memory.
So here are 2 very similar-looking implementations of a rotate-left function:
def rotate_with_list(items, n):
l = list(items)
for _ in range(n):
l.append(l.pop(0))
return l
from collections import deque
def rotate_with_deque(items, n):
d = deque(items)
for _ in range(n):
d.append(d.popleft())
return d
Note: This is such a common use of a deque that the deque has a built-in rotate method, but I'm doing it manually here for the sake of visual comparison.
Now let's %timeit.
In [1]: def rotate_with_list(items, n):
...: l = list(items)
...: for _ in range(n):
...: l.append(l.pop(0))
...: return l
...:
...: from collections import deque
...: def rotate_with_deque(items, n):
...: d = deque(items)
...: for _ in range(n):
...: d.append(d.popleft())
...: return d
...:
In [2]: items = range(100000)
In [3]: %timeit rotate_with_list(items, 800)
100 loops, best of 3: 17.8 ms per loop
In [4]: %timeit rotate_with_deque(items, 800)
The slowest run took 5.89 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 527 µs per loop
In [5]: %timeit rotate_with_list(items, 8000)
10 loops, best of 3: 174 ms per loop
In [6]: %timeit rotate_with_deque(items, 8000)
The slowest run took 8.99 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 1.1 ms per loop
In [7]: more_items = range(10000000)
In [8]: %timeit rotate_with_list(more_items, 800)
1 loop, best of 3: 4.59 s per loop
In [9]: %timeit rotate_with_deque(more_items, 800)
10 loops, best of 3: 109 ms per loop
Pretty interesting how both data structures expose an eerily similar interface but have drastically different performance :)
out of curiosity I tried inserting in beginning in list vs appendleft() of deque.
clearly deque is winner.

python convert a binary string to a structure sutiable for bitwise operators

I have a list unicode strings representing binary numbers e.g. "010110".
I wish to perform bitwise operations, so how do I convert that to a structure where I can perform bitwise operations on these (preferably an unsigned int)?
Use int() with the "base" option.
int("010110", 2)
You can convert the strings to int and then use the regular shift operators on them:
>>> x = int("010110", 2)
>>> x >> 3
2
>>> x << 3
176
using int() is the most obvious and useful way. but you didn't say if you need these as integers or not.
just in case not, then:
x = '1010100100'
intx = int(x,2)
x
0x2a4
intx >> 5
0x15
bin(intx>>5)
'0b10101'
x[:-5]
'10101'
intx << 3
0x1520
bin(intx<<3)
'0b1010100100000'
x + '0'*3
'1010100100000'
the actual shifts are slower, but the end result isn't necessarily, and not as much slower as you'd think. This is because even though the actual shift is probably a single cycle on most modern architectures, whereas a slice obviously is many more instructions, there is a lot of overhead just looking up arguments etc. that make it not so much difference.
# Shifts are about 40% faster with integers vs. using equivalent string methods
In [331]: %timeit intx>>5
10000000 loops, best of 3: 48.3 ns per loop
In [332]: timeit x[:-5]
10000000 loops, best of 3: 69.9 ns per loop
In [333]: %timeit x+'0'*3
10000000 loops, best of 3: 70.5 ns per loop
In [334]: %timeit intx << 3
10000000 loops, best of 3: 51.7 ns per loop
# But the conversion back to string adds considerable time,
# dependent on the length of the string
In [335]: %timeit bin(intx>>5)
10000000 loops, best of 3: 157 ns per loop
In [338]: %timeit bin(intx<<3)
1000000 loops, best of 3: 242 ns per loop
# The whole process, including string -> int -> shift -> string,
# is about 8x slower than just using the string directly.
In [339]: %timeit int(x,2)>>5
1000000 loops, best of 3: 455 ns per loop
In [341]: %timeit int(x,2)<<3
1000000 loops, best of 3: 378 ns per loop
int(x,2) is probably still your best bet, but just some other ideas for optimization if you have use of it.

String multiplication versus for loop

I was solving a Python question on CodingBat.com. I wrote following code for a simple problem of printing a string n times-
def string_times(str, n):
return n * str
Official result is -
def string_times(str, n):
result = ""
for i in range(n):
result = result + str
return result
print string_times('hello',3)
The output is same for both the functions. I am curious how string multiplication (first function) perform against for loop (second function) on performance basis. I mean which one is faster and mostly used?
Also please suggest me a way to get the answer to this question myself (using time.clock() or something like that)
We can use the timeit module to test this:
python -m timeit "100*'string'"
1000000 loops, best of 3: 0.222 usec per loop
python -m timeit "''.join(['string' for _ in range(100)])"
100000 loops, best of 3: 6.9 usec per loop
python -m timeit "result = ''" "for i in range(100):" " result = result + 'string'"
100000 loops, best of 3: 13.1 usec per loop
You can see that multiplying is the far faster option. You can take note that while the string concatenation version isn't that bad in CPython, that may not be true in other versions of Python. You should always opt for string multiplication or str.join() for this reason - not only but speed, but for readability and conciseness.
I've timed the following three functions:
def string_times_1(s, n):
return s * n
def string_times_2(s, n):
result = ""
for i in range(n):
result = result + s
return result
def string_times_3(s, n):
"".join(s for _ in range(n))
The results are as follows:
In [4]: %timeit string_times_1('hello', 10)
1000000 loops, best of 3: 262 ns per loop
In [5]: %timeit string_times_2('hello', 10)
1000000 loops, best of 3: 1.63 us per loop
In [6]: %timeit string_times_3('hello', 10)
100000 loops, best of 3: 3.87 us per loop
As you can see, s * n is not only the clearest and the most concise, it is also the fastest.
You can use the timeit stuff from either the command line or in code to see how fast some bit of python code is:
$ python -m timeit "\"something\" * 100"
1000000 loops, best of 3: 0.608 usec per loop
Do something similar for your other function and compare.

One liner to determine if dictionary values are all empty lists or not

I have a dict as follows:
someDict = {'a':[], 'b':[]}
I want to determine if this dictionary has any values which are not empty lists. If so, I want to return True. If not, I want to return False. Any way to make this a one liner?
Per my testing, the following one-liner (my original answer) has best time performance in all scenarios. See edits below for testing information. I do acknowledge that solutions using generator expressions will be much more memory efficient and should be preferred for large dicts.
EDIT: This is an aging answer and the results of my testing may not be valid for the latest version of python. Since generator expressions are the more "pythonic" way, I'd imagine their performance is improving. Please do your own testing if you're running this in a 'hot' codepath.
bool([a for a in my_dict.values() if a != []])
Edit:
Decided to have some fun. A comparison of answers, not in any particular order:
(As used below, timeit will calculate a loop order of magnitude based on what will take less than 0.2 seconds to run)
bool([a for a in my_dict.values() if a != []]) :
python -mtimeit -s"my_dict={'a':[],'b':[]}" "bool([a for a in my_dict.values() if a != []])"
1000000 loops, best of 3: 0.875 usec per loop
any([my_dict[i] != [] for i in my_dict]) :
python -mtimeit -s"my_dict={'a':[],'b':[]}" "any([my_dict[i] != [] for i in my_dict])"
1000000 loops, best of 3: 0.821 usec per loop
any(x != [] for x in my_dict.itervalues()):
python -mtimeit -s"my_dict={'a':[],'b':[]}" "any(x != [] for x in my_dict.itervalues())"
1000000 loops, best of 3: 1.03 usec per loop
all(map(lambda x: x == [], my_dict.values())):
python -mtimeit -s"my_dict={'a':[],'b':[]}" "all(map(lambda x: x == [], my_dict.values()))"
1000000 loops, best of 3: 1.47 usec per loop
filter(lambda x: x != [], my_dict.values()):
python -mtimeit -s"my_dict={'a':[],'b':[]}" "filter(lambda x: x != [], my_dict.values())"
1000000 loops, best of 3: 1.19 usec per loop
Edit again - more fun:
any() is best case O(1) (if bool(list[0]) returns True). any()'s worst case is the "positive" scenario - a long list of values for which bool(list[i]) returns False.
Check out what happens when the dict gets big:
bool([a for a in my_dict.values() if a != []]) :
#n=1000
python -mtimeit -s"my_dict=dict(zip(range(1000),[[]]*1000))" "bool([a for a in my_dict.values() if a != []])"
10000 loops, best of 3: 126 usec per loop
#n=100000
python -mtimeit -s"my_dict=dict(zip(range(100000),[[]]*100000))" "bool([a for a in my_dict.values() if a != []])"
100 loops, best of 3: 14.2 msec per loop
any([my_dict[i] != [] for i in my_dict]):
#n=1000
python -mtimeit -s"my_dict=dict(zip(range(1000),[[]]*1000))" "any([my_dict[i] != [] for i in my_dict])"
10000 loops, best of 3: 198 usec per loop
#n=100000
python -mtimeit -s"my_dict=dict(zip(range(100000),[[]]*100000))" "any([my_dict[i] != [] for i in my_dict])"
10 loops, best of 3: 21.1 msec per loop
But that's not enough - what about a worst-case 'False' scenario?
bool([a for a in my_dict.values() if a != []]) :
python -mtimeit -s"my_dict=dict(zip(range(1000),[0]*1000))" "bool([a for a in my_dict.values() if a != []])"
10000 loops, best of 3: 198 usec per loop
any([my_dict[i] != [] for i in my_dict]) :
python -mtimeit -s"my_dict=dict(zip(range(1000),[0]*1000))" "any([my_dict[i] != [] for i in my_dict])"
1000 loops, best of 3: 265 usec per loop
Not falsey or not empty lists:
Not falsey:
any(someDict.values())
Not empty lists:
any(a != [] for a in someDict.values())
or
any(map(lambda x: x != [], someDict.values()))
Or if you are ok with a falsey return value:
filter(lambda x: x != [], someDict.values())
Returns a list of items that are not empty lists, so if they are all empty lists it's an empty list :)
Quite literally:
any(x != [] for x in someDict.itervalues())
try this
all([d[i] == [] for i in d])
edit: oops, i think i got you backwards. lets deMorgan that
any([d[i] != [] for i in d])
this second way has the short-circuit advantage on the first anyhow
>>> someDict = {'a':[], 'b':[]}
>>> all(map(lambda x: x == [], someDict.values()))
True
len(filter(lambda x: x!=[], someDict.values())) != 0

Categories