sum function vs long addition - python

I am wondering if the sum() builtin has an andventage over a long addition ?
is
sum(filter(None, [a, b, c, d]))
faster than
a + b + c + d
assuming I am using CPython ?
thanks
EDIT: What if those variables are Decimals ?

A quick example (note that, to try to be fairer, the sum version takes a tuple argument, so you don't include the time for building that structure (a, b, c, d), and doesn't include the unnecessary filter):
>>> import timeit
>>> def add_up(a, b, c, d):
return a + b + c + d
>>> def sum_up(t):
return sum(t)
>>> t = (1, 2, 3, 4)
>>> timeit.timeit("add_up(1, 2, 3, 4)", setup="from __main__ import sum_up, add_up, t")
0.2710826617188786
>>> timeit.timeit("sum_up(t)", setup="from __main__ import sum_up, add_up, t")
0.3691424539089212
This is pretty much inevitable - add_up doesn't have any function call overhead, it just does 3 binary adds. But the different forms have different uses - sum doesn't care how many items are given to it, whereas you have to write each name out with +. In an example with a fixed number of items, where speed is crucial, + has the edge, but for almost all general cases sum is the way to go.
With Decimals:
>>> t = tuple(map(Decimal, t))
>>> a = Decimal(1)
>>> b = Decimal(2)
>>> c = Decimal(3)
>>> d = Decimal(4)
>>> timeit.timeit("add_up(a, b, c, d)", setup="from __main__ import sum_up, add_up, t, a, b, c, d")
0.5005962150420373
>>> timeit.timeit("sum_up(t)", setup="from __main__ import sum_up, add_up, t, a, b, c, d")
0.7599533142681025

Related

Intersecting two sets, retaining all (up to) three parts efficiently

If you have two sets a and b and intersect them, there are three interesting parts (which may be empty): h(ead) elements of a not in b, i(ntersection) elements in both a and b, and t(ail) elements of b not in a.
For example: {1, 2, 3} & {2, 3, 4} -> h:{1}, i:{2, 3}, t:{4} (not actual Python code, clearly)
One very clean way to code that in Python:
h, i, t = a - b, a & b, b - a
I figure that this can be slightly more efficient though:
h, t = a - (i := a & b), b - i
Since it first computes the intersection and then subtracts only that from a and then b, which would help if i is small and a and b are large - although I suppose it depends on the implementation of the subtraction whether it's truly faster. It's not likely to be worse, as far as I can tell.
I was unable to find such an operator or function, but since I can imagine efficient implementations that would perform the three-way split of a and b into h, i, and t in fewer iterations, am I missing something like this, which may already exist?
from magical_set_stuff import hit
h, i, t = hit(a, b)
It's not in Python, and I haven't seen such a thing in a 3rd-party library either.
Here's a perhaps unexpected approach that's largely insensitive to which sets are bigger than others, and to how much overlap among inputs there may be. I dreamed it up when facing a related problem: suppose you had 3 input sets, and wanted to derive the 7 interesting sets of overlaps (in A only, B only, C only, both A and B, both A and C, both B and C, or in all 3). This version strips that down to the 2-input case. In general, assign a unique power of 2 to each input, and use those as bit flags:
def hit(a, b):
x2flags = defaultdict(int)
for x in a:
x2flags[x] = 1
for x in b:
x2flags[x] |= 2
result = [None, set(), set(), set()]
for x, flag in x2flags.items():
result[flag].add(x)
return result[1], result[3], result[2]
I won't accept my own answer unless nobody manages to beat my own solution or any of the good and concise Python ones.
But for anyone interested in some numbers:
from random import randint
from timeit import timeit
def grismar(a: set, b: set):
h, i, t = set(), set(), b.copy()
for x in a:
if x in t:
i.add(x)
t.remove(x)
else:
h.add(x)
return h, i, t
def good(a: set, b: set):
return a - b, a & b, b - a
def better(a: set, b: set):
h, t = a - (i := a & b), b - i
return h, i, t
def ok(a: set, b: set):
return a - (a & b), a & b, b - (a & b)
from collections import defaultdict
def tim(a, b):
x2flags = defaultdict(int)
for x in a:
x2flags[x] = 1
for x in b:
x2flags[x] |= 2
result = [None, set(), set(), set()]
for x, flag in x2flags.items():
result[flag].add(x)
return result[1], result[3], result[2]
def pychopath(a, b):
h, t = set(), b.copy()
h_add = h.add
t_remove = t.remove
i = {x for x in a
if x in t and not t_remove(x) or h_add(x)}
return h, i, t
def enke(a, b):
t = b - (i := a - (h := a - b))
return h, i, t
xs = set(randint(0, 10000) for _ in range(10000))
ys = set(randint(0, 10000) for _ in range(10000))
# validation
g = (f(xs, ys) for f in (grismar, good, better, ok, tim, enke))
l = set(tuple(tuple(sorted(s)) for s in t) for t in g)
assert len(l) == 1, 'functions are equivalent'
# warmup, not competing
timeit(lambda: grismar(xs, ys), number=500)
# competition
print('a - b, a & b, b - a ', timeit(lambda: good(xs, ys), number=10000))
print('a - (i := a & b), b - i ', timeit(lambda: better(xs, ys), number=10000))
print('a - (a & b), a & b, b - (a & b) ', timeit(lambda: ok(xs, ys), number=10000))
print('tim ', timeit(lambda: tim(xs, ys), number=10000))
print('grismar ', timeit(lambda: grismar(xs, ys), number=10000))
print('pychopath ', timeit(lambda: pychopath(xs, ys), number=10000))
print('b - (i := a - (h := a - b)) ', timeit(lambda: enke(xs, ys), number=10000))
Results:
a - b, a & b, b - a 5.6963334
a - (i := a & b), b - i 5.3934624
a - (a & b), a & b, b - (a & b) 9.7732018
tim 16.3080373
grismar 7.709292500000004
pychopath 6.76331460000074
b - (i := a - (h := a - b)) 5.197220600000001
So far, the optimisation proposed by #enke in the comments appears to win out:
t = b - (i := a - (h := a - b))
return h, i, t
Edit: added #Pychopath's results, which is indeed substantially faster than my own, although #enke's result is still the one to beat (and likely won't be with just Python). If #enke posts their own answer, I'd happily accept it as the answer.
Optimized version of yours, seems to be about 20% faster than yours in your benchmark:
def hit(a, b):
h, t = set(), b.copy()
h_add = h.add
t_remove = t.remove
i = {x for x in a
if x in t and not t_remove(x) or h_add(x)}
return h, i, t
And you might want to do this at the start, especially if the two sets can have significantly different sizes:
if len(a) > len(b):
return hit(b, a)[::-1]

Sympy Linsolve unexpected results

I am trying to solve a system of equations with Linsolve, but obviously must have misunderstood something, since I keep getting unexpected results. Say I want to solve the two following equations:
a + b = 0
a - b + c = 0
I would expect the result:
b = 0.5*c
Instead Sympy returns the empty set. With nonlinsolve I get (-a), which doesn't make much sense either:
>>> import sympy
>>> a, b, c = sympy.symbols('a b c')
>>> Eqns = [a + b, a - b + c]
>>>sympy.linsolve(Eqns, b)
()
>>>sympy.nonlinsolve(Eqns, b)
(-a)
I think I'm going insane, please help :)
You also need to pass the other variable. So pass as many variables as equations or it's unsolvable, just like by hand.
import sympy as sp
a, b, c = sp.symbols('a b c')
Eqns = [a + b, a - b + c]
sp.solve(Eqns, b, a)

Informing sympy about inequality between variables

I am trying to solve a system in Sympy of the form
max(a,b+c) == a^2
I would like for example, to tell Sympy to search for a solution where $max(a,b+c) = a$ and $max(a,b+c) = b+c$. Is that possible in some way? I trying doing it through solve and solving a system of inequalities as in:
import sympy as sp
b = sp.Symbol('b', finite = True)
c = sp.Symbol('c', finite = True)
eq = sp.Max(a,b+c) - a**2
sp.solve([eq, a > b+c], a)
But I get the error:
The inequality, Eq(-x**2 + Max(x, _b + _c), 0), cannot be solved using
solve_univariate_inequality.
Is there anyway such type of equations can be solved? Or can I at least substitute $Max(a,b+c)$ to some case at least to simplify the expression?
Option 1
SymPy struggles solving equations with Min and Max. It is a little bit better at solving Piecewise equalities but it is still not great. Here is how I would tackle this specific problem using rewrite(Piecewise):
from sympy import *
a, b, c = symbols('a b c', real=True)
eq = Max(a, b+c) - a**2
solution = solve(eq.rewrite(Piecewise), a)
print(solution)
This gives
[Piecewise((0, b <= -c), (nan, True)), Piecewise((1, b + c <= 1), (nan, True)), Piecewise((-sqrt(b + c), b + c > -sqrt(b + c)), (nan, True)), Piecewise((sqrt(b + c), b + c > sqrt(b + c)), (nan, True))]
So this tells you that SymPy found 4 solutions all conditional on what b and c are. They seem like valid solutions after plugging them in. I'm not sure if those are all the solutions though.
SymPy might struggle a lot more if equations are more complicated than this.
The solutions would probably look even better if you added positive=True instead of real=True in the code above. Always try to give as much information as possible when defining symbols.
Option 2
Another route for solving these equations would be by substituting Max(a, b+c) for a and keep in mind that those solutions are for a >= b+c and repeat for b+c >= a. This would probably work better for more complicated equations.
For this specific example can do so by doing something like:
from sympy import *
a, b, c = symbols('a b c', real=True)
eq = Max(a, b+c) - a**2
eq1 = eq.subs(Max(a, b+c), a)
solution1 = solveset(eq1, a)
eq2 = eq.subs(Max(a, b+c), b+c)
solution2 = solveset(eq2, a)
solution = Piecewise((solution1, a > b+c), (solution2, a < b+c), (solution1.union(solution2), True))
print(solution)
Giving the same answer as above but a bit more readable:
Piecewise((FiniteSet(0, 1), a > b + c), (FiniteSet(sqrt(b + c), -sqrt(b + c)), a < b + c), (FiniteSet(0, 1, sqrt(b + c), -sqrt(b + c)), True))
Notice how you need to know the arguments of the Max before hand and that there is only one Max. Combining conditions with more than 1 max will be difficult especially since both solutions hold when they are equal.
I suggest this option if you are solving equations interactively instead of an in an automated fashion.
Option 3
I haven't tested this one but I hope this provides the same answers in the more general case where you have multiple Max varying arguments for each Max. Each Max can only take in 2 arguments though.
from sympy import *
a, b, c = symbols('a b c', real=True)
eq = Max(a, b+c) - a**2
eqs = [eq]
conditions = [True]
for f in preorder_traversal(eq):
new_eqs = []
new_conds = []
if f.func == Max:
for equation, condition in zip(eqs, conditions):
new_eqs.append(equation.subs(f, f.args[0]))
new_conds.append(And(condition, f.args[0] >= f.args[1]))
new_eqs.append(equation.subs(f, f.args[1]))
new_conds.append(And(condition, f.args[0] <= f.args[1]))
eqs = new_eqs
conditions = new_conds
solutions = []
for equation in eqs:
solutions.append(solveset(equation, a))
pieces = [(solution, condition) for solution, condition in zip(solutions, conditions)]
solution = Piecewise(*pieces)
print(solution)
This gives the same as above except for that last equality section:
Piecewise((FiniteSet(0, 1), a >= b + c), (FiniteSet(sqrt(b + c), -sqrt(b + c)), a <= b + c))
I could not combine both solutions when both of the inequalities hold so you just have to keep that in mind.

Power of 10 with Python Sympy and Latex

This is probably trivial, but I can't find the answer. Consider the following code:
from sympy import *
X = Symbol('X')
a=10
b=100
c=1000
d=10000
s = latex ( a*b*c*d / X )
print (s)
displays:
\frac{10000000000}{X}
And I would prefer
\frac{10^{10}}{X}
Is it possible ? Note that a, b, c and d are read from files. So values will change at each run. Then, following stuffs don't solve my problem:
n20 = Symbol('10')
neither
latex(S('10**10/X', evaluate=False))
>>> from sympy import *
>>> var('X')
X
>>> latex(S('10**20/X', evaluate=False))
'\\frac{10^{20}}{X}'
See https://github.com/sympy/sympy/wiki/Quick-examples.
EDIT: Your edited question differs considerably from the original. Here's an answer to it.
Because your input values might not be powers of ten r might not be. Consequently, when it is expressed as a power of ten its exponent might not be an integer; hence, the use of base ten logarithms.
from sympy import latex, sympify, Symbol
from math import log10
a=10
b=100
c=1000
d=10000
r = a * b * c * d
exponent = log10(r)
X = Symbol('X')
s = latex(sympify('10**{}/X'.format(exponent), evaluate=False))
print (s)
The result for these values of a, b, c and d is \frac{10^{10.0}}{X}.
All you need is a little help that will return your number with powers of 10 removed. Then wrap this in an unevaluated Mul and pass it to latex:
>>> def u10(n):
... if abs(n) < 10 or int(n) != n: return n
... s = str(n)
... m = s.rstrip('0')
... if len(m) == len(s): return n
... return Mul(int(m), Pow(10, len(s) - len(m), evaluate=0), evaluate=0)
...
>>> u10(12300)
123*10**2
>>> latex(Mul(_,1/x,evaluate=False))
'\\frac{123 \\cdot 10^{2}}{x}'�

Convert this function from recursive to iterative

def g(n):
"""Return the value of G(n), computed recursively.
>>> g(1)
1
>>> g(2)
2
>>> g(3)
3
>>> g(4)
10
>>> g(5)
22
"""
if n<=3:
return n
else:
return g(n-1)+2*g(n-2)+3*g(n-3)
How do I convert this to an iterative function? Until now, I didn't realize that writing a recursive function is sometimes easier than writing an iterative one. The reason why it's so hard I think is because I don't know what operation the function is doing. In the recursive one, it's not obvious what's going on.
I want to write an iterative definition, and I know I need to use a while loop, but each time I try to write one either I add additional parameters to g_iter(n) (when there should only be one), or I make a recursive call. Can someone at least start me off on the right path? You don't have to give me a complete solution.
FYI: We have not learned of the all too common "stack" I'm seeing on all these pages. I would prefer to stay away from this.
def g_iter(n):
"""Return the value of G(n), computed iteratively.
>>> g_iter(1)
1
>>> g_iter(2)
2
>>> g_iter(3)
3
>>> g_iter(4)
10
>>> g_iter(5)
22
"""
"*** YOUR CODE HERE ***"
def g(n):
if n <= 3:
return n
a, b, c = 1, 2, 3
for i in range(n - 3):
a, b, c = b, c, c + 2 * b + 3 * a
return c
UPDATE response to comment, without using for loop.
def g(n):
if n <= 3:
return n
a, b, c = 1, 2, 3
while n > 3:
a, b, c = b, c, c + 2 * b + 3 * a
n -= 1
return c

Categories