Using an operator to add in Python - python

Consider:
operator.add(a, b)
I'm having trouble understanding what this does. An operator is something like +-*/, so what does operator.add(a, b) do and how would you use it in a program?

Operator functions let you pick operations dynamically.
They do the same thing as the operator, so operator.add(a, b) does the exact same thing as a + b, but you can now use these operators in abstract.
Take for example:
import operator, random
ops = [operator.add, operator.sub]
print(random.choice(ops)(10, 5))
The above code will randomly either add up or subtract the two numbers. Because the operators can be applied in function form, you can also store these functions in variables (lists, dictionaries, etc.) and use them indirectly, based on your code. You can pass them to map() or reduce() or partial, etc. etc. etc.

As operator.add is a function and you can pass argument to it, it's for the situations where you can not use statements like a+d, like the map or itertools.imap functions. For better understanding, see the following example:
>>> import operator
>>> from itertools import imap
>>> list(imap(operator.add,[1,3],[5,5]))
[6, 8]

It does the same, it's just a function version of the operator in the Python operator module. It returns the result, so you would just it like this:
result = operator.add(a, b)
This is functionally equivalent to
result = a + b

It literally is how the + operator is defined. Look at the following example
class foo():
def __init__(self, a):
self.a = a
def __add__(self, b):
return self.a + b
>>> x = foo(5)
>>> x + 3
8
The + operator actually just calls the __add__ method of the class
The same thing happens for native Python types,
>>> 5 + 3
8
>>> operator.add(5,3)
8
Note that since I defined my __add__ method, I can also do
>>> operator.add(x, 3)
8

For the first part of your question, checkout the source for operator.add. It does exactly as you'd expect; adds two values together.
The answer to part two of your question is a little tricky.
They can be good for when you don't know what operator you'll need until run time. Like when the data file you're reading contains the operation as well as the values:
# warning: nsfw
total = 0
with open('./foo.dat') as fp:
for line in fp:
operation, first_val, second_val = line.split()
total += getattr(operator, operation)(first_val, second_val)
Also, you might want to make your code cleaner or more efficient (subjective) by using the operator functions with the map built-in as the example shows in the Python docs:
orig_values = [1,2,3,4,5]
new_values = [5,4,3,2,1]
total = sum(map(operator.add, orig_values, new_values))
Those are both convoluted examples which usually means that you probably won't use them except in extraordinary situations. You should really know that you need these functions before you use them.

Related

Python sum(generator): calling an external function

take the code:
SUM = sum(x() for x in xs)
I am writing some code that is in need of calling another function before each x() such that x() will compute the right value
is the only way to do this like so?
for x in xs: x.pre()
SUM = sum(x() for x in xs)
or
SUM = 0
for x in xs:
x.pre()
SUM += x()
incorporating x.pre into x is obviously possible but would make the real code exceptionally ugly and hard to read. is there some way of using generator expressions I am unaware of that would allow what I am trying to achieve?
eg:
SUM = sum(x(), x.pre() for x in xs)
thats obviously just an non-summable tuple generator
I would simply use the for-loop you already presented.
There are ways to do what you want in other ways. For example you could use functools.reduce with a customized function here instead of sum :
def pre_then_add(accumulated, new_one):
new_one.pre() # do some stuff so we get the right value
return accumulated + new_one() # add the value to the accumulated sum
and then it's just a matter of calling reduce:
import functools
functools.reduce(pre_then_add, xs, 0)
Note that this requires to give a "base value" otherwise the function is too simple but also that reduce with a customized function is not necessarily the most efficient or elegant way.
As pointed in the comments (thanks to #SteveJessop) another possibility using sum would be:
def pre_then_call(x):
x.pre();
return x()
sum(pre_then_call(x) for x in xs)
But just to repeat it: I would use the explicit loop. Those are far easier to understand in terms of what you are doing and why.

temporary variables in a python expression

Some (many? all?) functional programming languages like StandardML and Haskell have a type of expression in the form let ... in ... where is possible to create temporary variables with the scope of the expression itself.
Example: let a=b*c in a*(a+1)
It seems that in Python there is no expression construct similar to this.
Motivation:
For example the body of a lambda function must be a (one) expression. Not two expressions. Not a statement (an assignment is a statement and not an expression).
Moreover, when writing functional expressions and in general one-liners in python, things can become messy pretty easily (see my answer to Python 2 list comprehension and eval).
The point of such a construct is to avoid repetition (which sometimes leads to re-computation), for example l[:l.index(a)]+l[l.index(a)+1:] instead of an hypothetic let i=l.index(a) in l[:i]+l[i+1:]
How can we achieve a similar language feature in python2 / python3?
This isn't really idiomatic code, but for single expressions you can use lambdas that you immediately invoke. Your example would look like this:
>>> b, c = 2, 3
>>> (lambda a: a * (a + 1))(b * c)
42
You can also write this using keyword arguments if that helps readability:
>>> (lambda a: a * (a + 1))(a=b * c)
42
You can sort of simulate a smaller scope for your temporary variable by putting it into an iterable and then looping over that iterable:
>>> b,c = 2,3
>>> var = [a*(a+1) for a in [b*c]][0]
>>> var
42
This puts the single value b*c into a list, then loops over that list in a comprehension, with each new value consisting of some transformation of each element. The created list is one element in length, and we get the first element in it with [0]. Without the comprehension, it looks as follows:
>>> b,c = 2,3
>>> var = []
>>> for a in [b*c]:
... var.append(a*(a+1))
...
>>> var = var[0]
>>> var
42

What does ... mean in numpy code?

And what is it called? I don't know how to search for it; I tried calling it ellipsis with the Google. I don't mean in interactive output when dots are used to indicate that the full array is not being shown, but as in the code I'm looking at,
xTensor0[...] = xVTensor[..., 0]
From my experimentation, it appears to function the similarly to : in indexing, but stands in for multiple :'s, making x[:,:,1] equivalent to x[...,1].
Yes, you're right. It fills in as many : as required. The only difference occurs when you use multiple ellipses. In that case, the first ellipsis acts in the same way, but each remaining one is converted to a single :.
Although this feature exists mainly to support numpy and other, similar modules, it's a core feature of the language and can be used anywhere, like so:
>>> class foo:
... def __getitem__(self, key):
... return key
...
>>> aFoo = foo()
>>> aFoo[..., 1]
(Ellipsis, 1)
>>>
or even:
>>> derp = {}
>>> derp[..., 1] = "herp"
>>> derp
{(Ellipsis, 1): 'herp'}
Documentation here: http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
It does, well, what you describe it doing.

Case-insensitive comparison of sets in Python

I have two sets (although I can do lists, or whatever):
a = frozenset(('Today','I','am','fine'))
b = frozenset(('hello','how','are','you','today'))
I want to get:
frozenset(['Today'])
or at least:
frozenset(['today'])
The second option is doable if I lowercase everything I presume, but I'm looking for a more elegant way. Is it possible to do
a.intersection(b)
in a case-insensitive manner?
Shortcuts in Django are also fine since I'm using that framework.
Example from intersection method below (I couldn't figure out how to get this formatted in a comment):
print intersection('Today I am fine tomorrow'.split(),
'Hello How a re you TODAY and today and Today and Tomorrow'.split(),
key=str.lower)
[(['tomorrow'], ['Tomorrow']), (['Today'], ['TODAY', 'today', 'Today'])]
Here's version that works for any pair of iterables:
def intersection(iterableA, iterableB, key=lambda x: x):
"""Return the intersection of two iterables with respect to `key` function.
"""
def unify(iterable):
d = {}
for item in iterable:
d.setdefault(key(item), []).append(item)
return d
A, B = unify(iterableA), unify(iterableB)
return [(A[k], B[k]) for k in A if k in B]
Example:
print intersection('Today I am fine'.split(),
'Hello How a re you TODAY'.split(),
key=str.lower)
# -> [(['Today'], ['TODAY'])]
Unfortunately, even if you COULD "change on the fly" the comparison-related special methods of the sets' items (__lt__ and friends -- actually, only __eq__ needed the way sets are currently implemented, but that's an implementatio detail) -- and you can't, because they belong to a built-in type, str -- that wouldn't suffice, because __hash__ is also crucial and by the time you want to do your intersection it's already been applied, putting the sets' items in different hash buckets from where they'd need to end up to make intersection work the way you want (i.e., no guarantee that 'Today' and 'today' are in the same bucket).
So, for your purposes, you inevitably need to build new data structures -- if you consider it "inelegant" to have to do that at all, you're plain out of luck: built-in sets just don't carry around the HUGE baggage and overhead that would be needed to allow people to change comparison and hashing functions, which would bloat things by 10 times (or more) for the sae of a need felt in (maybe) one use case in a million.
If you have frequent needs connected with case-insensitive comparison, you should consider subclassing or wrapping str (overriding comparison and hashing) to provide a "case insensitive str" type cistr -- and then, of course, make sure than only instances of cistr are (e.g.) added to your sets (&c) of interest (either by subclassing set &c, or simply by paying care). To give an oversimplified example...:
class ci(str):
def __hash__(self):
return hash(self.lower())
def __eq__(self, other):
return self.lower() == other.lower()
class cifrozenset(frozenset):
def __new__(cls, seq=()):
return frozenset((ci(x) for x in seq))
a = cifrozenset(('Today','I','am','fine'))
b = cifrozenset(('hello','how','are','you','today'))
print a.intersection(b)
this does emit frozenset(['Today']), as per your expressed desire. Of course, in real life you'd probably want to do MUCH more overriding (for example...: the way I have things here, any operation on a cifrozenset returns a plain frozenset, losing the precious case independence special feature -- you'd probably want to ensure that a cifrozenset is returned each time instead, and, while quite feasible, that's NOT trivial).
First, don't you mean a.intersection(b)? The intersection (if case insensitive) would be set(['today']). The difference would be set(['i', 'am', 'fine'])
Here are two ideas:
1.) Write a function to convert the elements of both sets to lowercase and then do the intersection. Here's one way you could do it:
>>> intersect_with_key = lambda s1, s2, key=lambda i: i: set(map(key, s1)).intersection(map(key, s2))
>>> fs1 = frozenset('Today I am fine'.split())
>>> fs2 = frozenset('Hello how are you TODAY'.split())
>>> intersect_with_key(fs1, fs2)
set([])
>>> intersect_with_key(fs1, fs2, key=str.lower)
set(['today'])
>>>
This is not very efficient though because the conversion and new sets would have to be created on each call.
2.) Extend the frozenset class to keep a case insensitive copy of the elements. Override the intersection method to use the case insensitive copy of the elements. This would be more efficient.
>>> a_, b_ = map(set, [map(str.lower, a), map(str.lower, b)])
>>> a_ & b_
set(['today'])
Or... with less maps,
>>> a_ = set(map(str.lower, a))
>>> b_ = set(map(str.lower, b))
>>> a_ & b_
set(['today'])

Is it pythonic for a function to return multiple values?

In python, you can have a function return multiple values. Here's a contrived example:
def divide(x, y):
quotient = x/y
remainder = x % y
return quotient, remainder
(q, r) = divide(22, 7)
This seems very useful, but it looks like it can also be abused ("Well..function X already computes what we need as an intermediate value. Let's have X return that value also").
When should you draw the line and define a different method?
Absolutely (for the example you provided).
Tuples are first class citizens in Python
There is a builtin function divmod() that does exactly that.
q, r = divmod(x, y) # ((x - x%y)/y, x%y) Invariant: div*y + mod == x
There are other examples: zip, enumerate, dict.items.
for i, e in enumerate([1, 3, 3]):
print "index=%d, element=%s" % (i, e)
# reverse keys and values in a dictionary
d = dict((v, k) for k, v in adict.items()) # or
d = dict(zip(adict.values(), adict.keys()))
BTW, parentheses are not necessary most of the time.
Citation from Python Library Reference:
Tuples may be constructed in a number of ways:
Using a pair of parentheses to denote the empty tuple: ()
Using a trailing comma for a singleton tuple: a, or (a,)
Separating items with commas: a, b, c or (a, b, c)
Using the tuple() built-in: tuple() or tuple(iterable)
Functions should serve single purpose
Therefore they should return a single object. In your case this object is a tuple. Consider tuple as an ad-hoc compound data structure. There are languages where almost every single function returns multiple values (list in Lisp).
Sometimes it is sufficient to return (x, y) instead of Point(x, y).
Named tuples
With the introduction of named tuples in Python 2.6 it is preferable in many cases to return named tuples instead of plain tuples.
>>> import collections
>>> Point = collections.namedtuple('Point', 'x y')
>>> x, y = Point(0, 1)
>>> p = Point(x, y)
>>> x, y, p
(0, 1, Point(x=0, y=1))
>>> p.x, p.y, p[0], p[1]
(0, 1, 0, 1)
>>> for i in p:
... print(i)
...
0
1
Firstly, note that Python allows for the following (no need for the parenthesis):
q, r = divide(22, 7)
Regarding your question, there's no hard and fast rule either way. For simple (and usually contrived) examples, it may seem that it's always possible for a given function to have a single purpose, resulting in a single value. However, when using Python for real-world applications, you quickly run into many cases where returning multiple values is necessary, and results in cleaner code.
So, I'd say do whatever makes sense, and don't try to conform to an artificial convention. Python supports multiple return values, so use it when appropriate.
The example you give is actually a python builtin function, called divmod. So someone, at some point in time, thought that it was pythonic enough to include in the core functionality.
To me, if it makes the code cleaner, it is pythonic. Compare these two code blocks:
seconds = 1234
minutes, seconds = divmod(seconds, 60)
hours, minutes = divmod(minutes, 60)
seconds = 1234
minutes = seconds / 60
seconds = seconds % 60
hours = minutes / 60
minutes = minutes % 60
Yes, returning multiple values (i.e., a tuple) is definitely pythonic. As others have pointed out, there are plenty of examples in the Python standard library, as well as in well-respected Python projects. Two additional comments:
Returning multiple values is sometimes very, very useful. Take, for example, a method that optionally handles an event (returning some value in doing so) and also returns success or failure. This might arise in a chain of responsibility pattern. In other cases, you want to return multiple, closely linked pieces of data---as in the example given. In this setting, returning multiple values is akin to returning a single instance of an anonymous class with several member variables.
Python's handling of method arguments necessitates the ability to directly return multiple values. In C++, for example, method arguments can be passed by reference, so you can assign output values to them, in addition to the formal return value. In Python, arguments are passed "by reference" (but in the sense of Java, not C++). You can't assign new values to method arguments and have it reflected outside method scope. For example:
// C++
void test(int& arg)
{
arg = 1;
}
int foo = 0;
test(foo); // foo is now 1!
Compare with:
# Python
def test(arg):
arg = 1
foo = 0
test(foo) # foo is still 0
It's definitely pythonic. The fact that you can return multiple values from a function the boilerplate you would have in a language like C where you need to define a struct for every combination of types you return somewhere.
However, if you reach the point where you are returning something crazy like 10 values from a single function, you should seriously consider bundling them in a class because at that point it gets unwieldy.
Returning a tuple is cool. Also note the new namedtuple
which was added in python 2.6 which may make this more palatable for you:
http://docs.python.org/dev/library/collections.html#collections.namedtuple
OT: RSRE's Algol68 has the curious "/:=" operator. eg.
INT quotient:=355, remainder;
remainder := (quotient /:= 113);
Giving a quotient of 3, and a remainder of 16.
Note: typically the value of "(x/:=y)" is discarded as quotient "x" is assigned by reference, but in RSRE's case the returned value is the remainder.
c.f. Integer Arithmetic - Algol68
It's fine to return multiple values using a tuple for simple functions such as divmod. If it makes the code readable, it's Pythonic.
If the return value starts to become confusing, check whether the function is doing too much and split it if it is. If a big tuple is being used like an object, make it an object. Also, consider using named tuples, which will be part of the standard library in Python 2.6.
I'm fairly new to Python, but the tuple technique seems very pythonic to me. However, I've had another idea that may enhance readability. Using a dictionary allows access to the different values by name rather than position. For example:
def divide(x, y):
return {'quotient': x/y, 'remainder':x%y }
answer = divide(22, 7)
print answer['quotient']
print answer['remainder']

Categories