Consider the following code:
num = 1 + 1j
print(num.imag)
As opposed to
word = "hey"
print(word.islower())
One requires parentheses, and the other doesn't. I know in Python when we call functions without parentheses, we get back only a reference to the function, but it doesn't really answer it. So 'imag' returns a reference? because it seems the method does get executed and returns the imag part.
imag is not a method. It's simply a number-valued attribute.
islower is a method. In order to call the method, you put parentheses after the name.
num.imag is not a function, it's an attribute. To call a function you need the parentheses, or the __call__ method.
Attributes (e.g. imag) are like variables inside the object so you don't use parentheses to access them. Methods (e.g. islower()) are like functions inside the object so they do require parentheses to accept zero or more parameters and perform some work.
Objects can also have 'properties' that are special functions that behave like attributes (i.e. no parentheses) but can perform calculation or additional work when they are referenced or assigned.
Related
I need to create an algorithm which is executed as a sequence of functions chosen at run-time, a bit like the strategy pattern. I want to avoid creating many single method classes (100+), but use simple functions, and assemble them using a decorator.
actions = []
def act(specific_fn):
actions.append(specific_fn)
return specific_fn
#act
def specific_fn1():
print("specific_fn1")
#act
def specific_fn2():
print("specific_fn2")
def execute_strategy():
[f() for f in actions]
I have a couple of questions:
How can I modify the decorator function act to take the list actions as a parameter, so that it adds the decorated function into the list?
How do I use specific_fnX defined in another file? Currently, I just import the file inside the calling function - but that seems odd. Any other options?
Also, any other ideas on implementing this pattern?
This is a pretty good tutorial on using decorators, both with and without arguments. Others may know of others.
The trick is to remember that a decorator is a function call, but when applied to a function definition, it magically replaces the function with the returned result. If you want #my_decorator(my_list) to be a decorator, then my_decorator(my_list) must either return a function or an object with a __call__ method, and that function is then called on called on specific_fn1.
So yes, you need a function that returns a function that returns a function. Looking at some examples in a tutorial will make this clearer.
As to your second question, you can just call my_decorator(my_list)(specific_fn1) without using a decorator, and then ignorning the result.
I've come across many such situations where I have used in built functions or modules where the syntax is sometimes "X.func_name" and other times (X.func_name()".
For example :
In Pandas "df.columns" gives names of all columns and throws and throws error if written by mistake as "df.columns()" #TypeError: 'Index' object is not callable.
Also in Pandas, "count()", "min()" etc are written as df.count() | df.min()
I hope I have explained my question properly.
I believe it has something to do with the OOP concept of Class and it's member functions but I'd like a more in-dept understanding.
The syntax to access an attribute foo of an object or module is always .foo.
An attribute may or may not be callable. An int is not callable (it's just a number), but a function is callable. To call a function foo, you use parentheses, possibly with parameters inside, e.g. foo() or foo("bar"). Attempting to call something that is not callable will give you a TypeError.
So what syntax you use depends on whether the attribute is itself the value you want (e.g. an int or a str, or if it's a function that will return that value). In your example, columns is itself an int, whereas count is a function that you need to call in order to get an int.
Note that it's possible in Python to wrap any value in a function, or to turn a function into a property (i.e. make an attribute that automatically calls a function to produce its value), but in general the convention is that if something requires some kind of dynamic computation it will be a function, and values that are predetermined will not require a function invocation to retrieve.
The functions with parens are functions (actually class methods), which can take parameters and so on. Without parentheses, these are class variables.
So .loc and .iloc are not your typical functions. They somehow use [ and ] to surround the arguments so that it is comparable to normal array indexing. However, I have never seen this in another library (that I can think of, maybe numpy as something like this that I'm blanking on), and I have no idea how it technically works/is defined in the python code.
Are the brackets in this case just syntactic sugar for a function call? If so, how then would one make an arbitrary function use brackets instead of parenthesis? Otherwise, what is special about their use/defintion Pandas?
Note: The first part of this answer is a direct adaptation of my answer to this other question, that was answered before this question was reopened. I expand on the "why" in the second part.
So .loc and .iloc are not your typical functions
Indeed, they are not functions at all. I'll make examples with loc, iloc is analogous (it uses different internal classes).
The simplest way to check what loc actually is, is:
import pandas as pd
df = pd.DataFrame()
print(df.loc.__class__)
which prints
<class 'pandas.core.indexing._LocIndexer'>
this tells us that df.loc is an instance of a _LocIndexer class. The syntax loc[] derives from the fact that _LocIndexer defines __getitem__ and __setitem__*, which are the methods python calls whenever you use the square brackets syntax.
So yes, brackets are, technically, syntactic sugar for some function call, just not the function you thought it was (there are of course many reasons why python is designed this way, I won't go in the details here because 1) I am not sufficiently expert to provide an exhaustive answer and 2) there are a lot of better resources on the web about this topic).
*Technically, it's its base class _LocationIndexer that defines those methods, I'm simplifying a bit here
Why does Pandas use square brackets with .loc and .iloc?
I'm entering speculation area here, because I couldn't find any document explicitly talking about design choices in Pandas, however: there are at least two good reasons I see for choosing the square brackets.
The first, and most important reason is: you simply can't do with a function call everything you do with the square-bracket notation, because assigning to a function call is a syntax error in python:
# contrived example to show this can't work
a = []
def f():
global a
return a
f().append(1) # OK
f() = dict() # SyntaxError: cannot assign to function call
Using round brackets for a "function" call, calls the underlying __call__ method (note that any class that defines __call__ is callable, so "function" call is an incorrect term because python doesn't care whether something is a function or just behaves like one).
Using square brackets, instead, alternatively calls __getitem__ or __setitem__ depending on when the call happens (__setitem__ if it's on the left of an assignment operator, __getitem__ in any other case). There is no way to mimic this behaviour with a function call, you'd need a setter method to modify the data in the dataframe, but it still wouldn't be allowed in an assignment operation:
# imaginary method-based alternative to the square bracket notation:
my_data = df.get_loc(my_index)
df.set_loc(my_index, my_data*2)
This example brings me to the second reason: consistency. You can access elements of a DataFrame via square brackets:
something = df['a']
df['b'] = 2*something
when using loc you're still trying to refer to some items in the DataFrame, so it's more consistent to use the same syntax instead of asking the user to use some getter and setter functions (it's also, I believe, "more pythonic", but that's a fuzzy concept I'd rather stay away from).
Underneath the covers, both are using the __setitem__ and __getitem__ functions.
In my python program, I have a ton of functions that are really wrappers for more complicated functions (the more complicated functions take more arguments, so the simple functions calculate the extra arguments and pass them along with the original arguments to the complex functions). I don't want the more complicated functions to be visible from the outer scope. However, my understanding is that if you define a function inside a function every time the outer function gets called it redefines the inner function, which is wasteful. How can I hide my inner functions without redefining them over and over again? There must be some way for the interpreter to parse my file and just do the definitions once but still keep them in the inner scope.
Rather than controlling access to your "inner functions" by nesting them, use either or both of:
naming conventions (a leading underscore on a name means private-by-convention, see the style guide); and
defining a list named __all__ to specify what gets imported from the package by default (see the tutorial on modules).
In use:
# define the names that get imported from this package
__all__ = ['outer_func']`
def _inner_func(...):
"""Private-by-convention inner function."""
...
def outer_func(...):
"""Public outer function to call _inner_func."""
...
This makes testing much easier, too, as you can still get direct access to _inner_func when necessary.
I think the convention is to prepend the function name with two underscores.
(See: http://www.diveintopython.net/object_oriented_framework/private_functions.html)
Let me start by saying what I would like to do. I want to create a lazy wrapper for a variable, as in I record all the method calls and operator calls and evaluate them later when I specify the variable to call it on.
As such, I want to be able to intercept all the method calls and operator calls and special methods so that I can work on them. However, __getattr__ doesn't intercept operator calls or __str__ and such, so I want to know if there is a generic way to overload all method calls, or should I just dynamically create a class and duplicate the code for all of it (which I already did, but is ugly).
It can be done, but yes, it becomes "ugly" - I wrote a lazy decorator once, that turns any function into a "lazily computed function".
Basically, I found out that the only moment an object's value is actually used in Python is when one of the special "dunder" methods s called. For example, when you have a number, it's value is only used when you are either using it in another operation, or converting it to a string for IO (which also uses a "dunder" method)
So, my wrapper anotates the parameters to a function call, and returns an special object,
which has potentially all of the "dunder" methods. Just when one of those methods is called, the original function is called - and its return value is then cached for further usage.
The implementation is here:
https://bitbucket.org/jsbueno/metapython/src/510a7d125b24/lazy_decorator.py
Sorry for the text and most of the presentation being in Portuguese.