How to use a functional in this case? - python

I have a Pandas dataframe and want to do different things with it. Now my function has this structure:
def process_dataframe(df, save_to_file, print_to_screen, etc):
...
if save_to_file:
df.to_csv(filename)
elif print_to_screen:
print df
elif...
Which is an ugly if else case. I want to use a functional instead. A function pointer. Something like this. I create several functions:
def save_to_file(df, filename):
return create_function(to_csv, filename???)
def print_to_screen(df):
return create_function(print)
Which means I can change the structure of my function to this single line instead:
result = process_dataframe(save_to_file)
...
...
def process_dataframe(df, my_functional):
return my_functional(df)
The problem is that I dont understand the syntax. For instance, how to return the class member function ".to_csv" in "save_to_file()"? How does "save_to_file()" look like? Which args does it take?
Of course, I could use a lambda instead of defining each function. But I want to understand how to define functions first. The next step with lambdas, I can figure out myself.

I'd make sure this is actually what you want to do, but assuming it is, you can just write a function that calls functions (and passes through arguments), like this:
def process_df(df, function, *args, **kwargs):
function(df, *args, **kwargs)
And define your two actions.
def print_to_screen(df):
print df
def save_to_file(df, filename):
df.to_csv(filename)
Then you can use these as you like:
In [193]: df = pd.DataFrame([[1,2,3],[2,4,5]], columns=['a','b','c'])
In [197]: process_df(df, print_to_screen)
a b c
0 1 2 3
1 2 4 5
In [198]: process_df(df, save_to_file, 'temp.csv')
#writes temp.csv

The problem is that I dont understand the syntax. For instance, how to
return the class member function ".to_csv" in "save_to_file()?"
I think what you are asking is this :
def save_to_file(filename):
def df_to_csv(df):
return df.to_csv(filename)
return df_to_csv
And the call:
foo = save_to_file('myfile.csv')
foo(df) # <- here "df" will be saved to "myfile.csv"
You could also do this (which I believe is something you originally wanted):
def save_to_file(df, filename):
def df_to_csv():
return df.to_csv(filename)
return df_to_csv
And then call it like so:
foo = save_to_file(df, 'myfile.csv')
foo() # <- "df" is saved to "myfile.csv"
But to me this seems not much less ugly than the first solution, so you might want to rethink your approach.

Related

Return from method if other method succeeds with one line

I have trouble finding a fitting title for this question, so please forgive me.
Many methods in my class look like this:
def one_of_many():
# code to determine `somethings`
for something in somethings:
if self.try_something(something):
return
# code to determine `something_else`
if self.try_something(something_else):
return
…
where self.try_something returns True or False.
Is there a way to express this with something like:
def one_of_many():
# code to determine `somethings`
for something in somethings:
self.try_something_and_return(something) # this will return from `one_of_many` on success
# code to determine `something_else`
self.try_something_and_return(something_else) # this will return from `one_of_many` on success
…
I was fiddling with decorators and context managers to make this happen with no success but I still believe that "There must be a better way!".
It looks like itertools to the rescue:
When you say method, I assume this is a method of a class so the code could look like this:
import itertools
class Thing:
def one_of_many(self):
# code to determine `somethings`
for something in itertools.chain(somethings,[something_else]):
if self.try_something(something):
return
Hopefully something_else is not too difficult to compute.
Hopefully this mcve mimics your problem:
a = [1,2,3]
b = 3
def f(thing):
print(thing)
return False
class F:
pass
self = F()
self.trysomething = f
Map the method to all the things and take action if any return True
if any(map(self.trysomething, a + [b])):
print('yeay')
else:
print('nay')
Depending on what a and b actually are you may have to play around with ways to concatenate or flatten/add or chain as #quamrana mentioned.
if self.try_something(a_thing) or self.try_something(another_thing):
return
But you'll either need to know your thing's beforehand.. or calculate them with an expression within the function call.

Storing a data for recalling functions Python

I have a project in which I run multiple data through a specific function that "cleans" them.
The cleaning function looks like this:
Misc.py
def clean(my_data)
sys.stdout.write("Cleaning genes...\n")
synonyms = FileIO("raw_data/input_data", 3, header=False).openSynonyms()
clean_genes = {}
for g in data:
if g in synonyms:
# Found a data point which appears in the synonym list.
#print synonyms[g]
for synonym in synonyms[g]:
if synonym in data:
del data[synonym]
clean_data[g] = synonym
sys.stdout.write("\t%s is also known as %s\n" % (g, clean_data[g]))
return data
FileIO is a custom class I made to open files.
My question is, this function will be called many times throughout the program's life cycle. What I want to achieve is don't have to read the input_data every time since it's gonna be the same every time. I know that I can just return it, and pass it as an argument in this way:
def clean(my_data, synonyms = None)
if synonyms == None:
...
else
...
But is there another, better looking way of doing this?
My file structure is the following:
lib
Misc.py
FileIO.py
__init__.py
...
raw_data
runme.py
From runme.py, I do this from lib import * and call all the functions I made.
Is there a pythonic way to go around this? Like a 'memory' for the function
Edit:
this line: synonyms = FileIO("raw_data/input_data", 3, header=False).openSynonyms() returns a collections.OrderedDict() from input_data and using the 3rd column as the key of the dictionary.
The dictionary for the following dataset:
column1 column2 key data
... ... A B|E|Z
... ... B F|W
... ... C G|P
...
Will look like this:
OrderedDict([('A',['B','E','Z']), ('B',['F','W']), ('C',['G','P'])])
This tells my script that A is also known as B,E,Z. B as F,W. etc...
So these are the synonyms. Since, The synonyms list will never change throughout the life of the code. I want to just read it once, and re-use it.
Use a class with a __call__ operator. You can call objects of this class and store data between calls in the object. Some data probably can best be saved by the constructor. What you've made this way is known as a 'functor' or 'callable object'.
Example:
class Incrementer:
def __init__ (self, increment):
self.increment = increment
def __call__ (self, number):
return self.increment + number
incrementerBy1 = Incrementer (1)
incrementerBy2 = Incrementer (2)
print (incrementerBy1 (3))
print (incrementerBy2 (3))
Output:
4
5
[EDIT]
Note that you can combine the answer of #Tagc with my answer to create exactly what you're looking for: a 'function' with built-in memory.
Name your class Clean rather than DataCleaner and the name the instance clean. Name the method __call__ rather than clean.
Like a 'memory' for the function
Half-way to rediscovering object-oriented programming.
Encapsulate the data cleaning logic in a class, such as DataCleaner. Make it so that instances read synonym data once when instantiated and then retain that information as part of their state. Have the class expose a clean method that operates on the data:
class FileIO(object):
def __init__(self, file_path, some_num, header):
pass
def openSynonyms(self):
return []
class DataCleaner(object):
def __init__(self, synonym_file):
self.synonyms = FileIO(synonym_file, 3, header=False).openSynonyms()
def clean(self, data):
for g in data:
if g in self.synonyms:
# ...
pass
if __name__ == '__main__':
dataCleaner = DataCleaner('raw_data/input_file')
dataCleaner.clean('some data here')
dataCleaner.clean('some more data here')
As a possible future optimisation, you can expand on this approach to use a factory method to create instances of DataCleaner which can cache instances based on the synonym file provided (so you don't need to do expensive recomputation every time for the same file).
I think the cleanest way to do this would be to decorate your "clean" (pun intended) function with another function that provides the synonyms local for the function. this is iamo cleaner and more concise than creating another custom class, yet still allows you to easily change the "input_data" file if you need to (factory function):
def defineSynonyms(datafile):
def wrap(func):
def wrapped(*args, **kwargs):
kwargs['synonyms'] = FileIO(datafile, 3, header=False).openSynonyms()
return func(*args, **kwargs)
return wrapped
return wrap
#defineSynonyms("raw_data/input_data")
def clean(my_data, synonyms={}):
# do stuff with synonyms and my_data...
pass

Python use dictionary keys as function names

I would like to be able to use dictionary keys as function names, but I'm not sure if it's possible. As a quick example, instead of class().dothis(dictkey, otherstuff), I'd like to have an option for class().dictkey(otherstuff). Here's a not working code example to give an idea of what I was thinking of.
class testclass:
def __init__(self):
self.dict = {'stuff':'value', 'stuff2':'value2'}
#I know this part won't work, but it gives the general idea of what I'd like to do
for key, value in self.dict.iteritems():
def key():
#do stuff
return value
>>> testclass().stuff()
'value'
Obviously each key would need to be checked that it's not overriding anything important, but other than that, I'd appreciate a bit of help if it's possible to get working.
Basically, my script is to store other scripts in the headers of the Maya scene file, so you may call a command and it'll execute the matching script. It stores the scripts in text format in a dictionary, where I've done a wrapper like thing so you can input args and kwargs without much trouble, and because you can only enter and execute the scripts personally, there's virtually no danger of anything being malicious unless you do it to yourself.
The list is pickled and base64 encoded as it all needs to be in string format for the header, so each time the function is called it decodes the dictionary so you can edit or read it, so ideally I'd need the functions built each time it is called.
A couple of examples from the run function:
Execute a simple line of code
>>> SceneScript().add("MyScript", "print 5")
>>> SceneScript().run("MyScript")
5
Execute a function with a return
>>> SceneScript().add("MyScript", "def test(x): return x*5")
>>> SceneScript().run("MyScript", "test(10)", "test('c')")
[50, 'ccccc']
Pass a variable to a function command
>>> SceneScript().run("MyScript", 'test(a+b)', a=10, b=-50)
[-200]
Execute a function without a return
>>> SceneScript().add("MyScript", "def test(x): print x*5")
>>> SceneScript().run("MyScript", "test(10)", "test('c')")
50
ccccc
[None, None]
Pass a variable
>>> SceneScript().add("MyScript", "print x")
>>> SceneScript().run("MyScript", x=20)
20
So as this question is asking, in terms of the above code, I'd like to have something like SceneScript().MyScript( "test(10)" ), just to make it easier to use.
The only "correct" way I can think of to do this looks like this:
class SomeClass(object):
def __init__(self, *args, **kwargs):
funcs = {'funcname': 'returnvalue', ...}
for func, ret_val in funcs.iteritems():
setattr(self, func, self.make_function(ret_val))
#staticmethod
def make_function(return_value):
def wrapped_function(*args, **kwargs):
# do some stuff
return return_value
return wrapped_function
This should allow you do to:
>>> foo = SomeClass()
>>> foo.funcname()
'returnvalue'
Of course the question of why you'd want to do something like this remains, as yet, unanswered :)
EDIT per updated question:
The problem lies in the fact that you cannot safely assign the method to the function signature. I'm not sure how SceneScript().add works currently, but that's essentially going to have to tie into this somehow or another.
Are you looking for a way to call a function residing inside the current module through a string with its name? If so,
def stuff(arg):
return 5
d = {"stuff":"value","stuff2":"value2"}
print globals()["stuff"](d["stuff"])
will print 5.
I would look into partial functions using functools.partial, in conjunction with __getattribute__:
class Foo:
def __init__(self):
self.a = 5
self.b = 6
def funca(self, x):
print(self.a + x)
def funcb(self, x):
self.a += x
self.funca(x)
mydict = {'funca':1, 'funcb':2}
foo = Foo()
for funcname,param in mydict.items():
print('foo before:', foo.a, foo.b)
print('calling', funcname)
functools.partial(foo.__getattribute__(funcname), param)()
print('foo after:', foo.a, foo.b)
Output:
foo before: 5 6
calling funca
6
foo after: 5 6
foo before: 5 6
calling funcb
9
foo after: 7 6

automatic wrapper that adds an output to a function

[I am using python 2.7]
I wanted to make a little wrapper function that add one output to a function. Something like:
def add_output(fct, value):
return lambda *args, **kargs: (fct(*args,**kargs),value)
Example of use:
def f(a): return a+1
g = add_output(f,42)
print g(12) # print: (13,42)
This is the expected results, but it does not work if the function given to add_ouput return more than one output (nor if it returns no output). In this case, the wrapped function will return two outputs, one contains all the output of the initial function (or None if it returns no output), and one with the added output:
def f1(a): return a,a+1
def f2(a): pass
g1 = add_output(f1,42)
g2 = add_output(f2,42)
print g1(12) # print: ((12,13),42) instead of (12,13,42)
print g2(12) # print: (None,42) instead of 42
I can see this is related to the impossibility to distinguish between one output of type tuple and several output. But this is disappointing not to be able to do something so simple with a dynamic language like python...
Does anyone have an idea on a way to achieve this automatically and nicely enough, or am I in a dead-end ?
Note:
In case this change anything, my real purpose is doing some wrapping of class (instance) method, to looks like function (for workflow stuff). However it is require to add self in the output (in case its content is changed):
class C(object):
def f(self): return 'foo','bar'
def wrap(method):
return lambda self, *args, **kargs: (self,method(self,*args,**kargs))
f = wrap(C.f)
c = C()
f(c) # returns (c,('foo','bar')) instead of (c,'foo','bar')
I am working with python 2.7, so I a want solution with this version or else I abandon the idea. I am still interested (and maybe futur readers) by comments about this issue for python 3 though.
Your add_output() function is what is called a decorator in Python. Regardless, you can use one of the collections module's ABCs (Abstract Base Classes) to distinguish between different results from the function being wrapped. For example:
import collections
def add_output(fct, value):
def wrapped(*args, **kwargs):
result = fct(*args, **kwargs)
if isinstance(result, collections.Sequence):
return tuple(result) + (value,)
elif result is None:
return value
else: # non-None and non-sequence
return (result, value)
return wrapped
def f1(a): return a,a+1
def f2(a): pass
g1 = add_output(f1, 42)
g2 = add_output(f2, 42)
print g1(12) # -> (12,13,42)
print g2(12) # -> 42
Depending of what sort of functions you plan on decorating, you might need to use the collections.Iterable ABC instead of, or in addition to, collections.Sequence.

How to implement a submethod in a Python-class?

I appologize, if I didn't express my self clearly. What I want to do is this:
class someClass(object):
def aMethod(self, argument):
return some_data #for example a list or a more complex datastructure
def aMethod_max(self, argument):
var = self.aMethod(argument)
#do someting with var
return altered_var
or I could do:
def aMethod(self, argument):
self.someVar = some_data
return some_data #for example a list or a more complex datastructure
def aMethod_max(self, argument):
if not hasattr(self, someVar):
self.aMethod(argument)
#do someting with self.var
return altered_var
But I considered this too complicated and hoped for a more elegant solution. I hope that it's clear now, what I want to accomplish.
Therefore I phantasized about something like in the following paragraph.
class someClass(object):
someMethod(self):
#doSomething
return result
subMethod(self):
#doSomething with the result of someMethod
Foo = someClass()
Foo.someMethod.subMethod()
or if someMethod has an argument something like
Foo.someMethod(argument).subMethod()
How would I do something like this in python?
EDIT: or like this?
subMethod(self):
var = self.someMethod()
return doSomething(var)
Let's compare the existing solutions already given in your question (e.g. the ones you call "complicated" and "inelegant") with your proposed alternative.
The existing solutions mean you will be able to write:
foo.subMethod() # foo.someMethod() is called internally
but your proposed alternative means you have to write:
foo.someMethod().subMethod()
which is obviously worse.
On the other hand, if subMethod has to be able to modify the result of any method, rather than just someMethod, then the existing solutions would mean you have to write:
foo.subMethod(foo.anyMethod())
with the only disadvantage here being that you have to type foo twice, as opposed to once.
Conclusion: on the whole, the existing solutions are less complicated and inelegant than your proposed alternative - so stick with the existing solutions.
You can do method chaining when the result of someMethod is an instance of someClass.
Simple example:
>>> class someClass:
... def someMethod(self):
... return self
... def subMethod(self):
... return self.__class__
...
>>> x=someClass()
>>> x
<__main__.someClass instance at 0x2aaaaab30d40>
>>> x.someMethod().subMethod()
<class __main__.someClass at 0x2aaaaab31050>
Not sure if I'm understanding it right, but perhaps you mean this:
Foo.subMethod(Foo.someMethod())
This passes the result of someMethod() to subMethod(). You'd have to change your current definition of subMethod() to accept the result of someMethod().
You can achieve something similar using decorators:
def on_result(f):
def decorated(self,other,*args,**kwargs):
result = getattr(self,other)(*args,**kwargs)
return f(result)
return decorated
Usage:
class someClass(object):
def someMethod(self,x,y):
#doSomething
result = [1,2,3,x,y] # example
return result
#on_result
def subMethod(self):
#doSomething with the result of someMethod
print self # example
Foo = someClass()
Foo.subMethod("someMethod",4,5)
Output:
[1, 2, 3, 4, 5]
As you see, the first argument is the name of the method to be chained, and the remaining ones will be passed to it, no matter what its signature is.
EDIT: on second thought, this is rather pointless, since you could always use
Foo.submethod(Foo.someMethod(4,5))...
Maybe I didn't understand what you're trying to achieve. Does the subMethod have to be linked to a specific method only? Or maybe it's the syntatic form
a.b().c()
that's important to you? (in that case, see kojiro's answer)
From the feedback so far, I understand that subMethod will link only to someMethod, right? Maybe you can achieve this combining a decorator with a closure:
def on_result(other):
def decorator(f):
def decorated(self,*a1,**k1):
def closure(*a2,**k2):
return f(self,getattr(self,other)(*a1,**k1),*a2,**k2)
return closure
return decorated
return decorator
class someClass(object):
def someMethod(self,a,b):
return [a,2*b]
#on_result('someMethod')
def subMethod(self,result,c,d):
result.extend([3*c,4*d])
return result
Foo = someClass()
print Foo.subMethod(1,2)(3,4) # prints [1,4,9,16]
The decorator is kinda "ugly", but once written it's usage is quite elegant IMHO (plus, there are no contraints in the signature of either method).
Note: I'm using Python 2.5 and this is the only way I know of writing decorators that take arguments. There's probably a better way, but I'm too lazy to look it up right now...

Categories