I'm solving a problem and something confuses me. In following code isn't
counts = mapReduce(lines, mapper=computeWordCounts, reducer=sumUpWordCounts);
just wrong? Is that just a pseudocode or such usage is actually possible?
def computeWordCounts(line):
# TODO
def sumUpWordCounts(word, counts):
# TODO
def mapReduce(data, mapper, reducer):
# TODO
def test():
with open('/Users/bgedik/Desktop/zzz.txt') as f:
lines = f.read().splitlines()
counts = mapReduce(lines, mapper=computeWordCounts, reducer=sumUpWordCounts);
for word, count in counts:
print word, " => ", count
in python functions and classes are first class citizens you can pass them around just like any other variable
def square(a_var):
return a_var ** 2
def apply(value,fn):
return fn(value)
print apply(5,square)
you can also rename them
sq = square
print sq(5)
there is alot more than this see the docs https://docs.python.org/2/library/stdtypes.html#functions
It's perfectly valid, they're called named arguments. They're a way of skipping arbitrary arguments in the middle of the function all, and still knowing which ones you're actually sending.
This pattern is actually very widespread, from VB6 to C# and more!
Related
I have trouble finding a fitting title for this question, so please forgive me.
Many methods in my class look like this:
def one_of_many():
# code to determine `somethings`
for something in somethings:
if self.try_something(something):
return
# code to determine `something_else`
if self.try_something(something_else):
return
…
where self.try_something returns True or False.
Is there a way to express this with something like:
def one_of_many():
# code to determine `somethings`
for something in somethings:
self.try_something_and_return(something) # this will return from `one_of_many` on success
# code to determine `something_else`
self.try_something_and_return(something_else) # this will return from `one_of_many` on success
…
I was fiddling with decorators and context managers to make this happen with no success but I still believe that "There must be a better way!".
It looks like itertools to the rescue:
When you say method, I assume this is a method of a class so the code could look like this:
import itertools
class Thing:
def one_of_many(self):
# code to determine `somethings`
for something in itertools.chain(somethings,[something_else]):
if self.try_something(something):
return
Hopefully something_else is not too difficult to compute.
Hopefully this mcve mimics your problem:
a = [1,2,3]
b = 3
def f(thing):
print(thing)
return False
class F:
pass
self = F()
self.trysomething = f
Map the method to all the things and take action if any return True
if any(map(self.trysomething, a + [b])):
print('yeay')
else:
print('nay')
Depending on what a and b actually are you may have to play around with ways to concatenate or flatten/add or chain as #quamrana mentioned.
if self.try_something(a_thing) or self.try_something(another_thing):
return
But you'll either need to know your thing's beforehand.. or calculate them with an expression within the function call.
I'm new in stackoverflow and I'd like to make my first question for a problem in this code I've tried to write to learn objects in python.
I'm trying to call the creation of an object through a dictionary.
My purpose is to create an object thanks to a number, for example I have the dictionary newch = {1 : Character.new_dragon(), 2 : Character.new_goblin()} and when I call Player1 = newch[1] it should create a new dragon (#classmethod new_dragon) and assign it to Player1
The problem is that when i run the program, Character.new_dragon() and Character.new_goblin() are called automatically (i put a control print), but when I write "DRAGO" after the request "which player?" the functions aren't called because there isn't the control print
import random
class Character:
def __init__(self,idd,height,weight,att,defe):
self.idd=idd
self.height=height
self.weight=weight
self.att=att
self.defe=defe
#classmethod
def new_goblin(cls):
print('newgoblin')
return cls(1,getr(1,1.5,0.1),getr(40,60,0.5),getr(5,15,1),getr(6,10,1))
#classmethod
def new_dragon(cls):
print('newdrago')
return cls(2,getr(20,30,1),getr(500,2000,5),getr(50,150,3),getr(20,100,3))
def getr(start,stop,step): #returns float
x=random.randint(1, 1000)
random.seed(x)
return random.randint(0, int((stop - start) / step)) * step + start
play={1:'p1', 2:'p2', 3:'p3', 4:'p4'} #dict for players
newch={1:Character.new_dragon(),2:Character.new_goblin()} ############This doesn't work
i=1
while True:
char=input("which player? Drago or Goblin?").upper()
if(char=="DRAGO"):
play[i]=newch[1] #here i try to call Character.new_dragon()
i+=1
break
elif(char=="GOBLIN"):
play[i]=newch[2]
i+=1
break
print("write \'Drago\' or \'Goblin\'")
print(play[1].height, play[1].weight, play[1].att, play[1].defe)
Here's my code, if you could help me, I would be very glad, thanks
The new object is created immediately when you call Character.new_dragon(), and the object is then stored in the dict.
Instead you could not store the object in the dict, but the function that creates it. That function would be Character.new_dragon (without the ()). Then you can call that function when the player selects a character:
play[i]=newch[1]()
Complete code:
import random
class Character:
def __init__(self,idd,height,weight,att,defe):
self.idd=idd
self.height=height
self.weight=weight
self.att=att
self.defe=defe
#classmethod
def new_goblin(cls):
print('newgoblin')
return cls(1,getr(1,1.5,0.1),getr(40,60,0.5),getr(5,15,1),getr(6,10,1))
#classmethod
def new_dragon(cls):
print('newdrago')
return cls(2,getr(20,30,1),getr(500,2000,5),getr(50,150,3),getr(20,100,3))
def getr(start,stop,step): #returns float
x=random.randint(1, 1000)
random.seed(x)
return random.randint(0, int((stop - start) / step)) * step + start
play={1:'p1', 2:'p2', 3:'p3', 4:'p4'} #dict for players
newch={1:Character.new_dragon,2:Character.new_goblin} ############This doesn't work
i=1
while True:
char=input("which player? Drago or Goblin?").upper()
if(char=="DRAGO"):
play[i]=newch[1]() #here i try to call Character.new_dragon()
i+=1
break
elif(char=="GOBLIN"):
play[i]=newch[2]()
i+=1
break
print("write \'Drago\' or \'Goblin\'")
print(play[1].height, play[1].weight, play[1].att, play[1].defe)
This works, however I would not say it is the best coding style. Its hard to judge from only this piece of code, but it might be a better idea to make Drago and Goblin subclasses of the Character class and store the type of those classes in that dictionary.
newch={1:Character.new_dragon(),2:Character.new_goblin()}
As this is written, the new_dragon and new_goblin functions are called when the dictionary is created. This is why you are seeing them both run "automatically" every time you run your program.
If you instead declared the dict like:
newch={1:Character.new_dragon ,2:Character.new_goblin}
And later have something like:
if(char=="DRAGO"):
play[i]=newch[1]()
(note the parenthesis after the newch[1]) you should get what you want.
Incidentally, those break statements aren't necessary. The If/elif/else chain doesn't fall through like a switch statement in other languages.
When you are initialising the dictionary this way:
newch={1:Character.new_dragon(),2:Character.new_goblin()}
You are binding keys (1 and 2) to the return values of the new_dragon and new_goblin functions. You need to bind the functions(without calling them) like so:
newch={1:Character.new_dragon,2:Character.new_goblin}
Notice there are no brackets!
And then, when you create players, you execute those functions like so:
play[i]=newch[1]() Notice here we have brackets!
Additionally, if I may suggest an improvement of the code here:
if(char=="DRAGO"):
play[i]=newch[1]()
i+=1
To avoid the if statement, you can create you mapping with a string:
newch={"DRAGO":Character.new_dragon,"GOBLIN":Character.new_goblin}
And create instances just by calling
play[i]=newch[char]()
To handle errors, you can add just a single if statement checking whether the char string is in the list with dict keys.
I'm writing some tooling for online programming contexts.
Part of it is a test case checker which actually based on a set of pairs of (input, output) files are gonna check whether the solution method is actually working.
Basically, the solution method is expected to be defined as follow:
def solution(Nexter: inputs):
# blahblah some code here and there
n = inputs.next_int()
sub_process(inputs)
# simulating a print something
yield str(n)
can be then translated (once the AST modifications) as:
def solution():
# blahblah some code here and there
n = int(input())
sub_process()
print(str(n))
Note: Nexter is a class defined to be whether a generator of user input() calls or carry out the expected inputs + some other goodies.
I'm aware of the issues related to converting back to source code from the AST (requires to rely on 3rd party stuff). I also know that there is a NodeTransformer class:
http://greentreesnakes.readthedocs.io/en/latest/manipulating.html
https://docs.python.org/3/library/ast.html#ast.NodeTransformer
But its use remains unclear to me I don't know if I'm better off checking calls, expr, etc.
Here is below what I've ended up with:
signature = inspect.signature(iterative_greedy_solution)
if len(signature.parameters) == 1 and "inputs" in signature.parameters:
parameter = signature.parameters["inputs"]
annotation = parameter.annotation
if Nexter == annotation:
source = inspect.getsource(iterative_greedy_solution)
tree = ast.parse(source)
NexterInputsRewriter().generic_visit(tree)
class NexterInputsRewriter(ast.NodeTransformer):
def visit(self, node):
#???
This is definitely not the best design ever. Next time, I would probably go for the other way around (i.e. having a definition with simple user defined input() (and output, i.e. print(...)) and replacing them with test case inputs) when passing to a tester class asserting whether actual outputs are matching expecting ones.
To sum up this what I would like to achieve and I don't really know exactly how (apart of subclassing the NodeTransformer class):
Get rid of the solution function arguments
Modifiy the inputs calls in method body (as well as in the sub calls of methods also leveraging Nexter: inputs) in order to replace them with their actual user input() implementation, e.g. inputs.next_int() = int(input())
EDIT
Found that tool (https://python-ast-explorer.com/) that helps a lot to visualize what kind of ast.AST derivatives are used for a given function.
You can probably use NodeTransformer + ast.unparse() though it wouldn't be as effective as checking out some other 3rd party solutions considering it won't preserve any of your comments.
Here is an example transformation done by refactor (I'm the author), which is a wrapper layer around ast.unparse for doing easy source-to-source transformations through AST;
import ast
import refactor
from refactor import ReplacementAction
class ReplaceNexts(refactor.Rule):
def match(self, node):
# We need a call
assert isinstance(node, ast.Call)
# on an attribute (inputs.xxx)
assert isinstance(node.func, ast.Attribute)
# where the name for attribute is `inputs`
assert isinstance(node.func.value, ast.Name)
assert node.func.value.id == "inputs"
target_func_name = node.func.attr.removeprefix("next_")
# make a call to target_func_name (e.g int) with input()
target_func = ast.Call(
ast.Name(target_func_name),
args=[
ast.Call(ast.Name("input"), args=[], keywords=[]),
],
keywords=[],
)
return ReplacementAction(node, target_func)
session = refactor.Session([ReplaceNexts])
source = """\
def solution(Nexter: inputs):
# blahblah some code here and there
n = inputs.next_int()
sub_process(inputs)
st = inputs.next_str()
sub_process(st)
"""
print(session.run(source))
$ python t.py
def solution(Nexter: inputs):
# blahblah some code here and there
n = int(input())
sub_process(inputs)
st = str(input())
sub_process(st)
I know this is super basic and I have been searching everywhere but I am still very confused by everything I'm seeing and am not sure the best way to do this and am having a hard time wrapping my head around it.
I have a script where I have multiple functions. I would like the first function to pass it's output to the second, then the second pass it's output to the third, etc. Each does it's own step in an overall process to the starting dataset.
For example, very simplified with bad names but this is to just get the basic structure:
#!/usr/bin/python
# script called process.py
import sys
infile = sys.argv[1]
def function_one():
do things
return function_one_output
def function_two():
take output from function_one, and do more things
return function_two_output
def function_three():
take output from function_two, do more things
return/print function_three_output
I want this to run as one script and print the output/write to new file or whatever which I know how to do. Just am unclear on how to pass the intermediate outputs of each function to the next etc.
infile -> function_one -> (intermediate1) -> function_two -> (intermediate2) -> function_three -> final result/outfile
I know I need to use return, but I am unsure how to call this at the end to get my final output
Individually?
function_one(infile)
function_two()
function_three()
or within each other?
function_three(function_two(function_one(infile)))
or within the actual function?
def function_one():
do things
return function_one_output
def function_two():
input_for_this_function = function_one()
# etc etc etc
Thank you friends, I am over complicating this and need a very simple way to understand it.
You could define a data streaming helper function
from functools import reduce
def flow(seed, *funcs):
return reduce(lambda arg, func: func(arg), funcs, seed)
flow(infile, function_one, function_two, function_three)
#for example
flow('HELLO', str.lower, str.capitalize, str.swapcase)
#returns 'hELLO'
edit
I would now suggest that a more "pythonic" way to implement the flow function above is:
def flow(seed, *funcs):
for func in funcs:
seed = func(seed)
return seed;
As ZdaR mentioned, you can run each function and store the result in a variable then pass it to the next function.
def function_one(file):
do things on file
return function_one_output
def function_two(myData):
doThings on myData
return function_two_output
def function_three(moreData):
doMoreThings on moreData
return/print function_three_output
def Main():
firstData = function_one(infile)
secondData = function_two(firstData)
function_three(secondData)
This is assuming your function_three would write to a file or doesn't need to return anything. Another method, if these three functions will always run together, is to call them inside function_three. For example...
def function_three(file):
firstStep = function_one(file)
secondStep = function_two(firstStep)
doThings on secondStep
return/print to file
Then all you have to do is call function_three in your main and pass it the file.
For safety, readability and debugging ease, I would temporarily store the results of each function.
def function_one():
do things
return function_one_output
def function_two(function_one_output):
take function_one_output and do more things
return function_two_output
def function_three(function_two_output):
take function_two_output and do more things
return/print function_three_output
result_one = function_one()
result_two = function_two(result_one)
result_three = function_three(result_two)
The added benefit here is that you can then check that each function is correct. If the end result isn't what you expected, just print the results you're getting or perform some other check to verify them. (also if you're running on the interpreter they will stay in namespace after the script ends for you to interactively test them)
result_one = function_one()
print result_one
result_two = function_two(result_one)
print result_two
result_three = function_three(result_two)
print result_three
Note: I used multiple result variables, but as PM 2Ring notes in a comment you could just reuse the name result over and over. That'd be particularly helpful if the results would be large variables.
It's always better (for readability, testability and maintainability) to keep your function as decoupled as possible, and to write them so the output only depends on the input whenever possible.
So in your case, the best way is to write each function independently, ie:
def function_one(arg):
do_something()
return function_one_result
def function_two(arg):
do_something_else()
return function_two_result
def function_three(arg):
do_yet_something_else()
return function_three_result
Once you're there, you can of course directly chain the calls:
result = function_three(function_two(function_one(arg)))
but you can also use intermediate variables and try/except blocks if needed for logging / debugging / error handling etc:
r1 = function_one(arg)
logger.debug("function_one returned %s", r1)
try:
r2 = function_two(r1)
except SomePossibleExceptio as e:
logger.exception("function_two raised %s for %s", e, r1)
# either return, re-reraise, ask the user what to do etc
return 42 # when in doubt, always return 42 !
else:
r3 = function_three(r2)
print "Yay ! result is %s" % r3
As an extra bonus, you can now reuse these three functions anywhere, each on it's own and in any order.
NB : of course there ARE cases where it just makes sense to call a function from another function... Like, if you end up writing:
result = function_three(function_two(function_one(arg)))
everywhere in your code AND it's not an accidental repetition, it might be time to wrap the whole in a single function:
def call_them_all(arg):
return function_three(function_two(function_one(arg)))
Note that in this case it might be better to decompose the calls, as you'll find out when you'll have to debug it...
I'd do it this way:
def function_one(x):
# do things
output = x ** 1
return output
def function_two(x):
output = x ** 2
return output
def function_three(x):
output = x ** 3
return output
Note that I have modified the functions to accept a single argument, x, and added a basic operation to each.
This has the advantage that each function is independent of the others (loosely coupled) which allows them to be reused in other ways. In the example above, function_two() returns the square of its argument, and function_three() the cube of its argument. Each can be called independently from elsewhere in your code, without being entangled in some hardcoded call chain such as you would have if called one function from another.
You can still call them like this:
>>> x = function_one(3)
>>> x
3
>>> x = function_two(x)
>>> x
9
>>> x = function_three(x)
>>> x
729
which lends itself to error checking, as others have pointed out.
Or like this:
>>> function_three(function_two(function_one(2)))
64
if you are sure that it's safe to do so.
And if you ever wanted to calculate the square or cube of a number, you can call function_two() or function_three() directly (but, of course, you would name the functions appropriately).
With d6tflow you can easily chain together complex data flows and execute them. You can quickly load input and output data for each task. It makes your workflow very clear and intuitive.
import d6tlflow
class Function_one(d6tflow.tasks.TaskCache):
function_one_output = do_things()
self.save(function_one_output) # instead of return
#d6tflow.requires(Function_one)
def Function_two(d6tflow.tasks.TaskCache):
output_from_function_one = self.inputLoad() # load function input
function_two_output = do_more_things()
self.save(function_two_output)
#d6tflow.requires(Function_two)
def Function_three():
output_from_function_two = self.inputLoad()
function_three_output = do_more_things()
self.save(function_three_output)
d6tflow.run(Function_three()) # executes all functions
function_one_output = Function_one().outputLoad() # get function output
function_three_output = Function_three().outputLoad()
It has many more useful features like parameter management, persistence, intelligent workflow management. See https://d6tflow.readthedocs.io/en/latest/
This way function_three(function_two(function_one(infile))) would be the best, you do not need global variables and each function is completely independent of the other.
Edited to add:
I would also say that function3 should not print anything, if you want to print the results returned use:
print function_three(function_two(function_one(infile)))
or something like:
output = function_three(function_two(function_one(infile)))
print output
Use parameters to pass the values:
def function1():
foo = do_stuff()
return function2(foo)
def function2(foo):
bar = do_more_stuff(foo)
return function3(bar)
def function3(bar):
baz = do_even_more_stuff(bar)
return baz
def main():
thing = function1()
print thing
I am creating a word parsing class and I keep getting a
bound method Word_Parser.sort_word_list of <__main__.Word_Parser instance at 0x1037dd3b0>
error when I run this:
class Word_Parser:
"""docstring for Word_Parser"""
def __init__(self, sentences):
self.sentences = sentences
def parser(self):
self.word_list = self.sentences.split()
def sort_word_list(self):
self.sorted_word_list = self.word_list.sort()
def num_words(self):
self.num_words = len(self.word_list)
test = Word_Parser("mary had a little lamb")
test.parser()
test.sort_word_list()
test.num_words()
print test.word_list
print test.sort_word_list
print test.num_words
There's no error here. You're printing a function, and that's what functions look like.
To actually call the function, you have to put parens after that. You're already doing that above. If you want to print the result of calling the function, just have the function return the value, and put the print there. For example:
print test.sort_word_list()
On the other hand, if you want the function to mutate the object's state, and then print the state some other way, that's fine too.
Now, your code seems to work in some places, but not others; let's look at why:
parser sets a variable called word_list, and you later print test.word_list, so that works.
sort_word_list sets a variable called sorted_word_list, and you later print test.sort_word_list—that is, the function, not the variable. So, you see the bound method. (Also, as Jon Clements points out, even if you fix this, you're going to print None, because that's what sort returns.)
num_words sets a variable called num_words, and you again print the function—but in this case, the variable has the same name as the function, meaning that you're actually replacing the function with its output, so it works. This is probably not what you want to do, however.
(There are cases where, at first glance, that seems like it might be a good idea—you only want to compute something once, and then access it over and over again without constantly recomputing that. But this isn't the way to do it. Either use a #property, or use a memoization decorator.)
This problem happens as a result of calling a method without brackets. Take a look at the example below:
class SomeClass(object):
def __init__(self):
print 'I am starting'
def some_meth(self):
print 'I am a method()'
x = SomeClass()
''' Not adding the bracket after the method call would result in method bound error '''
print x.some_meth
''' However this is how it should be called and it does solve it '''
x.some_meth()
You have an instance method called num_words, but you also have a variable called num_words. They have the same name. When you run num_words(), the function replaces itself with its own output, which probably isn't what you want to do. Consider returning your values.
To fix your problem, change def num_words to something like def get_num_words and your code should work fine. Also, change print test.sort_word_list to print test.sorted_word_list.
For this thing you can use #property as an decorator, so you could use instance methods as attributes. For example:
class Word_Parser:
def __init__(self, sentences):
self.sentences = sentences
#property
def parser(self):
self.word_list = self.sentences.split()
#property
def sort_word_list(self):
self.sorted_word_list = self.word_list.sort()
#property
def num_words(self):
self.num_words = len(self.word_list)
test = Word_Parser("mary had a little lamb")
test.parser()
test.sort_word_list()
test.num_words()
print test.word_list
print test.sort_word_list
print test.num_words
so you can use access the attributes without calling (i.e., without the ()).
I think you meant print test.sorted_word_list instead of print test.sort_word_list.
In addition list.sort() sorts a list in place and returns None, so you probably want to change sort_word_list() to do the following:
self.sorted_word_list = sorted(self.word_list)
You should also consider either renaming your num_words() function, or changing the attribute that the function assigns to, because currently you overwrite the function with an integer on the first call.
The syntax problem is shadowing method and variable names. In the current version sort_word_list() is a method, and sorted_word_list is a variable, whereas num_words is both. Also, list.sort() modifies the list and replaces it with a sorted version; the sorted(list) function actually returns a new list.
But I suspect this indicates a design problem. What's the point of calls like
test.parser()
test.sort_word_list()
test.num_words()
which don't do anything? You should probably just have the methods figure out whether the appropriate counting and/or sorting has been done, and, if appropriate, do the count or sort and otherwise just return something.
E.G.,
def sort_word_list(self):
if self.sorted_word_list is not None:
self.sorted_word_list = sorted(self.word_list)
return self.sorted_word_list
(Alternately, you could use properties.)
Your helpful comments led me to the following solution:
class Word_Parser:
"""docstring for Word_Parser"""
def __init__(self, sentences):
self.sentences = sentences
def parser(self):
self.word_list = self.sentences.split()
word_list = []
word_list = self.word_list
return word_list
def sort_word_list(self):
self.sorted_word_list = sorted(self.sentences.split())
sorted_word_list = self.sorted_word_list
return sorted_word_list
def get_num_words(self):
self.num_words = len(self.word_list)
num_words = self.num_words
return num_words
test = Word_Parser("mary had a little lamb")
test.parser()
test.sort_word_list()
test.get_num_words()
print test.word_list
print test.sorted_word_list
print test.num_words
and returns:
['mary', 'had', 'a', 'little', 'lamb']
['a', 'had', 'lamb', 'little', 'mary']
5
Thank you all.
Bound method error also occurs (in a Django app for instnce) , if you do a thing as below:
class Products(models.Model):
product_category = models.ForeignKey(ProductCategory, on_delete=models.Protect)
def product_category(self)
return self.product_category
If you name a method, same way you named a field.