As a simplified example, say I have the following if statement
if x > 5:
score = 1
else if x <= 5
score = 2
How can I replace this with a dictionary? I want something like
score = {x > 5: 1, x <= 5: 2}[x]
My actual if statement is quite long, so I'd like to use this condensed construct.
Instead of a dictionary, you can use a list of tuples, where one item is a function and the other is a value. You loop through the list, calling each function; when the function returns a truthy value you return the corresponding value.
def greater_than_5(x):
return x > 5
def less_or_equal_5(x):
return x <= 5
scores = [(greater_than_5, 1), (less_or_equal_5, 2)]
for f, val in scores:
if f(x):
score = val
break
Using if-else will be the most readable solution. Also it avoids the problem that you would need an infinite amount of values to express something like x > 5 with a dictionary (as it will map all number starting from 6).
Only if you know that the domain is limited (e.g., x is always less than 1000), it will work. The simple if statement will most likely be faster.
To express ranges, you can use this short notation:
if 4 <= x <= 10:
...
If you have a very complicate formula (lots of if-else statements), and a limited number of values, using a precomputed table makes sense, though. This is what compilers also do if they have to translate complicated switch-case statements in C like languages when the values are dense. They implement it as a lookup table.
I would still recommend to start with writing the formula as if-else statements, and then use a function that computes the dictionary based on the if-else code and the range of possible values. To fill the dictionary, iterate over the range of possible values:
def score(x):
if x > 5:
return 1
elif x <= 5:
return 2
...
d = dict((i,foo(i)) for i in range(1,100))
d[x]
Related
I am trying to do two things in Python:
Simulate 100 random draws from a Poisson distribution. I have done this by:
sample100 = poisson.rvs(mu=5,size=100)
Take the above sample, and apply an UMP test I've generated to each individual observation (e.g., test the hypothesis against each individual observation). The test should accept the null hypothesis if the observation has a value < 8; reject with probability ~50% if observation has value = 8; reject if observation has value > 8
I cannot figure out how to do the second part of this. The function code I've made is:
def optionaltest(y,k,g):
if (y > k):
return 1
if (y == k):
if rand(uniform(0,1)) >= 0.4885: return 1
else: return 0
if (y < k):
return 0
But there are two issues - apparently if (y==k) is invalid syntax. Second, even if I remove that part, I can't actually apply the function to sample100 since it is an array.
How can I modify this to make it work? Clearly, I'm very new to Python but I have been scouring the internet for hours. Perhaps I should change how I'm generating my sample data so I can apply a function to it? Maybe there's a way to apply a function to each element of an array? How do I make the test logic work when the output = k (which I will set to 8 in this case)?
EDIT/UPDATE:
Here's how I ended up doing it:
def optionaltest(y):
if (y > 8):
return 1
if (y == 8):
if np.random.uniform(0,1) >= 0.4885: return 1
else: return 0
if (y < 8):
return 0
I was able to apply that test to my array data via:
results_sample100 = list(map(optimaltest, sample100))
cl.Counter(results_sample100)
This is invalid python syntax
if rand(uniform(0,1)) >= 0.4885 then 1
else 0
Instead, you could do this:
return 1 if rand(uniform(0,1)) >= 0.4885 else 0
You could also do something more verbose but potentially more straightforward (this is often a matter of taste), like this:
def optionaltest(y,k,g):
if (y > k):
return 1
if (y == k):
if rand(uniform(0,1)) >= 0.4885:
return 1
else:
return 0
if (y < k):
return 0
Or even like this:
def optionaltest(y,k,g):
if (y > k):
return 1
if (y == k) and rand(uniform(0,1)) >= 0.4885:
return 1
else:
return 0
For this question:
Maybe there's a way to apply a function to each element of an array?
You can use a for-loop or map a function over a list:
results = []
for elem in somelist:
results.append(my_function(elem))
Alternately:
results = list(map(my_function, somelist))
Your function takes three arguments, though, and it's not clear to me where those are coming from. Is your list a list of tuples?
syntax error is
in Python if condition then..else becomes
if condition:
pass
else:
pass
For applying function on your list's elements you can convert your list to a pandas dataframe. Then use apply function. For example if your list name is " data " and your function is "func" do this:
import pandas as pd
data = ['item_1', 'item_2', 'item_3']
df = pd.DataFrame (data, columns = ['column_name'])
result = df.apply(func)
I need to replicate this same function but instead of having a list as a parameter I need a dictionary. The idea is that the calculation done by the function is done with the values, and the function returns the keys.
def funcion(dic, Sum):
Subset = []
def f(dic, i, Sum):
if i >= len(dic): return 1 if Sum == 0 else 0
count = f(dic, i + 1, Sum)
count += f(dic, i + 1, Sum - dic[i])
return count
for i, x in enumerate(dic):
if f(dic, i + 1, Sum - x) > 0:
Subset.append(x)
Sum -= x
return Subset
The function works if I enter (300, 200,100,400). But i need to use as an input something like {1:300 , 2:200 , 3:100, 4:400 }
So the calculation is done with the values, but it returns the keys that match the condition.
Im trying working with dic.keys() and dic.values() but its not working. Could you help me?
Thank u so much.
Your code isn't working with your dictionary because it's expecting to be able to index into dic with numeric indexes starting at 0 and going up to len(dic)-1. However, you've given your dictionary keys that start at 1 and go to len(dic). That means you need to change things up.
The first change is in the recursive f function, where you need the base case to trigger on i > len(dic) rather than using the >= comparison.
The next change in in the loop that calls f. Rather than using enumerate, which will generate indexes starting at 0 (and pair them with the keys of the dictionary, which is what you get when you directly iterate on it), you probably want to do something else.
Now, ideally, you'd want to iterate on dic.items(), which would give you index, value pairs just like your code expects. But depending on how the dictionary gets built, that might iterate over the values in a different order than you expect. In recent versions of Python, dictionaries maintain the order their keys were added in, so if you're creating the dictionary with {1:300, 2:200, 3:100, 4:400 }, you'll get the right order, but a mostly-equivalent dictionary like {3:100, 4:400, 1:300, 2:200 } would give its results in a different order.
So if you need to be resilient against dictionaries that don't have their keys in the right order, you probably want to directly generate the 1-len(dict) keys yourself with range, and then index to get the x value inside the loop:
for i in range(1, len(dic)+1): # Generate the keys directly from a range
x = dic[i] # and do the indexing manually.
if f(dic, i + 1, Sum - x) > 0: # The rest of the loop is the same as before.
Subset.append(x)
Sum -= x
Coming from primarily coding in Java and wanted to know if Python could use conditionals and different kinds of incrementing inside its for loops like Java and C can. Sorry if this seems like a simple question.
i.e.:
boolean flag = True
for(int i = 1; i < 20 && flag; i *= 2) {
//Code in here
}
Not directly. A for loop iterates over a pre-generated sequence, rather than generating the sequence itself. The naive translation would probably look something like
flag = True
i = 1
while i < 20:
if not flag:
break
...
if some_condition:
flag = False
i *= 2
However, your code probably could execute the break statement wherever you set flag to False, so you could probably get rid of the flag altogether.
i = 1
while i < 20:
...
if some_condition:
break
i *= 2
Finally, you can define your own generator to iterate over
def powers_of_two():
i = 1
while True:
yield i
i *= 2
for i in powers_of_two():
...
if some_condition:
break
The for loops in Python are not like loops in C. They are like the for-each loops applied to iterables that came out in Java 7:
for (String name: TreeSet<String>(nameList) ) {
//Your code here
}
If you want control over your iterator variable, then a while or for loop with a break in it is probably the cleanest way to achieve that kind of control.
This might be a good time to look into finding time to do a tutorial on Python comprehensions. Even though they are not directly applicable to your question, that is the feature that I appreciate most having come from Java about five years ago.
You can use range() if you have the step as some constant increment (like i++,i+=10,etc). The syntax is -
range(start,stop,step)
range(start, stop, step) is used as a replacement for for (int i = start; i < stop; i += step). It doesn't work with multiplication, but you can still use it (with break) if you have something like i < stop && condition.
The equivalent loop for the one you mentioned in question can be =>
for(int i=0;i<20;i*=2) // C/C++ loop
# Python - 1
i = 0
while i < 20 : # Equivalent python loop
# Code here
i*=2
If you are willing to use flag as well as a condition, you will have to do it as =>
// C/C++
bool flag = true;
for(int i=0;i<20&&flag;i*=2) // C/C++ loop
# Python - 1
i,flag = 1,True
while not flag and i < 20 : # Equivalent python loop
# Code here
i*=2
Hope this helps !
In a sense, but it's not quite as simple as it is with JS and Java.
Here is your example written in Python using a while loop with two conditions. Also note that in Python while loops, you cannot assign or increment the index in the loop's declaration.
boolean_flag = True
i = 1
while (i < 20 and boolean_flag):
i *= 2
# Code in here
The answers above are good and efficient for what you ask, but I'll give my idea of how I would do it.
max = 20
for i in range(0, max/2):
c = i * 2
if flag:
#Do work.
break
or to make it shorter:
max = 20
for i in range(0, max, 2):
if flag:
#Do work.
break
Firstly, in python you cannot increment using the increment operator as in C++, or Java, e.x, x++ or --x. A for loop in Python works over an iterator (For example, List, String, etc.)
PYTHON FOR LOOPS:
A for loop is used for iterating over a sequence (that is either a list, a tuple, a dictionary, a set, or a string`).
This is less like the for keyword in other programming languages, and works more like an iterator method as found in other object-orientated programming languages.
With the for loop we can execute a set of statements, once for each item in a list, tuple, set etc.
Example
Print each fruit in a fruit list:
fruits = ["apple", "banana", "cherry"]
for x in fruits:
print(x)
will print:
apple
banana
cherry
Example
Do not print banana:
fruits = ["apple", "banana", "cherry"]
for x in fruits:
if x == "banana":
continue
print(x)
PYTHON CONDITIONALS:
In python the keyword for false values is False, and that for true values is True
Like C++ or Java, you can use == to compare values. But unlike Java, where there is strict type-checking and the condition needs to be a Boolean Statement, in Python:
Almost any value is evaluated to True if it has some sort of content.
Any string is True, except empty strings.
Any number is True, except 0.
Any list, tuple, set, and dictionary are True`, except empty ones.
In fact, there are not many values that evaluates to False, except empty values, such as (), [], {}, "", the number 0, and the value None. And of course the value False evaluates to False.
The following will return False:
bool(False)
bool(None)
bool(0)
bool("")
bool(())
bool([])
bool({})
One more value, or object in this case, evaluates to False, and that is if you have an object that is made from a class with a __len__ function that returns 0orFalse`:
class myclass():
def __len__(self):
return 0
myobj = myclass()
print(bool(myobj))
You use while for flag and condition and increment inside loop
i = 1
while flag and i < 20:
# code here
i = i*2
Sure, but you may need to do some things yourself.
Python provides the range() class which produces an interable sequence of values with an optional step increment.
for i in range(1, 20, 2):
# do the things
# here, i starts at 1, and increments by 2 each loop stops at >= 20
If you want to do something more complicated like i *= 2, You have two options:
use a while loop and increment the values yourself
write a custom generator like range() that implements such a thing.
Example generator:
def range_step_mult(start, stop, step):
while start < stop:
yield start
start *= step
for i in range_step_mult(1, 100, 2):
print(i)
Note the use of the yield keyword here. This is VERY important for performance over large ranges, especially if the iterable object is on the larger side. Using the keyword here allows Python to simply deal with one object at a time as opposed to generating all that stuff in memory before it even starts working on it.
You can use conditionals within the for loop, and you can use the break and continue keywords to control the loop flow to some level. That being said, the loop is generated outside of this block, so you can't change the step or something mid loop.
While loops are a different story entirely and you can alter the step as much as you want as you're the one incrementing it in the first place.
Example while loop:
i = 1
while i < 100
if i % 2 == 0:
i *= 2
else:
i += 1
In common with several other answers, here is how I would actually translate that code:
boolean_flag = True
i = 1 # actually I wouldn't use i as a variable name, but whatever.
while i < 20 and boolean_flag:
# code goes here
i *= 2
I would also consider using a for loop, which might look something like this:
from itertools import takewhile, count
boolean_flag = True
for i in takewhile(lambda i: i < 20, map(lambda x: 2**x, count())):
# code goes here
if not boolean_flag:
break
But having considered both, I prefer the while loop. And in practice I very rarely actually need a flag defined across a loop like that. Normally you can either break from the loop immediately you detect the condition that would cause you to set the flag, or else use logic like this:
boolean_flag = something()
something_that_has_to_happen_after_regardless_of_flag_value()
if not boolean_flag:
break
The need for boolean "break" flags is mostly (not wholly) a result of trying to write all your loops with no break in them, but there's no particular benefit to doing that.
It might be possible to salvage the for loop, or at least learn a few things about Python, by playing around with other ways to write what comes after for. Like lambda i: i < 20 could be (20).__gt__, but that's breathtakingly ugly in its own way. Or map(lambda is always a warning sign, and in this case map(lambda x: 2**x, count()) could be replaced with (2**x for x in count()). Or you can use functools.reduce to change the exponentiation back to multiplication, but it's unlikely to be worth it. Or you can write a generator function, but that's a chunk more boilerplate.
Supposing I know that the base-2 logarithm of 20 can only be so big, but I don't want to make myself a hostage to stupid off-by-one errors, I could write something like:
for i in (2**x for x in range(10)):
if not i < 20:
break
Or to get rid of all of the "clever" stuff:
for x in range(10):
i = 2 ** x
if not (i < 20 and boolean_flag):
break
But again, this isn't really solving the basic issue that for loops are intended for when you have an iterable containing the right values, and in Python you need to pull several things together to come up with the right iterable for this case (especially if you want to write 20 rather than the logarithm of 20). And that's even before you deal with the flag. So, in general you use a for loop when you have something to iterate over, or can easily produce something to iterate over, and use a while loop for more general looping based on mutating local variables.
Frankly for those specific numbers you might as well write for i in (1, 2, 4, 8, 16) and have done with it!
I have a python dictionary named cdc_year_births.
For cdc_year_births, the keys are the unit (in this case the unit is a year), the values are the number of births in that unit:
print(cdc_year_births)
{2000: 4058814, 2001: 4025933, 2002: 4021726, 2003: 4089950, 1994: 3952767,
1995: 3899589, 1996: 3891494, 1997: 3880894, 1998: 3941553, 1999: 3959417}
I wrote a function that returns the maximum and minimum years and their births. When I started the function, I thought I'd hard code the max and min unit at 0 and 1000000000, respectively, and then iterate through the dictionary and compare each key's value to those hard coded values; if the conditions were met, I'd replace the max/min unit and the max/min birth.
But if the dictionary I used had negative values or values greater than 1000000000, this function wouldn't work, which is why I had to "load in" some actual values from the dictionary with the first loop, then loop over them again.
I built this function but could not get it to work properly:
def max_min_counts(data):
max_min = {}
for key,value in data.items():
max_min["max"] = key,value
max_min["min"] = key,value
for key,value in data.items():
if value >= max_min["max"]:
max_min["max"]=key,value
if value <= max_min["min"]:
max_min["min"]=key,value
return max_min
t=max_min_counts(cdc_year_births)
print(t)
It results in TypeError: unorderable types: int() >= tuple() for
if value >= max_min["max"]:
and
if value <= max_min["min"]:
I tried extracting the value from the tuple as described in Finding the max and min in dictionary as tuples python, but could not get this to work.
Can anyone help me make the second, shorter function work or show me how to write a better one?
Thank you very much in advance.
Your values are 2-tuples. You'll need one further level of indexing to get them to work:
if value >= max_min["max"][1]:
And,
if value <= max_min["min"][1]:
If you want to preset your max/min values, you can use float('inf') and -float('inf'):
max_min["max"] = (-1, -float('inf')) # Smallest value possible.
max_min["min"] = (-1, float('inf')) # Largest value possible.
You can do this efficiently using max, min, and operator.itemgetter to avoid a lambda:
from operator import itemgetter
max(cdc_year_births.items(), key=itemgetter(1))
# (2003, 4089950)
min(cdc_year_births.items(), key=itemgetter(1))
# (1997, 3880894)
Here's a slick way to compute the max-min with reduce
from fuctools import reduce
reduce(lambda x, y: x if x[1] > y[1] else y, cdc_year_births.items())
# (2003, 4089950)
reduce(lambda x, y: x if x[1] < y[1] else y, cdc_year_births.items())
# (1997, 3880894)
items() generates a list of tuples out of your dictionary, and the key tells the functions what to compare against when picking the max/min.
In case you're interested in a more functional programming-oriented solution (or just something with more independent component parts), allow me to suggest the following:
Establish a comparison function between entries
Yes, we can use </> to compare the values as we iterate through the dict, but, as will become evident in a moment, it'll be useful to have something which lets us keep track of the year associated with that number of births.
def comp_births(op, lpair, rpair):
lyr, lbirths = lpair
ryr, rbirths = rpair
return rpair if op(rbirths, lbirths) else lpair
At the end of the day, op will end up being either the numerical greater than or the numerical less than, but adding this tuple business accomplishes our goal of keeping track of the year associated with the number of births. Futher, by factoring op out into a function parameter, rather than hard-coding the operator, we open the door for reusing this code for both the "min" and "max" variations.
Construct your iteratees
Now, all we need to do to create a function that compairs two year/num_births pairs is partially apply our comparison function:
from functools import partial
from operator import gt, lt
get_max = partial(comp_births, gt)
get_min = partial(comp_births, lt)
get_max((2003, 150), (2012, 400)) #=> (2012, 400)
Pipe in your data
So where do we find these year/num_births pairs? Turns out it's just cdc_year_births.items(). And since we're lazy, let's use a function to do the iteration for us (reduce):
from functools import reduce
yr_of_max_births, max_births = reduce(get_max, births.items())
yr_of_min_births, min_births = reduce(get_min, births.items())
demo
You need to compare against the value, not the entire tuple:
if value >= max_min["max"][1]:
As for not using the built-in functions, are you averse to using other built-ins? For instance, you could use reduce with a simple function -- x if x[1] < y[1] else y -- to get the minimum of all the entries. You could also sort the entries with x[1] as the key, then take the first and last elements of the sorted list.
Yeah, I'm up to this exercise too.
Without using max and min functions (we haven't covered them yet in the course material) here's the hard way...
def minimax(dict):
minimax_dict = {}
if(len(dict) == 31):
time = "day_of_month"
elif(len(dict) == 12):
time = "month"
elif(len(dict) == 7):
time = "day_of_week"
else:
time = 'year'
min_time = "min_" + time
max_time = "max_" + time
for item in dict:
if 'min_count' in minimax_dict:
if dict[item] < minimax_dict['min_count']:
minimax_dict['min_count'] = dict[item]
minimax_dict[min_time] = item
else:
minimax_dict['min_count'] = dict[item]
minimax_dict[min_time] = item
if 'max_count' in minimax_dict:
if dict[item] > minimax_dict['max_count']:
minimax_dict['max_count'] = dict[item]
minimax_dict[max_time] = item
else:
minimax_dict['max_count'] = dict[item]
minimax_dict[max_time] = item
return minimax_dict
#here's the test stuff...
min_max_dow_births = minimax(cdc_dow_births)
#min_max_dow_births
min_max_year_births = minimax(cdc_year_births)
#min_max_year_births
min_max_dom_births = minimax(cdc_dom_births)
#min_max_dom_births
min_max_month_births = minimax(cdc_month_births)
#min_max_month_births
I was trying to create a list comprehension from a function that I had and I came across an unexpected behavior. Just for a better understanding, my function gets an integer and checks which of its digits divides the integer exactly:
# Full function
divs = list()
for i in str(number):
digit = int(i)
if digit > 0 and number % digit == 0:
divs.append(digit)
return len(divs)
# List comprehension
return len([x for x in str(number) if x > 0 and number % int(x) == 0])
The problem is that, if I give a 1012 as an input, the full function returns 3, which is the expected result. The list comprehension returns a ZeroDivisionError: integer division or modulo by zero instead. I understand that it is because of this condition:
if x > 0 and number % int(x) == 0
In the full function, the multiple condition is handled from the left to the right, so it is fine. In the list comprehension, I do not really know, but I was guessing that it was not handled in the same way.
Until I tried with a simpler function:
# Full function
positives = list()
for i in numbers:
if i > 0 and 20 % i ==0:
positives.append(i)
return positives
# List comprehension
return [i for i in numbers if i > 0 and 20 % i == 0]
Both of them worked. So I am thinking that maybe it has something to do with the number % int(x)? This is just curiosity on how this really works? Any ideas?
The list comprehension is different, because you compare x > 0 without converting x to int. In Py2, mismatched types will compare in an arbitrary and stupid but consistent way, which in this case sees all strs (the type of x) as greater than all int (the type of 0) meaning that the x > 0 test is always True and the second test always executes (see Footnote below for details of this nonsense). Change the list comprehension to:
[x for x in str(number) if int(x) > 0 and number % int(x) == 0]
and it will work.
Note that you could simplify a bit further (and limit redundant work and memory consumption) by importing a Py3 version of map at the top of your code (from future_builtins import map), and using a generator expression with sum, instead of a list comprehension with len:
return sum(1 for i in map(int, str(number)) if i > 0 and number % i == 0)
That only calls int once per digit, and constructs no intermediate list.
Footnote: 0 is a numeric type, and all numeric types are "smaller" than everything except None, so a str is always greater than 0. In non-numeric cases, it would be comparing the string type names, so dict < frozenset < list < set < str < tuple, except oops, frozenset and set compare "naturally" to each other, so you can have non-transitive relationships; frozenset() < [] is true, [] < set() is true, but frozenset() < set() is false, because the type specific comparator gets invoked in the final version. Like I said, arbitrary and confusing; it was removed from Python 3 for a reason.
You should say int(x) > 0 in the list comprehension