How to avoid keyerror when applying a function?

How to avoid keyerror when applying a function? - python

I have a function that looks at the sales quantities of each product in a transaction and returns a value based on the quantity.
I need the function to work for any number of products in a transaction; presently the maximum is 6 different products in a transaction, but I want the function to work even if that increases to say 9 products.
I defined the function to compare up to 20 products, but I get a KeyError when applying it as columns 'Product_Quantity_7' onwards do not exist.
Could someone please share a method by which I can make the function robust to any number of products in my dataset (I've attached a snapshot of my dataset as well)?
def trial_function(row, df):
if row['Product_Quantity_1'] >= 6:
return 6
elif row['Product_Quantity_1'] >= 3:
return 3
elif row['Product_Quantity_2'] >= 6:
return 6
elif row['Product_Quantity_2'] >= 3:
return 3
elif row['Product_Quantity_3'] >= 6:
return 6
elif row['Product_Quantity_3'] >= 3:
return 3
elif row['Product_Quantity_4'] >= 6:
return 6
elif row['Product_Quantity_4'] >= 3:
return 3
elif row['Product_Quantity_5'] >= 6:
return 6
elif row['Product_Quantity_5'] >= 3:
return 3
elif row['Product_Quantity_6'] >= 6:
return 6
elif row['Product_Quantity_6'] >= 3:
return 3
elif row['Product_Quantity_7'] >= 6:
return 6
elif row['Product_Quantity_7'] >= 3:
return 3
elif row['Product_Quantity_8'] >= 6:
return 6
elif row['Product_Quantity_8'] >= 3:
return 3
elif row['Product_Quantity_9'] >= 6:
return 6
elif row['Product_Quantity_9'] >= 3:
return 3
Here's what my dataset looks like

Related

Why won't second for loop execute correctly?

I'm trying to write two for loops that will return a score for different inputs, and create a new field with the new score. The first loop works fine but the second loop never returns the correct score.
import pandas as pd
d = {'a':['foo','bar'], 'b':[1,3]}
df = pd.DataFrame(d)
score1 = df.loc[df['a'] == 'foo']
score2 = df.loc[df['a'] == 'bar']
for i in score1['b']:
if i < 3:
score1['c'] = 0
elif i <= 3 and i < 4:
score1['c'] = 1
elif i >= 4 and i < 5:
score1['c'] = 2
elif i >= 5 and i < 8:
score1['c'] = 3
elif i == 8:
score1['c'] = 4
for j in score2['b']:
if j < 2:
score2['c'] = 0
elif j <= 2 and i < 4:
score2['c'] = 1
elif j >= 4 and i < 6:
score2['c'] = 2
elif j >= 6 and i < 8:
score2['c'] = 3
elif j == 8:
score2['c'] = 4
print(score1)
print(score2)
When I run script it returns the following:
print(score1)
a b c
0 foo 1 0
print(score2)
a b
1 bar 3
Why doesn't score2 create the new field "c" or a score?

Avoid the use of for loops to conditionally update DataFrame columns which are not Python lists. Use vectorized methods of Pandas and Numpy such as numpy.select which scales to millions of rows! Remember these data science tools calculate much differently than general use Python:
# LIST OF BOOLEAN CONDITIONS
conds = [
score1['b'].lt(3), # EQUIVALENT TO < 3
score1['b'].between(3, 4, inclusive="left"), # EQUIVALENT TO >= 3 or < 4
score1['b'].between(4, 5, inclusive="left"), # EQUIVALENT TO >= 4 or < 5
score1['b'].between(5, 8, inclusive="left"), # EQUIVALENT TO >= 5 or < 8
score1['b'].eq(8) # EQUIVALENT TO == 8
]
# LIST OF VALUES
vals = [0, 1, 2, 3, 4]
# VECTORIZED ASSIGNMENT
score1['c'] = numpy.select(conds, vals, default=numpy.nan)
# LIST OF BOOLEAN CONDITIONS
conds = [
score2['b'].lt(2),
score2['b'].between(2, 4, inclusive="left"),
score2['b'].between(4, 6, inclusive="left"),
score2['b'].between(6, 8, inclusive="left"),
score2['b'].eq(8)
]
# LIST OF VALUES
vals = [0, 1, 2, 3, 4]
# VECTORIZED ASSIGNMENT
score2['c'] = numpy.select(conds, vals, default=numpy.nan)

On the first iteration of second for loop, j will be in 3. so that none your condition satisfies.
for j in score2['b']:
if j < 3:
score2['c'] = 0
elif j <= 3 and i < 5:
score2['c'] = 1
elif j >= 5 and i < 7:
score2['c'] = 2
elif j >= 7 and i < 9:
score2['c'] = 3
elif j == 9:
score2['c'] = 4

Making a card game using nestef if | Python

So, I have a homework. As you see below there're some cards below and each has a name. they're numbered from 0 to 35 and from left top to bottom. User enters a number and program has to tell the name, color and the number of the card. But there're some rules for it.
I can only use nested if and operators. Can't use while or other functions.
I can use max 13 if
can't use 36 if, else swap
this is the image of cards
I am not sure about what to do after this...
'''
num = int(input('Enter a number: '))
if num == 0 or num == 1 or num == 2 or num == 3:
print("6")
if num == 4 or num == 5 or num == 6 or num == 7:
print('7')
if num == 8 or num == 9 or num == 10 or num == 11:
print('8')
if num == 12 or num == 13 or num == 14 or num == 15:
print('9')
if num == 16 or num == 17 or num == 18 or num == 19:
print('10')
if num == 20 or num == 21 or num == 22 or num == 23:
print('vale')
if num == 24 or num == 25 or num == 26 or num == 27:
print('queen')
if num == 28 or num == 29 or num == 30 or num == 31:
print('king')
if num == 32 or num == 33 or num == 34 or num == 35:
print('tus')
'''

Here is what you are looking for in one line:
print(([str(i) for i in range(6, 11)] + ['vale', 'queen', 'king', 'tus'])[(int(input('Enter a number: '))) // 4])
The goal is to create a list of avalaible cards. Then use euclidien division by 4 to select the right card...

Matplotlib and Pandas Plotting amount of numbers in certain range

I have pandas Dataframe that looks like this:
I am asking to create this kind of plot for every year [1...10] with the Score range of [1...10].
This means that for every year, the plot will present:
how many values between [0-1] have in year 1
how many values between [2-3] have in year 1
how many values between [4-5] have in year 1
.
.
.
.
.
how many values between [6-7] have in year 10
how many values between [8-9] have in year 10
how many values between [10] has in year 10
Need some help, Thank you!

The following code works perfectly:
def visualize_yearly_score_distribution(ds, year):
sns.set_theme(style="ticks")
first_range = 0
second_range = 0
third_range = 0
fourth_range = 0
fifth_range = 0
six_range = 0
seven_range = 0
eight_range = 0
nine_range = 0
last_range = 0
score_list = []
for index, row in ds.iterrows():
if row['Publish Date'] == year:
if 0 < row['Score'] < 1:
first_range += 1
if 1 < row['Score'] < 2:
second_range += 1
if 2 < row['Score'] < 3:
third_range += 1
if 3 < row['Score'] < 4:
fourth_range += 1
if 4 < row['Score'] < 5:
fifth_range += 1
if 5 < row['Score'] < 6:
six_range += 1
if 6 < row['Score'] < 7:
seven_range += 1
if 7 < row['Score'] < 8:
eight_range += 1
if 8 < row['Score'] < 9:
nine_range += 1
if 9 < row['Score'] < 10:
last_range += 1
score_list.append(first_range)
score_list.append(second_range)
score_list.append(third_range)
score_list.append(fourth_range)
score_list.append(fifth_range)
score_list.append(six_range)
score_list.append(seven_range)
score_list.append(eight_range)
score_list.append(nine_range)
score_list.append(last_range)
range_list = ['0-1', '1-2', '2-3', '3-4', '4-5', '5-6', '6-7', '7-8', '8-9', '9-10']
plt.pie([x*100 for x in score_list], labels=[x for x in range_list], autopct='%0.1f', explode=None)
plt.title(f"Yearly Score Distribution for {str(year)}")
plt.tight_layout()
plt.legend()
plt.show()
Thank you all for the kind comments :)
This case is closed.

Python dice outcome count

Write a function collect_sims(nsim,N,D,p=0.5,nmax=10000) that runs your run_sim function nsim times (with parameters N, D, p) and returns a numpy array of length nmax giving the number of times that the simulation took a specified number of steps to stop. For example, suppose nsim was 8 and successive runs of run_sim gave you 3,4,4,3,6,5,4,4. You would tabulate this as “two 3s, four 4s, one 5, one 6, zero 7s, zero 8s …”
def collect_sims(nsim, N, D, p=0.5, nmax=10000):
run_sim(N=20, D=6, p=0.5, itmax=5000)
onecount = 0
twocount = 0
threecount = 0
fourcount = 0
fivecount = 0
sixcount = 0
for k in range (n):
if D == 1:
onecount += 1
if D == 2:
twocount += 1
if D == 3:
threecount += 1
if D == 4:
fourcount += 1
if D == 5:
fivecount += 1
if D == 6:
sixcount += 1
return(k)
print(onecount, "1",twocount,"2",threecount,"3",fourcount,"4",fivecount,"5",sixcount,"6")
It says my 6 variables onecount, twocount, etc are not defined, how can I define them? Also, what can I do to fix my code?

I don't know why are you returning k.
Anyway, the problem is that oncount, twocount, ... etc is in different scope that print. You can put the print() inside the function or you can return an tuple with the counts
Some like that:
def collect_sims(nsim, N, D, p=0.5, nmax=10000):
run_sim(N=20, D=6, p=0.5, itmax=5000)
onecount = 0
twocount = 0
threecount = 0
fourcount = 0
fivecount = 0
sixcount = 0
for k in range (n):
if D == 1:
onecount += 1
if D == 2:
twocount += 1
if D == 3:
threecount += 1
if D == 4:
fourcount += 1
if D == 5:
fivecount += 1
if D == 6:
sixcount += 1
return(onecount, twocount, threecount, fourcount,fivecount,sixcount)
onecount, twocount, threecount, fourcount,fivecount,sixcount = collect_sims (...)
print(onecount, "1",twocount,"2",threecount,"3",fourcount,"4",fivecount,"5",sixcount,"6")
Different Solution
Maybe this other solution can help you:
https://stackoverflow.com/a/9744274/6237334

Indent your for loop: in the code you posted, it's back at the original indentation level (none, for the for statement). This ends your function, and the loop is in the main program. Your variables aren't defined yet (since they're not the same as the ones in the function), and your return is illegal.
Try this, perhaps?
def collect_sims(nsim, N, D, p=0.5, nmax=10000):
run_sim(N=20, D=6, p=0.5, itmax=5000)
onecount = 0
twocount = 0
threecount = 0
fourcount = 0
fivecount = 0
sixcount = 0
for k in range (n):
if D == 1:
onecount += 1
if D == 2:
twocount += 1
if D == 3:
threecount += 1
if D == 4:
fourcount += 1
if D == 5:
fivecount += 1
if D == 6:
sixcount += 1
print(onecount, "1",twocount,"2",threecount,"3",fourcount,"4",fivecount,"5",sixcount,"6")
collect_sims()
I can't test, since you didn't supply enough code. Also, note that I simply left the print statement in place as a debugging trace. You have to return an array, and you've made no attempt to do that yet. Your original code returned k, which had to be n+1. This is not useful to the calling program.
FURTHER HELP
Learn to use a list of 6 elements for the counts, rather than six separate variables. Even better, put all of the die rolls into a list, and simply use the count function to determine how many of each you have.

Python: Maximum recursion depth while getting the str of an object

I'm making a program to get the amount of letters in a number:
def convert(number):
lettercount = 0
numstr = str(number)
# One's places
if len(numstr) is 1:
if number == 1 or number == 2 or number == 6:
lettercount += 3
elif number == 4 or number == 5 or number == 9:
lettercount += 4
else:
lettercount += 5
# Ten's places
elif len(numstr) is 2:
if number == 10:
lettercount += 3
elif number == 11 or number == 12:
lettercount += 6
elif number == 15 or number == 16:
lettercount += 7
elif number == 13 or number == 14 or number == 19:
lettercount += 8
elif number == 17 or number == 18:
lettercount += 9
elif number == 20 or number == 30 or number == 40 or\
number == 80 or number == 90:
lettercount += 6
else:
lettercount += convert(int((numstr)[-1]))
lettercount += convert(int(round(number, -1)))
return lettercount
print "88 has %i letters in its name." % convert(88)
print "23 has %i letters in its name." % convert(23)
print "46 has %i letters in its name." % convert(46)
It works just fine and returns a correct response for the 88 and 23, but it gives a recursion depth error on 46. I'm confused; why does it happen on just 46?
Fixed code:
def convert(number):
lettercount = 0
numstr = str(number)
# One's places
if len(numstr) == 1:
if number == 1 or number == 2 or number == 6:
lettercount += 3
elif number == 4 or number == 5 or number == 9:
lettercount += 4
else:
lettercount += 5
# Ten's places
elif len(numstr) == 2:
if number == 10:
lettercount += 3
elif number == 40 or number == 50:
lettercount += 5
elif number == 11 or number == 12 or number == 20 or number == 30 or\
number == 80 or number == 90:
lettercount += 6
elif number == 15 or number == 16:
lettercount += 7
elif number == 13 or number == 14 or number == 19:
lettercount += 8
elif number == 17 or number == 18:
lettercount += 9
else:
lettercount += convert(int((numstr)[-1]))
lettercount += convert((int(numstr) // 10) * 10)
return lettercount
print "88 has %i letters in its name." % convert(88)
print "23 has %i letters in its name." % convert(23)
print "46 has %i letters in its name." % convert(46)

Because when you do
convert(int(round(number, -1)))
you are calling convert(50). Since 50 isn't covered by your if statements, it gets to the else again, and calls convert(50) again, and so forth.

The problem here is that round(46, -1) will produce the value 50. When convert is called with the value 50 it will go to the exact same line
lettercount += convert(int(round(number, -1)))
The round(50, -1) call will produce 50 and at this point the convert function will execute infinitely

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to avoid keyerror when applying a function? - python

Related

Why won't second for loop execute correctly?

Making a card game using nestef if | Python

Matplotlib and Pandas Plotting amount of numbers in certain range

Python dice outcome count

Python: Maximum recursion depth while getting the str of an object

Categories

Resources