How to create a function based on another dataframe column being True? - python

I have a dataframe shown below:
Name X Y
0 A False True
1 B True True
2 C True False
I want to create a function for example:
example_function("A") = "A is in Y"
example_function("B") = "B is in X and Y"
example_function("C") = "C is in X"
This is my code currently (incorrect and doesn't look very efficient):
def example_function(name):
for name in df['Name']:
if df['X'][name] == True and df['Y'][name] == False:
print(str(name) + "is in X")
elif df['X'][name] == False and df['Y'][name] == True:
print(str(name) + "is in Y")
else:
print(str(name) + "is in X and Y")
I eventually want to add more Boolean columns so it needs to be scalable. How can I do this? Would it be better to create a dictionary, rather than a dataframe?
Thanks!

If you really want a function you could do:
def example_function(label):
s = df.set_index('Name').loc[label]
l = s[s].index.to_list()
return f'{label} is in {" and ".join(l)}'
example_function('A')
'A is in Y'
example_function('B')
'B is in X and Y'
You can also compute all the solutions as dictionary:
s = (df.set_index('Name').replace({False: pd.NA}).stack()
.reset_index(level=0)['Name']
)
out = s.index.groupby(s)
output:
{'A': ['Y'], 'B': ['X', 'Y'], 'C': ['X']}

I think you can stay with a DataFrame, the same output can be obtained with a function like this:
def func (name, df):
# some checks to verify that the name is actually in the df
occurrences_name = np.sum(df['Name'] == name)
if occurrences_name == 0:
raise ValueError('Name not found')
elif occurrences_name > 1:
raise ValueError('More than one name found')
# get the index corresponding to the name you're looking for
# and select the corresponding row
index = df[df['Name'] == name].index[0]
row = df.drop(['Name'], axis=1).iloc[index]
outstring = '{} is in '.format(name)
for i in range(len(row)):
if row[i] == True:
if i != 0: outstring += ', '
outstring += '{}'.format(row.index[i])
return outstring
of course you can adapt this to the specific shape of your df, I'm assuming that the column containing names is actually 'Name'.

Related

dataframe reparsing new columns in a loop

My dataframe is suppose to just create 1 modified copy of each int or float value column however it is modifying the modified column etc. I believe when I write for column in data, it thinks there are more columns than are actually present. Is their any way to fix this problem? error occurs at **
here is what is appearing
class simple_math:
def __init__(self, operand, operator):
self.operand=operand
self.operator=operator
if self.operator == '+' or self.operator=='-' or self.operator=='/':
print('this is correct character')
else:
print('You have entered the wrong character')
def user_op(self, user_input):
operand=self.operand
operator=self.operator
temp = operand
if operator == '+':
temp += user_input
return temp
if operator == '-':
temp -= user_input
return temp
if operator == '/':
temp /= user_input
return temp
test_data=sns.load_dataset('titanic')
df=test_data
df2=pd.DataFrame()
i = 0
for columns in df:
new_columns= []
if df[columns].dtypes == float or df[columns].dtypes == bool:
new_columns = df[columns]
df2.insert(i, columns, new_columns)
i=i+1
else:
pass
df2= df2.replace({True: 'TRUE', False: 'FALSE'})
df3 = df2.loc[df['fare']<70]
test_data = df3.dropna()
test_data
**class tester(simple_math):
def applicator(self, data):
data.reset_index(drop=True, inplace=True)
df = data
for columns in data:
try:
df['modified_%s' % columns]= simple_math.user_op(self, data[columns])
except:
print('unable to parse', columns)
pass
return df

Python: Function works when tested with an integer but not in within apply() lambda function

This is my function:
def recommendation(row):
if row['fitness_discipline'] == 'Cycling':
if (row.avg_workout_length > row.total_workout_length) & (row.next_class == 'last'):
filtered_classes = classes[(classes.fitness_discipline != row.prev_class) & (classes.fitness_discipline != row.fitness_discipline)& (classes.class_length < row.avg_workout_length*0.5)].sort_values(by='times_taken',ascending=False)
if filtered_classes.empty:
reco= ''
else:
reco = filtered_classes.iloc[0]['class'] + ', ' + filtered_classes.iloc[1]['class'] + ', '+ filtered_classes.iloc[-1]['class']
return reco
elif (row.total_workout_length >= row.avg_workout_length) & (row.next_class == 'last'):
filtered_classes = classes[(classes.fitness_discipline != row.prev_class) & (classes.fitness_discipline != row.fitness_discipline)& (classes.class_length <= 15)].sort_values(by='times_taken',ascending=False)
if filtered_classes.empty:
reco= ''
else:
reco = filtered_classes.iloc[0]['class'] + ', ' + filtered_classes.iloc[1]['class'] + ', '+ filtered_classes.iloc[-1]['class']
return reco
return None
return None
The above is just for context as when I run it, it runs perfectly as it should. In other words when I test it:
recommendation(df.iloc[12345])
Where '12345' is a random index in dataframe "df", I get the exact output I want as a string.
Then, when I do:
df['new_column'] = df.apply(lambda x: recommendation(x),axis=1)
I expect a new column to be created in df that takes each row and applies the function, outputting the result in the 'new_column" column based on each row. But no matter what I try, it fails and I usually get
IndexError: single positional indexer is out-of-bounds'
as an error.

How to check if a function returns one type of data or another in python

I have a function that must return lists, however if it is not possible to convert correctly in the try that is shown, then it means that the input is wrong, and what I do is return false here the code:
def objective_function(fo):
min_or_max = 0
piso = 5
i = 5
C = []
index = []
if fo[0 : 5] == "minz=":
min_or_max = 1
elif fo[0 : 5] == "maxz=":
min_or_max = 2
while i < len(term):
if fo[i] == "+" or fo[i] == "-":
coefficient, j = separate_term(fo[piso : i])
try:
C.append(Fr(coefficient))
index.append(int(j))
except:
return False
piso = i
print(C)
return min_or_max, C
What I want to do is know if it returns false to make a print ("It has an error"), and if not show the two lists that it returns as I would do it?

How to check for 3 in a row in tic tac toe game

I am making a tic tac toe game and trying to create a function that checks if 3 of the same spots in a row have the same input 'x' or '0'. I am having trouble with the three_in_row function I am trying to make to trigger game over. I am trying to figure out how to do this in a simple way so all rows or columns will be triggers if 3 X's or 0's are played... Here's what I have so far. This is in python 2.7.13
(this is only part of the code I think should be relevant)
def setup_board(size):
board = []
for row in range(size):
board.append([])
for col in range(size):
board[row].append(empty)
return board
def three_in_row(b):
b[0][0] and b[0][1] and b[0][2] == 'x'
def game_over(b):
if three_in_row(b) == True:
print "Congratulations You Win!"
else:
return False
def tic_tac_toe():
b = setup_board(3)
run_game(b)
In my opinion, it might make more sense to store X's as +1 and O's as -1, so that you can easily do arithmetic to check if the game is over.
For example:
def three_in_row(b):
xWin = check_winner(b,3)
oWin = check_winner(b,-3)
return xWin | oWin
def check_winner(b, valToCheck):
foundWin = any(sum(r) in {valToCheck} for r in b) # check rows
# now check columns
for i in range(3):
foundWin = foundWin | (sum([b[j][i] for j in range(3)]) == valToCheck)
# now check diagonals
foundWin = foundWin | (sum([b[i][i] for i in range(3)]) == valToCheck)
foundWin = foundWin | (sum([b[i][2-i] for i in range(3)]) == valToCheck)
return foundWin
Thanks to Blender for the following more succinct method:
def three_in_row(b):
return any(sum(r) in {3, -3} for r in b)
def line_match(game):
for i in range(3):
set_r = set(game[i])
if len(set_r) == 1 and game[i][0] != 0:
return game[i][0]
return 0
#transposed column function for future use
#def column(game):
# trans = numpy.transpose(game)
# for i in range(3):
# set_r = set(trans[i])
# if len(set_r) == 1 and trans[i][0] != 0:
# return list(set_r)[0]
def diagonal_match(game):
if game[1][1] != 0:
if game[1][1] == game[0][0] == game[2][2]:
return game[1][1]
elif game[1][1] == game[0][2] == game[2][0]:
return game[1][1]
return 0
The correct syntax for the checks is either:
b[0][0] == 'x' and b[0][1] == 'x' and b[0][2] == 'x'
or (more succinctly):
b[0][0] == b[0][1] == b[0][2] == 'x'
You are also missing a return just before your check, like:
return b[0][0] == b[0][1] == b[0][2] == 'x'
Anyways, your code does not iterate over all the rows. A possible correction would be:
def three_in_row(b):
for row in rage(0, 3):
if b[row][0] == b[row][1] == b[row][2] == 'x':
return True
return False
Doing a three_in_column(b) should be fairly easy (changing b[row][n] in b[n][column]), so is also manually checking the two diagonals.

How do You change a variable to a string in python?

So I am trying to change a randomized variable to a string with a function, any ideas why this isn't working?
def letter(x):
if x == 1:
x = "A"
elif x == 2:
x = "C"
elif x == 3:
x = "G"
elif x == 4:
x = "T"
else:
print "Error"
randint18= random.randrange(1,5)
letter(randint18)
print randint18 `
You have to return the value from the function, and assign it to a variable.
def letter(x):
...
return x
randint18 = random.randrange(1, 5)
result = letter(randint18)
print result
mine isn't a proper answer, which have been provided already, but a suggestion for improving your code. I'd do it in a comment, but the code formatting ain't good enough.
Why not use a dictionary for the mapping, instead of a sequence of if's? You could still place it in a function if you like:
letter = {1:'A', 2:'C', 3:'G', 4:'T'}
randint18 = random.randrange(1,5)
mapping = letter.get(randint18, 'Error')
print mapping
mind you, a list would be even more efficient, if the mapping started form zero:
letter = ['A', 'C', 'G', 'T']
randint18 = random.randrange(0,4)
try: # in case your random index were allowed to go past 3
mapping = letter[randint18]
except IndexError:
mapping = 'Error'
print mapping
You cannot alter the variable in place you must return it and capture the returned value.
import random
def letter(x):
if x == 1:
x = "A"
elif x == 2:
x = "C"
elif x == 3:
x = "G"
elif x == 4:
x = "T"
else:
print "Error"
return x # return it here
randint18= random.randrange(1,5)
randint18 = letter(randint18) # capture the returned value here
print randint18
There is a simpler way to achieve what you want, using a dictionary to map the values.
import random
def letter(x):
mapd = {1:'A', 2:'C', 3:'G', 4:'T'}
return mapd.get(x, None)
randint18= random.randrange(1,5)
randint18 = letter(randint18)
print randint18
You forgot to include a return in your function
def letter(x):
if x == 1:
x = "A"
elif x == 2:
x = "C"
elif x == 3:
x = "G"
elif x == 4:
x = "T"
else:
print "Error"
return x
randint18 = random.randrange(1,5)
returned_result = letter(randint18)
print returned_result
Add a return value of the function
return x
value_you_want = letter(randint18) ##add the return statement. Output will be saved to value_you_want
Please note that the variables defined inside a function are local to the function and cannot be accessed outside the scope of the function. You were expecting the value of x outside the function which is not possible. Just to check run your function and try to access the value in variable x. It will give error.
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
print x
NameError: name 'x' is not defined

Categories