Python Pulp - Number of Unique Teams Constraint - python

I am new to Pulp and therefore have been encountering a problem when trying to make a conditional constraint. I have made a fantasy football optimizer that picks the optimal selection of 9 players, my solver fully works currently with position constraints, salary constraints, and more.
The last thing I need to add is a constraint that makes it so out of the 9 players it picks, there need to be 8 unique team names of the players. For example: there is a Quarterback and a WR/TE going to be on the same team given this constraint in my code ###Stack QB with 2 teammates. and therefore everyone else should be on a different team than each other to have 8 unique team names.
Below is the the code i have tried to use to make this constraint, the head of the excel file being optimized and my code that works so far without the constraint I want to add of 8 unique team names in the 9 players selected.
I have currently tried this but it doesn't work! Would really appreciate any help!
list_of_teams = raw_data['Team'].unique()
team_vars = pulp.LpVariable.dicts('team', list_of_teams, cat = 'Binary')
for team in list_of_teams:
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Team'][i] == team] + [-9*team_vars[team]]) <= 0
prob += pulp.lpSum([team_vars[t] for t in list_of_teams]) >= 8
file_name = 'C:/Users/Michael Arena/Desktop/Football/Simulation.csv'
raw_data = pd.read_csv(file_name,engine="python",index_col=False, header=0, delimiter=",", quoting = 3)
player_ids = raw_data.index
player_vars = pulp.LpVariable.dicts('player', player_ids, cat='Binary')
prob = pulp.LpProblem("DFS Optimizer", pulp.LpMaximize)
prob += pulp.lpSum([raw_data['Projection'][i]*player_vars[i] for i in player_ids])
##Total Salary upper:
prob += pulp.lpSum([raw_data['Salary'][i]*player_vars[i] for i in player_ids]) <= 50000
##Total Salary lower:
prob += pulp.lpSum([raw_data['Salary'][i]*player_vars[i] for i in player_ids]) >= 49900
##Exactly 9 players:
prob += pulp.lpSum([player_vars[i] for i in player_ids]) == 9
##2-3 RBs:
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'RB']) >= 2
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'RB']) <= 3
##1 QB:
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'QB']) == 1
##3-4 WRs:
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'WR']) >= 3
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'WR']) <= 4
##1-2 TE's:
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'TE']) >= 1
# prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'TE']) <= 2
##1 DST:
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'DST']) == 1
###Stack QB with 2 teammates
for qbid in player_ids:
if raw_data['Position'][qbid] == 'QB':
prob += pulp.lpSum([player_vars[i] for i in player_ids if
(raw_data['Team'][i] == raw_data['Team'][qbid] and
raw_data['Position'][i] in ('WR', 'TE'))] +
[-1*player_vars[qbid]]) >= 0
###Don't stack with opposing DST:
for dstid in player_ids:
if raw_data['Position'][dstid] == 'DST':
prob += pulp.lpSum([player_vars[i] for i in player_ids if
raw_data['Team'][i] == raw_data['Opponent'][dstid]] +
[8*player_vars[dstid]]) <= 8
###Stack QB with 1 opposing player:
for qbid in player_ids:
if raw_data['Position'][qbid] == 'QB':
prob += pulp.lpSum([player_vars[i] for i in player_ids if
(raw_data['Team'][i] == raw_data['Opponent'][qbid] and
raw_data['Position'][i] in ('WR', 'TE'))]+
[-1*player_vars[qbid]]) >= 0
prob.solve()

In Linear Programming terms
Let x_i = 1 if the i^th player is chosen, and 0 otherwise, i = 1....I.
Let t_i be the team of the i^th player, which is a constant.
Let t_j be the j^th unique team, also a constant, j = 1....T.
And let t_{ij} = 1 if t_i == t_j, and 0 otherwise. This is also a constant.
Then you can say that the total number of players selected from team t_j is (t_{1j}*x_1 + t_{1j}*x_2 + ... + t_{Ij}*x_I), which takes a value between 0 and I, logically.
Now, you can let the binary variable y_j = 1 if any selected players come from team t_j, and 0 otherwise, like this:
(t_{1j}*x_1 + t_{1j}*x_2 + ... + t_{Ij}*x_I) >= y_j
This gives you the following situation:
If (t_{1j}*x_1 + t_{1j}*x_2 + ... + t_{Ij}*x_I) = 0, then y_j is 0;
If (t_{1j}*x_1 + t_{1j}*x_2 + ... + t_{Ij}*x_I) > 0, then y_j can be 0 or 1.
And now, if you add a constraint (y_1 + y_2 + ... + y_T) >= 8, that implies that (t_{1j}*x_1 + t_{1j}*x_2 + ... + t_{Ij}*x_I) > 0 for at least 8 different teams t_j.
In PULP terms (something like this, wasn't able tot test it)
If player_vars is a binary variable equivalent to x_i
teams = raw_data['Team'] # t_i
unique_teams = teams.unique() # t_j
player_in_team = teams.str.get_dummies() # t_{ij}
# Example output for `teams = pd.Series(['A', 'B', 'C', 'D', 'E', 'F', 'A', 'C', 'E'])`:
# A B C D E F
# 0 1 0 0 0 0 0
# 1 0 1 0 0 0 0
# 2 0 0 1 0 0 0
# 3 0 0 0 1 0 0
# 4 0 0 0 0 1 0
# 5 0 0 0 0 0 1
# 6 1 0 0 0 0 0
# 7 0 0 1 0 0 0
# 8 0 0 0 0 1 0
team_vars = pulp.LpVariable.dicts('team', unique_teams, cat='Binary') # y_j
for team in unique_teams:
prob += pulp.lpSum(
[player_in_team[team][i] * player_vars[i] for i in player_ids]
) >= team_vars[team]
prob += pulp.lpSum([team_vars[t] for t in unique_teams]) >= 8

Related

Matplotlib and Pandas Plotting amount of numbers in certain range

I have pandas Dataframe that looks like this:
I am asking to create this kind of plot for every year [1...10] with the Score range of [1...10].
This means that for every year, the plot will present:
how many values between [0-1] have in year 1
how many values between [2-3] have in year 1
how many values between [4-5] have in year 1
.
.
.
.
.
how many values between [6-7] have in year 10
how many values between [8-9] have in year 10
how many values between [10] has in year 10
Need some help, Thank you!
The following code works perfectly:
def visualize_yearly_score_distribution(ds, year):
sns.set_theme(style="ticks")
first_range = 0
second_range = 0
third_range = 0
fourth_range = 0
fifth_range = 0
six_range = 0
seven_range = 0
eight_range = 0
nine_range = 0
last_range = 0
score_list = []
for index, row in ds.iterrows():
if row['Publish Date'] == year:
if 0 < row['Score'] < 1:
first_range += 1
if 1 < row['Score'] < 2:
second_range += 1
if 2 < row['Score'] < 3:
third_range += 1
if 3 < row['Score'] < 4:
fourth_range += 1
if 4 < row['Score'] < 5:
fifth_range += 1
if 5 < row['Score'] < 6:
six_range += 1
if 6 < row['Score'] < 7:
seven_range += 1
if 7 < row['Score'] < 8:
eight_range += 1
if 8 < row['Score'] < 9:
nine_range += 1
if 9 < row['Score'] < 10:
last_range += 1
score_list.append(first_range)
score_list.append(second_range)
score_list.append(third_range)
score_list.append(fourth_range)
score_list.append(fifth_range)
score_list.append(six_range)
score_list.append(seven_range)
score_list.append(eight_range)
score_list.append(nine_range)
score_list.append(last_range)
range_list = ['0-1', '1-2', '2-3', '3-4', '4-5', '5-6', '6-7', '7-8', '8-9', '9-10']
plt.pie([x*100 for x in score_list], labels=[x for x in range_list], autopct='%0.1f', explode=None)
plt.title(f"Yearly Score Distribution for {str(year)}")
plt.tight_layout()
plt.legend()
plt.show()
Thank you all for the kind comments :)
This case is closed.

Pandas count of sequence of positive and negative numbers

Problem is probably simple, but my brain doesn't work as expected.
Imagine you have this Panda Series:
y = pd.Series([5, 5 , -5 , -10, 7 , 7 ])
z = y * 0
I would like to have output:
1, 2 , -1 ,-2 ,1 ,2
My solution below:
for i, row in y.iteritems():
if i == 0 and y[i] > 0:
z[i] = 1
elif i == 0:
z[i] = -1
elif y[i] >= 0 and y[i-1] >= 0:
z[i] = 1 + z[i-1]
elif y[i] < 0 and y[i-1] < 0:
z[i] = -1 + z[i-1]
elif y[i] >= 0 and y[i-1] < 0:
z[i] = 1
elif y[i] < 0 and y[i-1] >= 0:
z[i] = -1
I would think there is a more Python/Panda solution.
You can use np.sign() to check if the number is positive/negative ans compare it to the next row using shift(). Finally, use cumcount() to sum each sub series
y = pd.Series([5, 5 , -5 , -10, 7 , 7 ])
parts = (np.sign(y) != np.sign(y.shift())).cumsum()
print((y.groupby(parts).cumcount() + 1) * np.sign(y))
# or print(y.groupby(parts).cumcount().add(1).mul(np.sign(y)))
Output
0 1
1 2
2 -1
3 -2
4 1
5 2
Turning points in terms of sign are found via looking at difference not being 0 when subjected to np.sign. Then cumulative sum of this gives consecutive groups of same sign. We lastly put cumcount to number each group and also multiply by the sign to get negative counts:
signs = np.sign(y)
grouper = signs.diff().ne(0).cumsum()
result = y.groupby(grouper).cumcount().add(1).mul(signs)
where add(1) is because cumcount gives 0, 1, .. but we need 1 more.
>>> result
0 1
1 2
2 -1
3 -2
4 1
5 2

A test interview question I could not figure out

So I wrote a piece of code in pycharm
to solve this problem:
pick any 5 positive integers that add up to 100
and by addition,subtraction or just using one of the five values
you should be able to make every number up to 100
for example
1,22,2,3,4
for 1 I could give in 1
for 2 i could give in 2
so on
for 21 I could give 22 - 1
for 25 I could give (22 + 2) - 1
li = [1, 1, 1, 1, 1]
lists_of_li_that_pass_T1 = []
while True:
if sum(li) == 100:
list_of_li_that_pass_T1.append(li)
if li[-1] != 100:
li[-1] += 1
else:
li[-1] = 1
if li[-2] != 100:
li[-2] += 1
else:
li[-2] = 1
if li[-3] != 100:
li[-3] += 1
else:
li[-3] = 1
if li[-4] != 100:
li[-4] += 1
else:
li[-4] = 1
if li[-5] != 100:
li[-5] += 1
else:
break
else:
if li[-1] != 100:
li[-1] += 1
else:
li[-1] = 1
if li[-2] != 100:
li[-2] += 1
else:
li[-2] = 1
if li[-3] != 100:
li[-3] += 1
else:
li[-3] = 1
if li[-4] != 100:
li[-4] += 1
else:
li[-4] = 1
if li[-5] != 100:
li[-5] += 1
else:
break
this should give me all the number combinations that add up to 100 out of the total 1*10 ** 10
but its not working please help me fix it so it prints all of the sets of integers
I also can't think of what I would do next to get the perfect sets that solve the problem
After #JohnY comments, I assume that the question is:
Find a set of 5 integers meeting the following requirements:
their sum is 100
any number in the [1, 100] range can be constructed using at most once the elements of the set and only additions and substractions
A brute force way is certainly possible, but proving that any number can be constructed that way would be tedious. But a divide and conquer strategy is possible: to construct all numbers up to n with a set of m numbers u0..., um-1, it is enough to build all numbers up to (n+2)/3 with u0..., um-2 and use um-1 = 2*n/3. Any number in the ((n+2)/3, um-1) range can be written as um-1-x with x in the [1, (n+2)/3] range, and any number in the (um-1, n] range as um-1+y with y in the same low range.
So we can use here u4 = 66 and find a way to build numbers up to 34 with 4 numbers.
Let us iterate: u3 = 24 and build numbers up to 12 with 3 numbers.
One more step u2 = 8 and build numbers up to 4 with 2 numbers.
Ok: u0 = 1 and u1 = 3 give immediately:
1 = u0
2 = 3 - 1 = u1 - u0
3 = u1
4 = 3 + 1 = u1 + u0
Done.
Mathematical disgression:
In fact u0 = 1 and u1 = 3 can build all numbers up to 4, so we can use u2 = 9 to build all numbers up to 9+4 = 13. We can prove easily that the sequence ui = 3i verifies sum(ui for i in [0, m-1]) = 1 + 3 + ... + 3m-1 = (3m - 1)/(3 - 1) = (um - 1) / 2.
So we could use u0=1, u1=3, u2=9, u3=27 to build all numbers up to 40, and finally set u4 = 60.
In fact, u0 and u1 can only be 1 and 3 and u2 can be 8 or 9. Then if u2 == 8, u3 can be in the [22, 25] range, and if u2 == 9, u3 can be in the [21, 27] range. The high limit is given by the 3i sequence, and the low limit is given by the requirement to build numbers up to 12 with 3 numbers, and up to 34 with 4 ones.
No code was used, but I think that way much quicker and less error prone. It is now possible to use Python to show that all numbers up to 100 can be constructed from one of those sets using the divide and conquer strategy.

Variable takes negative value while it is restricted to be nonnegative

I am programming a vehicle routing problem in Python with PuLP. I got all my code in it, but for some reason I get a negative value for one of my decision variables, even though I restricted all of them to be nonnegative.
My code is as follows (Traveltimes is a two dimensional np array, with travel times between each pair of customers (i,j), where c(i,j) = c(j,i) and c(i,i) = 0.):
My code:
numVehicles = 2
numCustomers = 2
prob = LpProblem("DSP", LpMinimize)
var = [[[0 for k in range(numVehicles)] for j in range(numCustomers+1)] for i in range(numCustomers+1)]
for i in range(numCustomers+1):
for j in range(numCustomers+1):
for k in range(numVehicles):
var[i][j][k] = LpVariable("x"+str(i)+","+str(j)+","+str(k), 0,1, cat='Binary')
# ADD OBJECTIVE
obj = ""
for i in range(numCustomers+1):
for j in range(numCustomers+1):
for k in range(numVehicles):
obj += traveltimes[i][j]*var[i][j][k]
prob += obj
# ADD CONSTRAINTS
# All customers visited
for j in range(numCustomers+1):
for k in range(numVehicles):
nr = ""
for i in range(numCustomers+1):
nr += var[i][j][k]
prob += nr == 1
# Enter each customer exactly once
for i in range(numCustomers+1):
nr = ""
for k in range(numVehicles):
for j in range(1, numCustomers+1):
nr += var[i][j][k]
prob += nr == 1
# Leave each customer exactly once
for j in range(numCustomers+1):
nr = ""
for k in range(numVehicles):
for i in range(1, numCustomers+1):
nr += var[i][j][k]
prob += nr == 1
# Per vehicle only one customer can be visited as first
nrFirst = ""
for k in range(numVehicles):
for j in range(numCustomers+1):
nrFirst += var[0][j][k]
prob += nrFirst <= 1
# Max num vehicles
nrOut = ""
for k in range(numVehicles):
for j in range(numCustomers+1):
nrOut += var[0][j][k]
prob += nrOut <= numVehicles
# Restrict x(0,j,k) to be nonpositive
for j in range(numCustomers+1):
for k in range(numVehicles):
prob += var[0][j][k] >= 0
print(prob)
# Solve LP
prob.solve()
for v in prob.variables():
print(v.name, "=", v.varValue)
print("objective=", value(prob.objective))
The first output is the formulation printed
MINIMIZE
1.731*x0,1,0 + 1.731*x0,1,1 + 2.983*x0,2,0 + 2.983*x0,2,1 + 1.731*x1,0,0 + 1.731*x1,0,1 + 9.375*x1,2,0 + 9.375*x1,2,1 + 2.983*x2,0,0 + 2.983*x2,0,1 + 9.375*x2,1,0 + 9.375*x2,1,1 + 0.0
SUBJECT TO
_C1: x0,0,0 + x1,0,0 + x2,0,0 = 1
_C2: x0,0,1 + x1,0,1 + x2,0,1 = 1
_C3: x0,1,0 + x1,1,0 + x2,1,0 = 1
_C4: x0,1,1 + x1,1,1 + x2,1,1 = 1
_C5: x0,2,0 + x1,2,0 + x2,2,0 = 1
_C6: x0,2,1 + x1,2,1 + x2,2,1 = 1
_C7: x0,1,0 + x0,1,1 + x0,2,0 + x0,2,1 <= 1
_C8: x1,1,0 + x1,1,1 + x1,2,0 + x1,2,1 <= 1
_C9: x2,1,0 + x2,1,1 + x2,2,0 + x2,2,1 <= 1
_C10: x0,0,0 + x0,1,0 + x0,2,0 <= 1
_C11: x0,0,0 + x0,0,1 + x0,1,0 + x0,1,1 + x0,2,0 + x0,2,1 <= 1
VARIABLES
0 <= x0,0,0 <= 1 Integer
0 <= x0,0,1 <= 1 Integer
0 <= x0,1,0 <= 1 Integer
0 <= x0,1,1 <= 1 Integer
0 <= x0,2,0 <= 1 Integer
0 <= x0,2,1 <= 1 Integer
0 <= x1,0,0 <= 1 Integer
0 <= x1,0,1 <= 1 Integer
0 <= x1,1,0 <= 1 Integer
0 <= x1,1,1 <= 1 Integer
0 <= x1,2,0 <= 1 Integer
0 <= x1,2,1 <= 1 Integer
0 <= x2,0,0 <= 1 Integer
0 <= x2,0,1 <= 1 Integer
0 <= x2,1,0 <= 1 Integer
0 <= x2,1,1 <= 1 Integer
0 <= x2,2,0 <= 1 Integer
0 <= x2,2,1 <= 1 Integer
It can clearly be observed that all variables are restricted to be an integer between 0 and 1 (thus binary). However, for some reason, I do get negative values for some variable(s), as can be seen below
x0,0,0 = 0.0
x0,0,1 = -1.0
x0,1,0 = 0.0
x0,1,1 = 1.0
x0,2,0 = 0.0
x0,2,1 = 1.0
x1,0,0 = 1.0
x1,0,1 = 1.0
x1,1,0 = 1.0
x1,1,1 = 0.0
x1,2,0 = 0.0
x1,2,1 = 0.0
x2,0,0 = 0.0
x2,0,1 = 1.0
x2,1,0 = 0.0
x2,1,1 = 0.0
x2,2,0 = 1.0
x2,2,1 = 0.0
objective= 11.159
Really looking forward to any suggestions on how to solve this problem, since I clearly do not want negative values!
As a few others have suggested you should write a Minimum Complete and Verifiable Example.
That said, if you are getting constraints violated, and you are sure you've implemented them correctly, I reckon you have an infeasible problem (i.e. if you looked at your constraints carefully you would find there is a combination which makes solving impossible).
To check this add:
print (("Status:"), LpStatus[prob.status])
Just after you do prob.solve(). I reckon you'll find it's infeasible.
prob += nr == 1
"+=" is for assignment
"==" is checking for equivalence, and belongs in an "if" statement or a "while".
For instance:
if prob + nr == 1: #execute what follows if prob + nr is equal to 1

How do I add to a grid coordinate in python?

What I'm trying to do is have a 2D array and for every coordinate in the array, ask all the other 8 coordinates around it if they have stored a 1 or a 0. Similar to a minesweeper looking for mines.
I used to have this:
grid = []
for fila in range(10):
grid.append([])
for columna in range(10):
grid[fila].append(0)
#edited
for fila in range (10):
for columna in range (10):
neighbour = 0
for i in range 10:
for j in range 10:
if gird[fila + i][columna + j] == 1
neighbour += 1
But something didn't work well. I also had print statments to try to find the error that way but i still didnt understand why it only made half of the for loop. So I changed the second for loop to this:
#edited
for fila in range (10):
for columna in range (10):
neighbour = 0
if grid[fila - 1][columna - 1] == 1:
neighbour += 1
if grid[fila - 1][columna] == 1:
neighbour += 1
if grid[fila - 1][columna + 1] == 1:
neighbour += 1
if grid[fila][columna - 1] == 1:
neighbour += 1
if grid[fila][columna + 1] == 1:
neighbour += 1
if grid[fila + 1][columna - 1] == 1:
neighbour += 1
if grid[fila + 1][columna] == 1:
neighbour += 1
if grid[fila + 1][columna + 1] == 1:
neighbour += 1
And got this error:
if grid[fila - 1][columna + 1] == 1:
IndexError: list index out of range
It seems like I can't add on the grid coordinates but I can subtract. Why is that?
Valid indices in python are -len(grid) to len(grid)-1. the positive indices are accessing elements with offset from the front, the negative ones from the rear. adding gives a range error if the index is greater than len(grid)-1 that is what you see. subtracting does not give you a range error unless you get an index value less than -len(grid). although you do not check for the lower bound, which is 0 (zero) it seems to work for you as small negative indices return you values from the rear end. this is a silent error leading to wrong neighborhood results.
If you are computing offsets, you need to make sure your offsets are within the bounds of the lists you have. So if you have 10 elements, don't try to access the 11th element.
import collections
grid_offset = collections.namedtuple('grid_offset', 'dr dc')
Grid = [[0 for c in range(10)] for r in range(10)]
Grid_height = len(Grid)
Grid_width = len(Grid[0])
Neighbors = [
grid_offset(dr, dc)
for dr in range(-1, 2)
for dc in range(-1, 2)
if not dr == dc == 0
]
def count_neighbors(row, col):
count = 0
for nb in Neighbors:
r = row + nb.dr
c = col + nb.dc
if 0 <= r < Grid_height and 0 <= c < Grid_width:
# Add the value, or just add one?
count += Grid[r][c]
return count
Grid[4][6] = 1
Grid[5][4] = 1
Grid[5][5] = 1
for row in range(10):
for col in range(10):
print(count_neighbors(row, col), "", end='')
print()
Prints:
$ python test.py
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 1 1 0 0
0 0 0 1 2 3 1 1 0 0
0 0 0 1 1 2 2 1 0 0
0 0 0 1 2 2 1 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
The error is exactly what it says, you need to check if the coordinates fit within the grid:
0 <= i < 10 and 0 <= j < 10
Otherwise you're trying to access an element that doesn't exist in memory, or an element that's not the one you're actually thinking about - Python handles negative indexes, they're counted from the end.
E.g. a[-1] is the last element, exactly the same as a[len(a) - 1].

Categories