How to do a majority voting on columns in pandas - python

I have a dataframe which has 10 different columns, A1, A2, ...,A10. These columns contain y or n. I'd like to create another column whose value is y if the majority of columns (A1, A2, ...,A10) are y and n otherwise. How can I do this?

Use DataFrame.mode:
df['majority'] = df.mode(axis=1)[0]
Example
np.random.seed(0)
df = pd.DataFrame(np.random.choice(['y', 'n'], size=(10, 10)))
print(df)
0 1 2 3 4 5 6 7 8 9
0 y n n y n n n n n n
1 n y y n y y y y y n
2 y n n y y n n n n y
3 n y n y n n y n n y
4 y n y n n n n n y n
5 y n n n n y n y y n
6 n y n y n y y y y y
7 n n y y y n n y n y
8 y n y n n n n n n y
9 n n y y n y y n n y
df['majority'] = df.mode(axis=1)[0]
print(df)
0 1 2 3 4 5 6 7 8 9 majority
0 y n n y n n n n n n n
1 n y y n y y y y y n y
2 y n n y y n n n n y n
3 n y n y n n y n n y n
4 y n y n n n n n y n n
5 y n n n n y n y y n n
6 n y n y n y y y y y y
7 n n y y y n n y n y n
8 y n y n n n n n n y n
9 n n y y n y y n n y n
If it is necessary to handle the distinction between true majority and split decisions, you could use numpy.where. eg:
mode = df.mode(axis=1)
df['majority'] = np.where(mode.isna().any(1), mode[0], 'split')
print(df)
0 1 2 3 4 5 6 7 8 9 majority
0 y n n y n n n n n n n
1 n y y n y y y y y n y
2 y n n y y n n n n y n
3 n y n y n n y n n y n
4 y n y n n n n n y n n
5 y n n n n y n y y n n
6 n y n y n y y y y y y
7 n n y y y n n y n y split
8 y n y n n n n n n y n
9 n n y y n y y n n y split

Related

Multiplication table with X's while using functions?

Comp sci student here,
Very lost on how to add those X's on a multiplication table like the added photo. https://i.stack.imgur.com/cdHoZ.png
How on earth would I add those X's while also using functions? Here's my code if this helps:
for i in range(1,11):
for j in range(1,11):
print(i * j, end='\t')
print('')
The rule for the X is i>3 and j>2 and i*j != 81
for i in range(1, 10):
for j in range(1, 10):
if i > 3 and j > 2 and i * j != 81:
print('X', end='\t')
else:
print(i * j, end='\t')
print()
1 2 3 4 5 6 7 8 9
2 4 6 8 10 12 14 16 18
3 6 9 12 15 18 21 24 27
4 8 X X X X X X X
5 10 X X X X X X X
6 12 X X X X X X X
7 14 X X X X X X X
8 16 X X X X X X X
9 18 X X X X X X 81

bomb including ordered pairs in python

I'm trying to do this task in python 3:
Get as many Ordered Pairs as the User Wants, Seprated with space, like: (1,3) (5,6) ...
Print a 10 × 10 Square Made with Xs.
except for the Ordered Pairs given, print Os on their place.
Note: the Origin (0,0) of the Imaginary Coordinate System on this Square is the Left-Top place
well I wrote this code:
x = input()
L=(x.split())
for i in range(0,len(L)):
for n in range (0,10):
for m in range (0,10):
if (m == int((L[i])[1]) and n == int((L[i])[3])):
print("O", end=" ")
else:
print("X", end=" ")
print()
but it has a Problem: it prints more than one Square. when I give two inputs, it Prints two Squares :(
like this:
(0,0) (3,5)
O X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X O X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
but it's Supposed to be like:
(0,0) (1,2) (3,3) (1,5) (8,9)
O X X X X X X X X X
X X X X X X X X X X
X O X X X X X X X X
X X X O X X X X X X
X X X X X X X X X X
X O X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X O X
any help would be Appreciated.
I'm a Beginner :(
import ast
user_input = input()
ordered_pairs = [ast.literal_eval(i) for i in user_input.split(' ')]
l = [['X' for j in range(10)] for i in range(10)]
for x, y in ordered_pairs: l[y][x] = 'O'
print('\n', user_input, sep='')
for i in l:
print(*i)
(0,0) (1,2) (3,3) (1,5) (8,9)
O X X X X X X X X X
X X X X X X X X X X
X O X X X X X X X X
X X X O X X X X X X
X X X X X X X X X X
X O X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X O X
Try this:
x = "(0,0) (1,2) (3,3) (1,5) (8,9)"
x = [eval(i.replace("(", "").replace(")", "")) for i in x.split()]
X = np.array(x)
cols = X[:, 0].max()
rows = X[:, 1].max()
d = pd.DataFrame(np.zeros((max(rows, cols)+1, max(rows, cols)+1))).replace(0, "X")
d:
0 1 2 3 4 5 6 7 8 9
0 X X X X X X X X X X
1 X X X X X X X X X X
2 X X X X X X X X X X
3 X X X X X X X X X X
4 X X X X X X X X X X
5 X X X X X X X X X X
6 X X X X X X X X X X
7 X X X X X X X X X X
8 X X X X X X X X X X
9 X X X X X X X X X X
R = [i[1] for i in x]
C = [i[0] for i in x]
for i in range(len(R)):
print("Row:", R[i], end="\t")
print("Col:", C[i])
for i in range(len(R)):
d.iloc[R[i], C[i]] = "0"
Row: 0 Col: 0
Row: 2 Col: 1
Row: 3 Col: 3
Row: 5 Col: 1
Row: 9 Col: 8
d:
0 1 2 3 4 5 6 7 8 9
0 0 X X X X X X X X X
1 X X X X X X X X X X
2 X 0 X X X X X X X X
3 X X X 0 X X X X X X
4 X X X X X X X X X X
5 X 0 X X X X X X X X
6 X X X X X X X X X X
7 X X X X X X X X X X
8 X X X X X X X X X X
9 X X X X X X X X 0 X

Merge multiple Series as a single column into a DataFrame

I have the following data frames:
A.
k m n
0 x x x
1 x x x
2 x x x
3 x x x
4 x x x
5 x x x
6 x x x
7 x x x
8 x x x
9 x x x
B1.
l i j
1 x 46 x
2 x 64 x
3 x 83 x
9 x 70 x
B2.
l i j
0 x 23 x
4 x 34 x
6 x 54 x
8 x 32 x
B3.
l i j
0 x 11 x
5 x 98 x
7 x 94 x
9 x 80 x
How can I add the column "i" (from data frames B1, B2, and B3) to the data frame A?
Regarding the duplicate values (e.g. index 9 in B1 and B3 & index 0 in B2 and B3), I want to keep the leftmost value from [B1, B2, B3] (e.g. 23 for index 0 & 70 for index 9).
A desired output would be:
k m n i
0 x x x 23
1 x x x 46
2 x x x 64
3 x x x 83
4 x x x 34
5 x x x 98
6 x x x 54
7 x x x 94
8 x x x 32
9 x x x 70
you can concat the Bx dataframes, and use duplicated on the index to remove the duplicated index and keep the first.
A['i'] = (pd.concat([B1, B2, B3])
.loc[lambda x: ~x.index.duplicated(keep='first'), 'i'])
print(A)
k m n i
0 x x x 23
1 x x x 46
2 x x x 64
3 x x x 83
4 x x x 34
5 x x x 98
6 x x x 54
7 x x x 94
8 x x x 32
9 x x x 70

create a matrix from columns and horizontal lines

How can I create a matrix by using rows and columns.
when I print the matrix the output should be like this:
O X X X X X X X X X X X X X X X
N X X X X X X X X X X X X X X X
M X X X X X X X X X X X X X X X
L X X X X X X X X X X X X X X X
K X X X X X X X X X X X X X X X
J X X X X X X X X X X X X X X X
I X X X X X X X X X X X X X X X
H X X X X X X X X X X X X X X X
G X X X X X X X X X X X X X X X
F X X X X X X X X X X X X X X X
E X X X X X X X X X X X X X X X
D X X X X X X X X X X X X X X X
C X X X X X X X X X X X X X X X
B X X X X X X X X X X X X X X X
A X X X X X X X X X X X X X X X
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
I think I need to use a list in a dictionary and use matrix for For "X"s to be edited later.
hall_dictionary = {}
hall_dictionary["merhaba"] = []
rows = 10
columns = 15
x = [[hall_dictionary["merhaba"] for i in range(columns)] for j in range(rows)]
You can capsule the whole data-storage away into a class. It handles all the "book-keeping" and you simply use A to ... and 1 to ... to change the X.
Internally it uses a simple 1-dim list:
class Field:
def __init__(self, rows, cols, init_piece="x"):
self.rows = rows
self.cols = cols
self.field = [init_piece] * rows * cols
def place_at(self, row, col, piece):
"""Changes one tile on the field. Does all the reverse-engineering to compute
1-dim place of A..?,1..? given tuple of coords."""
def validation():
"""Raises error when out of bounds."""
error = []
if not (isinstance(row,str) and len(row) == 1 and row.isalpha()):
error.append("Use rows between A and {}".format(chr(ord("A") +
self.rows - 1)))
if not (0 < col <= self.cols):
error.append("Use columns between 1 and {}".format(self.cols))
if error:
error = ["Invalid row/column: {}/{}".format(row,col)] + error
raise ValueError('\n- '.join(error))
validation()
row = ord(row.upper()[0]) - ord("A")
self.field[row * self.cols + col - 1] = piece
def print_field(self):
"""Prints the playing field."""
for c in range(self.rows - 1,-1,-1):
ch = chr(ord("A") + c)
print("{:<4} ".format(ch), end = "")
print(("{:>2} " * self.cols).format(*self.field[c * self.cols:
(c + 1) * self.cols], sep = " "))
print("{:<4} ".format(""), end = "")
print(("{:>2} " * self.cols).format(*range(1,self.cols + 1)))
Then you can use it like so:
rows = 10
cols = 15
f = Field(rows,cols)
f.print_field()
# this uses A...? and 1...? to set things
for r,c in [(0,0),("A",1),("ZZ",99),("A",99),("J",15)]:
try:
f.place_at(r,c,"i") # set to 'i'
except ValueError as e:
print(e)
f.print_field()
Output (before):
J x x x x x x x x x x x x x x x
I x x x x x x x x x x x x x x x
H x x x x x x x x x x x x x x x
G x x x x x x x x x x x x x x x
F x x x x x x x x x x x x x x x
E x x x x x x x x x x x x x x x
D x x x x x x x x x x x x x x x
C x x x x x x x x x x x x x x x
B x x x x x x x x x x x x x x x
A x x x x x x x x x x x x x x x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Output (setting things && after):
Invalid row/column: 0/0
- Use rows between A and J
- Use columns between 1 and 15
Invalid row/column: ZZ/99
- Use rows between A and J
- Use columns between 1 and 15
Invalid row/column: A/99
- Use columns between 1 and 15
J x x x x x x x x x x x x x x i
I x x x x x x x x x x x x x x x
H x x x x x x x x x x x x x x x
G x x x x x x x x x x x x x x x
F x x x x x x x x x x x x x x x
E x x x x x x x x x x x x x x x
D x x x x x x x x x x x x x x x
C x x x x x x x x x x x x x x x
B x x x x x x x x x x x x x x x
A i x x x x x x x x x x x x x x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Sounds like a 2-D array (similiar to answer How to define a two-dimensional array in Python), so something like this:
vertical = list(string.ascii_uppercase[0:15][::-1]) # ['O', 'N', ..., 'A']
columns = 15
hall_dictionary = {}
hall_dictionary["merhaba"] = [[x for x in range(columns)] for y in vertical]
for i in range(len(vertical)):
for j in range(columns):
hall_dictionary["merhaba"][i][j] = 'X'
Then you can index as desired:
hall_dictionary["merhaba"][0][1] # Always prints 'X'
display the entire array:
for row in hall_dictionary["merhaba"]:
print(row)
['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X']
... 15 rows ...
and assign, update new values:
hall_dictionary["merhaba"][0][2] = 'O'
for row in hall_dictionary["merhaba"]:
print(row)
['X', 'X', 'O', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X']
...
confirms element [0][2] has been updated.
If you're interested, you can use a pandas DataFrame as well.
import pandas as pd
rows = 10
columns = 15
def indexToLetter(index:int): # This function might be a bit too verbose
if index == 0: # but all this does is convert an
return 'A' # integer index [0, ∞) to an
# alphabetical index [A..Z, AA..ZZ, AAA...]
ret = ''
while index > 0:
length = len(ret)
letter = chr(ord('A') + index % 26 - [0, 1][length >= 1])
ret = letter + ret
index //= 26
return ret
# create the row labels
rLabels = [*map(indexToLetter, range(rows))]
# create the dataframe, note that we can simplify
# [['X' for i in range(columns)] for j in range(rows)]
# to [['X'] * columns] * rows
df = pd.DataFrame([['X'] * columns] * rows, index=rLabels)
print(df)
Output:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
A X X X X X X X X X X X X X X X
B X X X X X X X X X X X X X X X
C X X X X X X X X X X X X X X X
D X X X X X X X X X X X X X X X
E X X X X X X X X X X X X X X X
F X X X X X X X X X X X X X X X
G X X X X X X X X X X X X X X X
H X X X X X X X X X X X X X X X
I X X X X X X X X X X X X X X X
J X X X X X X X X X X X X X X X
The output looks slightly ugly, and might not be what you're looking for. But with a dataframe, it's very convenient to manipulate a matrices and tables of data.
You can access it by specifying the column then the row (unlike some other solutions).
df[1][0] = 'O'
df[1][2] = 'O'
df[1][3] = 'O'
print(df)
Output:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
A X O X X X X X X X X X X X X X
B X X X X X X X X X X X X X X X
C X O X X X X X X X X X X X X X
D X O X X X X X X X X X X X X X
E X X X X X X X X X X X X X X X
F X X X X X X X X X X X X X X X
G X X X X X X X X X X X X X X X
H X X X X X X X X X X X X X X X
I X X X X X X X X X X X X X X X
J X X X X X X X X X X X X X X X
Say, someone wants to book the entire row 'E' of the hall.
if any(df.loc['E'] == 'O'): # check if any seats were taken
print('Error: some seats in Row <E> are taken.')
else:
df.loc['E'] = 'O'
print(df)
Output:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
A X O X X X X X X X X X X X X X
B X X X X X X X X X X X X X X X
C X O X X X X X X X X X X X X X
D X O X X X X X X X X X X X X X
E O O O O O O O O O O O O O O O
F X X X X X X X X X X X X X X X
G X X X X X X X X X X X X X X X
H X X X X X X X X X X X X X X X
I X X X X X X X X X X X X X X X
J X X X X X X X X X X X X X X X
Note: you can also do df.iloc[4] to access row E. Want to access rows B to E? Use df.loc['B':'E'] or df.iloc[1:5].
You can also do the same with columns by accessing df[<column_index>] = 'O'.

How to convert row values to attributes (columns) in pandas

I have a dataset in pandas with column pid (patient id), and code (drug code), sorted in rows as the example shows. I need to convert them to 1 patient/row, and list all the drugs as attributes for each patient.
What I have now:
pid code
1 Az
1 Bn
2 Az
2 Bn
2 C4
3 Bn
3 C4
3 Dx
4 Az
4 Bn
4 Dx
4 E
5 C4
5 Dx
5 E
I need to convert it to:
pid Az Bn C4 Dx E
1 y y n n n
2 y y y n n
3 n y y y n
4 y y n y y
5 n n y y y
IIUC crosstab
pd.crosstab(df.pid,df.code).replace({1:'y',0:'n'})
Out[231]:
code Az Bn C4 Dx E
pid
1 y y n n n
2 y y y n n
3 n y y y n
4 y y n y y
5 n n y y y
One way is to pivot your dataframe
new_df = df.assign(values='y').pivot(index='pid', columns='code', values='values').replace({None:'n'})
>>> new_df
code Az Bn C4 Dx E
pid
1 y y n n n
2 y y y n n
3 n y y y n
4 y y n y y
5 n n y y y
Having fun!
Fun 1
Create a Series with a MultiIndex and unstack
pd.Series('y', df.values.T.tolist()).unstack(fill_value='n')
Az Bn C4 Dx E
1 y y n n n
2 y y y n n
3 n y y y n
4 y y n y y
5 n n y y y
Fun 2
Use defaultdict
d = defaultdict(dict)
for i, p, c in df.itertuples():
d[c][p] = 'y'
pd.DataFrame(d).fillna('n')
Az Bn C4 Dx E
1 y y n n n
2 y y y n n
3 n y y y n
4 y y n y y
5 n n y y y
Fun 3
i, r = pd.factorize(df.pid)
j, c = pd.factorize(df.code)
e = np.empty((len(r), len(c)), str)
e.fill('n')
e[i, j] = 'y'
pd.DataFrame(e, r, c)
Az Bn C4 Dx E
1 y y n n n
2 y y y n n
3 n y y y n
4 y y n y y
5 n n y y y

Categories