Counter set or rows with the same numbering based on condition

Counter set or rows with the same numbering based on condition - python

I have dataset. For a certain condition there is a column has True or False values. If there is a sequence of rows has the same value, then let the counter of these rows be the same.
To make it clear, below is my code:
c1 = [True,True,False,False,False,True,False,True,True,False,True]
counter = 1
switch = 0 #increase the counter when the vector has switched twice
c2 = np.repeat(None, len(c1))
c2[i]=counter
for i in range(1,len(c1)):
p = c1[i-1]
x = c1[i]
if p==x:
counter=counter
c2[i]=counter
if p!=x :
switch = switch + 1
c2[i]=switch
elif switch == 2:
counter = counter + 1
switch = 0 #reset the counter
print(c2)
The actual output is
[None 1 1 1 1 2 3 4 1 5 6]
while the expected one should be
[None, 1,1,1,1,2,2,3,3,3,4]

c1 = [True,True,False,False,False,True,False,True,True,False,True]
res = []
var = 1
cur=c1[0]
flag = 0
res.append(None)
for val in c1[1:]:
if val==cur and flag == 0:
res.append(var)
elif val == cur and flag == 1:
var+=1
flag = 0
res.append(var)
elif val != cur and flag == 0:
flag = 1
res.append(var)
elif val != cur and flag == 1:
res.append(var)
else:
pass
print(res)
Output:[None, 1, 1, 1, 1, 2, 2, 3, 3, 3, 4]

Related

Why won't second for loop execute correctly?

I'm trying to write two for loops that will return a score for different inputs, and create a new field with the new score. The first loop works fine but the second loop never returns the correct score.
import pandas as pd
d = {'a':['foo','bar'], 'b':[1,3]}
df = pd.DataFrame(d)
score1 = df.loc[df['a'] == 'foo']
score2 = df.loc[df['a'] == 'bar']
for i in score1['b']:
if i < 3:
score1['c'] = 0
elif i <= 3 and i < 4:
score1['c'] = 1
elif i >= 4 and i < 5:
score1['c'] = 2
elif i >= 5 and i < 8:
score1['c'] = 3
elif i == 8:
score1['c'] = 4
for j in score2['b']:
if j < 2:
score2['c'] = 0
elif j <= 2 and i < 4:
score2['c'] = 1
elif j >= 4 and i < 6:
score2['c'] = 2
elif j >= 6 and i < 8:
score2['c'] = 3
elif j == 8:
score2['c'] = 4
print(score1)
print(score2)
When I run script it returns the following:
print(score1)
a b c
0 foo 1 0
print(score2)
a b
1 bar 3
Why doesn't score2 create the new field "c" or a score?

Avoid the use of for loops to conditionally update DataFrame columns which are not Python lists. Use vectorized methods of Pandas and Numpy such as numpy.select which scales to millions of rows! Remember these data science tools calculate much differently than general use Python:
# LIST OF BOOLEAN CONDITIONS
conds = [
score1['b'].lt(3), # EQUIVALENT TO < 3
score1['b'].between(3, 4, inclusive="left"), # EQUIVALENT TO >= 3 or < 4
score1['b'].between(4, 5, inclusive="left"), # EQUIVALENT TO >= 4 or < 5
score1['b'].between(5, 8, inclusive="left"), # EQUIVALENT TO >= 5 or < 8
score1['b'].eq(8) # EQUIVALENT TO == 8
]
# LIST OF VALUES
vals = [0, 1, 2, 3, 4]
# VECTORIZED ASSIGNMENT
score1['c'] = numpy.select(conds, vals, default=numpy.nan)
# LIST OF BOOLEAN CONDITIONS
conds = [
score2['b'].lt(2),
score2['b'].between(2, 4, inclusive="left"),
score2['b'].between(4, 6, inclusive="left"),
score2['b'].between(6, 8, inclusive="left"),
score2['b'].eq(8)
]
# LIST OF VALUES
vals = [0, 1, 2, 3, 4]
# VECTORIZED ASSIGNMENT
score2['c'] = numpy.select(conds, vals, default=numpy.nan)

On the first iteration of second for loop, j will be in 3. so that none your condition satisfies.
for j in score2['b']:
if j < 3:
score2['c'] = 0
elif j <= 3 and i < 5:
score2['c'] = 1
elif j >= 5 and i < 7:
score2['c'] = 2
elif j >= 7 and i < 9:
score2['c'] = 3
elif j == 9:
score2['c'] = 4

How can I replace values in a CSV column from a range?

I am attempting to change the values of two columns in my dataset from specific numeric values (2, 10, 25 etc.) to single values (1, 2, 3 or 4) based on the percentile of the specific value within the dataset.
Using the pandas quantile() function I have got the ranges I wish to replace between, but I haven't figured out a working method to do so.
age1 = datasetNB.Age.quantile(0.25)
age2 = datasetNB.Age.quantile(0.5)
age3 = datasetNB.Age.quantile(0.75)
fare1 = datasetNB.Fare.quantile(0.25)
fare2 = datasetNB.Fare.quantile(0.5)
fare3 = datasetNB.Fare.quantile(0.75)
My current solution attempt for this problem is as follows:
for elem in datasetNB['Age']:
if elem <= age1:
datasetNB[elem].replace(to_replace = elem, value = 1)
print("set to 1")
elif (elem > age1) & (elem <= age2):
datasetNB[elem].replace(to_replace = elem, value = 2)
print("set to 2")
elif (elem > age2) & (elem <= age3):
datasetNB[elem].replace(to_replace = elem, value = 3)
print("set to 3")
elif elem > age3:
datasetNB[elem].replace(to_replace = elem, value = 4)
print("set to 4")
else:
pass
for elem in datasetNB['Fare']:
if elem <= fare1:
datasetNB[elem] = 1
elif (elem > fare1) & (elem <= fare2):
datasetNB[elem] = 2
elif (elem > fare2) & (elem <= fare3):
datasetNB[elem] = 3
elif elem > fare3:
datasetNB[elem] = 4
else:
pass
What should I do to get this working?

pandas already has one function to do that, pandas.qcut.
You can simply do
q_list = [0, 0.25, 0.5, 0.75, 1]
labels = range(1, 5)
df['Age'] = pd.qcut(df['Age'], q_list, labels=labels)
df['Fare'] = pd.qcut(df['Fare'], q_list, labels=labels)
Input
import numpy as np
import pandas as pd
# Generate fake data for the sake of example
df = pd.DataFrame({
'Age': np.random.randint(10, size=6),
'Fare': np.random.randint(10, size=6)
})
>>> df
Age Fare
0 1 6
1 8 2
2 0 0
3 1 9
4 9 6
5 2 2
Output
DataFrame after running the above code
>>> df
Age Fare
0 1 3
1 4 1
2 1 1
3 1 4
4 4 3
5 3 1
Note that in your specific case, since you want quartiles, you can just assign q_list = 4.

Sudoku brute force always returns None [duplicate]

This question already has answers here:
Why does my recursive function return None?
(4 answers)
Closed 1 year ago.
I have reviewed multiple questions regarding similar problems but have not been able to find a solution. I have to write a script for solving and displaying a sudoku puzzle in a text file. If there is no possible solution, it should return None, otherwise it should print the formatted and solved puzzle. My issue is that my solve() function always returns None, even for puzzles with simple solutions. I have tried to debug by following the logic and reviewing my syntax but I fear I am making a simple mistake I cannot see. My functions are
def load_puzzle(path):
with open(path) as fin:
contents = fin.read()
lines = contents.split('\n')
puzzle = []
for i in lines:
token_strings = i.split(' ')
token_ints = [int(x) for x in token_strings]
puzzle.append(token_ints)
return puzzle
def display_puzzle(puzzle):
for i in range(9):
if i == 0 or i == 3 or i == 6:
print('+-------+-------+-------+')
row = '| '
for j in range(9):
if puzzle[i][j] == 0:
row = row + '. '
else:
row = row + str(puzzle[i][j]) + ' '
if j==2 or j==5 or j==8:
row = row + '| '
print(row)
print('+-------+-------+-------+')
def get_next(row, col):
if col < 8:
#print('inside get_next') #DEBUG
return row, col+1
elif col == 8 and row < 8:
return row+1, 0
elif col == 8 and row == 8:
return None, None
def copy_puzzle(puzzle):
new_puzzle = []
for i in puzzle:
new_puzzle.append(i.copy())
return new_puzzle
def get_options(puzzle, row, col):
if puzzle[row][col] > 0:
return None
used = []
for i in puzzle[row]:
if i > 0:
used.append(i)
for i in range(9):
if puzzle[i][col] > 0:
used.append(puzzle[i][col])
start_row = 3*int(row/3)
start_col = 3*int(col/3)
for i in range(2):
for j in range(2):
if puzzle[start_row + i][start_col +j] > 0:
used.append(puzzle[start_row + i][start_col +j])
options = []
for i in range(1, 10):
if i not in used:
options.append(i)
return options
def solve(puzzle, row=0, col=0):
if puzzle[row][col] != 0:
#print("here") # debuggin
next_row, next_col = get_next(row, col)
#print(next_row, next_col) # debugging
if next_row is None:
return puzzle
else:
solve(puzzle, next_row, next_col)
if puzzle[row][col] == 0:
# print("there") # debuggin
options = get_options(puzzle, row, col)
#print(options) #debuggin
if options == []:
return None
for i in options:
new_puzzle = copy_puzzle(puzzle)
new_puzzle[row][col] = i
# display_puzzle(new_puzzle) #debuggin
result = solve(new_puzzle, row, col)
if result is not None:
return result
The commented out print() functions are ones I used to follow the loops to make sure the functions were operating as intended. As far as I can tell they were, but with so many loops, Jupyter Notebook began to print over itself and the display became indecipherable, as well as the making the function have an unreasonable resolution time.
The initial test puzzle is a .txt file containing:
5 0 0 0 0 0 0 0 0
0 9 0 7 0 0 8 0 0
0 0 0 0 3 0 0 7 0
6 0 1 0 0 0 9 8 0
0 0 0 6 0 0 0 0 0
0 0 9 0 0 0 7 0 1
0 0 0 0 0 8 1 9 0
0 4 0 5 0 1 0 0 8
0 7 0 3 0 6 0 4 0

Solved:
Missing return statement before solve(puzzle, next_row, next_col).
also ranges for cell check in get_options() should be set to 3 instead of 2.

Loop not functioning correctly

Working with data frames and this is the code I have for it.
numbers = 3
count=0
A = 0
B = 0
C = 0
for x in range(numbers):
if str(data.iloc[count])== 'A':
A += 1
elif str(data.iloc[count])== 'B':
B += 1
elif str(data.iloc[count])== 'C':
C += 1
count +=1
#this is to return the count to check if it works
print A
print B
print C
but for some reason when I run this code only the count for A increases.
i.e. if the data in the index had a 'A', 'B', 'B' its still returning A = 3 and B = 0 where it should be returning A = 1, B = 2, and C = 0
what am I doing wrong? thanks again.

Since your count += 1 is not within the for loop, count += 1 only runs once, after the for loop is complete. It needs to be indented. Alternatively, you do not need to use a count variable since x is already going through the range 0 to 3:
numbers = 3
A = 0
B = 0
C = 0
for x in range(numbers):
if str(data.iloc[x])== 'A':
A += 1
elif str(data.iloc[x])== 'B':
B += 1
elif str(data.iloc[x])== 'C':
C += 1
#this is to return the count to check if it works
print A
print B
print C

This also worked
count=0
numbers = 3
A = 0
B = 0
C = 0
for x in range(numbers):
count +=1
if str(data.iloc[x])== 'A':
A += 1
elif str(data.iloc[x])== 'B':
B += 1
elif str(data.iloc[x])== 'C':
C += 1
#this is to return the count to check if it works
print A
print B
print C

What am I missing with my approach to Euler project, p84?

I'm solving my way through Euler project. I have reached question 84 and I have created a simulation of the game. I have run the simulation against the statistic provided in the question and I get the right sequence. When I try to run this with 2d4 I get wrong results. Usually I get 101516. What am I missing?
Please note, I'm not looking for the solution or for you to fix my code. I only want to know where my algorithm is flawed.
from random import randint
import sys
pos = 0
doubles = 0
csqs = [2,17,33] #The cc position
hsqs = [7,22,36] #The ch positions
rounds = 0
stop = 100000
sqs = dict() #will store how many visit I had on each square
for i in range (0,40): #initial values of sqs
sqs[i] = 0
def doCC():# the cc cards. There is no real effect of randomly picking a card or kipping them in order on my result
global pos,doubles
global CC
if CC == 0:
pos = 0
elif CC == 1:
pos = 10
doubles = 0
CC += 1
if CC == 16: CC = 0
def doCH(): #CH cards
global pos, doubles, CH
if CH == 0: pos = 0
elif CH == 1:
pos = 10
doubles = 0
elif CH == 2: pos = 11
elif CH == 3: pos = 24
elif CH == 4: pos = 39
elif CH == 5: pos = 5
elif CH == 6 or CH == 7:
if pos == 7:
pos = 15
elif pos == 22:
pos = 25
elif pos == 36:
pos = 5
elif CH ==8:
if pos == 22: pos = 28
else: pos = 12
elif CH == 9:
pos -= 3
CH += 1
if CH == 16: CH = 0
while rounds < stop:
d1 = randint(1,4)
d2 = randint(1,4)
if d1 == d2:
doubles += 1 #counting doubles
else:
doubles = 0
if doubles == 3:
pos = 10
doubles = 0
pos += d1+d2
if pos>= 40 : #you have just crossed go
pos -= 40
rounds += 1
if rounds %10000 == 0: print rounds,
if pos == 30: #g2j
doubles = 0
pos = 10
if pos in hsqs:
doCH()
if pos in csqs:
doCC()
sqs[pos] += 1
sys.stdout.flush()
# Setting values
m1 = 0
m2 = 0
m3 = 0
v1 = 0
v2 = 0
v3 = 0
su = 0
for v in sqs:
m = sqs[v]
su += m
if m > m1:
m1 = m
v1 = v
for v in sqs:
m = sqs[v]
if m > m2 and m < m1:
m2 = m
v2 = v
for v in sqs:
m = sqs[v]
if m > m3 and m < m2:
m3 = m
v3 = v
for v in sqs:
sqs[v] = sqs[v]*100.0/su
print m1,m2,m3
print v1,v2,v3
print m1*100.0/su, m2*100.0/su, m3*100.0/su
print sqs

When you have three doubles in a row,
if doubles == 3:
pos = 10
doubles = 0
pos += d1+d2
you still hop out of jail, but you shouldn't.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Counter set or rows with the same numbering based on condition - python

Related

Why won't second for loop execute correctly?

How can I replace values in a CSV column from a range?

Sudoku brute force always returns None [duplicate]

Loop not functioning correctly

What am I missing with my approach to Euler project, p84?

Categories

Resources