python Reverse Collatz Conjecture

python Reverse Collatz Conjecture - python

What the program should do is take steps and a number and than output you how many unique sequences there are with exactly x steps to create number.
Does someone know how I can save some memory - as I should make this work for pretty huge numbers within a 4 second limit.
def IsaacRule(steps, number):
if number in IsaacRule.numbers:
return 0
else:
IsaacRule.numbers.add(number)
if steps == 0:
return 1
counter = 0
if ((number - 1) / 3) % 2 == 1:
counter += IsaacRule(steps-1, (number - 1) / 3)
if (number * 2) % 2 == 0:
counter += IsaacRule(steps-1, number*2)
return counter
IsaacRule.numbers = set()
print(IsaacRule(6, 2))
If someone knows a version with memoization I would be thankful, right now it works, but there is still room for improvement.

Baseline: IsaacRule(50, 2) takes 6.96s
0) Use the LRU Cache
This made the code take longer, and gave a different final result
1) Eliminate the if condition: (number * 2) % 2 == 0 to True
IsaacRule(50, 2) takes 0.679s. Thanks Pm2Ring for this one.
2) Simplify ((number - 1) / 3) % 2 == 1 to number % 6 == 4 and use floor division where possible:
IsaacRule(50, 2) takes 0.499s
Truth table:
| n | n-1 | (n-1)/3 | (n-1)/3 % 2 | ((n-1)/3)%2 == 1 |
|---|-----|---------|-------------|------------------|
| 1 | 0 | 0.00 | 0.00 | FALSE |
| 2 | 1 | 0.33 | 0.33 | FALSE |
| 3 | 2 | 0.67 | 0.67 | FALSE |
| 4 | 3 | 1.00 | 1.00 | TRUE |
| 5 | 4 | 1.33 | 1.33 | FALSE |
| 6 | 5 | 1.67 | 1.67 | FALSE |
| 7 | 6 | 2.00 | 0.00 | FALSE |
Code:
def IsaacRule(steps, number):
if number in IsaacRule.numbers:
return 0
else:
IsaacRule.numbers.add(number)
if steps == 0:
return 1
counter = 0
if number % 6 == 4:
counter += IsaacRule(steps-1, (number - 1) // 3)
counter += IsaacRule(steps-1, number*2)
return counter
3) Rewrite code using sets
IsaacRule(50, 2) takes 0.381s
This lets us take advantage of any optimizations made for sets. Basically I do a breadth first search here.
4) Break the cycle so we can skip keeping track of previous states.
IsaacRule(50, 2) takes 0.256s
We just need to add a check that number != 1 to break the only known cycle. This gives a speed up, but you need to add a special case if you start from 1. Thanks Paul for suggesting this!
START = 2
STEPS = 50
# Special case since we broke the cycle
if START == 1:
START = 2
STEPS -= 1
current_candidates = {START} # set of states that can be reached in `step` steps
for step in range(STEPS):
# Get all states that can be reached from current_candidates
next_candidates = set(number * 2 for number in current_candidates if number != 1) | set((number - 1) // 3 for number in current_candidates if number % 6 == 4)
# Next step of BFS
current_candidates = next_candidates
print(len(next_candidates))

Related

How should I solve logic error in timestamp using Python?

I have written a code to calculate a, b, and c. They were initialized at 0.
This is my input file
-------------------------------------------------------------
| Line | Time | Command | Data |
-------------------------------------------------------------
| 1 | 0015 | ACTIVE | |
| 2 | 0030 | WRITING | |
| 3 | 0100 | WRITING_A | |
| 4 | 0115 | PRECHARGE | |
| 5 | 0120 | REFRESH | |
| 6 | 0150 | ACTIVE | |
| 7 | 0200 | WRITING | |
| 8 | 0314 | PRECHARGE | |
| 9 | 0318 | ACTIVE | |
| 10 | 0345 | WRITING_A | |
| 11 | 0430 | WRITING_A | |
| 12 | 0447 | WRITING | |
| 13 | 0503 | WRITING | |
and the timestamps and commands are used to process the calculation for a, b, and c.
import re
count = {}
timestamps = {}
with open ("page_stats.txt", "r") as f:
for line in f:
m = re.split(r"\s*\|\s*", line)
if len(m) > 3 and re.match(r"\d+", m[1]):
count[m[3]] = count[m[3]] + 1 if m[3] in count else 1
#print(m[2])
if m[3] in timestamps:
timestamps[m[3]].append(m[2])
#print(m[3], m[2])
else:
timestamps[m[3]] = [m[2]]
#print(m[3], m[2])
a = b = c = 0
for key in count:
print("%-10s: %2d, %s" % (key, count[key], timestamps[key]))
if timestamps["ACTIVE"] > timestamps["PRECHARGE"]: #line causing logic error
a = a + 1
print(a)
Before getting into the calculation, I assign the timestamps with respect to the commands. This is the output for this section.
ACTIVE : 3, ['0015', '0150', '0318']
WRITING : 4, ['0030', '0200', '0447', '0503']
WRITING_A : 3, ['0100', '0345', '0430']
PRECHARGE : 2, ['0115', '0314']
REFRESH : 1, ['0120']
To get a, the timestamps of ACTIVE must be greater than PRECHARGE and WRITING must be greater than ACTIVE. (Line 4, 6, 7 will contribute to the first a and Line 8, 9, and 12 contributes to the second a)
To get b, the timestamps of WRITING must be greater than ACTIVE. For the lines that contribute to a such as Line 4, 6, 7, 8, 9, and 12, they cannot be used to calculate b. So, Line 1 and 2 contribute to b.
To get c, the rest of the unused lines containing WRITING will contribute to c.
The expected output:
a = 2
b = 1
c = 1
However, in my code, when I print a, it displays 0, which shows the logic has some error. Any suggestion to amend my code to achieve the goal? I have tried for a few days and the problem is not solved yet.

I made a function that will return the commands in order that match a pattern with gaps allowed.
I also made a more compact version of your file reading.
There is probably a better version to divide the list into two parts, the problem was to only allow elements in that match the whole pattern. In this one I iterate over the elements twice.
import re
commands = list()
with open ("page_stats.txt", "r") as f:
for line in f:
m = re.split(r"\s*\|\s*", line)
if len(m) > 3 and re.match(r"\d+", m[1]):
_, line, time, command, data, _ = m
commands.append((line,time,command))
def search_pattern(pattern, iterable, key=None):
iter = 0
count = 0
length = len(pattern)
results = []
sentinel = object()
for elem in iterable:
original_elem = elem
if key is not None:
elem = key(elem)
if elem == pattern[iter]:
iter += 1
results.append((original_elem,sentinel))
if iter >= length:
iter = iter % length
count += length
else:
results.append((sentinel,original_elem))
matching = []
nonmatching = []
for res in results:
first,second = res
if count > 0:
if second is sentinel:
matching.append(first)
count -= 1
elif first is sentinel:
nonmatching.append(second)
else:
value = first if second is sentinel else second
nonmatching.append(value)
return matching, nonmatching
pattern_a = ['PRECHARGE','ACTIVE','WRITING']
pattern_b = ['ACTIVE','WRITING']
pattern_c = ['WRITING']
matching, nonmatching = search_pattern(pattern_a, commands, key=lambda t: t[2])
a = len(matching)//len(pattern_a)
matching, nonmatching = search_pattern(pattern_b, nonmatching, key=lambda t: t[2])
b = len(matching)//len(pattern_b)
matching, nonmatching = search_pattern(pattern_c, nonmatching, key=lambda t: t[2])
c = len(matching)//len(pattern_c)
print(f'{a=}')
print(f'{b=}')
print(f'{c=}')
Output:
a=2
b=1
c=1

Pandas: how to incrementally add one to column while sum is less than corresponding column?

I am trying to increment a column by 1 while the sum of that column is less than or equal to a total supply figure. I also need that column to be less than the corresponding value in the 'allocation' column. The supply variable will be dynamic from 1-400 based on user input. Below is the desired output (Allocation Final column).
supply = 14
| rank | allocation | Allocation Final |
| ---- | ---------- | ---------------- |
| 1 | 12 | 9 |
| 2 | 3 | 3 |
| 3 | 1 | 1 |
| 4 | 1 | 1 |
Below is the code I have so far:
data = [[1.05493,12],[.94248,3],[.82317,1],[.75317,1]]
df = pd.DataFrame(data,columns=['score','allocation'])
df['rank'] = df['score'].rank()
df['allocation_new'] = 0
#static for testing
supply = 14
for index in df.index:
while df.loc[index, 'allocation_new'] < df.loc[index, 'allocation'] and df.loc[index, 'allocation_new'].sum() < supply:
df.loc[index, 'allocation_new'] += 1
print(df)

This should do:
def allocate(df, supply):
if supply > df['allocation'].sum():
raise ValueError(f'Unacheivable supply {supply}, maximal {df["allocation"].sum()}')
under_alloc = pd.Series(True, index=df.index)
df['allocation final'] = 0
while (missing := supply - df['allocation final'].sum()) >= 0:
assert under_alloc.any()
if missing <= under_alloc.sum():
df.loc[df.index[under_alloc][:missing], 'allocation final'] += 1
return df
df.loc[under_alloc, 'allocation final'] = (
df.loc[under_alloc, 'allocation final'] + missing // under_alloc.sum()
).clip(upper=df.loc[under_alloc, 'allocation'])
under_alloc = df['allocation final'] < df['allocation']
return df
At every iteration, we add the missing quotas to any rows that did not reach the allocation yet (rounded down, that’s missing // under_alloc.sum()), then using pd.Series.clip() to ensure we stay below the allocation.
If there’s less missing quotas than available ranks to which to allocate (e.g. run the same dataframe with supply=5 or 6), we allocate to the first missing ranks.
>>> df = pd.DataFrame( {'allocation': {0: 12, 1: 3, 2: 1, 3: 1}, 'rank': {0: 1, 1: 2, 2: 3, 3: 4}})
>>> print(allocate(df, 14))
allocation rank allocation final
0 12 1 9
1 3 2 3
2 1 3 1
3 1 4 1
>>> print(allocate(df, 5))
allocation rank allocation final
0 12 1 2
1 3 2 1
2 1 3 1
3 1 4 1

Here is a simpler version:
def allocate(series, supply):
allocated = 0
values = [0]*len(series)
while True:
for i in range(len(series)):
if allocated >= supply:
return values
if values[i] < series.iloc[i]:
values[i]+=1
allocated+=1
pass
allocate(df['allocation'], 14)
output:
[9,3,1,1]

Nearest neighbors in a given range

I faced the problem of quickly finding the nearest neighbors in a given range.
Example of dataset:
id | string | float
0 | AA | 0.1
12 | BB | 0.5
2 | CC | 0.3
102| AA | 1.1
33 | AA | 2.8
17 | AA | 0.5
For each line, print the number of lines satisfying the following conditions:
string field is equal to current
float field <= current float - del
For this example with del = 1.5:
id | count
0 | 0
12 | 0
2 | 0
102| 2 (string is equal row with id=0,33,17 but only in row id=0,17 float value: 1.1-1.5<=0.1, 1.1-1.5<=0.5)
33 | 0 (string is equal row with id=0,102,17 but 2.8-1.5>=0.1/1.1/1.5)
17 | 1
To solve this problem, I used a class BallTree with custom metric, but it works for a very long time due to a reverse tree walk (on a large dataset).
Can someone suggest other solutions or how you can increase the speed of custom metrics to the speed of the metrics from the sklearn.neighbors.DistanceMetric?
My code:
from sklearn.neighbors import BallTree
def distance(x, y):
if(x[0]==y[0] and x[1]>y[1]):
return (x[1] - y[1])
else:
return (x[1] + y[1])
tree2 = BallTree(X, leaf_size=X.shape[0], metric=distance)
mas=tree2.query_radius(X, r=del, count_only = True)

Passing value from previous result python

I want to evaluate the gap of a variable between time interval.
Here is an example of the calculation:
Count | Gap | Gap Result | Evaluate
----------------------------------------
19 | 15-5 | 10 | 10
18 | 15-3 | 12 | 10-12 = -2
17 | 15-4 | 11 | 12-11 = 1
I have no idea how to express it. Please advice.
number = [1,2,3,4,5,6,7]
goal = 15
count = 20
def step (self)
while count > 0:
count -= 1
gap = [goal - (random.choice(number))]
previous_gap = gap from (count - 1) # I don't know how to express this
evaluate = previous_gap - gap

You'll need to store the previous gap too; set it to 0 to start with. You don't want a list, you are dealing with individual numbers here:
goal = 15
count = 20
previous_gap = evaluate = 0
while count > 0:
count -= 1
gap = goal - random.choice(number)
if previous_gap:
evaluate = previous_gap - gap
# remember the gap for the next step
previous_gap = gap

Gurobi: How can I sum just a part of a variable?

I have the following model:
from gurobipy import *
n_units = 1
n_periods = 3
n_ageclasses = 4
units = range(1,n_units+1)
periods = range(1,n_periods+1)
periods_plus1 = periods[:]
periods_plus1.append(max(periods_plus1)+1)
ageclasses = range(1,n_ageclasses+1)
nothickets = ageclasses[1:]
model = Model('MPPM')
HARVEST = model.addVars(units, periods, nothickets, vtype=GRB.INTEGER, name="HARVEST")
FOREST = model.addVars(units, periods_plus1, ageclasses, vtype=GRB.INTEGER, name="FOREST")
model.addConstrs((quicksum(HARVEST[(k+1), (t+1), nothicket] for k in range(n_units) for t in range(n_periods) for nothicket in nothickets) == FOREST[unit, period+1, 1] for unit in units for period in periods if period < max(periods_plus1)), name="A_Thicket")
I have a problem with formulating the constraint. I want for every unit and every period to sum the nothickets part of the variable HARVEST. Concretely I want xk=1,t=1,2 + xk=1,t=1,3 + xk=1,t=1,4
and so on. This should result in only three ones per row of the constraint matrix. But with the formulation above I get 9 ones.
I tried to use a for loop outside of the sum, but this results in another problem:
for k in range(n_units):
for t in range(n_periods):
model.addConstrs((quicksum(HARVEST[(k+1), (t+1), nothicket] for nothicket in nothickets) == FOREST[unit,period+1, 1] for unit in units for period in periods if period < max(periods_plus1)), name="A_Thicket")
With this formulation I get this matrix:
constraint matrix
But what I want is:
row_idx | col_idx | coeff
0 | 0 | 1
0 | 1 | 1
0 | 2 | 1
0 | 13 | -1
1 | 3 | 1
1 | 4 | 1
1 | 5 | 1
1 | 17 | -1
2 | 6 | 1
2 | 7 | 1
2 | 8 | 1
2 | 21 | -1
Can anybody please help me to reformulate this constraint?

This worked for me:
model.addConstrs((HARVEST.sum(unit, period, '*') == ...

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python Reverse Collatz Conjecture - python

Related

How should I solve logic error in timestamp using Python?

Pandas: how to incrementally add one to column while sum is less than corresponding column?

Nearest neighbors in a given range

Passing value from previous result python

Gurobi: How can I sum just a part of a variable?

Categories

Resources