Efficient way to transpose the bit of an integer in python? - python

Consider a 6 bits integer
x = a b c d e f
that should be transpose to three integers of 2 bits as follows
x1 = a d
x2 = b e
x3 = c f
What is an efficient way to do this in python?
I currently goes as follows
bit_list = list( bin(x)[2:] ) # to chop of '0b'
# pad beginning if necessary, to make sure bit_list contains 6 bits
nb_of_bit_to_pad_on_the_left = 6 - len(bit_list)
for i in xrange(nb_of_bit_to_pad_on_the_left):
bit_list.insert(0,'0')
# transposition
transpose = [ [], [], [] ]
for bit in xrange(0, 6, 2):
for dimension in xrange(3):
x = bit_list[bit + dimension]
transpose[dimension].append(x)
for i in xrange(n):
bit_in_string = ''.join(transpose[i])
transpose[i] = int(bit_in_string, 2)
but this is slow when transposing a 5*1e6 bits integer, to one million of 5 bits integer.
Is there a better method?
Or some bitshit magic <</>> that will be speedier?
This question arised by trying to make a python implementation of Skilling Hilbert curve algorithm

This should work:
mask = 0b100100
for i in range(2, -1, -1):
tmp = x & mask
print(((tmp >> 3 + i) << 1) + ((tmp & (1 << i)) >> i))
mask >>= 1
The first mask extracts only a and d, then it is shifted to extract only b and e and then c and f.
In the print statement the numbers are either x00y00 or 0x00y0 or 00x00y. The (tmp >> 3 + i) transforms these numbers into x and then the << 1 obtains x0.
The ((tmp & (1 << i)) >> i)) first transforms those numbers into y00/y0 or y and then right-shifts to obtain simply y. Summing the two parts you get the xy number you want.

Slices will work if your working with strings ( bin(x) ).
>>>
>>> HInt = 'ABCDEFGHIJKLMNO'
>>> x = []
>>> for i in [0, 1, 2]:
x.append(HInt[i::3])
>>> x[0]
'ADGJM'
>>> x[1]
'BEHKN'
>>> x[2]
'CFILO'
>>>

Related

How can I generate three random integers that satisfy some condition? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I'm a beginner in programming and I'm looking for a nice idea how to generate three integers that satisfy a condition.
Example:
We are given n = 30, and we've been asked to generate three integers a, b and c, so that 7*a + 5*b + 3*c = n.
I tried to use for loops, but it takes too much time and I have a maximum testing time of 1000 ms.
I'm using Python 3.
My attempt:
x = int(input())
c = []
k = []
w = []
for i in range(x):
for j in range(x):
for h in range(x):
if 7*i + 5*j + 3*h = x:
c.append(i)
k.append(j)
w.append(h)
if len(c) == len(k) == len(w)
print(-1)
else:
print(str(k[0]) + ' ' + str(c[0]) + ' ' + str(w[0]))
First, let me note that your task is underspecified in at least two respects:
The allowed range of the generated values is not specified. In particular, you don't specify whether the results may include negative integers.
The desired distribution of the generated values is not specified.
Normally, if not specified, one might assume that a uniform distribution on the set of possible solutions to the equation was expected (since it is, in a certain sense, the most random possible distribution on a given set). But a (discrete) uniform distribution is only possible if the solution set is finite, which it won't be if the range of results is unrestricted. (In particular, if (a, b, c) is a solution, then so is (a, b + 3k, c − 5k) for any integer k.) So if we interpret the task as asking for a uniform distribution with unlimited range, it's actually impossible!
On the other hand, if we're allowed to choose any distribution and range, the task becomes trivial: just make the generator always return a = −n, b = n, c = n. Clearly this is a solution to the equation (since −7n + 5n + 3n = (−7 + 5 + 3)n = 1n), and a degenerate distribution that assigns all probability mass to single point is still a valid probability distribution!
If you wanted a slightly less degenerate solution, you could pick a random integer k (using any distribution of your choice) and return a = −n, b = n + 3k, c = n − 5k. As noted above, this is also a solution to the equation for any k. Of course, this distribution is still somewhat degenerate, since the value of a is fixed.
If you want to let all return values be at least somewhat random, you could also pick a random h and return a = −n + h, b = n − 2h + 3k and c = n + h − 5k. Again, this is guaranteed to be a valid solution for any h and k, since it clearly satisfies the equation for h = k = 0, and it's also easy to see that increasing or decreasing either h or k will leave the value of the left-hand side of the equation unchanged.
In fact, it can be proved that this method can generate all possible solutions to the equation, and that each solution will correspond to a unique (h, k) pair! (One fairly intuitive way to see this is to plot the solutions in 3D space and observe that they form a regular lattice of points on a 2D plane, and that the vectors (+1, −2, +1) and (0, +3, −5) span this lattice.) If we pick h and k from some distribution that (at least in theory) assigns a non-zero probability to every integer, then we'll have a non-zero probability of returning any valid solution. So, at least for one somewhat reasonable interpretation of the task (unbounded range, any distribution with full support) the following code should solve the task efficiently:
from random import gauss
def random_solution(n):
h = int(gauss(0, 1000)) # any distribution with full support on the integers will do
k = int(gauss(0, 1000))
return (-n + h, n - 2*h + 3*k, n + h - 5*k)
If the range of possible values is restricted, the problem becomes a bit trickier. On the positive side, if all values are bounded below (or above), then the set of possible solutions is finite, and so a uniform distribution exists on it. On the flip side, efficiently sampling this uniform distribution is not trivial.
One possible approach, which you've used yourself, is to first generate all possible solutions (assuming there's a finite number of them) and then sample from the list of solutions. We can do the solution generation fairly efficiently like this:
find all possible values of a for which the equation might have a solution,
for each such a, find all possible values of b for which there still have a solution,
for each such (a, b) pair, solve the equation for c and check if it's valid (i.e. an integer within the specified range), and
if yes, add (a, b, c) to the set of solutions.
The tricky part is step 2, where we want to calculate the range of possible b values. For this, we can make use of the observation that, for a given a, setting c to its smallest allowed value and solving the equation gives an upper bound for b (and vice versa).
In particular, solving the equation for a, b and c respectively, we get:
a = (n − 5b − 3c) / 7
b = (n − 7a − 3c) / 5
c = (n − 7a − 5b) / 3
Given lower bounds on some of the values, we can use these solutions to compute corresponding upper bounds on the others. For example, the following code will generate all non-negative solutions efficiently (and can be easily modified to use a lower bound other than 0, if needed):
def all_nonnegative_solutions(n):
a_min = b_min = c_min = 0
a_max = (n - 5*b_min - 3*c_min) // 7
for a in range(a_min, a_max + 1):
b_max = (n - 7*a - 3*c_min) // 5
for b in range(b_min, b_max + 1):
if (n - 7*a - 5*b) % 3 == 0:
c = (n - 7*a - 5*b) // 3
yield (a, b, c)
We can then store the solutions in a list or a tuple and sample from that list:
from random import choice
solutions = tuple(all_nonnegative_solutions(30))
a, b, c = choice(solutions)
Ps. Apparently Python's random.choice is not smart enough to use reservoir sampling to sample from an arbitrary iterable, so we do need to store the full list of solutions even if we only want to sample from it once. Or, of course, we could always implement our own sampler:
def reservoir_choice(iterable):
r = None
n = 0
for x in iterable:
n += 1
if randrange(n) == 0:
r = x
return r
a, b, c = reservoir_choice(all_nonnegative_solutions(30))
BTW, we could make the all_nonnegative_solutions function above a bit more efficient by observing that the (n - 7*a - 5*b) % 3 == 0 condition (which checks whether c = (n − 7a − 5b) / 3 is an integer, and thus a valid solution) is true for every third value of b. Thus, if we first calculated the smallest value of b that satisfies the condition for a given a (which can be done with a bit of modular arithmetic), we could iterate over b with a step size of 3 starting from that minimum value and skip the divisibility check entirely. I'll leave implementing that optimization as an exercise.
import numpy as np
def generate_answer(n: int, low_limit:int, high_limit: int):
while True:
a = np.random.randint(low_limit, high_limit + 1, 1)[0]
b = np.random.randint(low_limit, high_limit + 1, 1)[0]
c = (n - 7 * a - 5 * b) / 3.0
if int(c) == c and low_limit <= c <= high_limit:
break
return a, b, int(c)
if __name__ == "__main__":
n = 30
ans = generate_answer(low_limit=-5, high_limit=50, n=n)
assert ans[0] * 7 + ans[1] * 5 + ans[2] * 3 == n
print(ans)
If you select two of the numbers a, b, c, you know the third. In this case, I randomize ints for a, b, and I find c by c = (n - 7 * a - 5 * b) / 3.0.
Make sure c is an integer, and in the allowed limits, and we are done.
If it is not, randomize again.
If you want to generate all possibilities,
def generate_all_answers(n: int, low_limit:int, high_limit: int):
results = []
for a in range(low_limit, high_limit + 1):
for b in range(low_limit, high_limit + 1):
c = (n - 7 * a - 5 * b) / 3.0
if int(c) == c and low_limit <= c <= high_limit:
results.append((a, b, int(c)))
return results
If third-party libraries are allowed, you can use SymPy's diophantine.diop_linear linear Diophantine equations solver:
from sympy.solvers.diophantine.diophantine import diop_linear
from sympy import symbols
from numpy.random import randint
n = 30
N = 8 # Number of solutions needed
# Unknowns
a, b, c = symbols('a, b, c', integer=True)
# Coefficients
x, y, z = 7, 5, 3
# Parameters of parametric equation of solution
t_0, t_1 = symbols('t_0, t_1', integer=True)
solution = diop_linear(x * a + y * b + z * c - n)
if not (None in solution):
for s in range(N):
# -10000 and 10000 (max and min for t_0 and t_1)
t_sub = [(t_0, randint(-10000, 10000)), (t_1, randint(-10000, 10000))]
a_val, b_val, c_val = map(lambda t : t.subs(t_sub), solution)
print('Solution #%d' % (s + 1))
print('a =', a_val, ', b =', b_val, ', c =', c_val)
else:
print('no solutions')
Output (random):
Solution #1
a = -141 , b = -29187 , c = 48984
Solution #2
a = -8532 , b = -68757 , c = 134513
Solution #3
a = 5034 , b = 30729 , c = -62951
Solution #4
a = 7107 , b = 76638 , c = -144303
Solution #5
a = 4587 , b = 23721 , c = -50228
Solution #6
a = -9294 , b = -106269 , c = 198811
Solution #7
a = -1572 , b = -43224 , c = 75718
Solution #8
a = 4956 , b = 68097 , c = -125049
Why your solution can't cope with large values of n
You may understand that everything in a for loop with a range of i, will run i times. So it will multiply the time taken by i.
For example, let's pretend (to keep things simple) that this runs in 4 milliseconds:
if 7*a + 5*b + 3*c = n:
c.append(a)
k.append(b)
w.append(c)
then this will run in 4×n milliseconds:
for c in range(n):
if 7*a + 5*b + 3*c = n:
c.append(a)
k.append(b)
w.append(c)
Approximately:
n = 100 would take 0.4 seconds
n = 250 would take 1 second
n = 15000 would take 60 seconds
If you put that inside a for loop over a range of n then the whole thing will be repeated n times. I.e.
for b in range(n):
for c in range(n):
if 7*a + 5*b + 3*c = n:
c.append(a)
k.append(b)
w.append(c)
will take 4n² milliseconds.
n = 30 would take 4 seconds
n = 50 would take 10 seconds
n = 120 would take 60 seconds
Putting it in a third for-loop will take 4n³ milliseconds.
n = 10 would take 4 seconds
n = 14 would take 10 seconds.
n = 24 would take 60 seconds.
Now, what if you halved the original if to 2 milliseconds? n would be able to increase by 15000 in the first case... and 23 in the last case. The lesson here is that fewer for-loops is usually much more important than speeding up what's inside them. As you can see in Gulzar's answer part 2, there are only two for loops which makes a big difference. (This only applies if the loops are inside each other; if they are just one after another you don't have the multiplication problem.)
from my perspective, the last number of the three is never a random number. let say you generate a and b first then c is never a random because it should be calculated from the equation
n = 7*a + 5*b + 3*c
c = (7*a + 5*b - n) / -3
this means that we need to generate two random values (a,b)
that 7*a + 5*b - n is divisible by 3
import random
n = 30;
max = 1000000;
min = -1000000;
while True:
a = random.randint(min , max);
b = random.randint(min , max);
t = (7*a) + (5*b) - n;
if (t % 3 == 0) :
break;
c = (t/-3);
print("A = " + str(a));
print("B = " + str(b));
print("C = " + str(c));
print("7A + 5B + 3C =>")
print("(7 * " + str(a) + ") + (5 * " + str(b) + ") + (3 * " + str(c) + ") = ")
print((7*a) + (5*b) + (3*c));
REPL

Circular shift of a bit in python (equivalent of Fortran's ISHFTC)

I want to achieve the equivalent of the ISHFTC function in Fortran using python.
What is the best way to do this?
For example,
x = '0100110'
s = int(x, 2)
s_shifted = ISHFTC(s,1,7) #shifts to left by 1
#binary representation of s_shifted should be 1001100
My attempt based on Circular shift in c
def ISHFTC(n, d,N):
return (n << d)|(n >> (N - d))
However, this does not do what I want. Example,
ISHFTC(57,1,6) #57 is '111001'
gives 115 which is '1110011' whereas I want '110011'
Your attempted solution does not work because Python has unlimited size integers.
It works in C (for specific values of N, depending on the type used, typically something like 8 or 32), because the bits that are shifted out to the left are automatically truncated.
You need to do this explicitly in Python to get the same behaviour. Truncating a value to the lowest N bits can be done be using % (1 << N) (the remainder of dividing by 2N).
Example: ISHFTC(57, 1, 6)
We want to keep the 6 bits inside |......| and truncate all bits to the left. The bits to the right are truncated automatically, because the these are already the 6 least significant bits.
n |111001|
a = n << d 1|110010|
m = (1 << N) 1|000000|
b = a % m 0|110010|
c = n >> (N - d) |000001|(11001)
result = b | c |110011|
Resulting code:
def ISHFTC(n, d, N):
return ((n << d) % (1 << N)) | (n >> (N - d))
# ^^^^^^ a
# ^^^^^^ m
# ^^^^^^^^^^^^^^^^^ b
# ^^^^^^^^^^^^ c
>>> ISHFTC(57, 1, 6)
51
>>> bin(_)
'0b110011'

Numpy: Conserving sum in average over two arrays of integers

I have two arrays of positive integers A and B that each sum to 10:
A = [1,4,5]
B = [5,5,0]
I want to write a code (that will work for a general size of the array and the sum) to calculate the array C who is also a array of positive integers that also sums to 10 that is the closest to the element-wise average as possible:
Pure average C = (A + B) / 2: C=[3,4.5,2.5]
Round C = np.ceil((A + B) / 2).astype(int): C=[3,5,3], (sum=11, incorrect!)
Fix the sum C = SOME CODE: c=[3,4,3], (sum=10, correct!)
Any value can be adjusted to make the sum correct, as long as all elements remain positive integers.
What should C = SOME CODE be?
Minimum reproducible example:
A = np.array([1,4,5])
B = np.array([5,5,0])
C = np.ceil((A + B) / 2).astype(int)
print(np.sum(C))
11
This should give 10.
You can ceil/floor every other non-int element. This works for any shape/size and any sum value (in fact you do not need to know the sum at all. It is enough if A and B have same sum):
C = (A + B) / 2
C_c = np.ceil(C)
C_c[np.flatnonzero([C!=C.astype(int)])[::2]] -= 1
print(C_c.sum())
#10.0
print(C_c.astype(int))
#[3 4 3]
Ok so based off what you're saying, this could work:
C = ((a + b) / 2) # array([3, 4, 2])
curr_sum = sum(C) # 9
adjust_amount = sum(a) - curr_sum # 10-9 = 1
if adjust_amount > 0:
C[-1] += adjust_amount # array([3, 4, 3])
# Otherwise if it's negative just grab the largest and subtract to ensure you still remain >0
else:
C[np.argmax(C)] += adjust_amount

Is there another way to find item n in this recursive sequence?

The sequence is:
an = an-1 + (2 * an-2)
a0 = 1, a1= 1. Find a100
The way I did it is making a list.
#List 'a' with a0 = 1 , a1 = 1.
a = [1,1]
#To get the a100, implement 'i' as the index value of the list.
for i in range (2,101):
x = a[i-1] + (2 * a[i-2])
print( str(len(a)) + (": ") + str(x))
#Save new value to list
a.append(x)
Is there another way to do this where you can just directly get the value of a100? Or the value of a10000.. it will take up so much memory.
For this specific case, the sequence appears to be known as the Jacobsthal sequence. Wikipedia gives a closed form expression for a_n that can be expressed as follows:
def J(n):
return (2**(n+1) - (1 if n % 2 else -1))//3
Slightly more generally, you can use fast matrix exponentiation to find a specific value of a_n in O(log n) matrix operations. The approach here is a slight modification of this.
import numpy as np
def a(n):
mat = np.array([[1, 2], [1, 0]], dtype=object) # object for large integers
return np.linalg.matrix_power(mat, n)[0,0]
Here is the value for a_1000:
7143390714575115472989500327066678737076032078036890716291669255802340340832907483287989192104639054183964486117020978834580968571282093623989718383132383202623045183216153990280716403374914094585302788102030983322387960844932511706110362630718041943047464318457694778440286554435082924558137112046251
This recurrence relation has a closed form solution:
a = lambda n: (2**(n+1) + (-1)**n)//3
Then
a(0) == 1
a(1) == 1
a(2) == 3
...
Use Wolfram Alpha solve for the closed form solution.
For a more general solution, sympy's rsolve can generate a formula for linear recurrences. And then use substitution to find particular values.
from sympy import rsolve, Function, symbols
f = Function('f')
n = symbols('n', integer=True)
T = f(n) - f(n - 1) - 2 * f(n - 2)
sol = rsolve(T, f(n), {f(0): 1, f(1): 1})
print(sol.factor())
for k in range(6):
print(f'a[{10**k}] = {sol.subs({n:10**k})}')
This finds the formula: ((-1)**n + 2*2**n)/3 and substituting various values gives:
a[1] = 1
a[10] = 683
a[100] = 845100400152152934331135470251
a[1000] = 7143390714575115472989500327066678737076032078036890716291669255802340340832907483287989192104639054183964486117020978834580968571282093623989718383132383202623045183216153990280716403374914094585302788102030983322387960844932511706110362630718041943047464318457694778440286554435082924558137112046251
a[10000] = 13300420779205055899224947751223900558823312212574616365680059665686292553481297754613307789357463065266220752948806082847704327566275854078395857288064215971903820031195863017843497700844039930347033391278795541028339072307078736457006049910726416592060326596558672835961088838567081045539649268371274925376816731095916294031173247751323635481912358774462877183753093841891253840488152356727760984122637587639312975932940560640357511880709747618222262691017043766353735428453489979600223956211100972865182186443850404115054687605329465453071585497122508186691535256991501267222976387636433705286400943222614410489725426171396919846079518533884638490449629415374679171890883668485549192847140249201910928687618755494267749463781127049374279769561549759200832570764870138287994839741197500087328573494472227205070621546774178994858997503894208562707691159300991409504210074059830342802209213468621093971730976504006937230561044048029975244677676707947087336124281517272447267049737611904634607637370045500833604005013228174598706158078702963192048604263495032226147988471602982108251173897742022519137359868942131422329103081800375446624970338827853981873988860876269047918349845673238184625284288814399599917924440538912558558685095521850114849105048496522741529593155873907738282168861316542080131736118854643798317265443020838956090639908522753418790270855651099392460347365053921743882641323846748271362887055383912692879736402269982104388805781403942200602501882277026496929598476838303527006808207298214407168983217160516849324232198998893837958637097759081249712999519344381402467576288757211476207860932148655897231556293513976121900670048618498909700385756334067235325208259649285799693889564105871362639412657210097186118095746465818754306322522134720983321447905340926047485500603884544957480384983947611769143791817076603055269994974019086721023722205420067991783904156229025970272783748933896591684108429045765889012975813584862160062970831282169566933785351515891836917604484599090827358327607311145704700506065400164526586785514617302254188281302685535172938965970009784445593131997924161090875584262602248970534271757827918474036922817159666073457645479797721100990086996148246631809842046103645478455250800241851505149187576887740797874187195112987924800865762440512367759907023068198581038345298256830912964615391929510632144672034080214910330858779357159414245558929061170945822567007313514409276959727327732103102944890874437957354081499958646666151187821572015407908429716866090505450005466559490856410166587392640154829574782514412057571343645656039081553195235917082324370960357975081345975714019208241045008362225535513352731779100379038105003677818345932796086474225126766610787543447696005152433715459704967280220123536564742545543604882702212692308056024281175802607700426526000495235781464187268985316355546978912530579053491968145752746720495213034211965438416298865678974339803258684849814383125421063166939821410053665460303868944551299858094210708807124261007787849536528397806251

Transform bits into byte series

Given a Python integer which is within the size of 4 bits, how does one transform it – with bitwise arithmetic instead of string processing – into an integer within the size of 4 bytes, for which each bit in the original corresponds to a byte which is the bit repeated 8 times?
For example: 0b1011 should become 0b11111111000000001111111111111111
With apologies to ncoghlan:
expanded_bits = [
0b00000000000000000000000000000000,
0b00000000000000000000000011111111,
0b00000000000000001111111100000000,
0b00000000000000001111111111111111,
0b00000000111111110000000000000000,
0b00000000111111110000000011111111,
0b00000000111111111111111100000000,
0b00000000111111111111111111111111,
0b11111111000000000000000000000000,
0b11111111000000000000000011111111,
0b11111111000000001111111100000000,
0b11111111000000001111111111111111,
0b11111111111111110000000000000000,
0b11111111111111110000000011111111,
0b11111111111111111111111100000000,
0b11111111111111111111111111111111,
]
Then just index this list with the nibble you want to transform:
>>> bin(expanded_bits[0b1011])
"0b11111111000000001111111111111111"
I'd just do a loop:
x = 0b1011
y = 0
for i in range(4):
if x & (1 << i):
y |= (255 << (i * 8))
print "%x" % y
The following recursive solution uses only addition, left/right shift operators and bitwise & operator with integers:
def xform_rec(n):
if n == 0:
return 0
else:
if 0 == n & 0b1:
return xform_rec(n >> 1) << 8
else:
return 0b11111111 + (xform_rec(n >> 1) << 8)
Or, as a one-liner:
def xform_rec(n):
return 0 if n == 0 else (0 if 0 == n & 0b1 else 0b11111111) + (xform_rec(n >> 1) << 8)
Examples:
>>> print bin(xform_rec(0b1011))
0b11111111000000001111111111111111
>>> print bin(xform_rec(0b0000))
0b0
>>> print bin(xform_rec(0b1111))
0b11111111111111111111111111111111)

Categories