Create a list not that random - python

My question is about creating a semi random list of letters in Python. I want that 20% of the time, the letter is the same than 2 letters before.
I do not want only random because if I do so it will not happen 20% of the time that n-2 is the same letter.
I was thinking about creating a first list with the letter I want and then create a new list that will take the letter from the first list but randomly, but I do not know how to add my constraint of 20%?
Finally I need it to be exactly 20% of the time
A
B
B
C
B
A
A
C
A
Like that for example... Do you have tips for me ?

Try this:
from random import random, choice
letters = ['A', 'B', 'C']
prob = 0.2
target_length = 20
target = ""
while len(target) < target_length:
if len(target) >= 2 and random() < prob:
c = target[-2]
else:
c = choice(letters)
target += c
print(target)
Here I just decide about choosing a new letter from the set or getting the letter 2 times before from the existing sequence (string target). Of course I have to consider the case when target has less than 2 letters, otherwise it's nothing to choose 2 time before.

How about this:
import random
import string
alpha = string.ascii_uppercase
my_list = [random.choice(alpha) for _ in range(2)] # initializing; first two elements have to be random
n = 50 # select the length of the list; note that 2 elements are already inside!
for i in range(n):
if random.random() < 0.2: # 20% of the time repeat a letter
my_list.append(my_list[len(my_list)-2])
else:
my_list.append(random.choice(alpha)) # 80% of the time, get a new
print(my_list) # -> ['O', 'M', 'O', 'F', 'B', 'F', 'S', 'G', ...
# ^ ^
And to check:
j = 0
for i, item in enumerate(my_list[2:], 2):
if item == my_list[i-2]:
j += 1
print(j/len(my_list)) # I got 23%. The bigger the list get's, the closer to 20% you'll get

You could try this: Create a random list and get indices that do not fulfil the condition. From those, select random indices to fulfil the condition minus the number of indices already fulfilling it. Set those indices to the element two indices prior.
lst = [random.randint(1, 10) for _ in range(20)]
not_same = [i for i in range(2, len(lst)) if i not in same]
num_same = len(lst) - len(not_same)
make_same = random.sample(not_same, len(lst)//5 - num_same)
for i in make_same:
lst[i] = lst[i-2]
Note, however, that there is the chance of getting slightly fewer or slightly more than 20%. If e.g. your list is [1,2,3,4,1] and you set 3 to 1, you gain two "same" elements instead of one, and if the list is [1,2,3,4,3] and you set the first 3 to 1, you gain one and you lose one.

Related

How to find back to back same items in Python?

I basically want to count how many times "H" printed back to back 3 times in this random sequence. How can i add a code which detects its occurence, is it possible with index method?
import random
prob= ["H","T"]
streakcount=0
x =[]
for excount in range(1000):
y = random.choice(prob)
x.append(y)
print(x)
Yeah, you can index the previous element in the 'x' array and check if it is also an 'H'.
So, first add an IF statement after y is assigned a random value and check if that value is 'H'. If it is, then you want to index the previous element in the array 'x' to see if it is also an 'H'. Finally, we need an accumulator to store the number of times this happens.
That's the gist of it. The only potential hiccup we want to avoid occurs at the very beginning of the 'for' loop when excount is 0 or 1. At this time, if an 'H' is randomly chosen and we try to index the 'x' array at the index, excount-2, we'll end up indexing the list from the end (with a negative number) or indexing a list at an index that does not yet exist.
This could occur because we're subtracting 1 and 2 from excount and then indexing the 'x' array, so we just want to double-check that excount is >= 2 before we start checking to see if we've seen three H's in a row.
import random
prob= ["H","T"]
streakcount=0
x =[]
for excount in range(1000):
y = random.choice(prob)
if y == 'H' and excount >= 2:
# check to see if previous element was also an 'H'
if x[excount-1] == 'H' and x[excount-2] == 'H':
streakcount += 1 # 3 H's in a row! increment counter
x.append(y)
print(x)
I think you can just convert a list to string and then use count to get count.
import random
prob= ["H","T"]
streakcount=0
x =[]
for _ in range(1000):
y = random.choice(prob)
x.append(y)
strlist = ' '.join(map(str, x))
print(strlist.count("H H H"))
The deque class from the collections module is useful for this use-case. The deque is constructed so as to support a list with a maximum size of 3. As new values are appended to the deque the oldest value is lost. Therefore we can use the deque's count() function to get the number of values matching the criterion of interest
from random import choice
from collections import deque
ht = ['H', 'T']
streakcount = 0
d = deque([], 3)
for _ in range(1_000):
d.append(choice(ht))
if d.count('H') == 3:
streakcount += 1
print(f'{streakcount=}')
import random
prob = ["H", "T"]
streakcount = 0
x = []
for excount in range(1000):
y = random.choice(prob)
x.append(y)
for i in range(len(x)):
if i > 0 and i + 1 <= len(x):
if x[i - 1] == 'H' and x[i] == 'H' and x[i + 1] == 'H':
streakcount += 1
print(streakcount)

How to print out a 1000 new strings with a single replaced element based on a 1000-symbol string?

There will be a bit of molecular biology here.
So I need to generate 1000 mutant sequences bsed on a primary 1000-nucleotide sequence. Every following mutant sequence must have one random nucleotide switched to one of the same class (A to G and vice-versa; T to C and vice-versa) compared to the preceding sequence. Also, random.randint and random.seed(1) must be used.
Here's what I have so far:
import random
# below is the initial sequence
seq = 'CGCCTGTAATCCCAGCACTCTGGGAGGCAGAGGTGGGCCGATCACTTGAGGTCAGGAGTTCGAGACCAGCCTGGGCAACATGGTGAAACACCATCTCTACTAAAAACACAAAAATTAGCCAGGTGTGGTGGCAGGCACCTGCAGTCCCAGCTACTCCGGAGGCTGAGGCAGGAGAATTGCTCGAACCTGGGAGGCAGGGGTTGCAGTGAGCCGACATGGCGCCACTGCACTCCAGTCTGGGCGACAGAGTGAGACCCTATCTCAAAAAAAAAAAAAAAAAAAAAAGACCCAACTCAAGTATCATCTCCAGGAAGCCTTCCCCTACTCCCAGCAATTAAATGCTCCTCAGAGAATTCCCATTTTTGGTTTACTCTTTGGTTTACCTCCAGACAGGAAGCCCCCACTGACACTGTTGTAGTCCCAGGGTGCAACACAAAGCAGAGATCACAAGCTGAGTTTAATAATTGCTTGTGGAATACATGTCCCAAGCCACCTCCTGCAGGAAGCCCTTCCAGATGCCCATTCTAGCCAGTCTGGCTCTTTGCTTCCATACCTTCACAACACTTGTGCCTCCCCCAGGGCCTCTTTCTCATCTTGCTTTCTGGGGCAGCTGTGTGCACATTTGTCTGTGTGCAGCAACTCTCTAAGGCAGGGATTTTTACTCCTATTTTTGATGAGGGGAGCTGTGGCTCAGAGAGGTTGAATAACCTAAGGCCACACAGTGAGTGGCAGAGCCAGGAATGTGACTTGGGTCCATTTGAATCCAAAGTCCCTGTACTTTCCACTGCCCTACCTAGATGTCCCTGTACCTCCTATAAAATCAGCATGGAGCCTGGTGCCTGGTAGTCCCTACAAATATTCACAAATTGGAGCTTAGCTCAGCTCTCAGGCAAGGCCCAGGTCAAAAGGGCAGATACAGCTTTGGGACCTTAGTTGCCACCACATGCCATACCTTCTTCCCAGCAGAAGGACTCCCTCCAAGACAGGGTAGGGGTGGAGG'
n = 0
while n <= 1000: # setting up a cycle for 1000 mutations
i = random.randint(0, 1001) # choosing random nucleotide to switch
if seq[i] == 'A':
print(seq.replace('A', 'G', 1)) # the third argumunt is supposed to show how many times a nucleotide must be switched but it does't work for some reason
elif seq[i] == 'G':
print(seq.replace('G', 'A', 1))
elif seq[i] == 'C':
print(seq.replace('C', 'T', 1))
elif seq[i] == 'T':
print(seq.replace('T', 'C', 1))
n = n + 1
The main problems I encountered is getting the program to generate new mutations based on the previous sequence, not the original one and only substituting one nucleotide.
You need to update the sequence of nucleotides in the loop. Strings cannot be changed, so I'd recommend just using a list of letters to start:
import random
# Since string cannot be changed/mutated,
# break up sequence into a list of strings
seq = list('CGCCTGTAATCCCAGCACTCTGGGAGGCAGAGGTGGGCCGATCACTTGAGGTCAGGAGTTCGAGACCAGCCTGGGCAACATGGTGAAACACCATCTCTACTAAAAACACAAAAATTAGCCAGGTGTGGTGGCAGGCACCTGCAGTCCCAGCTACTCCGGAGGCTGAGGCAGGAGAATTGCTCGAACCTGGGAGGCAGGGGTTGCAGTGAGCCGACATGGCGCCACTGCACTCCAGTCTGGGCGACAGAGTGAGACCCTATCTCAAAAAAAAAAAAAAAAAAAAAAGACCCAACTCAAGTATCATCTCCAGGAAGCCTTCCCCTACTCCCAGCAATTAAATGCTCCTCAGAGAATTCCCATTTTTGGTTTACTCTTTGGTTTACCTCCAGACAGGAAGCCCCCACTGACACTGTTGTAGTCCCAGGGTGCAACACAAAGCAGAGATCACAAGCTGAGTTTAATAATTGCTTGTGGAATACATGTCCCAAGCCACCTCCTGCAGGAAGCCCTTCCAGATGCCCATTCTAGCCAGTCTGGCTCTTTGCTTCCATACCTTCACAACACTTGTGCCTCCCCCAGGGCCTCTTTCTCATCTTGCTTTCTGGGGCAGCTGTGTGCACATTTGTCTGTGTGCAGCAACTCTCTAAGGCAGGGATTTTTACTCCTATTTTTGATGAGGGGAGCTGTGGCTCAGAGAGGTTGAATAACCTAAGGCCACACAGTGAGTGGCAGAGCCAGGAATGTGACTTGGGTCCATTTGAATCCAAAGTCCCTGTACTTTCCACTGCCCTACCTAGATGTCCCTGTACCTCCTATAAAATCAGCATGGAGCCTGGTGCCTGGTAGTCCCTACAAATATTCACAAATTGGAGCTTAGCTCAGCTCTCAGGCAAGGCCCAGGTCAAAAGGGCAGATACAGCTTTGGGACCTTAGTTGCCACCACATGCCATACCTTCTTCCCAGCAGAAGGACTCCCTCCAAGACAGGGTAGGGGTGGAGG')
for n in range(1000): # setting up a cycle for 1000 mutations
i = random.randint(0, len(seq)-1) # choosing random nucleotide to switch
print(i)
print(seq[i]) # i-th nucleotide before the mutation
if seq[i] == 'A':
seq[i] = 'G'
elif seq[i] == 'G':
seq[i] = 'A'
elif seq[i] == 'C':
seq[i] = 'T'
elif seq[i] == 'T':
seq[i] = 'C'
print(seq[i]). # i-th nucleotide after the mutation
print(''.join(seq)) # join nucleotides into a string for printing
random.randrange() is more appropriate to choose a random index in a fixed list, and str.maketrans() and str.translate() are a faster and more straightforward way to translate one letter to another.
Make sure to replace the character in the previous iteration each time. Strings are immutable, so use a mutable list:
import random
xlat = str.maketrans('AGCT','GATC')
#seq = list('CGCCTGTAATCCCAGCACTCTGGGAGGCAGAGGTGGGCCGATCACTTGAGGTCAGGAGTTCGAGACCAGCCTGGGCAACATGGTGAAACACCATCTCTACTAAAAACACAAAAATTAGCCAGGTGTGGTGGCAGGCACCTGCAGTCCCAGCTACTCCGGAGGCTGAGGCAGGAGAATTGCTCGAACCTGGGAGGCAGGGGTTGCAGTGAGCCGACATGGCGCCACTGCACTCCAGTCTGGGCGACAGAGTGAGACCCTATCTCAAAAAAAAAAAAAAAAAAAAAAGACCCAACTCAAGTATCATCTCCAGGAAGCCTTCCCCTACTCCCAGCAATTAAATGCTCCTCAGAGAATTCCCATTTTTGGTTTACTCTTTGGTTTACCTCCAGACAGGAAGCCCCCACTGACACTGTTGTAGTCCCAGGGTGCAACACAAAGCAGAGATCACAAGCTGAGTTTAATAATTGCTTGTGGAATACATGTCCCAAGCCACCTCCTGCAGGAAGCCCTTCCAGATGCCCATTCTAGCCAGTCTGGCTCTTTGCTTCCATACCTTCACAACACTTGTGCCTCCCCCAGGGCCTCTTTCTCATCTTGCTTTCTGGGGCAGCTGTGTGCACATTTGTCTGTGTGCAGCAACTCTCTAAGGCAGGGATTTTTACTCCTATTTTTGATGAGGGGAGCTGTGGCTCAGAGAGGTTGAATAACCTAAGGCCACACAGTGAGTGGCAGAGCCAGGAATGTGACTTGGGTCCATTTGAATCCAAAGTCCCTGTACTTTCCACTGCCCTACCTAGATGTCCCTGTACCTCCTATAAAATCAGCATGGAGCCTGGTGCCTGGTAGTCCCTACAAATATTCACAAATTGGAGCTTAGCTCAGCTCTCAGGCAAGGCCCAGGTCAAAAGGGCAGATACAGCTTTGGGACCTTAGTTGCCACCACATGCCATACCTTCTTCCCAGCAGAAGGACTCCCTCCAAGACAGGGTAGGGGTGGAGG')
#mutations = 1000
seq = list('CGCCTGTAAT')
mutations = 5
random.seed(1)
for _ in range(mutations):
i = random.randrange(len(seq)) # equivalent to random.randint(0,len(seq)-1)
seq[i] = seq[i].translate(xlat) # translate and replace at index
print(''.join(seq))
Output:
CGTCTGTAAT
CGTCTGTAAC
CATCTGTAAC
CATCCGTAAC
CGTCCGTAAC

Program in Python that converts sorted string array based on repeated elements

First time writing here, sorry if unclear explanation. I have sorted array from input strings called t which I sorted alphabetically, then I want to make another array called new which counts numbers of repeated elements for example ['a','a','b','c','s','s','s'] should be ['2','1','1','3']. The way I did it does not count in the last element (it would be ['2','1','1']. Please help
s = input("Enter words: ").split(" ")
length = len(s)
t = [None] * length
for i in range(length):
t[i] = s[i]
#then did some code to sort array t, guess it's not so relevant to show here
repeatedcount = 1
j = 0
new = [None] * length
for i in range(1,length):
if (t[i] == t[i-1]): #does not count the last time it repeats
repeatedcount+=1
else:
new[j] = repeatedcount
j += 1
repeatedcount = 1
use the Counter object from pythons standard library "collections"
from collections import Counter
mylist = ['a', 'b', 'c', 'a', 'b', 'b']
print(Counter(mylist))
>> Counter({'b': 3, 'a': 2, 'c': 1})
detailed examples
you can also use dict.
counter = {}
for char in a:
if char not in counter.keys():
counter[char] = 1
else:
counter[char] += 1
print(counter.values())

Index out of range error when creating a random list with no consecutive repetitions

I am trying to generate a list of 100 elements which consist of the numbers 1 to 4 randomly distributed, but without consecutive repetitions. I do not want to determine whether the numbers 1 to 4 occur the same number of times, I want it to be completely random except for having no consecutive repetitions. I wrote some code that seems to be doing that until it stops and says
list index out of range, however I cannot figure out why this error is happening.
from random import randint
guesses = []
for x in range (0, 99):
guess = randint(1,4)
guesses.append(guess)
if x> 0 and guesses[x] == guesses[x-1]:
guesses.remove(guess)
print(guesses)
It should look something like this:
123421342312321423124213...23142314213
Your problem is that the number keeps increasing even when you remove a number instead of decreasing. I would recommend using a while loop instead. Also, you should only add the number to your list if needed instead of adding it then removing it.
from random import randint
guesses = [randint(1,4)]
x = 1
while x < 100:
guess = randint(1,4)
if guess != guesses[x-1]:
guesses.append(guess)
x += 1
print(guesses)
here is a solution using numpy
from time import time
import numpy as np
def solve_random_non_consecutive(minValue,maxValue,size):
# initial guess
a = np.random.randint(minValue,maxValue,size)
# indexes where a[i] == a[i-1]
x = np.where(np.diff(a) == 0)[0]
# as long as we have consecutive duplicates
while len(x) > 0:
# rerandomize all indexes
a[x] = np.random.randint(minValue,maxValue,len(x))
# find all duplicates
x = np.where(np.diff(a) == 0)[0]
return a
s = time()
print(solve_random_non_consecutive(1,5,1000000))
print("Took %0.2fs to solve"%(time()-s)) # took ~ 0.17 seconds to generate 1MIL
# any of the solutions using iteration took ~ 10 seconds to generate 1 mil
some caveats are that since its repopulating the data randomly the amount of time may vary from run to run
Your code would work if instead of removing the element that matches the one before, you replace it until it doesn't:
from random import randint
guesses = [randint(1,4)]
for x in range (1, 100):
guess = randint(1,4)
guesses.append(guess)
while guesses[x] == guesses[x-1]:
guesses[x] = randint(1,4)
Two alternate ideas:
You could create a set of your choices:
{1, 2, 3, 4}
And then on each iteration ask for a random.choice from the set - the last item. choice needs something indexible so you need to convert to a list each time, but there might be some ways to make that more efficient if this is a bottleneck:
from random import choice
choices = {1, 2, 3, 4}
l = [choice(list(choices))] # start with one random choice
for i in range(99):
l.append(choice(list(choices - {l[-1]})))
This seems to be pretty uniform:
from collections import Counter
counts = Counter(l)
counts
Counter({3: 26, 2: 25, 1: 26, 4: 23})
Use Iterators
You can do this all with iterators that evaluate lazily, then just take an islice() of the length you want:
from random import randint
from itertools import tee, islice
#generator to makes random ints between start and stop
def randIt(start, stop):
while True:
yield randint(1,4)
rands, prevs = tee(randIt(1, 4))
next(prevs)
# non_dupes is a generator that makes non-repeating rands
non_dupes = (r for r, i in zip(rands, prevs) if r!=i)
# use itertools islice or a loop to get the number you want
# or just call `next(non_dupes)` for one:
list(islice(non_dupes, 0, 100))
I had a similar problem this week, my solution was that I had to adjust my counter(x it looks like for you) every time I removed an index because the array gets shorter so things start to shift around.
The problem is once you remove an element, x becomes larger than the size of your array
so guesses[x] is out of bounds because x >= guesses.size()
You're only generating 99 elements. Range(0,99) goes from 0 to 98 including 0 to total 99 elements.
Also, the part of your code which removes duplicate guesses needs to set x back to x - 1. This way the "counter" for each element you want to create is not 1 ahead of how many elements you actually have.
Additionally, when you remove this element, that method will remove the first instance of a object equal to the variable guess, not necessarily the one you just added. You should use .pop() View the example in python I screenshotted.
for x in range (0, 100):
guess = randint(1,4)
guesses.append(guess)
if x> 0 and guesses[x] == guesses[x-1]:
guesses.pop()
x = x - 1
When you remove an element from the guesses array, its length will decrement
use this code
from random import randint
guesses = []
x = 0
while x < 100:
guess = randint(1,4)
guesses.append(guess)
if x > 0 and guesses[x] == guesses[x-1]:
guesses.pop()
else:
x += 1
print(guesses)

generate a list of increasing elements

I am trying to write a section of a larger program which will generate a list of random integers. the first randomly generated list should have X elements, and then generate another list of random integers with X + Y elements, and so on, sequentially adding Y to the number of elements until I get to a specified point. Each generated list will also be sorted using the selection sort method. I am using several different sort methods (selection, bubble, merge, quick, radix...) to calculate the execution time of each method for increasing input sizes. As far as the selection sort portion, I have this so far but the output I'm getting is 100 lists of 100 numbers. Clearly I'm still pretty new to Python.
Hoping for a breakthrough, thanks!
import time
import random
start_timeSelection = time.clock()
lst = []
count = 100
def selectionSort(lst):
count = 100
lst = [int(999*random.random()) for i in range(count)]
for i in range(len(lst) - 1):
currentMin = lst[i]
currentMinIndex = i
for j in range(i + 1, len(lst)):
if currentMin > lst[j]:
currentMin, currentMinIndex = lst[j], j
if currentMinIndex != i:
lst[currentMinIndex], lst[i] = lst[i], currentMin
print(lst)
while count < 300:
count += 100
selectionSort(lst)
s = (time.clock() - start_timeSelection)
print("Selection Sort execution time is: ", s, "seconds")
here is what you are looking for.
This code creates a list containing a random number of elements ranging from 0-9 (but you can change the range). Also it creates 10 lists(this number can also be changed). Each new list is a random length + the previous list's length:
from random import randint
X_plus_Y = 0
x = 0
while x < 10: #change 10 to make desired number of lists
lst = []
num_of_elements = randint(0,9) #change 9 for different random range
X_plus_Y += num_of_elements
print("add "+str(num_of_elements)+" equals " + str(X_plus_Y))
for i in range(X_plus_Y):
lst.append(randint(0,9))
print(lst)
print("\n")
x += 1
hope this helped
Here's a short example that uses a generator and list comprehension to generate lists of random values with increasing length.
import random
def generate_list(start_len, incr_len):
i = 0
while True:
yield [random.random() for j in range(start_len + i * incr_len)]
i += 1
gl = generate_list(start_len=2, incr_len=3)
print(next(gl)) # [0.3401864808412862, 0.33105346208017106]
print(next(gl)) # [0.5075146706165449, 0.5802519757892776, 0.5244104797659368, 0.8235816542342208, 0.3669745504311662]

Categories