Related
How I find the longest sub-list in list of number pairs that has the difference of 1.
So "number neighbors" could be element 1 and 2 or element 7 and 6.
If the list is [7, 1, 2, 5, 7, 6, 5, 6, 3, 4, 2, 1, 0],
Then the desired output should be 4. The sub-list would be then [7, 6, 5, 6].
This is what I have for now. The for loop formula is really broken and I don't know how to solve this.
list = [7, 1, 2, 5, 7, 6, 5, 6, 3, 4, 2, 1, 0]
sublist = []
for i in list:
if list[i] - list[i+1] == 1 or list[i] - list[i+1] == -1:
sublist.append(i)
print(sublist)
print(len(sublist))
more-itertools makes it simple:
from more_itertools import split_when
lst = [7, 1, 2, 5, 7, 6, 5, 6, 3, 4, 2, 1, 0]
print(max(split_when(lst, lambda x, y: abs(x - y) != 1), key=len))
Its best to break these types of problems up into their parts
the main problem here is to get all sequential sequences
def get_sequences_iter(an_array):
# start a new sequence with first value (or empty)
sequence = an_array[:1]
# check each index with the value that comes before
for idx in range(1,len(an_array)):
if an_array[idx] - an_array[idx-1] in {1,-1}:
# this is part of a run append it
sequence.append(an_array[idx])
else:
# run is broken
yield sequence
# start a new run
sequence = [an_array[idx]]
#capture final sequence
yield sequence
once you have this you can get all the runs in a list in O(n) time
sequences = get_sequences_iter([7, 1, 2, 5, 7, 6, 5, 6, 3, 4, 2, 1, 0])
for sequence in sequences:
print(sequence)
hopefully with this information you can figure out how to solve your actual question
but just in case
print(max(sequences,key=len))
I have the following loop:
import random
distribution = []
sum = 0
while True:
item = random.randint(1, 8)
if sum + item == 812:
distribution.append(item)
break
elif sum + item > 812:
distribution.append(812 - sum)
break
else:
distribution.append(item)
sum = sum + item
The idea here is to create a list with items where each item is a random number between 1 and 8 and the sum of items in the list has to be exactly 812. The challenge is, that the last item can randomly overshoot, so a simple stop condition won't do.
I'm racking my brain on how to express this functionally, but came up blank so far. Any ideas?
I am answering the title of this question:
What is the functional-style replacement of this loop?
def random_list(sum_rest):
item = random.randint(1, min(8, sum_rest))
return ([] if sum_rest - item == 0 else random_list(sum_rest - item)) + [item]
This solution is based on the improvement proposed by #DollarAkshay, once you get that, it is very straightforward: I am recursively calling random_list, but decreasing the sum_rest parameter by the item randomly chosen.
You get the list by calling
random_list(812)
A disciplined (and practical) approach is to generate sequences, stopping when the sum is at least 812, and then restarting if the sum is above 812. This is called "rejection sampling". This guarantees that you are picking sequences fairly, and not skewing the randomness.
You expect approximately 1 in 8 of the samples to be good, so you expect to generate 8 samples before you find a good one.
import random
def seqsum(n):
while True:
S = 0
s = []
while S < n:
s.append(random.randrange(8)+1)
S += s[-1]
if S == n:
return s
for _ in range(10):
print(seqsum(812))
Obviously this will be slower than skewing the last element or elements of the sequence to force the sum, but that's a trade-off for avoiding the skew.
Here's some code that prints out the distribution of the last element of the sequence, using this method, that of #hiro-protagonist and that of #DollarAkshay. It shows the frequencies of the numbers 1 to 8 in a sample of size 10000, printed as an array. You can see the heavy skew towards low numbers in those methods.
def seqhiro(n):
S = 0
s = []
while S < n:
s.append(min(812 - S, random.randint(1, 8)))
S += s[-1]
return s
def seqdollar(n):
S = 0
s = []
while S < n:
s.append(random.randint(1, min(8, 812 - S)))
S += s[-1]
return s
def sample(m):
last = [0] * 9
for _ in range(10000):
s = m(812)
last[s[-1]] += 1
return last[1:]
print('rejection', sample(seqsum))
print('hiro', sample(seqhiro))
print('dollar', sample(seqdollar))
Output:
rejection [1234, 1308, 1280, 1178, 1246, 1247, 1257, 1250]
hiro [2226, 1904, 1727, 1319, 1077, 895, 560, 292]
dollar [5220, 1803, 938, 595, 475, 393, 319, 257]
Dollar Akshay's method produces 1 as the final number approximately 20 times more than 8, and hiro's method is better, but still 1 appears approximately 7.5 times more often than 8 as the final entry.
The method of this answer (rejection sampling) gives an approximately equal distribution for all final digits in this sample of 10000 runs.
Obviously it's possible to hide the skew of the fast methods, for example by shuffling the array before returning it, but the statistically correct method in this answer has nothing to hide.
One solution is to generate random numbers such that they do not cross 812. For example when your sum is 806 you can only generate numbers between [1, 6].
So you should change your random number generator to be
item = random.randint(1, min(8, 812 - sum))
Also your entire program becomes much shorter. Note do not use the variable name sum as it is an inbuilt function
import random
distribution = []
list_sum = 0
while list_sum < 812:
item = random.randint(1, min(8, 812 - list_sum ))
distribution.append(item)
list_sum += item
this is exactly what your if statements do - just rewritten with min:
MAX = 812
distribution = []
s = 0
while s < MAX:
item = min(MAX - s, random.randint(1, 8))
distribution.append(item)
s += item
note that this will slightly skew your distribution. the last number will be consistently smaller than the others... one way to hide that a little would be to shuffle the list afterwards.
in order to have a really uniform distribution you'd have to start over in case you overshoot...
An answer that's similar to yours
import random
distribution = []
lsum = 0
while True:
item = random.randint(1, 8)
lsum = lsum + item
distribution.append(item)
if lsum > 812:
lsum = lsum - item
distribution.pop()
if lsum == 812:
break
Because you know the max value of items added, could you not just deterministically end with the last item as soon as you are within range of this max value at the end? In that case, you could do something like:
import random
def creat_random_list(total_sum, min_value=1, max_value=8):
distribution = []
current_sum = 0
done = False
while not done:
if total_sum - current_sum <= max_value:
new_item = total_sum - current_sum
done = True
else:
new_item = random.randint(min_value, max_value)
distribution.append(new_item)
current_sum += new_item
return distribution
random_numers = creat_random_list(total_sum=812)
print(f"sum {sum(random_numers)} of list {random_numers}")
Note that I have renamed your variable sum to current sum, because sum is an inbuilt function, so not a good variable name to pick
Result look like:
sum 812 of list [7, 5, 7, 4, 7, 7, 7, 1, 2, 6, 6, 4, 5, 3, 1, 4, 4, 2, 8, 7, 5, 5, 5, 5, 1, 1, 1, 4, 5, 5, 1, 1, 8, 8, 6, 5, 5, 5, 8, 2, 3, 7, 8, 6, 2, 6, 7, 4, 7, 7, 8, 7, 1, 4, 7, 2, 2, 6, 4, 4, 3, 4, 2, 6, 3, 3, 3, 4, 1, 3, 6, 6, 8, 5, 6, 3, 3, 7, 6, 8, 5, 3, 5, 4, 1, 7, 6, 5, 4, 1, 7, 1, 5, 1, 7, 3, 4, 2, 3, 3, 1, 4, 6, 5, 4, 1, 1, 1, 3, 7, 1, 3, 8, 8, 7, 4, 4, 8, 8, 8, 5, 6, 5, 6, 8, 5, 7, 2, 2, 8, 5, 1, 5, 4, 3, 2, 1, 1, 8, 8, 8, 8, 2, 1, 1, 4, 8, 4, 1, 2, 2, 2, 8, 5, 6, 5, 4, 1, 3, 4, 3, 3, 2, 2, 5, 7, 8, 1, 1, 8, 2, 7, 7, 2, 5, 1, 7, 7, 2, 3, 7, 7]
how can I replace the numbers that are greater than 9 by their sum of digits?
right now the list multipliedlist =
[1, 4, 3, 8, 5, 12, 7, 16, 2]
I need to change it to (ex, num 12 and num 16 replaced to (3) and (7) )
[1, 4, 3, 8, 5, 3, 7, 7, 2]
I can use sum(map(int, str(number))) to add the digits but how can i change the values in the same list by their index?
def check_id_valid(id_number):
updatedid = map(int, str(id_number))
multipliedlist = [i * 1 if j % 2 == 0 else i * 2 for j, i in enumerate(updatedid)]
# for index, number in enumerate(multipliedlist):
# if multipliedlist[index] > 9:
# multipliedlist[index] = sum(map(int, str(number)))
# else:
# multipliedlist[index] == number #statement has no effect error.
print(check_id_valid(123456782))
New to python sorry if this is not explained as it's supposed to be
I appreciate any help,Thanks :)
Using a list comprehension
Ex:
data = [1, 4, 3, 8, 5, 12, 7, 16, 2]
print([sum(map(int, str(i))) if i > 9 else i for i in data])
Output:
[1, 4, 3, 8, 5, 3, 7, 7, 2]
Break your task into the constituent parts, namely
replacing a number with the sum of its digits
doing that for a list of numbers.
def sum_digits(number):
# Convert the number into a string (10 -> "10"),
# iterate over its characters to convert each of them
# back to an integer, then use the `sum()` builtin for
# summing.
return sum(int(digit_char) for digit_char in str(number))
def sum_all_digits(numbers):
return [sum_digits(number) for number in numbers]
print(sum_all_digits([1, 4, 3, 8, 5, 12, 7, 16, 2]))
outputs the expected
[1, 4, 3, 8, 5, 3, 7, 7, 2]
To change values by index you can use enumerate() function:
def sum_digits(n):
r = 0
while n:
r, n = r + n % 10, n // 10
return r
multipliedlist = [1, 4, 3, 8, 5, 12, 7, 16, 2]
for i, n in enumerate(multipliedlist):
multipliedlist[i] = sum_digits(multipliedlist[i])
print(multipliedlist)
[1, 4, 3, 8, 5, 3, 7, 7, 2]
list = [0, 1, 2, 3, 4, 1, 5, 0, 6, 5, 7, 8, 9, 10, 11, 12, 13, 2]
list is used "matrix like"
1. 0 1 2
2. 3 4 1
3. 5 0 6
... and so on. I would like to write all those lines into a new list/matrix, but without lines, that would repeat a number. However the order of a line has to be preserved.
So far I use this:
compa = [0,1,2,3,4,1,5,0,6,5,7,8,9,10,11,12,13,2] #the list to be used as base
temp = [0,1,2] #new list starts from the first element
temp2 = [12,13,2] #new list starts from the last element
Mischzahl = 3 #defines the number of elements in a line of the "matrix"
n = 0
while n < len(compa):
for m in range(0,len(temp)):
if temp[m] == compa[n]:
n = (int(n/Mischzahl) + 1) * Mischzahl - 1 #calculates the "foul" line and sets n to the next line
break
if (n + 1) % Mischzahl == 0 and m == len(temp) - 1 : #if the end of temp is reached, the current line is transferred to temp.
for p in range(Mischzahl):
temp.append(compa[Mischzahl*int(n/Mischzahl) + p])
n += 1
and the same backwards
n = len(compa) - 1
while n > 0: #same as above but starting from last element
for m in range(len(temp2)):
if temp2[m] == compa[n]:
n = (int(n/Mischzahl) - 1) * Mischzahl + Mischzahl
break
if (n) % Mischzahl == 0 and m == len(temp2) - 1:
for p in range(Mischzahl):
temp2.append(compa[Mischzahl*int(n/Mischzahl) + p])
n = n - 1
resulting output for temp and temp2:
[0, 1, 2, 3, 4, 1, 5, 0, 6, 5, 7, 8, 9, 10, 11, 12, 13, 2] #compa
[0, 1, 2, 5, 7, 8, 9, 10, 11] #temp
[12, 13, 2, 9, 10, 11, 5, 7, 8, 3, 4, 1] #temp2
Since this is the most time-consuming part of the script: Is there a more efficient way to do this? Any helpful advice or direction would be highly welcome.
You can define a function that iterates over the list in strides of a given length (in your case 3), checks if the elements of the stride are in a set of numbers, if not extend the out list and update the set.
from math import ceil
def unique_by_row(compa, stride_size=3, reverse=False):
strides = ceil(len(compa)/stride_size)
out = []
check = set()
it = range(strides)
if reverse:
it = reversed(it)
for i in it:
x = compa[stride_size*i:stride_size*(i+1)]
if not check.intersection(x):
out.extend(x)
check.update(x)
return out
Tests:
compa = [0, 1, 2, 3, 4, 1, 5, 0, 6, 5, 7, 8, 9, 10, 11, 12, 13, 2]
unique_by_row(compa)
# returns:
[0, 1, 2, 5, 7, 8, 9, 10, 11]
unique_by_row(compa, reverse=True)
# returns:
[12, 13, 2, 9, 10, 11, 5, 7, 8, 3, 4, 1]
I am trying to come up with a function to split the length of a list evenly depending on it's original length.
So for example if I have a dataset returned that is 2000 I would like to split it into 4. Whereas if the dataset is 1500 split it into 3.
Then to call the function:
Thread_A_DATA, Thread_B_DATA = split_list( SQL_RETURN )
I would like to do something like the following:
if len(dataset) <= 1000:
# Split in 2
a, b = split_list(dataset, 2)
if len(dataset) > 1000 or len(dataset) <= 1500:
# Split in 3
a, b, c = split_list(dataset, 3)
# etc etc...
I've managed to split a dataset in half using this code found previously on stackoverflow:
def split_list( a_list ):
half = len( a_list ) / 2
return a_list[:half], a_list[half:]
But I can't work it out with 3,4 or 5 splits!
If anyone can help that would be great.
Thanks in advance.
As I understand the question, you don't want to split every 500 elements but instead split in 2 if there are less than 1000 elements, in 3 if less than 1500, 4 for 2000, etc. But if there are 1700 elements, you would split in 4 groups of 425 elements (that's what I understand by "split evenly").
So, here's my solution:
def split_list(a_list, number_of_splits):
step = len(a_list) / number_of_splits + (1 if len(a_list) % number_of_splits else 0)
return [a_list[i*step:(i+1)*step] for i in range(number_of_splits)]
l = [1, 8, 2, 3, 4, 5, 6, 7, 1, 5, 3, 1, 2, 5]
print l
print split_list(l, 3)
print split_list(l, 2)
Output
[1, 8, 2, 3, 4, 5, 6, 7, 1, 5, 3, 1, 2, 5]
[[1, 8, 2, 3, 4], [5, 6, 7, 1, 5], [3, 1, 2, 5]]
[[1, 8, 2, 3, 4, 5, 6], [7, 1, 5, 3, 1, 2, 5]]
edit: Python 3 version:
def split_list(a_list, number_of_splits):
step = len(a_list) // number_of_splits + (1 if len(a_list) % number_of_splits else 0)
return [a_list[i*step:(i+1)*step] for i in range(number_of_splits)]
l = [1, 8, 2, 3, 4, 5, 6, 7, 1, 5, 3, 1, 2, 5]
print(l)
print(split_list(l, 3))
print(split_list(l, 2))
Python 3
def splitList(L):
return[L[i:i+500] for i in range(0, len(L), 500)]
Python 2
def splitList(L):
return[L[i:i+500] for i in xrange(0, len(L), 500)]
def split_it(a_list,size_of_split):
return zip(*[iter(a_list)]*size_of_split)
is fun
print split_it(range(100),3) # splits it into groups of 3
unfortunatly this will truncate the end of the list if it does not divide evenly into split_size ... you can fix it like so
return zip(*[iter(a_list)]*size_of_split) + [tuple(a_list[-(len(a_list)%size_of_split):])]
if you wanted to cut it into 7 pieces say you can find the size of the split by
split_size = len(a_list) / num_splits
Python 2.7
>>> import math
>>> lst = range(35)
>>> t = 3 # how many items to be splited
>>> n = int(math.ceil(len(lst) / float(t)))
>>> res = [lst[i:i+n] for i in range(0, len(lst), n)]
>>> res
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], [24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34]]