How do you sort all the values within the nested list structure, so that the sublists are both the same length as in the original list and so that the values shift to the appropriate sublist so that they are sorted overall, not just within each sublist individually. How does one go about this??
for instance:
list1=[[0.10, 0.90, 0,20], [0.15, 0.80], [0.68, 0.08, 0.30]]
Becomes:
list1=[[0.08, 0.10, 0.15], [0.20, 0.30], [0.68, 0.80, 0.90]]
Any help is appreciated
This works.
list1=[[0.10, 0.90, 0.20], [0.15, 0.80], [0.68, 0.08, 0.30]]
list_lengths = [len(x) for x in list1]
flattened = [item for items in list1 for item in items]
items_sorted = sorted(flattened)
loc = 0
lists2 = []
for length in list_lengths:
lists2.append(items_sorted[loc:loc+length])
loc += length
print(lists2)
You need to get list lengths at some point to build the final lists2. To get your ordered values properly, you flatten and sort the list, then you add lists to list2 by slicing your sorted items.
Note that this will work for arbitrary length lists and tuples.
You can use chain.from_iterable to chain the lists, sort them and create an iterator. Then you can just iterate over the original lists and create a result using next:
>>> from itertools import chain
>>> l = [[0.10, 0.90, 0.20], [0.15, 0.80], [0.68, 0.08, 0.30]]
>>> it = iter(sorted(chain.from_iterable(l)))
>>> [[next(it) for _ in l2] for l2 in l]
[[0.08, 0.1, 0.15], [0.2, 0.3], [0.68, 0.8, 0.9]]
I would use itertools for this and confine the whole thing inside one function:
import itertools
def join_sort_slice(iterable):
it = iter(sorted(itertools.chain(*iterable)))
output = []
for i in map(len, iterable):
output.append(list(itertools.islice(it, i)))
return output
Use it:
lst = [[0.10, 0.90, 0.20], [0.15, 0.80], [0.68, 0.08, 0.30]]
join_sort_slice(lst)
# [[0.08, 0.1, 0.15], [0.2, 0.3], [0.68, 0.8, 0.9]]
The idea is to chain all sublists together and then sort the outcome. This sorted output is then sliced based on the lengths of the original list of lists.
I hope this helps.
Similar to #Evan's answer
import itertools
import numpy as np
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = itertools.tee(iterable)
next(b, None)
return zip(a, b)
list1=[[0.10, 0.90, 0.20], [0.15, 0.80], [0.68, 0.08, 0.30]]
# get the sizes of each of the sublists and where they start
sizes = [len(l) for l in list1]
sizes.insert(0,0)
offsets = np.cumsum(sizes)
# flatten and sort
flat_list = sorted(itertools.chain(*list1))
nested = [flat_list[begin:end] for begin, end in pairwise(offsets)]
print(nested)
Another variation with itertools:
import itertools
list1=[[0.10, 0.90, 0.20], [0.15, 0.80], [0.68, 0.08, 0.30]]
sorted_l = sorted(itertools.chain.from_iterable(list1))
result = []
k=0
for i in (len(i) for i in list1):
result.append(sorted_l[k:k+i])
k=k+i
print(result)
The output:
[[0.08, 0.1, 0.15], [0.2, 0.3], [0.68, 0.8, 0.9]]
Related
I have this exercise and the goal is to solve it with complexity less than O(n^2).
You have an array with length N filled with event probabilities. Create another array in which for each element i calculate the probability of all event to happen until the position i.
I have coded this O(n^2) solution. Any ideas how to improve it?
probabilityTable = [0.1, 0.54, 0.34, 0.11, 0.55, 0.75, 0.01, 0.06, 0.96]
finalTable = list()
for i in range(len(probabilityTable)):
finalTable.append(1)
for j in range(i):
finalTable[i] *= probabilityTable[j]
for item in finalTable:
print(item)
probabilityTable = [0.1, 0.54, 0.34, 0.11, 0.55, 0.75, 0.01, 0.06, 0.96]
finalTable = probabilityTable.copy()
for i in range(1, len(probabilityTable)):
finalTable[i] = finalTable[i] * finalTable[i - 1]
for item in finalTable:
print(item)
new_probs = [probabilityTable[0]]
for prob in probabilityTable[1:]:
new_probs.append(new_probs[-1] + prob)
Not sure what is wrong with this function but would appriciate any help I could get on it. New to python and a bit confused.
def summer(tables):
"""
MODIFIES the table to add a column summing the previous elements in the row.
Example: Suppose that a is
[['First', 'Second', 'Third'], [0.1, 0.3, 0.5], [0.6, 0.2, 0.7], [0.5, 1.1, 0.1]]
then place_sums(a) modifies the table a so that it is now
[['First', 'Second', 'Third', 'Sum'],
[0.1, 0.3, 0.5, 0.8], [0.6, 0.2, 0.7, 1.5], [0.5, 1.1, 0.1, 1.7]]
Parameter table: the nested list to process
"""
numrows = len(tables)
sums = []
for n in range(numrows):
sums = [sum(item) for item in tables]
return sums
This is what you are looking for. You don't need to create a new list. You just need to update your variable tables. Also putting a return statement inside your loop just make it run one iteration. You should look at how for loop work and what the return statement actually does.
def summer(tables):
"""
MODIFIES the table to add a column summing the previous elements in the row.
Example: Suppose that a is
[['First', 'Second', 'Third'], [0.1, 0.3, 0.5], [0.6, 0.2, 0.7], [0.5, 1.1, 0.1]]
then place_sums(a) modifies the table a so that it is now
[['First', 'Second', 'Third', 'Sum'],
[0.1, 0.3, 0.5, 0.8], [0.6, 0.2, 0.7, 1.5], [0.5, 1.1, 0.1, 1.7]]
Parameter table: the nested list to process
"""
tables[0].append('Sum')
for i in range(1, len(tables)):
tables[i].append(sum(tables[i]))
Does anybody have an idea how to get the elements in a list whose values fall within a specific (from - to) range?
I need a loop to check if a list contains elements in a specific range, and if there are any, I need the biggest one to be saved in a variable..
Example:
list = [0.5, 0.56, 0.34, 0.45, 0.53, 0.6]
# range (0.5 - 0.58)
# biggest = 0.56
You could use a filtered comprehension to get only those elements in the range you want, then find the biggest of them using the built-in max():
lst = [0.5, 0.56, 0.34, 0.45, 0.53, 0.6]
biggest = max([e for e in lst if 0.5 < e < 0.58])
# biggest = 0.56
As an alternative to other answers, you can also use filter and lambda:
lst = [0.5, 0.56, 0.34, 0.45, 0.53, 0.6]
biggest = max([i for i in filter(lambda x: 0.5 < x < 0.58, lst)])
I suppose a normal if check would be faster, but I'll give this just for completeness.
Also, you should not use list = ... as list is a built-in in python.
You could also go about it a step at a time, as the approach may aid in debugging.
I used numpy in this case, which is also a helpful tool to put in your tool belt.
This should run as is:
import numpy as np
l = [0.5, 0.56, 0.34, 0.45, 0.53, 0.6]
a = np.array(l)
low = 0.5
high = 0.58
index_low = (a < high)
print(index_low)
a_low = a[index_low]
print(a_low)
index_in_range = (a_low >= low)
print(index_in_range)
a_in_range = a_low[index_in_range]
print(a_in_range)
a_max = a_in_range.max()
print(a_max)
I would like to know if there is an equivalent for pandas.Series.unique() when the series contains non-hashable elements (in my case, lists).
For instance, with
>> ds
XTR
s0b0_VARC-0.200 [0.05, 0.05]
s0b0_VARC-0.100 [0.05, 0.05]
s0b0_VARC0.000 [0.05, 0.05]
s0b0_VARC0.100 [0.05, 0.05]
s0b1_VARC-0.200 [0.05, 0.05]
s0b1_VARC0.000 [0.05, 0.05]
s0b1_VARC0.100 [0.05, 0.05]
s0b2_VARC-0.200 [0.05, 0.05]
s0b2_VARC-0.100 [0.06, 0.025]
s0b2_VARC0.000 [0.05, 0.05]
s0b2_VARC0.100 [0.05, 0.05]
I would like to get
>> ds.unique()
2
Thanks #Quang Hoang
Inspired from this SO answer, I wrote the following function (not sure how robust it is though):
def count_unique_values(series):
try:
tuples = [tuple(x) for x in series.values]
series = pd.Series(tuples)
nb = len(series.unique())
print(nb)
except TypeError:
nb = len(series.unique())
return nb
I am trying to select three random elements from within a array.
I currently have implemented:
result= np.random.uniform(np.min(dataset[:,1]), np.max(dataset[:,1]), size=3
Which returns three random floats between the min and max range. I am struggling finding a way to select random elements within an array, instead of a random float which may not exist as an element inside the array.
I have also tried:
result = random.choice(dataset[:,0])
Which only returns a single element, is it possible to return 3 with this function
You can use random.sample(), if you want to sample without replacement, ie. the same element can't be picked twice.
>>> import random
>>> l = [0.3, 0.2, 0.1, 0.4, 0.5, 0.6]
>>> random.sample(l, 3)
[0.3, 0.5, 0.1]
If you want to sample with replacement, you can random.choices()
>>> import random
>>> l = [0.3, 0.2, 0.1, 0.4, 0.5, 0.6]
>>> random.choices(l, k=3)
[0.3, 0.5, 0.3]
You can use random.choices instead:
result = random.choices(dataset[:,0], k=3)