Recursively dividing up a list, based on the endpoints - python

Here is what I am trying to do.
Take the list:
list1 = [0,2]
This list has start point 0 and end point 2.
Now, if we were to take the midpoint of this list, the list would become:
list1 = [0,1,2]
Now, if we were to recursively split up the list again (take the midpoints of the midpoints), the list would becomes:
list1 = [0,.5,1,1.5,2]
I need a function that will generate lists like this, preferably by keeping track of a variable. So, for instance, let's say there is a variable, n, that keeps track of something. When n = 1, the list might be [0,1,2] and when n = 2, the list might be [0,.5,1,1.5,2], and I am going to increment the value of to keep track of how many times I have divided up the list.
I know you need to use recursion for this, but I'm not sure how to implement it.
Should be something like this:
def recursive(list1,a,b,n):
"""list 1 is a list of values, a and b are the start
and end points of the list, and n is an int representing
how many times the list needs to be divided"""
int mid = len(list1)//2
stuff
Could someone help me write this function? Not for homework, part of a project I'm working on that involves using mesh analysis to divide up rectangle into parts.
This is what I have so far:
def recursive(a,b,list1,n):
w = b - a
mid = a + w / 2
left = list1[0:mid]
right = list1[mid:len(list1)-1]
return recursive(a,mid,list1,n) + mid + recursive(mid,b,list1,n)
but I'm not sure how to incorporate n into here.
NOTE: The list1 would initially be [a,b] - I would just manually enter that but I'm sure there's a better way to do it.

You've generated some interesting answers. Here are two more.
My first uses an iterator to avoid
slicing the list and is recursive because that seems like the most natural formulation.
def list_split(orig, n):
if not n:
return orig
else:
li = iter(orig)
this = next(li)
result = [this]
for nxt in li:
result.extend([(this+nxt)/2, nxt])
this = nxt
return list_split(result, n-1)
for i in range(6):
print(i, list_split([0, 2], i))
This prints
0 [0, 2]
1 [0, 1.0, 2]
2 [0, 0.5, 1.0, 1.5, 2]
3 [0, 0.25, 0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2]
4 [0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1.0, 1.125, 1.25, 1.375, 1.5, 1.625, 1.75, 1.875, 2]
5 [0, 0.0625, 0.125, 0.1875, 0.25, 0.3125, 0.375, 0.4375, 0.5, 0.5625, 0.625, 0.6875, 0.75, 0.8125, 0.875, 0.9375, 1.0, 1.0625, 1.125, 1.1875, 1.25, 1.3125, 1.375, 1.4375, 1.5, 1.5625, 1.625, 1.6875, 1.75, 1.8125, 1.875, 1.9375, 2]
My second is based on the observation that recursion isn't necessary if you always start from two elements. Suppose those elements are mn and mx. After N applications of the split operation you will have 2^N+1 elements in it, so the numerical distance between the elements will be (mx-mn)/(2**N).
Given this information it should therefore be possible to deterministically compute the elements of the array, or even easier to use numpy.linspace like this:
def grid(emin, emax, N):
return numpy.linspace(emin, emax, 2**N+1)
This appears to give the same answers, and will probably serve you best in the long run.

You can use some arithmetic and slicing to figure out the size of the result, and fill it efficiently with values.
While not required, you can implement a recursive call by wrapping this functionality in a simple helper function, which checks what iteration of splitting you are on, and splits the list further if you are not at your limit.
def expand(a):
"""
expands a list based on average values between every two values
"""
o = [0] * ((len(a) * 2) - 1)
o[::2] = a
o[1::2] = [(x+y)/2 for x, y in zip(a, a[1:])]
return o
def rec_expand(a, n):
if n == 0:
return a
else:
return rec_expand(expand(a), n-1)
In action
>>> rec_expand([0, 2], 2)
[0, 0.5, 1.0, 1.5, 2]
>>> rec_expand([0, 2], 4)
[0,
0.125,
0.25,
0.375,
0.5,
0.625,
0.75,
0.875,
1.0,
1.125,
1.25,
1.375,
1.5,
1.625,
1.75,
1.875,
2]

You could do this with a for loop
import numpy as np
def add_midpoints(orig_list, n):
for i in range(n):
new_list = []
for j in range(len(orig_list)-1):
new_list.append(np.mean(orig_list[j:(j+2)]))
orig_list = orig_list + new_list
orig_list.sort()
return orig_list
add_midpoints([0,2],1)
[0, 1.0, 2]
add_midpoints([0,2],2)
[0, 0.5, 1.0, 1.5, 2]
add_midpoints([0,2],3)
[0, 0.25, 0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2]

You can also do this totally non-recursively and without looping. What we're doing here is just making a binary scale between two numbers like on most Imperial system rulers.
def binary_scale(start, stop, level):
length = stop - start
scale = 2 ** level
return [start + i * length / scale for i in range(scale + 1)]
In use:
>>> binary_scale(0, 10, 0)
[0.0, 10.0]
>>> binary_scale(0, 10, 2)
[0.0, 2.5, 5.0, 7.5, 10.0]
>>> binary_scale(10, 0, 1)
[10.0, 5.0, 0.0]

Fun with anti-patterns:
def expand(a, n):
for _ in range(n):
a[:-1] = sum(([a[i], (a[i] + a[i + 1]) / 2] for i in range(len(a) - 1)), [])
return a
print(expand([0, 2], 2))
OUTPUT
% python3 test.py
[0, 0.5, 1.0, 1.5, 2]
%

Related

Most efficient way to convert list of values to probability distribution?

I have several lists that can only contain the following values: 0, 0.5, 1, 1.5
I want to efficiently convert each of these lists into probability mass functions. So if a list is as follows: [0.5, 0.5, 1, 1.5], the PMF will look like this: [0, 0.5, 0.25, 0.25].
I need to do this many times (and with very large lists), so avoiding looping will be optimal, if at all possible. What's the most efficient way to make this happen?
Edit: Here's my current system. This feels like a really inefficient/unelegant way to do it:
def get_distribution(samplemodes1):
n, bin_edges = np.histogram(samplemodes1, bins = 9)
totalcount = np.sum(n)
bin_probability = n / totalcount
bins_per_point = np.fmin(np.digitize(samplemodes1, bin_edges), len(bin_edges)-1)
probability_perpoint = [bin_probability[bins_per_point[i]-1] for i in range(len(samplemodes1))]
counts = Counter(samplemodes1)
total = sum(counts.values())
probability_mass = {k:v/total for k,v in counts.items()}
#print(probability_mass)
key_values = {}
if(0 in probability_mass):
key_values[0] = probability_mass.get(0)
else:
key_values[0] = 0
if(0.5 in probability_mass):
key_values[0.5] = probability_mass.get(0.5)
else:
key_values[0.5] = 0
if(1 in probability_mass):
key_values[1] = probability_mass.get(1)
else:
key_values[1] = 0
if(1.5 in probability_mass):
key_values[1.5] = probability_mass.get(1.5)
else:
key_values[1.5] = 0
distribution = list(key_values.values())
return distribution
Here are some solution for you to benchmark:
Using collections.Counter
from collections import Counter
bins = [0, 0.5, 1, 1.5]
a = [0.5, 0.5, 1.0, 0.5, 1.0, 1.5, 0.5]
denom = len(a)
counts = Counter(a)
pmf = [counts[bin]/denom for bin in Bins]
NumPy based solution
import numpy as np
bins = [0, 0.5, 1, 1.5]
a = np.array([0.5, 0.5, 1.0, 0.5, 1.0, 1.5, 0.5])
denom = len(a)
pmf = [(a == bin).sum()/denom for bin in bins]
but you can probably do better by using np.bincount() instead.
Further reading on this idea: https://thispointer.com/count-occurrences-of-a-value-in-numpy-array-in-python/

list of floats in python [duplicate]

How do I iterate between 0 and 1 by a step of 0.1?
This says that the step argument cannot be zero:
for i in range(0, 1, 0.1):
print(i)
Rather than using a decimal step directly, it's much safer to express this in terms of how many points you want. Otherwise, floating-point rounding error is likely to give you a wrong result.
Use the linspace function from the NumPy library (which isn't part of the standard library but is relatively easy to obtain). linspace takes a number of points to return, and also lets you specify whether or not to include the right endpoint:
>>> np.linspace(0,1,11)
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])
>>> np.linspace(0,1,10,endpoint=False)
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
If you really want to use a floating-point step value, use numpy.arange:
>>> import numpy as np
>>> np.arange(0.0, 1.0, 0.1)
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
Floating-point rounding error will cause problems, though. Here's a simple case where rounding error causes arange to produce a length-4 array when it should only produce 3 numbers:
>>> numpy.arange(1, 1.3, 0.1)
array([1. , 1.1, 1.2, 1.3])
range() can only do integers, not floating point.
Use a list comprehension instead to obtain a list of steps:
[x * 0.1 for x in range(0, 10)]
More generally, a generator comprehension minimizes memory allocations:
xs = (x * 0.1 for x in range(0, 10))
for x in xs:
print(x)
Building on 'xrange([start], stop[, step])', you can define a generator that accepts and produces any type you choose (stick to types supporting + and <):
>>> def drange(start, stop, step):
... r = start
... while r < stop:
... yield r
... r += step
...
>>> i0=drange(0.0, 1.0, 0.1)
>>> ["%g" % x for x in i0]
['0', '0.1', '0.2', '0.3', '0.4', '0.5', '0.6', '0.7', '0.8', '0.9', '1']
>>>
Increase the magnitude of i for the loop and then reduce it when you need it.
for i * 100 in range(0, 100, 10):
print i / 100.0
EDIT: I honestly cannot remember why I thought that would work syntactically
for i in range(0, 11, 1):
print i / 10.0
That should have the desired output.
NumPy is a bit overkill, I think.
[p/10 for p in range(0, 10)]
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
Generally speaking, to do a step-by-1/x up to y you would do
x=100
y=2
[p/x for p in range(0, int(x*y))]
[0.0, 0.01, 0.02, 0.03, ..., 1.97, 1.98, 1.99]
(1/x produced less rounding noise when I tested).
scipy has a built in function arange which generalizes Python's range() constructor to satisfy your requirement of float handling.
from scipy import arange
Similar to R's seq function, this one returns a sequence in any order given the correct step value. The last value is equal to the stop value.
def seq(start, stop, step=1):
n = int(round((stop - start)/float(step)))
if n > 1:
return([start + step*i for i in range(n+1)])
elif n == 1:
return([start])
else:
return([])
Results
seq(1, 5, 0.5)
[1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]
seq(10, 0, -1)
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
seq(10, 0, -2)
[10, 8, 6, 4, 2, 0]
seq(1, 1)
[ 1 ]
The range() built-in function returns a sequence of integer values, I'm afraid, so you can't use it to do a decimal step.
I'd say just use a while loop:
i = 0.0
while i <= 1.0:
print i
i += 0.1
If you're curious, Python is converting your 0.1 to 0, which is why it's telling you the argument can't be zero.
Here's a solution using itertools:
import itertools
def seq(start, end, step):
if step == 0:
raise ValueError("step must not be 0")
sample_count = int(abs(end - start) / step)
return itertools.islice(itertools.count(start, step), sample_count)
Usage Example:
for i in seq(0, 1, 0.1):
print(i)
[x * 0.1 for x in range(0, 10)]
in Python 2.7x gives you the result of:
[0.0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6000000000000001, 0.7000000000000001, 0.8, 0.9]
but if you use:
[ round(x * 0.1, 1) for x in range(0, 10)]
gives you the desired:
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
import numpy as np
for i in np.arange(0, 1, 0.1):
print i
Best Solution: no rounding error
>>> step = .1
>>> N = 10 # number of data points
>>> [ x / pow(step, -1) for x in range(0, N + 1) ]
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
Or, for a set range instead of set data points (e.g. continuous function), use:
>>> step = .1
>>> rnge = 1 # NOTE range = 1, i.e. span of data points
>>> N = int(rnge / step
>>> [ x / pow(step,-1) for x in range(0, N + 1) ]
[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
To implement a function: replace x / pow(step, -1) with f( x / pow(step, -1) ), and define f.
For example:
>>> import math
>>> def f(x):
return math.sin(x)
>>> step = .1
>>> rnge = 1 # NOTE range = 1, i.e. span of data points
>>> N = int(rnge / step)
>>> [ f( x / pow(step,-1) ) for x in range(0, N + 1) ]
[0.0, 0.09983341664682815, 0.19866933079506122, 0.29552020666133955, 0.3894183423086505,
0.479425538604203, 0.5646424733950354, 0.644217687237691, 0.7173560908995228,
0.7833269096274834, 0.8414709848078965]
And if you do this often, you might want to save the generated list r
r=map(lambda x: x/10.0,range(0,10))
for i in r:
print i
more_itertools is a third-party library that implements a numeric_range tool:
import more_itertools as mit
for x in mit.numeric_range(0, 1, 0.1):
print("{:.1f}".format(x))
Output
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
This tool also works for Decimal and Fraction.
My versions use the original range function to create multiplicative indices for the shift. This allows same syntax to the original range function.
I have made two versions, one using float, and one using Decimal, because I found that in some cases I wanted to avoid the roundoff drift introduced by the floating point arithmetic.
It is consistent with empty set results as in range/xrange.
Passing only a single numeric value to either function will return the standard range output to the integer ceiling value of the input parameter (so if you gave it 5.5, it would return range(6).)
Edit: the code below is now available as package on pypi: Franges
## frange.py
from math import ceil
# find best range function available to version (2.7.x / 3.x.x)
try:
_xrange = xrange
except NameError:
_xrange = range
def frange(start, stop = None, step = 1):
"""frange generates a set of floating point values over the
range [start, stop) with step size step
frange([start,] stop [, step ])"""
if stop is None:
for x in _xrange(int(ceil(start))):
yield x
else:
# create a generator expression for the index values
indices = (i for i in _xrange(0, int((stop-start)/step)))
# yield results
for i in indices:
yield start + step*i
## drange.py
import decimal
from math import ceil
# find best range function available to version (2.7.x / 3.x.x)
try:
_xrange = xrange
except NameError:
_xrange = range
def drange(start, stop = None, step = 1, precision = None):
"""drange generates a set of Decimal values over the
range [start, stop) with step size step
drange([start,] stop, [step [,precision]])"""
if stop is None:
for x in _xrange(int(ceil(start))):
yield x
else:
# find precision
if precision is not None:
decimal.getcontext().prec = precision
# convert values to decimals
start = decimal.Decimal(start)
stop = decimal.Decimal(stop)
step = decimal.Decimal(step)
# create a generator expression for the index values
indices = (
i for i in _xrange(
0,
((stop-start)/step).to_integral_value()
)
)
# yield results
for i in indices:
yield float(start + step*i)
## testranges.py
import frange
import drange
list(frange.frange(0, 2, 0.5)) # [0.0, 0.5, 1.0, 1.5]
list(drange.drange(0, 2, 0.5, precision = 6)) # [0.0, 0.5, 1.0, 1.5]
list(frange.frange(3)) # [0, 1, 2]
list(frange.frange(3.5)) # [0, 1, 2, 3]
list(frange.frange(0,10, -1)) # []
Lots of the solutions here still had floating point errors in Python 3.6 and didnt do exactly what I personally needed.
Function below takes integers or floats, doesnt require imports and doesnt return floating point errors.
def frange(x, y, step):
if int(x + y + step) == (x + y + step):
r = list(range(int(x), int(y), int(step)))
else:
f = 10 ** (len(str(step)) - str(step).find('.') - 1)
rf = list(range(int(x * f), int(y * f), int(step * f)))
r = [i / f for i in rf]
return r
Suprised no-one has yet mentioned the recommended solution in the Python 3 docs:
See also:
The linspace recipe shows how to implement a lazy version of range that suitable for floating point applications.
Once defined, the recipe is easy to use and does not require numpy or any other external libraries, but functions like numpy.linspace(). Note that rather than a step argument, the third num argument specifies the number of desired values, for example:
print(linspace(0, 10, 5))
# linspace(0, 10, 5)
print(list(linspace(0, 10, 5)))
# [0.0, 2.5, 5.0, 7.5, 10]
I quote a modified version of the full Python 3 recipe from Andrew Barnert below:
import collections.abc
import numbers
class linspace(collections.abc.Sequence):
"""linspace(start, stop, num) -> linspace object
Return a virtual sequence of num numbers from start to stop (inclusive).
If you need a half-open range, use linspace(start, stop, num+1)[:-1].
"""
def __init__(self, start, stop, num):
if not isinstance(num, numbers.Integral) or num <= 1:
raise ValueError('num must be an integer > 1')
self.start, self.stop, self.num = start, stop, num
self.step = (stop-start)/(num-1)
def __len__(self):
return self.num
def __getitem__(self, i):
if isinstance(i, slice):
return [self[x] for x in range(*i.indices(len(self)))]
if i < 0:
i = self.num + i
if i >= self.num:
raise IndexError('linspace object index out of range')
if i == self.num-1:
return self.stop
return self.start + i*self.step
def __repr__(self):
return '{}({}, {}, {})'.format(type(self).__name__,
self.start, self.stop, self.num)
def __eq__(self, other):
if not isinstance(other, linspace):
return False
return ((self.start, self.stop, self.num) ==
(other.start, other.stop, other.num))
def __ne__(self, other):
return not self==other
def __hash__(self):
return hash((type(self), self.start, self.stop, self.num))
This is my solution to get ranges with float steps.
Using this function it's not necessary to import numpy, nor install it.
I'm pretty sure that it could be improved and optimized. Feel free to do it and post it here.
from __future__ import division
from math import log
def xfrange(start, stop, step):
old_start = start #backup this value
digits = int(round(log(10000, 10)))+1 #get number of digits
magnitude = 10**digits
stop = int(magnitude * stop) #convert from
step = int(magnitude * step) #0.1 to 10 (e.g.)
if start == 0:
start = 10**(digits-1)
else:
start = 10**(digits)*start
data = [] #create array
#calc number of iterations
end_loop = int((stop-start)//step)
if old_start == 0:
end_loop += 1
acc = start
for i in xrange(0, end_loop):
data.append(acc/magnitude)
acc += step
return data
print xfrange(1, 2.1, 0.1)
print xfrange(0, 1.1, 0.1)
print xfrange(-1, 0.1, 0.1)
The output is:
[1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0]
[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1]
[-1.0, -0.9, -0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0.0]
For completeness of boutique, a functional solution:
def frange(a,b,s):
return [] if s > 0 and a > b or s < 0 and a < b or s==0 else [a]+frange(a+s,b,s)
You can use this function:
def frange(start,end,step):
return map(lambda x: x*step, range(int(start*1./step),int(end*1./step)))
It can be done using Numpy library. arange() function allows steps in float. But, it returns a numpy array which can be converted to list using tolist() for our convenience.
for i in np.arange(0, 1, 0.1).tolist():
print i
start and stop are inclusive rather than one or the other (usually stop is excluded) and without imports, and using generators
def rangef(start, stop, step, fround=5):
"""
Yields sequence of numbers from start (inclusive) to stop (inclusive)
by step (increment) with rounding set to n digits.
:param start: start of sequence
:param stop: end of sequence
:param step: int or float increment (e.g. 1 or 0.001)
:param fround: float rounding, n decimal places
:return:
"""
try:
i = 0
while stop >= start and step > 0:
if i==0:
yield start
elif start >= stop:
yield stop
elif start < stop:
if start == 0:
yield 0
if start != 0:
yield start
i += 1
start += step
start = round(start, fround)
else:
pass
except TypeError as e:
yield "type-error({})".format(e)
else:
pass
# passing
print(list(rangef(-100.0,10.0,1)))
print(list(rangef(-100,0,0.5)))
print(list(rangef(-1,1,0.2)))
print(list(rangef(-1,1,0.1)))
print(list(rangef(-1,1,0.05)))
print(list(rangef(-1,1,0.02)))
print(list(rangef(-1,1,0.01)))
print(list(rangef(-1,1,0.005)))
# failing: type-error:
print(list(rangef("1","10","1")))
print(list(rangef(1,10,"1")))
Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:57:36) [MSC v.1900 64
bit (AMD64)]
I know I'm late to the party here, but here's a trivial generator solution that's working in 3.6:
def floatRange(*args):
start, step = 0, 1
if len(args) == 1:
stop = args[0]
elif len(args) == 2:
start, stop = args[0], args[1]
elif len(args) == 3:
start, stop, step = args[0], args[1], args[2]
else:
raise TypeError("floatRange accepts 1, 2, or 3 arguments. ({0} given)".format(len(args)))
for num in start, step, stop:
if not isinstance(num, (int, float)):
raise TypeError("floatRange only accepts float and integer arguments. ({0} : {1} given)".format(type(num), str(num)))
for x in range(int((stop-start)/step)):
yield start + (x * step)
return
then you can call it just like the original range()... there's no error handling, but let me know if there is an error that can be reasonably caught, and I'll update. or you can update it. this is StackOverflow.
To counter the float precision issues, you could use the Decimal module.
This demands an extra effort of converting to Decimal from int or float while writing the code, but you can instead pass str and modify the function if that sort of convenience is indeed necessary.
from decimal import Decimal
def decimal_range(*args):
zero, one = Decimal('0'), Decimal('1')
if len(args) == 1:
start, stop, step = zero, args[0], one
elif len(args) == 2:
start, stop, step = args + (one,)
elif len(args) == 3:
start, stop, step = args
else:
raise ValueError('Expected 1 or 2 arguments, got %s' % len(args))
if not all([type(arg) == Decimal for arg in (start, stop, step)]):
raise ValueError('Arguments must be passed as <type: Decimal>')
# neglect bad cases
if (start == stop) or (start > stop and step >= zero) or \
(start < stop and step <= zero):
return []
current = start
while abs(current) < abs(stop):
yield current
current += step
Sample outputs -
from decimal import Decimal as D
list(decimal_range(D('2')))
# [Decimal('0'), Decimal('1')]
list(decimal_range(D('2'), D('4.5')))
# [Decimal('2'), Decimal('3'), Decimal('4')]
list(decimal_range(D('2'), D('4.5'), D('0.5')))
# [Decimal('2'), Decimal('2.5'), Decimal('3.0'), Decimal('3.5'), Decimal('4.0')]
list(decimal_range(D('2'), D('4.5'), D('-0.5')))
# []
list(decimal_range(D('2'), D('-4.5'), D('-0.5')))
# [Decimal('2'),
# Decimal('1.5'),
# Decimal('1.0'),
# Decimal('0.5'),
# Decimal('0.0'),
# Decimal('-0.5'),
# Decimal('-1.0'),
# Decimal('-1.5'),
# Decimal('-2.0'),
# Decimal('-2.5'),
# Decimal('-3.0'),
# Decimal('-3.5'),
# Decimal('-4.0')]
Add auto-correction for the possibility of an incorrect sign on step:
def frange(start,step,stop):
step *= 2*((stop>start)^(step<0))-1
return [start+i*step for i in range(int((stop-start)/step))]
My solution:
def seq(start, stop, step=1, digit=0):
x = float(start)
v = []
while x <= stop:
v.append(round(x,digit))
x += step
return v
Here is my solution which works fine with float_range(-1, 0, 0.01) and works without floating point representation errors. It is not very fast, but works fine:
from decimal import Decimal
def get_multiplier(_from, _to, step):
digits = []
for number in [_from, _to, step]:
pre = Decimal(str(number)) % 1
digit = len(str(pre)) - 2
digits.append(digit)
max_digits = max(digits)
return float(10 ** (max_digits))
def float_range(_from, _to, step, include=False):
"""Generates a range list of floating point values over the Range [start, stop]
with step size step
include=True - allows to include right value to if possible
!! Works fine with floating point representation !!
"""
mult = get_multiplier(_from, _to, step)
# print mult
int_from = int(round(_from * mult))
int_to = int(round(_to * mult))
int_step = int(round(step * mult))
# print int_from,int_to,int_step
if include:
result = range(int_from, int_to + int_step, int_step)
result = [r for r in result if r <= int_to]
else:
result = range(int_from, int_to, int_step)
# print result
float_result = [r / mult for r in result]
return float_result
print float_range(-1, 0, 0.01,include=False)
assert float_range(1.01, 2.06, 5.05 % 1, True) ==\
[1.01, 1.06, 1.11, 1.16, 1.21, 1.26, 1.31, 1.36, 1.41, 1.46, 1.51, 1.56, 1.61, 1.66, 1.71, 1.76, 1.81, 1.86, 1.91, 1.96, 2.01, 2.06]
assert float_range(1.01, 2.06, 5.05 % 1, False)==\
[1.01, 1.06, 1.11, 1.16, 1.21, 1.26, 1.31, 1.36, 1.41, 1.46, 1.51, 1.56, 1.61, 1.66, 1.71, 1.76, 1.81, 1.86, 1.91, 1.96, 2.01]
I am only a beginner, but I had the same problem, when simulating some calculations. Here is how I attempted to work this out, which seems to be working with decimal steps.
I am also quite lazy and so I found it hard to write my own range function.
Basically what I did is changed my xrange(0.0, 1.0, 0.01) to xrange(0, 100, 1) and used the division by 100.0 inside the loop.
I was also concerned, if there will be rounding mistakes. So I decided to test, whether there are any. Now I heard, that if for example 0.01 from a calculation isn't exactly the float 0.01 comparing them should return False (if I am wrong, please let me know).
So I decided to test if my solution will work for my range by running a short test:
for d100 in xrange(0, 100, 1):
d = d100 / 100.0
fl = float("0.00"[:4 - len(str(d100))] + str(d100))
print d, "=", fl , d == fl
And it printed True for each.
Now, if I'm getting it totally wrong, please let me know.
The trick to avoid round-off problem is to use a separate number to move through the range, that starts and half the step ahead of start.
# floating point range
def frange(a, b, stp=1.0):
i = a+stp/2.0
while i<b:
yield a
a += stp
i += stp
Alternatively, numpy.arange can be used.
My answer is similar to others using map(), without need of NumPy, and without using lambda (though you could). To get a list of float values from 0.0 to t_max in steps of dt:
def xdt(n):
return dt*float(n)
tlist = map(xdt, range(int(t_max/dt)+1))

Python - matrix multiplication code problem

I have this exercise where I get to build a simple neural network with one input layer and one hidden layer... I made the code below to perform a simple matrix multiplication, but it's not doing it properly as when I do the multiplication by hand. What am I doing wrong in my code?
#toes %win #fans
ih_wgt = ([0.1, 0.2, -0.1], #hid[0]
[-0.1, 0.1, 0.9], #hid[1]
[0.1, 0.4, 0.1]) #hid[2]
#hid[0] hid[1] #hid[2]
ho_wgt = ([0.3, 1.1, -0.3], #hurt?
[0.1, 0.2, 0.0], #win?
[0.0, 1.3, 0.1]) #sad?
weights = [ih_wgt, ho_wgt]
def w_sum(a,b):
assert(len(a) == len(b))
output = 0
for i in range(len(a)):
output += (a[i] * b[i])
return output
def vect_mat_mul(vec, mat):
assert(len(vec) == len(mat))
output = [0, 0, 0]
for i in range(len(vec)):
output[i]= w_sum(vec, mat[i])
return output
def neural_network(input, weights):
hid = vect_mat_mul(input, weights[0])
pred = vect_mat_mul(hid, weights[1])
return pred
toes = [8.5, 9.5, 9.9, 9.0]
wlrec = [0.65, 0.8, 0.8, 0.9]
nfans = [1.2, 1.3, 0.5, 1.0]
input = [toes[0],wlrec[0],nfans[0]]
pred = neural_network(input, weights)
print(pred)
the output of my code is:
[0.258, 0, 0]
The way I attempted to solve it by hand is as follows:
I multiplied the input vector [8.5, 0.65, 1.2] with the input weight matrix
ih_wgt = ([0.1, 0.2, -0.1], #hid[0]
[-0.1, 0.1, 0.9], #hid[1]
[0.1, 0.4, 0.1]) #hid[2]
[0.86, 0.295, 1.23]
the output vector is then fed into the network as an input vector which is then multiplied by the hidden weight matrix
ho_wgt = ([0.3, 1.1, -0.3], #hurt?
[0.1, 0.2, 0.0], #win?
[0.0, 1.3, 0.1]) #sad?
the correct output prediction:
[0.2135, 0.145, 0.5065]
Your help would be much appreciated!
You're almost there! Only a simple indentation thing is the reason:
def vect_mat_mul(vec, mat):
assert(len(vec) == len(mat))
output = [0, 0, 0]
for i in range(len(vec)):
output[i]= w_sum(vec, mat[i])
return output # <-- This one was inside the for loop

"Unsorting" a Quicksort

(Quick note! While I know there are plenty of options for sorting in Python, this code is more of a generalized proof-of-concept and will later be ported to another language, so I won't be able to use any specific Python libraries or functions.
In addition, the solution you provide doesn't necessarily have to follow my approach below.)
Background
I have a quicksort algorithm and am trying to implement a method to allow later 'unsorting' of the new location of a sorted element. That is, if element A is at index x and is sorted to index y, then the 'pointer' (or, depending on your terminology, reference or mapping) array changes its value at index x from x to y.
In more detail:
You begin the program with an array, arr, with some given set of numbers. This array is later run through a quick sort algorithm, as sorting the array is important for future processing on it.
The ordering of this array is important. As such, you have another array, ref, which contains the indices of the original array such that when you map the reference array to the array, the original ordering of the array is reproduced.
Before the array is sorted, the array and mapping looks like this:
arr = [1.2, 1.5, 1.5, 1.0, 1.1, 1.8]
ref = [0, 1, 2, 3, 4, 5]
--------
map(arr,ref) -> [1.2, 1.5, 1.5, 1.0, 1.1, 1.8]
You can see that index 0 of ref points to index 0 of arr, giving you 1.2. Index 1 of ref points to index 1 of arr, giving you 1.5, and so on.
When the algorithm is sorted, ref should be rearranged such that when you map it according to the above procedure, it generates the pre-sorted arr:
arr = [1.0, 1.1, 1.2, 1.5, 1.5, 1.8]
ref = [2, 3, 4, 0, 1, 5]
--------
map(arr,ref) -> [1.2, 1.5, 1.5, 1.0, 1.1, 1.8]
Again, index 0 of ref is 2, so the first element of the mapped array is arr[2]=1.2. Index 1 of ref is 3, so the second element of the mapped array is arr[3]=1.5, and so on.
The Issue
The current implementation of my code works great for sorting, but horrible for the remapping of ref.
Given the same array arr, the output of my program looks like this:
arr = [1.0, 1.1, 1.2, 1.5, 1.5, 1.8]
ref = [3, 4, 0, 1, 2, 5]
--------
map(arr,ref) -> [1.5, 1.5, 1.0, 1.1, 1.2, 1.8]
This is a problem because this mapping is definitely not equal to the original:
[1.5, 1.5, 1.0, 1.1, 1.2, 1.8] != [1.2, 1.5, 1.5, 1.0, 1.1, 1.8]
My approach has been this:
When elements a and b, at indices x and y in arr are switched,
Then set ref[x] = y and ref[y] = x.
This is not working and I can't think of another solution that doesn't need O(n^2) time.
Thank you!
Minimally Reproducible Example
testing = [1.5, 1.2, 1.0, 1.0, 1.2, 1.2, 1.5, 1.3, 2.0, 0.7, 0.2, 1.4, 1.2, 1.8, 2.0, 2.1]
# This is the 'map(arr,ref) ->' function
def print_links(a,b):
tt = [a[b[i]-1] for i in range(0,len(a))]
print("map(arr,ref) -> {}".format(tt))
# This tests the re-mapping against an original copy of the array
f = 0
for i in range(0,len(testing)):
if testing[i] == tt[i]:
f += 1
print("{}/{}".format(f,len(a)))
def quick_sort(arr,ref,first=None,last=None):
if first == None:
first = 0
if last == None:
last = len(arr)-1
if first < last:
split = partition(arr,ref,first,last)
quick_sort(arr,ref,first,split-1)
quick_sort(arr,ref,split+1,last)
def partition(arr,ref,first,last):
pivot = arr[first]
left = first+1
right = last
done = False
while not done:
while left <= right and arr[left] <= pivot:
left += 1
while arr[right] >= pivot and right >= left:
right -= 1
if right < left:
done = True
else:
temp = arr[left]
arr[left] = arr[right]
arr[right] = temp
# This is my attempt at preserving indices part 1
temp = ref[left]
ref[left] = ref[right]
ref[right] = temp
temp = arr[first]
arr[first] = arr[right]
arr[right] = temp
# This is my attempt at preserving indices part 2
temp = ref[first]
ref[first] = ref[right]
ref[right] = temp
return right
# Main body of code
a = [1.5,1.2,1.0,1.0,1.2,1.2,1.5,1.3,2.0,0.7,0.2,1.4,1.2,1.8,2.0,2.1]
b = range(1,len(a)+1)
print("The following should match:")
print("a = {}".format(a))
a0 = a[:]
print("ref = {}".format(b))
print("----")
print_links(a,b)
print("\nQuicksort:")
quick_sort(a,b)
print(a)
print("\nThe following should match:")
print("arr = {}".format(a0))
print("ref = {}".format(b))
print("----")
print_links(a,b)
You can do what you ask, but when we have to do something like this in real life, we usually mess with the sort's comparison function instead of the swap function. Sorting routines provided with common languages usually have that capability built in so you don't have to write your own sort.
In this procedure, you sort the ref array (called order below), by the value of the arr value it points to. The generates the same ref array you already have, but without modifying arr.
Mapping with this ordering sorts the original array. You expected it to unsort the sorted array, which is why your code isn't working.
You can invert this ordering to get the ref array you were originally looking for, or you can just leave arr unsorted and map it through order when you need it ordered.
arr = [1.5, 1.2, 1.0, 1.0, 1.2, 1.2, 1.5, 1.3, 2.0, 0.7, 0.2, 1.4, 1.2, 1.8, 2.0, 2.1]
order = range(len(arr))
order.sort(key=lambda i:arr[i])
new_arr = [arr[order[i]] for i in range(len(arr))]
print("original array = {}".format(arr))
print("sorted ordering = {}".format(order))
print("sorted array = {}".format(new_arr))
ref = [0]*len(order)
for i in range(len(order)):
ref[order[i]]=i
unsorted = [new_arr[ref[i]] for i in range(len(ref))]
print("unsorted after sorting = {}".format(unsorted))
Output:
original array = [1.5, 1.2, 1.0, 1.0, 1.2, 1.2, 1.5, 1.3, 2.0, 0.7, 0.2, 1.4, 1.2, 1.8, 2.0, 2.1]
sorted ordering = [10, 9, 2, 3, 1, 4, 5, 12, 7, 11, 0, 6, 13, 8, 14, 15]
sorted array = [0.2, 0.7, 1.0, 1.0, 1.2, 1.2, 1.2, 1.2, 1.3, 1.4, 1.5, 1.5, 1.8, 2.0, 2.0, 2.1]
unsorted after sorting = [1.5, 1.2, 1.0, 1.0, 1.2, 1.2, 1.5, 1.3, 2.0, 0.7, 0.2, 1.4, 1.2, 1.8, 2.0, 2.1]
You don't need to maintain the map of indices and elements,just sort the indices as you sort your array.for example:
unsortedArray = [1.2, 1.5, 2.1]
unsortedIndexes = [0, 1, 2]
sortedAray = [1.2, 1.5, 2.1]
then you just swap 0 and 1as you sort unsortedArray.and get the sortedIndexes[1, 0, 2],you can get the origin array by sortedArray[1],sortedArray[0],sortedArray[2].
def inplace_quick_sort(s, indexes, start, end):
if start>= end:
return
pivot = getPivot(s, start, end)#it's should be a func
left = start
right = end - 1
while left <= right:
while left <= right and customCmp(pivot, s[left]):
# s[left] < pivot:
left += 1
while left <= right and customCmp(s[right], pivot):
# pivot < s[right]:
right -= 1
if left <= right:
s[left], s[right] = s[right], s[left]
indexes[left], indexes[right] = indexes[right], indexes[left]
left, right = left + 1, right -1
s[left], s[end] = s[end], s[left]
indexes[left], indexes[end] = indexes[end], indexes[left]
inplace_quick_sort(s, indexes, start, left-1)
inplace_quick_sort(s, indexes, left+1, end)
def customCmp(a, b):
return a > b
def getPivot(s, start, end):
return s[end]
if __name__ == '__main__':
arr = [1.5,1.2,1.0,1.0,1.2,1.2,1.5,1.3,2.0,0.7,0.2,1.4,1.2,1.8,2.0,2.1]
indexes = [i for i in range(len(arr))]
inplace_quick_sort(arr,indexes, 0, len(arr)-1)
print("sorted = {}".format(arr))
ref = [0]*len(indexes)
for i in range(len(indexes)):
#the core point of Matt Timmermans' answer about how to construct the ref
#the value of indexes[i] is index of the orignal array
#and i is the index of the sorted array,
#so we get the map by ref[indexes[i]] = i
ref[indexes[i]] = i
unsorted = [arr[ref[i]] for i in range(len(ref))]
print("unsorted after sorting = {}".format(unsorted))
It's not that horrible: you've merely reversed your reference usage. Your indices, ref, tell you how to build the sorted list from the original. However, you've used it in the opposite direction: you've applied it to the sorted list, trying to reconstruct the original. You need the inverse mapping.
Is that enough to get you to solve your problem?
I think you can just repair your ref array after the fact. From your code sample, just insert the following snippet after the call toquick_sort(a,b)
c = range(1, len(b)+1)
for i in range(0, len(b)):
c[ b[i]-1 ] = i+1
The c array should now contain the correct references.
Stealing/rewording what #Prune writes: what you have in b is the forward transformation, the sorting itself. Applying it to a0 provides the sorted list (print_links(a0,b))
You just have to revert it via looking up which element went to what position:
c=[b.index(i)+1 for i in range(1,len(a)+1)]
print_links(a,c)

Normalization VS. numpy way to normalize?

I'm supposed to normalize an array. I've read about normalization and come across a formula:
I wrote the following function for it:
def normalize_list(list):
max_value = max(list)
min_value = min(list)
for i in range(0, len(list)):
list[i] = (list[i] - min_value) / (max_value - min_value)
That is supposed to normalize an array of elements.
Then I have come across this: https://stackoverflow.com/a/21031303/6209399
Which says you can normalize an array by simply doing this:
def normalize_list_numpy(list):
normalized_list = list / np.linalg.norm(list)
return normalized_list
If I normalize this test array test_array = [1, 2, 3, 4, 5, 6, 7, 8, 9] with my own function and with the numpy method, I get these answers:
My own function: [0.0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1.0]
The numpy way: [0.059234887775909233, 0.11846977555181847, 0.17770466332772769, 0.23693955110363693, 0.29617443887954614, 0.35540932665545538, 0.41464421443136462, 0.47387910220727386, 0.5331139899831830
Why do the functions give different answers? Is there others way to normalize an array of data? What does numpy.linalg.norm(list) do? What do I get wrong?
There are different types of normalization. You are using min-max normalization. The min-max normalization from scikit learn is as follows.
import numpy as np
from sklearn.preprocessing import minmax_scale
# your function
def normalize_list(list_normal):
max_value = max(list_normal)
min_value = min(list_normal)
for i in range(len(list_normal)):
list_normal[i] = (list_normal[i] - min_value) / (max_value - min_value)
return list_normal
#Scikit learn version
def normalize_list_numpy(list_numpy):
normalized_list = minmax_scale(list_numpy)
return normalized_list
test_array = [1, 2, 3, 4, 5, 6, 7, 8, 9]
test_array_numpy = np.array(test_array)
print(normalize_list(test_array))
print(normalize_list_numpy(test_array_numpy))
Output:
[0.0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1.0]
[0.0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1.0]
MinMaxscaler uses exactly your formula for normalization/scaling:
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.minmax_scale.html
#OuuGiii: NOTE: It is not a good idea to use Python built-in function names as varibale names. list() is a Python builtin function so its use as a variable should be avoided.
The question/answer that you reference doesn't explicitly relate your own formula to the np.linalg.norm(list) version that you use here.
One NumPy solution would be this:
import numpy as np
def normalize(x):
x = np.asarray(x)
return (x - x.min()) / (np.ptp(x))
print(normalize(test_array))
# [ 0. 0.125 0.25 0.375 0.5 0.625 0.75 0.875 1. ]
Here np.ptp is peak-to-peak ie
Range of values (maximum - minimum) along an axis.
This approach scales the values to the interval [0, 1] as pointed out by #phg.
The more traditional definition of normalization would be to scale to a 0 mean and unit variance:
x = np.asarray(test_array)
res = (x - x.mean()) / x.std()
print(res.mean(), res.std())
# 0.0 1.0
Or use sklearn.preprocessing.normalize as a pre-canned function.
Using test_array / np.linalg.norm(test_array) creates a result that is of unit length; you'll see that np.linalg.norm(test_array / np.linalg.norm(test_array)) equals 1. So you're talking about two different fields here, one being statistics and the other being linear algebra.
The power of python is its broadcasting property, which allows you to do vectorizing array operations without explicit looping. So, You do not need to write a function using explicit for loop, which is slow and time-consuming, especially if your dataset is too big.
The pythonic way of doing min-max normalization is
test_array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
normalized_test_array = (test_array - min(test_array)) / (max(test_array) - min(test_array))
output >> [ 0., 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1. ]

Categories