Cast each digit of string to int - python

So I have a string say "1234567", and my desired endpoint is a list of the form [1, 2, 3, 4, 5, 6, 7]
What I'm currently doing is this
[int(x) for x in "1234567"]
What I'm wondering is if there is a better or more Pythonic way to do this? Possibly using built-ins or standard library functions.

You can use map function:
map(int, "1234567")
or range:
range(1,8)
With range result will be same:
>>> map(int, "1234567")
[1, 2, 3, 4, 5, 6, 7]
>>> range(1,8)
[1, 2, 3, 4, 5, 6, 7]

One way is to use map. map(int, "1234567")

There isn't any 'more pythonic' way to do it. And AFAIK, whether you prefer map or list comprehension is a matter of personal taste, but more people seem to prefer list comprehensions.
For what you are doing though, if this is in performance-sensitive code, take a page from old assembly routines and use a dict instead, it will be faster than int, and not much more complicated:
In [1]: %timeit [int(x) for x in '1234567']
100000 loops, best of 3: 4.69 µs per loop
In [2]: %timeit map(int, '1234567')
100000 loops, best of 3: 4.38 µs per loop
# Create a lookup dict for each digit, instead of using the builtin 'int'
In [5]: idict = dict(('%d'%x, x) for x in range(10))
# And then, for each digit, just look up in the dict.
In [6]: %timeit [idict[x] for x in '1234567']
1000000 loops, best of 3: 1.21 µs per loop

Related

Trying to update an array by changing its value in python [duplicate]

I have a list:
my_list = [1, 2, 3, 4, 5]
How can I multiply each element in my_list by 5? The output should be:
[5, 10, 15, 20, 25]
You can just use a list comprehension:
my_list = [1, 2, 3, 4, 5]
my_new_list = [i * 5 for i in my_list]
>>> print(my_new_list)
[5, 10, 15, 20, 25]
Note that a list comprehension is generally a more efficient way to do a for loop:
my_new_list = []
for i in my_list:
my_new_list.append(i * 5)
>>> print(my_new_list)
[5, 10, 15, 20, 25]
As an alternative, here is a solution using the popular Pandas package:
import pandas as pd
s = pd.Series(my_list)
>>> s * 5
0 5
1 10
2 15
3 20
4 25
dtype: int64
Or, if you just want the list:
>>> (s * 5).tolist()
[5, 10, 15, 20, 25]
Finally, one could use map, although this is generally frowned upon.
my_new_list = map(lambda x: x * 5, my_list)
Using map, however, is generally less efficient. Per a comment from ShadowRanger on a deleted answer to this question:
The reason "no one" uses it is that, in general, it's a performance
pessimization. The only time it's worth considering map in CPython is
if you're using a built-in function implemented in C as the mapping
function; otherwise, map is going to run equal to or slower than the
more Pythonic listcomp or genexpr (which are also more explicit about
whether they're lazy generators or eager list creators; on Py3, your
code wouldn't work without wrapping the map call in list). If you're
using map with a lambda function, stop, you're doing it wrong.
And another one of his comments posted to this reply:
Please don't teach people to use map with lambda; the instant you
need a lambda, you'd have been better off with a list comprehension
or generator expression. If you're clever, you can make map work
without lambdas a lot, e.g. in this case, map((5).__mul__, my_list), although in this particular case, thanks to some
optimizations in the byte code interpreter for simple int math, [x * 5 for x in my_list] is faster, as well as being more Pythonic and simpler.
A blazingly faster approach is to do the multiplication in a vectorized manner instead of looping over the list. Numpy has already provided a very simply and handy way for this that you can use.
>>> import numpy as np
>>>
>>> my_list = np.array([1, 2, 3, 4, 5])
>>>
>>> my_list * 5
array([ 5, 10, 15, 20, 25])
Note that this doesn't work with Python's native lists. If you multiply a number with a list it will repeat the items of the as the size of that number.
In [15]: my_list *= 1000
In [16]: len(my_list)
Out[16]: 5000
If you want a pure Python-based approach using a list comprehension is basically the most Pythonic way to go.
In [6]: my_list = [1, 2, 3, 4, 5]
In [7]: [5 * i for i in my_list]
Out[7]: [5, 10, 15, 20, 25]
Beside list comprehension, as a pure functional approach, you can also use built-in map() function as following:
In [10]: list(map((5).__mul__, my_list))
Out[10]: [5, 10, 15, 20, 25]
This code passes all the items within the my_list to 5's __mul__ method and returns an iterator-like object (in python-3.x). You can then convert the iterator to list using list() built in function (in Python-2.x you don't need that because map return a list by default).
benchmarks:
In [18]: %timeit [5 * i for i in my_list]
463 ns ± 10.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [19]: %timeit list(map((5).__mul__, my_list))
784 ns ± 10.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [20]: %timeit [5 * i for i in my_list * 100000]
20.8 ms ± 115 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [21]: %timeit list(map((5).__mul__, my_list * 100000))
30.6 ms ± 169 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [24]: arr = np.array(my_list * 100000)
In [25]: %timeit arr * 5
899 µs ± 4.98 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
You can do it in-place like so:
l = [1, 2, 3, 4, 5]
l[:] = [x * 5 for x in l]
This requires no additional imports and is very pythonic.
Since I think you are new with Python, lets do the long way, iterate thru your list using for loop and multiply and append each element to a new list.
using for loop
lst = [5, 20 ,15]
product = []
for i in lst:
product.append(i*5)
print product
using list comprehension, this is also same as using for-loop but more 'pythonic'
lst = [5, 20 ,15]
prod = [i * 5 for i in lst]
print prod
With map (not as good, but another approach to the problem):
list(map(lambda x: x*5,[5, 10, 15, 20, 25]))
also, if you happen to be using numpy or numpy arrays, you could use this:
import numpy as np
list(np.array(x) * 5)
from functools import partial as p
from operator import mul
map(p(mul,5),my_list)
is one way you could do it ... your teacher probably knows a much less complicated way that was probably covered in class
Multiplying each element in my_list by k:
k = 5
my_list = [1,2,3,4]
result = list(map(lambda x: x * k, my_list))
resulting in: [5, 10, 15, 20]
I found it interesting to use list comprehension or map with just one object name x.
Note that whenever x is reassigned, its id(x) changes, i.e. points to a different object.
x = [1, 2, 3]
id(x)
2707834975552
x = [1.5 * x for x in x]
id(x)
2707834976576
x
[1.5, 3.0, 4.5]
list(map(lambda x : 2 * x / 3, x))
[1.0, 2.0, 3.0]
id(x) # not reassigned
2707834976576
x = list(map(lambda x : 2 * x / 3, x))
x
[1.0, 2.0, 3.0]
id(x)
2707834980928
var1 = [2,4,6,8,10,12]
#for Integer multiplier us int(x).__mull__ with map
var2 = list( map(int(2).__mul__,var1 ))
#for float multiplier
var2 = list( map(float(2.5).__mul__,var1 ))
Best way is to use list comprehension:
def map_to_list(my_list, n):
# multiply every value in my_list by n
# Use list comprehension!
my_new_list = [i * n for i in my_list]
return my_new_list
# To test:
print(map_to_list([1,2,3], -1))
Returns:
[-1, -2, -3]

flatten list of lists and scalars [duplicate]

This question already has answers here:
Flatten an irregular (arbitrarily nested) list of lists
(51 answers)
Closed 6 months ago.
So for a matrix, we have methods like numpy.flatten()
np.array([[1,2,3],[4,5,6],[7,8,9]]).flatten()
gives [1,2,3,4,5,6,7,8,9]
what if I wanted to get from np.array([[1,2,3],[4,5,6],7]) to [1,2,3,4,5,6,7]?
Is there an existing function that performs something like that?
With uneven lists, the array is a object dtype, (and 1d, so flatten doesn't change it)
In [96]: arr=np.array([[1,2,3],[4,5,6],7])
In [97]: arr
Out[97]: array([[1, 2, 3], [4, 5, 6], 7], dtype=object)
In [98]: arr.sum()
...
TypeError: can only concatenate list (not "int") to list
The 7 element is giving problems. If I change that to a list:
In [99]: arr=np.array([[1,2,3],[4,5,6],[7]])
In [100]: arr.sum()
Out[100]: [1, 2, 3, 4, 5, 6, 7]
I'm using a trick here. The elements of the array lists, and for lists [1,2,3]+[4,5] is concatenate.
The basic point is that an object array is not a 2d array. It is, in many ways, more like a list of lists.
chain
The best list flattener is chain
In [104]: list(itertools.chain(*arr))
Out[104]: [1, 2, 3, 4, 5, 6, 7]
though it too will choke on the integer 7 version.
concatenate and hstack
If the array is a list of lists (not the original mix of lists and scalar) then np.concatenate works. It iterates on the object just as though it were a list.
With the mixed original list concatenate does not work, but hstack does
In [178]: arr=np.array([[1,2,3],[4,5,6],7])
In [179]: np.concatenate(arr)
...
ValueError: all the input arrays must have same number of dimensions
In [180]: np.hstack(arr)
Out[180]: array([1, 2, 3, 4, 5, 6, 7])
That's because hstack first iterates though the list and makes sure all elements are atleast_1d. This extra iteration makes it more robust, but at a cost in processing speed.
time tests
In [170]: big1=arr.repeat(1000)
In [171]: timeit big1.sum()
10 loops, best of 3: 31.6 ms per loop
In [172]: timeit list(itertools.chain(*big1))
1000 loops, best of 3: 433 µs per loop
In [173]: timeit np.concatenate(big1)
100 loops, best of 3: 5.05 ms per loop
double the size
In [174]: big1=arr.repeat(2000)
In [175]: timeit big1.sum()
10 loops, best of 3: 128 ms per loop
In [176]: timeit list(itertools.chain(*big1))
1000 loops, best of 3: 803 µs per loop
In [177]: timeit np.concatenate(big1)
100 loops, best of 3: 9.93 ms per loop
In [182]: timeit np.hstack(big1) # the extra iteration hurts hstack speed
10 loops, best of 3: 43.1 ms per loop
The sum is quadratic in size
res=[]
for e in bigarr:
res += e
res grows with the number of e, so each iteration step is more expensive.
chain times the best.
You can write custom flatten function using yield:
def flatten(arr):
for i in arr:
try:
yield from flatten(i)
except TypeError:
yield i
Usage example:
>>> myarr = np.array([[1,2,3],[4,5,6],7])
>>> newarr = list(flatten(myarr))
>>> newarr
[1, 2, 3, 4, 5, 6, 7]
You can use apply_along_axis here
>>> arr = np.array([[1,2,3],[4,5,6],[7]])
>>> np.apply_along_axis(np.concatenate, 0, arr)
array([1, 2, 3, 4, 5, 6, 7])
As a bonus, this is not quadratic in the number of lists either.

Fastest way to mix arrays in numpy?

a= array([1,3,5,7,9])
b= array([2,4,6,8,10])
I want to mix pair of arrays so that their sequences insert element by element
Example: using a and b, it should result into
c= array([1,2,3,4,5,6,7,8,9,10])
I need to do that using pairs of long arrays (more than one hundred elements) on thousand of sequences. Any smarter ideas than pickling element by element on each array?
thanks
c = np.empty(len(a)+len(b), dtype=a.dtype)
c[::2] = a
c[1::2] = b
(That assumes a and b have the same dtype.)
You asked for the fastest, so here's a timing comparison (vstack, ravel and empty are all numpy functions):
In [40]: a = np.random.randint(0, 10, size=150)
In [41]: b = np.random.randint(0, 10, size=150)
In [42]: %timeit vstack((a,b)).T.flatten()
100000 loops, best of 3: 5.6 µs per loop
In [43]: %timeit ravel([a, b], order='F')
100000 loops, best of 3: 3.1 µs per loop
In [44]: %timeit c = empty(len(a)+len(b), dtype=a.dtype); c[::2] = a; c[1::2] = b
1000000 loops, best of 3: 1.94 µs per loop
With vstack((a,b)).T.flatten(), a and b are copied to create vstack((a,b)), and then the data is copied again by the flatten() method.
ravel([a, b], order='F') is implemented as asarray([a, b]).ravel(order), which requires copying a and b, and then copying the result to create an array with order='F'. (If you do just ravel([a, b]), it is about the same speed as my answer, because it doesn't have to copy the data again. Unfortunately, order='F' is needed to get the alternating pattern.)
So the other two methods copy the data twice. In my version, each array is copied once.
This'll do it:
vstack((a,b)).T.flatten()
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
Using numpy.ravel:
>>> np.ravel([a, b], order='F')
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

Select elements row-wise based on single array

Say I have an array d of size (N,T), out of which I need to select elements using index of shape (N,), where the first element corresponds to the index in the first row, etc... how would I do that?
For example
>>> d
Out[748]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]])
>>> index
Out[752]: array([5, 6, 1], dtype=int64)
Expected Output:
array([[5],
[6],
[2])
Which is an array containing the fifth element of the first row, the 6th element of the second row and the second element of the third row.
Update
Since I will have sufficiently larger N, I was interested in the speed of the different methods for higher N. With N = 30000:
>>> %timeit np.diag(e.take(index2, axis=1)).reshape(N*3, 1)
1 loops, best of 3: 3.9 s per loop
>>> %timeit e.ravel()[np.arange(e.shape[0])*e.shape[1]+index2].reshape(N*3, 1)
1000 loops, best of 3: 287 µs per loop
Finally, you suggest reshape(). As I want to leave it as general as possible (without knowing N), I instead use [:,np.newaxis] - it seems to increase duration from 287µs to 288µs, which I'll take :)
This might be ugly but more efficient:
>>> d.ravel()[np.arange(d.shape[0])*d.shape[1]+index]
array([5, 6, 2])
edit
As pointed out by #deinonychusaur the statement above can be written as clean as:
d[np.arange(index.size),index]
There might be nicer ways, but a combo of take, diag and reshape would do:
In [137]: np.diag(d.take(index, axis=1)).reshape(3, 1)
Out[137]:
array([[5],
[6],
[2]])
EDIT
Comparisons with #Emanuele Paolinis' alterative, adding reshape to it to match the sought output:
In [142]: %timeit d.reshape(d.size)[np.arange(d.shape[0])*d.shape[1]+index].reshape(3, 1)
100000 loops, best of 3: 9.51 µs per loop
In [143]: %timeit np.diag(d.take(index, axis=1)).reshape(3, 1)
100000 loops, best of 3: 3.81 µs per loop
In [146]: %timeit d.ravel()[np.arange(d.shape[0])*d.shape[1]+index].reshape(3, 1)
100000 loops, best of 3: 8.56 µs per loop
This method is about twice as fast as both proposed alternatives.
EDIT 2: An even better method
Based on #Emanuele Paulinis' version but reduced number of operations outperforms all on large arrays 10k rows by 100 columns.
In [199]: %timeit d[(np.arange(index.size), index)].reshape(index.size, 1)
1000 loops, best of 3: 364 µs per loop
In [200]: %timeit d.ravel()[np.arange(d.shape[0])*d.shape[1]+index].reshape(index.size, 1)
100 loops, best of 3: 5.22 ms per loop
So if speed is of essence:
d[(np.arange(index.size), index)].reshape(index.size, 1)

How to find the cumulative sum of numbers in a list?

time_interval = [4, 6, 12]
I want to sum up the numbers like [4, 4+6, 4+6+12] in order to get the list t = [4, 10, 22].
I tried the following:
t1 = time_interval[0]
t2 = time_interval[1] + t1
t3 = time_interval[2] + t2
print(t1, t2, t3) # -> 4 10 22
If you're doing much numerical work with arrays like this, I'd suggest numpy, which comes with a cumulative sum function cumsum:
import numpy as np
a = [4,6,12]
np.cumsum(a)
#array([4, 10, 22])
Numpy is often faster than pure python for this kind of thing, see in comparison to #Ashwini's accumu:
In [136]: timeit list(accumu(range(1000)))
10000 loops, best of 3: 161 us per loop
In [137]: timeit list(accumu(xrange(1000)))
10000 loops, best of 3: 147 us per loop
In [138]: timeit np.cumsum(np.arange(1000))
100000 loops, best of 3: 10.1 us per loop
But of course if it's the only place you'll use numpy, it might not be worth having a dependence on it.
In Python 2 you can define your own generator function like this:
def accumu(lis):
total = 0
for x in lis:
total += x
yield total
In [4]: list(accumu([4,6,12]))
Out[4]: [4, 10, 22]
And in Python 3.2+ you can use itertools.accumulate():
In [1]: lis = [4,6,12]
In [2]: from itertools import accumulate
In [3]: list(accumulate(lis))
Out[3]: [4, 10, 22]
I did a bench-mark of the top two answers with Python 3.4 and I found itertools.accumulate is faster than numpy.cumsum under many circumstances, often much faster. However, as you can see from the comments, this may not always be the case, and it's difficult to exhaustively explore all options. (Feel free to add a comment or edit this post if you have further benchmark results of interest.)
Some timings...
For short lists accumulate is about 4 times faster:
from timeit import timeit
def sum1(l):
from itertools import accumulate
return list(accumulate(l))
def sum2(l):
from numpy import cumsum
return list(cumsum(l))
l = [1, 2, 3, 4, 5]
timeit(lambda: sum1(l), number=100000)
# 0.4243644131347537
timeit(lambda: sum2(l), number=100000)
# 1.7077815784141421
For longer lists accumulate is about 3 times faster:
l = [1, 2, 3, 4, 5]*1000
timeit(lambda: sum1(l), number=100000)
# 19.174508565105498
timeit(lambda: sum2(l), number=100000)
# 61.871223849244416
If the numpy array is not cast to list, accumulate is still about 2 times faster:
from timeit import timeit
def sum1(l):
from itertools import accumulate
return list(accumulate(l))
def sum2(l):
from numpy import cumsum
return cumsum(l)
l = [1, 2, 3, 4, 5]*1000
print(timeit(lambda: sum1(l), number=100000))
# 19.18597290944308
print(timeit(lambda: sum2(l), number=100000))
# 37.759664884768426
If you put the imports outside of the two functions and still return a numpy array, accumulate is still nearly 2 times faster:
from timeit import timeit
from itertools import accumulate
from numpy import cumsum
def sum1(l):
return list(accumulate(l))
def sum2(l):
return cumsum(l)
l = [1, 2, 3, 4, 5]*1000
timeit(lambda: sum1(l), number=100000)
# 19.042188624851406
timeit(lambda: sum2(l), number=100000)
# 35.17324400227517
Try the
itertools.accumulate() function.
import itertools
list(itertools.accumulate([1,2,3,4,5]))
# [1, 3, 6, 10, 15]
Behold:
a = [4, 6, 12]
reduce(lambda c, x: c + [c[-1] + x], a, [0])[1:]
Will output (as expected):
[4, 10, 22]
Assignment expressions from PEP 572 (new in Python 3.8) offer yet another way to solve this:
time_interval = [4, 6, 12]
total_time = 0
cum_time = [total_time := total_time + t for t in time_interval]
You can calculate the cumulative sum list in linear time with a simple for loop:
def csum(lst):
s = lst.copy()
for i in range(1, len(s)):
s[i] += s[i-1]
return s
time_interval = [4, 6, 12]
print(csum(time_interval)) # [4, 10, 22]
The standard library's itertools.accumulate may be a faster alternative (since it's implemented in C):
from itertools import accumulate
time_interval = [4, 6, 12]
print(list(accumulate(time_interval))) # [4, 10, 22]
Since python 3.8 it's possible to use Assignment expressions, so things like this became easier to implement
nums = list(range(1, 10))
print(f'array: {nums}')
v = 0
cumsum = [v := v + n for n in nums]
print(f'cumsum: {cumsum}')
produces
array: [1, 2, 3, 4, 5, 6, 7, 8, 9]
cumsum: [1, 3, 6, 10, 15, 21, 28, 36, 45]
The same technique can be applied to find the cum product, mean, etc.
p = 1
cumprod = [p := p * n for n in nums]
print(f'cumprod: {cumprod}')
s = 0
c = 0
cumavg = [(s := s + n) / (c := c + 1) for n in nums]
print(f'cumavg: {cumavg}')
results in
cumprod: [1, 2, 6, 24, 120, 720, 5040, 40320, 362880]
cumavg: [1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]
First, you want a running list of subsequences:
subseqs = (seq[:i] for i in range(1, len(seq)+1))
Then you just call sum on each subsequence:
sums = [sum(subseq) for subseq in subseqs]
(This isn't the most efficient way to do it, because you're adding all of the prefixes repeatedly. But that probably won't matter for most use cases, and it's easier to understand if you don't have to think of the running totals.)
If you're using Python 3.2 or newer, you can use itertools.accumulate to do it for you:
sums = itertools.accumulate(seq)
And if you're using 3.1 or earlier, you can just copy the "equivalent to" source straight out of the docs (except for changing next(it) to it.next() for 2.5 and earlier).
If You want a pythonic way without numpy working in 2.7 this would be my way of doing it
l = [1,2,3,4]
_d={-1:0}
cumsum=[_d.setdefault(idx, _d[idx-1]+item) for idx,item in enumerate(l)]
now let's try it and test it against all other implementations
import timeit, sys
L=list(range(10000))
if sys.version_info >= (3, 0):
reduce = functools.reduce
xrange = range
def sum1(l):
cumsum=[]
total = 0
for v in l:
total += v
cumsum.append(total)
return cumsum
def sum2(l):
import numpy as np
return list(np.cumsum(l))
def sum3(l):
return [sum(l[:i+1]) for i in xrange(len(l))]
def sum4(l):
return reduce(lambda c, x: c + [c[-1] + x], l, [0])[1:]
def this_implementation(l):
_d={-1:0}
return [_d.setdefault(idx, _d[idx-1]+item) for idx,item in enumerate(l)]
# sanity check
sum1(L)==sum2(L)==sum3(L)==sum4(L)==this_implementation(L)
>>> True
# PERFORMANCE TEST
timeit.timeit('sum1(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.001018061637878418
timeit.timeit('sum2(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.000829620361328125
timeit.timeit('sum3(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.4606760001182556
timeit.timeit('sum4(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.18932826995849608
timeit.timeit('this_implementation(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.002348129749298096
There could be many answers for this depending on the length of the list and the performance. One very simple way which I can think without thinking of the performance is this:
a = [1, 2, 3, 4]
a = [sum(a[0:x]) for x in range(1, len(a)+1)]
print(a)
[1, 3, 6, 10]
This is by using list comprehension and this may work fairly well it is just that here I am adding over the subarray many times, you could possibly improvise on this and make it simple!
Cheers to your endeavor!
values = [4, 6, 12]
total = 0
sums = []
for v in values:
total = total + v
sums.append(total)
print 'Values: ', values
print 'Sums: ', sums
Running this code gives
Values: [4, 6, 12]
Sums: [4, 10, 22]
Try this:
result = []
acc = 0
for i in time_interval:
acc += i
result.append(acc)
l = [1,-1,3]
cum_list = l
def sum_list(input_list):
index = 1
for i in input_list[1:]:
cum_list[index] = i + input_list[index-1]
index = index + 1
return cum_list
print(sum_list(l))
In Python3, To find the cumulative sum of a list where the ith element
is the sum of the first i+1 elements from the original list, you may do:
a = [4 , 6 , 12]
b = []
for i in range(0,len(a)):
b.append(sum(a[:i+1]))
print(b)
OR you may use list comprehension:
b = [sum(a[:x+1]) for x in range(0,len(a))]
Output
[4,10,22]
lst = [4, 6, 12]
[sum(lst[:i+1]) for i in xrange(len(lst))]
If you are looking for a more efficient solution (bigger lists?) a generator could be a good call (or just use numpy if you really care about performance).
def gen(lst):
acu = 0
for num in lst:
yield num + acu
acu += num
print list(gen([4, 6, 12]))
In [42]: a = [4, 6, 12]
In [43]: [sum(a[:i+1]) for i in xrange(len(a))]
Out[43]: [4, 10, 22]
This is slighlty faster than the generator method above by #Ashwini for small lists
In [48]: %timeit list(accumu([4,6,12]))
100000 loops, best of 3: 2.63 us per loop
In [49]: %timeit [sum(a[:i+1]) for i in xrange(len(a))]
100000 loops, best of 3: 2.46 us per loop
For larger lists, the generator is the way to go for sure. . .
In [50]: a = range(1000)
In [51]: %timeit [sum(a[:i+1]) for i in xrange(len(a))]
100 loops, best of 3: 6.04 ms per loop
In [52]: %timeit list(accumu(a))
10000 loops, best of 3: 162 us per loop
Somewhat hacky, but seems to work:
def cumulative_sum(l):
y = [0]
def inc(n):
y[0] += n
return y[0]
return [inc(x) for x in l]
I did think that the inner function would be able to modify the y declared in the outer lexical scope, but that didn't work, so we play some nasty hacks with structure modification instead. It is probably more elegant to use a generator.
Without having to use Numpy, you can loop directly over the array and accumulate the sum along the way. For example:
a=range(10)
i=1
while((i>0) & (i<10)):
a[i]=a[i-1]+a[i]
i=i+1
print a
Results in:
[0, 1, 3, 6, 10, 15, 21, 28, 36, 45]
A pure python oneliner for cumulative sum:
cumsum = lambda X: X[:1] + cumsum([X[0]+X[1]] + X[2:]) if X[1:] else X
This is a recursive version inspired by recursive cumulative sums. Some explanations:
The first term X[:1] is a list containing the previous element and is almost the same as [X[0]] (which would complain for empty lists).
The recursive cumsum call in the second term processes the current element [1] and remaining list whose length will be reduced by one.
if X[1:] is shorter for if len(X)>1.
Test:
cumsum([4,6,12])
#[4, 10, 22]
cumsum([])
#[]
And simular for cumulative product:
cumprod = lambda X: X[:1] + cumprod([X[0]*X[1]] + X[2:]) if X[1:] else X
Test:
cumprod([4,6,12])
#[4, 24, 288]
Here's another fun solution. This takes advantage of the locals() dict of a comprehension, i.e. local variables generated inside the list comprehension scope:
>>> [locals().setdefault(i, (elem + locals().get(i-1, 0))) for i, elem
in enumerate(time_interval)]
[4, 10, 22]
Here's what the locals() looks for each iteration:
>>> [[locals().setdefault(i, (elem + locals().get(i-1, 0))), locals().copy()][1]
for i, elem in enumerate(time_interval)]
[{'.0': <enumerate at 0x21f21f7fc80>, 'i': 0, 'elem': 4, 0: 4},
{'.0': <enumerate at 0x21f21f7fc80>, 'i': 1, 'elem': 6, 0: 4, 1: 10},
{'.0': <enumerate at 0x21f21f7fc80>, 'i': 2, 'elem': 12, 0: 4, 1: 10, 2: 22}]
Performance is not terrible for small lists:
>>> %timeit list(accumulate([4, 6, 12]))
387 ns ± 7.53 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
>>> %timeit np.cumsum([4, 6, 12])
5.31 µs ± 67.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit [locals().setdefault(i, (e + locals().get(i-1,0))) for i,e in enumerate(time_interval)]
1.57 µs ± 12 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
And obviously falls flat for larger lists.
>>> l = list(range(1_000_000))
>>> %timeit list(accumulate(l))
95.1 ms ± 5.22 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> %timeit np.cumsum(l)
79.3 ms ± 1.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> %timeit np.cumsum(l).tolist()
120 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> %timeit [locals().setdefault(i, (e + locals().get(i-1, 0))) for i, e in enumerate(l)]
660 ms ± 5.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Even though the method is ugly and not practical, it sure is fun.
I think the below code is the easiest:
a=[1,1,2,1,2]
b=[a[0]]+[sum(a[0:i]) for i in range(2,len(a)+1)]
def cumulative_sum(list):
l = []
for i in range(len(list)):
new_l = sum(list[:i+1])
l.append(new_l)
return l
time_interval = [4, 6, 12]
print(cumulative_sum(time_interval)
Maybe a more beginner-friendly solution.
So you need to make a list of cumulative sums. You can do it by using for loop and .append() method
time_interval = [4, 6, 12]
cumulative_sum = []
new_sum = 0
for i in time_interval:
new_sum += i
cumulative_sum.append(new_sum)
print(cumulative_sum)
or, using numpy module
import numpy
time_interval = [4, 6, 12]
c_sum = numpy.cumsum(time_interval)
print(c_sum.tolist())
This would be Haskell-style:
def wrand(vtlg):
def helpf(lalt,lneu):
if not lalt==[]:
return helpf(lalt[1::],[lalt[0]+lneu[0]]+lneu)
else:
lneu.reverse()
return lneu[1:]
return helpf(vtlg,[0])

Categories