loop through a list and use previous elements - python

I have a list that contains decimal numbers, however in this example I use ints:
my_list = [40, 60, 100, 240, ...]
I want to print each element of the list in reverse order and afterwards I want to print a second line where every value is divided by 2, then a third line where the previous int is devided by 3 and so on...
Output should be:
240 120 60 36
120 60 30 18 #previous number divided by 2
40 20 10 6 #previous number divided by 3
... ... ... ... #previous number divided by 4 ...
My solution is ugly: I can make a slice and reverse that list and make n for loops and append the result in a new list. But there must be a better way. How would you do that?

I'd write a generator to yield lists in turn:
def divider(lst,n):
lst = [float(x) for x in lst[::-1]]
for i in range(1,n+1):
lst = [x/i for x in lst]
yield lst
is more appropriate. If we want to make it slightly more efficient, we could factor out the first iteration (division by 1) and yield it separately:
def divider(lst,n):
lst = [float(x) for x in reversed(lst)]
yield lst
for i in range(2,n+1):
lst = [x/i for x in lst]
yield lst
*Note that in this context there isn't a whole lot of difference between lst[::-1] and reversed(lst). The former is typically a little faster, but the latter is a little more memory efficient. Choose according to your constraints.
Demo:
>>> def divider(lst,n):
... lst = [float(x) for x in reversed(lst)]
... yield lst
... for i in range(2,n+1):
... lst = [x/i for x in lst]
... yield lst
...
>>> for lst in divider([40, 60, 100, 240],3):
... print lst
...
[240.0, 100.0, 60.0, 40.0]
[120.0, 50.0, 30.0, 20.0]
[40.0, 16.666666666666668, 10.0, 6.666666666666667]

To print the columnar the output you want, use format strings. You may have to tweak this to get the alignment and precision you want for your actual data:
def print_list(L):
print ' '.join('{:>3d}'.format(i) for i in L)
Normally to do the division we could use a function with recursion, but we can also use a simple loop where each iteration produces the list that is worked on next:
my_list = [40, 60, 100, 240, 36, 60, 120, 240]
maxdiv = 20
baselist = list(reversed(my_list))
for div in range(1, maxdiv+1):
baselist = [i/div for i in baselist]
print_list(baselist)
Output:
240 120 60 36 240 100 60 40
120 60 30 18 120 50 30 20
40 20 10 6 40 16 10 6
10 5 2 1 10 4 2 1
2 1 0 0 2 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
...

max_n = 3
vals = [40, 60, 100, 240]
grid = [list(reversed(vals))]
for n in xrange(2, max_n + 1):
grid.append([v/n for v in grid[-1]])
for g in grid:
print g
# Output
[240, 100, 60, 40]
[120.0, 50.0, 30.0, 20.0]
[40.0, 16.666666666666668, 10.0, 6.666666666666667]

new_list = my_list[::-1] #reverse the list
print '\t'.join(map(str, new_list))
for _counter in range(2, 21): #count from 2 to 20
for _index in range(len(new_list)): # iterate the whole list
new_list[_index] = new_list[_index]/_counter
print '\t'.join(map(str, new_list))
Which will produce an output like(I used float instead of int):
240.0 100.0 60.0 40.0
120.0 50.0 30.0 20.0
40.0 16.6666666667 10.0 6.66666666667
10.0 4.16666666667 2.5 1.66666666667
2.0 0.833333333333 0.5 0.333333333333

my_list = [40, 60, 100, 240]
def dostuff(l,limit):
print('\t'.join(map(str,reversed(l))))
print('\n'.join([ '\t'.join(map(str,[v/float(i) for v in reversed(my_list)])) for i in range(2,limit+1)]))
dostuff(my_list,20)
Produces:
240 100 60 40
120.0 50.0 30.0 20.0
80.0 33.333333333333336 20.0 13.333333333333334
60.0 25.0 15.0 10.0
48.0 20.0 12.0 8.0
40.0 16.666666666666668 10.0 6.666666666666667
34.285714285714285 14.285714285714286 8.571428571428571 5.714285714285714
30.0 12.5 7.5 5.0
26.666666666666668 11.11111111111111 6.666666666666667 4.444444444444445
24.0 10.0 6.0 4.0
21.818181818181817 9.090909090909092 5.454545454545454 3.6363636363636362
20.0 8.333333333333334 5.0 3.3333333333333335
18.46153846153846 7.6923076923076925 4.615384615384615 3.076923076923077
17.142857142857142 7.142857142857143 4.285714285714286 2.857142857142857
16.0 6.666666666666667 4.0 2.6666666666666665
15.0 6.25 3.75 2.5
14.117647058823529 5.882352941176471 3.5294117647058822 2.3529411764705883
13.333333333333334 5.555555555555555 3.3333333333333335 2.2222222222222223
12.631578947368421 5.2631578947368425 3.1578947368421053 2.1052631578947367
12.0 5.0 3.0 2.0

Related

How to store each iteration's values in dataframe?

I want to take input in the form of lists and join them into strings. How I can store the output as a dataframe column?
The input X is a dataframe and the column name is des:
X['des'] =
[5, 13]
[L32MM, 4MM, 2]
[724027, 40]
[58, 60MM, 36MM, 0, 36, 3]
[8.5, 45MM]
[5.0MM, 44MM]
[10]
This is my code:
def clean_text():
for i in range(len(X)):
str1 = " "
print(str1.join(X['des'][i]))
m = clean_text
m()
And here is my output, but how I can make it as a dataframe?
5 13
L32MM 4MM 2
724027 40
58 60MM 36MM 0 36 3
8.5 45MM
5.0MM 44MM
10
Note that iterating in pandas is an antipattern. Whenever possible, use DataFrame and Series methods to operate on entire columns at once.
Series.str.join (recommended)
X['joined'] = X['des'].str.join(' ')
Output:
des joined
0 [5, 13] 5 13
1 [L32MM, 4MM, 2] L32MM 4MM 2
2 [724027, 40] 724027 40
3 [58, 60MM, 36MM, 0, 36, 3] 58 60MM 36MM 0 36 3
4 [8.5, 45MM] 8.5 45MM
5 [5.0MM, 44MM] 5.0MM 44MM
6 [10] 10
Loop (not recommended)
Iterate the numpy values and assign using DataFrame.loc:
for i, des in enumerate(X['des'].to_numpy()):
X.loc[i, 'loop'] = ' '.join(des)
Or iterate via DataFrame.itertuples:
for row in X.itertuples():
X.loc[row.Index, 'itertuples'] = ' '.join(row.des)
Or iterate via DataFrame.iterrows:
for i, row in X.iterrows():
X.loc[i, 'iterrows'] = ' '.join(row.des)
Output:
des loop itertuples iterrows
0 [5, 13] 5 13 5 13 5 13
1 [L32MM, 4MM, 2] L32MM 4MM 2 L32MM 4MM 2 L32MM 4MM 2
2 [724027, 40] 724027 40 724027 40 724027 40
3 [58, 60MM, 36MM, 0, 36, 3] 58 60MM 36MM 0 36 3 58 60MM 36MM 0 36 3 58 60MM 36MM 0 36 3
4 [8.5, 45MM] 8.5 45MM 8.5 45MM 8.5 45MM
5 [5.0MM, 44MM] 5.0MM 44MM 5.0MM 44MM 5.0MM 44MM
6 [10] 10 10 10

create df form list comprehension within loop

I have to the following code to create df from a list comprehension within a loop. However, the output is not as I desire.
I would like to create a new column for each group in the list. In this example, 3 groups implies 3 columns.
Input:
t = [x * .001 for x in range(2)]
l = [[10, 2, 40], [20, 4, 80], [30, 6, 160]]
tmp = pd.DataFrame([], dtype=object)
for i in range(len(l)):
l1 = [l[i][1]*l[i][0]*l[i][2]*t[j] for j in range(len(t))]
tmp = tmp.append(l1, ignore_index=False)
Output:
l = [[10, 2, 40], [20, 4, 80], [30, 6, 160]]
tmp=
0
0 0.0
1 0.8
0 0.0
1 6.4
0 0.0
1 28.8
Desired Output:
0.0 0.0 0.0
0.8 6.4 28.8
How can I get the above desired output?
I believe you can create lists and then call DataFrame cosntructor for improve performance:
t=[x * .001 for x in range(2)]
l=[[10,2,40],[20,4,80],[30,6,160]]
tmp = []
for i in range(len(l)):
l1 = [l[i][1]*l[i][0]*l[i][2]*t[j] for j in range(len(t))]
print (l1)
mp.append(l1)
df = pd.DataFrame(tmp, dtype=object).T
print (df)
0 1 2
0 0 0 0
1 0.8 6.4 28.8
If need use DataFrame.append:
t=[x * .001 for x in range(2)]
l=[[10,2,40],[20,4,80],[30,6,160]]
tmp = pd.DataFrame([], dtype=object)
for i in range(len(l)):
l1 = [l[i][1]*l[i][0]*l[i][2]*t[j] for j in range(len(t))]
print (l1)
tmp=tmp.append([l1])
df = tmp.T
df.columns = range(len(df.columns))
print (df)
0 1 2
0 0.0 0.0 0.0
1 0.8 6.4 28.8
you can use concat instead of append:
for i in range(len(l)):
l1 = [l[i][1]*l[i][0]*l[i][2]*t[j] for j in range(len(t))]
l1 = pd.DataFrame(l1)
tmp = pd.concat([tmp,l1], axis=1)
If you wanted to make your code a little bit cleaner and increase its readability, I suggest to use double list comprehension in combination with numpy.prod and numpy.array funcitons.
import pandas as pd
import numpy as np
t = [x * .001 for x in range(2)]
l = [[10, 2, 40], [20, 4, 80], [30, 6, 160]]
tmp = pd.DataFrame(
np.array(
[
np.prod(np.array(i)) * j
for j in t
for i in l
]
).reshape(len(t), len(l))
)
The result looks like this:
>>> print(tmp)
0 1 2
0 0.0 0.0 0.0
1 0.8 6.4 28.8

Calculating grid values given the distance in python

I have a cell grid of big dimensions. Each cell has an ID (p1), cell value (p3) and coordinates in actual measures (X, Y). This is how first 10 rows/cells look like
p1 p2 p3 X Y
0 0 0.0 0.0 0 0
1 1 0.0 0.0 100 0
2 2 0.0 12.0 200 0
3 3 0.0 0.0 300 0
4 4 0.0 70.0 400 0
5 5 0.0 40.0 500 0
6 6 0.0 20.0 600 0
7 7 0.0 0.0 700 0
8 8 0.0 0.0 800 0
9 9 0.0 0.0 900 0
Neighbouring cells of cell i in the p1 can be determined as (i-500+1, i-500-1, i-1, i+1, i+500+1, i+500-1).
For example: p1 of 5 has neighbours - 4,6,504,505,506. (these are the ID of rows in the upper table - p1).
What I am trying to is:
For the chosen value/row i in p1, I would like to know all neighbours in the chosen distance from i and sum all their p3 values.
I tried to apply this solution (link), but I don't know how to incorporate the distance parameter. The cell value can be taken with df.iloc, but the steps before this are a bit tricky for me.
Can you give me any advice?
EDIT:
Using the solution from Thomas and having df called CO:
p3
0 45
1 580
2 12000
3 12531
4 22456
I'd like to add another column and use the values from p3 columns
CO['new'] = format(sum_neighbors(data, CO['p3']))
But it doesn't work. If I add a number instead of a reference to row CO['p3'] it works like charm. But how can I use values from p3 column automatically in format function?
SOLVED:
It worked with:
CO['new'] = CO.apply(lambda row: sum_neighbors(data, row.p3), axis=1)
Solution:
import numpy as np
import pandas
# Generating toy data
N = 10
data = pandas.DataFrame({'p3': np.random.randn(N)})
print(data)
# Finding neighbours
get_candidates = lambda i: [i-500+1, i-500-1, i-1, i+1, i+500+1, i+500-1]
filter = lambda neighbors, N: [n for n in neighbors if 0<=n<N]
get_neighbors = lambda i, N: filter(get_candidates(i), N)
print("Neighbors of 5: {}".format(get_neighbors(5, len(data))))
# Summing p3 on neighbors
def sum_neighbors(data, i, col='p3'):
return data.iloc[get_neighbors(i, len(data))][col].sum()
print("p3 sum on neighbors of 5: {}".format(sum_neighbors(data, 5)))
Output:
p3
0 -1.106541
1 -0.760620
2 1.282252
3 0.204436
4 -1.147042
5 1.363007
6 -0.030772
7 -0.461756
8 -1.110459
9 -0.491368
Neighbors of 5: [4, 6]
p3 sum on neighbors of 5: -1.1778133703169344
Notes:
I assumed p1 was range(N) as seemed to be implied (so we don't need it at all).
I don't think that 505 is a neighbour of 5 given the list of neighbors of i defined by the OP.

How to get the first digit of the fractional part of a decimal number?

In my pandas, data is like this:
(The original data is like 2.1, 3.7, 5.6, without the 0 following.)
I want to see the distribution of the first digit of the decimal part. (i.e, 6 for 4.6). How can I do it?
I thought about 15.1 % 1, but it returns 0.09999999999999964 instead.
For positive numbers, You could use the multiplication first, and then a modulo.
x = 15.6
x *= 10 # 156
x %= 10 # 6
If they are Negative,
def get(x):
return (x * 10) % 10
x = - 15.6
print get(abs(x))
A much cleaner way as poke suggested.
abs(x * 10) % 10
If you have a dataframe called df, could enclose like this:
df.apply( lambda x: abs(x * 10) % 10 )
You could try:
int(str(x).split('.')[1][0])
This would convert to a string, split on . and take the first character of the second part then turn it back to an integer. The reason that you get the strange value is that 0.1 is an irrational number in binary.
You could also use:
int(x * 10.0) % 10
This would ensure that you had an integer, (you might need to use math.round as well).
As an example:
>>> pf = pandas.DataFrame([0.5, 4.6, 7.2, 9.8, 36.0])
>>> pf
0
0 0.5
1 4.6
2 7.2
3 9.8
4 36.0
>>> pf[0]
0 0.5
1 4.6
2 7.2
3 9.8
4 36.0
Name: 0, dtype: float64
>>> pf.apply(lambda x: int(x[0]*10.0)%10)
0 5
dtype: int64
>>> pf.apply(lambda x: int(x[0]*10.0)%10, 1)
0 5
1 6
2 2
3 8
4 0
dtype: int64
>>>
On testing for -ve numbers:
>>> pf = pandas.DataFrame([0.5, 4.6, 7.2, -9.8, 36.0])
>>> df = pf.apply(lambda x: int(x[0]*10.0)%10, 1)
>>> df
0 5
1 6
2 2
3 2
4 0
dtype: int64
>>> df = pf.apply(lambda x: int(abs(x[0])*10.0)%10, 1)
>>> df
0 5
1 6
2 2
3 8
4 0
dtype: int64
>>>
So our final answer is:
pf.apply(lambda x: int(abs(x[0])*10.0)%10, 1)
I also tried the string method:
>>> pf.apply(lambda x:int(str(x[0]).split('.')[1][0]), 1)
0 5
1 6
2 2
3 8
4 0
dtype: int64

Count number of 0s from [1,2,....num]

We are given a large number 'num', which can have upto 10^4 digits ,( num<= 10^(10000) ) , we need to find the count of number of zeroes in the decimal representation starting from 1 upto 'num'.
eg:
countZeros('9') = 0
countZeros('100') = 11
countZeros('219') = 41
The only way i could think of is to do brute force,which obviously is too slow for large inputs.
I found the following python code in this link ,which does the required in O(L),L being length of 'num'.
def CountZeros(num):
Z = 0
N = 0
F = 0
for j in xrange(len(num)):
F = 10*F + N - Z*(9-int(num[j]))
if num[j] == '0':
Z += 1
N = 10*N + int(num[j])
return F
I can't understand the logic behind it..Any kind of help will be appreciated.
from 0 - 9 : 0 zeros
from 10 - 99: 9 zeros ( 10, 20, ... 90)
--100-199 explained-----------------------
100, 101, ..., 109 : 11 zeros (two in 100)
110, 120, ..., 199: 9 zeros (this is just the same as 10-99) This is important
Total: 20
------------------------------------------
100 - 999: 20 * 9 = 180
total up to 999 is: 180 + 9: 189
CountZeros('999') -> 189
Continu this pattern and you might start to see the overall pattern and eventually the algorithm.
Does the following help you're understanding:
>>> for i in range(10, 100, 10):
... print(CountZeros(str(i)))
...
1
2
3
4
5
6
7
8
9
>>>
What about this:
>>> CountZeros("30")
j Z N F
0 0 0 0
j Z N F
0 0 3 0
j Z N F
1 0 3 0
j Z N F
1 1 30 3
3

Categories