Modifying alternate indices of 3d numpy array - python

I have a numpy array with shape (140, 23, 2) being 140 frames, 23 objects, and x,y locations. The data has been generated by a GAN and when I animate the movement it's very jittery. I want to smooth it by converting the coordinates for each object so every odd number index to be the mid-point between the even numbered indices either side of it. e.g.
x[1] = (x[0] + x[2]) / 2
x[3] = (x[2] + x[4]) / 2
Below is my code:
def smooth_coordinates(df):
# df shape is (140, 23, 2)
# iterate through each object (23)
for j in range(len(df[0])):
# iterate through 140 frames
for i in range(len(df)):
# if it's an even number and index allows at least 1 index after it
if (i%2 != 0) and (i < (len(df[0])-2)):
df[i][j][0] = ( (df[i-1][j][0]+df[i+1][j][0]) /2 )
df[i][j][1] = ( (df[i-1][j][1]+df[i+1][j][1]) /2 )
return df
Aside from it being very inefficient my input df and output df are identical. Any suggestions for how to achieve this more efficiently?

import numpy as np
a = np.random.randint(100, size= [140, 23, 2]) # input array
b = a.copy()
i = np.ogrid[1: a.shape[0]-1: 2] # odd indicies
i
>>> [ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25,
27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51,
53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77,
79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103,
105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129,
131, 133, 135, 137]
(a == b).all() # testing for equality
>>> True
a[i] = (a[i-1] + a[i+1]) / 2 # averaging positions across frames
(a == b).all() # testing for equality again
>>> False

Related

Problem with adding elements from functions to list (too much memory is using?)

I replace in this code
import matplotlib.pyplot as plt
#parametry dla romeo i julii, zeby byly niezmienne w uczuciach musza byc wieksze od 0
aR = 0.5
aL = 0.7
#pR pL odpowiedzi Romea/Julii na miłość
pR = 0.2
pL = 0.5
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]
rom = []
jul = []
def Romeo(n):
if n == 0:
return 1
return Romeo(n - 1)*aR
def Julia(n):
if n == 0:
return 1
return Julia(n - 1)*aL
def alfa(n):
if n == 0:
return 1
return aR*Romeo(n - 1) + pR*Julia(n - 1)
def beta(n):
if n == 0:
return 1
return aL*Julia(n - 1) + pL*Romeo(n - 1)
j = 0
while j < 100:
rom.append(alfa(j))
j+=1
j = 0
while j < 100:
jul.append(beta(j))
j+=1
plt.plot(x, rom, label = "Romeo love")
plt.plot(x, jul, label = "Julia love")
plt.xlabel("Days")
plt.ylabel("Romeo love")
plt.title("Some graph")
plt.legend()
plt.show()
only alfa and beta functions byt this:
import matplotlib.pyplot as plt
#parametry dla romeo i julii, zeby byly niezmienne w uczuciach musza byc wieksze od 0
aR = 0.5
aL = 0.7
#pR pL odpowiedzi Romea/Julii na miłość
pR = 0.2
pL = 0.5
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, ]
rom = []
jul = []
def Romeo(n):
if n == 0:
return 1
return Romeo(n - 1)*aR
def Julia(n):
if n == 0:
return 1
return Julia(n - 1)*aL
def alfa(n):
if n == 0:
return 1
return round(aR*alfa(n - 1) + pR*beta(n - 1), 3)
def beta(n):
if n == 0:
return 1
return round(aL*beta(n-1) + pL*alfa(n - 1), 3)
j = 0
while j < 100:
rom.append(alfa(j))
j+=1
j = 0
while j < 100:
jul.append(beta(j))
j+=1
plt.plot(x, rom, label = "Romeo love")
plt.plot(x, jul, label = "Julia love")
plt.xlabel("Days")
plt.ylabel("Romeo love")
plt.title("Some graph")
plt.legend()
plt.show()
And Pycharm does not want to compilate (does not draw this graph) or it will take a lot of time. Ealier it was not a problem. \
I thought that a lot of numbers after point can be a reason and i round every number from list, but it didnt solve the problem.
What I changed by replacing this functions? How can I fix that?
Im pretty sure that the problem is in assigning elements from functions to list [2 while]. But i do not know why.
The current recursive approach is wasteful.
For example, when computing alfa(1) would require alfa(0), beta(0).
When you move on to alfa(2), the code will first compute alfa(1) and beta(1). Then alfa(1) would call alfa(0) and beta(0), while beta(1) would separately call alfa(0), beta(0) again, without recycling what we have computed before. So you need 6 calls for alfa(2).
At alfa(3), you would compute alfa(2) and beta(2), each of which needs 6 calls; so you need 14 calls (if my math is not off).
Imagine how many computations you would need at n == 100; the answer is 2535301200456458802993406410750. Cumulatively, i.e., since you want to plot alfa(1), ..., alfa(100), you need 5070602400912917605986812821300
computations in total, only to produce a single list rom.
You can use memoization to remember the previously calculated results and recycle them.
In python, you can achieve this by using functools.lru_cache (python doc); put
from functools import lru_cache
at the beginning of your code and then put
#lru_cache()
before each function; e.g.,
#lru_cache()
def Romeo(n):
if n == 0:
return 1
return Romeo(n - 1)*aR
You will see the graph almost immediately now.

Remove elements in a list if difference with previous element less than value

Given a list of numbers in ascending order. It is necessary to leave only elements to get such a list where the difference between the elements was greater or equal than a certain value (10 in my case).
Given:
list = [10,15,17,21,34,36,42,67,75,84,92,94,103,115]
Goal:
list=[10,21,34,67,84,94,115]
you could use a while loop and a variable to track the current index you are currently looking at. So starting at index 1, check if the number at this index minus the number in the previous index is less than 10. If it is then delete this index but keep the index counter the same so we look at the next num that is now in this index. If the difference is 10 or more increase the index to look at the next num. I have an additional print line in the loop you can remove this is just to show the comparing.
nums = [10, 15, 17, 21, 34, 36, 42, 67, 75, 84, 92, 94, 103, 115]
index = 1
while index < len(nums):
print(f"comparing {nums[index-1]} with {nums[index]} nums list {nums}")
if nums[index] - nums[index - 1] < 10:
del nums[index]
else:
index += 1
print(nums)
OUTPUT
comparing 10 with 15 nums list [10, 15, 17, 21, 34, 36, 42, 67, 75, 84, 92, 94, 103, 115]
comparing 10 with 17 nums list [10, 17, 21, 34, 36, 42, 67, 75, 84, 92, 94, 103, 115]
comparing 10 with 21 nums list [10, 21, 34, 36, 42, 67, 75, 84, 92, 94, 103, 115]
comparing 21 with 34 nums list [10, 21, 34, 36, 42, 67, 75, 84, 92, 94, 103, 115]
comparing 34 with 36 nums list [10, 21, 34, 36, 42, 67, 75, 84, 92, 94, 103, 115]
comparing 34 with 42 nums list [10, 21, 34, 42, 67, 75, 84, 92, 94, 103, 115]
comparing 34 with 67 nums list [10, 21, 34, 67, 75, 84, 92, 94, 103, 115]
comparing 67 with 75 nums list [10, 21, 34, 67, 75, 84, 92, 94, 103, 115]
comparing 67 with 84 nums list [10, 21, 34, 67, 84, 92, 94, 103, 115]
comparing 84 with 92 nums list [10, 21, 34, 67, 84, 92, 94, 103, 115]
comparing 84 with 94 nums list [10, 21, 34, 67, 84, 94, 103, 115]
comparing 94 with 103 nums list [10, 21, 34, 67, 84, 94, 103, 115]
comparing 94 with 115 nums list [10, 21, 34, 67, 84, 94, 115]
[10, 21, 34, 67, 84, 94, 115]
You could build up the list in a loop. Start with the first number in the list. Keep track of the last number chosen to be in the new list. Add an item to the new list only when it differs from the last number chosen by at least the target amount:
my_list = [10,15,17,21,34,36,42,67,75,84,92,94,103,115]
last_num = my_list[0]
new_list = [last_num]
for x in my_list[1:]:
if x - last_num >= 10:
new_list.append(x)
last_num = x
print(new_list) #prints [10, 21, 34, 67, 84, 94, 115]
This problem can be solved fairly simply by iterating over your initial set of values, and adding them to your new list only when your difference of x condition is met.
Additionally, by putting this functionality into a function, you can get easily swap out the values or the minimum distance.
values = [10,15,17,21,34,36,42,67,75,84,92,94,103,115]
def foo(elements, distance):
elements = sorted(elements) # sorting the user input
new_elements = [elements[0]] # make a new list for output
for element in elements[1:]: # Iterate over the remaining elements...
if element - new_elements[-1] >= distance:
# this is the condition you described above
new_elements.append(element)
return new_elements
print(foo(values, 10))
# >>> [10, 21, 34, 67, 84, 94, 115]
print(foo(values, 5))
# >>> [10, 15, 21, 34, 42, 67, 75, 84, 92, 103, 115]
A few other notes here...
I sorted the array before I processed it. You may not want to do that for your particular application, but it seemed to make sense, since your sample data was already sorted. In the case that you don't want to sort the data before you build the list, you can remove the sorted on the line that I commented above.
I named the function foo because I was lazy and didn't want to think about the name. I highly recommend that you give it a more descriptive name.

Recursion Error. Having trouble understanding the logic with recursive functions

from functools import lru_cache
#lru_cache(maxsize=1000)
def recursiveFunc(x):
if x == 1:
return 1
elif x > 1 :
return recursiveFunc(x) + recursiveFunc(x+1) #This is the part i'm having doubts about.
for x in range(1, 101):
print(x, ":", recursiveFunc(x))
This functions is supposed to generate consecutive numbers starting from 1 to 100 using recursion.
Your problem is that you have to learn very well all the recursion story, it takes time... you have to visualize what the program is executing in every step. My advice is to draw the first times the stack buffer with every call of the function
The solution of your problem is:
def recursiveFunc(x):
if x == 1:
return 1
elif x > 1 :
return 1 + recursiveFunc(x-1) #This is the part I've changed.
for x in range(1, 101):
print(x, ":", recursiveFunc(x))
Why your doesn't work? Cause when the function calls return, return start the new function recursiveFunc(x)... but it's just the same of before! so there is an infinite loop.
Furthermore if you add like recursiveFunc(x+1) and you pass x that are positive you will never made the comparison x == 0 cause x it's growing call after call.
Here I'll try to clear things up for you :)
Writing a function that lists numbers from 1 to n is simple.
If we tried running this function
def recursiveFunc(i):
print(i)
recursiveFunc(i+1)
recursiveFunc(1)
It would print out 1, then 2, 3.... But would never stop.
1
2
3
...
To fix this we add a second parameter
def recursiveFunc(i, n):
if i > n:
return
print(i)
recursiveFunc(i+1)
recursiveFunc(1, 100)
This will escape the function when it passes n, in this case, 100
1
2
...
100
if you wanted to return the series rather than just print it out you could do something like this:
def recursiveFunc(i, n):
if i >= n:
return str(i)
return str(i) + ", " + str(recursiveFunc(i + 1, n))
print(recursiveFunc(1, 100))
Then the output would be
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100

Pandas create random samples without duplicates

I have a pandas dataframe containing ~200,000 rows and I would like to create 5 random samples of 1000 rows each however I do not want any of these samples to contain the same row twice.
To create a random sample I have been using:
import numpy as np
rows = np.random.choice(df.index.values, 1000)
sampled_df = df.ix[rows]
However just doing this several times would run the risk of having duplicates. Would the best way to handle this be keeping track of which rows are sampled each time?
You can use df.sample.
A dataframe with 100 rows and 5 columns:
df = pd.DataFrame(np.random.randn(100, 5), columns = list("abcde"))
Sample 5 rows:
df.sample(5)
Out[8]:
a b c d e
84 0.012201 -0.053014 -0.952495 0.680935 0.006724
45 -1.347292 1.358781 -0.838931 -0.280550 -0.037584
10 -0.487169 0.999899 0.524546 -1.289632 -0.370625
64 1.542704 -0.971672 -1.150900 0.554445 -1.328722
99 0.012143 -2.450915 -0.718519 -1.192069 -1.268863
This ensures those 5 rows are different. If you want to repeat this process, I'd suggest sampling number_of_rows * number_of_samples rows. For example if each sample is going to contain 5 rows and you need 10 samples, sample 50 rows. The first 5 will be the first sample, the second five will be the second...
all_samples = df.sample(50)
samples = [all_samples.iloc[5*i:5*i+5] for i in range(10)]
You can set replace to False in np.random.choice
rows = np.random.choice(df.index.values, 1000, replace=False)
Take a look on numpy.random docs
For your solution:
import numpy as np
rows = np.random.choice(df.index.values, 1000, replace=False)
sampled_df = df.ix[rows]
This will make random choices without replacement.
If you want to generate multiple samples that none will have any elements in common you will need to remove the elements from each choice after each iteration. You can usenumpy.setdiff1d for that.
import numpy as np
allRows = df.index.values
numOfSamples = 5
samples = list()
for i in xrange(numOfSamples):
choices = np.random.choice(allRows, 1000, replace=False)
samples.append(choices)
allRows = np.setdiff1d(allRows, choices)
Here is a working example with a range of numbers between 0 and 100:
In [58]: import numpy as np
In [59]: allRows = np.arange(100)
In [60]: numOfSamples = 5
In [61]: samples = list()
In [62]: for i in xrange(numOfSamples):
....: choices = np.random.choice(allRows, 5, replace=False)
....: samples.append(choices)
....: allRows = np.setdiff1d(allRows, choices)
....:
In [63]: samples
Out[63]:
[array([66, 24, 47, 31, 22]),
array([ 8, 28, 15, 62, 52]),
array([18, 65, 71, 54, 48]),
array([59, 88, 43, 7, 85]),
array([97, 36, 55, 56, 14])]
In [64]: allRows
Out[64]:
array([ 0, 1, 2, 3, 4, 5, 6, 9, 10, 11, 12, 13, 16, 17, 19, 20, 21,
23, 25, 26, 27, 29, 30, 32, 33, 34, 35, 37, 38, 39, 40, 41, 42, 44,
45, 46, 49, 50, 51, 53, 57, 58, 60, 61, 63, 64, 67, 68, 69, 70, 72,
73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 86, 87, 89, 90, 91,
92, 93, 94, 95, 96, 98, 99])

2-dimensional Array decomposition in Python

I would appreciate your help with a translation to Python 3 that decomposes an input array of any size into smaller square arrays of length 4.
I have tried chunks and the array functions in numpy but they are useless for this.
Here is my code in Perl that works well, but I want it to compare to Python (in efficiency).
sub make_array {
my $input = shift;
my $result;
my #parts = split '-', $input;
$result = [];
# Test for valid number of lines in inputs
my $lines = scalar #parts;
if($lines % $width){
die "Invalid line count $lines not divisible by $width" ;
# Or could pad here by adding an entire row of '0'.
}
# Chunk input lines into NxN subarrays
# loop across all input lines in steps of N lines
my $line_width = 0;
for (my $nn=0;$nn<$lines;$nn+=$width){
# make a temp array to handle $width rows of input
my #temp = (0..$width-1);
for my $ii (0..$width-1){
my $p = $parts[$nn+$ii];
my $padding_needed = length($p) % $width;
if($padding_needed != 0) {
print "'$p' is not divisible by correct width of $width, Adding $padding_needed zeros\n";
for my $pp (0..$padding_needed){
$p .= "0";
}
}
if($line_width == 0){
$line_width = length($p);
}
$temp[$ii] = $p;
}
# now process temp array left to right, creating keys
my $chunks = ($line_width/$width);
if($DEBUG) { print "chunks: $chunks\n"; }
for (my $zz =0;$zz<$chunks;$zz++){
if($DEBUG) { print "zz:$zz\n"; }
my $key;
for (my $yy=0;$yy<$width;$yy++){
my $qq = $temp[$yy];
$key .= substr($qq,$zz*$width, $width) . "-";
}
chop $key; # lose the trailing '-'
if($DEBUG) { print "Key: $key\n"; }
push #$result, $key;
}
}
if($DEBUG){
print "Reformatted input:";
print Dumper $result;
my $count = scalar #$result;
print "There are $count keys to check against the lookup table\n";
}
return $result;
}
As an example, I have the following 12 x 12 matrix:
000011110011
000011110011
000011110011
000011110011
000011110011
000011110011
000011110011
000011110011
and I want it decomposed into 6 square submatrices of length 4:
0000 1111 0011
0000 1111 0011
0000 1111 0011
0000 1111 0011
0000 1111 0011
0000 1111 0011
0000 1111 0011
0000 1111 0011
The original matrix comes from a file (the program should read it from a text file) in the following format:
000011110011,000011110011,000011110011,000011110011,000011110011,000011110011,000011110011,000011110011
So the program needs to split it by hyphens and take each chunk as a row of the large matrix. The 6 submatrices should come in the same input format, hence the first one would be:
0000,0000,0000,0000
The program should decompose any input matrix into square matrices of length j, say 4, if the original matrix is of size not multiple of 4 then it should disregard the remaining chunks that couldn't form a 4x4 matrix.
Several large matrices of different size could come in the original input file, with break lines as separators. For example, the original large matrix together with anothe rmatrix would look like the following in a text file:
000011110011,000011110011,000011110011,000011110011,000011110011,000011110011,000011110011,000011110011\n
0101,0101,0101,0101
and retrieve the 2 sets of subarrays, one of 6 arrays of 4x4 and a single one of 4x4 for the second one. If you solve it for the single case is of course fine.
This is easy with numpy. Suppose we have a 12x12 array;
In [1]: import numpy as np
In [2]: a = np.arange(144).reshape([-1,12])
In [3]: a
Out[3]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[ 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
[ 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35],
[ 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47],
[ 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
[ 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71],
[ 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83],
[ 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95],
[ 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107],
[108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119],
[120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131],
[132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143]])
To select the top-left 4x4 array, use slicing:
In [4]: a[0:4,0:4]
Out[4]:
array([[ 0, 1, 2, 3],
[12, 13, 14, 15],
[24, 25, 26, 27],
[36, 37, 38, 39]])
The right-bottom sub-array is:
In [7]: a[8:12,8:12]
Out[7]:
array([[104, 105, 106, 107],
[116, 117, 118, 119],
[128, 129, 130, 131],
[140, 141, 142, 143]])
You can guess the rest...

Categories