Related
I need to divide my dataframe into 2 dataframe based on their index
Df1 with this index:[5, 15, 22, 23, 24]
Df2 with this index:[0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 21, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54]
Unable to find solution! Any help would be appreciated
If input is list of index values is possible use Index.isin in boolean indexing (if not exist some values in original index also working correct):
idx = [5, 15, 22, 23, 24]
mask = df.index.isin(idx)
df1 = df[mask]
df2 = df[~mask]
Solution with DataFrame.loc is possible without : and is necessary all values exist in original index:
L1 = [5, 15, 22, 23, 24]
L2 = [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20,
21, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54]
df1 = df.loc[L1]
df2 = df.loc[L2]
You can use .loc:
df_1 = df.loc[[5, 15, 22, 23, 24], :]
df_2 = df.loc[[0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 21, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54], :]
Here is the documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html
I'm very new to coding, and I'm doing an assignment where I have to find the product of all even-indexed integers in a huge list:
number_list = [12, 41, 10, 34, 37, 2, 3, 8, 42, 46, 46, 27, 13, 49, 41, 2, 28, 21, 37, 27, 31, 19, 46, 7, 50, 1, 46, 45, 19, 10, 14, 8, 44, 14, 10, 4, 23, 29, 46, 18, 32, 40, 32, 7, 33, 45, 26, 24, 43, 45]
The question recommends using range(len(list)), which gives me range (1,50), but I don't see how that's relevant. I managed to get the answer without using that method:
number_list = [12, 41, 10, 34, 37, 2, 3, 8, 42, 46, 46, 27, 13, 49, 41, 2, 28, 21, 37, 27, 31, 19, 46, 7, 50, 1, 46, 45, 19, 10, 14, 8, 44, 14, 10, 4, 23, 29, 46, 18, 32, 40, 32, 7, 33, 45, 26, 24, 43, 45]
result = 1
evenlist = number_list[::2]
for num in evenlist:
result = result * num
How would range(len(list)) be useful here?
Might be something like this, where you reference it using the index of the array;
number_list = [12, 41, 10, 34, 37, 2, 3, 8, 42, 46, 46, 27, 13, 49, 41, 2, 28, 21, 37, 27, 31, 19, 46, 7, 50, 1, 46, 45, 19, 10, 14, 8, 44, 14, 10, 4, 23, 29, 46, 18, 32, 40, 32, 7, 33, 45, 26, 24, 43, 45]
result = 1
for idx in range(0, len(number_list), 2):
result = result * number_list[idx]
I would add that #fixatd's answer is correct but also caution you that the solution is not Pythonic. I realize your book/instructor wants the answer a certain way but I'd like to elaborate and show you some better alternatives for when you're not tied to a solution they want to see.
For example, here would be a more functional approach to the solution:
from operator import mul
from itertools import islice
from functools import reduce
number_list = [12, 41, 10, 34, 37, 2, 3, 8, 42, 46, 46, 27, 13, 49, 41, 2, 28, 21, 37, 27, 31, 19, 46, 7, 50, 1, 46, 45, 19, 10, 14, 8, 44, 14, 10, 4, 23, 29, 46, 18, 32, 40, 32, 7, 33, 45, 26, 24, 43, 45]
reduce(mul, islice(number_list, 0, None, 2))
218032559868925537961630414929920000
Alternatively, if you prefer less imports or less functional you can loop like a native. In python you'll often just loop over the iterable. The idiom is
for something in iterable:
It's typically less pythonic to use len of something inside of range just to loop over something iteratively. If you do happen to need the indices for some reason then use enumerate:
for index, item in enumerate(iterable):
The solution that you provided is actually quite nice and pythonic as opposed to what the assignment is requesting. Here's your solution slightly cleaned up:
result = 1
for num in number_list[::2]:
result *= num
>>>result
218032559868925537961630414929920000
Why you are interested in using range? Below there are two other variants as solution to your problem.
First way:
result = 1
for numbers in number_list:
result *= numbers
Second way:
from functools import reduce
reduce(lambda a, b: a*b, number_list)
Well, besides #fixatd's answer, you could also do it with reduce:
import operator
from functools import reduce
def prod(iterable):
return reduce(operator.mul, iterable, 1)
result = prod(evenlist)
What you can do is iterate over the array but you can specify the "jump" of each iteration as two instead of one using range(start,end,jump) . What you are doing when you are doing list[::2] is creating a new array and removing every other element then doing a iteration over it. Careful because it may take more resources for a larger input.
number_list = [12, 41, 10, 34, 37, 2, 3, 8, 42, 46, 46, 27, 13, 49, 41, 2, 28, 21, 37, 27, 31, 19, 46, 7, 50, 1, 46, 45, 19, 10, 14, 8, 44, 14, 10, 4, 23, 29, 46, 18, 32, 40, 32, 7, 33, 45, 26, 24, 43, 45]
prod = 1
for i in range(0, len(number_list), 2):
prod *= list[i]
Suppose I have an array with shape (3, 4, 5) and want to slice along the second axis with an index array [2, 1, 0].
I could not explain what I want to do in text, so please refer the below code and figure:
>>> src = np.arange(3*4*5).reshape(3,4,5)
>>> index = [2,1,0]
>>> src
>>> array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
>>> # what I need is:
array([[[10, 11, 12, 13, 14]], # slice the 2nd row (index[0])
[[25, 26, 27, 28, 29]], # 1st row (index[1])
[[40, 41, 42, 43, 44]]]) # 0th row (index[2])
src[np.arange(src.shape[0]), [2, 1, 0]]
# src[np.arange(src.shape[0]), [2, 1, 0], :]
array([[10, 11, 12, 13, 14],
[25, 26, 27, 28, 29],
[40, 41, 42, 43, 44]])
We need to compute the indices for axis=0:
>>> np.arange(src.shape[0])
array([0, 1, 2])
And we already have the indices for axes=1. We then slice across axis=3 to extract our cross-section.
You could do:
import numpy as np
arr = np.array([[[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
first, second = zip(*enumerate([2, 1, 0]))
result = arr[first, second, :]
print(result)
Output
[[10 11 12 13 14]
[25 26 27 28 29]
[40 41 42 43 44]]
How would one (efficiently) do the following:
x = np.arange(49)
x2 = np.reshape(x, (7,7))
x2
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34],
[35, 36, 37, 38, 39, 40, 41],
[42, 43, 44, 45, 46, 47, 48]])
From here I want to roll a couple of things.
I want to roll 0,7,14,21 etc so 14 comes to top.
Then the same with 4,11,18,25 etc so 39 comes to top.
Result should be:
x2
array([[14, 1, 2, 3, 39, 5, 6],
[21, 8, 9, 10, 46, 12, 13],
[28, 15, 16, 17, 4, 19, 20],
[35, 22, 23, 24, 11, 26, 27],
[42, 29, 30, 31, 18, 33, 34],
[ 0, 36, 37, 38, 25, 40, 41],
[ 7, 43, 44, 45, 32, 47, 48]])
I looked up numpy.roll, here and google but couldn't find how one would do this.
For horizontal rolls, I could do:
np.roll(x2[0], 3, axis=0)
x3
array([4, 5, 6, 0, 1, 2, 3])
But how do I return the full array with this roll change as a new copy?
Roll with a negative shift:
x2[:, 0] = np.roll(x2[:, 0], -2)
Roll with a positive shift:
x2[:, 4] = np.roll(x2[:, 4], 2)
gives:
>>>x2
array([[14, 1, 2, 3, 39, 5, 6],
[21, 8, 9, 10, 46, 12, 13],
[28, 15, 16, 17, 4, 19, 20],
[35, 22, 23, 24, 11, 26, 27],
[42, 29, 30, 31, 18, 33, 34],
[ 0, 36, 37, 38, 25, 40, 41],
[ 7, 43, 44, 45, 32, 47, 48]])
Here's a way to roll multiple columns in one go with advanced-indexing -
# Params
cols = [0,4] # Columns to be rolled
dirn = [2,-2] # Offset with direction as sign
n = x2.shape[0]
x2[:,cols] = x2[np.mod(np.arange(n)[:,None] + dirn,n),cols]
Sample run -
In [45]: x2
Out[45]:
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34],
[35, 36, 37, 38, 39, 40, 41],
[42, 43, 44, 45, 46, 47, 48]])
In [46]: cols = [0,4,5] # Columns to be rolled
...: dirn = [2,-2,4] # Offset with direction as sign
...: n = x2.shape[0]
...: x2[:,cols] = x2[np.mod(np.arange(n)[:,None] + dirn,n),cols]
...:
In [47]: x2 # Three columns rolled
Out[47]:
array([[14, 1, 2, 3, 39, 33, 6],
[21, 8, 9, 10, 46, 40, 13],
[28, 15, 16, 17, 4, 47, 20],
[35, 22, 23, 24, 11, 5, 27],
[42, 29, 30, 31, 18, 12, 34],
[ 0, 36, 37, 38, 25, 19, 41],
[ 7, 43, 44, 45, 32, 26, 48]])
You have to overwrite the column
e.g.:
x2[:,0] = np.roll(x2[:,0], 3)
See here a useful method for shifting a 2D array in all 4 directions (up, down, left, right):
def image_shift_roll(img, x_shift, y_roll):
img_roll = img.copy()
img_roll = np.roll(img_roll, -y_roll, axis = 0) # Positive y rolls up
img_roll = np.roll(img_roll, x_roll, axis = 1) # Positive x rolls right
return img_roll
I am using nested arrays as a matrix representation. I created the following function for spliting quadratic matrices with size 2^k into four equal parts (used for Strassen algorithm):
import itertools
def splitmat(mat):
n = len(mat)
return map( \
lambda (x,y):map(lambda z:z[y[0]:y[1]],mat[x[0]:x[1]]), \
itertools.product([(0,n/2),(n/2,n)],repeat=2)
)
Now I'm trying to find an inverse function that joins the four parts back to a full matrix. I could use two nested loops, but may there be any pythonic way to achieve this? I would prefer to not use numpy but only builtin modules. Do you have any idea or hint how to achieve this?
Your inverse operation can be split into 2 simplier operation:
concatenate rows(numpy.vstack)
concatenate columns(numpy.hstack)
So, if you have matrix divided into 4 submatrix:
M = |m1|m2|
|m3|m4|
then M = hstack(vstack(m1, m2), vstack(m3, m4).
This operations can be code like this:
import itertools
import math
# iterators
def ihstack(*matrixes):
return map(lambda rows: itertools.chain(*rows), zip(*matrixes))
def ivstack(*matrixes):
return itertools.chain(*matrixes)
# main function
def squarejoin(*matrixes):
size = int(math.sqrt(len(matrixes)))
assert size ** 2 == len(matrixes), 'Incorrect number of matrices'
return _matrixjoin(matrixes, size, size)
def _matrixjoin(matrixes, hsize, vsize):
print(matrixes, hsize, vsize)
return ivstack(*(ihstack(*itertools.islice(matrixes, i*hsize, (i+1)*hsize)) for i in range(vsize)))
Here I have an example program where a 2 loops implementation works and is crystal clear in its intent, a 1 loop implementation works and is, imho, slightly less clear and eventually a 0 (explicit, btw) loops implementation that, alas, is buggy.
My vote goes to the two loops... further, I'd like to be shown what's wrong with my 0 loops attempt
Code
import itertools
def pm(m):
for row in m: print row
mat = []
n = 8
for i in range(n):
mat.append(range(i*n, i*n+n))
# this is shorthand for your splitmat function
res = map(lambda (x,y):
map(lambda z:z[y[0]:y[1]],mat[x[0]:x[1]]),
itertools.product([(0,n/2),(n/2,n)],repeat=2))
pm(res)
print "\n2 cycles"
mat = []
for i, j in ((0,1),(2,3)):
for a, b in zip(res[i],res[j]):
mat.append(a+b)
pm(mat)
print "\n1 cycle"
mat = []
for i, j in ((0,1),(2,3)):
map(lambda x: mat.append(x[0]+x[1]), zip(res[i],res[j]))
pm(mat)
print "\n0 cycles"
mat = map(lambda i_j:
map(lambda x: x[0]+x[1], zip(res[i_j[0]],res[i_j[1]])), ((0,1),(2,3)))
pm(mat)
Output
[[0, 1, 2, 3], [8, 9, 10, 11], [16, 17, 18, 19], [24, 25, 26, 27]]
[[4, 5, 6, 7], [12, 13, 14, 15], [20, 21, 22, 23], [28, 29, 30, 31]]
[[32, 33, 34, 35], [40, 41, 42, 43], [48, 49, 50, 51], [56, 57, 58, 59]]
[[36, 37, 38, 39], [44, 45, 46, 47], [52, 53, 54, 55], [60, 61, 62, 63]]
2 cicli
[0, 1, 2, 3, 4, 5, 6, 7]
[8, 9, 10, 11, 12, 13, 14, 15]
[16, 17, 18, 19, 20, 21, 22, 23]
[24, 25, 26, 27, 28, 29, 30, 31]
[32, 33, 34, 35, 36, 37, 38, 39]
[40, 41, 42, 43, 44, 45, 46, 47]
[48, 49, 50, 51, 52, 53, 54, 55]
[56, 57, 58, 59, 60, 61, 62, 63]
1 ciclo
[0, 1, 2, 3, 4, 5, 6, 7]
[8, 9, 10, 11, 12, 13, 14, 15]
[16, 17, 18, 19, 20, 21, 22, 23]
[24, 25, 26, 27, 28, 29, 30, 31]
[32, 33, 34, 35, 36, 37, 38, 39]
[40, 41, 42, 43, 44, 45, 46, 47]
[48, 49, 50, 51, 52, 53, 54, 55]
[56, 57, 58, 59, 60, 61, 62, 63]
0 cicli
[[0, 1, 2, 3, 4, 5, 6, 7], [8, 9, 10, 11, 12, 13, 14, 15], [16, 17, 18, 19, 20, 21, 22, 23], [24, 25, 26, 27, 28, 29, 30, 31]]
[[32, 33, 34, 35, 36, 37, 38, 39], [40, 41, 42, 43, 44, 45, 46, 47], [48, 49, 50, 51, 52, 53, 54, 55], [56, 57, 58, 59, 60, 61, 62, 63]]