Coverting Int64index to list or accessing list of lists - python

I have a list of lists but because of the Int64Index I cannot access it. Is there a way to access individual values or make it into a normal list?
data_exp = pd.read_csv(path+'/exp.csv')
exp_list=[]
for i in range (1,n+1):
check=data_exp.apply(lambda x: True if x['Set No.']==i else False, axis=1)
temp=[data_exp[check==True].index+1]
exp_list.append(temp)
del temp
display(exp_list)
The for loop just sort values based on a condition. The output is good but it is the format which is problamatic.
Gives me out put as follows:-
[[Int64Index([8, 11, 17, 20, 21, 27, 29, 36, 37, 38], dtype='int64')],
[Int64Index([1, 3, 7, 10, 14, 31, 33, 34, 35], dtype='int64')],
[Int64Index([5, 9, 12, 15, 19, 23, 25, 26, 28, 32], dtype='int64')],
[Int64Index([2, 4, 6, 13, 16, 18, 22, 24, 30, 39, 40], dtype='int64')]]
Thanks in advance

I'm not quite sure what you're doing to get the list of Int64Indexes, but you can access the numpy array underlying the index with the values property:
from pandas import Int64Index
l = [[Int64Index([8, 11, 17, 20, 21, 27, 29, 36, 37, 38], dtype='int64')],
[Int64Index([1, 3, 7, 10, 14, 31, 33, 34, 35], dtype='int64')],
[Int64Index([5, 9, 12, 15, 19, 23, 25, 26, 28, 32], dtype='int64')],
[Int64Index([2, 4, 6, 13, 16, 18, 22, 24, 30, 39, 40], dtype='int64')]]
print(l[0][0].values[0])

Related

KeyError: 2 in qcut

I want to do bucketing on one of the column of my dataframe. I have 2 columns, Category & Rank.
I want to do bucketing in each category. So i first groupby on category & then use qcut on each group to do bucketing. After groupby, my dataframe's group looks like (final_data.groupby(['category'])['Ranks'].groups)
{'north': [1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 13, 14, 17, 18, 19, 21, 22, 23, 24, 25, 2, 3, 4, 5, 6, 7, 8, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 22, 23, 24, 25, 29, 30, 31, 32, 33, 34, 36, 37, 38, 39, 40, 41, 42, 43, 44], 'south': [0, 5, 12, 15, 16, 20, 26, 27, 0, 1, 9, 10, 21, 26, 27, 28, 35, 45, 46, 47]}
I'm applying this code to bucketing
final_data.groupby(['category'])['Rank'].transform(lambda g: pd.qcut(g, q=[0.0, .1, .25, .5, .75, .9, 1.0], labels= ["Top 10", "11-25", "26-50", "50-75", "75-90" ,"Bottom10"]))
The above code is throwing the error

Divide Dataframe into 2 dataframe using index

I need to divide my dataframe into 2 dataframe based on their index
Df1 with this index:[5, 15, 22, 23, 24]
Df2 with this index:[0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 21, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54]
Unable to find solution! Any help would be appreciated
If input is list of index values is possible use Index.isin in boolean indexing (if not exist some values in original index also working correct):
idx = [5, 15, 22, 23, 24]
mask = df.index.isin(idx)
df1 = df[mask]
df2 = df[~mask]
Solution with DataFrame.loc is possible without : and is necessary all values exist in original index:
L1 = [5, 15, 22, 23, 24]
L2 = [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20,
21, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54]
df1 = df.loc[L1]
df2 = df.loc[L2]
You can use .loc:
df_1 = df.loc[[5, 15, 22, 23, 24], :]
df_2 = df.loc[[0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 21, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54], :]
Here is the documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html

How do I initialize a numpy array starting at a particular number?

I can initialize a numpy array and reshape it at the time of creation.
test = np.arange(32).reshape(4, 8)
which produces this:
array([[ 0, 1, 2, 3, 4, 5, 6, 7],
[ 8, 9, 10, 11, 12, 13, 14, 15],
[16, 17, 18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29, 30, 31]])
... but I'd like to know how to start the sequential numbering at a given point, say at 13 rather than at 0. How is that done in numpy?
I've looked for answers and found something somewhat similar but it seems there would be a numpy command to do this.
arange takes an optional start argument.
start = 13 # Any number works here
np.arange(start, start + 32).reshape(4, 8)
# array([[13, 14, 15, 16, 17, 18, 19, 20],
# [21, 22, 23, 24, 25, 26, 27, 28],
# [29, 30, 31, 32, 33, 34, 35, 36],
# [37, 38, 39, 40, 41, 42, 43, 44]])

Re-order a numpy array python

I have a big two-dimensional array like this:
array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 9,10,11,12,13,14,15,16],
[17,18,19,20,21,22,23,24],
[25,26,27,28,29,30,31,32],
[33,34,35,36,37,38,39,40],
[41,42,43,44,45,46,47,48],
....])
and I need to convert it into:
array([ 1, 9,17, 2,10,18, 3,11,19, 4,12,20, 5,13,21, 6,14,22, 7,15,23, 8,16,24],
[25,33,41,26,34,42,27,35,43,28,36,44,29,37,45,30,38,46,31,39,47,32,40,48],
...
Note that this should only be a demonstration what it should do.
The original array contains only boolean values and has the size of 512x8. In my example, I order only 3 rows with 8 elements into one row but what I really need are respectively 32 rows with 8 elements.
I am really sorry, but after 30 minutes of writing, this is the only description I got of my problem. I hope it is enough.
I think you can achieve your desired result using two reshape operations and a transpose:
x = np.array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 9,10,11,12,13,14,15,16],
[17,18,19,20,21,22,23,24],
[25,26,27,28,29,30,31,32],
[33,34,35,36,37,38,39,40],
[41,42,43,44,45,46,47,48]])
y = x.reshape(2, 3, 8).transpose(0, 2, 1).reshape(2, -1)
print(repr(y))
# array([[ 1, 9, 17, 2, 10, 18, 3, 11, 19, 4, 12, 20, 5, 13, 21, 6, 14,
# 22, 7, 15, 23, 8, 16, 24],
# [25, 33, 41, 26, 34, 42, 27, 35, 43, 28, 36, 44, 29, 37, 45, 30, 38,
# 46, 31, 39, 47, 32, 40, 48]])
To break that down a bit:
#hpaulj's first reshape operation gives us this:
x1 = x.reshape(2, 3, 8)
print(repr(x1))
# array([[[ 1, 2, 3, 4, 5, 6, 7, 8],
# [ 9, 10, 11, 12, 13, 14, 15, 16],
# [17, 18, 19, 20, 21, 22, 23, 24]],
# [[25, 26, 27, 28, 29, 30, 31, 32],
# [33, 34, 35, 36, 37, 38, 39, 40],
# [41, 42, 43, 44, 45, 46, 47, 48]]])
print(x1.shape)
# (2, 3, 8)
In order to get the desired output we need to 'collapse' this array along the second dimension (with size 3), then along the third dimension (with size 8).
The easiest way to achieve this sort of thing is to first transpose the
array so that the dimensions you want to collapse along are ordered from first to last:
x2 = x1.transpose(0, 2, 1) # you could also use `x2 = np.rollaxis(x1, 1, 3)`
print(repr(x2))
# array([[[ 1, 9, 17],
# [ 2, 10, 18],
# [ 3, 11, 19],
# [ 4, 12, 20],
# [ 5, 13, 21],
# [ 6, 14, 22],
# [ 7, 15, 23],
# [ 8, 16, 24]],
# [[25, 33, 41],
# [26, 34, 42],
# [27, 35, 43],
# [28, 36, 44],
# [29, 37, 45],
# [30, 38, 46],
# [31, 39, 47],
# [32, 40, 48]]])
print(x2.shape)
# (2, 8, 3)
Finally I can use reshape(2, -1) to collapse the array over the last two dimensions. The -1 causes numpy to infer the appropriate size in the last dimension based on the number of elements in x.
y = x2.reshape(2, -2)
Looks like a starting point is to reshape it, for example
In [49]: x.reshape(2,3,8)
Out[49]:
array([[[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16],
[17, 18, 19, 20, 21, 22, 23, 24]],
[[25, 26, 27, 28, 29, 30, 31, 32],
[33, 34, 35, 36, 37, 38, 39, 40],
[41, 42, 43, 44, 45, 46, 47, 48]]])
.ravel(order='F') doesn't get it right, so I think we need to swap some axes before flattening. It will need to be a copy.
Using #ali_m's transpose:
In [65]: x1=x.reshape(2,3,8)
In [66]: x1.transpose(0,2,1).flatten()
Out[66]:
array([ 1, 9, 17, 2, 10, 18, 3, 11, 19, 4, 12, 20, 5, 13, 21, 6, 14,
22, 7, 15, 23, 8, 16, 24, 25, 33, 41, 26, 34, 42, 27, 35, 43, 28,
36, 44, 29, 37, 45, 30, 38, 46, 31, 39, 47, 32, 40, 48])
oops - there's an inner layer of nesting that's easy to miss
array([1,9,17,2,10,18,3,11,19,4,12,20,5,13,21,6,14,22,7,15,23,8,16,24],
[25,33,41,26,34,42,27,35,43,28,36,44,29,37,45,30,38,46,31,39,47,32,40,4],
You are missing a [] set. So #ali_m got it right.
I'm tempted to delete this, but my trial and error might be instructive.

Dynamic Arrays in Python using numpy

travel_mat1= numpy.array([[23,23,20,24,28,12,17,10],[11,27,17,19,24,18,23,7],
[17,26,22,13,18,29,30,18],[22,21,28,7,18,29,30,18],[27,16,33,36,10,23,26,25],
[31,13,36,14,26,23,20,27],[34,7,33,20,35,17,14,24],[28,13,27,26,37,11,10,18],
[25,17,33,28,34,10,12,15]])
I need to change the size of array dynamically with no loss of actual data in the array. Means, I need to have a virtual dynamic array.
The above array Travel_mat1 is a 9X8 matrix. So if i need a 8X7 size matrix from Travel_mat1, it should look like:
([[23,23,20,24,28,12,17],[11,27,17,19,24,18,23],[17,26,22,13,18,29,30],
[22,21,28,7,18,29], [27,16,33,36,10,23,26],[31,13,36,14,26,23,20],
[34,7,33,20,35,17,14],[28,13,27,26,37,11,10]]).
Means, I need to reduce a row and a column in this case. How can I do this in python?
You can use numpy.delete:
>>> numpy.delete(numpy.delete(travel_mat1, 8, 0), 7, 1)
array([[23, 23, 20, 24, 28, 12, 17],
[11, 27, 17, 19, 24, 18, 23],
[17, 26, 22, 13, 18, 29, 30],
[22, 21, 28, 7, 18, 29, 30],
[27, 16, 33, 36, 10, 23, 26],
[31, 13, 36, 14, 26, 23, 20],
[34, 7, 33, 20, 35, 17, 14],
[28, 13, 27, 26, 37, 11, 10]])

Categories