Naturally sorting pandas data rasies error - python

I have a pandas data from with the following indices
print(df.index)
MultiIndex(levels=[[u'Day 3', u'Day 4', u'Day 5', u'Day 7', u'Day 9'], [u'D1', u'D10', u'D11', u'D12', u'D2', u'D3', u'D4', u'D5', u'D6', u'D7', u'D8', u'D9'], [1.0, 2.0, 3.0]],
labels=[[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4], [0, 0, 0, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11, 11, 1, 1, 1, 2, 2, 2, 3, 3, 3, 0, 0, 0, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11, 11, 1, 1, 1, 2, 2, 2, 3, 3, 3, 0, 0, 0, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11, 11, 1, 1, 1, 2, 2, 2, 3, 3, 3, 0, 0, 0, 4, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11, 11, 1, 1, 1, 2, 2, 2, 3, 3, 3, 0, 0, 0, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11, 11, 1, 1, 1, 2, 2, 2, 3, 3, 3], [1, 2, 0, 1, 2, 0, 0, 2, 1, 0, 1, 2, 2, 0, 1, 0, 2, 1, 1, 2, 0, 1, 0, 2, 2, 0, 1, 0, 1, 2, 2, 1, 0, 1, 2, 0, 0, 2, 1, 0, 2, 1, 2, 0, 1, 0, 2, 1, 1, 0, 2, 0, 1, 2, 0, 2, 1, 2, 0, 1, 0, 2, 1, 0, 2, 1, 2, 0, 1, 0, 2, 1, 2, 1, 0, 0, 2, 1, 1, 2, 0, 0, 2, 1, 0, 1, 2, 0, 1, 2, 2, 1, 0, 1, 0, 2, 1, 0, 2, 0, 1, 2, 2, 0, 1, 1, 0, 2, 1, 2, 0, 1, 1, 2, 0, 2, 1, 0, 1, 2, 0, 0, 1, 2, 0, 1, 2, 2, 1, 0, 1, 0, 2, 2, 0, 1, 0, 1, 2, 0, 2, 1, 2, 0, 1, 1, 2, 0, 0, 2, 1, 0, 2, 1, 0, 2, 1, 2, 1, 0, 0, 2, 1, 2, 0, 1, 2, 0, 1, 2, 1, 0, 1, 2, 0, 2, 1, 0, 1, 2, 0]],
names=[u'Interval', u'Device', u'Well'])
I am sorting with the following
df = df.reindex(index=natsorted(df.index))
With this particular df, however, it returns the follow error.
raise Exception("cannot handle a non-unique multi-index!")
Exception: cannot handle a non-unique multi-index!
Any help would be greatly appreciated.

I made a minimal example and could reproduce your error. It seems it happens, because of the same levels tuple Day 3, D1 and 1.0 in arrays. If you remove one of them it works fine.
import pandas as pd
import numpy as np
from natsort import natsorted
arrays = [[u'Day 3', u'Day 3', u'Day 4', u'Day 5', u'Day 7', u'Day 9', u'Day 3', u'Day 4', u'Day 5', u'Day 7', u'Day 9'],
[u'D1', u'D1', u'D10', u'D11', u'D12', u'D2', u'D3', u'D4', u'D5', u'D6', u'D7'],
[1.0, 1.0, 2.0, 3.0, 1.0, 2.0, 1.0, 2.0, 3.0, 1.0, 2.0]]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=[u'Interval', u'Device', u'Well'])
df = pd.Series(np.random.randn(len(arrays[0])), index=index)
print df.index
df = df.reindex(index=natsorted(df.index))
As you mentioned you use several excel files, this may be helpful: Merging multiple dataframes with non unique indexes

Related

prediction to actual label and export result to csv

import pandas as pd
import os
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
predictions = model.predict_generator(Br_test_generator, steps=test_steps_per_epoch)
predicted_classes = np.argmax(predictions, axis=1)
predicted_classes
output= array([3, 1, 0, 3, 5, 0, 0, 0, 6, 0, 0, 3, 6, 0, 1, 0, 0, 2, 2, 2, 2, 2,
1, 1, 0, 2, 2, 6, 0, 0, 0, 1, 1, 0, 0, 2, 0, 1, 1, 1, 1, 1, 1, 1,
6, 0, 5, 1, 3, 1, 0, 2, 2, 1, 1, 1, 1, 2, 2, 2, 4, 1, 5, 1, 0, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 5, 2, 2, 5, 2, 5, 5, 5, 2, 2, 2, 2,
1, 3, 5, 5, 2, 2, 5, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 2, 1, 2, 1, 5,
2, 2, 2, 5, 3, 1, 3, 3, 1, 3, 3, 3, 1, 1, 0, 1, 5, 0, 2, 5, 5, 4,
4, 4, 4, 4, 6, 4, 4, 4, 5, 0, 4, 4, 4, 4, 4, 5, 6, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 2, 2, 5, 5, 6, 5,
5, 6, 1, 6, 4, 5, 4, 1, 4, 5, 0, 2, 5, 5, 5, 2, 2, 2, 6, 6, 5, 6,
6, 6, 6, 6, 4, 6, 2, 6, 6, 2, 0, 2, 5, 6, 6, 6, 4, 4, 0, 6])
true_classes = Bre_test_generator.classes
class_labels = list(Bre_test_generator.class_indices.keys())
class_labels
output=['1B', '2B', '3B', 'CA', 'FB', 'MB', 'NB']
I want my predicted_classes to match the corresponding class_labels and I also want to output the result in csv.
I want my csv to have two columns: the image ID and the predicted classs_labels

Counting occurences of an item in an ndarray based on multiple conditions?

How do you specify multiple conditions in the np.count_nonzero function.
This is for counting the numbers inside an array that have a value between two values. I know you can subtract the outcomes of two individual count_nonzero lines. But I would like to know if there is an easy way to pass multiple conditions to np.count_nonzero.
import numpy as np
array = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0],
[0, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 0],
[0, 1, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 2, 1, 1, 0],
[0, 1, 2, 3, 4, 4, 4, 4, 4, 4, 4, 4, 3, 2, 1, 0],
[0, 2, 3, 4, 5, 6, 6, 6, 6, 6, 6, 5, 4, 3, 2, 0],
[0, 2, 3, 4, 5, 6, 7, 8, 8, 7, 6, 5, 4, 3, 2, 0],
[0, 2, 3, 4, 5, 6, 6, 6, 6, 6, 6, 5, 4, 3, 2, 0],
[0, 1, 2, 3, 4, 4, 4, 4, 4, 4, 4, 4, 3, 2, 1, 0],
[0, 1, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 2, 1, 1, 0],
[0, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 0],
[0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
# Count occurences of values between 5 and 8 in array.
result1 = np.count_nonzero(array <= 8)
result2 = np.count_nonzero(array <= 5)
result = result 1 - result2
I would like to know if there is a way that looks something like:
np.count_nonzero(array >= 6 and array <= 8)
Can this be what you are looking for:
np.count_nonzero(np.logical_and(array>=5, array<=8))
#24

Python: TypeError: Only 2-D and 3-D images supported with scikit-image regionprops

Given a numpy.ndarray of the kind
myarray=
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1])
I want to use scikit-image on the array (which is already labelled) to derive some properties.
This is what I do:
myarray.reshape((11,11))
labelled=label(myarray)
props=sk.measure.regionprops(labelled)
But then I get this error:
TypeError: Only 2-D and 3-D images supported., pointing at props. What is the problem? The image I am passing to props is already a 2D object.
Shape of myarray:
In [17]: myarray
Out[17]:
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])
I tried this code and I got no errors:
import numpy as np
from skimage.measure import label, regionprops
myarray = np.random.randint(1, 4, (11,11), dtype=np.int64)
labelled = label(myarray)
props = regionprops(labelled)
Sample output:
In [714]: myarray
Out[714]:
array([[1, 2, 1, 1, 3, 3, 1, 1, 3, 3, 3],
[1, 1, 3, 1, 3, 2, 2, 2, 3, 3, 2],
[3, 3, 3, 1, 3, 3, 1, 1, 2, 3, 1],
[1, 3, 1, 1, 1, 2, 1, 3, 1, 3, 3],
[3, 2, 3, 3, 1, 1, 2, 1, 3, 2, 3],
[3, 2, 1, 3, 1, 1, 3, 1, 1, 2, 2],
[1, 3, 1, 1, 1, 1, 3, 3, 1, 2, 2],
[3, 3, 1, 1, 3, 2, 1, 2, 2, 2, 1],
[1, 1, 1, 3, 3, 2, 2, 3, 3, 3, 1],
[1, 2, 2, 2, 2, 2, 1, 3, 3, 2, 2],
[3, 2, 2, 3, 1, 3, 3, 1, 3, 3, 2]], dtype=int64)
In [715]: labelled
Out[715]:
array([[ 0, 1, 0, 0, 2, 2, 3, 3, 4, 4, 4],
[ 0, 0, 5, 0, 2, 6, 6, 6, 4, 4, 7],
[ 5, 5, 5, 0, 2, 2, 0, 0, 6, 4, 8],
[ 9, 5, 0, 0, 0, 10, 0, 4, 0, 4, 4],
[ 5, 11, 5, 5, 0, 0, 10, 0, 4, 12, 4],
[ 5, 11, 0, 5, 0, 0, 13, 0, 0, 12, 12],
[14, 5, 0, 0, 0, 0, 13, 13, 0, 12, 12],
[ 5, 5, 0, 0, 15, 12, 0, 12, 12, 12, 16],
[ 0, 0, 0, 15, 15, 12, 12, 17, 17, 17, 16],
[ 0, 12, 12, 12, 12, 12, 18, 17, 17, 19, 19],
[20, 12, 12, 21, 22, 17, 17, 18, 17, 17, 19]], dtype=int64)
In [716]: props[0].area
Out[716]: 1.0
In [717]: props[1].centroid
Out[717]: (1.0, 4.4000000000000004)
I noticed that when all the elements of myarray have the same value (as in your example), labelled is an array of zeros. I also read this in the regionprops documentation:
Parameters: label_image : (N, M) ndarray
Labeled input image. Labels with value 0 are ignored.
Perhaps you should use a myarray with more than one distinct value in order to get meaningful properties...
I was having this same issue, then after checking Tonechas answer I realized I was importing label from scipy instead of skimage.
from scipy.ndimage.measurements import label
I just replaced it to
from skimage.measure import label, regionprops
And everything worked :)

reshape numpy 3D array to 2D

I have a very big array with the shape = (32, 3, 1e6)
I need to reshape it to this shape = (3, 32e6)
On a snippet, how to go from this::
>>> m3_3_5
array([[[8, 4, 1, 0, 0],
[6, 8, 5, 5, 2],
[1, 1, 1, 1, 1]],
[[8, 7, 1, 0, 3],
[2, 8, 5, 5, 2],
[1, 1, 1, 1, 1]],
[[2, 4, 0, 2, 3],
[2, 5, 5, 3, 2],
[1, 1, 1, 1, 1]]])
to this::
>>> res3_15
array([[8, 4, 1, 0, 0, 8, 7, 1, 0, 3, 2, 4, 0, 2, 3],
[6, 8, 5, 5, 2, 2, 8, 5, 5, 2, 2, 5, 5, 3, 2],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])
I did try various combinations with reshape with no success::
>>> dd.T.reshape(3, 15)
array([[8, 8, 2, 6, 2, 2, 1, 1, 1, 4, 7, 4, 8, 8, 5],
[1, 1, 1, 1, 1, 0, 5, 5, 5, 1, 1, 1, 0, 0, 2],
[5, 5, 3, 1, 1, 1, 0, 3, 3, 2, 2, 2, 1, 1, 1]])
>>> dd.reshape(15, 3).T.reshape(3, 15)
array([[8, 0, 8, 2, 1, 8, 0, 8, 2, 1, 2, 2, 5, 2, 1],
[4, 0, 5, 1, 1, 7, 3, 5, 1, 1, 4, 3, 5, 1, 1],
[1, 6, 5, 1, 1, 1, 2, 5, 1, 1, 0, 2, 3, 1, 1]])
a.transpose([1,0,2]).reshape(3,15) will do what you want. (I am basically following comments by #hpaulj).
In [14]: a = np.array([[[8, 4, 1, 0, 0],
[6, 8, 5, 5, 2],
[1, 1, 1, 1, 1]],
[[8, 7, 1, 0, 3],
[2, 8, 5, 5, 2],
[1, 1, 1, 1, 1]],
[[2, 4, 0, 2, 3],
[2, 5, 5, 3, 2],
[1, 1, 1, 1, 1]]])
In [15]: a.transpose([1,0,2]).reshape(3,15)
Out[15]:
array([[8, 4, 1, 0, 0, 8, 7, 1, 0, 3, 2, 4, 0, 2, 3],
[6, 8, 5, 5, 2, 2, 8, 5, 5, 2, 2, 5, 5, 3, 2],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])
You can get the desired behavior with np.hstack
# g is your (3,3,5) array from above
reshaped = np.hstack(g[i,:,:] for i in range(3)) #uses a generator exp
reshaped_simpler = np.hstack(g) # this produces equivalent output to the above statmement
print reshaped # (3,30)
Output
array([[8, 4, 1, 0, 0, 8, 7, 1, 0, 3, 2, 4, 0, 2, 3],
[6, 8, 5, 5, 2, 2, 8, 5, 5, 2, 2, 5, 5, 3, 2],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

convert matrix to image

How would I go about going converting a list of lists of ints into a matrix plot in Python?
The example data set is:
[[3, 5, 3, 5, 2, 3, 2, 4, 3, 0, 5, 0, 3, 2],
[5, 2, 2, 0, 0, 3, 2, 1, 0, 5, 3, 5, 0, 0],
[2, 5, 3, 1, 1, 3, 3, 0, 0, 5, 4, 4, 3, 3],
[4, 1, 4, 2, 1, 4, 5, 1, 2, 2, 0, 1, 2, 3],
[5, 1, 1, 1, 5, 2, 5, 0, 4, 0, 2, 4, 4, 5],
[5, 1, 0, 4, 5, 5, 4, 1, 3, 3, 1, 1, 0, 1],
[3, 2, 2, 4, 3, 1, 5, 5, 0, 4, 3, 2, 4, 1],
[4, 0, 1, 3, 2, 1, 2, 1, 0, 1, 5, 4, 2, 0],
[2, 0, 4, 0, 4, 5, 1, 2, 1, 0, 3, 4, 3, 1],
[2, 3, 4, 5, 4, 5, 0, 3, 3, 0, 2, 4, 4, 5],
[5, 2, 4, 3, 3, 0, 5, 4, 0, 3, 4, 3, 2, 1],
[3, 0, 4, 4, 4, 1, 4, 1, 3, 5, 1, 2, 1, 1],
[3, 4, 2, 5, 2, 5, 1, 3, 5, 1, 4, 3, 4, 1],
[0, 1, 1, 2, 3, 1, 2, 0, 1, 2, 4, 4, 2, 1]]
To give you an idea of what I'm looking for, the function MatrixPlot in Mathematica gives me this image for this data set:
Thanks!
You may try
from pylab import *
A = rand(5,5)
figure(1)
imshow(A, interpolation='nearest')
grid(True)
source
Perhaps matshow() from matplotlib is what you need.
You can also use pyplot from matplotlib, follows the code:
from matplotlib import pyplot as plt
plt.imshow(
[[3, 5, 3, 5, 2, 3, 2, 4, 3, 0, 5, 0, 3, 2],
[5, 2, 2, 0, 0, 3, 2, 1, 0, 5, 3, 5, 0, 0],
[2, 5, 3, 1, 1, 3, 3, 0, 0, 5, 4, 4, 3, 3],
[4, 1, 4, 2, 1, 4, 5, 1, 2, 2, 0, 1, 2, 3],
[5, 1, 1, 1, 5, 2, 5, 0, 4, 0, 2, 4, 4, 5],
[5, 1, 0, 4, 5, 5, 4, 1, 3, 3, 1, 1, 0, 1],
[3, 2, 2, 4, 3, 1, 5, 5, 0, 4, 3, 2, 4, 1],
[4, 0, 1, 3, 2, 1, 2, 1, 0, 1, 5, 4, 2, 0],
[2, 0, 4, 0, 4, 5, 1, 2, 1, 0, 3, 4, 3, 1],
[2, 3, 4, 5, 4, 5, 0, 3, 3, 0, 2, 4, 4, 5],
[5, 2, 4, 3, 3, 0, 5, 4, 0, 3, 4, 3, 2, 1],
[3, 0, 4, 4, 4, 1, 4, 1, 3, 5, 1, 2, 1, 1],
[3, 4, 2, 5, 2, 5, 1, 3, 5, 1, 4, 3, 4, 1],
[0, 1, 1, 2, 3, 1, 2, 0, 1, 2, 4, 4, 2, 1]], interpolation='nearest')
plt.show()
The output would be:

Categories