Scatter plot in python

Scatter plot in python - python

I have a vector X of size 100x2 and the corresponding binary labels in a vector y ={1, -1} of length 100. I would like to plot the scattered data with s.t. I get the features on the axis and the color of the data point corresponds to a label e.g. red is -1, yellow is 1 for a given data point.
I've been looking into matplotlib and the fcn scatter however it accepts only a single feature vector and its label.
I would be grateful for any help.

You can do this easily using seaborn (or matplotlib as well). Below is the code.
I am creating a random array of size 100x2 and calling it X. I am creating a random array of 0s and 1s of size 100x1 and calling it Y
>> import numpy as np
>> X = np.random.randint(100, size=(100, 2))
>> Y = np.random.choice([0, 1], size=(100))
>> X
array([[11, 47],
[23, 2],
[91, 14],
[65, 32],
[81, 78],
....
>> Y
array([0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1,
0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0,
1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1,
0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1,
1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1])
Use Seaborn scatterplot
import seaborn as sns
sns.scatterplot(x=X[:,0], y=X[:,1], hue=Y)
Output sns scatterplot

Related

How do I calculate the matrix exponential of a sparse matrix?

I'm trying to find the matrix exponential of a sparse matrix:
import numpy as np
b = np.array([[1, 0, 1, 0, 1, 0, 1, 1, 1, 0],
[1, 0, 0, 0, 1, 1, 0, 1, 1, 0],
[0, 1, 1, 0, 1, 1, 0, 0, 1, 1],
[0, 0, 0, 0, 0, 1, 1, 1, 0, 0],
[1, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 0, 1, 0, 0, 1, 1],
[0, 0, 1, 0, 1, 0, 1, 1, 0, 0],
[1, 0, 0, 0, 1, 1, 0, 0, 1, 1],
[0, 0, 0, 0, 1, 0, 1, 1, 1, 0],
[0, 0, 0, 1, 0, 1, 1, 0, 0, 1]])
I can calculate this using scipy.linalg.expm, but it is slow for larger matrices.
from scipy.linalg import expm
S1 = expm(b)
Since this is a sparse matrix, I tried converting b to a scipy.sparse matrix and calling that function on the converted sparse matrix:
import scipy.sparse as sp
import numpy as np
sp_b = sp.csr_matrix(b)
S1 = expm(sp_b);
But I get the following error:
loop of ufunc does not support argument 0 of type csr_matrix which has no callable exp method
How can I calculate the matrix exponential of a sparse matrix?

You need to use scipy.sparse.linalg.expm for your sparse matrix instead of scipy.linalg.expm.
import scipy.sparse as sp
from scipy.sparse.linalg import expm
import numpy as np
b = np.array([[1, 0, 1, 0, 1, 0, 1, 1, 1, 0],
[1, 0, 0, 0, 1, 1, 0, 1, 1, 0],
[0, 1, 1, 0, 1, 1, 0, 0, 1, 1],
[0, 0, 0, 0, 0, 1, 1, 1, 0, 0],
[1, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 0, 1, 0, 0, 1, 1],
[0, 0, 1, 0, 1, 0, 1, 1, 0, 0],
[1, 0, 0, 0, 1, 1, 0, 0, 1, 1],
[0, 0, 0, 0, 1, 0, 1, 1, 1, 0],
[0, 0, 0, 1, 0, 1, 1, 0, 0, 1]])
sp_b = sp.csr_matrix(b)
S1 = expm(sp_b);
Note: As you found, defining your matrix as a CSR matrix gives the warning "SparseEfficiencyWarning: spsolve is more efficient when sparse b is in the CSC matrix format". To get rid of this, you can do as the warning suggests, and define a CSC matrix if that makes sense for your application:
sp_b = sp.csc_matrix(b)

Can anyone know how to see the data in a cluster after doing k-means clustering?

Is there any code to see or view the data in a cluster after doing k-means clustering in python,
so that i can know which type of data clustered into which cluster and why.
help me with this ?
The cluster file is in .File extension, so I am unable to open it.

It depends on how you are doing Kmeans... however... the attribute which shows the categorical cluster assignments (or "labels") are:
KMeans().fit().labels_
Code: (source here)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
%matplotlib inline
X= -2 * np.random.rand(100,2)
X1 = 1 + 2 * np.random.rand(50,2)
X[50:100, :] = X1
plt.scatter(X[ : , 0], X[ :, 1], s = 50)
plt.show()
Kmean = KMeans(n_clusters=2).fit(X)
print(Kmean.labels_)
Output:
Kmean.labels_
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
If you were to put the X, X1 and labels_ into a dataframe, it would look like this:
X X1 Labels
0 -1.918458 -1.918458 0
1 -1.378906 -1.378906 0
2 -0.888738 -0.888738 0
3 -1.924301 -1.924301 0
4 -0.619357 -0.619357 0
.. ... ... ...
95 1.893219 1.893219 1
96 2.820921 2.820921 1
97 2.454180 2.454180 1
98 1.599229 1.599229 1
99 2.270729 2.270729 1
[100 rows x 3 columns]

Any predicted values or any values has features to make color maps in general can do this all of what you need is to make color equals to your color theme list and your labeler like this (the relabeler is just for make ground truth data colors like the predicted ones):
MyColorTheme = np.array(["darkgrey", "lightsalmon", "powderblue"])
MyRelabeler = np.choose(MyCluster.labels_, [2, 0, 1]).astype(np.int64)
plt.subplot(1, 2, 1)
plt.title("My Ground Truth Classification Module")
plt.scatter(x = MyDataFrame[["Petal Length"]], y = MyDataFrame[["Petal
Width"]], c = MyColorTheme[MyData.target], s = 50)
plt.subplot(1, 2, 2)
plt.title("K clustring Classification Module")
plt.scatter(x = MyDataFrame[["Petal Length"]], y = MyDataFrame[["Petal
Width"]], c = MyColorTheme[MyRelabeler], s = 50)
Output:

Plot csv file taking the first column as x axis and the others as y axis once more

Can you please elaborate more how to use the solution of here once more? I have the same problem, I want to use the first column as the x axis and the following columns as y axis values. My code currently looks like this.
But basically I want to have it look like a scatter plot, with the values on each x value.
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv(
"ORBGRAND_Hamming_4LWmax_63storeLWsuccess", sep=", ")
[plt.plot(data[0], data[x]) for x in range(1, len(filecsv[:, 0]))]
plt.grid(True)
plt.suptitle("(63,45) BCH", fontsize=40)
plt.grid(True)
plt.yscale('log')
plt.xticks(fontsize=20)
plt.yticks(fontsize=20)
plt.legend(loc='lower left')
plt.xlabel('$E_b/N_0$ (dB)', fontsize=20)
plt.ylabel('BLER', fontsize=20)
plt.legend(loc='lower left', prop={'size': 17})
plt.show()
My file looks like this:
0.000000, 0, 2, 0, 0, 1, 0, 0, 3, 0, 1
0.500000, 1, 0, 3, 3, 1, 0, 0, 3, 2, 1
1.000000, 4, 5, 0, 0, 0, 0, 0, 0, 0, 1
1.500000, 0, 0, 1, 3, 1, 0, 2, 0, 0, 0
2.000000, 1, 0, 0, 1, 0, 3, 0, 0, 0, 0
2.500000, 0, 0, 0, 0, 5, 1, 0, 1, 0, 1
3.000000, 3, 0, 1, 2, 0, 0, 0, 0, 1, 0
3.500000, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0
4.000000, 2, 2, 0, 0, 0, 0, 3, 0, 0, 0
4.500000, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0
5.000000, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0
5.500000, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0
6.000000, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0
I get the following error:
Expected 85 fields in line 3, saw 88. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

This creates a scatter plot with the first column as x against all the following columns.
As commented above, your error is not from the plotting itself.
import pandas as pd
import matplotlib.pyplot as plt
# This is just a subset of your posted data
data = [[0.000000, 0, 2, 0, 0, 1, 0, 0, 3, 0, 1],
[0.500000, 1, 0, 3, 3, 1, 0, 0, 3, 2, 1],
[1.000000, 4, 5, 0, 0, 0, 0, 0, 0, 0, 1],
[1.500000, 0, 0, 1, 3, 1, 0, 2, 0, 0, 0],
[2.000000, 1, 0, 0, 1, 0, 3, 0, 0, 0, 0]]
# build the dataframe
column_names = [f'y{i}' for i in range(1, len(data[0]))]
column_names[0] = "x"
df = pd.DataFrame(data, columns=["x"] + column_names)
# You didn't specify what plot, I assumed you want all in one figure?
# This plots all the values (y) to the same x colum
plt.figure()
for y in column_names:
plt.scatter(df["x"], df[y], label=y)
plt.legend()
# Add your aesthetics here

How to find regions of ones surrounded by zeros in a numpy array?

I have a numpy array like
arr1 = np.array([1,1,1,1,0,0,1,1,1,0,0,0,1,1])
arr2 = np.array([1,1,1,1,0,0,0,1,1,1,1,1,1,1])
0-water
1-land
I want to find the index of the island with water surrounding it.
For example in the arr1 water starts at index 4 and island index 6 to 8 is surrounded by two water strip. So the answer for arr1 is
[4,5,6,7,8,9,10,11]
but in the second case there is not land surrounded by water, so no output.

The following approach pads the array with a one at the start and the end. And calculates the differences: these are -1 when going from water to land, 1 when going from land to water, and 0 everywhere else.
The following code constructs a series of test cases and visualizes the function. It can serve as a test bed for different definitions of the desired outcome.
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
import numpy as np
def find_islands(arr):
d = np.diff(np.pad(arr, pad_width=(1, 1), constant_values=1))
land_starts, = np.nonzero(d == 1)
land_ends, = np.nonzero(d == -1)
if len(land_starts) > 1 and len(land_ends) > 1:
return np.arange(arr.size)[land_ends[0]: land_starts[-1]]
else:
return None
def show_array(arr, y, color0='skyblue', color1='limegreen'):
if arr is not None:
plt.imshow(arr[np.newaxis, :], cmap=ListedColormap([color0, color1]), vmin=0, vmax=1,
extent=[0, arr.size, y, y + 2])
def mark_array(arr, y, color0='none', color1='crimson'):
if arr is not None:
pix = np.zeros(arr[-1] + 1)
pix[arr] = 1
show_array(pix, y, color0, color1)
tests = [np.array([1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1]),
np.array([1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1]),
np.array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]),
np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1]),
np.array([0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1]),
np.array([1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
np.array([1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0]),
np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0]),
np.array([0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0]),
np.array([1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0])]
for arr, y in zip(tests, range(0, 1000, 5)):
show_array(arr, y + 2)
result = find_islands(arr)
mark_array(result, y)
ax = plt.gca()
ax.relim()
ax.autoscale()
ax.axis('auto')
plt.show()

Python Image.fromarray() doesn't accept my ndarray input which is built from a list

I'm trying to visualize a list of 2048280 integers which are either 1's or 0's. There is a function that outputs this list from a (width=1515 height=1352) image file. The function
test_results = [(numpy.argmax(SomeFunctionReturningAnArrayForEachGivenPixel))
for y in xrange(1352) for x in range(1532)]
returns an array of size 2058280 (=1515x1352) = as expected. For each y, 1532 values of 1/0 are returned and stored in the array.
Now, when this "test_results" array is returned, I want to save it as an image. So I np.reshape() the array to size (1352,1515,1). All is fine. Logically, I should save this list as a grayscale image. I changed the ndarray data type to 'unit8' and multiplied the pixel values by 127 or 255.
But no matter what I do, the Image.fromarray() function keeps saying that either 'it cannot handle this data type' or 'too many dimensions' or simply gives an error. When I debug it into the Image functions, it looks like the Image library cannot retrieve the array's 'stride'!
All the examples on the net simply reshape the list into an array and save them as an image! Is there anything wrong with my list?
I have already tried various modes ('RGB' , 'L' , '1'). I also changed the data type of my array into uint8, int8, np.uint8(), uint32..
result=self.evaluate(test_data,box) #returns the array
re_array= np.asarray(result,dtype='uint8')
res2 = np.reshape(reray,(1352,1515,1))
res3 =(res2*255)
i = Image.fromarray(res3,'1') ## Raises the exception
i.save('me.png')

For a grayscale image, don't add the trivial third dimension to your array. Leave it as a two-dimensional array: res2 = np.reshape(reray, (1352, 1515)) (assuming reray is the one-dimensional array).
Here's a simple example that worked for me. data is a two-dimensional array with type np.uint8 containing 0s and 1s:
In [29]: data
Out[29]:
array([[0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1],
[0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0],
[1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1],
[1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0],
[1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1],
[1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0],
[0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0],
[1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0],
[1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1],
[0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0]], dtype=uint8)
Create an image from 255*data with mode 'L', and save it as a PNG file:
In [30]: img = Image.fromarray(255*data, mode='L')
In [31]: img.save('foo.png')
When I tried to create the image using mode='1', I wasn't able to get a correct PNG file. Pillow has some known problems with moving between numpy arrays and images with bit depth 1.
Another option is to use numpngw. (I'm the author numpngw.) It allows you to save the data to a PNG file with bit depth 1:
In [40]: import numpngw
In [41]: numpngw.write_png('foo.png', data, bitdepth=1)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Scatter plot in python - python

Related

How do I calculate the matrix exponential of a sparse matrix?

Can anyone know how to see the data in a cluster after doing k-means clustering?

Plot csv file taking the first column as x axis and the others as y axis once more

How to find regions of ones surrounded by zeros in a numpy array?

Python Image.fromarray() doesn't accept my ndarray input which is built from a list

Categories

Resources