Related
I have the following data which needs to be linearly classified using least squares. I wanted to visualise my data and then plot the features with colours but I got the following error when assigning the colour colour_cond.
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Note that data_t is made of 1s and 0s.
import numpy as np
import matplotlib.pyplot as plt
import glob
from scipy.io import loadmat
%matplotlib inline
data = glob.glob('Mydata_A.mat')
data_c1 = np.array([loadmat(entry, variable_names= ("X"), squeeze_me=True)["X"][:,0] for entry in data])
data_c2 = np.array([loadmat(entry, variable_names= ("X"), squeeze_me=True)["X"][:,1] for entry in data])
data_t = np.array([loadmat(entry, variable_names= ("T"), squeeze_me=True)["T"][:] for entry in data])
colour_cond=['red' if t==1 else 'blue' for t in data_t]
plt.scatter(data_c1,data_c2,colour=colour_cond)
plt.xlabel('X1')
plt.ylabel('X2')
plt.title('Training Data (X1,X2)')
plt.show()
Your problem is that the arrays data_c1, data_c2 and data_t seem to have more that one dimension. In your following line:
colour_cond=['red' if t==1 else 'blue' for t in data_t]
the variable t is not a scalar but a NumPy array, and t == 1 is ambiguous for non-scalar NumPy objects. I would suggest you to ravel (i.e. flatten) all your arrays:
import glob
import numpy as np
import matplotlib.pyplot as plt
from scipy.io import loadmat
%matplotlib inline
data = loadmat('Mydata_A.mat')
data_c1 = np.array([
loadmat(entry, variable_names=("X"), squeeze_me=True)["X"][:, 0]
for entry in entries]).ravel()
data_c2 = np.array([
loadmat(entry, variable_names=("X"), squeeze_me=True)["X"][:, 1]
for entry in entries]).ravel()
data_t = np.array([
loadmat(entry, variable_names=("T"), squeeze_me=True)["T"][:]
for entry in entries]).ravel()
colour_cond = ['red' if t==1 else 'blue' for t in data_t]
plt.scatter(data_c1, data_c2, color=colour_cond)
plt.xlabel('X1')
plt.ylabel('X2')
plt.title('Training Data (X1,X2)')
plt.show()
I am trying to build a histogram and here is my code:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
x = ['0','1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22','23','24','25','26','27','28','29','30','31','32','33','34','35','36','38','40','41','42','43','44','45','48','50','51','53','54','57','60','64','70','77','93','104','108','147'] #sample names
y = ['164','189','288','444','311','216','122','111','92','54','45','31','31','30','18','15','15','10','4','15','2','8','6','4','7','5','3','3','1','10','3','3','3','2','4','2','1','1','1','2','2','1','1','1','1','1','2','1','2','2','2','1','1','2','1','1','1','1']
plt.bar(x, y)
plt.xlabel('Number of Methods')
plt.ylabel('Variables')
plt.show()
Here is the histogram I obtain:
I would like the values in the y axis to be in an increasing order. This means that 1 should be first followed by 3, 5, 7, etc. How can I fix this?
They're not decreasing, they're in the order in which they are in the list, because the list items are strings. Try
x = [int(i) for i in x]
y = [int(i) for i in y]
to convert them to numbers before plotting.
I have a list containing array elements:
[array([2.40460915, 0.85513601]), array([1.80998096, 0.97406986]), array([2.14505475, 0.96109123]),
array([2.12467111, 0.93991277])]
And I want to plot that list using mathplotlib, such that i iterate over each element in the list, and plot the ith element, using plt.scatter(x,y) where x is the first element of the array at the ith position, and similar for y the second element.
I am not super familiar with how to do this indexing in python, and no matter how I try to solve this, I cannot get a plot.
for i in range(len(list)):
# plt.scatter(x,y) for x,y as described above
Can anyone tell me an easy way to do this?
from numpy import array
import matplotlib.pyplot as plt
a = [array([2.40460915, 0.85513601]), array([1.80998096, 0.97406986]), array([2.14505475, 0.96109123]),
array([2.12467111, 0.93991277])]
# *i unpacks i into a tuple (i[0], i[1]), which is interpreted as (x,y) by plt.scatter
for i in a:
plt.scatter(*i)
plt.show()
You can zip the unpacked values of numpy array a.
One-liner to plot as you want:
plt.scatter(*zip(*a))
which is equivalent to x,y=zip(*a); plt.scatter(x,y)
import numpy as np
import matplotlib.pyplot as plt
a=[np.array([2.40460915, 0.85513601]), np.array([1.80998096, 0.97406986]), np.array([2.14505475, 0.96109123]), np.array([2.12467111, 0.93991277])]
plt.scatter(*zip(*a)) #x,y=zip(*a)
plt.show()
This would do it:
import matplotlib.pyplot as plt
import numpy as np
a= [np.array([2.40460915, 0.85513601]),
np.array([1.80998096, 0.97406986]),
np.array([2.14505475, 0.96109123]),
np.array([2.12467111, 0.93991277])]
plt.scatter([i[0] for i in a], [i[1] for i in a]) # just this line here
plt.show()
There are many solutions to this question. I write two that you will understand easily:
Solution 1: many scatters
for i in range(len(data)):
point = data[i] #the element ith in data
x = point[0] #the first coordenate of the point, x
y = point[1] #the second coordenate of the point, y
plt.scatter(x,y) #plot the point
plt.show()
Solution 2: one scatter (I recomend if you are not familiarizated with indexing)
x = []
y = []
for i in range(len(data)):
point = data[i]
x.append(point[0])
y.append(point[1])
plt.scatter(x,y)
plt.show()
try converting the array into pandas Dataframe by
data=pd.DataFrame(data='''array''')
and try plotting the datas
I wrote a simple function to plot log in python:
import matplotlib.pyplot as plt
import numpy as np
x = list(range(1, 10000, 1))
y = [-np.log(p/10000) for p in x]
plt.scatter(x, y) # also tried with plt.plot(x, y)
plt.show()
I just want to see how the plot looks.
fn.py:5: RuntimeWarning: divide by zero encountered in log
y = [-np.log(p/10000) for p in x]
I get the above error and on top of that I get a blank plot with even the ranges wrong.
It is strange why there is divide by zero warning, when I am dividing by a number?
How can I correctly plot the function?
Although you have tagged python-3.x, it seems that you are using python-2.x where p/10000 will result in 0 for values of p < 10000 because the division operator / performs integer division in python-2.x. If that is the case, you can explicitly use 10000.0 instead of 10000 to avoid that and get a float division.
Using .0 is not needed in python 3+ because by default it performs float division. Hence, your code works fine in python 3.6.5 though
import matplotlib.pyplot as plt
import numpy as np
x = list(range(1, 10000, 1))
y = [-np.log(p/10000.0) for p in x]
plt.scatter(x, y)
plt.show()
On a different note: You can simply use NumPy's arange to generate x and avoid the list completely and use vectorized operation.
x = np.arange(1, 10000)
y = -np.log(x/10000.0)
Why import numpy and then avoid using it? You could have simply done:
from math import log
import matplotlib.pyplot as plt
x = xrange(1, 10000)
y = [-log(p / 10000.0) for p in x]
plt.scatter(x, y)
plt.show()
If you're going to bring numpy into the picture, think about doing things in a numpy-like fashion:
import matplotlib.pyplot as plt
import numpy as np
f = lambda p: -np.log(p / 10000.0)
x = np.arange(1, 10000)
plt.scatter(x, f(x))
plt.show()
I have an array containing 5 different numbers:
array([2.40064633, 4.10132553, 8.59968518, 2.40290345, 1.39988773]
and I want to plot the lines on the x axis (parallel to the y axis) equal to each of these numbers i.e.
x = 2.4006463
x = 4.10132553 so on and so forth for all of the numbers in the array.
I tried using plot(x = array[...]) but to no solution.
Is there a clean way of doing this using numpy or mathlab?
This will work:
import matplotlib.pyplot as plt
b =([2.40064633, 4.10132553, 8.59968518, 2.40290345, 1.39988773])
for l in b:
plt.axvline(l)
plt.show()
or is it an numpy array then:
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(1,4)
for l in x:
plt.axvline(l)
plt.show()
here is my take. quite the similar as Rahul's only with the lines harshed.
import matplotlib.pyplot as plt
import numpy as np
xcoords = np.array([2.40064633, 4.10132553, 8.59968518, 2.40290345, 1.39988773])
for xc in xcoords:
plt.axvline(x=xc, color='k', linestyle='--')