I have a list containing array elements:
[array([2.40460915, 0.85513601]), array([1.80998096, 0.97406986]), array([2.14505475, 0.96109123]),
array([2.12467111, 0.93991277])]
And I want to plot that list using mathplotlib, such that i iterate over each element in the list, and plot the ith element, using plt.scatter(x,y) where x is the first element of the array at the ith position, and similar for y the second element.
I am not super familiar with how to do this indexing in python, and no matter how I try to solve this, I cannot get a plot.
for i in range(len(list)):
# plt.scatter(x,y) for x,y as described above
Can anyone tell me an easy way to do this?
from numpy import array
import matplotlib.pyplot as plt
a = [array([2.40460915, 0.85513601]), array([1.80998096, 0.97406986]), array([2.14505475, 0.96109123]),
array([2.12467111, 0.93991277])]
# *i unpacks i into a tuple (i[0], i[1]), which is interpreted as (x,y) by plt.scatter
for i in a:
plt.scatter(*i)
plt.show()
You can zip the unpacked values of numpy array a.
One-liner to plot as you want:
plt.scatter(*zip(*a))
which is equivalent to x,y=zip(*a); plt.scatter(x,y)
import numpy as np
import matplotlib.pyplot as plt
a=[np.array([2.40460915, 0.85513601]), np.array([1.80998096, 0.97406986]), np.array([2.14505475, 0.96109123]), np.array([2.12467111, 0.93991277])]
plt.scatter(*zip(*a)) #x,y=zip(*a)
plt.show()
This would do it:
import matplotlib.pyplot as plt
import numpy as np
a= [np.array([2.40460915, 0.85513601]),
np.array([1.80998096, 0.97406986]),
np.array([2.14505475, 0.96109123]),
np.array([2.12467111, 0.93991277])]
plt.scatter([i[0] for i in a], [i[1] for i in a]) # just this line here
plt.show()
There are many solutions to this question. I write two that you will understand easily:
Solution 1: many scatters
for i in range(len(data)):
point = data[i] #the element ith in data
x = point[0] #the first coordenate of the point, x
y = point[1] #the second coordenate of the point, y
plt.scatter(x,y) #plot the point
plt.show()
Solution 2: one scatter (I recomend if you are not familiarizated with indexing)
x = []
y = []
for i in range(len(data)):
point = data[i]
x.append(point[0])
y.append(point[1])
plt.scatter(x,y)
plt.show()
try converting the array into pandas Dataframe by
data=pd.DataFrame(data='''array''')
and try plotting the datas
Related
I am trying to build a histogram and here is my code:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
x = ['0','1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22','23','24','25','26','27','28','29','30','31','32','33','34','35','36','38','40','41','42','43','44','45','48','50','51','53','54','57','60','64','70','77','93','104','108','147'] #sample names
y = ['164','189','288','444','311','216','122','111','92','54','45','31','31','30','18','15','15','10','4','15','2','8','6','4','7','5','3','3','1','10','3','3','3','2','4','2','1','1','1','2','2','1','1','1','1','1','2','1','2','2','2','1','1','2','1','1','1','1']
plt.bar(x, y)
plt.xlabel('Number of Methods')
plt.ylabel('Variables')
plt.show()
Here is the histogram I obtain:
I would like the values in the y axis to be in an increasing order. This means that 1 should be first followed by 3, 5, 7, etc. How can I fix this?
They're not decreasing, they're in the order in which they are in the list, because the list items are strings. Try
x = [int(i) for i in x]
y = [int(i) for i in y]
to convert them to numbers before plotting.
Here is my resulting plot below but I would like it to look like the truncated dendrograms in astrodendro such as this:
There is also a really cool looking dendrogram from this paper that I would like to recreate in matplotlib.
Below is the code for generating an iris data set with noise variables and plotting the dendrogram in matplotlib.
Does anyone know how to either: (1) truncate the branches like in the example figures; and/or (2) to use astrodendro with a custom linkage matrix and labels?
import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
import astrodendro
from scipy.cluster.hierarchy import dendrogram, linkage
from scipy.spatial import distance
def iris_data(noise=None, palette="hls", desat=1):
# Iris dataset
X = pd.DataFrame(load_iris().data,
index = [*map(lambda x:f"iris_{x}", range(150))],
columns = [*map(lambda x: x.split(" (cm)")[0].replace(" ","_"), load_iris().feature_names)])
y = pd.Series(load_iris().target,
index = X.index,
name = "Species")
c = map_colors(y, mode=1, palette=palette, desat=desat)#y.map(lambda x:{0:"red",1:"green",2:"blue"}[x])
if noise is not None:
X_noise = pd.DataFrame(
np.random.RandomState(0).normal(size=(X.shape[0], noise)),
index=X_iris.index,
columns=[*map(lambda x:f"noise_{x}", range(noise))]
)
X = pd.concat([X, X_noise], axis=1)
return (X, y, c)
def dism2linkage(DF_dism, method="ward"):
"""
Input: A (m x m) dissimalrity Pandas DataFrame object where the diagonal is 0
Output: Hierarchical clustering encoded as a linkage matrix
Further reading:
http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.cluster.hierarchy.linkage.html
https://pypi.python.org/pypi/fastcluster
"""
#Linkage Matrix
Ar_dist = distance.squareform(DF_dism.as_matrix())
return linkage(Ar_dist,method=method)
# Get data
X_iris_with_noise, y_iris, c_iris = iris_data(50)
# Get distance matrix
df_dism = 1- X_iris_with_noise.corr().abs()
# Get linkage matrix
Z = dism2linkage(df_dism)
#Create dendrogram
with plt.style.context("seaborn-white"):
fig, ax = plt.subplots(figsize=(13,3))
D_dendro = dendrogram(
Z,
labels=df_dism.index,
color_threshold=3.5,
count_sort = "ascending",
#link_color_func=lambda k: colors[k]
ax=ax
)
ax.set_ylabel("Distance")
I'm not sure this really constitutes a practical answer, but it does allow you to generate dendrograms with truncated hanging lines. The trick is to generate the plot as normal, then manipulate the resulting matplotlib plot to recreate the lines.
I couldn't get your example to work locally, so I've just created a dummy dataset.
from matplotlib import pyplot as plt
from scipy.cluster.hierarchy import dendrogram, linkage
import numpy as np
a = np.random.multivariate_normal([0, 10], [[3, 1], [1, 4]], size=[5,])
b = np.random.multivariate_normal([0, 10], [[3, 1], [1, 4]], size=[5,])
X = np.concatenate((a, b),)
Z = linkage(X, 'ward')
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
dendrogram(Z, ax=ax)
The resulting plot is the usual long-arm dendrogram.
Now for the more interesting bit. A dendrogram is made up of a number of LineCollection objects (one for each colour). To update the lines we iterate through these, extracting the details about their constituent paths, modifying these to remove any lines reaching to a y of zero, and then recreating a LineCollection for these modified paths.
The updated path is then added to the axes, and the original is removed.
The one tricky part is determining what height to draw to instead of zero. Since we are iterating over each dendrograms path, we don't know which point came before — we basically have no idea where we are. However, we can exploit the fact that hanging lines hang vertically. Assuming there are no lines on the same x, we can look for the known other y values for a given x and use that as the basis for our new y when calculating. The downside is that in order to make sure we have this number, we have to pre-scan the data.
Note: If you can get dendrogram hanging lines on the same x, you would need to include the y and search for nearest y above this x to do this.
import numpy as np
from matplotlib.path import Path
from matplotlib.collections import LineCollection
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
dendrogram(Z, ax=ax);
for c in ax.collections[:]: # use [:] to get a copy, since we're adding to the same list
paths = []
for path in c.get_paths():
segments = []
y_at_x = {}
# Pre-pass over all elements, to find the lowest y value at each x value.
# we can use this to caculate where to cut our lines.
for n, seg in enumerate(path.iter_segments()):
x, y = seg[0]
# Don't store if the y is zero, or if it's higher than the current low.
if y > 0 and y < y_at_x.get(x, np.inf):
y_at_x[x] = y
for n, seg in enumerate(path.iter_segments()):
x, y = seg[0]
if y == 0:
# If we know the last y at this x, use it - 0.5, limit > 0
y = max(0, y_at_x.get(x, 0) - 0.5)
segments.append([x,y])
paths.append(segments)
lc = LineCollection(paths, colors=c.get_colors()) # Recreate a LineCollection with the same params
ax.add_collection(lc)
ax.collections.remove(c) # Remove the original LineCollection
The resulting dendrogram looks like this:
How can I scatter plot a list of pairs with each axis of the plot representing one of the value in the pair in python? My list looks like this
[(62725984, 63548262), (64797631, 64619047), (65069350, 65398449), (58960696, 57416785), (58760119, 58666604), (60470606, 61338129), (60728760, 59001882)]
This should be easy. You can extract the pair into two variables as follows:
x,y = zip(*<name_of_your_2d_list>)
Also, you can pass the same to scatter function as
matplotlib.pyplot.scatter(*zip(*<name_of_your_2d_list>).
Try the following. It should work:
import matplotlib.pyplot, pylab
data = [(62725984, 63548262), (64797631, 64619047), (65069350, 65398449), (58960696, 57416785), (58760119, 58666604), (60470606, 61338129), (60728760, 59001882)]
matplotlib.pyplot.scatter(*zip(*data))
matplotlib.pyplot.show()
try below code:
import matplotlib.pyplot
import pylab
list1 = [(62725984, 63548262), (64797631, 64619047), (65069350, 65398449), (58960696, 57416785), (58760119, 58666604), (60470606, 61338129), (60728760, 59001882)]
list1 = list(zip(*list1))
pylab.scatter(list(list1[0]),list(list1[1]))
pylab.show()
You can use the function below.
import matplotlib.pyplot as plt
def scatter_plot(list):
x = []
y = []
for i in list:
x.append(i[0])
y.append(i[1])
plt.scatter(x,y)
plt.show()
And simply use this function as below.
scatter_plot(list_of_list)
I have a text file that consists of 3 columns.
column contain X coordinate
column contain Y coordinate
column contain 0 or 1
So far I draw all the coordinates:
import matplotlib.pyplot as plt
import numpy as np
x, y = np.loadtxt("coordinates.txt",delimiter=' ',skiprows=1, usecols=(0,1),unpack=True)
plt.plot(x,y)
plt.show()
I want to draw only those coordinates where the value of 2rd column is 1.
Please help me.
hope this help:
import matplotlib.pyplot as plt
import numpy as np
f = np.loadtxt('coordinates.txt',delimiter=' ',skiprows=1)
f = f[f[:,2] == 1]
x = f[:,0]
y = f[:,1]
plt.plot([x], [y], 'ro')
plt.show()
The long way to do this is using a loop that plots (lets say) dots based on position within the list. But it might be a helpful to you, considering your comments.
Based on your comments, the data you're dealing with is considered as a string. Be sure to check types of data if you're planing to deal with programming. https://www.tutorialspoint.com/python/python_variable_types.htm
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt('coordinates.txt',delimiter=' ',skiprows=1)
x_data = data[:,0] # [every row, "1st" column]
y_data = data[:,1] # [every row, "2nd" column]
z_data = data[:,2] # [every row, "3rd" column]
#check every number in z and if it is equal to your desired condition,
#plot blue circle ('bo') on coordinates where that condition is satisfied (x[i], y[i])
for i in range(len(z)):
if z[i] == str(1):
plt.plot(x[i],y[i], 'bo')
You can also plot every dot, and make them different like this:
for i in range(len(z)):
if z[i] == str(1):
plt.plot(x[i],y[i], 'bo') #ones are blue dots
else:
plt.plot(x[i],y[i], 'ro') #zeros are red dots
I would definitely recommend that you do some research on how to read data and how to deal with it when it's read (for example: converting strings to floats), because this is not the proper way to do this, but it will do the trick.
I have numpy array with values 0,1,2. I want to separate them in different arrays and plot them. How can I do that?
for i in range(2):
if i==0
z = [i]
elif i==1
y = [i]
else
w = [i]
this is what i tried
just use the histogram function from pyplot
import numpy as np
import matplotlib.pyplot as plt
y = np.random.randint(0,3,100)
plt.hist(y)
plt.show()