Ask for help in like python.
I have a pandas where the rows are the "people" and the columns are the subjects. When it has the value "1", it means that there is a relationship between the two and "zero" for no. That simple.
As well as plotting a binary matrix between this relationship, and the people and x the subjects.
The question is that I can't even make this plot "smaller" according to the photo of the objective. I always come across the "trace".
Example code:
matrixNumpy = matrix.to_numpy()
fig=plt.figure(figsize=(20, 20))
fig.add_subplot(2, 4, 1)
plt.imshow(matrixNumpy, aspect='auto', interpolation='none', cmap='Greys')
Pandas
Objetive
how is it currently
New photos
G = Graph Bipartite
create matrix
plot
matrix = bipartite.biadjacency_matrix(G, Hash, assunto).todense()
matrix = pd.DataFrame(matrix, index=Hash, columns=assunto)
matrix = matrix.squeeze()
matrix
matrixNumpy = matrix.to_numpy()
matrixNumpy.shape
fig, axes = plt.subplots(1,2, figsize=(15,15))
ax = axes[0]
ax.imshow(matrixNumpy, aspect='auto', cmap='Greys', )
ax = axes[1]
ax.imshow(total_sort_mat(matrixNumpy), aspect='auto', cmap='Greys',)
TY
Hard to copy the data from a screenshot, so there is my attempt to help you out.
Considering that you are using a 2D numpy at the end, let's go with a toy example
import numpy as np
import matplotlib.pyplot as plt
mat = np.random.choice([0, 1], size=(45,), p=[1./3, 2./3]).reshape((3,15))
If we plot this using aspect='auto', we get a result which is similar to what you don't want
plt.figure(figsize=(2,2))
plt.imshow(mat, aspect='auto', interpolation='none', cmap='Greys')
If you use aspect='equal', it returns
plt.imshow(mat, aspect='equal', interpolation='none', cmap='Greys')
The other possible reasons why it is not working might be
Since mentioned in your comment that you are getting an empty plot when aspect='auto', change your figsize=(15,15) to a smaller value like such as figsize=(1,1)
Even after changing the figsize if you are getting empty plot, then the matrix may be too large to be rendered. Try plotting a small portion first.
If you are in a Jupyter notebook, check if some of the previously executed cells are not affecting your variables.
Related
I have a 2D array and it's contents will display correctly as an image when I simply use
img = plt.imshow(full2DArray)
but my problem is that the axes just naively show the number of rows and columns. For example if my 2D array is 53x53 then the axes will count 0-53 on the y-axis and 0-53 on the x-axis.
I need to show the exact same image but have the axes display a linear scale from -130 to +130 instead.
I have a similar answer to this question here but to explain for your case, we can take an array data = np.random.rand(53,53) filled with random values, and plot it with imshow. You simply need to adjust the extent=[<xmin>,<xmax>,<ymin>,<ymax>] parameter, so in the example code:
import numpy as np
import matplotlib.pyplot as plt
data = np.random.rand(53,53)
print(data.shape) # Displays (53,53)
plt.figure()
plt.xlabel("x")
plt.ylabel("y")
plt.imshow(data, origin='lower', aspect='auto',
extent = [-130,130,-130,130], cmap=plt.cm.jet)
plt.colorbar()
plt.show()
We get the following plot with your desired bounds:
If I understand it correctly, you need predifined axis, instead of pyplot infering these from the image.
Setting xlim before calling imshow will do the job.
plt.xlim([-130, 130])
Similarly, you can call ylim for the y axis.
I have got the following problem: I have a sequence of letters (a protein sequence) and I would like to give them a colored background based on a value (I have a matching array of numbers). The end result should look something like this:
I tried a pyplot.matshow by adding my array twice for a 2d array.
figure = plt.figure()
axes = figure.add_subplot(111)
protein_seq='KALEPLMLVMGLISPLAT'
seq_markers= [ protein_seq[i] for i in range(len(protein_seq)) ]
data=np.random.rand(len(protein_seq))
data2d=[data,data]
# randomly generated array
# using the matshow() function
caxes = axes.matshow(data2d, cmap=plt.cm.Reds, vmin=0, vmax=2)
# figure.colorbar(caxes)
axes.set_xticklabels(seq_markers)
This gives
I am not sure how I get my labels on the matrix. I attempted using markers, but they tend to be small in a figure. Many thanks in advance!
You can provide the labels to sns.heatmap, which also will take care of choosing the text color depending on the cell's darkness.
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
figure = plt.figure()
axes = figure.add_subplot(111)
protein_seq = 'KALEPLMLVMGLISPLAT'
data = np.random.rand(len(protein_seq))
sns.heatmap(data=data.reshape(1, -1), annot=np.array([*'KALEPLMLVMGLISPLAT']).reshape(1, -1), fmt='',
xticklabels=[], yticklabels=[],
cmap='Reds', vmin=0, vmax=2, square=True, ax=axes, cbar=False)
plt.show()
I have a MxN (say, 1000x50) array. I want to plot each 50-point line onto the same plot, and have a heatmap of their density.
Simply doing a plt.pcolor(data) is not what I want, since I don't want to plot the matrix.
This is what I want to plot, but as I said it doesn't provide me with the heatmap I need.
import numpy as np
import matplotlib.pyplot as plt
data = np.random.rand(1000, 50)
fig, ax = plt.subplots()
for i in range(0,1000):
ax.plot(data[i], '.')
plt.show()
I would like a way of getting this together (I assume it will have something to do with histograms and binning?).
EDIT: simply adding an alpha value to the plot ( ax.plot(data[i], '.r', alpha=0.01)) achieves something similar to what I want. I would like, however, to have a heatmap with different colours.
As you already pointed out in your question, probably one of the simplest approaches involves histograms. A linear approximation of the histogram is probably enough for this application.
You can use np.histogram to calculate bin heights and edges and use scipy.interpolate.interp1d to obtain a function that provides an interpolation of the histogram. We can define a simple helper function to get the approximate density around each value in one column of the data array:
# import scipy.interpolate as interp
def get_density(vals, bins=30, kind="linear"):
y, bin_edges = np.histogram(vals, bins=bins, density=True)
x = (bin_edges[1:] + bin_edges[:-1])/2.
f = interp.interp1d(x, y, kind=kind, fill_value="extrapolate")
return f(vals)
Then you can use any colormap you want to map the density to a color value. The easiest way to go from here is to use plt.scatter instead of plot, where you can provide a specific color for every data point.
I would do something like this:
fig, ax = plt.subplots()
for i in range(data.shape[1]):
colors = plt.cm.viridis(get_density(data[:, i]))
ax.scatter(i*np.ones(data.shape[0]), data[:, i], c=colors, marker='.')
I tried to make the title as clear as possible although I am not sure it is completely limpid.
I have three series of data (number of events along time). I would like to do a subplots were the three time series are represented. You will find attached the best I could come up with. The last time series is significantly shorter and that's why it is not visible on here.
I'm also adding the corresponding code so you can maybe understand better why I'm trying to do and advice me on the proper/smart way to do so.
import numpy as np
import matplotlib.pyplot as plt
x=np.genfromtxt('nbr_lig_bound1.dat')
x1=np.genfromtxt('nbr_lig_bound2.dat')
x2=np.genfromtxt('nbr_lig_bound3.dat')
# doing so because imshow requieres a 2D array
# best way I found and probably not the proper way to get it done
x=np.expand_dims(x, axis=0)
x=np.vstack((x,x))
x1=np.expand_dims(x1, axis=0)
x1=np.vstack((x1,x1))
x2=np.expand_dims(x2, axis=0)
x2=np.vstack((x2,x2))
# hoping that this would compensate for sharex shrinking my X range to
# the shortest array
ax[0].set_xlim(1,24)
ax[1].set_xlim(1,24)
ax[2].set_xlim(1,24)
fig, ax = plt.subplots(nrows=3, ncols=1, figsize=(6,6), sharex=True)
fig.subplots_adjust(hspace=0.001) # this seem to have no effect
p1=ax[0].imshow(x1[:,::10000], cmap='autumn_r')
p2=ax[1].imshow(x2[:,::10000], cmap='autumn_r')
p3=ax[2].imshow(x[:,::10000], cmap='autumn')
Here is what I could reach so far:
and here is a scheme of what I wish to have since I could not find it on the web. In short, I would like to remove the blank spaces around the plotted data in the two upper graphs. And as a more general question I would like to know if imshow is the best way of obtaining such plot (cf intended results below).
Using fig.subplots_adjust(hspace=0) sets the vertical (height) space between subplots to zero but doesn't adjust the vertical space within each subplot. By default, plt.imshow has a default aspect ratio (rc image.aspect) usually set such that pixels are squares so that you can accurately recreate images. To change this use aspect='auto' and adjust the ylim of your axes accordingly.
For example:
# you don't need all the `expand_dims` and `vstack`ing. Use `reshape`
x0 = np.linspace(5, 0, 25).reshape(1, -1)
x1 = x0**6
x2 = x0**2
fig, axes = plt.subplots(3, 1, sharex=True)
fig.subplots_adjust(hspace=0)
for ax, x in zip(axes, (x0, x1, x2)):
ax.imshow(x, cmap='autumn_r', aspect='auto')
ax.set_ylim(-0.5, 0.5) # alternatively pass extent=[0, 1, 0, 24] to imshow
ax.set_xticks([]) # remove all xticks
ax.set_yticks([]) # remove all yticks
plt.show()
yields
To add a colorbar, I recommend looking at this answer which uses fig.add_axes() or looking at the documentation for AxesDivider (which I personally like better).
I have a set of coordinates, say [(2,3),(45,4),(3,65)]
I need to plot them as a matrix is there anyway I can do this in matplotlib so I want it to have this sort of look http://imgur.com/Q6LLhmk
Edit: My original answer used ax.scatter. There is a problem with this: If two points are side-by-side, ax.scatter may draw them with a bit of space in between, depending on the scale:
For example, with
data = np.array([(2,3),(3,3)])
Here is a zoomed-in detail:
So here is a alternative solution that fixes this problem:
import matplotlib.pyplot as plt
import numpy as np
data = np.array([(2,3),(3,3),(45,4),(3,65)])
N = data.max() + 5
# color the background white (1 is white)
arr = np.ones((N,N), dtype = 'bool')
# color the dots black (0)
arr[data[:,1], data[:,0]] = 0
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.imshow(arr, interpolation='nearest', cmap = 'gray')
ax.invert_yaxis()
# ax.axis('off')
plt.show()
No matter how much you zoom in, the adjacent squares at (2,3) and (3,3) will remain side-by-side.
Unfortunately, unlike ax.scatter, using ax.imshow requires building an N x N array, so it could be more memory-intensive than using ax.scatter. That should not be a problem unless data contains very large numbers, however.