So, I have a numpy.ndarray called CT with shape (10, 500).
Each row is a function and defined over the x-variables called Gm. Gm is a numpy.ndarray with shape (1,500).
I need to graph the 10 functions in the CT matrix (as a function of Gm) in one graph and try the following:
# consumption functions over time
plt.figure(figsize=(10,10))
TimeSteps = CT.shape[0]
for t in range(0,TimeSteps):
plt.plot(Gm,CT[t].reshape(1,DiscG),'go',label='t')
plt.show()
This works, but all graphs are shown with the same color (green) and it is not possible to distinguish if the graph is t = 0, 1, 2, etc.
Any idea as to how you get plt to choose a different color for each graph and make it possible to label them and put it in a text box.
It is common curtosy when asking a question to have a minimal and verifiable example. The questions you posed as problems are actually examples of the code working as intended but not as you want them to be. Here is an example of scatter dots with different colors and different labels as you posed on your question and answered by me and #DavidG.
import matplotlib.pyplot as plt
import numpy as np
# dummy data
x = np.random.rand(10, 100)
fig, ax = plt.subplots()
[ax.plot(xi, marker = 'o', label = idx) for idx, xi in enumerate(x)]
ax.legend()
fig.show()
The color cycles here stem from the standard color map used by matplotlib if you want to use specific colors or change the standard cycles please look at the documentation provided by matplotlib
OK - found another simpler way ... simply to transpose the input:
plt.figure(figsize=(10,10))
plt.plot(Gm.transpose(),CT.transpose(),marker='o')
plt.show()
That way the whole function gets a unique color, and it seems resolved. So my initial guess running a for loop was too complicated.
Related
I am currently taking a Matplotlib class. I was given an image to create the image as a 3D subplot 4 times at 4 different angles. It's a linear plot. As the data changes the plots change colors. As it's an image, I'm not certain where the actual changes start. I don't want an exact answer, just an explanation of how this would work. I have found many methods for doing this for a small list but this has 75 data points and I can't seem to do it without adding 75 entries.
I've also tried to understand cmap but I am confused on it as well.
Also, it needs to done without Seaborn.
This is part of the photo.
I am finding your question a little bit hard to understand. What I think you need is a function to map the input x/y argument onto a colour in your chosen colour map. See the below example:
import numpy as np
import matplotlib.pyplot
def number_to_colour(number, total_number):
return plt.cm.rainbow(np.linspace(0,1.,total_number))[list(number)]
x = np.arange(12)
y = x*-3.
z = x
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(x, y, z, c=number_to_colour(x, len(x)))
plt.show()
plt.cm.rainbow(np.linspace(0,1.,total_number)) creates an array of colours of length total_number evenly spaced spaced across the colour map (in this case rainbow). Modifying the indexing of this array (or changing np.linspace to another function with the desired scaling), should give you the colour scaling that you need.
I am trying to print about 42 plots in 7 rows, 6 columns, but the printed output in jupyter notebook, shows all the plots one under the other. I want them in (7,6) format for comparison. I am using matplotlib.subplot2grid() function.
Note: I do not get any error, and my code works, however the plots are one under the other, vs being in a grid/ matrix form.
Here is my code:
def draw_umap(n_neighbors=15, min_dist=0.1, n_components=2, metric='euclidean', title=''):
fit = umap.UMAP(
n_neighbors=n_neighbors,
min_dist=min_dist,
n_components=n_components,
metric=metric
)
u = fit.fit_transform(df);
plots = []
plt.figure(0)
fig = plt.figure()
fig.set_figheight(10)
fig.set_figwidth(10)
for i in range(7):
for j in range(6):
plt.subplot2grid((7,6), (i,j), rowspan=7, colspan=6)
plt.scatter(u[:,0], u[:,1], c= df.iloc[:,0])
plt.title(title, fontsize=8)
n=range(7)
d=range(6)
for n in n_neighbors:
for d in dist:
draw_umap(n_neighbors=n, min_dist=d, title="n_neighbors={}".format(n) + " min_dist={}".format(d))
I did refer to this post to get the plots in a grid and followed the code.
I also referred to this post, and modified my code for size of the fig.
Is there a better way to do this using Seaborn?
What am I missing here? Please help!
Both questions that you have linked contain solutions that seem more complicated than necessary. Note that subplot2grid is useful only if you want to create subplots of varying sizes which I understand is not your case. Also note that according to the docs Using GridSpec, as demonstrated in GridSpec demo is generally preferred, and I would also recommend this function only if you want to create subplots of varying sizes.
The simple way to create a grid of equal-sized subplots is to use plt.subplots which returns an array of Axes through which you can loop to plot your data as shown in this answer. That solution should work fine in your case seeing as you are plotting 42 plots in a grid of 7 by 6. But the problem is that in many cases you may find yourself not needing all the Axes of the grid, so you will end up with some empty frames in your figure.
Therefore, I suggest using a more general solution that works in any situation by first creating an empty figure and then adding each Axes with fig.add_subplot as shown in the following example:
import numpy as np # v 1.19.2
import matplotlib.pyplot as plt # v 3.3.4
# Create sample dataset
rng = np.random.default_rng(seed=1) # random number generator
nvars = 8
nobs = 50
xs = rng.uniform(size=(nvars, nobs))
ys = rng.normal(size=(nvars, nobs))
# Create figure with appropriate space between subplots
fig = plt.figure(figsize=(10, 8))
fig.subplots_adjust(hspace=0.4, wspace=0.3)
# Plot data by looping through arrays of variables and list of colors
colors = plt.get_cmap('tab10').colors
for idx, x, y, color in zip(range(len(xs)), xs, ys, colors):
ax = fig.add_subplot(3, 3, idx+1)
ax.scatter(x, y, color=color)
This could be done in seaborn as well, but I would need to see what your dataset looks like to provide a solution relevant to your case.
You can find a more elaborate example of this approach in the second solution in this answer.
The color map in matplotlib allows to mark "bad" values, i.e. NaNs, with a specific color. When we plot the color bar afterwards, this color is not included. Is there a preferred approach to have both the contiuous color bar and a discrete legend for the specific color for bad values?
Edit:
Certainly, it's possible to make use of the "extend" functionality. However, this solution is not satisfactory. The function of the legend/colorbar is to clarify the meaning of colors to the user. In my opinion, this solution does not communicate that the value is a NaN.
Code example:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
data = np.random.rand(10, 10)
data[0:3, 0:3] = np.nan # some bad values for set_bad
colMap = cm.RdBu
colMap.set_bad(color='black')
plt.figure(figsize=(10, 9))
confusion_matrix = plt.imshow(data, cmap=colMap, vmin=0, vmax=1)
plt.colorbar(confusion_matrix)
plt.show()
Which produces:
A legend element could be created and used as follows:
from matplotlib.patches import Patch
legend_elements = [Patch(facecolor=colMap(np.nan), label='Bad values')]
plt.legend(handles=legend_elements)
You can do this using one of the approaches used for out-of-range plotting shown at https://matplotlib.org/3.1.1/tutorials/colors/colorbar_only.html#discrete-intervals-colorbar
Set the color of the bad value e.g. to -999 and use the keyword extend.
Another approach is to used masked plotting as shown here.
Another way could be to use cmap.set_bad(). An example can be found here.
Let's say I want to visualize the functions f[n] = e^{-(x-n)^2}/n for n=1...10. Notice that these are not probability distributions.
(not actually the plot I want to do, but close enough).
I'd like to demonstrate it with something like a violin-plot (https://matplotlib.org/gallery/statistics/violinplot.html) where for each n I have a vertical line and I plot the function on both sides of the vertical line.
But violin plots seem to only be used for showing the locations of a sample of data. So all the tools for it require me to give it a data set. The data I want to plot isn't of that type - it's an actual known function.
[if you want more context this is related to an earlier question of mine - https://stats.stackexchange.com/questions/403359/visualizing-2d-data-when-one-dimension-is-discrete-and-the-other-continuous].
The question is a bit broad, so maybe this is not actually what you're looking for. But as I understand it, you just want to plot your function at position f(x,n) at different positions n and have x on the vertical axis.
import numpy as np
import matplotlib.pyplot as plt
f = lambda x, n: np.exp(-(x-n)**2)/n
x = np.linspace(-2,12,101)
ns = np.arange(1,11)
for n in ns:
plt.fill_betweenx(x, -f(x,n)+n, f(x,n)+n, color="C0", alpha=0.5)
plt.xlabel("n")
plt.ylabel("x")
plt.xticks(ns)
plt.show()
IIUC, you want something like this:
df = pd.DataFrame({n: [np.exp(-(x-n)**2)/n for x in np.arange(-1,1,0.1)] for n in range(1,11)})
fig, ax = plt.subplots(1,1, figsize=(10,10))
ax.violinplot(df.T)
plt.show()
Output:
I have been given a data for which I need to find a histogram. So I used pandas hist() function and plot it using matplotlib. The code runs on a remote server so I cannot directly see it and hence I save the image. Here is what the image looks like
Here is my code below
import matplotlib.pyplot as plt
df_hist = pd.DataFrame(np.array(raw_data)).hist(bins=5) // raw_data is the data supplied to me
plt.savefig('/path/to/file.png')
plt.close()
As you can see the x axis labels are overlapping. So I used this function plt.tight_layout() like so
import matplotlib.pyplot as plt
df_hist = pd.DataFrame(np.array(raw_data)).hist(bins=5)
plt.tight_layout()
plt.savefig('/path/to/file.png')
plt.close()
There is some improvement now
But still the labels are too close. Is there a way to ensure the labels do not touch each other and there is fair spacing between them? Also I want to resize the image to make it smaller.
I checked the documentation here https://matplotlib.org/api/_as_gen/matplotlib.pyplot.savefig.html but not sure which parameter to use for savefig.
Since raw_data is not already a pandas dataframe there's no need to turn it into one to do the plotting. Instead you can plot directly with matplotlib.
There are many different ways to achieve what you'd like. I'll start by setting up some data which looks similar to yours:
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import gamma
raw_data = gamma.rvs(a=1, scale=1e6, size=100)
If we go ahead and use matplotlib to create the histogram we may find the xticks too close together:
fig, ax = plt.subplots(1, 1, figsize=[5, 3])
ax.hist(raw_data, bins=5)
fig.tight_layout()
The xticks are hard to read with all the zeros, regardless of spacing. So, one thing you may wish to do would be to use scientific formatting. This makes the x-axis much easier to interpret:
ax.ticklabel_format(style='sci', axis='x', scilimits=(0,0))
Another option, without using scientific formatting would be to rotate the ticks (as mentioned in the comments):
ax.tick_params(axis='x', rotation=45)
fig.tight_layout()
Finally, you also mentioned altering the size of the image. Note that this is best done when the figure is initialised. You can set the size of the figure with the figsize argument. The following would create a figure 5" wide and 3" in height:
fig, ax = plt.subplots(1, 1, figsize=[5, 3])
I think the two best fixes were mentioned by Pam in the comments.
You can rotate the labels with
plt.xticks(rotation=45
For more information, look here: Rotate axis text in python matplotlib
The real problem is too many zeros that don't provide any extra info. Numpy arrays are pretty easy to work with, so pd.DataFrame(np.array(raw_data)/1000).hist(bins=5) should get rid of three zeros off of both axes. Then just add a 'kilo' in the axes labels.
To change the size of the graph use rcParams.
from matplotlib import rcParams
rcParams['figure.figsize'] = 7, 5.75 #the numbers are the dimensions