pandas bar chart y-axis min max range in floating number - python

I have a graph like this. I want to build it using python matplotlib.pyplot. However I get the graph like this.
How can I change the range of y-axis? instead of 0 to 13, I want it to range from min_value-1 to max_value+1 in floating values.

Something like this?
Just a simple example.
import numpy as np
import matplotlib.pyplot as plt
array = np.array([110,200,300])
ax = plt.gca()
minimum = array.min()-10
maximum = array.max()+10
ax.set_ylim([minimum,maximum])
plt.bar(range(len(array)), array)

Related

Beginner question: Python scatter plot with normal distribution not plotting

I have an array of random integers for which I have calculated the mean and std, the standard deviation. Next I have an array of random numbers within the normal distribution of this (mean, std).
I want to plot now a scatter plot of the normal distribution array using matplotlib. Can you please help?
Code:
random_array_a = np.random.randint(2,15,size=75) #random array from [2,15)
mean = np.mean(random_array_a)
std = np.std(random_array_a)
sample_norm_distrib = np.random.normal(mean,std,75)
The scatter plot needs x and y axis...but what should it be?
I think what you may want is a histogram of the normal distribution:
import matplotlib.pyplot as plt
%matplotlib inline
plt.hist(sample_norm_distrib)
The closest thing you can do to visualise your distribution of 1D output is doing scatter where your x & y are the same. this way you can see more accumulation of data in the high probability areas. For example:
import numpy as np
import matplotlib.pyplot as plt
mean = 0
std = 1
sample_norm_distrib = np.random.normal(mean,std,7500)
plt.figure()
plt.scatter(sample_norm_distrib,sample_norm_distrib)

Matplotlib: how to locate ticks and showing min and max of data

Good day,
I would like to dynamically locate my ticks and showing the min and max of the data (which is varying, thus I really can't harcode the conditions). I'm trying to use matplotlib.ticker functions and the best that I can find is MaxNLocator().. but unfortunately, it does not consider the limits of my dataset.
What would be the best approach to my problem?
Thanks!
pseudocode as follows:
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
data1 = range(5)
ax1 = plt.subplot(2,1,1)
ax1.plot(data1)
data2 = range(63)
ax2 = plt.subplot(2,1,2)
ax2.plot(data2)
ax1.xaxis.set_major_locator(MaxNLocator(integer=True))
ax2.xaxis.set_major_locator(MaxNLocator(integer=True))
plt.show()
and the output is:
Not sure about best approach, but one possible way to do this would be to create a list of numbers between your minimum and maximum using numpy.linspace(start, stop, num). The third argument passed to this lets you control the number of points generated. You can then round these numbers using a list comprehension, and then set the ticks using ax.set_xticks().
Note: This will produce unevenly distributed ticks in some cases, which may be unavoidable in your case
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
import numpy as np
data1 = range(5)
ax1 = plt.subplot(2,1,1)
ax1.plot(data1)
data2 = range(63) # max of this is 62, not 63 as in the question
ax2 = plt.subplot(2,1,2)
ax2.plot(data2)
ticks1 = np.linspace(min(data1),max(data1),5)
ticks2 = np.linspace(min(data2),max(data2),5)
int_ticks1 = [round(i) for i in ticks1]
int_ticks2 = [round(i) for i in ticks2]
ax1.set_xticks(int_ticks1)
ax2.set_xticks(int_ticks2)
plt.show()
This gives:
Update: This will give a maximum numbers of ticks of 5, however if the data goes from say range(3) then the number of ticks will be less. I have updates the creating of int_ticks1 and int_ticks2 so that only unique values will be used to avoid repeated plotting of certain ticks if the range is small
Using the following data
data1 = range(3)
data2 = range(3063)
# below removes any duplicate ticks
int_ticks1 = list(set([int(round(i)) for i in ticks1]))
int_ticks2 = list(set([int(round(i)) for i in ticks2]))
This produces the following figure:

matplotlib turn an array into a parametric plot

Suppose I have the following script:
import numpy as np
import matplotlib.pyplot as plt
A = np.array([[1,1,1,0],[0,0,1,0],[0,1,0,0],[0,0,0,0]])
How can I plot just the values of A that are equal to 1, leaving the 0's blank? Basically I'm looking to plot just those points, and not as a pcolormesh or something similar.
If you change the values to non-integer values they will not appear in your array.
x(x == -1) = NaN;
plot(x)

Plotting random point on Function - Pandas

I want to graph a function 2D or 3D
for example a f(x) = sin(x)
Then randomly plot a certain amount of points
I am using IPython and I think this might be possible using Pandas
You can use np.random.uniform to generate a few random points along x-axis and calculate corresponding f(x) values.
import numpy as np
import matplotlib.pyplot as plt
# generate 20 points from uniform (-3,3)
x = np.random.uniform(-3, 3, size=20)
y = np.sin(x)
fig, ax = plt.subplots()
ax.scatter(x,y)
You should post example code so people can demonstrate it more easily.
(numpy.random.random(10)*x_scale)**2
Generate an array of random numbers between 0 and 1, scale as appropriate (so for (-10,0);
10*numpy.random.random(100) -10
then pass this to any function that can calculate the value of f(x) for each element of the array.
Use shape() if you need to play around with layout of the array.
If you want to use Pandas...
import pandas as pd
import matplotlib.pyplot as plt
x=linspace(0,8)
y=sin(x)
DF=pd.DataFrame({'x':x,'y':y})
plot values:
DF.plot(x='x',y='y')
make a random index:
RandIndex=randint(0,len(DF),size=20)
use it to select from original DF and plot:
DF.iloc[RandIndex].plot(x='x',y='y',kind='scatter',s=120,ax=plt.gca())

Pyplot: using percentage on x axis

I have a line chart based on a simple list of numbers. By default the x-axis is just the an increment of 1 for each value plotted. I would like to be a percentage instead but can't figure out how. So instead of having an x-axis from 0 to 5, it would go from 0% to 100% (but keeping reasonably spaced tick marks. Code below. Thanks!
from matplotlib import pyplot as plt
from mpl_toolkits.axes_grid.axislines import Subplot
data=[8,12,15,17,18,18.5]
fig=plt.figure(1,(7,4))
ax=Subplot(fig,111)
fig.add_subplot(ax)
plt.plot(data)
The code below will give you a simplified x-axis which is percentage based, it assumes that each of your values are spaces equally between 0% and 100%.
It creates a perc array which holds evenly-spaced percentages that can be used to plot with. It then adjusts the formatting for the x-axis so it includes a percentage sign using matplotlib.ticker.FormatStrFormatter. Unfortunately this uses the old-style string formatting, as opposed to the new style, the old style docs can be found here.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.ticker as mtick
data = [8,12,15,17,18,18.5]
perc = np.linspace(0,100,len(data))
fig = plt.figure(1, (7,4))
ax = fig.add_subplot(1,1,1)
ax.plot(perc, data)
fmt = '%.0f%%' # Format you want the ticks, e.g. '40%'
xticks = mtick.FormatStrFormatter(fmt)
ax.xaxis.set_major_formatter(xticks)
plt.show()
This is a few months late, but I have created PR#6251 with matplotlib to add a new PercentFormatter class. With this class you can do as follows to set the axis:
import matplotlib.ticker as mtick
# Actual plotting code omitted
ax.xaxis.set_major_formatter(mtick.PercentFormatter(5.0))
This will display values from 0 to 5 on a scale of 0% to 100%. The formatter is similar in concept to what #Ffisegydd suggests doing except that it can take any arbitrary existing ticks into account.
PercentFormatter() accepts three arguments, max, decimals, and symbol. max allows you to set the value that corresponds to 100% on the axis (in your example, 5).
The other two parameters allow you to set the number of digits after the decimal point and the symbol. They default to None and '%', respectively. decimals=None will automatically set the number of decimal points based on how much of the axes you are showing.
Note that this formatter will use whatever ticks would normally be generated if you just plotted your data. It does not modify anything besides the strings that are output to the tick marks.
Update
PercentFormatter was accepted into Matplotlib in version 2.1.0.
Totally late in the day, but I wrote this and thought it could be of use:
def transformColToPercents(x, rnd, navalue):
# Returns a pandas series that can be put in a new dataframe column, where all values are scaled from 0-100%
# rnd = round(x)
# navalue = Nan== this
hv = x.max(axis=0)
lv = x.min(axis=0)
pp = pd.Series(((x-lv)*100)/(hv-lv)).round(rnd)
return pp.fillna(navalue)
df['new column'] = transformColToPercents(df['a'], 2, 0)

Categories