Create a Student-Age graph in Python-Matplotlib - python

import matplotlib.pyplot as plt
x = ['Eric','Jhon','bill','Daniel']
y = [10, 17, 12.5, 20]
plt.plot(x,y)
plt.show()
When I run this code, I get this error
ValueError: could not convert string to float:
I want all names in list x at x-axis and corresponding ages are in second list y which will be used in bar graph.
So here I have 1 more question
Is it a good way to do in my case(I mean if we can create a list of tuples(name,age)) and that would be easy?? or something else.

The error occurs because matplotlib is expecting numerical data but you're providing strings (the names).
What you can do instead is plot your data using some numerical data and then replace the ticks on the x-axis using plt.xticks as below.
import matplotlib.pyplot as plt
names = ['Eric','John','Bill','Daniel']
x = range(len(names))
y = [10, 17, 12.5, 20]
plt.plot(x, y)
plt.xticks(x, names)
plt.show()

Related

How to change axis scale in python?

More specifically how do I change it to work like this graph ? I've tried using plt.yscale() but to no avail as it only allows certain set values and I didn't get very far with using plt.axis. This code is a simple attempt at a linear regression with the values shown below, my coefficients (for a function A+Bx) were A=38.99 and B=2.055
X=np.array([2,4,6,8,10])
Y=np.array([42.0,48.4,51.3,56.3,58.6])
A, B=P.polyfit(X,Y,1)
plt.plot(X,Y,'o')
plt.plot(X,A+B*X)
plt.yscale('linear')
plt.show()
And my graph comes out looking like this:graph2 Which isn't wrong but I got curious on how to make it look like the one above and just couldn't figure it out.
I'm using Matplotlib's Object-Oriented API.
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline # only needed if running from a Jupiter Notebook
import numpy as np
X = np.array([2,4,6,8,10])
Y = np.array([42.0,48.4,51.3,56.3,58.6])
A, B = np.polyfit(X,Y,1)
fig, ax = plt.subplots()
ax.plot(X, Y, 'o')
ax.plot(X, B+A*X)
ax.xaxis.set_major_locator(mpl.ticker.FixedLocator([0, 2, 4, 6, 8, 10]))
ax.yaxis.set_major_locator(mpl.ticker.FixedLocator([40, 50, 60]))
ax.set(xlim=[0, 11], ylim=[40, 60], xlabel=r'Mass (kg) $->$', ylabel=r'Length (cm) $->$');

y axis has decreasing values instead of increasing ones for plt

I am trying to build a histogram and here is my code:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
x = ['0','1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22','23','24','25','26','27','28','29','30','31','32','33','34','35','36','38','40','41','42','43','44','45','48','50','51','53','54','57','60','64','70','77','93','104','108','147'] #sample names
y = ['164','189','288','444','311','216','122','111','92','54','45','31','31','30','18','15','15','10','4','15','2','8','6','4','7','5','3','3','1','10','3','3','3','2','4','2','1','1','1','2','2','1','1','1','1','1','2','1','2','2','2','1','1','2','1','1','1','1']
plt.bar(x, y)
plt.xlabel('Number of Methods')
plt.ylabel('Variables')
plt.show()
Here is the histogram I obtain:
I would like the values in the y axis to be in an increasing order. This means that 1 should be first followed by 3, 5, 7, etc. How can I fix this?
They're not decreasing, they're in the order in which they are in the list, because the list items are strings. Try
x = [int(i) for i in x]
y = [int(i) for i in y]
to convert them to numbers before plotting.

Plotting categorical variable against numeric variable in matplotlib

My DataFrame's structure
trx.columns
Index(['dest', 'orig', 'timestamp', 'transcode', 'amount'], dtype='object')
I'm trying to plot transcode (transaction code) against amount to see the how much money is spent per transaction. I made sure to convert transcode to a categorical type as seen below.
trx['transcode']
...
Name: transcode, Length: 21893, dtype: category
Categories (3, int64): [1, 17, 99]
The result I get from doing plt.scatter(trx['transcode'], trx['amount']) is
Scatter plot
While the above plot is not entirely wrong, I would like the X axis to contain just the three possible values of transcode [1, 17, 99] instead of the entire [1, 100] range.
Thanks!
In matplotlib 2.1 you can plot categorical variables by using strings. I.e. if you provide the column for the x values as string, it will recognize them as categories.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame({"x" : np.random.choice([1,17,99], size=100),
"y" : np.random.rand(100)*100})
plt.scatter(df["x"].astype(str), df["y"])
plt.margins(x=0.5)
plt.show()
In order to optain the same in matplotlib <=2.0 one would plot against some index instead.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame({"x" : np.random.choice([1,17,99], size=100),
"y" : np.random.rand(100)*100})
u, inv = np.unique(df["x"], return_inverse=True)
plt.scatter(inv, df["y"])
plt.xticks(range(len(u)),u)
plt.margins(x=0.5)
plt.show()
The same plot can be obtained using seaborn's stripplot:
sns.stripplot(x="x", y="y", data=df)
And a potentially nicer representation can be done via seaborn's swarmplot:
sns.swarmplot(x="x", y="y", data=df)

matplotlib: histogram is not displaying

I am trying to draw histogram but nothing appears in the Figure Window.
My code is below:
import numpy as np
import matplotlib.pyplot as plt
values = [1000000, 1525097, 2050194, 1095638, 1620736, 2145833, 1191277, 1716375, 1286916, 1382555]
plt.hist(values, 10, histtype = 'bar', facecolor = 'blue')
plt.ylabel("Values")
plt.xlabel("Bin Number")
plt.title("Histogram")
plt.axis([0,11,0,220000])
plt.show()
This is the output:
I am trying to achieve this plot
Any help would be much appreciated...
You are confusing what a histogram is. The histogram that can be produced with the given data is as given below.
A histogram basically counts how many given values fall within a given range.
You have given incorrect arguments to the axis() function. The ending value is 2200000 You missed a single zero. Also you have swapped the arguments. Limits of the x axis comes first and then the limits of the Y axis. This is the modified code:
import numpy as np
import matplotlib.pyplot as plt
values = [1000000, 1525097, 2050194, 1095638, 1620736, 2145833, 1191277, 1716375, 1286916, 1382555]
plt.hist(values, 10, histtype = 'bar', facecolor = 'blue')
plt.ylabel("Values")
plt.xlabel("Bin Number")
plt.title("Histogram")
plt.axis([0,2200000,0,11])
plt.show()
This is the histogram generated:
I finally achieved it...
Here is the code:
import numpy as np
import matplotlib.pyplot as plt
values = [1000000, 1525097, 2050194, 1095638, 1620736, 2145833, 1191277, 1716375, 1286916, 1382555]
strategy = [1,2,3,4,5,6,7,8,9,10]
value = np.array(values)
strategies = np.array(strategy)
plt.bar(strategy, values, .8)
plt.ylabel("Values")
plt.xlabel("Bin Number")
plt.title("Histogram")
plt.axis([1,11,0,2200000])
plt.show()
Output:

How to make X axis in matplotlib/pylab to NOT sort automatically the values?

Whenever I plot, the X axis sorts automatically (for example, if i enter values 3, 2, 4, it will automatically sort the X axis from smaller to larger.
How can I do it so the axis remains with the order I input the values i.e 3, 2, 4
import pylab as pl
data = genfromtxt('myfile.dat')
pl.axis('auto')
pl.plot(data[:,1], data[:,0])
I found one function, set_autoscalex_on(FALSE) but I'm not sure how to use it or whether it is what I want.
Thanks
You could provide a dummy x-range, and then override the xtick labels. I do agree with the comments above questioning wether its the best solution, but thats hard to judge without any context.
If you really want to, this might be an option:
fig, ax = plt.subplots(1,2, figsize=(10,4))
x = [2,4,3,6,1,7]
y = [1,2,3,4,5,6]
ax[0].plot(x, y)
ax[1].plot(np.arange(len(x)), y)
ax[1].set_xticklabels(x)
edit: If you work with dates, why not plot the real date on the axis (and perhaps format it by the day-of-month if you do want 29 30 1 2 etc on the axis?
Maybe you want to set the xticks:
import pylab as pl
data = genfromtxt('myfile.dat')
pl.axis('auto')
xs = pl.arange(data.shape[0])
pl.plot(xs, data[:,0])
pl.xticks(xs, data[:,1])
Working sample:
Another option would be to work with datetimes. If you work with dates, you can use those as input to the plot command.
Working sample:
import random
import pylab as plt
import datetime
from matplotlib.dates import DateFormatter, DayLocator
fig, ax = plt.subplots(2,1, figsize=(6,8))
# Sample 1: use xticks
days = [29,30,31,1,2,3,4,5]
values = [random.random() for x in days]
xs = range(len(days))
plt.axes(ax[0])
plt.plot(xs, values)
plt.xticks(xs, days)
# Sample 2: Work with dates
date_strings = ["2013-01-30",
"2013-01-31",
"2013-02-01",
"2013-02-02",
"2013-02-03"]
dates = [datetime.datetime.strptime(x, "%Y-%m-%d") for x in date_strings]
values = [random.random() for x in dates]
plt.axes(ax[1])
plt.plot(dates,values)
ax[1].xaxis.set_major_formatter(DateFormatter("%b %d"))
ax[1].xaxis.set_major_locator(DayLocator())
plt.show()

Categories