Altair chart: Show less lines in the grid - python

I'm working on a chart using Altair, and I'm trying to figure out how to have less lines in the background grid. Is there a term for that background grid?
Here's a chart that looks like mine, that I took from the tutorial:
Let's say that I want to have half as many grid lines on the X axis. How could I do that?

Grid lines are drawn at the location of ticks, so to adjust the grid lines you can adjust the ticks. For example:
import altair as alt
import numpy as np
import pandas as pd
x = np.arange(100)
source = pd.DataFrame({
'x': x,
'f(x)': np.sin(x / 5)
})
alt.Chart(source).mark_line().encode(
x=alt.X('x', axis=alt.Axis(tickCount=4)),
y='f(x)'
)
You can see other tick-related properties in the documentation for alt.Axis.

Related

plot overlaps using matplotlib

I am learning matplotlib.
I am trying to plot two below plots in a single plot using matplotlib.
But it overlaps.
Here is my code.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
train_error = [0.26462888486225206, 0.26462383329393313, 0.2628674962680674, 0.2553700231555298, 0.17473717177688022, 0.14773444580059242, 0.1468299949185866, 0.1468235689407127, 0.1439370366766204]
test_error = [0.8438224756653776, 0.8442034650577578, 1.018608707726192, 4.853704454584892, 123.69312582226338, 798.4569874115062, 3205.5264038946007, 9972.587330411312, 10787335.618580218]
plt.plot(train_error)
plt.plot(test_error)
plt.show()
Where am i doing wrong ? Can anyone please guide / help ?
Use the subplot
Go check https://matplotlib.org/stable/gallery/subplots_axes_and_figures/subplots_demo.html
plt.subplot(1,2,1)
plt.plot(train_error)
plt.subplot(1,2,2)
plt.plot(test_error)
in plt.subplot(a,b,x) you have a,b that represents the number of (row and column) you want vertically and horizontally and x the index of the subplot selected counting from left to right and top to bottom.

How do you change the spread of the Y axis of pandas box plot?

I am plotting 100 data points for 9 different groups. One group's data points are much larger than all the other groups so when I make a box graph using pandas only that group is shown, while all other groups are smashed to the bottom. Here is what it looks like now: smushed box plot
I would like the Y axis to be more spaced out so that I can see the other groups' box graphs. Here is similar data in a scatter plot that has the spacing I am looking for: well spaced scatter plot
What I have
What is need
Here is my code at the moment:
# use ``` to designate a code block in markdown
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("residues.csv")
df.plot.box()
plt.show()
It looks like you want y to be log-scaled:
df.plot.box(logy=True)
Try this:
boxplot = df.boxplot(column=df.columns)
plt.show()
Reference
See the pandas documentation on boxplot: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.boxplot.html

Controlling Bin Widths in Altair

I have a set of numbers that I'd like to plot on a histogram.
Say:
import numpy as np
import matplotlib.pyplot as plt
my_numbers = np.random.normal(size = 1000)
plt.hist(my_numbers)
If I want to control the size and range of the bins I could do this:
plt.hist(my_numbers, bins=np.arange(-4,4.5,0.5))
Now, if I want to plot a histogram in Altair the code below will do, but how do I control the size and range of the bins in Altair?
import pandas as pd
import altair as alt
my_numbers_df = pd.DataFrame.from_dict({'Integers': my_numbers})
alt.Chart(my_numbers_df).mark_bar().encode(
alt.X("Integers", bin = True),
y = 'count()',
)
I have searched Altair's docs but all their explanations and sample charts (that I could find) just said bin = True with no further modification.
Appreciate any pointers :)
As demonstrated briefly in the Bin transforms section of the documentation, you can pass an alt.Bin() instance to fine-tune the binning parameters.
The equivalent of your matplotlib histogram would be something like this:
alt.Chart(my_numbers_df).mark_bar().encode(
alt.X("Integers", bin=alt.Bin(extent=[-4, 4], step=0.5)),
y='count()',
)

Black bar covering my x labels for matplotlib plot?

I am trying to play a figure and I am having a black box pop up on the bottom of the plot where the x labels should be. I tried this command from a similar question on here in the past:
from matplotlib import rcParams
rcParams.update({'figure.autolayout': True})
But the problem was still the same. Here is my current code:
import pylab
from matplotlib import rcParams
rcParams.update({'figure.autolayout': True})
df['date'] = df['date'].astype('str')
pos = np.arange(len(df['date']))
plt.bar(pos,df['value'])
ticks = plt.xticks(pos, df['value'])
And my plot is attached here. Any help would be great!
pos = np.arange(len(df['date'])) and ticks = plt.xticks(pos, df['value']) are causing the problem you are having. You are putting an xtick at every value you have in the data frame.
Don't know how you data looks like and what's the most sensible way to do this. ticks = plt.xticks(pos[::20], df['value'].values[::20], rotation=90) will put a tick every 20 rows that would make the plot more readable.
It actually is not a black bar, but rather all of your x-axis labels being crammed into too small of a space. You can try rotating the axis labels to create more space or just remove them all together.

matplotbib figure horization axis label automatically alignment or rescale

I was trying to plot a time series data figure using matplotbib, the problem is that there are too many observations, therefore the labels have overlap and don't fit well within a sized figure.
I am thinking of three solutions, one is to shrink the label size of observations, one is to change the text into vertical order or skewed manner, last is only to specify the first and last a few observations with dots between them. The code is to demonstrate my point.
I wonder anyone can help? Thanks
from datetime import date
import numpy as np
from pandas import *
import matplotlib.pyplot as plt
N = 100
data = np.array(np.random.randn(N))
time_index = date_range(date.today(), periods = len(data))
plt.plot(time_index, data)
For your simple plot, you could do
plt.xticks(rotation=90).
Alternatively, you could specify what ticks you wanted to display with
plt.xticks(<certain range of values>)
plt.xticklabels(<labels for those values>)
Edit:
Personally, I would change to the object-oriented way of pyplot.
f = plt.figure()
ax = f.add_subplot(111)
ax.plot(<stuff>)
ax.tick_params(axis='x', labelsize='8')
plt.setp( ax.xaxis.get_majorticklabels(), rotation=90 )
# OR
xlabels = ax.get_xticklabels()
for label in xlabels:
label.set_rotation(90)
plt.show()

Categories