Currently I'm doing some data visualization using python, matplotlib and mplcursor that requires to show different parameters and values at the same time in a certain time period.
Sample CSV data that was extracted from a system:
https://i.stack.imgur.com/fjd1d.png
My expected output would look like this:
https://i.stack.imgur.com/zXGXA.png
Found the same case but they were using numpy functions: Add the vertical line to the hoverbox (see pictures)
Hoping someone will suggest what is the best approach of my problem.
Code below:
import matplotlib.pyplot as plt
import numpy as np
import mplcursors
import pandas as pd
fig, ax=plt.subplots()
y1=ax.twinx()
y2=ax.twinx()
y2.spines.right.set_position(("axes", 1.05))
df=pd.read_csv(r"C:\Users\OneDrive\Desktop\sample.csv")
time=df['Time']
yd1=df['Real Power']
yd2=df['Frequency']
yd3=df['SOC']
l1=ax.plot(time,yd1,color='black', label='Real Power')
l2=y1.plot(time,yd2, color='blue', label='Frequency')
l3=y2.plot(time,yd3, color='orange', label='SOC')
df=pd.DataFrame(df)
arr=df.to_numpy()
print(arr)
def show_annotation(sel):
x=sel.target[0]
annotation_str = df['Real Power'][sel.index]
#sel.annotation.set_text(annotation_str)
fig.autofmt_xdate()
cursor=mplcursors.cursor(hover=True)
cursor.connect('add', show_annotation)
plt.show()```
Related
I want to change the following graphs to line graphs or other better understandable type.
import os
import itertools
import pvlib
import numpy as np
import pandas as pd
import matplotlib.style
import matplotlib as mpl
import matplotlib.pyplot as plt
from pvlib import clearsky, atmosphere, solarposition
from pvlib.location import Location
from pvlib.iotools import read_tmy3
tus = Location(28.6, 77.2, 'Asia/Kolkata', 216, 'Delhi')
times = pd.date_range(start='2016-01-01', end='2016-12-31', freq='1min', tz=tus.tz)
cs = tus.get_clearsky(times)
cs.plot();
plt.grid()
plt.ylabel('Irradiance $W/m^2$');
plt.title('Ineichen, climatological turbidity');
Irradiance data:
Also, how can I place the legend to suitable position.
Pandas plots by default line graphs. The graph that you are showing is actually a line graph, there are just so many lines that it looks like a solid plot!
To see this, try and only plot the first 1440 rows (one day), i.e.:
cs.iloc[:1440].plot()
If you want to visualize annual data you need to do some kind of aggregation for it to make sense. For example, resample to average daily irradiance and plot this:
cs.resample('1d').mean().plot()
To change the legend use the plt.legend command:
cs.resample('1d').mean().plot()
plt.legend(loc='upper right')
I have a scatter plot im working with and for some reason im not seeing all the x values on my graph
#%%
from pandas import DataFrame, read_csv
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
file = r"re2.csv"
df = pd.read_csv(file)
#sns.set(rc={'figure.figsize':(11.7,8.27)})
g = sns.FacetGrid(df, col='city')
g.map(plt.scatter, 'type', 'price').add_legend()
This is an image of a small subset of my plots, you can see that Res is displaying, the middle bar should be displaying Con and the last would be Mlt. These are all defined in the type column from my data set but are not displaying.
Any clue how to fix?
Python is doing what you tell it to do. Just pick different features, presumably things that make more sense for plotting, if you want to generate a more interesting plots. See this generic example below.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_theme(style="darkgrid")
tips = sns.load_dataset("tips")
sns.relplot(x="total_bill", y="tip", hue="smoker", data=tips);
Personally, I like plotly plots, which are dynamic, more than I like seaborn plots.
https://plotly.com/python/line-and-scatter/
I want to replicate plots from this paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5000555/pdf/nihms774453.pdf I'm particularly interested in plot on page 16, right panel. I tried to do this in matplotlib but it seems to me that there is no way to access lines in linecollection.
I don't know how to change the color of the each line, according to the value at every index. I'd like to eventually get something like here: https://matplotlib.org/3.1.1/gallery/lines_bars_and_markers/multicolored_line.html but for every line, according to the data.
this is what I tried:
the data in numpy array: https://pastebin.com/B1wJu9Nd
import pandas as pd, numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
from matplotlib import colors as mcolors
%matplotlib inline
base_range = np.arange(qq.index.max()+1)
fig, ax = plt.subplots(figsize=(12,8))
ax.set_xlim(qq.index.min(), qq.index.max())
# ax.set_ylim(qq.columns[0], qq.columns[-1])
ax.set_ylim(-5, len(qq.columns) +5)
line_segments = LineCollection([np.column_stack([base_range, [y]*len(qq.index)]) for y in range(len(qq.columns))],
cmap='viridis',
linewidths=(5),
linestyles='solid',
)
line_segments.set_array(base_range)
ax.add_collection(line_segments)
axcb = fig.colorbar(line_segments)
plt.show()
my result:
what I want to achieve:
I generate a plot using the following code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
index=pd.date_range('2018-01-01',periods=200)
data=pd.Series(np.random.randn(200),index=index)
plt.figure()
plt.plot(data)
Which gives me a plot, looking as follows:
It looks like Matplotlib has decided to format the x-ticks as %Y-%m (source)
I am looking for a way to retrieve this date format. A function like ax.get_xtickformat(), which would then return %Y-%m. Which is the smartest way to do this?
There is no built-in way to obtain the date format used to label the axes. The reason is that this format is determined at drawtime and may even change as you zoom in or out of the plot.
However you may still determine the format yourself. This requires to draw the figure first, such that the ticklocations are fixed. Then you may query the formats used in the automatic formatting and select the one which would be chosen for the current view.
Note that the following assumes that an AutoDateFormatter or a formatter subclassing this is in use (which should be the case by default).
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
index=pd.date_range('2018-01-01',periods=200)
data=pd.Series(np.random.randn(200),index=index)
plt.figure()
plt.plot(data)
def get_fmt(axis):
axis.axes.figure.canvas.draw()
formatter = axis.get_major_formatter()
locator_unit_scale = float(formatter._locator._get_unit())
fmt = next((fmt for scale, fmt in sorted(formatter.scaled.items())
if scale >= locator_unit_scale),
formatter.defaultfmt)
return fmt
print(get_fmt(plt.gca().xaxis))
plt.show()
This prints %Y-%m.
If you want to edit the format of the date in myFmt = DateFormatter("%d-%m-%Y"):
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
index=pd.date_range('2018-01-01',periods=200)
data=pd.Series(np.random.randn(200),index=index)
fig, ax = plt.subplots()
ax.plot(index, data)
myFmt = DateFormatter("%d-%m-%Y")
ax.xaxis.set_major_formatter(myFmt)
fig.autofmt_xdate()
plt.show()
I have pandas series of complex numbers, which I would like to plot. Currently, I am looping through each point and assigning it a color. I would prefer to generate the plot without the need to loop over each point... Using Series.plot() would be preferable. Converting series to numpy is ok though.
Here is an example of what I currently have:
import pandas as pd
import numpy as np
from matplotlib import pyplot
s = pd.Series((1+np.random.randn(500)*0.05)*np.exp(1j*np.linspace(-np.pi, np.pi, 500)))
cmap = pyplot.cm.viridis
for i, val in enumerate(s):
pyplot.plot(np.real(val), np.imag(val), 'o', ms=10, color=cmap(i/(len(s)-1)))
pyplot.show()
You can use pyplot.scatter, which allows coloring of points based on a value.
pyplot.scatter(np.real(s), np.imag(s), s=50, c=np.arange(len(s)), cmap='viridis')
Here, we set c to an increasing sequence to get the same result as in the question.
You can simply plot the real and imaginary part of the series without a loop.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
s = pd.Series((1+np.random.randn(500)*0.05)*np.exp(1j*np.linspace(-np.pi, np.pi, 500)))
plt.plot(s.values.real,s.values.imag, marker="o", ls="")
plt.show()
However, you need to use a scatter plot if you want to have different colors:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
s = pd.Series((1+np.random.randn(500)*0.05)*np.exp(1j*np.linspace(-np.pi, np.pi, 500)))
plt.scatter(s.values.real,s.values.imag, c = range(len(s)), cmap=plt.cm.viridis)
plt.show()