I am trying to make a live plot in matplotlib, meaning that the data is coming from a CSV file that keeps getting updated with new data. So far, I can't succeed to make the plot update continually.
My intention is that I want that with time passing the graph old point will get out of the plot figure.
Can someone help me please?
This is my code:
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import csv
import time
fig=plt.figure()
ax1=fig.add_subplot(1,1,1)
def animate(i):
graph_data=open('DATA.csv')
xs=[]
ys=[]
for line in graph_data:
time,hrt=line.split(',')
xs.append(float(time))
ys.append(float(hrt))
ax1.clear()
ax1.plot(xs,ys,'b',linewidth=0.5)
ani=animation.FuncAnimation(fig,animate,interval=8)
plt.show()
Use How to update/refresh my graph (animated graph) after another cycle, with matplotlib?
but open the file outside of the function.
Inside of the function check if there is new data. It's not that good to use it from a file.
You could count the number of lines that you read and reopen the file new each time and skip that number of lines.
You could check the time stamp and read the last line.
You could read all the data and plot the whole data again. Ok for little data.
You could open the file in binary mode and check the number of bytes you read and then read up to the next line separator.
Related
I have data for several subjects and would like to make a plot for each of them. The goal is to loop through the individual subjects, extract the data for the current subject and then plot it. Once the data of the first subject is plotted, the program should stop and wait for user input, so that the plot for that subject can be inspected in peace. How is this possible?
Note: Im not interested in making subplots.
Im working with PyCharm and the TkAgg Backend.
import matplotlib.pyplot as plt
import numpy as np
data_per_subject = [np.array([1,2]),
np.array([3,4]),
np.array([5,6])]
for data in data_per_subject:
# open figure and plot data of current subject
fig, ax = plt.subplots()
plt.plot(data)
# Here some magic has to take place, so that the program waits for my
# input. The importent thing is, that I have to be able to inspect
# the plot for the current subject while the programm is waiting for my
# input!
# close figure
plt.close()
I've tried different methods to save my plot but every thing I've tried has turned up with a blank image and I'm not currently out of ideas. Any help with other suggestions that could fix this? The code sample is below.
word_frequency = nltk.FreqDist(merged_lemmatizedTokens) #obtains frequency distribution for each token
print("\nMost frequent top-10 words: ", word_frequency.most_common(10))
word_frequency.plot(10, title='Top 10 Most Common Words in Corpus')
plt.savefig('img_top10_common.png')
I was able to save the NLTK FreqDist plot, when I first initialized a figure object, then called the plot function and finally saved the figure object.
import matplotlib.pyplot as plt
from nltk.probability import FreqDist
fig = plt.figure(figsize = (10,4))
plt.gcf().subplots_adjust(bottom=0.15) # to avoid x-ticks cut-off
fdist = FreqDist(merged_lemmatizedTokens)
fdist.plot(10, cumulative=False)
plt.show()
fig.savefig('freqDist.png', bbox_inches = "tight")
I think you can try the following:
plt.ion()
word_frequency.plot(10, title='Top 10 Most Common Words in Corpus')
plt.savefig('img_top10_common.png')
plt.ioff()
plt.show()
This is because inside nltk's plot function, plt.show() is called and once the figure is closed, plt.savefig() has no active figure to save anymore.
The workaround is to turn interactive mode on, such that the plt.show() from inside the nltk function does not block. Then savefig is called with a current figure available and saves the correct plot. To then show the figure, interactive mode needs to be turned off again and plt.show() be called externally - this time in a blocking mode.
Ideally, nltk would rewrite their plotting function to either allow to set the blocking status, or to not show the plot and return the created figure, or to take a Axes as input to which to plot. Feel free to reach out to them with this request.
I am trying to plot some data from a file. The file contains 13 columns, but i want just the first and the fourth column to plot. Also, there are more than one of the file, i want to plot them on the same diagram. I succeeded to show lines on the diagram. I added my code for plotting arrays. The problem is that i want to have different colors for each file, but my code does the same for all. How can i correct it?
Thank you.
# gen_len is an array, same for all files
# gen_number is an array contains information
# of files
colors="bgrcmyk"
index=0
for gen in gen_number:
plt.plot(gen,gen_len,color=colors[index])
index=index+1
plt.savefig('result.png')
plt.show()
A more elegant solution for reading in your files would be to use numpy's genfromtxt, which can import just your desired columns, and also ignore lines starting with a certain character (the comments='#' keyword). I think this code does as you want:
import numpy as np
import matplotlib.pyplot as plt
import sys
colors="bgrcmyk"
for i in range(1,len(sys.argv)):
gen,gen_len=np.genfromtxt(sys.argv[i],usecols=(0,3),unpack=True,comments='#')
plt.plot(gen,gen_len,c=colors[i])
plt.savefig('result.png')
I'm required to use the information from a .sac file and plot it against a grid. I know that using various ObsPy functions one is able to plot the Seismograms using st.plot() but I can't seem to get it against a grid. I've also tried following the example given here "How do I draw a grid onto a plot in Python?" but have trouble when trying to configure my x axis to use UTCDatetime. I'm new to python and programming of this sort so any advice / help would be greatly appreciated.
Various resources used:
"http://docs.obspy.org/tutorial/code_snippets/reading_seismograms.html"
"http://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.plot.html#obspy.core.stream.Stream.plot"
The Stream's plot() method actually automatically generates a grid, e.g. if you take the default example and plot it via:
from obspy.core import read
st = read() # without filename an example file is loaded
tr = st[0] # we will use only the first channel
tr.plot()
You may want to play with the number_of_ticks, tick_format and tick_rotationparameters as pointed out in http://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.plot.html.
However if you want more control you can pass a matplotlib figure as input parameter to the plot() method:
from obspy.core import read
import matplotlib.pyplot as plt
fig = plt.figure()
st = read('/path/to/file.sac')
st.plot(fig=fig)
# at this point do whatever you want with your figure, e.g.
fig.gca().set_axis_off()
# finally display your figure
fig.show()
Hope it helps.
I have an in-house distributed computing library that we use all the time for parallel computing jobs. After the processes are partitioned, they run their data loading and computation steps and then finish with a "save" step. Usually this involved writing data to database tables.
But for a specific task, I need the output of each process to be a .png file with some data plots. There are 95 processes in total, so 95 .pngs.
Inside of my "save" step (executed on each process), I have some very simple code that makes a boxplot with matplotlib's boxplot function and some code that uses savefig to write it to a .png file that has a unique name based on the specific data used in that process.
However, I occasionally see output where it appears that two or more sets of data were written into the same output file, despite the unique names.
Does matplotlib use temporary file saves when making boxplots or saving figures? If so, does it always use the same temp file names (thus leading to over-write conflicts)? I have run my process using strace and cannot see anything that obviously looks like temp file writing from matplotlib.
How can I ensure that this will be threadsafe? I definitely want to conduct the file saving in parallel, as I am looking to expand the number of output .pngs considerably, so the option of first storing all the data and then just serially executing the plot/save portion is very undesirable.
It's impossible for me to reproduce the full parallel infrastructure we are using, but below is the function that gets called to create the plot handle, and then the function that gets called to save the plot. You should assume for the sake of the question that the thread safety has nothing to do with our distributed library. We know it's not coming from our code, which has been used for years for our multiprocessing jobs without threading issues like this (especially not for something we don't directly control, like any temp files from matplotlib).
import pandas
import numpy as np
import matplotlib.pyplot as plt
def plot_category_data(betas, category_name):
"""
Function to organize beta data by date into vectors and pass to box plot
code for producing a single chart of multi-period box plots.
"""
beta_vector_list = []
yms = np.sort(betas.yearmonth.unique())
for ym in yms:
beta_vector_list.append(betas[betas.yearmonth==ym].Beta.values.flatten().tolist())
###
plot_output = plt.boxplot(beta_vector_list)
axs = plt.gcf().gca()
axs.set_xticklabels(betas.FactorDate.unique(), rotation=40, horizontalalignment='right')
axs.set_xlabel("Date")
axs.set_ylabel("Beta")
axs.set_title("%s Beta to BMI Global"%(category_name))
axs.set_ylim((-1.0, 3.0))
return plot_output
### End plot_category_data
def save(self):
"""
Make calls to store the plot to the desired output file.
"""
out_file = self.output_path + "%s.png"%(self.category_name)
fig = plt.gcf()
fig.set_figheight(6.5)
fig.set_figwidth(10)
fig.savefig(out_file, bbox_inches='tight', dpi=150)
print "Finished and stored output file %s"%(out_file)
return None
### End save
In your two functions, you're calling plt.gcf(). I would try generating a new figure every time you plot with plt.figure() and referencing that one explicitly so you skirt the whole issue entirely.