I am currently making a project which requires real time monitoring of various quantities like temperature, pressure, humidity etc. I am following a approach of making individual arrays of all the sensors and ploting a graph using matplotlib and drwnow.
HOST = "localhost"
PORT = 4223
UID1 = "tsJ" # S1
from tinkerforge.ip_connection import IPConnection
from tinkerforge.bricklet_ptc import BrickletPTC
import numpy as np
import serial
import matplotlib
from matplotlib.ticker import ScalarFormatter, FormatStrFormatter
import matplotlib.pyplot as plt
from matplotlib import style
style.use('ggplot')
from drawnow import *
# creating arrays to feed the data
tempC1 = []
def makeafig():
# creating subplots
fig1 = plt.figure(1)
a = fig1.add_subplot(111)
#setting up axis label, auto formating of axis and title
a.set_xlabel('Time [s]', fontsize = 10)
a.set_ylabel('Temperature [°C]', fontsize = 10)
y_formatter = matplotlib.ticker.ScalarFormatter(useOffset=False)
a.yaxis.set_major_formatter(y_formatter)
title1 = "Current Room Temperature (Side1): " + str(temperature1/100) + " °C"
a.set_title(title1, fontsize = 10)
#plotting the graph
a.plot(tempC1, "#00A3E0")
#saving the figure
fig1.savefig('RoomTemperature.png', dpi=100)
while True:
ipcon = IPConnection() # Create IP connection
ptc1 = BrickletPTC(UID1, ipcon) # S1
ipcon.connect(HOST, PORT) # Connect to brickd
#setting the temperature from PTC bricklet
temperature1 = ptc1.get_temperature()
#processing data from a temperature sensor to 1st array
dataArray1=str(temperature1/100).split(',')
temp1 = float(dataArray1[0])
tempC1.append(temp1)
#making a live figure
drawnow(makeafig)
plt.draw()
This is the approach I found good on the internet and it is working. The only problem I am facing is It consumes more time if I made more arrays for other sensors and the plot being made lags from the real time when I compare it with a stopwatch.
Is there any good and efficient approach for obtaining live graphs that will be efficient with lot of sensors and doen't lag with real time.
Or any command to clear up the already plotted array values?
I'd be obliged if anyone can help me with this problem.
I'd like to ask that, does filling data in arrays continuosly makes the process slow or is it my misconception?
That one is easy to test; creating an empty list and appending a few thousand values to it takes roughly 10^-4 seconds, so that shouldn't be a problem. For me somewhat surprisingly, it is actually faster than creating and filling a fixed size numpy.ndarray (but that is probably going to depend on the size of the list/array).
I quickly played around with your makeafig() function, placing a = fig1.add_subplot(111) up to (including) a.plot(..) in a simple for i in range(1,5) loop, with a = fig1.add_subplot(2,2,i); that makes makeafig() about 50% slower, but differences are only about 0.1-0.2 seconds. Is that in line with the lag that you are experiencing?
That's about what I can test without the real data, my next step would be to time the part from ipcon=.. to temperature1=... Perhaps the bottleneck is simply the retrieval of the data? I'm sure that there are several examples on SO on how to time parts of Python scripts, for these kind of problems something like the example below should be sufficient:
import time
t0 = time.time()
# do something
dt = time.time() - t0
Related
I'm using Dask to read 500 parquet files and it does it much faster than other methods that I have tested.
Each parquet file contains a time column and many other variable columns.
My goal is to create a single plot that will have 500 lines of variable vs time.
When I use the following code, it works very fast compared to all other methods that I have tested but it gives me a single "line" on the plot:
import dask.dataframe as dd
import matplotlib.pyplot as plt
import time
start = time.time()
ddf = dd.read_parquet("results_parq/*.parquet")
plt.plot(ddf['t'].compute(),ddf['reg'].compute())
plt.show()
end = time.time()
print(end-start)
from my understanding, it happens because Dask just plots the following:
t
0
0.01
.
.
100
0
0.01
.
.
100
0
What I mean it plots a huge column instead of 500 columns.
One possible solution that I tried to do is to plot it in a for loop over the partitions:
import dask.dataframe as dd
import matplotlib.pyplot as plt
import time
start = time.time()
ddf = dd.read_parquet("results_parq/*.parquet")
for p in ddf.partitions:
plt.plot(p['t'].compute(),p['reg'].compute())
plt.show()
end = time.time()
print(end-start)
It does the job and the resulting plot looks like I want:
However, it results in much longer times.
Is there a way to do something like this but yet to use Dask multicore benefits? Like somehow use map_partitions for it?
Thank you
as a start, you cannot normally make matplotlib draw to the same figure from multiple processes, as the renderers aren't using shared memory. (neither should they from a programming point of view)
drawing 500 lines is a very simple task for matplotlib and the problem maybe not in matplotlib.
your dask workers are likely sending data sequentially to your main process, hence the slowdown. (each worker has to wait for master to request data then send it then wait for confirmation, then wait for next order to come, etc)
you can force them to send their data faster by prefetching all the data before you start plotting by matplotlib.
import numpy as np
ddf = dd.read_parquet("results_parq/*.parquet")
# compute length of each partition
lengths = ddf.map_partitions(len).compute()
# get all partitions at once
ddf2 = ddf.compute()
# calculate each parition start and end
lengths = list(lengths)
lengths.insert(0,0)
accumelated_lengths= np.cumsum(lengths)
# plot each partition
for i in range(len(accumelated_lengths)-1):
plt.plot(ddf2['t'][accumelated_lengths[i]:accumelated_lengths[i+1]],
ddf2['reg'][accumelated_lengths[i]:accumelated_lengths[i+1]])
plt.show()
Edit: making 500 calls to plt.plot is probably slowing you down too, you could use matplotlib.LineCollection instead, it takes 1/20 of the time for 500 lines.
# plot each partition
lines = []
fig, ax = plt.subplots()
for i in range(len(accumelated_lengths) - 1):
lines.append(tuple(zip(ddf2['a'][accumelated_lengths[i]:accumelated_lengths[i + 1]],
ddf2['b'][accumelated_lengths[i]:accumelated_lengths[i + 1]])))
coll = LineCollection(lines,colors = np.random.random([len(lines),3]))
ax.add_collection(coll)
ax.autoscale_view()
plt.show()
Using the below python(ubuntu) code and rtlsdr, I am able to plot a graph. Can anyone please tell me how to modify this code to plot the graph continuously in real time?
from pylab import *
from rtlsdr import *
sdr = RtlSdr()
sdr.sample_rate = 2.4e6
sdr.center_freq = 93.5e6
sdr.gain = 50
samples = sdr.read_samples(256*1024)
sdr.close()
psd(samples.real, NFFT=1024, Fs=sdr.sample_rate/1e6, Fc=sdr.center_freq/1e6)
xlabel('Frequency (MHz)')
ylabel('Relative power (dB)')
show()
Normally you can update a pyplot-generated plot by calling plot.set_xdata(), plot.set_ydata(), and plot.draw() (Dynamically updating plot in matplotlib), without having to recreate the entire plot. However, this only works for plots that directly draw a data series. The plot instance cannot automatically recalculate the spectral density calculated by psd().
You'll therefore need to call psd() again when you want to update the plot - depending on how long it takes to draw, you could do this in regular intervals of a second or less.
This might work:
from pylab import *
from rtlsdr import *
from time import sleep
sdr = RtlSdr()
sdr.sample_rate = 2.4e6
sdr.center_freq = 93.5e6
sdr.gain = 50
try:
while True: # run until interrupted
samples = sdr.read_samples(256*1024)
clf()
psd(samples.real, NFFT=1024, Fs=sdr.sample_rate/1e6, Fc=sdr.center_freq/1e6)
xlabel('Frequency (MHz)')
ylabel('Relative power (dB)')
show()
sleep(1) # sleep for 1s
except:
pass
sdr.close()
Edit: Of course, I'm not sure how read_samples runs; my example here assumed that it returns almost immediately. If it blocks for a long time while waiting for data, you may want to read less data at a time, and discard old data as you do so:
from collections import deque
max_size = 256*1024
chunk_size = 1024
samples = deque([], max_size)
while True:
samples.extend(sdr.read_samples(chunk_size))
# draw plot
Taking ideas from various sources, and combining with my own, I sought to create an animated maps showing the shading of countries based on some value in my data.
The basic process is this:
Run DB query to get dataset, keyed by country and time
Use pandas to do some data manipulation (sums, avgs, etc)
Initialize basemap object, then load Load external shapefile
Using the animation library, color the countries, one frame for each distinct "time" in the dataset.
Save as gif or mp4 or whatever
This works just fine. The problem is that it is extremely slow. I have potentially over 100k time intervals (over several metrics) I want to animate, and I'm getting an average time of 15s to generate each frame, and it gets worse the more frames there are. At this rate, it will potentially take weeks of maxing out the cpu and memory on my computer to generate a single animation.
I know that matplotlib isn't known for being very fast (examples: 1 and 2) But I read stories of people generating animations at 5+ fps and wonder what I'm doing wrong.
Some optimizations that I have done:
Only recolor the countries in the animate function. This takes on average ~3s per frame, so while it could be improved, it's not what takes the most time.
I use the blit option.
I tried using a smaller plots size and less detailed basemap, but the results were marginal.
Perhaps a less detailed shapefile would speed up the coloring of the shapes, but as I said before, that's only a 3s per frame improvement.
Here is the code (minus a few identifiable features)
import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import time
from math import pi
from sqlalchemy import create_engine
from mpl_toolkits.basemap import Basemap
from matplotlib.patches import Polygon
from matplotlib.collections import PatchCollection
from geonamescache import GeonamesCache
from datetime import datetime
def get_dataset(avg_interval, startTime, endTime):
### SQL query
# Returns a dataframe with fields [country, unixtime, metric1, metric2, metric3, metric4, metric5]]
# I use unixtime so I can group by any arbitrary interval to get sums and avgs of the metrics (hence the param avg_interval)
return df
# Initialize plot figure
fig=plt.figure(figsize=(11, 6))
ax = fig.add_subplot(111, axisbg='w', frame_on=False)
# Initialize map with Robinson projection
m = Basemap(projection='robin', lon_0=0, resolution='c')
# Load and read shapefile
shapefile = 'countries/ne_10m_admin_0_countries'
m.readshapefile(shapefile, 'units', color='#dddddd', linewidth=0.005)
# Get valid country code list
gc = GeonamesCache()
iso2_codes = list(gc.get_dataset_by_key(gc.get_countries(), 'fips').keys())
# Get dataset and remove invalid countries
# This one will get daily aggregates for the first week of the year
df = get_dataset(60*60*24, '2016-01-01', '2016-01-08')
df.set_index(["country"], inplace=True)
df = df.ix[iso2_codes].dropna()
num_colors = 20
# Get list of distinct times to iterate over in the animation
period = df["unixtime"].sort_values(ascending=True).unique()
# Assign bins to each value in the df
values = df["metric1"]
cm = plt.get_cmap('afmhot_r')
scheme= cm(1.*np.arange(num_colors)/num_colors)
bins = np.linspace(values.min(), values.max(), num_colors)
df["bin"] = np.digitize(values, bins) - 1
# Initialize animation return object
x,y = m([],[])
point = m.plot(x, y,)[0]
# Pre-zip country details and shap objects
zipped = zip(m.units_info, m.units)
tbegin = time.time()
# Animate! This is the part that takes a long time. Most of the time taken seems to happen between frames...
def animate(i):
# Clear the axis object so it doesn't draw over the old one
ax.clear()
# Dynamic title
fig.suptitle('Num: {}'.format(datetime.utcfromtimestamp(int(i)).strftime('%Y-%m-%d %H:%M:%S')), fontsize=30, y=.95)
tstart = time.time()
# Get current frame dataset
frame = df[df["unixtime"]==i]
# Loop through every country
for info, shape in zipped:
iso2 = info['ISO_A2']
if iso2 not in frame.index:
# Gray if not in dataset
color = '#dddddd'
else:
# Colored if in dataset
color = scheme[int(frame.ix[iso2]["bin"])]
# Get shape info for country, then color on the ax subplot
patches = [Polygon(np.array(shape), True)]
pc = PatchCollection(patches)
pc.set_facecolor(color)
ax.add_collection(pc)
tend = time.time()
#print "{}%: {} of {} took {}s".format(str(ind/tot*100), str(ind), str(tot), str(tend-tstart))
print "{}: {}s".format(datetime.utcfromtimestamp(int(i)).strftime('%Y-%m-%d %H:%M:%S'), str(tend-tstart))
return None
# Initialize animation object
output = animation.FuncAnimation(fig, animate, period, interval=150, repeat=False, blit=False)
filestring = time.strftime("%Y%m%d%H%M%S")
# Save animation object as m,p4
#output.save(filestring + '.mp4', fps=1, codec='ffmpeg', extra_args=['-vcodec', 'libx264'])
# Save animation object as gif
output.save(filestring + '.gif', writer='imagemagick')
tfinish = time.time()
print "Total time: {}s".format(str(tfinish-tbegin))
print "{}s per frame".format(str((tfinish-tbegin)/len(df["unixtime"].unique())))
P.S. I know the code is sloppy and could use some cleanup. I'm open to any suggestions, especially if that cleanup would improve performance!
Edit 1: Here is an example of the output
2016-01-01 00:00:00: 3.87843298912s
2016-01-01 00:00:00: 4.08691620827s
2016-01-02 00:00:00: 3.40868711472s
2016-01-03 00:00:00: 4.21187019348s
Total time: 29.0233821869s
9.67446072896s per frame
The first first few lines represent the date being processed, and the runtime of each frame. I have no clue why the first one is repeated. The final line it the total runtime of the program divided by the number of frames. Note that the average time is 2-3x the individual times. This makes me think that there is something happening "between" the frames that is eating up a lot of time.
Edit 2: I ran some performance tests and determined that average time to generate each additional frame is greater than the last, proportional to the number of frames, indicating that this is an quadratic-time process. (or would it be exponential?) Either way, I'm very confused why this wouldn't be linear. If the dataset is already generated, and the maps take a constant time to regenerate, what variable is causing each extra frame to take longer than the previous?
Edit 3: I just made the realization that I have no idea how the animation function works. The (x,y) and point variables were taken from an example that was just plotting moving dots, so it makes sense in that context. A map... not so much. I tried returning something map related from the animate function and got better performance. Returning the ax object (return ax,) makes the procedure run in linear time... but doesn't write anything to the gif. Anybody have any idea what I need to return from the animate function to make this work?
Edit 4: Clearing the axis every frame lets the frames generate at a constant rate! Now I just have to work on general optimizations. I'll start with ImportanceOfBeingErnest's suggestion first. The previous edits are obsolete now.
**EDIT: I am receiving the following error message:*
"Error retrieving accessibility bus address: or.freedesktop.DBus.Error.ServiceUnknown: The name org.a11y.Bus was not provided by any .service files"
I wrote code for the PC platform using Python and it works just fine. I was looking to have this program run on a rasberry pi connected to a monitor.
I installed Raspbian and all of the libraries for the above needed functions.
The code is as follows:
############################################################
# Visualizing data from the Arduino serial monotor #
# and creating a graph in a nice format. Later work will #
# look into outputting this to a Android device #
############################################################
############################################################
############################################################
############################################################
#Becuase we can always use more time
import time
#Inport the fancy plotting library
import matplotlib.pyplot as plt
#Inport the serial library
import serial
#We want to do array math so we need Numpy
import numpy
#We also want to draw things so we need drawnow and
#everything that comes with it
from drawnow import *
############################################################
############################################################
#Creating an array to hold data for plotting. See several
#lines below for more information
tempArray = []
pressureArray = []
altitudeArray = []
#The object that is created to interact with the Serial Port
arduinoSerialInput = serial.Serial('/dev/ttyACM0', 230400)
time.sleep(3)
#Here we set the definition oto look for interactive mode
#and plot live data
plt.ion()
############################################################
############################################################
clippingCounter = 0
#Here we define a function to make a plot, or figure of
#the data aquired
def plotTempData():
plt.ylim(150,320)
plt.title('Plots from data of the BMP Sensor')
plt.grid(True)
plt.ylabel('Tempature')
plt.subplot(2, 2, 1)
plt.plot(tempArray, 'ro-', label='Degrees K', linewidth=2)
plt.legend(loc='upper left')
#plt2 = plt.twinx()
#plt2.plot(pressureArray, 'bx-')
def plotPresData():
#plt.ylim(260,320)
plt.title('Plots from data of the BMP Sensor')
plt.grid(True)
plt.ylabel('Pressure')
plt.subplot(2, 2, 2)
plt.plot(pressureArray, 'bo-', label='Pascals', linewidth=2)
plt.legend(loc='upper left')
plt.plot(pressureArray, 'bo-')
plt.show()
############################################################
############################################################
while (1==1):
#First things first, lets wait for data prior to reading
if (arduinoSerialInput.inWaiting()>0):
myArduinoData = arduinoSerialInput.readline()
# print myArduinoData -Line commented out, see below
#The arduino should be set up to proovide one string of data
#from here we will seperate that one string into two or more
#to make it easier to plot
#This will create an array seperated by the comma
#we specified in the Arduino IDE aasign it to a
#variable of our choice and convert the string
#back into a number using the float command
dataArray = myArduinoData.split(',')
temp = float(dataArray[0])
pressure = float(dataArray[1])
#Used to test the printing of values to output
#print temp, " , ", pressure, " , ", altitude
#We need to create an array to to hold the values in question
#Looking at the top, you will see the empty array we need to
#create that will hold the values we need to plot
tempArray.append(temp)
pressureArray.append(pressure)
plt.figure(1)
drawnow(plotTempData)
plt.pause(.000001)
#plt.figure(2)
drawnow(plotPresData)
plt.pause(.000001)
#We want to clip some data off since the graph keeps
#expanding and too many points are building up.
clippingCounter = clippingCounter + 1
if(clippingCounter>50):
tempArray.pop(0)
pressureArray.pop(0)
############################################################
############################################################
The error you mentioned is I believe an issue with 'at-spi2-core' module. The problem seems fixed on Ubuntu version 2.9.90-0ubuntu2, have you tried to upgrade the 'at-spi2-core' module? Alternatively forcing it to be enabled might solve your issue.
Information on ensuring the SPI is enabled is here :
http://www.raspberrypi-spy.co.uk/2014/08/enabling-the-spi-interface-on-the-raspberry-pi/
Would have added this in a comment as I am not 100% on this but I don't have the rep for that.
I've written a simple GUI in python using pylabs and tkinter based on an example found here:
http://hardsoftlucid.wordpress.com/various-stuff/realtime-plotting/
used for sine wave generation.
Except I tweaked it to pull data through suds from a server on the internet. It's not working as I exactly anticipated as the GUI is somewhat slow. I think its due to the timer. I just started learning how to use matplotlib functions yesterday so I'm not aware of how every function works.
How can I speed it up? Right now the data comes in at 2-3 seconds which is fine, but I just want to increase the GUI responsiveness.
Here is my code:
import numpy as np
from matplotlib import pyplot as plt
plt.ion() # set plot to animated
url = "http://10.217.247.36/WSDL/v4.0/iLON100.WSDL"
client = Client(url, username='ilon', password='ilon', location = 'http://10.217.247.36/WSDL/iLON100.WSDL')
read = client.factory.create('ns0:E_xSelect')
read['xSelect'] = """//Item[starts-with(UCPTname, "Net/MB485/MAIN POWER/Fb/PowerSum")]"""
ydata = [0] * 50
ax1=plt.axes()
# make plot
line, = plt.plot(ydata)
plt.ylim([10,40])
# start data collection
while True:
x = client.service.Read(read).Item[0].UCPTvalue[0].value #data stream
x = float(x)
ymin = float(min(ydata))-10
ymax = float(max(ydata))+10
plt.ylim([ymin,ymax])
ydata.append(x)
del ydata[0]
line.set_xdata(np.arange(len(ydata)))
line.set_ydata(ydata) # update the data
plt.draw() # update the plot