python plotting overrides data - python

I have lot of binary and ascii files in one folder. I am reading them using glob module. Doing processing of the binary data so that I can plot them. And finally, I am trying to plot simplified binary data in one subplot and normal ascii file in another subplot. The problem I am facing is that it can generate plots for the corresponding binary files. But for the ascii files it just simply override the previous files and always generates the same plot. Here is the simplied version of the code for an example-
import glob
import numpy as np
from struct import unpack
import matplotlib.pyplot as plt
chi = sorted(glob.glob('C:/Users/Desktop/bin/*.chi'))
for index,fh in enumerate(chi):
data = np.genfromtxt(fh, dtype = float)
x = [row[0] for row in data]
y = [row[1] for row in data]
binary = sorted(glob.glob('C:/Users/Desktop/bin/*.bin'))
for count,FILE in enumerate(binary):
F = open(FILE,'rb')
B = unpack('f'*1023183, F.read(4*1023183))
A = np.array(B).reshape(1043, 981)
F.close()
#a = something column 1 # some further processing
#b = something column 2 # and generates 1D data
fig = plt.figure(figsize=(11, 8.0))
ax1 =fig.add_subplot(211,axisbg='w')
ax1.plot(a,b)
ax2 =fig.add_subplot(212, axisbg ='w')
ax2.plot(x,y)
plt.show()
Can somebody please explain why the files are replacing each other during plotting only for one set of data where the other set is plotting correctly?

the structures of the loops is not correct in your example, you must have the plot command inside the loop over the ascii file, else only the last one is plotted. This should work:
try it like this:
import glob
import numpy as np
from struct import unpack
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(11, 8.0))
chi = sorted(glob.glob('C:/Users/Desktop/bin/*.chi'))
for index,fh in enumerate(chi):
data = np.genfromtxt(fh, dtype = float)
x = [row[0] for row in data]
y = [row[1] for row in data]
ax1 =fig.add_subplot(211, axisbg ='w')
ax1.plot(x,y)
binary = sorted(glob.glob('C:/Users/Desktop/bin/*.bin'))
for count,FILE in enumerate(binary):
F = open(FILE,'rb')
B = unpack('f'*1023183, F.read(4*1023183))
A = np.array(B).reshape(1043, 981)
F.close()
#a = something column 1 # some further processing
#b = something column 2 # and generates 1D data
ax2 =fig.add_subplot(212,axisbg='w')
ax2.plot(a,b)
plt.show()

Related

Heatmap from 3D-data, with float-numbers

I am trying to generate a heatmap from 3D-data in a csv-file. The csv-file has the format x,y,z for each line. The problem is when I create a array to link the values, I can't use float-numbers as keys. When setting the dtype to int in np.loadtext(), the code works fine; but this makes the resolution only half of what the csv-file can replicate. Is there another way of linking the values?
The code so far is:
import numpy as np
import seaborn as sb
import matplotlib.pyplot as plt
fname = 'test18.csv'
x, y, z = np.loadtxt(fname, delimiter=',', dtype=float).T
pltZ = np.zeros((y.max()+1, x.max()+1), dtype=float)
pltZ[y, x] = z
heat_map = sb.heatmap(pltZ, cmap=plt.cm.rainbow)
plt.show()

How to use pandas with matplotlib to create 3D plots

I am struggling a bit with the pandas transformations needed to make data render in 3D on matplot lib. The data I have is usually in columns of numbers (usually time and some value). So lets create some test data to illustrate.
import pandas as pd
pattern = ("....1...."
"....1...."
"..11111.."
".1133311."
"111393111"
".1133311."
"..11111.."
"....1...."
"....1....")
# create the data and coords
Zdata = list(map(lambda d:0 if d == '.' else int(d), pattern))
Zinverse = list(map(lambda d:1 if d == '.' else -int(d), pattern))
Xdata = [x for y in range(1,10) for x in range(1,10)]
Ydata = [y for y in range(1,10) for x in range(1,10)]
# pivot the data into columns
data = [d for d in zip(Xdata,Ydata,Zdata,Zinverse)]
# create the data frame
df = pd.DataFrame(data, columns=['X','Y','Z',"Zi"], index=zip(Xdata,Ydata))
df.head(5)
Edit: This block of data is demo data that would normally come from a query on a
database that may need more cleaning and transforms before plotting. In this case data is already aligned and there are no problems aside having one more column we don't need (Zi).
So the numbers in pattern are transferred into height data in the Z column of df ('Zi' being the inverse image) and with that as the data frame I've struggled to come up with this pivot method which is 3 separate operations. I wonder if that can be better.
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.cm as cm
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
Xs = df.pivot(index='X', columns='Y', values='X').values
Ys = df.pivot(index='X', columns='Y', values='Y').values
Zs = df.pivot(index='X', columns='Y', values='Z').values
ax.plot_surface(Xs,Ys,Zs, cmap=cm.RdYlGn)
plt.show()
Although I have something working I feel there must be a better way than what I'm doing. On a big data set I would imagine doing 3 pivots is an expensive way to plot something. Is there a more efficient way to transform this data ?
I guess you can avoid some steps during the preparation of the data by not using pandas (but only numpy arrays) and by using some convenience fonctions provided by numpy such as linespace and meshgrid.
I rewrote your code to do so, trying to keep the same logic and the same variable names :
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
pattern = ("....1...."
"....1...."
"..11111.."
".1133311."
"111393111"
".1133311."
"..11111.."
"....1...."
"....1....")
# Extract the value according to your logic
Zdata = list(map(lambda d:0 if d == '.' else int(d), pattern))
# Assuming the pattern is always a square
size = int(len(Zdata) ** 0.5)
# Create a mesh grid for plotting the surface
Xdata = np.linspace(1, size, size)
Ydata = np.linspace(1, size, size)
Xs, Ys = np.meshgrid(Xdata, Ydata)
# Convert the Zdata to a numpy array with the appropriate shape
Zs = np.array(Zdata).reshape((size, size))
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# Plot the surface
ax.plot_surface(Xs, Ys, Zs, cmap=cm.RdYlGn)
plt.show()

Data Points not being plotted on a Matplotlib plot

Hello I am attempting to write a program that allows the plotting of the graph from various data sets from a excel database.(The x axis is a fixed set of values while the data values from other columns can be selected). However, the graph that is plotted only contains the axes of the graph, while the data points are completely missing. The code I have used is as such:
import xlrd
import matplotlib.pyplot as plt
from matplotlib.figure import *
loc = ("C:\\Users\\yeoho\\DCO_Raw_Data.xlsx")
wb = xlrd.open_workbook(loc)
sheet = wb.sheet_by_index(0)
sheet.cell_value(0,0)
x = [[sheet.cell_value(r,0)]for r in range(6,sheet.nrows)]
checkOn = True
while checkOn:
FileName = [[sheet.cell_value(0,c)]for c in range(1,13)]
print(FileName)
print("Enter the Integer (1-n) corresponding to the file name that you would like to plot")
z = int(input())
y = [[sheet.cell_value(r,z)]for r in range(6,sheet.nrows)]
fig = plt.figure()
ax = fig.add_subplot(111)
assert len(x) == len(y)
for i in range(len(x)):
plt.plot(x[i],y[i],color='black')
plt.show()
break
The code in lines 16-21 were taken from another stackoverflow page. How to plot two lists of tuples with Matplotlib
The original code did not have a color parameter but I have found out that that is not the source of the issue.
I am unsure of what the issue here is. Thank you for taking your time to read this and I hope you can help me with this issue.

How to plot data from different files?

I'm trying to plot data from different text files.
I've had to manipulate data so I could construct the graphic that I desire for just one document. All other documents are in the same way. But I can't see how I can plot all in one panel. The code that I tried for the loop of all files was:
import numpy as np
import matplotlib.pyplot as plt
filenames=["b_10.txt","b_100.txt","b_500.txt","b_1000.txt"]
for i in filenames:
with open(i) as f:
data = f.read()
data = data.split('\n')
x = [row.split(' ')[0] for row in data]
y = [row.split(' ')[-1] for row in data]
x
a=list(map(str.strip, y))
trip_list = [item.strip('\tall\t') for item in y]
yy = np.array(trip_list[1:12])
yy
xx= np.array(x[21:32])
xx
fig = plt.figure()
plt.hold(True)
plt.ylabel('Precisão Interpolada')
plt.xlabel('Recall')
plt.plot(xx,yy,'-',label="Precisão Interpolada vs Recall")
plt.show()
It gave me an error:
ValueError: could not convert string to float:
and a blank panel
enter image description here

HDF5 file to diagram in python

I'm trying to generate some diagrams from an .h5 file but I don't know how to do it.
I'm using pytables, numpy and matplotlib.
The hdf5 files I use contains 2 sets of data, 2 differents curves.
My goal is to get diagrams like this one.
This is what I managed to do for the moment:
import tables as tb
import numpy as np
import matplotlib.pyplot as plt
h5file = tb.openFile(args['FILE'], "a")
for group in h5file.walkGroups("/"):
for array in h5file.walkNodes("/","Array"):
if(isinstance(array.atom.dflt, int)):
tab = np.array(array.read())
x = tab[0]
y = tab[1]
plt.plot(x, y)
plt.show()
x and y values are good but I don't know how to use them, so the result is wrong. I get a triangle instead of what I want ^^
Thank you for your help
EDIT
I solved my problem.
Here is the code :
fig = plt.figure()
tableau = np.array(array.read())
x = tableau[0]
y = tableau[1]
ax1 = fig.add_subplot(211)
ax2 = fig.add_subplot(212)
ax1.plot(x)
ax2.plot(y)
plt.title(array.name)
plt.show()

Categories