loop over many files and plot them - python

Thanks for the time spending reading it, maybe it is a simple question.
I have a file like this (they are like 200 files):
Output of SMC2FS2: FAS for file 20123427.CB2A.BHE.sac.smc
Nfreq_out =
8192
freq fas
0.0000000E+00 6.6406252E-03
2.4414062E-03 1.3868844E+04
4.8828125E-03 3.0740834E+04
7.3242188E-03 2.7857139E+04
9.7656250E-03 1.6535047E+04
1.2207031E-02 9.7825762E+03
1.4648438E-02 6.1421987E+03
1.7089844E-02 6.5783145E+03
1.9531250E-02 5.6137949E+03
2.1972656E-02 3.5297178E+03
To read them, to skip the header and to start the processing:
#define the path where I have the 200 files
pato='D:\\Seismic_Inves\\flc_grant\\120427\\smc2fs\\smooth'
os.chdir(pato)
lista=[]
#list all files with "kono_"
for a in glob.glob('*kono_*'):
lista.append(a)
#read and skip the header for all files
for archis in lista:
with open(archis,'r') as leo:
for _ in range(4):
next(leo)
#start the proccesing
for line in leo:
leo=[x.strip() for x in leo if x.strip()]
leos=[tuple(map(float,x.split())) for x in leo[1:]]
f=[x[0] for x in leos]
fas=[x[1] for x in leos]
plt.figure(1)
plt.plot(f,fas,'r')
plt.yscale('log')
plt.xscale('log')
plt.show()
As you can imagine it is a plot of Frequency vs Amplitude (FAS plot)
The code works well, but open a figure and plot just one file, then I need to close the figure and it will plot the second file and so on.
The question is:
How can I plot all the data (the 200 fcsv iles) in just one figure.
to #GlobalTraveler, this is the result using your suggestion:
FAS Konoomachi_smooth_data

Add the block argument to show -> plt.show(block = False) or move show outside the for loop
However in the grandscheme of things I would suggest moving the code to more OO approach. For example:
#define the path where I have the 200 files
from matplotlib.pyplot import subplots, show
pato='D:\\Seismic_Inves\\flc_grant\\120427\\smc2fs\\smooth'
os.chdir(pato)
lista=[]
#list all files with "kono_"
for a in glob.glob('*kono_*'):
lista.append(a)
#read and skip the header for all files
fig, ax = subplots() # open figure and create axis
for archis in lista:
with open(archis,'r') as leo:
for _ in range(4):
next(leo)
#start the proccesing
for line in leo:
leo=[x.strip() for x in leo if x.strip()]
leos=[tuple(map(float,x.split())) for x in leo[1:]]
f=[x[0] for x in leos]
fas=[x[1] for x in leos]
ax.plot(f,fas,'r') # plot on this axis
ax.set(**dict(xscale = 'log', yscale = 'log')) # format the axis
show() # show
it is the result with your suggestion
FAS_konoomachi_smooth

Related

Import Data in Python with Pandas just for specific rows

I am really new in Python and I hope this is the right community for my question. Sorry if it is not.
I am trying to import data from a .txt file with pandas.
The file looks like this:
# Raman Scattering Spectrum
# X-Axis: Frequency (cm-1)
# Y-Axis: Intensity (10-36 m2 cm/sr)
# Harmonic Data
# Peak information (Harmonic)
# X Y
# 20.1304976000 1.1465331676
# 25.5433266000 6.0306906544
...
# 3211.8081700000 0.3440113123
# 3224.5118500000 0.8814596030
# Plot Curve (Harmonic)
# X Y DY/DX
0.0000000000 8.4803414671 0.6546818124
8.0000000000 17.8239097502 2.0146387573
I already wrote this pieces of code to import my data:
import pandas as pd
# import matplotlib as plt
# import scipy as sp
data = pd.read_csv('/home/andrea/Schreibtisch/raman_gauss.txt', sep='\t')
data
Now I just get one column.
If I try it with
pd.read_fwf(file)
I got 3 columns, but the x and y values from plot curve (harmonic) are in one column.
Now I want to import from Plot Curve (Harmonic) the x, y and DY/DX values in different variables or containers as series.
The hart part for me ist how to split x und y now in 2 columns and how to tell python that the import should start at the line number from plot cuve (harmonix) +2 lines.
I think about it yet and my idea was to check all containers for the string 'Plot Curve (Harmonic). Then I get a new series with true or false. Then I need to read out which line number is true for the search word. And then I start the import from this line...
I am too much a newbie to Python and I am not yet familiar with the documantation that I found the command i must use.
Has anyone tipps for me with a command or something? And how to split the columns?
Thank you very much!
You can read as follows.
Code
import pandas as pd
import re # Regex to parse header
def get_data(filename):
# Find row containing 'Plot Curve (Harmonic)'
with open('data.txt', 'r') as f:
for i, line in enumerate(f):
if 'Plot Curve (Harmonic)' in line:
start_row = i
# Parse header on next line
header = re.findall(r'\S+', next(f))[1:]
# [1:] to skip '#' at beginnning of line
break
else:
start_row = None # not found
if start_row:
# Use delimiter=r"\s+": since have multiple spaces between numbers
# skip_rows = start_row+2: to skip to data
# (skip current and header row)
# reference: https://thispointer.com/pandas-skip-rows-while-reading-csv-file-to-a-dataframe-using-read_csv-in-python/
# names = header: assigns column names
df = pd.read_csv('data.txt', delimiter=r"\s+", skiprows=start_row+2,
names = header)
return df
Test
df = get_data('data.txt')
print(df)
data.txt file
# Raman Scattering Spectrum
# X-Axis: Frequency (cm-1)
# Y-Axis: Intensity (10-36 m2 cm/sr)
# Harmonic Data
# Peak information (Harmonic)
# X Y
# 20.1304976000 1.1465331676
# 25.5433266000 6.0306906544
...
# 3211.8081700000 0.3440113123
# 3224.5118500000 0.8814596030
# Plot Curve (Harmonic)
# X Y DY/DX
0.0000000000 8.4803414671 0.6546818124
8.0000000000 17.8239097502 2.0146387573
Output
X Y DY/DX
0 0.0 8.480341 0.654682
1 8.0 17.823910 2.014639
First: Thank you very much for your answer. It helps me a lot.
I tried to used the comment function but i cannot add an 'Enter'
I want to plot the data, I can now extract from the file, but when I add my standard plot code:
plt.plot(df.X, df.Y)
plt.legend(['simulated'])
plt.xlabel('raman_shift')
plt.ylabel('intensity')
plt.grid(True)
plt.show()
I get now the error:
TypeError Traceback (most recent call last)
<ipython-input-240-8594f8545868> in <module>
28 plt.plot(df.X, df.Y)
29 plt.legend(['simulated'])
---> 30 plt.xlabel('raman_shift')
31 plt.ylabel('intensity')
32 plt.grid(True)
TypeError: 'str' object is not callable
I have nothing changed at the label function. In my other project this lines work well.
And I dont know as well how do read out the DY/DX column, the '/' kann not be used in the columnname.
Did you got a tipp for me, again? :)
Thanks.

Plotting data from multiple files containing multiple columns in one graph

I am quite new to python so sorry if my question is very basic.
I have a couple of data files (let's say 4), each containing 9 columns and n rows (I need to skip the first row because it is the name of each column). I would like to have the flexibility of plotting any two columns against each other and at the same time do it in one graph.
Let's say I want to take column 2 and 4 from all the data files and plot against each other and have them in one graph to compare them.
What is the general way to do it please?
I looked at a lot of different examples but I couldn't really find the one that addresses this specific case.
Below is a piece of code I have for plotting two columns against each other for one file:
from pylab import *
### processing function
def store(var,textFile):
data=loadtxt(textFile,skiprows=1)
it=[]
eps=[]
sig=[]
tc=[]
sc=[]
te=[]
se=[]
ubf=[]
for i in range(0,len(data)):
it.append(float(data[i,1]))
eps.append(float(data[i,0]))
sig.append(float(data[i,4]))
tc.append(float(data[i,6]))
sc.append(float(data[i,2]))
te.append(float(data[i,7]))
se.append(float(data[i,3]))
ubf.append(float(data[i,8]))
var.append(it)
var.append(eps)
var.append(sig)
var.append(tc)
var.append(sc)
var.append(te)
var.append(se)
var.append(ubf)
### data input
dataFile1='555_20K_tensionTestCentreCrack_L5a0_r0.01'
a1=[]
store(a1,dataFile1)
rcParams.update({'legend.numpoints':1,'font.size': 20,'axes.labelsize':25,'xtick.major.pad':10,'ytick.major.pad':10,'legend.fontsize':20})
lw=2
ms=10
### plots
crossSection=0.04
figure(0,figsize=(10,10))
ax1=subplot(1,1,1)
grid()
xlabel('iteration [-]')
ax1.plot(a1[0],[x/1e6 for x in a1[2]],'-k',linewidth=lw)
ylabel(r'$\sigma_1$ [MPa]')
#axis(ymin=0,ymax=10)
ax2 = ax1.twinx()
ax2.plot(a1[0],a1[7],'-r',linewidth=lw)
ylabel('unbForce [-]')
figure(1,figsize=(10,10))
ax1=subplot(1,1,1)
grid()
xlabel(r'$\varepsilon_1$ [millistrain]')
#axis(xmin=0,xmax=0.12)
plot([x*1e3 for x in a1[1]],[x/1e6 for x in a1[2]],'-k',linewidth=lw)
ylabel(r'$\sigma_1$ [MPa]')
#axis(ymin=0,ymax=10)
#savefig(dataFile1+'_sigVSeps.eps',dpi=1000,format='eps',transparent=False)
figure(2,figsize=(10,10))
ax1=subplot(1,1,1)
grid()
xlabel(r'$\varepsilon_1$ [millistrain]')
axis(xmin=0,xmax=0.12)
ax1.plot([x*1e3 for x in a1[1]],[x/1e6 for x in a1[2]],'-k',linewidth=lw)
ylabel(r'$\sigma_1$ [MPa]')
#axis(ymin=0,ymax=10)
ax2 = ax1.twinx()
ax2.plot([x*1e3 for x in a1[1]],a1[3],'-b',linewidth=lw)
ax2.plot([x*1e3 for x in a1[1]],a1[4],'-r',linewidth=lw)
ylabel('cumulative number of microcracks [-]')
legend(('tensile','shear'))
#savefig(dataFile1+'_sig&cracksVSeps.eps',dpi=1000,format='eps',transparent=False)
### show or save
show()
The name of the columns in the data file are it , eps, sig, tc, sc, te, se, ubf and i.
The data file name is 555_20K_tensionTestCentreCrack_L5a0_r0.01.
As you see using this code I am able to plot any two different columns against each other. But I can do it only for one data file.
How can I change this code in order to be able to call different data files?
I actually didn't write this piece of code myself, so I don't get what this line means:
ax1.plot(a1[0],[x/1e6 for x in a1[2]],'-k',linewidth=lw)
and also this line:
plot([x*1e3 for x in a1[1]],[x/1e6 for x in a1[2]],'-k',linewidth=lw)
Sorry again for my weird question.
Please tell me if you need more details about my case.
Thanks a lot

Plot a Wave files Audio Visually In Python

I am trying to figure out how to plot the audio visually of a wav file. In my code if I do a wavefile.readframe(-1) I get the whole wav file plotted, the way my code works now is I just get a silver (one frame!) I'd like to show 24 frames of audio on each image plot from the wave file so I can animate it. Hopefully this is clear.
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
import wave , sys , os , struct
waveFile = wave.open('mono.wav','r')
length = waveFile.getnframes()
for i in range(0,length):
print i # so we know where we are at.
waveData = waveFile.readframes(i)
fs = waveFile.getframerate()
signal = np.fromstring(waveData, 'Int16')
Time=np.linspace(0, len(signal)/fs, num=len(signal))
plt.axis('off')
plt.plot(Time,signal , 'w')
plt.savefig('signal' + str(i) + '.png' , facecolor='none', edgecolor='none', transparent=True, frameon=False)
plt.close
From the documentation:
Wave_read.readframes(n)
Reads and returns at most n frames of audio, as a string of bytes.
So to read a chunk of 24 frames you simply call
waveData = waveFile.readframes(24)
When you open the file in read ('r') mode, the file pointer starts out at the 0th frame. As you read frames from the file, you will advance the file pointer by the same number of frames. This means that calling waveFile.readframes(24) repeatedly will yield consecutive chunks of 24 frames until you hit the end of the file - there's no need to pass a changing index i.
To keep track of where you are within the file, you can call waveFile.tell(), and to skip forwards or backwards to the kth frame you can use waveFile.setpos(k).
By the way, this behaviour is very consistent with how standard file objects work in Python.
I recoded a bit , the above answered help, but I needed to do more massaging. So if you have a need to plot audio this way in realtime just adjust the readframes for as many frames you would like to take in. To plot each frame I wound up having to make seperate plt.figure id's This code snip will get you where you want to go
wave_file = wave.open('mono.wav', 'r')
data_size = wave_file.getnframes()
sample_rate = wave_file.getframerate()
while True:
waveData = wave_file.readframes(10000)
signal = np.fromstring(waveData , 'Int16')
Time=np.linspace(0, len(signal), num=len(signal))
plt.figure()
fig = plt.figure(figsize=(xinch,yinch) , frameon=False)
#fig = plt.figure(frameon=False)
ax = fig.add_axes([0, 0, 1, 1])
#ax.axis('off')
plt.axis('off')
line = plt.plot(Time,signal , 'w')
plt.setp(line, linewidth=10)
plt.savefig('signal' + str(x) + '.png')
plt.close
x+= 1
if wave_file.tell() == data_size:
break
Will result in frames like this:

How can I create x = constant lines on a figure with adjacent labels (automatically)

I am trying to plot some reference lines say x=1 and x=3 and to label each of those lines in the figure beside the line. I am aware this can be done by manually by specifying the position however for my case (I am plotting hundreds of these lines automatically) this is not plausible.
Is there a simple way to do this?
I have been trying to use a function found online with no luck.
Here is what I currently have: - filename contains a time series - Yan_S and Yan_T are both files containing values I want to plot and information for the associated labels.
################## DEFINE SPECTRAL PLOTTING ##################
# plot_spectra(filename, title, deltaT, taper_onoff)
# filename should include full path to .txt of amplitudes (no headers)
# title will be the title of the figure
# deltaT is the time smapling in seconds
# taper_onoff 1-on 0-off
# mode_label_onoff 1-on 0-off
def plot_spectra(filename, title, deltaT, mode_label_onoff):
signal = np.genfromtxt(filename)
time_vector = np.arange(signal.size)*deltaT
spectra = np.fft.rfft(signal) # implement taper
freq_mHz = np.fft.rfftfreq(time_vector.size, deltaT)*1.e3
plt.ylabel(r'$\sqrt{P}$')
plt.xlabel('mHz')
#plt.plot(freq_mHz, abs(spectra.real))
# PLOT MODE LABELS
if mode_label_onoff != 0:
Yan_S_Vals = Yan_S[4]
Yan_T_Vals = Yan_T[4]
Yan_S_Labels = ["" for x in range(len(Yan_S))]
Yan_T_Labels = ["" for x in range(len(Yan_S))]
for i in np.arange(0,len(Yan_S)):
Yan_S_Labels[i] = str(Yan_S[0][i])+'S'+str(Yan_S[2][i])
Yan_T_Labels[i] = str(Yan_S[0][i])+'S'+str(Yan_S[2][i])
# Spheroidal Modes
lineid_plot.plot_line_ids(freq_mHz, abs(spectra.real), Yan_S_Vals, Yan_S_Labels)
# Toroidal Modes
lineid_plot.plot_line_ids(freq_mHz, abs(spectra.real), Yan_T_Vals, Yan_T_Labels)
plt.xlim(0, 10)
#plt.savefig('/Users/Alex/Desktop/Python/'+title.split('.')[0]+'_spectra.svg')
plt.savefig('/Users/Alex/Desktop/Python/'+title.split('.')[0]+'_spectra.png')
plt.show()
plt.close('all')
Any help would be great!

Quitting matplotlib.pyplot animation gracefully

I have a script that plots data of some photometry apertures, and I want to plot them in an xy plot. I am using matplotlib.pyplot with python 2.5.
The input data is stored in around 500 files and read. I am aware that this is not the most efficient way of inputting the data but that's another issue...
Example code:
import matplotlib.pyplot as plt
xcoords = []
ycoords = []
# lists are populated with data from first file
pltline, = plt.plot(xcoords, ycoords, 'rx')
# then loop populating the data from each file
for file in filelist:
xcoords = [...]
ycoords = [...]
pltline.set_xdata(xcoords)
pltline.set_ydata(ycoords)
plt.draw()
As there are over 500 files, I will occasionally want to close the animation window in the middle of the plotting. My code to plot works but it doesn't exit very gracefully. The plot window does not respond to clicking the close button and I have to Ctrl+C out of it.
Can anyone help me find a way to close the animation window while the script is running whilst looking graceful (well more graceful than a series of python traceback errors)?
If you update the data and do the draw in a loop, you should be able to interrupt it. Here's an example (that draws a stationary circle and then moves a line around the perimeter):
from pylab import *
import time
data = [] # make the data
for i in range(1000):
a = .01*pi*i+.0007
m = -1./tan(a)
x = arange(-3, 3, .1)
y = m*x
data.append((clip(x+cos(a), -3, 3),clip(y+sin(a), -3, 3)))
for x, y in data: # make a dynamic plot from the data
try:
plotdata.set_data(x, y)
except NameError:
ion()
fig = figure()
plot(cos(arange(0, 2.21*pi, .2)), sin(arange(0, 2.21*pi, .2)))
plotdata = plot(x, y)[0]
xlim(-2, 2)
ylim(-2, 2)
draw()
time.sleep(.01)
I put in the time.sleep(.01) command to be extra sure that I could break the run, but in my tests (running Linux) it wasn't necessary.

Categories