Good afternoon,
I am currently seeking to compare the voltage amplitude versus time for measurements from an oscilloscope. I am running my code from a Linux terminal and I am currently experiencing the following errors:
ValueError: Invalid number of FFT data points (0) specified.
NameError: name 'yf' is not defined
My code is posted below:
import csv
import matplotlib.pyplot as plt
import numpy as np
from scipy.fftpack import fft
sample_interval= -1
sample_num = -1
time = []
amplitude = []
with open('nofilter-1.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
for row in readCSV:
time.append(row[3]);
amplitude.append(row[4]);
if(row ==1):
sample_interval = row[1]
if(row ==2):
sample_num = row[1]
# sample spacing
print("syntax")
yf = fft(amplitude)
xf = np.linspace(0.0, 1.0/(2.0*sample_interval), sample_num/2)
fig, ax = plt.subplots()
ax.plot(xf, 2.0/sample_num * np.abs(yf[:sample_num//2]))
plt.show()
Am I running into any syntax errors or have a defined a variable improperly?
Sorry all for the late reply! Here's a snippet of the .csv file that I am working with.
.csv file
As can be seen, columns one and three contain strings in some shape or form which is why I noticed in my code after having it pointed out that I may have mixed up the rows and columns. I followed the advice of bobrobbob and found very little luck.
Related
I want to plot a graph by taking data from multiple csv files and plot them against round number(= number of csv files for that graph). Suppose I have to take a max value of a particular column from all csvs and plot them with the serial number of the csvs on x axis. I am able to read csv files and plot too but from a single file. I am unable to do as stated above.
Below is what I did-
import pandas as pd
import matplotlib.pyplot as plt
import glob
import numpy as np
csv = glob.glob("path" + "*csv")
csv.sort()
N = len(csv)
r = 0
for rno in range(1, N + 1):
r += 1
for f in csv:
df = pd.read_csv(f)
col1 = pd.DataFrame(df, columns=['col. name'])
a = col1[:].to_numpy()
Max = col1.max()
plt.plot(r, Max)
plt.show()`
If anyone has an idea it'd be helpful. Thank you.
I'm trying to solve a correlation problem where I need to find where a pattern sequence is found inside a signal sequence. At some point I was able to find the correct solution only to begin trying to optimize the code, and the code I had accomplished wasn't saved. Now the cross correlation function just won't solve correctly and I don't know why. I have restarted the kernel multiple times.
Here is the code and the links to the text files that contain the signal and the pattern.
https://drive.google.com/file/d/1tBzHMUfmcx_gGR0arYPaQ5GB9MybXKRv/view?usp=sharing
https://drive.google.com/file/d/1TeSe9t8TeVHEp2BxKXYz6Ndlpah--yLg/view?usp=sharing
import numpy as np
import matplotlib.pyplot as plt
patron = np.loadtxt('patron.txt', delimiter=',', skiprows=1)
senal = np.loadtxt('señal.txt', delimiter=',', skiprows=1)
Fs=100
ts = np.arange(0,len(senal))
plt.figure(figsize=(20,8))
plt.subplot(3,1,1)
plt.plot(ts,patron)
plt.subplot(3,1,2)
plt.plot(ts,senal)
corr = np.correlate(senal,patron,"same")
print(np.where(corr == np.amax(corr))) #this should be where correlation reaches its maximum value, and where the functions are most "similar"
plt.subplot(3,1,3)
plt.plot(ts,corr, 'r')
How do I know I had it right? I plotted the "senal" sequence shifted 799 places (the value I had when the code was right) with:
np.roll(senal,799)
plt.plot(senal)
which resulted in this graph. It looks pretty intuitive when it resulted in a maximum correlation at index 799:
Hello I fliped the 'patron' and 'senal' values in the correlate function function and it seems good:
import numpy as np
import matplotlib.pyplot as plt
patron = np.loadtxt('patron.txt', delimiter=',', skiprows=1)
senal = np.loadtxt('señal.txt', delimiter=',', skiprows=1)
Fs=100
ts = np.arange(0,len(senal))
plt.figure(figsize=(20,8))
plt.subplot(3,1,1)
plt.plot(ts,patron)
plt.subplot(3,1,2)
plt.plot(ts,senal)
corr = np.correlate(patron,senal,'same')
print(np.argmax(corr)) #this should be where correlation reaches its maximum value, and where the functions are most "similar"
plt.subplot(3,1,3)
plt.plot(corr, 'r')
I am really new in Python and I hope this is the right community for my question. Sorry if it is not.
I am trying to import data from a .txt file with pandas.
The file looks like this:
# Raman Scattering Spectrum
# X-Axis: Frequency (cm-1)
# Y-Axis: Intensity (10-36 m2 cm/sr)
# Harmonic Data
# Peak information (Harmonic)
# X Y
# 20.1304976000 1.1465331676
# 25.5433266000 6.0306906544
...
# 3211.8081700000 0.3440113123
# 3224.5118500000 0.8814596030
# Plot Curve (Harmonic)
# X Y DY/DX
0.0000000000 8.4803414671 0.6546818124
8.0000000000 17.8239097502 2.0146387573
I already wrote this pieces of code to import my data:
import pandas as pd
# import matplotlib as plt
# import scipy as sp
data = pd.read_csv('/home/andrea/Schreibtisch/raman_gauss.txt', sep='\t')
data
Now I just get one column.
If I try it with
pd.read_fwf(file)
I got 3 columns, but the x and y values from plot curve (harmonic) are in one column.
Now I want to import from Plot Curve (Harmonic) the x, y and DY/DX values in different variables or containers as series.
The hart part for me ist how to split x und y now in 2 columns and how to tell python that the import should start at the line number from plot cuve (harmonix) +2 lines.
I think about it yet and my idea was to check all containers for the string 'Plot Curve (Harmonic). Then I get a new series with true or false. Then I need to read out which line number is true for the search word. And then I start the import from this line...
I am too much a newbie to Python and I am not yet familiar with the documantation that I found the command i must use.
Has anyone tipps for me with a command or something? And how to split the columns?
Thank you very much!
You can read as follows.
Code
import pandas as pd
import re # Regex to parse header
def get_data(filename):
# Find row containing 'Plot Curve (Harmonic)'
with open('data.txt', 'r') as f:
for i, line in enumerate(f):
if 'Plot Curve (Harmonic)' in line:
start_row = i
# Parse header on next line
header = re.findall(r'\S+', next(f))[1:]
# [1:] to skip '#' at beginnning of line
break
else:
start_row = None # not found
if start_row:
# Use delimiter=r"\s+": since have multiple spaces between numbers
# skip_rows = start_row+2: to skip to data
# (skip current and header row)
# reference: https://thispointer.com/pandas-skip-rows-while-reading-csv-file-to-a-dataframe-using-read_csv-in-python/
# names = header: assigns column names
df = pd.read_csv('data.txt', delimiter=r"\s+", skiprows=start_row+2,
names = header)
return df
Test
df = get_data('data.txt')
print(df)
data.txt file
# Raman Scattering Spectrum
# X-Axis: Frequency (cm-1)
# Y-Axis: Intensity (10-36 m2 cm/sr)
# Harmonic Data
# Peak information (Harmonic)
# X Y
# 20.1304976000 1.1465331676
# 25.5433266000 6.0306906544
...
# 3211.8081700000 0.3440113123
# 3224.5118500000 0.8814596030
# Plot Curve (Harmonic)
# X Y DY/DX
0.0000000000 8.4803414671 0.6546818124
8.0000000000 17.8239097502 2.0146387573
Output
X Y DY/DX
0 0.0 8.480341 0.654682
1 8.0 17.823910 2.014639
First: Thank you very much for your answer. It helps me a lot.
I tried to used the comment function but i cannot add an 'Enter'
I want to plot the data, I can now extract from the file, but when I add my standard plot code:
plt.plot(df.X, df.Y)
plt.legend(['simulated'])
plt.xlabel('raman_shift')
plt.ylabel('intensity')
plt.grid(True)
plt.show()
I get now the error:
TypeError Traceback (most recent call last)
<ipython-input-240-8594f8545868> in <module>
28 plt.plot(df.X, df.Y)
29 plt.legend(['simulated'])
---> 30 plt.xlabel('raman_shift')
31 plt.ylabel('intensity')
32 plt.grid(True)
TypeError: 'str' object is not callable
I have nothing changed at the label function. In my other project this lines work well.
And I dont know as well how do read out the DY/DX column, the '/' kann not be used in the columnname.
Did you got a tipp for me, again? :)
Thanks.
I am having an issue that I really can not figure out (honestly I don't even know where to start).
I have a data set for which I calculate the mean using numpy, after which I need to plot a histogram using pyplot. The issue is that after importing matplotlib.pyplot the mean changes each time I run the script. If I comment out the "import matplotlib.pyplot as plt" line everything works fine though. Here is my code if you need to see it :
#!/usr/bin/env python
import csv
import numpy as np
import matplotlib.pyplot as plt
### READ DATA ###
table = []
with open ('data.csv', 'r') as f:
reader = csv.reader(f)
for row in reader:
table.append(row)
f.close()
names = table[0]
data = np.array(table)
ind = 0
for n in names:
if(n == "dataset8"):
dataset8 = np.array(data[1:, ind], "int32")
if (n == "dataset10"):
dataset10 = np.array(data[1:,ind], "int32")
ind += 1
### GET MEAN VALUE of datasets ###
print "avg dataset8 = " + str(np.mean(dataset8))
print "avg dataset10 = " + str(np.mean(dataset10))
np.mean(dataset8) is the value that changes every time I run the script (only when "import matplotlib.pyplot" is included), while np.mean(dataset10) works fine.
Does anyone have any ideas?
Tom
To whomever,
I am having a graphing problem where it seems that previous data is being stacked on top of new data. I wanted to find a way to separate these so that I can get individual graphs per data set.
Briefly before we get into the script let me tell you what you're looking at. I have 8 data sets each one named somethingsomethingsomething...n=0,1,...,7. So there 8 different files with different sets of values for the wavelength (here I named it WL) and stokes parameters (here I named them SI SQ SU SV). I was told to make some graphs of them so here we are.
The following is what I have:
the base
import matplotlib.pyplot as plt
import numpy as np
import scipy.constants as c
from scipy.interpolate import spline
import re
something to tell the program to not worry about random spaces in data set files
split_on_spaces = re.compile(" +").split
defining the arrays
WL = np.array([])
SI = np.array([])
SQ = np.array([])
SU = np.array([])
SV = np.array([])
code for data interpretation
with open('C:\\Users\\Schmidt\\Desktop\\Python\\Homework_4\\CoolStuffLivesHere\\stokes_profiles_1.txt') as f:
for line in f:
data=split_on_spaces(line.strip())
if len(data) == 0:
continue
if len(data) != 5:
sys.stderr.write("BAD LINE: {}".format(repr(line)))
continue
WL = np.append(WL, float(data[0]))
SI = np.append(SI, data[1])
SQ = np.append(SQ, data[2])
SU = np.append(SU, data[3])
SV = np.append(SV, data[4])
plotting sequence
plt.plot(WL,SI)
plt.show()
Then rinse and repeat for the other 3 parameters and then rinse and repeat for the other data sets as well. It works real fine for the first rendering. However for subsequent graphs it looks more like these: first example, second example.
So in a nut shell what line of code should I be typing in where to resolve my graph stacking issue?
Without getting into subplots, you're just adding to the original plot. You need to close it if you want to re-use it.
i.e.
plt.plot(WL,SI)
plt.show()
plt.close()
plt.plot(WL,SQ)
Unless you want them on the same plot.