ValueError: Could not convert string to float: "nbformat":4

ValueError: Could not convert string to float: "nbformat":4 - python

After some long calculation, I've got files which contain following strings.
(Each string is separated with "\t" and has "\n" at the end of the each line.)
0.0000008375000 829.685601736 555.939928236
0.0000008376000 829.511081539 555.889353246
0.0000008377000 829.336613968 555.838785601
0.0000008378000 829.162199002 555.7882253
0.0000008379000 828.987836621 555.737672342
0.0000008380000 828.813526805 555.687126727
0.0000008381000 828.639269533 555.636588453
Then I tried to plot these files. (The file's name is starting with P.)
fList = np.array(gl.glob("P*"))
for i in fList:
f = open(i, "r")
data = f.read()
data = data.replace("\n", "\t")
data = np.array(data.split("\t"))[:-1].reshape(-1,3)
plt.plot(data[:,0], data[:,1], label=i)
Then I ended up with following error.
(Error pointer indicates this happened at the line plt.plot(data[:,0], data[:,1], label=i))
ValueError: could not convert string to float: "nbformat": 4,
I've looked up some other tutorials or walkthroughs but unfortunately, could not understand how to fix this issue. Any help or advice would be very grateful.

You can directly use numpy to read in the file into three arrays:
import numpy as np
import matplotlib.pyplot as plt
from glob import glob
fList = glob("P*")
for i in fList:
x,y,z = np.loadtxt(i, unpack=True)
plt.plot(x,y, label=i)
plt.legend()
plt.show()

Related

Python - ValueError: could not convert string to float

I am a beginner in python and I'm trying to graph some data from a file. The code is the following:
import matplotlib.pyplot as plt
import pandas as pd
from scipy.signal import find_peaks
import os
dataFrame = pd.read_csv('soporte.txt', sep='\t',skiprows=1, encoding = 'utf-8-sig')
x = dataFrame['Wavelength nm.']
y = dataFrame['Abs.']
indices, _ = find_peaks(y, threshold=1)
plt.plot(x, y)
plt.show()
And I get the following error:
ValueError: could not convert string to float: '-0,04008'
I'll show you a piece of the file I am trying to work with:
"soporte.spc - RawData"
"Wavelength nm." "Abs."
180,0 -0,04008
181,0 -0,00084
182,0 -0,00746
183,0 0,00854
184,0 -0,01525
185,0 -0,00354
Thank you very much!!!
L

Use the decimal=',' option in pandas, i.e.,
dataFrame = pd.read_csv('soporte.txt', sep='\t',skiprows=1, encoding = 'utf-8-sig', decimal=',')

Syntax Errors Regarding Amplitude

Good afternoon,
I am currently seeking to compare the voltage amplitude versus time for measurements from an oscilloscope. I am running my code from a Linux terminal and I am currently experiencing the following errors:
ValueError: Invalid number of FFT data points (0) specified.
NameError: name 'yf' is not defined
My code is posted below:
import csv
import matplotlib.pyplot as plt
import numpy as np
from scipy.fftpack import fft
sample_interval= -1
sample_num = -1
time = []
amplitude = []
with open('nofilter-1.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
for row in readCSV:
time.append(row[3]);
amplitude.append(row[4]);
if(row ==1):
sample_interval = row[1]
if(row ==2):
sample_num = row[1]
# sample spacing
print("syntax")
yf = fft(amplitude)
xf = np.linspace(0.0, 1.0/(2.0*sample_interval), sample_num/2)
fig, ax = plt.subplots()
ax.plot(xf, 2.0/sample_num * np.abs(yf[:sample_num//2]))
plt.show()
Am I running into any syntax errors or have a defined a variable improperly?

Sorry all for the late reply! Here's a snippet of the .csv file that I am working with.
.csv file
As can be seen, columns one and three contain strings in some shape or form which is why I noticed in my code after having it pointed out that I may have mixed up the rows and columns. I followed the advice of bobrobbob and found very little luck.

Save contour images generated in a loop as a single pdf file (2 images per page preferably)

I have written this code which will generate a number of contour plots, each of which corresponds to a single text file. I have multiple text files. Currently, I am able to generate all of the images separately in png format without any issues.
When I try to save the images as a pdf file, it is saving only the last image generated in a loop.I tried using the PdfPages package. This question is similar to the one that I posted before but with a different question. Similar
Issue: I want to able to generate all of the images into a single pdf file automatically from python. So for eg. if I have 100 text files, then I want to save all of the 100 images onto a single pdf file.Also ideally I want to save 2 images in a single page in the pdf file. There are some questions in SO about this, but I couldn't find an appropriate solution for my issue. Since I have many case for which I have to generate the images, I want to save them as a single pdf file as it is more easier to analyze them. I would appreciate any suggestions/advice to help me with this.
This is link for the sample text file Sample Text
ges
from __future__ import print_function
import numpy as np
from matplotlib import pyplot as plt
from scipy.interpolate import griddata
from matplotlib.backends.backend_pdf import PdfPages
path = 'location of the text files'
FT_init = 5.4311
delt = 0.15
TS_init = 140
dj_length = 2.4384
def streamfunction2d(y,x,Si_f,q):
with PdfPages('location of the generated pdf') as pdf:
Stf= plt.contour(x,y,Si_f,20)
Stf1 = plt.colorbar(Stf)
plt.clabel(Stf,fmt='%.0f',inline=True)
plt.figtext(0.37,0.02,'Flowtime(s)',style= 'normal',alpha=1.0)
plt.figtext(0.5,0.02,str(q[p]),style= 'normal',alpha=1.0)
plt.title('Streamfunction_test1')
plt.hold(True)
plt.tight_layout()
pdf.savefig()
path1 = 'location where the image is saved'
image = path1+'test_'+'Stream1_'+str((timestep[p]))+'.png'
plt.savefig(image)
plt.close()
timestep = np.linspace(500,600,2)
flowtime = np.zeros(len(timestep))
timestep = np.array(np.round(timestep),dtype = 'int')
###############################################################################
for p in range(len(timestep)):
if timestep[p]<TS_init:
flowtime[p] = 1.1111e-01
else:
flowtime[p] = (timestep[p]-TS_init)*delt+FT_init
q = np.array(flowtime)
timestepstring=str(timestep[p]).zfill(4)
fname = path+"ddn150AE-"+timestepstring+".txt"
f = open(fname,'r')
data = np.loadtxt(f,skiprows=1)
data = data[data[:, 1].argsort()]
data = data[np.logical_not(data[:,11]== 0)]
Y = data[:,2] # Assigning Y to column 2 from the text file
limit = np.nonzero(Y==dj_length)[0][0]
Y = Y[limit:]
Vf = data[:,11]
Vf = Vf[limit:]
Tr = data[:,9]
Tr = Tr[limit:]
X = data[:,1]
X = X[limit:]
Y = data[:,2]
Y = Y[limit:]
U = data[:,3]
U = U[limit:]
V = data[:,4]
V = V[limit:]
St = data[:,5]
St = St[limit:]
###########################################################################
## Using griddata for interpolation from Unstructured to Structured data
# resample onto a 300x300 grid
nx, ny = 300,300
# (N, 2) arrays of input x,y coords and dependent values
pts = np.vstack((X,Y )).T
vals = np.vstack((Tr))
vals1 = np.vstack((St))
# The new x and y coordinates for the grid
x = np.linspace(X.min(), X.max(), nx)
y = np.linspace(Y.min(), Y.max(), ny)
r = np.meshgrid(y,x)[::-1]
# An (nx * ny, 2) array of x,y coordinates to interpolate at
ipts = np.vstack(a.ravel() for a in r).T
Si = griddata(pts, vals1, ipts, method='linear')
print(Ti.shape,"Ti_Shape")
Si_f = np.reshape(Si,(len(y),len(x)))
print(Si_f.shape,"Streamfunction Shape")
Si_f = np.transpose(Si_f)
streamfunction2d(y,x,Si_f,q)

Edit : As you mentioned matplotlib is probably able to handle everything by itself using PdfPages function. See this related answer. My original answer is a hack.
I think the error in your code is that you are creating another PdfPage object each time you go through the loop. My advice would be to add the PdfPage object as an argument to your streamfunction2d function and create the PdfPage object once and for all before the loop (using a with statement as in the documentation seems a good idea).
Example:
def streamfunction2d(y,x,Si_f,q,pdf):
# (...)
pdf.savefig(plt.gcf())
with PdfPages('output.pdf') as pdf:
for p in range(len(timestep)):
# (...)
streamfunction2d(y,x,Si_f,q,pdf)
Original answer:
Here is a quick and dirty solution using the pdfunite software.
from matplotlib import pyplot as plt
import numpy as np
import subprocess
import os
X = np.linspace(0,1,100)
for i in range(10):
# random plot
plt.plot(X,np.cos(i*X))
# Save each figure as a pdf file.
plt.savefig("page_{:0}.pdf".format(i))
plt.clf()
# Calling pdfunite to merge all the pages
subprocess.call("pdfunite page_*.pdf united.pdf",shell=True)
# Removing temporary files
for i in range(10):
os.remove("page_{:0}.pdf".format(i))
It uses two things:
You can save your figures as pdf using matplotlib's savefig command.
You can call other programs using the subprocess library. I used pdfunite to merge all the pages. Be sure it is available on your machine !
If you want to have several graph by page, you can use subplots.
Alternatively, you could use another python library (such as pyPDF) to merge the pages, but it would require slightly more code. Here is an (untested) example:
from matplotlib import pyplot as plt
import numpy as np
from pyPdf import PdfFileWriter, PdfFileReader
# create an empty pdf file
output = PdfFileWriter()
X = np.linspace(0,1,100)
for i in range(10):
# random plot
plt.plot(X,np.cos(i*X))
# Save each figure as a pdf file.
fi = "page_{:0}.pdf".format(i)
plt.savefig(fi)
plt.clf()
# add it to the end of the output
input = PdfFileReader(file(fi, "rb"))
output.addPage(input.getPage(0))
# Save the resulting pdf file.
outputStream = file("document-output.pdf", "wb")
output.write(outputStream)

Histograms in Python using matplotlib

I'm trying to make a histogram, and i've been doing some searches and trying to find the right code, but everything I try doesn't end up working. This is my code right now,
import matplotlib.pyplot as plt
import numpy as np
with open('gaubg.csv') as f:
v = np.loadtxt(f, delimiter= ',', dtype="float", skiprows=1, usecols='None')
plt.hist(v, bins=100)
plt.xlabel("G-r0")
plt.ylabel('# of stars')
plt.title("Bottom half g-r0")
plt.show()
gaubg.csv is a csv file that includes about 600,000 (float, not int) data points that have to do with the color of stars. Every time I run this through python, this is the error message that shows up
Traceback (most recent call last):
File "gaub.py", line 5, in
v = np.loadtxt(f, delimiter= ',', dtype="float", skiprows=1, usecols='None')
File "/sdss/ups/prd/numpy/v1_6_1/Linux/lib/python2.7/sitepackages/numpy/lib/npyio.py", line 794, in loadtxt
vals = [vals[i] for i in usecols]
TypeError: list indices must be integers, not str
I have no idea what that means. I've been trying to fix the code but I'm not sure how. If you could point out the obvious error(s) I'd be grateful!

usecols= 'None'
should be
usecols= None
Or you can skip adding the usecols argument altogether. When you specified a string numpy tried to iterate through each character with the assumption that it's an integer.

Matplotlib pcolor

I am using Matplotlib to create an image based on some data. All of the data falls in the range of 0 through to 1 and I am trying to color the data based on its value using a colormap and this works perfectly in Matlab, however when converting the code across to Python I simply get a black square as the output. I believe this is because I'm plotting the image wrong and so it is plotting all the data as 0. I have tried searching this problem for several hours and I have tried plt.set_clim([0, 1]) however that didn't seem to do anything. I am new to Python and Matplotlib, although I am not new to programming (Java, javascript, PHP, etc), but I cannot see where I am going wrong. If any body can see anything glaringly incorrect in my code then I would be extremely grateful.
Thank you
from numpy import *
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.colors as myColor
e1cx=[]
e1cy=[]
e1cz=[]
print("Reading files...")
in_file = open("eigenvector_1_component_x.txt", "rt")
for line in in_file.readlines():
e1cx.append([])
for i in line.split():
e1cx[-1].append(float(i))
in_file.close()
in_file = open("eigenvector_1_component_y.txt", "rt")
for line in in_file.readlines():
e1cy.append([])
for i in line.split():
e1cy[-1].append(float(i))
in_file.close()
in_file = open("eigenvector_1_component_z.txt", "rt")
for line in in_file.readlines():
e1cz.append([])
for i in line.split():
e1cz[-1].append(float(i))
in_file.close()
print("...done")
nx = 120
ny = 128
nz = 190
fx = zeros((nz,nx,ny))
fy = zeros((nz,nx,ny))
fz = zeros((nz,nx,ny))
z = 0
while z<nz-1:
x = 0
while x<nx:
y = 0
while y<ny:
fx[z][x][y]=e1cx[(z*128)+y][x]
fy[z][x][y]=e1cy[(z*128)+y][x]
fz[z][x][y]=e1cz[(z*128)+y][x]
y += 1
x += 1
z+=1
if((z % 10) == 0):
plt.figure(num=None)
plt.axis("off")
normals = myColor.Normalize(vmin=0,vmax=1)
plt.pcolor(fx[z][:][:],cmap='spectral', norm=normals)
filename = 'Imagex_%d' % z
plt.savefig(filename)
plt.colorbar(ticks=[0,2,4,6], format='%0.2f')

Although you have resolved your original issue and have code that works, I wanted to point out that both python and numpy provide several tools that make code like this much simpler to write. Here are a few examples:
Loading data
Instead of building up lists by appending to the end of an empty one, it is often easier to generate them from other lists. For example, instead of
e1cx = []
for line in in_file.readlines():
e1cx.append([])
for i in line.split():
e1cx[-1].append(float(i))
you can simply write:
e1cx = [[float(i) for i in line.split()] for line in in_file]
The syntax [x(y) for y in l] is known as a list comprehension, and, in addition to being more concise will execute more quickly than a for loop.
However, for loading tabular data from a text file, it is even simpler to use numpy.loadtxt:
import numpy as np
e1cx = np.loadtxt("eigenvector_1_component_x.txt")
for more information,
print np.loadtxt.__doc__
See also, its slightly more sophisticated cousin numpy.genfromtxt
Reshaping data
Now that we have our data loaded, we need to reshape it. The while loops you use work fine, but numpy provides an easier way. First, if you prefer to use your method of loading the data, then convert your eigenvector arrays into proper numpy arrays using e1cx = array(e1cx), etc.
The array class provides methods for rearranging how the data in an array is indexed without requiring it to be copied. The simplest method is array.reshape, which will do half of what your while loops do:
almost_fx = e1cx.reshape((nz,ny,nx))
Here, almost_fx is a rank-3 array indexed as almost_fx[iz,iy,ix]. One important thing to be aware of is that e1cx and almost_fx share their data. So, if you change e1cx[0,0], you will also change almost_fx[0,0,0].
In your code, you swapped the x and y locations. If this is indeed what you wanted to do, you can accomplish this with array.swapaxes:
fx = almost_fx.swapaxes(1,2)
Of course, you could always combine this into one line
fx = e1cx.reshape((nz,ny,nx)).swapaxes(1,2)
However, if you want the z-slices (fx[z,:,:]) to plot with x horizontal and y vertical, you probably do not want to swap the axes above. Just reshape and plot.
Slicing arrays
Finally, rather than looping over the z-index and testing for multiples of 10, you can loop directly over a slice of the array using:
for fx_slice in fx[::10]:
# plot fx_slice and save it
This indexing syntax is array[start:end:step] where start is included in the result end is not. Leaving start blank implies 0, while leaving end blank implies the end of the list.
Summary
In summary your complete code (after introducing a few more python idioms like enumerate) could look something like:
import numpy as np
from matplotlib import pyplot as pt
shape = (190,128,120)
fx = np.loadtxt("eigenvectors_1_component_x.txt").reshape(shape).swapaxes(1,2)
for i,fx_slice in enumerate(fx[::10]):
z = i*10
pt.figure()
pt.axis("off")
pt.pcolor(fx_slice, cmap='spectral', vmin=0, vmax=1)
pt.colorbar(ticks=[0,2,4,6], format='%0.2f')
pt.savefig('Imagex_%d' % z)
Alternatively, if you want one pixel per element, you can replace the body of the for loop with
z = i*10
pt.imsave('Imagex_%d' % z, fx_slice, cmap='spectral', vmin=0, vmax=1)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

ValueError: Could not convert string to float: "nbformat":4 - python

You can directly use numpy to read in the file into three arrays: import numpy as np import matplotlib.pyplot as plt from glob import glob fList = glob("P*") for i in fList: x,y,z = np.loadtxt(i, unpack=True) plt.plot(x,y, label=i) plt.legend() plt.show()

Related

Python - ValueError: could not convert string to float

Syntax Errors Regarding Amplitude

Save contour images generated in a loop as a single pdf file (2 images per page preferably)

Histograms in Python using matplotlib

Matplotlib pcolor

Categories

Resources