Error in Python 3.4 (Spyder) while plotting on Open Mandriva - python

I was trying to plot an IR spectrum from csv file, like this :
import matplotlib.pyplot as plt
file=open('261.1_2014-12-10t16-33-55.csv')
for line in file :
data.append(line)
pointset=data[6:]
for point in pointset:
res=point.split(',')
h=float(res[0])
wn.append(h)
y=float(res[1])
Ads.append(y)
plt.plot(wn,Ads)
plt.show()
but instead of single line, i get huge lot of them.
variables Ads and wn have much more entries then point set and data.
What is wrong?

You are iterating over the lines in the file twice. For each line in the file, you iterate over each point in pointset, but pointset is just the set of all lines read so far except the first six.
I think this is what you want:
from matplotlib import pyplot as plt
file = open('filename.csv')
for ii,line in enumerate(file):
if ii>=6: #skip lines 0, 1,2,3,4,5
fields = line.split(",")
wn.append(float(fields[0]))
Ads.append(float(fields[1]))
plt.plot(wn,Ads)
plt.show()

Related

How to remove an extra character in some rows of a file in python

If this description is not enough, I can include a sample of my code to further see where my error is coming from. For the time being, I have a file that I am importing that looks like the following...
0.9,0.9,0.12
0.,0.75,0.16
0.7,0.75,0.24
0.7,0.75,0.32
0.5,0.,0.1
0.6,0.65,0.38
0.6,0.8,0.,
0.9,0.95,0.04
0.5,0.65,0.28
On the third to last row, there is a comma after the 0. of the last column, which looks like "0.,". Because of this, I get the error
Some errors were detected !
Line #7 (got 4 columns instead of 3)
for my code I use
%matplotlib notebook
import numpy as np #import python tools
import matplotlib.pyplot as plt
from numpy import genfromtxt
from mpl_toolkits.mplot3d import Axes3D
file = 'graph'
info = genfromtxt(file, delimiter=',')
beaming = info[:,0]
albedo = info[:,2]
diameter = info[:,1]
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(diameter, albedo, beaming, c='r', marker='o')
ax.set_xlabel('diameter (km)')
ax.set_ylabel('albedo')
ax.set_zlabel('beaming parameter')
plt.show()
Maybe there is a way to ignore that comma and explicitly read in the first 3 columns or a way to get rid of the comma but i'm unsure. Any suggestions help!
You could remove all the commas and resave it as new file via:
with open("file.txt") as fi, open("out.txt", "w") as fo:
for line in fi:
line = line.strip()
if line[-1] == ",":
line = line[:-1]
fo.write(line + "\n")

Matplotlib: blank plot and window won't close

I'm trying to plot a curve using the data from a csv file using:
import matplotlib.pyplot as plt
from csv import reader
with open('transmission_curve_HST_ACS_HRC.F606W.csv', 'rw') as f:
data = list(reader(f))
wavelength_list = [i[0] for i in data[1::]]
percentage = [i[1] for i in data[1::]]
plt.plot(wavelength_list, percentage)
plt.show()
But all it make is opening a completely blank window and I can't close it unless I close the terminal.
The csv file looks like this:
4565,"0,00003434405472044760"
4566,"0,00004045191689260860"
4567,"0,00004656394357747830"
4568,"0,00005267963655205460"
4569,"0,00005879949856084820"
Do you have any idea why?
You need to modify three things in your code:
Change 'rw' to 'r' when you read from the file
Correct the way you iterate over data
Convert the numbers from the second column to float
import matplotlib.pyplot as plt
from csv import reader
with open('transmission_curve_HST_ACS_HRC.F606W.csv', 'r') as f:
data = list(reader(f))
wavelength_list = [i[0] for i in data]
percentage = [float(str(i[1]).replace(',','.')) for i in data]
plt.plot(wavelength_list, percentage)
plt.show()
Content of the csv file:
4564,"0,00002824029270045730"
4565,"0,00003434405472044760"
4566,"0,00004045191689260860"
4567,"0,00004656394357747830"
4568,"0,00005267963655205460"
4569,"0,00005879949856084820"

Coverting string to floats using .csv file and generating Histogram

I've been with .cvs files to generate Histogram from the data. It has data something like this
102.919 103.36
102.602 103.05
104.106 104.57
108.791 109.26
104.045 104.52
104.324 104.77
105.106 105.57
102.619 103.08
102.124 102.6
Here's the code I have written
# histplot.py
import numpy as np
import matplotlib.pyplot as plt
import csv
with open('datafile.csv', 'rU') as data:
reader = csv.DictReader(data, delimiter=' ', quoting=csv.QUOTE_NONNUMERIC)
for line in reader:
t = float(line)
data.append(t)
reader.close()
# generate the histogram
hist, bin_edges=np.histogram(data, bins=50, range=[80,135])
# generate histogram figure
plt.hist(data, bin_edges)
plt.savefig('chart_file', format="pdf")
plt.show()
Running this code give me an error ValueError: could not convert string to float: '102.919,103.36'
Can someone help me in giving few ideas regarding converting strings to float using csv file.
Thank you in advance.
First of all with open('datafile.csv', 'rU') as data: means that you obtain data as a filehandle to the file. You can use this filehandle as an iterable but you cannot append anything to it.
Second csv.DictReader provides access to the data as a dictionary. In this case here, I would recommend using csv.reader, which gives access to the data as a list.
Third, you cannot convert the whole line, may it be a dictionary or a list, to a float. You can only do that with a single element of the list. (This is where the error comes from.) Conversion to float isn't even necessary, since the reader already takes care of that.
Now, you can simply append the elements line by line to an initially empty list and supply this list to the histogram function.
import numpy as np
import matplotlib.pyplot as plt
import csv
data = [] #create empty list
with open('datafile.csv', 'rU') as f:
reader = csv.reader(f, delimiter=' ', quoting=csv.QUOTE_NONNUMERIC)
for line in reader:
data.extend(line)
# generate the histogram
hist, bin_edges=np.histogram(data, bins=50, range=[80,135])
# generate histogram figure
plt.hist(data, bin_edges)
#plt.savefig('chart_file', format="pdf")
plt.show()
Let me just mention that the whole data reading can be done in a much simpler way, using numpy.loadtxt.
Also, plotting the histogram may be simplified, in case no further data processing needs to take place.
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt('datafile.csv').flatten()
plt.hist(data, bins=50, range=[80,135])
plt.show()

How do I generate a histogram from a list of values using matplotlib?

so I've been trying to plot a histogram using python with mathplotlib.
So I've got two datasets, basically the heights of a sample of men and women as a list in python, imported off a csv file.
The code that I'm using:
import csv
import numpy as np
from matplotlib import pyplot as plt
men=[]
women=[]
with open('women.csv','r') as f:
r1=csv.reader(f, delimiter=',')
for row in r1:
women+=[row[0]]
with open('men.csv','r') as f:
r2=csv.reader(f, delimiter=',')
for row in r2:
men+=[row[0]]
fig = plt.figure()
ax = fig.add_subplot(111)
numBins = 20
ax.hist(men,numBins,color='blue',alpha=0.8)
ax.hist(women,numBins,color='red',alpha=0.8)
plt.show()
and the error that I get:
Traceback (most recent call last):
File "//MEME/Users/Meme/Miniconda3/Lib/idlelib/test.py", line 22, in <module>
ax.hist(men,numBins,color='blue',alpha=0.8)
File "\\MEME\Users\Meme\Miniconda3\lib\site-packages\matplotlib\__init__.py", line 1811, in inner
return func(ax, *args, **kwargs)
File "\\MEME\Users\Meme\Miniconda3\lib\site-packages\matplotlib\axes\_axes.py", line 5983, in hist
raise ValueError("color kwarg must have one color per dataset")
ValueError: color kwarg must have one color per dataset
NOTE:assume your files contain multiple lines (comma separated) and the first entry in each line is the height.
The bug is when you append "data" into the women and men list. row[0] is actually a string. Hence matplotlib is confused. I suggest you run this code before plotting (python 2):
import csv
import numpy as np
from matplotlib import pyplot as plt
men=[]
women=[]
import pdb;
with open('women.csv','r') as f:
r1=csv.reader(f, delimiter=',')
for row in r1:
women+=[(row[0])]
with open('men.csv','r') as f:
r2=csv.reader(f, delimiter=',')
for row in r2:
men+=[(row[0])]
fig = plt.figure()
ax = fig.add_subplot(111)
print men
print women
#numBins = 20
#ax.hist(men,numBins,color='blue',alpha=0.8)
#ax.hist(women,numBins,color='red',alpha=0.8)
#plt.show()
A sample output will be
['1','3','3']
['2','3','1']
So in the loops, you just do a conversion from string into float or integers e.g. women += [float(row[0])] and men += [float(row[0])]

How do I make a histogram from a csv file which contains a single column of numbers in python?

I have a csv file (excel spreadsheet) of a column of roughly a million numbers. I want to make a histogram of this data with the frequency of the numbers on the y-axis and the number quantities on the x-axis. I know matplotlib can plot a histogram, but my main problem is converting the csv file from string to float since a string can't be graphed. This is what I have:
import matplotlib.pyplot as plt
import csv
with open('D1.csv', 'rb') as data:
rows = csv.reader(data, quoting = csv.QUOTE_NONNUMERIC)
floats = [[item for number, item in enumerate(row) if item and (1 <= number <= 12)] for row in rows]
plt.hist(floats, bins=50)
plt.title("histogram")
plt.xlabel("value")
plt.ylabel("frequency")
plt.show()
You can do it in one line with pandas:
import pandas as pd
pd.read_csv('D1.csv', quoting=2)['column_you_want'].hist(bins=50)
Okay I finally got something to work with headings, titles, etc.
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('D1.csv', quoting=2)
data.hist(bins=50)
plt.xlim([0,115000])
plt.title("Data")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.show()
My first problem was that matplotlib is necessary to actually show the graph. Also, I needed to set the action
pd.read_csv('D1.csv', quoting=2)
to data so I could plot the histogram of that action with
data.hist
Thank you all for the help.
Panda's read_csv is very powerful, but if your csv file is simple (without headers, or NaNs or comments) you do not need Pandas, as you can use Numpy:
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt('D1.csv')
plt.hist(data, normed=True, bins='auto')
(In fact loadtxt can deal with some headers and comments, but read_csv is more versatile)

Categories