I'm trying to make a graph of the first column ('Time') of a csv file plotted against the the second column ('Bid').
Here's what I have so far.
import pandas as pd
import datetime
import csv
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
headers = ['Time','Bid','Ask']
df = pd.read_csv('quotes_format.csv')
x = df['Time']
y = df['Bid']
plt.plot(x,y)
plt.gcf().autofmt_xdate()
plt.show()
The csv file looks something like this
This fails and returns exit code 1. How would I fix this so it would generate the graph I'm looking for?
You can specify what the names of each column in the dataframe are with the parameter names.
headers = ['Time','Bid','Ask']
df = pd.read_csv('quotes_format.csv', names=headers)
Here is the documentation for the pandas read_csv function.
Related
Hello I cannot understand why this code does not select rows between dates. It shows me same dataset from first date 2004. Here is my code below:
import pandas as pd
from pandas import DataFrame
import datetime
from matplotlib import pyplot as plt
df1 = pd.read_csv('time_series_15min_singleindex.csv',header=0,index_col=0,parse_dates=True)
df=DataFrame(df1,columns['utc_timestamp','DE_solar_generation_actual','DE_wind_onshore_generation_actual']
df['utc_timestamp'] = pd.to_datetime(df['utc_timestamp'],utc=True)
start_date=pd.to_datetime('2008-12-31',utc=True)
end_date=pd.to_datetime('2009-01-01',utc=True)
df[df['utc_timestamp'].between(start_date,end_date)]
df.plot()
You forget assign back, use:
df = df[df['utc_timestamp'].between(start_date,end_date)]
I am trying to make a simple bar graph out of a 2 column CSV file. One column is the x axis names, the other column is the actual data which will be used for the bars. The CSV looks like this:
count,team
21,group1
15,group2
63,group3
22,group4
42,group5
72,group6
21,group7
23,group8
24,group9
31,group10
32,group11
I am using this code:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.read_csv("sampleData.csv",sep=",").set_index('count')
d = dict(zip(df.index,df.values.tolist()))
df.plot.bar(x = 'count', y = 'team')
print(d)
However, I get an error
KeyError: 'count' from this line :
df.plot.bar(x = 'count', y = 'team')
I don't understand how there is an error for something that exists.
When you set the count as index, you just have a single column left in your DataFrame, i.e., team. Don't set the count as index and switch the order of x and y values for plotting the bar chart
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.read_csv("sampleData.csv", sep=",")
df.plot.bar(x = 'team', y = 'count')
Matplotlib solution
plt.bar(df['team'], df['count'])
plt.xticks(rotation=45) # Just rotating for better visualizaton
import sys
import ConfigParser
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime as DT
import bokeh
sys.path.extend(['..\..\myProj\SOURCE'])
fullfilepath = "../../myProj/SOURCE/" + 'myparts.txt'
ohg_df = pd.read_csv(fullfilepath, sep="\t" )
temp_df = temp_df[['as_on_date', 'ohg_qty']]
temp_df = temp_df.sort(['as_on_date'], ascending=[1])
temp_df.set_index('as_on_date')
plt.plot(temp_df.index, temp_df.ohg_qty)
plt.show()
This is my dataframe after importing.
I am trying to plot the line graph with x axis as date mentioned in the dataframe.
Can someone guide me... I am new to pandas.
dataframe picture
output pitcure
Easier:
# Set index directly
ohg_df = pd.read_csv(fullfilepath, sep="\t", index='as_on_date')
# Convert string index to dates
ohg_df.index = pd.to_datetime(ohg_df.index)
# Get a column and plot it (taking a column keeps the index)
plt.plot(ohg_df.ohg_qty)
i want to convert that dataframe
into this dataframe and plot a matplotlib graph using date along x axis
changed dataframe
Use df.T.plot(kind='bar'):
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame.from_csv('./housing_price_index_2010-11_100.csv')
df.T.plot(kind='bar')
plt.show()
you can also assign the transpose to a new variable and plot that (what you asked in the comment):
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame.from_csv('./housing_price_index_2010-11_100.csv')
df_transposed = df.T
df_transposed.plot(kind='bar')
plt.show()
both result the same:
Hi all so I'm trying to work with this set of data that has two columns, one is names and the other is the number of births for each name. What I want to do is import a csv file, perform some basic functions on it such as finding the baby name with the maximum number of births, and then plotting the data in a bar graph. But, when I have an index value for the dataframe, the bar graph prints that as the x axis instead of the names. So I removed the index and now I get all kinds of errors. Below is my code, first the one with the index and then the one without. Thanks in advance. This is really driving me crazy
import pandas as pd
import matplotlib.pyplot as plt
import pdb
import matplotlib as p
import os
from pandas import DataFrame
Location = os.path.join(os.path.sep,'Users', 'Mark\'s Computer','Desktop','projects','data','births1880.csv')
a = pd.read_csv(Location, index_col = False)
print(a) #print the dataframe just to see what I'm getting.
MaxValue = a['Births'].max()
MaxName = a['Names'][a['Births'] == MaxValue].values
print(MaxValue, ' ', MaxName)
a.plot(kind ='bar')
plt.show()
This code works but spits out a bar graph with the index as the x axis instead of the names?
import pandas as pd
import matplotlib.pyplot as plt
import pdb
import matplotlib as p
import os
from pandas import DataFrame
Location = os.path.join(os.path.sep,'Users', 'Mark\'s Computer','Desktop','projects','data','births1880.csv')
a = pd.read_csv(Location, index_col = True) #why is setting the index column to true removing it?
print(a) #print the dataframe just to see what I'm getting.
MaxValue = a['Births'].max()
MaxName = a['Names'][a['Births'] == MaxValue].values
print(MaxValue, ' ', MaxName)
a.plot(kind ='bar', x='Names', y = 'Births' )
plt.show()
edited for solution.
It would be nice if you'd provided a sample csv file, so I made one up, took me a while to figure out what format pandas expects.
I used a test.csv that looked like:
names,briths
mike,3
mark,4
Then my python code:
import pandas
import numpy
import matplotlib.pyplot as plt
a = pandas.read_csv('test.csv', index_col = False)
a.plot(kind='bar')
indices = numpy.arange(len(a['names']))
plt.xticks( indices+0.5, a['names'].values)
plt.show()