Related
I tried to plot a histogram by using the data from a dictionary. For example, if I have 9 data in a dictionary, I can easily plot the histogram as below:
myDictionary = {'you': 27, 'apple': 1, 'mango': 72, 'watermelon': 62, 'juice': 33, 'peter': 36, 'vegetable': 20, 'meat': 12, 'egg': 9}
plt.bar(myDictionary.keys(), myDictionary.values(), width=0.5, color='g')
However, if there is 42000 data in the dictionary, it takes too long for the plot to come out, in fact I'm not sure the plot will appear or not because I didn't wait for the histogram to come out.
Is there any solution can solve this problem? I'm not sure what other method can be use in order to get the histogram. Further, the x-axis will be very untidy as there are too many labels (42000).
I'd suggest you'd extract the values from your dictionary to a pandas series to speed up indexing. You can then plot this series directly.
import matplotlib.pyplot as plt
import pandas as pd
myDictionary = {'you': 27, 'apple': 1, 'mango': 72, 'watermelon': 62, 'juice': 33, 'peter': 36, 'vegetable': 20, 'meat': 12, 'egg': 9}
# First, convert to a series
s = pd.Series(x for x in myDictionary.values())
# Plot your series directly
s.hist()
plt.show()
I have a DataFrame and I can save it as a png file. But now I want to change the background color of specific cells who meet a certain condition.
Conditions:
Numbers who are 80 or higher must get a green background.
Numbers below 80 must get a red background.
All column names and index cells need a black background with a white text color.
The following posts came close to what I want but didn't provided with the answer I needed.
Post 1
Post 2
My code:
import matplotlib.pyplot as plt
from pandas.tools.plotting import table
import pandas as pd
#My dataframe
df = pd.DataFrame({
'Weeks' : [201605, 201606, 201607, 201608],
'Computer1' : [50, 77, 96, 100],
'Computer2' : [50, 79, 100, 80],
'Laptop1' : [75, 77, 96, 95],
'Laptop2' : [86, 77, 96, 40],
'Phone' : [99, 99, 44, 85],
'Phone2' : [93, 77, 96, 25],
'Phone3' : [94, 91, 96, 33]
})
df2 = df.set_index('Weeks') #Makes the column 'Weeks' the index.
#Make a png file out of an dataframe.
plt.figure(figsize=(9,3))
ax = plt.subplot(211, frame_on=False) # no visible frame
ax.xaxis.set_visible(False) # hide the x axis
ax.yaxis.set_visible(False) # hide the y axis
table(ax, df2, rowLabels=df2.index, colLabels=df2.columns, loc='center', cellColours=None)
plt.savefig('mytable.png') #save it as an png.
This is how it currently looks:
This is how I want it to look
you can do something like this:
colors = df2.applymap(lambda x: 'green' if x>= 80 else 'red').reset_index().drop(['Weeks'], axis=1)
tbl = table(ax, df2, loc='center',
cellColours=colors.as_matrix(),
colColours=['black']*len(colors.columns),
rowColours=['black']*len(colors))
Setting index's color:
[tbl._cells[row, -1]._text.set_color('white') for row in range(1, len(colors)+1)]
setting header's colors:
[tbl._cells[0, col]._text.set_color('white') for col in range(len(colors.columns))]
plt.show()
Code (complete):
import matplotlib.pyplot as plt
from pandas.tools.plotting import table
import pandas as pd
#My dataframe
df = pd.DataFrame({
'Weeks' : [201605, 201606, 201607, 201608],
'Computer1' : [50, 77, 96, 100],
'Computer2' : [50, 79, 100, 80],
'Laptop1' : [75, 77, 96, 95],
'Laptop2' : [86, 77, 96, 40],
'Phone' : [99, 99, 44, 85],
'Phone2' : [93, 77, 96, 25],
'Phone3' : [94, 91, 96, 33]
})
df2 = df.set_index('Weeks') #Makes the column 'Weeks' the index.
colors = df2.applymap(lambda x: 'green' if x>= 80 else 'red') \
.reset_index().drop(['Weeks'], axis=1)
#print(colors)
plt.figure(figsize=(10,5))
ax = plt.subplot(2, 1, 1, frame_on=True) # no visible frame
#ax.xaxis.set_visible(False) # hide the x axis
#ax.yaxis.set_visible(False) # hide the y axis
# hide all axises
ax.axis('off')
# http://matplotlib.org/api/pyplot_api.html?highlight=table#matplotlib.pyplot.table
tbl = table(ax, df2,
loc='center',
cellLoc='center',
cellColours=colors.as_matrix(),
colColours=['black']*len(colors.columns),
rowColours=['black']*len(colors),
#fontsize=14
)
# set color for index (X, -1) and headers (0, X)
for key, cell in tbl.get_celld().items():
if key[1] == -1 or key[0] == 0:
cell._text.set_color('white')
# remove grid lines
cell.set_linewidth(0)
# refresh table
plt.show()
# save it as an png.
plt.savefig('mytable.png')
My problem is the following.
I have a pandas DataFrame containing the data of a "sample" in the first row and the data of the "controls" on all the other rows.
I would like to have a scatter plot (or any other kind of plot to generalize the question) in which all the "controls" are in one color and the "sample" in another one. How to do that? I have looked in pandas documentation but I couldn’t find anything.
Here is what I have up to now
from pandas import *
from collections import OrderedDict
mydict = OrderedDict([
('sample', [454, 481, 160, 26, 17]),
('ctrl_1', [454, 470, 101, 10, 8]),
('ctrl_2', [454, 473, 110, 15, 9]),
('ctrl_3', [454, 472, 104, 19, 13]),
('ctrl_4', [454, 472, 105, 16, 13]),
('ctrl_5', [454, 466, 97, 15, 10]),
('ctrl_6', [454, 473, 110, 17, 10]),
('ctrl_7', [454, 465, 99, 15, 11]),
('ctrl_8', [454, 471, 107, 18, 12]),
('ctrl_9', [454, 471, 102, 15, 11]),
('ctrl_10', [454, 472, 116, 14, 9])
])
df = DataFrame.from_dict(mydict,orient='index')
df.columns=['A','B','C','D','E']
df.plot(kind='scatter',x='C',y='E',figsize=(10,10), color='blue')
I tried to split the DataFrame in two (controls and sample) and plot one on top of the other but pandas raise an error (TypeError: There is no line property "y") when you try to scatterplot a single point (is it a bug?).
sample = df.ix[0]
controls = df.ix[1:]
controls.plot(kind='scatter',x='C',y='E',figsize=(10,10), color='blue')
sample.plot(kind='scatter',x='C',y='E',figsize=(10,10), color='red')
Any suggestion?
You're getting a Series back from df.ix[0], which can't be drawn as a scatter plot. (I guess it could be a valid type in theory, but, as you say, it would only show 1 point.)
If you change your code slightly to make sample a DataFrame instead, it works. (I've also put both on the same plot by using the same axes.)
sample = df.ix[:1]
controls = df.ix[1:]
ax = controls.plot(kind='scatter',x='C',y='E',figsize=(10,10), color='blue')
sample.plot(ax=ax, kind='scatter',x='C',y='E',figsize=(10,10), color='red')
I'm querying data from a simple sqlite3 DB which is pulling a list of the number of connections per port observed on my system. I'm trying to graph this into a simple bar-chart using matplotlib.
Thus far, I'm using the follow code:
import matplotlib as mpl
mpl.use('Agg') # force no x11
import matplotlib.pyplot as plt
import sqlite3
con = sqlite3.connect('test.db')
cur = con.cursor()
cur.execute('''
SELECT dst_port, count(dst_port) as count from logs
where dst_port != 0
group by dst_port
order by count desc;
'''
)
data = cur.fetchall()
dst_ports, dst_port_count = zip(*data)
#dst_ports = [22, 53223, 40959, 80, 3389, 23, 443, 35829, 8080, 4899, 21320, 445, 3128, 44783, 4491, 9981, 8001, 21, 1080, 8081, 3306, 8002, 8090]
#dst_port_count = [5005, 145, 117, 41, 34, 21, 17, 16, 15, 11, 11, 8, 8, 8, 6, 6, 4, 3, 3, 3, 1, 1, 1]
print dst_ports
print dst_port_count
fig = plt.figure()
# aesthetics and data
plt.grid()
plt.bar(dst_ports, dst_port_count, align='center')
#plt.xticks(dst_ports)
# labels
plt.title('Number of connections to port')
plt.xlabel('Destination Port')
plt.ylabel('Connection Attempts')
# save figure
fig.savefig('temp.png')
When I run the above, the data is successful retrieved from the DB and a graph is generated. However, the graph isn't what I was expecting. For example, on the x-axis, it plots all values between 0 and 5005. I'm looking for it to display only the values in dst_ports. I've tried using xticks but this doesn't work either.
I've included some sample data in the above code which I've commented out that may be useful.
In addition, here is an example of the graph output from the above code:
And also a grpah when using xticks:
You need to create some xdata by np.arange():
import matplotlib as mpl
import numpy as np
import matplotlib.pyplot as plt
dst_ports = [22, 53223, 40959, 80, 3389, 23, 443, 35829, 8080, 4899, 21320, 445, 3128, 44783, 4491, 9981, 8001, 21, 1080, 8081, 3306, 8002, 8090]
dst_port_count = [5005, 145, 117, 41, 34, 21, 17, 16, 15, 11, 11, 8, 8, 8, 6, 6, 4, 3, 3, 3, 1, 1, 1]
fig = plt.figure(figsize=(12, 4))
# aesthetics and data
plt.grid()
x = np.arange(1, len(dst_ports)+1)
plt.bar(x, dst_port_count, align='center')
plt.xticks(x, dst_ports, rotation=45)
# labels
plt.title('Number of connections to port')
plt.xlabel('Destination Port')
plt.ylabel('Connection Attempts')
Here is the output:
What are my best options for creating a financial open-high-low-close (OHLC) chart in a high level language like Ruby or Python? While there seem to be a lot of options for graphing, I haven't seen any gems or eggs with this kind of chart.
http://en.wikipedia.org/wiki/Open-high-low-close_chart (but I don't need the moving average or Bollinger bands)
JFreeChart can do this in Java, but I'd like to make my codebase as small and simple as possible.
Thanks!
You can use matplotlib and the the optional bottom parameter of matplotlib.pyplot.bar. You can then use line plot to indicate the opening and closing prices:
For example:
#!/usr/bin/env python
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import lines
import random
deltas = [4, 6, 13, 18, 15, 14, 10, 13, 9, 6, 15, 9, 6, 1, 1, 2, 4, 4, 4, 4, 10, 11, 16, 17, 12, 10, 12, 15, 17, 16, 11, 10, 9, 9, 7, 10, 7, 16, 8, 12, 10, 14, 10, 15, 15, 16, 12, 8, 15, 16]
bases = [46, 49, 45, 45, 44, 49, 51, 52, 56, 58, 53, 57, 62, 63, 68, 66, 65, 66, 63, 63, 62, 61, 61, 57, 61, 64, 63, 58, 56, 56, 56, 60, 59, 54, 57, 54, 54, 50, 53, 51, 48, 43, 42, 38, 37, 39, 44, 49, 47, 43]
def rand_pt(bases, deltas):
return [random.randint(base, base + delta) for base, delta in zip(bases, deltas)]
# randomly assign opening and closing prices
openings = rand_pt(bases, deltas)
closings = rand_pt(bases, deltas)
# First we draw the bars which show the high and low prices
# bottom holds the low price while deltas holds the difference
# between high and low.
width = 0
ax = plt.axes()
rects1 = ax.bar(np.arange(50), deltas, width, color='r', bottom=bases)
# Now draw the ticks indicating the opening and closing price
for opening, closing, bar in zip(openings, closings, rects1):
x, w = bar.get_x(), 0.2
args = {
}
ax.plot((x - w, x), (opening, opening), **args)
ax.plot((x, x + w), (closing, closing), **args)
plt.show()
creates a plot like this:
Obviously, you'd want to package this up in a function that drew the plot using (open, close, min, max) tuples (and you probably wouldn't want to randomly assign your opening and closing prices).
You can use Pylab (matplotlib.finance) with Python. Here are some examples: http://matplotlib.sourceforge.net/examples/pylab_examples/plotfile_demo.html . There is some good material specifically on this problem in Beginning Python Visualization.
Update: I think you can use matplotlib.finance.candlestick for the Japanese candlestick effect.
Have you considered using R and the quantmod package? It likely provides exactly what you need.
Some examples about financial plots (OHLC) using matplotlib can be found here:
finance demo
#!/usr/bin/env python
from pylab import *
from matplotlib.dates import DateFormatter, WeekdayLocator, HourLocator, \
DayLocator, MONDAY
from matplotlib.finance import quotes_historical_yahoo, candlestick,\
plot_day_summary, candlestick2
# (Year, month, day) tuples suffice as args for quotes_historical_yahoo
date1 = ( 2004, 2, 1)
date2 = ( 2004, 4, 12 )
mondays = WeekdayLocator(MONDAY) # major ticks on the mondays
alldays = DayLocator() # minor ticks on the days
weekFormatter = DateFormatter('%b %d') # Eg, Jan 12
dayFormatter = DateFormatter('%d') # Eg, 12
quotes = quotes_historical_yahoo('INTC', date1, date2)
if len(quotes) == 0:
raise SystemExit
fig = figure()
fig.subplots_adjust(bottom=0.2)
ax = fig.add_subplot(111)
ax.xaxis.set_major_locator(mondays)
ax.xaxis.set_minor_locator(alldays)
ax.xaxis.set_major_formatter(weekFormatter)
#ax.xaxis.set_minor_formatter(dayFormatter)
#plot_day_summary(ax, quotes, ticksize=3)
candlestick(ax, quotes, width=0.6)
ax.xaxis_date()
ax.autoscale_view()
setp( gca().get_xticklabels(), rotation=45, horizontalalignment='right')
show()
finance work 2
Are you free to use JRuby instead of Ruby? That'd let you use JFreeChart, plus your code would still be in Ruby
Please look at the Open Flash Chart embedding for WHIFF
http://aaron.oirt.rutgers.edu/myapp/docs/W1100_1600.openFlashCharts
An example of a candle chart is right at the top. This would be especially
good for embedding in web pages.
Open Flash Chart is nice choice if you like the look of examples. I've moved to JavaScript/Canvas library like Flot for HTML embedded charts, as it is more customizable and I get desired effect without much hacking (http://itprolife.worona.eu/2009/08/scatter-chart-library-moving-to-flot.html).
This is the stock chart I draw just days ago using Matplotlib, I've posted the source too, for your reference: StockChart_Matplotlib