df.plot fails after pandas upgrade to v 1.0.1 - python

I was using pandas 0.23.4 and just upgraded to 1.0.1.
I have a code which generated a dataframe and I would plot it as a stacked bar plot df.plot(kind='bar') and as an area plot df.plot.area(). It was working fine. I decided to upgrade pandas and now neither of the plot commands work. Here is an example:
df=pd.DataFrame()
df["col1"]=[0.7,0.2,0.1,0.0]
df["col2"]=[0.1,0.5,0.2,0.2]
df['col3']=[0.1,0.0,0.1,0.8]
df.plot.area()
This gives the error TypeError: float() argument must be a string or a number, not '_NoValueType'.
I don't know how to fix this. I would appreciate any help.
Thanks!
EDIT: Full error message:
Traceback (most recent call last):
File "<ipython-input-96-b436d7233c8a>", line 1, in <module>
df.plot.area()
File "C:\Users\Anaconda3\lib\site-packages\pandas\plotting\_core.py", line 1363, in area
return self(kind="area", x=x, y=y, **kwargs)
File "C:\Users\Anaconda3\lib\site-packages\pandas\plotting\_core.py", line 847, in __call__
return plot_backend.plot(data, kind=kind, **kwargs)
File "C:\Users\Anaconda3\lib\site-packages\pandas\plotting\_matplotlib\__init__.py", line 61, in plot
plot_obj.generate()
File "C:\Users\Anaconda3\lib\site-packages\pandas\plotting\_matplotlib\core.py", line 262, in generate
self._setup_subplots()
File "C:\Users\Anaconda3\lib\site-packages\pandas\plotting\_matplotlib\core.py", line 321, in _setup_subplots
axes = fig.add_subplot(111)
File "C:\Users\Anaconda3\lib\site-packages\matplotlib\figure.py", line 1257, in add_subplot
a = subplot_class_factory(projection_class)(self, *args, **kwargs)
File "C:\Users\Anaconda3\lib\site-packages\matplotlib\axes\_subplots.py", line 74, in __init__
self.update_params()
File "C:\Users\Anaconda3\lib\site-packages\matplotlib\axes\_subplots.py", line 136, in update_params
return_all=True)
File "C:\Users\Anaconda3\lib\site-packages\matplotlib\gridspec.py", line 467, in get_position
fig_bottom = fig_bottoms[rows].min()
File "C:\Users\Anaconda3\lib\site-packages\numpy\core\_methods.py", line 32, in _amin
return umr_minimum(a, axis, None, out, keepdims, initial)
TypeError: float() argument must be a string or a number, not '_NoValueType'

Okay, I rebooted my the computer and now everything works. No idea what was wrong before!

Related

What do these networkx errors mean which I am getting after setting a filter on my pandas dataframe?

I have a source, target and weight dataframe called Gr_Team which looks like this (sample) -
Passer Receiver count
116643 102747 27
102826 169102 10
116643 102826 7
167449 102826 8
102747 167449 4
Each Passer and Receiver have their unique x,y co-ordinates which I have in a dictionary loc - {'102739': [32.733999999999995, 26.534], '102747': [81.25847826086964, 27.686739130434784], '102826': [68.09609195402302, 77.52206896551728]}
I plotted this using networkx:
G=nx.from_pandas_edgelist(Gr_Team, 'Passer', 'Receiver', create_using=nx.DiGraph())
nx.draw(G, loc, with_labels=False, node_color='red',
node_size=Gr_Team['count']*100,
width=Gr_Team['count'],
edge_color = Gr_Team["count"],
edge_cmap = cmap,
arrowstyle='->',
arrowsize=10,
vmin=vmin,
vmax=vmax,
font_size=10,
font_weight="bold",
connectionstyle='arc3, rad=0.1')
That worked without any issues and here's what I got:
However, as soon as I try to filter out all the rows with count value below a constant, let's say 3, using this Gr_Team = Gr_Team[Gr_Team["count"]>3], I get a key error and here's the entire error and traceback which I can't make anything out of:
Warning (from warnings module):
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python37\lib\site-packages\networkx\drawing\nx_pylab.py", line 676
if cb.iterable(node_size): # many node sizes
MatplotlibDeprecationWarning:
The iterable function was deprecated in Matplotlib 3.1 and will be removed in 3.3. Use np.iterable instead.
Traceback (most recent call last):
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python37\PassMapOptaF24Networkx.py", line 148, in <module>
font_weight="bold")#,
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python37\lib\site-packages\networkx\drawing\nx_pylab.py", line 128, in draw
draw_networkx(G, pos=pos, ax=ax, **kwds)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python37\lib\site-packages\networkx\drawing\nx_pylab.py", line 280, in draw_networkx
edge_collection = draw_networkx_edges(G, pos, arrows=arrows, **kwds)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python37\lib\site-packages\networkx\drawing\nx_pylab.py", line 684, in draw_networkx_edges
arrow_color = edge_cmap(color_normal(edge_color[i]))
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\core\series.py", line 868, in __getitem__
result = self.index.get_value(self, key)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\core\indexes\base.py", line 4375, in get_value
tz=getattr(series.dtype, 'tz', None))
File "pandas\_libs\index.pyx", line 81, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 89, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 987, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 993, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 1
I realized that doing only nx.draw(G, loc, with_labels=False, node_color='red') still worked but as soon as I try to pass node_size or edge_color, it hits the above error. From my understanding, the error is only when I'm using the dataframe Gr_Team in the keyword arguments.
I can't figure out why that's happening and why filtering breaks the code. Any help would be appreciated.
EDIT 1: Here's a gist of the entire code. I tried my best to keep it minimal. Here's the link to the csv file that needs to be read in as df. The line which produces the error is in also in there; commented out.
These error lines give a good hint:
File "pandas_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
KeyError: 1
The problem seems to be that passing the Gr_Team - dataframe to specifiy node_size, width and edge_color requires the indices to be consecutive. Using
Gr_Team = Gr_Team[Gr_Team["count"] > 3 ].reset_index()
should solve the problem.

statsmodels SARIMAX with exogenous variables matrices are different sizes

I'm running a SARIMAX model but running into problems with specifying the exogenous variables. In the first block of code (below) I specify one exogenous variable lesdata['LESpost'] and the model runs without a problem. However, when I add in another exogenous variable I end up with an error message (see stack trace).
ar = (1,0,1) # AR(1 3)
ma = (0) # No MA terms
mod1 = sm.tsa.statespace.SARIMAX(lesdata['emadm'], exog= (lesdata['LESpost'],lesdata['QOF']), trend='c', order=(ar,0,ma), mle_regression=True)
Traceback (most recent call last):
File "<ipython-input-129-d1300aeaeffc>", line 4, in <module>
mle_regression=True)
File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\tsa\statespace\sarimax.py", line 510, in __init__
endog, exog=exog, k_states=k_states, k_posdef=k_posdef, **kwargs
File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\tsa\statespace\mlemodel.py", line 84, in __init__
missing='none')
File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\tsa\base\tsa_model.py", line 43, in __init__
super(TimeSeriesModel, self).__init__(endog, exog, missing=missing)
File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\base\model.py", line 212, in __init__
super(LikelihoodModel, self).__init__(endog, exog, **kwargs)
File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\base\model.py", line 63, in __init__
**kwargs)
File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\base\model.py", line 88, in _handle_data
data = handle_data(endog, exog, missing, hasconst, **kwargs)
File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\base\data.py", line 630, in handle_data
**kwargs)
File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\base\data.py", line 80, in __init__
self._check_integrity()
File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\base\data.py", line 496, in _check_integrity
super(PandasData, self)._check_integrity()
File "C:\Users\danie\Anaconda2\lib\site-packages\statsmodels\base\data.py", line 403, in _check_integrity
raise ValueError("endog and exog matrices are different sizes")
ValueError: endog and exog matrices are different sizes
Is there something obvious I am missing here? The variables are all of the same length and there are no missing data.
Thanks for reading and hope you can help !
Two dimensional data needs to have observations in row and variables in columns after applying numpy.asarray.
exog = (lesdata['LESpost'],lesdata['QOF'])
Applying asarray to this tuple puts the variables in rows which is the numpy default from the C origin which is not what statsmodels wants.
DataFrames are already shaped in the appropriate way, so one option is to use a DataFrame with the desired columns
exog = lesdata[['LESpost', 'QOF']]
Another option for list or tuples of array_likes is to use numpy.column_stack, e.g.
exog = np.column_stack((lesdata['LESpost'].values,lesdata['QOF'].values))

Memory error when computing eigenvalues in python

I try to find eigenvalues of adjacency matrix of a large graph (465,017 nodes, 834,797 edges). I try to find values using NetworkX adjacency_spectrum method. When I compiling I have a memory error.
Traceback (most recent call last):
File "5.py", line 19, in <module>
w=nx.adjacency_spectrum(G)
File "/home/aiym/anaconda3/lib/python3.5/site-packages/networkx/linalg/spectrum.py", line 75, in adjacency_spectrum
return eigvals(nx.adjacency_matrix(G,weight=weight).todense())
File "/home/aiym/anaconda3/lib/python3.5/site-packages/scipy/sparse/base.py", line 691, in todense
return np.asmatrix(self.toarray(order=order, out=out))
File "/home/aiym/anaconda3/lib/python3.5/site-packages/scipy/sparse/compressed.py", line 920, in toarray
return self.tocoo(copy=False).toarray(order=order, out=out)
File "/home/aiym/anaconda3/lib/python3.5/site-packages/scipy/sparse/coo.py", line 252, in toarray
B = self._process_toarray_args(order, out)
File "/home/aiym/anaconda3/lib/python3.5/site-packages/scipy/sparse/base.py", line 1009, in _process_toarray_args
return np.zeros(self.shape, dtype=self.dtype, order=order)
MemoryError
Can you help me to fix this problem? or suggest other methods to compute eigenvalues without memory error

matplotlib text (and figtext) error

I'm trying to plot text values instead of symbols (for an MDS solution), and matplotlib.pyplot is giving me errors I don't understand. I've updated ipython and matplotlib to make sure it's not an old problem (or a problem with old versions), and I haven't been able to find any answers or reports of similar problems here (or elsewhere via google).
So, for example, after invoking ipython --pylab, if I type:
x = random.rand(4)
y = random.rand(4)
s = [str(i) for i in arange(4)+1]
text(x,y,s)
I get this error:
Traceback (most recent call last):
File "//anaconda/lib/python2.7/site-packages/matplotlib/artist.py", line 55, in
draw_wrapper draw(artist, renderer, *args, **kwargs)
File "//anaconda/lib/python2.7/site-packages/matplotlib/figure.py", line 1034, in draw
func(*args)
File "//anaconda/lib/python2.7/site-packages/matplotlib/artist.py", line 55, in
draw_wrapper draw(artist, renderer, *args, **kwargs)
File "//anaconda/lib/python2.7/site-packages/matplotlib/axes.py", line 2086, in draw
a.draw(renderer)
File "//anaconda/lib/python2.7/site-packages/matplotlib/artist.py", line 55, in
draw_wrapper draw(artist, renderer, *args, **kwargs)
File "//anaconda/lib/python2.7/site-packages/matplotlib/text.py", line 547, in draw
bbox, info, descent = self._get_layout(renderer)
File "//anaconda/lib/python2.7/site-packages/matplotlib/text.py", line 287, in
_get_layout key = self.get_prop_tup()
File "//anaconda/lib/python2.7/site-packages/matplotlib/text.py", line 696, in
get_prop_tup x, y = self.get_position()
File "//anaconda/lib/python2.7/site-packages/matplotlib/text.py", line 684, in
get_position x = float(self.convert_xunits(self._x))
TypeError: only length-1 arrays can be converted to Python scalars
I get the same error if I try calling text with scalars rather than vectors/lists (e.g., text(x[0],y[0],s[0]), or any number of variants of the arguments to the text function). The same thing happens:
with figtext,
if I manually import matplotlib.pyplot as plt and call plt.text, and
if I explicitly make figure and subplot objects and/or call scatter(x,y) first.
Also, for what it's worth, once this problem occurs, the error message appears again if I manually resize the figure. Possibly related is the fact that changes to figures don't update automatically, but only after I plot in another subplot or manually resize the figure. But I digress.
I've got an updated installation of Anaconda on a Mac (with Mavericks), and, as mentioned above, I'm using iPython.
plt.text expects a single x, y, and string values, not sequences. (See: http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.text )
Just use a loop.
For example:
import numpy as np
import matplotlib.pyplot as plt
x, y = np.random.rand(2,4)
s = [str(i) for i in np.arange(1, 5)]
fig, ax = plt.subplots()
text = [ax.text(*item) for item in zip(x, y, s)]
plt.show()

Drawing under a curve in matplotlib

For a subplot (self.intensity), I want to shade the area under the graph.
I tried this, hoping it was the correct syntax:
self.intensity.fill_between(arange(l,r), 0, projection)
Which I intend as to do shading for projection numpy array within (l,r) integer limits.
But it gives me an error. How do I do it correctly?
Heres the traceback:
Traceback (most recent call last):
File "/usr/lib/pymodules/python2.7/matplotlib/backends/backend_wx.py", line 1289, in _onLeftButtonDown
FigureCanvasBase.button_press_event(self, x, y, 1, guiEvent=evt)
File "/usr/lib/pymodules/python2.7/matplotlib/backend_bases.py", line 1576, in button_press_event
self.callbacks.process(s, mouseevent)
File "/usr/lib/pymodules/python2.7/matplotlib/cbook.py", line 265, in process
proxy(*args, **kwargs)
File "/usr/lib/pymodules/python2.7/matplotlib/cbook.py", line 191, in __call__
return mtd(*args, **kwargs)
File "/root/dev/spectrum/spectrum/plot_handler.py", line 55, in _onclick
self._call_click_callback(event.xdata)
File "/root/dev/spectrum/spectrum/plot_handler.py", line 66, in _call_click_callback
self.__click_callback(data)
File "/root/dev/spectrum/spectrum/plot_handler.py", line 186, in _on_plot_click
band_data = self._band_data)
File "/root/dev/spectrum/spectrum/plot_handler.py", line 95, in draw
self.intensity.fill_between(arange(l,r), 0, projection)
File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 6457, in fill_between
raise ValueError("Argument dimensions are incompatible")
ValueError: Argument dimensions are incompatible
It seems like you are trying to fill the part of the projection from l to r. fill_between expects the x and y arrays to be of equal lengths, so you can not expect to fill part of the curve only.
To get what you want, you can do either of the following:
1. send only part of the projection that needs to be filled to the command; and draw the rest of the projection separately.
2. send a separate boolean array as argument that defines the sections to fill in. See the documentation!
For the former method, see the example code below:
from pylab import *
a = subplot(111)
t = arange(1, 100)/50.
projection = sin(2*pi*t)
# Draw the original curve
a.plot(t, projection)
# Define areas to fill in
l, r = 10, 50
# Fill the areas
a.fill_between(t[l:r], projection[l:r])
show()

Categories