Creating a histogram and understanding both maplotlib axes and subplot - python

I've just had some trouble piecing together matplotlib documentation particularly for my data set. I think I have a small enough block of code that shows all of my lack of understanding despite reading documentation. The documentation i've beeen trying to use for reference was initially this for creating a line graph https://matplotlib.org/gallery/text_labels_and_annotations/date.html
I've been trying to plot a numpy array, post_records containing two columns. I'm working with social media data, so the first column is for post_ids and the second is for datetime_obj_col that I managed to read from csv file using some scripting.
I managed to create a line graph with this data in matplotlib, but I don't quite know how to make a histogram.
Right now, nothing shows up when I run my program
fig, ax = plt.subplots()
hist, bins, patch_lst = ax.hist(post_records[:,1], bins=range(31)) # thought that bins could be a sequence, wanted to create 31 bins for 31 total days in a month
ax.plot(hist, bins)
ax.set_xlabel('Days')
ax.set_ylabel('frequency')
ax.set_title(r'Histogram of Time')
plt.show() # shows nothing
What do I need to pass to ax.plot? I'm unclear about how to pass in my x dataset
why isn't the window showing?
Edit with how to replicate this:
def create_dataframe_of_datetime_objects_and_visualize():
datetime_lst = [1521071920000000000, 1521071901000000000, 1521071844000000000, 1521071741000000000, 1521071534000000000] # to get this variable I loaded my original dataframe with 1980000, sliced the first 5 entries, then printed out the 'datetime_obj_col'. I can't exactly remember what this format is called, I think it's unix time.
id_lst = [974013, 974072, 327212, 123890, 438201]
for each in range(len(datetime_lst)):
datetime_lst[each] = pd.to_datetime(datetime_lst[each], errors='coerce')
datetime_lst[each] = datetime_lst[each].strftime("%d-%b-%y %H:%M:%S")
datetime_lst[each] = pd.to_datetime(datetime_lst[each], errors='coerce', dayfirst=True, format="%d-%b-%y %H:%M:%S")
datetime_lst = pd.Series(datetime_lst)
df = pd.DataFrame({'tweet_id':id_lst, 'datetime_obj_col': datetime_lst})
gb_var = df.groupby(df["datetime_obj_col"].dt.month)
gb_var_count = gb_var.count()
gb_var.plot(kind="bar")
plt.show()
note that I am not using histogram anymore. But there should be two errors that come up, the following:
Traceback (most recent call last):
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\core
\groupby\groupby.py", line 918, in apply
result = self._python_apply_general(f)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\core
\groupby\groupby.py", line 936, in _python_apply_general
self.axis)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\core
\groupby\groupby.py", line 2273, in apply
res = f(group)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\core
\groupby\groupby.py", line 541, in f
return self.plot(*args, **kwargs)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\plot
ting_core.py", line 2941, in call
sort_columns=sort_columns, **kwds)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\plot
ting_core.py", line 1977, in plot_frame
**kwds)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\plot
ting_core.py", line 1804, in _plot
plot_obj.generate()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\plot
ting_core.py", line 266, in generate
self._post_plot_logic_common(ax, self.data)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\plot
ting_core.py", line 405, in _post_plot_logic_common
self._apply_axis_properties(ax.yaxis, fontsize=self.fontsize)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\plot
ting_core.py", line 478, in _apply_axis_properties
labels = axis.get_majorticklabels() + axis.get_minorticklabels()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
axis.py", line 1245, in get_majorticklabels
ticks = self.get_major_ticks()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
axis.py", line 1396, in get_major_ticks
numticks = len(self.get_major_locator()())
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
dates.py", line 1249, in call
self.refresh()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
dates.py", line 1269, in refresh
dmin, dmax = self.viewlim_to_dt()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
dates.py", line 1026, in viewlim_to_dt
.format(vmin))
ValueError: view limit minimum 0.0 is less than 1 and is an invalid Matplotlib d
ate value. This often happens if you pass a non-datetime value to an axis that h
as datetime units
Edit:
This is starting to seem like a bug related specifically to trying to use hist() to plot a column of datetime objects.
I took data from post_records which is a loaded numpy array that stores a 2d data set of 198000+ post ids and datetime objects.
This is the code for a function called create datetime objects. It opens a csv file "tweet_time_info_preprocessed.csv," which only has three columns: 'tweet_id" "tweet_created_at_date," and "tweet_created_at_hour." The following is code to combine the tweet_created_at_date and tweet_created_at_hour columns into formatted datetime objects using the pandas library to_datetime() method.
Csv file sample
def create_datetime_objects():
with open("post_time_info_preprocessed.csv", 'r', encoding='utf8') as time_csv:
mycsv = csv.reader(time_csv)
progress = 0
for row in mycsv:
progress +=1
if progress == 1: #header row
continue
if progress % 10000 == 0:
print(progress)
each_post_datetime_lst = []
each_post_datetime_lst.append(row[0])
time_str = str(row[1]) + " " + str(row[2])
a_date_object = pd.to_datetime(time_str, dayfirst=True, format="%d-%b-%y %H:%M:%S")
each_post_datetime_lst.append(a_date_object)
post_and_datetime_lst.append(each_tweet_datetime_lst)
numpy_arr_of_tweets_and_datetimes = np.array(tweets_and_datetime_objs)
np.save(np_save_path, numpy_arr_of_tweets_and_datetimes)
then I have visualize_objects_histogram()
def visualize_objects_histogram():
print("Visualizing timeplot as histogram")
post_records= np.load("tweets_and_datetime_objects.npy")
df = pd.DataFrame(data=post_records, columns=['post_id', 'datetime_obj_col'])
df_sliced = df[0:5]
print(df_sliced)
fig, ax = plt.subplots()
hist, bins, patch_lst = ax.hist(df_sliced['datetime_obj_col'], bins=range(5))
ax.plot(hist, bins)
ax.set_xlabel('Days')
ax.set_ylabel('frequency')
ax.set_title('Histogram of Time')
plt.show()
So I sliced off 5 rows of the data frame and stored them into df_slice. I run this code, a blank white window appears. Printing df_slice gives
tweet_id datetime_obj_col
0 974072352958042112 2018-03-14 23:58:40
1 974072272578166784 2018-03-14 23:58:21
2 974072032177598464 2018-03-14 23:57:24
3 974071601313533953 2018-03-14 23:55:41
4 974070732777914368 2018-03-14 23:52:14
And there's also an error message that comes with the blank white window. It's very long.
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Users\biney\AppData\Local\Programs\Python\Python36-32\lib\tkinter__i
nit__.py", line 1699, in call
return self.func(*args)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
backends_backend_tk.py", line 227, in resize
self.draw()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
backends\backend_tkagg.py", line 12, in draw
super(FigureCanvasTkAgg, self).draw()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
backends\backend_agg.py", line 433, in draw
self.figure.draw(self.renderer)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
artist.py", line 55, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
figure.py", line 1475, in draw
renderer, self, artists, self.suppressComposite)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
image.py", line 141, in _draw_list_compositing_images
a.draw(renderer)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
artist.py", line 55, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
axes_base.py", line 2607, in draw
mimage._draw_list_compositing_images(renderer, self, artists)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
image.py", line 141, in _draw_list_compositing_images
a.draw(renderer)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
artist.py", line 55, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
axis.py", line 1190, in draw
ticks_to_draw = self._update_ticks(renderer)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
axis.py", line 1028, in _update_ticks
tick_tups = list(self.iter_ticks()) # iter_ticks calls the locator
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
axis.py", line 971, in iter_ticks
majorLocs = self.major.locator()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
dates.py", line 1249, in call
self.refresh()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
dates.py", line 1269, in refresh
dmin, dmax = self.viewlim_to_dt()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
dates.py", line 1026, in viewlim_to_dt
.format(vmin))
ValueError: view limit minimum -0.19500000000000003 is less than 1 and is an inv
alid Matplotlib date value. This often happens if you pass a non-datetime value
to an axis that has datetime units
This error message repeats 5 times with slightly different values for the "view limit." Possibly 5 error messages ffor my 5 records. I think the error messages are most closely related to the following online version of dates.py... Could be wrong : https://fossies.org/linux/matplotlib/lib/matplotlib/dates.py (around line 1022, I'm going to check the actual file on my computer soon).
I'm going to try stuff from this post to see if it will help: Can Pandas plot a histogram of dates?
Edit 2:
The previous stackoverflow introduced me to two helpful methods, but they didn't work. I changed my visualize... function to the following
def visualize_datetime_objects_with_pandas():
tweets_and_datetime_objects = np.load("tweets_and_datetime_objects.npy") # contains python datetime objects
print("with pandas")
print(tweets_and_datetime_objects.shape)
df = pd.DataFrame(data=tweets_and_datetime_objects, columns=['tweet_id', 'datetimeobj'])
pandas_freq_dict = df['datetimeobj'].value_counts().to_dict()
#print(pandas_freq_dict)
print(len(list(pandas_freq_dict.keys())))
print(list(pandas_freq_dict.keys())[0])
print(list(pandas_freq_dict.values())[1])
plt.plot(pandas_freq_dict.keys(), pandas_freq_dict.values())
#df = df.set_index('datetimeobj')
# changing the index of this dataframe to a time index
#df['datetimeobj'].plot(kind='line', style=['--'])
plt.show()
It gives the following output/error message.
date-time temporal data visualization script
Visualizing timeplot as histogram
tweet_id datetime_obj_col
datetime_obj_col
14 5 5
tweet_id datetime_obj_col
datetime_obj_col
14 5 5
Traceback (most recent call last):
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\core
\groupby\groupby.py", line 918, in apply
result = self._python_apply_general(f)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\core
\groupby\groupby.py", line 936, in _python_apply_general
self.axis)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\core
\groupby\groupby.py", line 2273, in apply
res = f(group)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\core
\groupby\groupby.py", line 541, in f
return self.plot(*args, **kwargs)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\plot
ting_core.py", line 2941, in call
sort_columns=sort_columns, **kwds)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\plot
ting_core.py", line 1977, in plot_frame
**kwds)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\plot
ting_core.py", line 1804, in _plot
plot_obj.generate()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\plot
ting_core.py", line 266, in generate
self._post_plot_logic_common(ax, self.data)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\plot
ting_core.py", line 405, in _post_plot_logic_common
self._apply_axis_properties(ax.yaxis, fontsize=self.fontsize)
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\pandas\plot
ting_core.py", line 478, in _apply_axis_properties
labels = axis.get_majorticklabels() + axis.get_minorticklabels()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
axis.py", line 1245, in get_majorticklabels
ticks = self.get_major_ticks()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
axis.py", line 1396, in get_major_ticks
numticks = len(self.get_major_locator()())
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
dates.py", line 1249, in call
self.refresh()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
dates.py", line 1269, in refresh
dmin, dmax = self.viewlim_to_dt()
File "C:\Users\biney\AppData\Roaming\Python\Python36\site-packages\matplotlib\
dates.py", line 1026, in viewlim_to_dt
.format(vmin))
ValueError: view limit minimum 0.0 is less than 1 and is an invalid Matplotlib d
ate value. This often happens if you pass a non-datetime value to an axis that h
as datetime units

Related

Strange error occured to python program using matplotlib and AutoDateLocator

I have created a python program that uses matplotlib to plot data and it had been working fine until today, an error occured. The program uses AutoDateLocator and ConciseDateFormatter. As you can see in the last line of the error log below, the received error is:
> IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
Since it would be really difficult to post more details about my program functionality as well as the data used, I was wondering if there is an obvious solution to the above-mentioned problem, or if you could guide me to where I should look for the problem.
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\tkinter_init_.py", line 1883,
in call
return self.func(*args)
File "C:\Users\Nick\Desktop\Uni\TUC
studies\thesis\Code\python\test\myGUI_V3.py", line 841, in plotMeas
fig.tight_layout()
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\cbook\deprecation.py",
line 411, in
wrapper
return func(*inner_args, **inner_kwargs)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\figure.py",
line 2613, in tight_layout
kwargs = get_tight_layout_figure(
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\tight_layout.py",
line 303, in get_tight_layout_figure
kwargs = auto_adjust_subplotpars(fig, renderer,
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\tight_layout.py",
line 84, in
auto_adjust_subplotpars
bb += [ax.get_tightbbox(renderer, for_layout_only=True)]
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\axes_base.py",
line 4155, in
get_tightbbox
bb_xaxis = self.xaxis.get_tightbbox(
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\axis.py", line
1109, in get_tightbbox
ticks_to_draw = self._update_ticks()
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\axis.py", line
1030, in _update_ticks
minor_labels = self.minor.formatter.format_ticks(minor_locs)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\dates.py", line
797, in format_ticks
if len(np.unique(tickdate[:, level])) > 1:
IndexError: too many indices for array: array is 1-dimensional, but 2
were indexed
Traceback (most recent call last):
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\backends\backend_qt5.py",
line 480, in
_draw_idle
self.draw()
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py",
line 407, in draw
self.figure.draw(self.renderer)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\artist.py",
line 41, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\figure.py",
line 1863, in draw
mimage._draw_list_compositing_images(
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\image.py", line
131, in
_draw_list_compositing_images
a.draw(renderer)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\artist.py",
line 41, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\cbook\deprecation.py",
line 411, in wrapper
return func(*inner_args, **inner_kwargs)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\axes_base.py",
line 2747, in draw
mimage._draw_list_compositing_images(renderer, self, artists)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\image.py", line
131, in
_draw_list_compositing_images
a.draw(renderer)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\artist.py",
line 41, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\axis.py", line
1164, in draw
ticks_to_draw = self._update_ticks()
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\axis.py", line
1030, in _update_ticks
minor_labels = self.minor.formatter.format_ticks(minor_locs)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\dates.py", line
797, in format_ticks
if len(np.unique(tickdate[:, level])) > 1:
IndexError: too many indices for array: array is 1-dimensional, but 2
were indexed
Traceback (most recent call last):
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\backends\backend_qt5.py",
line 480, in
_draw_idle
self.draw()
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py",
line 407, in draw
self.figure.draw(self.renderer)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\artist.py",
line 41, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\figure.py",
line 1863, in draw
mimage._draw_list_compositing_images(
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\image.py", line
131, in
_draw_list_compositing_images
a.draw(renderer)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\artist.py",
line 41, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\cbook\deprecation.py",
line 411, in wrapper
return func(*inner_args, **inner_kwargs)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\axes_base.py",
line 2747, in draw
mimage._draw_list_compositing_images(renderer, self, artists)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\image.py", line
131, in
_draw_list_compositing_images
a.draw(renderer)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\artist.py",
line 41, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\axis.py", line
1164, in draw
ticks_to_draw = self._update_ticks()
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\axis.py", line
1030, in _update_ticks
minor_labels = self.minor.formatter.format_ticks(minor_locs)
File
"C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\dates.py", line
797, in format_ticks
if len(np.unique(tickdate[:, level])) > 1:
IndexError: too many indices for array: array is 1-dimensional, but 2
were indexed
As it turns out, it was a problem of the setup of the AutoDateLocator. Because of the specific dataset, the minor ticks could not be set correctly, based on the values I had selected for minticks, maxticks, and for the intervald tuples chosen. As a result, when the ConciseDateFormatter tried to use the locator for the minor ticks, the above-mentioned error occured.
When used a simple DateFormatter, there was no error, but neither the minor ticks, nor their labels appeared on the plot.

Python error on data.plot(x=data.timestamp, style=".-")

In Python I am trying to use the following command to get timestamp info on my plot. I've imported the proper packages pandas and pylab. The data is all cleaned as well.
data.plot(x=data.timestamp, style=".-")
I keep getting a massive error with lots of different things. I am following along to https://www.youtube.com/watch?v=5XGycFIe8qE and it comes at 38 minutes. Here is the error I get: It's massive
Traceback (most recent call last): File "", line 1, in data.plot(x=data.timestamp, style=".-") File "C:\Python3\lib\site-packages\pandas\plotting_core.py", line 2673, in call sort_columns=sort_columns, **kwds) File "C:\Python3\lib\site-packages\pandas\plotting_core.py", line 1900, in plot_frame **kwds) File "C:\Python3\lib\site-packages\pandas\plotting_core.py", line 1727, in _plot plot_obj.generate() File "C:\Python3\lib\site-packages\pandas\plotting_core.py", line 260, in generate self._post_plot_logic_common(ax, self.data) File "C:\Python3\lib\site-packages\pandas\plotting_core.py", line 395, in _post_plot_logic_common self._apply_axis_properties(ax.yaxis, fontsize=self.fontsize) File "C:\Python3\lib\site-packages\pandas\plotting_core.py", line 468, in _apply_axis_properties labels = axis.get_majorticklabels() + axis.get_minorticklabels() File "C:\Python3\lib\site-packages\matplotlib\axis.py", line 1188, in get_majorticklabels ticks = self.get_major_ticks() File "C:\Python3\lib\site-packages\matplotlib\axis.py", line 1339, in get_major_ticks numticks = len(self.get_major_locator()()) File "C:\Python3\lib\site-packages\matplotlib\dates.py", line 1054, in call self.refresh() File "C:\Python3\lib\site-packages\matplotlib\dates.py", line 1074, in refresh dmin, dmax = self.viewlim_to_dt() File "C:\Python3\lib\site-packages\matplotlib\dates.py", line 832, in viewlim_to_dt return num2date(vmin, self.tz), num2date(vmax, self.tz) File "C:\Python3\lib\site-packages\matplotlib\dates.py", line 441, in num2date return _from_ordinalf(x, tz) File "C:\Python3\lib\site-packages\matplotlib\dates.py", line 256, in _from_ordinalf dt = datetime.datetime.fromordinal(ix).replace(tzinfo=UTC) ValueError: ordinal must be >= 1
I had a similar issue and I believe it is a version issue with pandas and the Jupyter Notebook that are different from what Quentin is using in his examples here: https://github.com/QCaudron/pydata_pandas and what you, and I, are using.
Try this:
data.plot('timestamp', style='.-')
or
data.plot(x='timestamp', style='.-')
Per the pandas docs https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.plot.html either a label or position should work.
DataFrame
x : label or position, default None
y : label or position, default None
Allows plotting of one column versus another

Out of memory with Python but OK with Matlab while displaying the same 2D data as image

I have a 2D array to display as image (it is 500 by 20 000).
Python:
import numpy as np
from matplotlib import pyplot as plt
spect_data = np.loadtxt('some_data.txt')
plt.figure(figsize=(12,9))
plt.imshow(spect_data,aspect='auto')
plt.colorbar()
plt.show()
Matlab:
spect_data=load('some_data.txt');
imagesc(spect_data)
Here's the error I get (sorry I wasn't clear about my problem the first time):
Traceback (most recent call
last):
File
"C:\Users\User\Anaconda\lib\site-packages\IPython\core\formatters.py",
line 339, in call
return printer(obj)
File
"C:\Users\User\Anaconda\lib\site-packages\IPython\core\pylabtools.py",
line 228, in
png_formatter.for_type(Figure, lambda fig: print_figure(fig, 'png', **kwargs))
File
"C:\Users\User\Anaconda\lib\site-packages\IPython\core\pylabtools.py",
line 119, in print_figure
fig.canvas.print_figure(bytes_io, **kw)
File
"C:\Users\User\Anaconda\lib\site-packages\matplotlib\backend_bases.py",
line 2180, in print_figure
**kwargs)
File
"C:\Users\User\Anaconda\lib\site-packages\matplotlib\backends\backend_agg.py",
line 527, in print_png
FigureCanvasAgg.draw(self)
File
"C:\Users\User\Anaconda\lib\site-packages\matplotlib\backends\backend_agg.py",
line 474, in draw
self.figure.draw(self.renderer)
File
"C:\Users\User\Anaconda\lib\site-packages\matplotlib\artist.py", line
61, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File
"C:\Users\User\Anaconda\lib\site-packages\matplotlib\figure.py", line
1159, in draw
func(*args)
File
"C:\Users\User\Anaconda\lib\site-packages\matplotlib\artist.py", line
61, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File
"C:\Users\User\Anaconda\lib\site-packages\matplotlib\axes_base.py",
line 2324, in draw
a.draw(renderer)
File
"C:\Users\User\Anaconda\lib\site-packages\matplotlib\artist.py", line
61, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Users\User\Anaconda\lib\site-packages\matplotlib\image.py",
line 389, in draw
im = self.make_image(renderer.get_image_magnification())
File "C:\Users\User\Anaconda\lib\site-packages\matplotlib\image.py",
line 624, in make_image
transformed_viewLim)
File "C:\Users\User\Anaconda\lib\site-packages\matplotlib\image.py",
line 238, in _get_unsampled_image
x = (x * 255).astype(np.uint8)
MemoryError
I'm not sure this will solve your problem, but you seem to have the data in memory several times - as a numpy array, as a list of floats, and as a list of strings.
If you only need the numpy array, you could use
np.loadtxt
or
np.fromfile
if you need more control over how the data is read.
This assumes (you do not specify) that the data is in an ASCII file.
For a more specific answer, you should post your code so people can see what you are doing and where the problem might be.

plot dates and values in python

I have a file containing data as below. you see the first column is dates. I need to plot all other columns against dates in the first column. I tried to use the code below but I receive an error message. the code an error are provided.
Data
2010-01-01,1.628,0.7063157895,0,0.9216842105
2010-01-03,1.602631579,0.6901052632,0,0.9125263158
2010-01-04,1.5818947369,0.6775789474,0,0.9043157895
2010-01-05,1.5755789473,0.6716842105,0,0.9038947368
2010-01-06,1.5605263158,0.6622105263,0,0.8983157895
2010-01-07,1.5611578948,0.6608421053,0,0.9003157895
2010-01-08,1.5598947369,0.6593684211,0,0.9005263158
2010-01-09,1.5576842105,0.6569473684,0,0.9007368421
2010-01-10,1.5462105263,0.6543157895,0,0.8918947368
2010-01-11,1.5656842105,0.6666315789,0,0.8990526316
2010-01-12,1.5517894736,0.6546315789,0,0.8971578947
2010-01-13,1.5558947368,0.6551578947,0,0.9007368421
2010-01-14,1.5638947369,0.6588421053,0,0.9050526316
2010-01-15,1.5375789474,0.6432631579,0,0.8943157895
2010-01-16,1.522631579,0.6352631579,0,0.8873684211
2010-01-17,1.5056842105,0.6254736842,0,0.8802105263
2010-01-18,1.4881052632,0.6157894737,0,0.8723157895
2010-01-19,1.4889842789,0.6251948052,0,0.8637894737
2010-01-20,1.4733383459,0.6182857143,0,0.8550526316
2010-01-21,1.4507368421,0.6009473684,0,0.8497894737
Code
import csv
import datetime as dt
import matplotlib.pyplot as plt
lis1=[]
lis2=[]
lis3=[]
lis4=[]
lis5=[]
with open('/home/omar/Desktop/finall.csv', 'rU') as f:
reader=csv.reader(f, delimiter=',')
for row in reader:
lis1.append(dt.datetime.strptime(row[0],'%Y-%m-%d'))
lis2.append(row[1])
lis3.append(row[2])
lis4.append(row[3])
lis5.append(row[4])
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(lis1,lis2,lis3,lis4,lis5,'o-')
fig.autofmt_xdate()
plt.show()
Error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/spyderlib/widge/externalshell/sitecustomize.py", line 540, in runfile execfile(filename, namespace)
File "/home/omar/python/plot_txt.py", line 37, in <module>
fig.autofmt_xdate()
File "/usr/lib/pymodules/python2.7/matplotlib/figure.py", line 431, in autofmt_xdate
for label in self.axes[0].get_xticklabels():
File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 2614, in get_xticklabels
self.xaxis.get_ticklabels(minor=minor))
File "/usr/lib/pymodules/python2.7/matplotlib/axis.py", line 1161, in get_ticklabels
return self.get_majorticklabels()
File "/usr/lib/pymodules/python2.7/matplotlib/axis.py", line 1145, in get_majorticklabels
ticks = self.get_major_ticks()
File "/usr/lib/pymodules/python2.7/matplotlib/axis.py", line 1244, in get_major_ticks
numticks = len(self.get_major_locator()())
File "/usr/lib/pymodules/python2.7/matplotlib/dates.py", line 802, in __call__
self.refresh()
File "/usr/lib/pymodules/python2.7/matplotlib/dates.py", line 819, in refresh
dmin, dmax = self.viewlim_to_dt()
File "/usr/lib/pymodules/python2.7/matplotlib/dates.py", line 564, in viewlim_to_dt
return num2date(vmin, self.tz), num2date(vmax, self.tz)
File "/usr/lib/pymodules/python2.7/matplotlib/dates.py", line 311, in num2date
return _from_ordinalf(x, tz)
File "/usr/lib/pymodules/python2.7/matplotlib/dates.py", line 214, in _from_ordinalf
dt = datetime.datetime.fromordinal(ix)
ValueError: ordinal must be >= 1
>>> Traceback (most recent call last):
File "/usr/lib/pymodules/python2.7/matplotlib/backends/backend_qt4.py", line 374, in idle_draw
self.draw()
File "/usr/lib/pymodules/python2.7/matplotlib/backends/backend_qt4agg.py", line 154, in draw
FigureCanvasAgg.draw(self)
File "/usr/lib/pymodules/python2.7/matplotlib/backends/backend_agg.py", line 451, in draw
self.figure.draw(self.renderer)
File "/usr/lib/pymodules/python2.7/matplotlib/artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "/usr/lib/pymodules/python2.7/matplotlib/figure.py", line 1034, in draw
func(*args)
File "/usr/lib/pymodules/python2.7/matplotlib/artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 2086, in draw
a.draw(renderer)
File "/usr/lib/pymodules/python2.7/matplotlib/artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "/usr/lib/pymodules/python2.7/matplotlib/axis.py", line 1091, in draw ticks_to_draw = self._update_ticks(renderer)
File "/usr/lib/pymodules/python2.7/matplotlib/axis.py", line 945, in _update_ticks
tick_tups = [t for t in self.iter_ticks()]
File "/usr/lib/pymodules/python2.7/matplotlib/axis.py", line 889, in iter_ticks
majorLocs = self.major.locator()
File "/usr/lib/pymodules/python2.7/matplotlib/dates.py", line 802, in __call__
self.refresh()
File "/usr/lib/pymodules/python2.7/matplotlib/dates.py", line 819, in refresh
dmin, dmax = self.viewlim_to_dt()
File "/usr/lib/pymodules/python2.7/matplotlib/dates.py", line 564, in viewlim_to_dt
return num2date(vmin, self.tz), num2date(vmax, self.tz)
File "/usr/lib/pymodules/python2.7/matplotlib/dates.py", line 311, in num2date
return _from_ordinalf(x, tz)
File "/usr/lib/pymodules/python2.7/matplotlib/dates.py", line 214, in _from_ordinalf
dt = datetime.datetime.fromordinal(ix)
ValueError: ordinal must be >= 1
The error is in your plot command.
You need to repeat what is on the x axis for each field you plot on the y axis.
So change this:
ax.plot(lis1,lis2,lis3,lis4,lis5,'o-')
to this:
ax.plot(lis1,lis2,lis1,lis3,lis1,lis4,lis1,lis5,'o-')
try:
import csv
import datetime as dt
import matplotlib.pyplot as plt
lis1=[]
lis2=[]
lis3=[]
lis4=[]
lis5=[]
with open('/home/omar/Desktop/finall.csv', 'rU') as f:
reader=csv.reader(f, delimiter=',')
for row in reader:
lis1.append(dt.datetime.strptime(row[0],'%Y-%m-%d'))
lis2.append(row[1])
lis3.append(row[2])
lis4.append(row[3])
lis5.append(row[4])
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot_date(lis1,lis2,'o-',
lis1,lis3,'o-',
lis1,lis4,'o-',
lis1,lis5,'o-')
plt.show()
see documentation here: http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.plot

Inexplicable bug in matplolib and numpy : "IndexError: string index out of range"

I am currently dealing with an inexplicable bug in python+matplotlib+numpy. I'm under Ubuntu 13.04 x64 in a VirtualBox hosted by Windows 7 pro x64 and I use python 2.7 (installed from the Ubuntu repositories).
Consider the following script that plots y as a function of x from a two column file named data.txt :
#!/usr/bin/env python
import numpy as np
import matplotlib as mpl
mpl.rcParams['text.usetex']=True
import matplotlib.pyplot as plt
x, y = np.loadtxt("data.txt", unpack=True)
plt.plot(x, y)
plt.show()
And consider the following data.txt :
-49.5938 2.0
-40.6585 0.156267
-39.7351 0.0616373
-39.1767 0.00565121
-38.7736 -0.0346863
-38.4573 -0.0664897
-38.1965 -0.0929013
-37.9744 -0.115588
-37.7809 -0.135542
-37.6093 -0.153403
-37.4551 -0.169605
-37.3151 -0.18446
-37.1868 -0.198198
-37.0684 -0.210994
-36.9584 -0.222983
-36.8722 -0.232458
-36.8228 -0.237918
-36.7749 -0.243232
-36.7284 -0.248409
-36.6833 -0.253457
-36.6395 -0.258382
-36.5968 -0.263192
-36.5553 -0.267892
-36.5149 -0.272489
-36.4754 -0.276987
-36.437 -0.281392
-36.3994 -0.285707
-36.3628 -0.289937
-36.3269 -0.294086
-36.2919 -0.298157
-36.2577 -0.302154
-36.2241 -0.30608
-36.1913 -0.309937
-36.1591 -0.313729
-36.1276 -0.317458
-36.0966 -0.321126
-36.0663 -0.324736
-36.0366 -0.32829
-36.0074 -0.33179
-35.9787 -0.335238
-35.9505 -0.338635
-35.9228 -0.341984
-35.8956 -0.345286
-35.8688 -0.348542
-35.8425 -0.351755
-35.8166 -0.354925
-35.7912 -0.358053
-35.7661 -0.361142
-35.7414 -0.364192
-35.7171 -0.367205
-35.6931 -0.37018
-35.6695 -0.373121
-35.6463 -0.376027
-35.6234 -0.378899
-35.6008 -0.381739
-35.5785 -0.384547
-35.5565 -0.387324
-35.5348 -0.390071
-35.5134 -0.392789
-35.4923 -0.395479
-35.4714 -0.39814
-35.4508 -0.400775
-35.4305 -0.403383
-35.4104 -0.405965
-35.3906 -0.408522
-35.371 -0.411054
-35.3517 -0.413563
-35.3326 -0.416047
-35.3137 -0.418509
-35.295 -0.420948
-35.2765 -0.423366
-35.2582 -0.425762
-35.2402 -0.428137
-35.2223 -0.430491
-35.2047 -0.432825
-35.1872 -0.43514
-35.1699 -0.437435
-35.1528 -0.439711
-35.1358 -0.44197
-35.119 -0.444212
-35.1024 -0.44644
-35.0859 -0.448658
-35.0694 -0.450878
-35.0529 -0.453105
-35.0363 -0.455341
-35.0195 -0.457613
-35.0024 -0.459933
-34.9849 -0.462312
-34.9669 -0.464766
-34.9483 -0.467309
-34.9292 -0.469936
-34.9092 -0.472681
-34.8886 -0.475519
-34.8672 -0.47848
-34.8449 -0.481569
-34.8219 -0.484767
-34.7982 -0.48807
-34.7739 -0.491477
-34.7491 -0.494958
-34.7241 -0.49848
-34.6992 -0.502011
-34.6742 -0.505551
-34.6495 -0.509079
-34.625 -0.512581
-34.6007 -0.516063
-34.5767 -0.519519
-34.553 -0.522941
-34.5296 -0.526332
-34.5065 -0.529692
-34.4837 -0.533017
-34.4612 -0.536311
-34.439 -0.539573
-34.4171 -0.542805
-34.3954 -0.546006
-34.374 -0.549176
-34.3529 -0.552316
-34.3321 -0.555427
-34.3115 -0.55851
-34.2912 -0.561564
-34.2711 -0.564591
-34.2512 -0.567592
-34.2316 -0.570566
-34.2121 -0.573515
-34.193 -0.576439
-34.174 -0.579339
-34.1552 -0.582214
-34.1367 -0.585066
-34.1183 -0.587895
-34.1002 -0.590701
-34.0822 -0.593485
-34.0644 -0.596248
-34.0468 -0.598989
-34.0294 -0.601709
-34.0122 -0.604409
-33.9951 -0.607089
-33.9783 -0.609748
-33.9615 -0.612389
-33.945 -0.615011
-33.9286 -0.617614
-33.9123 -0.620199
-33.8962 -0.622766
-33.8803 -0.625315
-33.8645 -0.627847
-33.8489 -0.630361
-33.8334 -0.632859
-33.818 -0.635341
-33.8028 -0.637806
-33.7877 -0.640256
-33.7727 -0.64269
-33.7579 -0.645108
-33.7432 -0.647511
-33.7286 -0.6499
-33.7141 -0.652274
-33.6998 -0.654634
-33.6856 -0.65698
-33.6714 -0.659315
-33.6574 -0.661635
-33.6435 -0.663947
-33.6297 -0.666249
-33.6159 -0.668547
-33.6022 -0.670835
-33.5885 -0.673127
-33.5749 -0.67542
-33.5612 -0.677726
-33.5475 -0.680025
-33.5338 -0.682341
-33.5201 -0.684666
-33.5062 -0.687017
-33.4923 -0.689387
-33.4782 -0.691796
-33.4639 -0.694247
-33.4494 -0.696723
-33.4347 -0.69924
-33.4199 -0.7018
-33.4047 -0.704414
-33.3893 -0.707087
-33.3737 -0.709791
-33.3578 -0.712549
-33.3418 -0.71534
-33.3255 -0.718181
-33.3091 -0.721056
-33.2924 -0.723999
-33.2755 -0.726975
-33.2585 -0.729988
-33.2412 -0.733045
-33.2237 -0.736152
-33.2061 -0.739295
-33.1884 -0.742461
-33.1706 -0.745656
-33.1525 -0.748904
-33.1344 -0.752172
-33.1162 -0.755459
-33.098 -0.758756
-33.0798 -0.762079
-33.0616 -0.765406
-33.0434 -0.768737
-33.0252 -0.772066
-33.0072 -0.775386
-32.9892 -0.778712
-32.9713 -0.78203
-32.9535 -0.785343
-32.9358 -0.788648
-32.9182 -0.791942
-32.9007 -0.795223
-32.8833 -0.798485
-32.8661 -0.801738
-32.8489 -0.804987
-32.8319 -0.808216
-32.8151 -0.811425
-32.7984 -0.814617
-32.7817 -0.817795
-32.7653 -0.82096
-32.7489 -0.824109
-32.7327 -0.82724
-32.7166 -0.830353
-32.7007 -0.833449
-32.6848 -0.83653
-32.6691 -0.839596
-32.6536 -0.842642
-32.6381 -0.845671
-32.6228 -0.848682
-32.6076 -0.85168
-32.5925 -0.854664
-32.5776 -0.857631
-32.5627 -0.860578
-32.548 -0.863512
-32.5334 -0.866433
-32.5189 -0.869332
-32.5046 -0.872219
-32.4903 -0.87509
-32.4761 -0.877949
-32.4621 -0.880793
-32.4481 -0.883619
-32.4343 -0.886435
-32.4205 -0.889237
-32.4069 -0.892021
-32.3934 -0.894796
-32.3799 -0.897551
-32.3666 -0.900292
-32.3534 -0.903023
-32.3402 -0.905737
-32.3272 -0.908445
-32.3142 -0.911134
-32.3013 -0.913813
-32.2885 -0.916482
-32.2759 -0.919132
-32.2633 -0.921769
-32.2507 -0.924401
-32.2383 -0.92702
-32.2259 -0.929628
-32.2136 -0.932223
-32.2015 -0.934805
-32.1893 -0.937376
-32.1773 -0.939932
-32.1654 -0.942481
-32.1535 -0.94502
-32.1417 -0.947543
-32.13 -0.950056
-32.1183 -0.95256
-32.1068 -0.955052
-32.0952 -0.957536
-32.0838 -0.960005
-32.0725 -0.962469
-32.0612 -0.964924
-32.0499 -0.967368
-32.0387 -0.969807
-32.0276 -0.972236
-32.0165 -0.974663
-32.0055 -0.977076
-31.9946 -0.979485
If I try to execute the script, I got the following error :
Exception in Tkinter callback
Traceback (most recent call last):
File "/usr/lib/python2.7/lib-tk/Tkinter.py", line 1473, in __call__
return self.func(*args)
File "/usr/lib/pymodules/python2.7/matplotlib/backends/backend_tkagg.py", line 276, in resize
self.show()
File "/usr/lib/pymodules/python2.7/matplotlib/backends/backend_tkagg.py", line 348, in draw
FigureCanvasAgg.draw(self)
File "/usr/lib/pymodules/python2.7/matplotlib/backends/backend_agg.py", line 440, in draw
self.figure.draw(self.renderer)
File "/usr/lib/pymodules/python2.7/matplotlib/artist.py", line 54, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "/usr/lib/pymodules/python2.7/matplotlib/figure.py", line 1006, in draw
func(*args)
File "/usr/lib/pymodules/python2.7/matplotlib/artist.py", line 54, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 2086, in draw
a.draw(renderer)
File "/usr/lib/pymodules/python2.7/matplotlib/artist.py", line 54, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "/usr/lib/pymodules/python2.7/matplotlib/axis.py", line 1052, in draw
renderer)
File "/usr/lib/pymodules/python2.7/matplotlib/axis.py", line 1001, in _get_tick_bboxes
extent = tick.label1.get_window_extent(renderer)
File "/usr/lib/pymodules/python2.7/matplotlib/text.py", line 752, in get_window_extent
bbox, info = self._get_layout(self._renderer)
File "/usr/lib/pymodules/python2.7/matplotlib/text.py", line 313, in _get_layout
ismath=ismath)
File "/usr/lib/pymodules/python2.7/matplotlib/backends/backend_agg.py", line 200, in get_text_width_height_descent
renderer=self)
File "/usr/lib/pymodules/python2.7/matplotlib/texmanager.py", line 606, in get_text_width_height_descent
page = next(iter(dvi))
File "/usr/lib/pymodules/python2.7/matplotlib/dviread.py", line 71, in __iter__
have_page = self._read()
File "/usr/lib/pymodules/python2.7/matplotlib/dviread.py", line 126, in _read
byte = ord(self.file.read(1)[0])
IndexError: string index out of range
The interesting thing, is that :
If I replace the two last values of the file by -32.0 -1.0 it works
If I remove mpl.rcParams['text.usetex']=True it also works
I would be very happy if one have any explanation about that very bizarre problem.

Categories