i have a data frame called abc, that has rows of times (y direction) and columns of dates (x direction), the frame is made up of values.
what i need to do is create a graph by selecting a certain time, this graph needs to be dates vs the values in that row in the the dataframe. attached is a photo of the data
example of dataframe
so i need to be able to create a graph by typing in say 9am and a time series graph will appear of a graph of jun,july,aug,sept etc... on the x axis vs 1,2,6,8,9 etc...on the y
and then it would pull a different graph if i changed it to say 11am
This is very basic pandas functionality:
You can access the dataframe along the respective axis with the .ix operator
df=pd.DataFrame(np.random.randint(0,10,(5, 4)),index=['9am','10am','11am','12pm','1pm'],columns=['jun','jul','aug','sept'])
# select '9am' and all [:] of the months
df.ix['9am',:].plot()
# select month and 9am to 11 am
df.ix['9am':'11am','jun'].plot()
plot obviously draws the desired graph.
Related
I'm trying to plot a bar chart of some de-identified transactional banking data using the pandas and matplotlib libraries.
The data looks like this:
The column named "day" stores the numbers of the days on which the transaction was made, the column named "tr_type" stores the numbers of transactions made on each day, and the column named "average_income" stores the average amount of incomes for each of the different types of transactions.
The task is to display the data of all three columns, which have the largest average amount of incomes, on one graph.
For definiteness, I took the top 5 rows of sorted data.
`
slised_two = sliced_df_new.sort_values('average_income', ascending=False).head(5)
slised_two = slised_two.set_index('day')
`
For convenience in further plotting, I set a column called "day" as an index. I get this:
Based on this data, I tried to build one graph, but, unfortunately, I did not achieve the result I wanted, because I had to build 2 graphs for normal data display.
`
axes = slised_two.plot.bar(rot=0, subplots=True)
axes[1].legend(loc=2)
`
The question arises, is it possible to build a histogram in such a way that days are displayed on the x-axis, the average amount of incomes is displayed on the y-axis, and at the same time, the transaction number is signed on top of each column?
I'm trying to join the points of a plot with at least 70 subplots, since it is not a scatter (because I can't use it since they are not series), I've tried marker = 'o-', but doesn't work. The data is in the format %mm-%yy, there are at least 6 different months (as a date column), and not for every column (Fund names) there exist any data in an specific month, however I want to join all the points from the same column even if they skipped any date.
I'm trying this, however it only joins the data that corresponds to following months.
df.plot(subplots = True, figsize = (20,20), layout = (14,5),legend=False,marker='o')
I need to create a chart like this:
X Axis: Dates. I can present it either as a categorical (and order it somehow, even by another field) axis or as a continues axis.
Y Axis: Date Time (from 00:00 to 23:59)
The bars represent occurrences of events.
Emphasis points:
I might have some parallel occurrence (as you can see in the last date which is visualized in the graph). I can separate them with a field.
Event might start at any time and any day and continue to the next day (as you can see in the second and the third days above). I want them to be visualized as in the above chart. (solutions that based on separating them to 2 different events are not relevant here because of the final target - clustering).
Data can be arranged in any order either in Python or Tableau.
I am new to Python and pandas.
I have a dataset loaded into Python as a DataFrame. The index of the DataFrame are times of the format "2018-01-01 00:00:00". My dataset ranges from "2018-01-01 00:00:00" to "2018-12-31 23:59:59". The data column has a column name "X".
I can plot the entire dataset using matplotlib:
plt.plot(data.index, data["X"])
However, I want to plot different segments of the time series: 1 month, 6 months, 2 days, 3 seconds, etc.
What is the best way to do this?
Thanks
If you want to plot a month you could do
data.loc['2018-02',"X"].plot()
6 months
data.loc['2018-02':'2018-08',"X"].plot()
and the same logic applies for other ranges
You might need to do one more processing step on your index to ensure you're dealing with datetime objects rather than strings.
new_data = (
data
.assign(datetime=lambda df: pandas.to_datetime(df.index))
.set_index('datetime')
)
new_data["X"].plot()
This should get us really close to what you want, but i haven't tested it on data with your date format.
I have a pandas dataframe that I want to use in a vincent visualization. I can visualize the data, however, the X axis should be displayed as dates and instead the dates are just given an integer index of 500, 1000, 1500, etc.
The dataframe looks like this:
weight date
0 125.200000 2013-11-18
Truncated for brevity.
My vincent code in my ipython notebook:
chart = vincent.Line(df[['weight']])
chart.legend(title='weight')
chart.axis_titles(x='Date', y='Weight')
chart.display()
How can I tell vincent that my dataframe contains dates such that the X axis labels are just like the dataframe's dates above, i.e. 2013-11-18?
ok, so here's what I did. I ran into this problem before with matplotlib, and it was so painful that wrote a blog post about it (http://codrspace.com/szeitlin/biking-data-from-xml-to-plots-part-2/). Vincent is not exactly the same, but essentially you have to do 4 steps:
convert your dates to datetime objects, if you haven't already
df['date_objs'] = df['date'].apply(pandas.to_datetime)
convert your datetime objects to whatever format you want.
make your datetime objects into your index
df.index = df.index.values.astype('M8[D]')
tell vincent you want to plot your data (weight) as the y-axis. It will automatically use the index of your dataframe as the x-axis.
chart = vincent.Line(plot[['weight']])
chart.axis_titles(x='dates', y='weight')
chart.display()