Appending data sets to an OpenPyxl Chart using a For-Loop - python

In Python, I have the ability to add series data to my chart object to plot as a line graph.
I'm using the following lines:
overall_stats_sheet2 = current_book.worksheets[0]
overall_chart_sheet = current_book.worksheets[1]
chart_object = charts.LineChart()
for x in top_down_reference_points[0]:
chart_object.append(charts.Series(charts.Reference(overall_stats_sheet, (x,1), (x, overall_stats_sheet2.get_highest_column()+1)), title = 'Erasure Decodes'))
chart_object.drawing.top = 0
chart_object.drawing.left = 400
chart_object.drawing.width = 650
chart_object.drawing.height = 400
overall_chart_sheet.add_chart(chart_object)
top_down_reference_points[0] contains all of the row numbers that erasure decode exists on. In the example picture, the numbers are row 19 and row 39.
My for loop code currently iterates through those and appends them to the graph, but it creates a new legend label and line for each erasure-decode set. I want to combine all that data from the sheet and graph one line associated with all the erasure decode data. Is this possible?

It's not entirely clear from your code which cells you want in your chart and how. It may be as simple as creating a single series that refers to multiple cells. At the moment you're creating multiple series which is why you're seeing multiple items in the legend.
BTW. I strongly recommend you start using the 2.3 beta of openpyxl which has much better chart support.

Related

PPTX Python - How to fix ValueError: chart data contains no categories for a LineChart?

I'm trying to replace data in an existing line chart in Python PPTX. Here's the code I'm using:
for chart in charts:
chart_data = CategoryChartData()
chart_index = list(charts).index(chart)
scenario_no = chart_index + 1
sc_df = wrk_df[wrk_df['Scenario No'] == scenario_no]
for category in sc_df['categories'].tolist():
chart_data.add_category(category)
chart_data.add_series('Volume', sc_df['Series 1'].tolist())
chart_data.add_series('Value', sc_df['Series 2'].tolist())
chart.replace_data(chart_data)
Basically there are several charts on the slide, through which the code iterates and replaces the data. The charts themselve have a numeric x axis and two series.
When I run this code I get the following error:
ValueError: chart data contains no categories
I've already tried converting the new categories into string, however, it doesn't work with any data type.
I'm also able to print original category labels in the existing chart, which means it does have categories.
I can't think of what's going wrong here. Does anyone have any solution for this or at least any knowledge of why this is happenning?
It turned out that the data being passed as categories was empty

Conditional Pandas combined with matplotlib

I have quite an interesting problem. I have a data frame containing data of a car that is triggered by events in the GPS. The data also contains GPS data. However, I have made a Polygon in Matplot that I want to limit my GPS data. With other words, I am trying filter out data that has been recorded inside of the Polygon that has been created with GPS points. I have tried solving this problem by first adding a new column with a built in condition in matplotlib (poly_path.contains_point(point)).
My GPS column looks like this:
GPS DATA:
GPS
(57.723124, 11.923557)
(57.724115, 11.933557)
(57.723124, 11.923557)
...
And I would like to add a new column using this condition.
GPS DATA:
GPS Is inside Polygon
(57.723124, 11.923557) True
(57.724115, 11.933557) False
(57.723124, 11.923557) True
...
And I have tried solving this problem by adding this line:
df1filt["Is inside Polygon"] = poly_path.contains_point(df1filt['GPS'])
However doesnt work.
Any ideas?
Thank you in advance!
Try:
df1filt["Is inside Polygon"] = df1filt['GPS'].apply(poly_path.contains_point)
Edit:
If the data type of the column is string, you need to create a cleaning function, apply it, then try my first solution.
E.g.
def clean_gps_col(text):
text = text[1:-1]
text = text.split(',')
return (float(text[0]), float(text[1]))
df1filt['GPS_cleaned'] = df1filt['GPS'].apply(clean_gps_col)
now try the first soluton
df1filt["Is inside Polygon"] = df1filt['GPS_cleaned'].apply(poly_path.contains_point)

Reformatting y axis values in a multi-line plot in Python

Updated with more info
I've seen this answered on here for single line plots, but I need help with a plot showing two variables, if that matters at all... I am fairly new to python in general. My line graph shows two different departments' funding over the years. I just want to reformat the y axis to display as a number in the hundreds of millions.
Using a csv for the general public funding report of Minneapolis.
msp_df = pd.read_csv('Minneapolis_Data_Snapshot_v2.csv',error_bad_lines=False)
msp_df.info()
Saved just the two depts I was interested in, to a dataframe.
CPED_df = (msp_df['Unnamed: 0'] == 'CPED')
msp_df.iloc[CPED_df.values]
police_df = (msp_df['Unnamed: 0'] == 'Police')
msp_df.iloc[police_df.values]
("test" is the new name of my data frame containing all the info as seen below.)
test = pd.DataFrame({'Year': range(2014,2021),
'CPED': msp_df.iloc[CPED_df.values].T.reset_index(drop=True).drop(0,0)[5].tolist(),
'Police': msp_df.iloc[police_df.values].T.reset_index(drop=True).drop(0,0)[4].tolist()})
The numbers from the original dataset were being read as strings because of the commas so had to fix that first.)
test['Police2'] = test['Police'].str.replace(',','').astype(int)
test['CPED2'] = test['CPED'].str.replace(',','').astype(int)
And here is my code for the plot. It executes, I'm just wanting to reformat the y axis number scale. Right now it just shows up as a decimal. (I've already imported pandas and seaborn and matploblib)
plt.plot(test.Year, test.Police2, test.Year, test.CPED2)
plt.ylabel('Budget in Hundreds of Millions')
plt.xlabel('Year')
Current plot
Any help super appreciated! Thanks :)
the easiest way to reformat the y axis, to force it to take certain values ​​is to use
plt.yticks(ticks, labels)
for example if you want to have only display values ​​from 0 to 1 you can do :
plt.yticks([0,0.2,0.5,0.7,1], ['a', 'b', 'c', 'd', 'e'])

Pandas Dataframes merging thru iterations. How to avoid lists and rows of headers

I code just once in a while and I am super basic at the moment. Might be a silly question, but it got me stuck in for a bit too much now.
Background
I have a function (get_profiles) that plots points every 5m along one transect line (100m long) and extracts elevation (from a geotiff).
The arguments are:
dsm (digital surface model)
transect_file (geopackage, holds many LineStrings with different transect_ID)
transect_id (int, extracted from transect_file)
step (int, number of meters to extract elevation along transect lines)
The output for one transect line is a dataframe like in the picture, which is what I expected, and I like it!
However, the big issue is when I iterate the function over the transect_ids (the transect_files has 10 Shapely LineStrings), like this:
tr_list = np.arange(1,transect_file.shape[0]-1)
geodb_transects= []
for i in tr_list:
temp=get_profiles(dsm,transect_file,i,5)
geodb_transects.append(temp)
I get a list. It might be here the error, but I don't know how to do in another way.
type(geodb_transects)
output:list
And, what's worse, I get headers (distance, z, tr_id, date) every time a new iteration starts.
How to get a clean pandas dataframe, just like the output of 1 iteration (20rows) but with all the tr_id chunks of 20row each aligned and without headers?
If your output is a DataFrame then you’re simply looking to concatenate the incremental DataFrame into some growing DataFrame.
It’s not the most efficient but something like
import pandas
df = pandas.DataFrame()
for i in range(7) :
df = df.concat( df_ret_func(i))
You may also be interested in the from_records function if you have a list of elements that are all records of the same form and can be converted into the rows of a DataFrame.

How to plot two y-axes in Excel 2003 with Python

I'm trying to plot two data series with different y-axes in the same plot in Excel 2003 using Python and win32com.client. I started with VBA to try to get the code I needed. Here's what it looks like so far:
chart = xlApp.Charts.Add()
# This part successfully creates the first series I want
series = chart.SeriesCollection(1)
series.XValues = xlSheet.Range("L13:L200")
series.Values = xlSheet.Range("M13:M200")
# This is what I added to try to plot the second series
series.AxisGroup = xlPrimary
series2 = chart.SeriesCollection(2)
series2.XValues = xlSheet.Range("L13:L200")
series2.Values = xlSheet.Range("N13:N200")
series2.AxisGroup = xlSecondary
# The rest is for formatting it the way I want, but it doesn't work now that I'm
# to plot the second series. (It stops working when I add the last five lines of code).
chart.Legend.Delete() # Delete legend; MUST BE DONE BEFORE CHART IS MOVED
series.Name = file
chart.Location(2, xlSheet.Name) # Copy chart to active worksheet
chart = xlSheet.Shapes(1)
chart.Top = 51
chart.Left = 240
chart.Width = 500
chart.Height = 350
This plots the first series, but as noted in the comments, no longer adds the title, moves the chart, deletes the legend or resizes the chart. It does nothing with the second series. It is also not generating an error.

Categories