Applying formatting to a Doughnut chart done with python pptx - python

I am working on an automated ppt through a python-pptx. I am interested in applying character formatting to the data label through the next function but it does not work. I need help with figuring out why.
def apply_data_labels(self, chart):
plot = chart.plots[0]
plot.has_data_labels = True
for series in plot.series:
values = series.values
counter = 0
for point in series.points:
data_label = point.data_label
data_label.has_text_frame = True
data_label.text_frame.text = str(values[counter])
data_label.font.size = Pt(22)
data_label.font.color.rgb = RGBColor(255,160,122)
counter = counter + 1

Try:
data_label.text_frame.paragraphs[0].font.size = Pt(22)
Things work a little differently when you're setting the font of an individual data label rather than all of them at once.

Related

How to plot a "heatmap" of thousands of timeseries in python?

I am looking for a way to visualize, for the lack of a better word, the "density" or "heatmap" of some synthetic time series I have created.
I have a loop that creates a list, which are values of one time series. I don't think it matters but just in case, here is the code of what's going on. This is a Markov Process, so with each i, which represents the hour, i create a new value, depending on the former i and state:
for x in range(10000):
start_h = 0
start_s = 1
generated_values_list = []
for i in range(start_h,120):
if i>=24:
i=i%24
print(str(start_s)+" | " +str(i))
pot_value_list = GMM_vals_container_workingdays_spring["State: "+ str(start_s)+", hour: "+str(i)]
if len(pot_value_list)>50:
actual_value = random.choice(pot_value_list)#
#cdf, gmm_x, gmm = GMM_erstellen(pot_value_list,50)
#actual_value = gmm.sample()[0][0][0]
#print("made by GMM")
else:
actual_value = random.choice(pot_value_list)
#print("made not by GMM")
generated_values_list.append(actual_value)
probabilities_next_state = TPMs_WD[i][start_s-1]
next_state = random.choices(states,weights=probabilities_next_state)
start_s = next_state[0]
plt.plot(generated_values_list)
But - I think - the only part that matters is this:
for x in range(10000):
#some code that creates the generated_values_list
plt.plot(generated_values_list)
This creates, as expected a picture like this:
It is not clear from here which are the most common paths so I would like to make values that are hit frequently are more colorful while not so frequent values are rather grey.
I think seaborn library has something for that but I don't seem to understand the docs.

How to read element in list item in Python?

I have the following output from a function and I need to read shape, labels, and domain from this stream.
[Annotation(shape=Rectangle(x=0.0, y=0.0, width=1.0, height=1.0), labels=[ScoredLabel(62282a1dc79ed6743e731b36, name=GOOD, probability=0.5143796801567078, domain=CLASSIFICATION, color=Color(red=233, green=97, blue=21, alpha=255), hotkey=ctrl+3)], id=622cc4d962f051a8f41ddf35)]
I need them as follows
shp = Annotation.shape
lbl = Annotation.labels
dmn = domain
It seems simple but I could not figure it out yet.
Given output as a list of Annotation objects:
output = [Annotation(...)]
you ought to be able to simply do:
shp = output[0].shape
lbl = output[0].labels
dmn = labels[0].domain

Pulling PowerPoint Text Attributes Through Python

I am trying to pull in the attributes associated with my text in PowerPoint and am getting weird outputs... The output from shape.fill is not as expected. I am also curious to find the other attributes like shape.font and the position of the shape - is this possible?
Issue:
f = shape.fill
Output: <pptx.dml.fill.FillFormat object at 0x00000215C4D6DD90>
Code:
mylist = []
mylist2 = []
mylist3 = []
mylist4 = []
mylist5 = []
mylist6 = []
mylist7 = []
for eachfile in glob.glob(direct):
s = 1
file = os.path.basename(eachfile)
try:
prs = Presentation(eachfile)
for slide in prs.slides:
for shape in slide.shapes:
if hasattr(shape, "text"):
x = nltk.word_tokenize(shape.text)
t = shape.text
f = shape.fill
print(f)
mylist4.append(file)
mylist5.append(t)
mylist7.append(f)
mylist6.append('Slide: ' + str(s))
# x = shape.text.split() #looks for words with punctuation included
for word in x:
word = word.lower()
if word in terms:
mylist.append("Slide " + str(s))
mylist2.append(file)
mylist3.append(word)
s = s + 1
except:
pass
#mylist = list(dict.fromkeys(mylist))
d = {'FileName':mylist2,'Slide':mylist, 'Match':mylist3}
d2 = {'FileName':mylist4, 'Slide':mylist6, 'Text':mylist5, 'Color':mylist7}
search = phrases + terms
d3 = {'Text':search}
df = pd.DataFrame(d)
df = df.drop_duplicates()
<pptx.dml.fill.FillFormat object at 0x00000215C4D6DD90> is a python object. You need to look up the documentation for these type of objects and use its attribute functions in order to get information out of it.
The only documentation I could find for this type of object is this one, although that is not a "normal" one, but just the source code. Functions You can use are written inside of the FillFormat class, starting with back_color(self, ...)
The API documentation describes what you should expect on any given attribute. For example, here: https://python-pptx.readthedocs.io/en/latest/api/dml.html#fillformat-objects
you can find out how to interrogate the FillFormat object that Shape.fill returns.
In many cases, things are substantially more complex than the common cases and the API will reflect that. For example, fills come in several varieties: an RGB color (most common), a pattern (repeated bitmap mask), an image (either tiled or fit in a variety of ways), and a "null" fill. Accommodating all these options requires you to learn more about PowerPoint than you probably originally wanted to know :)
The overall API documentation is here: https://python-pptx.readthedocs.io/en/latest/#api-documentation

Python-PPTX : Data Label Positions not working for Doughnut Chart

I have a Chart Placeholder, into which I have inserted a chart of chart_type 'DOUGHNUT'. I've added data labels to it and want to change their positions. For some reason, the method given in the documentation has no effect on my chart.
Here is my code, please help if I'm doing something wrong -
from pptx import Presentation
from pptx.chart.data import ChartData
from pptx.enum.chart import XL_CHART_TYPE, XL_LABEL_POSITION, XL_DATA_LABEL_POSITION, XL_TICK_MARK, XL_TICK_LABEL_POSITION
chart_data = ChartData()
chart_data.add_series('', tuple(input_chart_data[x] for x in input_chart_data))
graphic_frame = content_placeholder.insert_chart(XL_CHART_TYPE.DOUGHNUT, chart_data)
chart = graphic_frame.chart
chart.has_legend = False
#Adding Data-Labels with custom text
chart.plots[0].has_data_labels = True
data_labels = chart.plots[0].data_labels
i = 0
series = chart.series[0]
for point in series.points:
fill = point.format.fill
fill.solid()
fill.fore_color.rgb = RGBColor(<color_code>)
point.data_label.has_text_frame = True
#Assigning custom text for data label associated with each data-point
point.data_label.text_frame.text = str(chart_data.categories[i].label) + "\n" + str(float(chart.series[0].values[i])) + "%"
for run in point.data_label.text_frame.paragraphs[0].runs:
run.font.size = Pt(10)
i+=1
data_labels.position = XL_LABEL_POSITION.OUTSIDE_END
PowerPoint is finicky about where you place certain chart attributes and feels free to ignore them when it wants (although it does so consistently).
A quick option worth trying is to set the value individually, point-by-point in the series. So something like:
for point in series.points:
point.data_label.position = XL_LABEL_POSITION.OUTSIDE_END
The most reliable method is to start by producing the effect you want by hand, using PowerPoint itself on an example chart, then inspecting the XML PowerPoint produces in the saved file, perhaps using opc-diag. Once you've identified what XML produces the desired effect (or discovered PowerPoint won't let you do it), then you can proceed to working out how to get the XML generated by python-pptx. That might make a good second question if you're able to get that far.
I made it work by writing the below code.
def apply_data_labels(self, chart):
plot = chart.plots[0]
plot.has_data_labels = True
for series in plot.series:
values = series.values
counter = 0
for point in series.points:
data_label = point.data_label
data_label.has_text_frame = True
data_label.text_frame.text = str(values[counter])
counter = counter + 1
the cause of error is setting the label position. no matter what you set it asks to repair the PPT. will have to drill down more to see why is it so.
Also to save some more time the formatting doesn't works(font color, size)
If anybody has any leads then please help.
To add on Vibhanshu's response, I could get the formatting (font type, font color, size etc) to work using the following code:
for idx, point in enumerate(chart.series[0].points):
# set position
point.data_label.position = XL_LABEL_POSITION.OUTSIDE_END
# set text
point.data_label.has_text_frame = True
point.data_label.text_frame.text = "This is an example"
# set formatting
for paragraph_idx, paragraph in enumerate(point.data_label.text_frame.paragraphs):
paragraph.line_spacing = 0.6 # set paragraph line spacing
for run in paragraph.runs:
run.font.size = Pt(30) #set font size
run.font.name = 'Poppins Medium' #set font name
run.font.color.rgb = RGBColor.from_string("FF0000") #set font color

Why doesn't this code save my figures with titles?

I'm producing some figures with the following code:
def boxplot_data(self,parameters_file,figure_title):
data = pandas.read_csv(parameters_file)
header = data.keys()
number_of_full_subplots = len(header)/16
remainder = len(header)-(16*number_of_full_subplots)
try:
for i in range(number_of_full_subplots+1):
fig =plt.figure(i)
txt = fig.suptitle(figure_title+' (n='+str(len(data[header[0]]))+') '+'Page '+str(i)+' of '+str(number_of_full_subplots),fontsize='20')
txt.set_text(figure_title+' (n='+str(len(data[header[0]]))+') '+'Page '+str(i)+' of '+str(number_of_full_subplots))
for j in range(16):
plt.ioff()
plt.subplot(4,4,j)
plt.boxplot(data[header[16*i+j]])
plt.xlabel('')
mng=plt.get_current_fig_manager()
mng.window.showMaximized()
plt.savefig(str(i)+'.png',bbox_inches='tight',orientation='landscape')
plt.close(fig)
plt.ion()
except IndexError:
txt = fig.suptitle(figure_title+' (n='+str(len(data[header[0]]))+') '+'Page '+str(i)+' of '+str(number_of_full_subplots),fontsize='20')
txt.set_text(figure_title+' (n='+str(len(data[header[0]]))+') '+'Page '+str(i)+' of '+str(number_of_full_subplots))
print '{} full figures were created and 1 partially filled \
figure containing {} subplots'.format(number_of_full_subplots,remainder)
This produces and saves the figures to file in the properly formatted manner however, no matter what I do the code seems to bypass the fig.suptitle line(s) and consequently I can't give my figure a title. Apologies if it seems there is a lot going on in this function that I haven't explained but does anybody have an explanation as to why this code refuses to give my figures titles?
Your problem is not that suptitle is bypassed, but that you are never saving the figure that you call suptitle on. All your calls to savefig are within the inner loop and as such are saving only the subplots. You can actually watch this happening if you open the png file while your code is running - you see each of the 16 sub axes being added one by one.
Your code looks unnecessarily complicated. For instance, I don't think you need to use ion and ioff. Here is a simple example of how to do what I think you want, followed by a translation of your code to fit that (Obviously i can't test, because I don't have your data)
import matplotlib.pyplot as plt
test_y=range(10)
test_x=[8,13,59,8,81,2,5,6,2,3]
def subplotsave_test():
for i in range(5):
fig = plt.figure(i)
txt = fig.suptitle('Page '+str(i)+' of '+str(5),fontsize='20')
for j in range(16):
plt.subplot(4,4,j+1)
plt.plot(test_y,test_x)
plt.savefig(str(i)+'.png',bbox_inches='tight',orientation='landscape')
if __name__ == '__main__':
subplotsave_test()
One tip I have found works for me - do a plt.show() wherever you intend to save the figure and ensure it looks like you want beforehanad and then replace that call with plt.savefig()
Possible translation of your function
def boxplot_data(self,parameters_file,figure_title):
data = pandas.read_csv(parameters_file)
header = data.keys()
number_of_full_subplots = len(header)/16
remainder = len(header)-(16*number_of_full_subplots)
for i in range(number_of_full_subplots+1)
fig =plt.figure(i)
fig.suptitle(figure_title+' (n='+str(len(data[header[0]]))+') '+'Page '+str(i)+' of '+str(number_of_full_subplots),fontsize='20')
for j in range(16):
plt.subplot(4,4,j+1)
if 16*i + j < len(header):
plt.boxplot(data[header[16*i+j]])
plt.xlabel('')
#You might want the showMaximized() call here - does nothing
#on my machine but YMMV
else:
print '{} full figures were created and 1 partially filled \
figure containing {} subplots'.format(number_of_full_subplots,remainder)
break
plt.savefig(str(i)+'.png',bbox_inches='tight',orientation='landscape')
plt.close(fig)

Categories