Python OpenPyxl inserting Histogram to Excel - python

I have been using OpenPyxl for creating Excel workbooks using data from other CSV files.
Currently I want to insert a histogram into the worksheet based on a numerical list that I have as variable x below.
I cannot find an efficient way to generate the histogram, the option I opted for was generate the histogram in matplotlib save and then place in worksheet, however this seems combersome and I feel like I am missing some synthax to directly pass the plt to the img.
The option using Reference seems imperfect also as I have 10^6 length vectors and would rather not write them to this file.
import numpy as np
import openpyxl
import matplotlib.pyplot as plt
wb = openpyxl.Workbook()
ws = wb.active
x = np.random.rand(100)
plt.clf()
plt.hist(x, bins=10)
plt.savefig('temp1.png')
img = openpyxl.drawing.image.Image('temp1.png',size=(300,300))
img.anchor(ws.cell('A1'))
ws.add_image(img)
plt.clf()
plt.plot(x)
plt.savefig('temp2.png')
img = openpyxl.drawing.image.Image('temp2.png',size=(300,300))
img.anchor(ws.cell('A15'))
ws.add_image(img)
wb.save("trial.xlsx")
As you can see this generates two .png files and overall seems unclean. I do not think the preformance is taking much of a hit but undoubtedly will have better solutions and optimization is valued here.
I would treat answers of the form: "Swap from using OpenPyxl to ..." as a last resort only.

Related

Creating a table in python and printing to a PDF

I know some similar questions have been asked but none have been able to answer my question or maybe my python programming skills are not that great(they are not). Ultimately I'm trying to creating a table to look like the one below, all of the "Some values" will be filled with JSON data which I do know how to import but creating the table to then export it to a PDF using FPDF is what is stumping me. I've tried pretty table and wasn't able to achieve this I tried using html but really I dont know too much html to build a table from scratch like this. so if some one could help or point in the right direction it would be appreciated.
I would recommend the using both the Pandas Library and MatplotLib
Firstly, with Pandas you can load data in from a JSON, either from a JSON file or string with the read_json(..) function documented here.
Something like:
import pandas as pd
df = pd.read_json("/path/to/my/file.json")
There is plenty of functionality withing the pandas library to manipulate your dataframe (your table) however you need.
Once you're done, you can then use MatplotLib to generate your PDF without needing any HTML
This would then become something like
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
df = pd.read_json("/path/to/my/file.json")
# Manipulate your dataframe 'df' here as required.
# Now use Matplotlib to write to a PDF
# Lets create a figure first
fig, ax = plt.subplots(figsize=(20, 10))
ax.axis('off')
the_table = ax.table(
cellText=df.values,
colLabels=df.columns,
loc='center'
)
# Now lets create the PDF and write the table to it
pdf_pages = PdfPages("/path/to/new/table.pdf")
pdf_pages.savefig(fig)
pdf_pages.close()
Hope this helps.

Could not convert string to float -headers

Although there are many topics on this error I cant find any that have a solution that will help me. I have headers labelled bottom, left, and right in my .csv Excel file, when I try to plot them I get a could not convert string to
text error due to these headers. How could I solve this?
import matplotlib.pyplot as plt
import numpy as np
# Read the input data only once
Bottom, Left, Right = np.loadtxt ("C:Data 2.csv", delimiter=",", skiprows=1, unpack=True)
# Plot in the first axis
ax1.plot(Bottom, Left, label='Pressure/area', color='b')
plt.show()
This is what the file looks like:
one of the other approches that you can take is to read the csv file throught pandas using pd.read_csv. This will help you solve your problem.
for example:-
if my filepath is 'example/path/pathtofile'
import pandas as pd
filepath = 'example/path/pathtofile'
data = pd.read_csv(filepath)

plotting using pandas in python

What i am trying to do is slightly basic, however i am very new to python, and am having trouble.
Goal: is to plot the yellow highlighted Row(which i have highlighted, however it will not be highlighted when i need to read the data) on the Y-Axis and plot the "Time" Column on the X-Axis.
Here is a photo of the Data, and then the code that i have tried along with its error.
Code
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import style
style.use('ggplot')
#Reading CSV and converting it to a df(Data_Frame)
df1 = pd.read_csv('Test_Sheet_1.csv', skiprows = 8)
#Creating a list from df1 and labeling it 'Time'
Time = df1['Time']
print(Time)
#Reading CSV and converting it to a df(Data_Frame)
df2 = pd.read_csv('Test_Sheet_1.csv').T
#From here i need to know how to skip 4 lines.
#I need to skip 4 lines AFTER the transposition and then we can plot DID and
Time
DID = df2['Parameters']
print(DID)
Error
As you can see from the code, right now i am just trying to print the Data so that i can see it, and then i would like to put it onto a graph.
I think i need to use the 'skiplines' function after the transposition, so that python can know where to read the "column" labeled parameters(its only a column after the Transposition), However i do not know how to use the skip lines function after the transposition unless i transpose it to a new Excel Document, but this is not an option.
Any help is very much appreciated,
Thank you!
Update
This is the output I get when I add print(df2.columns.tolist())

Pass array to an excel single cell from Python using xlwings

Is it possible to pass an array from python using xlwings into a single cell in excel? When I attempt to pass a numpy array as a Range.value excel creates a horizontal range with each value in its own row. My code is similar to this:
import xlwings as xw
import numpy as np
xlist = list()
xlist.append(1)
xlist.append(2)
wb = xw.Workbook('xlwingstest.xlsx')
xw.Range("sheet1", "a1", asarray= True).value = np.asarray(list)
I am trying to avoid using VBA, but suspect I may not have a choice.
Thanks

Plot multiple data using for loop, pyplot and genfromtxt

I am pretty sure this particular problem must have been treated somewhere but I cannot find it so I put the question.
I have 66 files with data stored in one single column. I wish to plot all data in a single plot. I'm used to do it with bash where acquiring and plotting data inside a loop is pretty trivial but I can't figure out in python.
thanks a lot for your help.
NM
Something like this should do it, although it will depend on how your data files are named.
import matplotlib.pyplot as plt
import numpy as np
fig,ax = plt.subplots()
# Lets say your files are called data-00.txt, data-01.txt etc.
for i in range(66):
data=np.genfromtxt('data-{:02d}.txt'.format(i))
ax.plot(data)
fig.savefig('my_fig.png')

Categories