Creating a table in python and printing to a PDF - python

I know some similar questions have been asked but none have been able to answer my question or maybe my python programming skills are not that great(they are not). Ultimately I'm trying to creating a table to look like the one below, all of the "Some values" will be filled with JSON data which I do know how to import but creating the table to then export it to a PDF using FPDF is what is stumping me. I've tried pretty table and wasn't able to achieve this I tried using html but really I dont know too much html to build a table from scratch like this. so if some one could help or point in the right direction it would be appreciated.

I would recommend the using both the Pandas Library and MatplotLib
Firstly, with Pandas you can load data in from a JSON, either from a JSON file or string with the read_json(..) function documented here.
Something like:
import pandas as pd
df = pd.read_json("/path/to/my/file.json")
There is plenty of functionality withing the pandas library to manipulate your dataframe (your table) however you need.
Once you're done, you can then use MatplotLib to generate your PDF without needing any HTML
This would then become something like
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
df = pd.read_json("/path/to/my/file.json")
# Manipulate your dataframe 'df' here as required.
# Now use Matplotlib to write to a PDF
# Lets create a figure first
fig, ax = plt.subplots(figsize=(20, 10))
ax.axis('off')
the_table = ax.table(
cellText=df.values,
colLabels=df.columns,
loc='center'
)
# Now lets create the PDF and write the table to it
pdf_pages = PdfPages("/path/to/new/table.pdf")
pdf_pages.savefig(fig)
pdf_pages.close()
Hope this helps.

Related

How do I create a dataframe from a geojason file without using geopandas?

I'm looking to turn a geojason into a pandas dataframe that I can work with using python. However, for some reason, the geojason package will not install on my computer.
So wanted to know how I could turn a geojason file into a dataframe witout using the geojason package.
This is what I have so far
import json
import pandas as pd
with open('Local_Authority_Districts_(December_2020)_UK_BGC.geojson') as f:
data = json.load(f)
Here is a link to the geojason that I'm working with. I'm new to python so any help would be much appreciated. https://drive.google.com/file/d/1V4WljiJcASqq9ksh8CHM_2nBC0K2PR18/view?usp=sharing
You could use geopandas. It's as easy as this:
import geopandas as gpd
gdf = gpd.read_file('Local_Authority_Districts_(December_2020)_UK_BGC.geojson')
You can turn the resulting geodataframe into a regular dataframe with:
df = pd.DataFrame(gdf)

Plotly Choropleth Map not Displaying Correctly

First off, I'd like to start by saying that I am asking this question since I do not believe that it is related to the similarly titled question: plotly choropleth not plotting data, as I believe it has something to do with the custom GEOJSON boundaries and/or how I am accessing the data.
I use the crime.csv table from https://www.kaggle.com/ankkur13/boston-crime-data, and I am using a GEOJSON file from Boston Analytics (Fetched in the script). Currently, my code runs without errors, but it does not load the data onto the plot.
import pandas as pd
import plotly.graph_objs as go
from urllib.request import urlopen
import json
# Read Dataset
# Located at: https://www.kaggle.com/ankkur13/boston-crime-data
df = pd.read_csv("crime.csv")
with urlopen('http://bostonopendata-boston.opendata.arcgis.com/datasets/9a3a8c427add450eaf45a470245680fc_5.geojson?outSR={%22latestWkid%22:2249,%22wkid%22:102686}') as response:
pd_districts = json.load(response)
df_agg = df.groupby("DISTRICT").agg(CRIMES=("YEAR","count"))
df_agg.reset_index(inplace=True)
df_agg = df_agg[df_agg['DISTRICT'] != 'nan']
df['DISTRICT'] = df['DISTRICT'].apply(lambda x: str(x))
fig = go.Figure(go.Choroplethmapbox(geojson=pd_districts,
locations=df_agg['DISTRICT'].unique(),
z=df_agg['CRIMES'],
featureidkey="features.properties.DISTRICT")
)
fig.update_layout(mapbox_style="carto-positron")
fig.update_geos(fitbounds="locations")
fig.show()
I believe it has something to do with my featureidkey. However, I have tried multiple variants such as properties.DISTRICT, properties.ID, but to no avail.
I also ran a few sanity checks to make sure that the data was accessible, and here they are:
print(df_agg['DISTRICT'].unique())
print(pd_districts['features'][0]['properties']['ID'])
print(df['DISTRICT'].dtype)
Any help would be appreciated.

Import data tables in Python

I am new to Python, coming from MATLAB. In MATLAB, I used to create a variable table (copy from excel to MATLAB) in MATLAB and save it as a .mat file and whenever I needed the data from the MATLAB, I used to import it using:
A = importdata('Filename.mat');
[Filename is 38x5 table, see the attached photo]
Is there a way I can do this in Python? I have to work with about 35 such tables and loading everytime from excel is not the best way.
In order to import excel tables into your python environment you have to install pandas.
Check out the detailed guideline.
import pandas as pd
xl = pd.ExcelFile('myFile.xlsx')
I hope this helps.
Use pandas:
import pandas as pd
dataframe = pd.read_csv("your_data.csv")
dataframe.head() # prints out first rows of your data
Or from Excel:
dataframe = pd.read_excel('your_excel_sheet.xlsx')

How to Read a WebPage with Python and write to a flat file?

Very novice at Python here.
Trying to read the table presented at this page (w/ the current filters set as is) and then write it to a csv file.
http://www65.myfantasyleague.com/2017/options?L=47579&O=243&TEAM=DAL&POS=RB
I tried this next approach. It creates the csv file but does not fill it w/ the actual table contents.
Appreciate any help in advance. thanks.
import requests
import pandas as pd
url = 'http://www65.myfantasyleague.com/2017/optionsL=47579&O=243&TEAM=DAL&POS=RB'
csv_file='DAL.RB.csv'
pd.read_html(requests.get(url).content)[-1].to_csv(csv_file)
Generally, try to emphasize your problems better, try to debug and don't put everything in one line. With that said, your specific problem here was the index and the missing ? in the code (after options):
import requests
import pandas as pd
url = 'http://www65.myfantasyleague.com/2017/options?L=47579&O=243&TEAM=DAL&POS=RB'
# -^-
csv_file='DAL.RB.csv'
pd.read_html(requests.get(url).content)[1].to_csv(csv_file)
# -^-
This yields a CSV file with the table in it.

plotting using pandas in python

What i am trying to do is slightly basic, however i am very new to python, and am having trouble.
Goal: is to plot the yellow highlighted Row(which i have highlighted, however it will not be highlighted when i need to read the data) on the Y-Axis and plot the "Time" Column on the X-Axis.
Here is a photo of the Data, and then the code that i have tried along with its error.
Code
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import style
style.use('ggplot')
#Reading CSV and converting it to a df(Data_Frame)
df1 = pd.read_csv('Test_Sheet_1.csv', skiprows = 8)
#Creating a list from df1 and labeling it 'Time'
Time = df1['Time']
print(Time)
#Reading CSV and converting it to a df(Data_Frame)
df2 = pd.read_csv('Test_Sheet_1.csv').T
#From here i need to know how to skip 4 lines.
#I need to skip 4 lines AFTER the transposition and then we can plot DID and
Time
DID = df2['Parameters']
print(DID)
Error
As you can see from the code, right now i am just trying to print the Data so that i can see it, and then i would like to put it onto a graph.
I think i need to use the 'skiplines' function after the transposition, so that python can know where to read the "column" labeled parameters(its only a column after the Transposition), However i do not know how to use the skip lines function after the transposition unless i transpose it to a new Excel Document, but this is not an option.
Any help is very much appreciated,
Thank you!
Update
This is the output I get when I add print(df2.columns.tolist())

Categories