Converting CSV to HTML Table in Python

Converting CSV to HTML Table in Python - python

I'm trying to take data from a .csv file and importing into a HTML table within python.
This is the csv file https://www.mediafire.com/?mootyaa33bmijiq
Context:
The csv is populated with data from a football team [Age group, Round, Opposition, Team Score, Opposition Score, Location]. I need to be able to select a specific age group and only display those details in separate tables.
This is all I've got so far....
infile = open("Crushers.csv","r")
for line in infile:
row = line.split(",")
age = row[0]
week = row [1]
opp = row[2]
ACscr = row[3]
OPPscr = row[4]
location = row[5]
if age == 'U12':
print(week, opp, ACscr, OPPscr, location)

First install pandas:
pip install pandas
Then run:
import pandas as pd
columns = ['age', 'week', 'opp', 'ACscr', 'OPPscr', 'location']
df = pd.read_csv('Crushers.csv', names=columns)
# This you can change it to whatever you want to get
age_15 = df[df['age'] == 'U15']
# Other examples:
bye = df[df['opp'] == 'Bye']
crushed_team = df[df['ACscr'] == '0']
crushed_visitor = df[df['OPPscr'] == '0']
# Play with this
# Use the .to_html() to get your table in html
print(crushed_visitor.to_html())
You'll get something like:
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>age</th>
<th>week</th>
<th>opp</th>
<th>ACscr</th>
<th>OPPscr</th>
<th>location</th>
</tr>
</thead>
<tbody>
<tr>
<th>34</th>
<td>U17</td>
<td>1</td>
<td>Banyo</td>
<td>52</td>
<td>0</td>
<td>Home</td>
</tr>
<tr>
<th>40</th>
<td>U17</td>
<td>7</td>
<td>Aspley</td>
<td>62</td>
<td>0</td>
<td>Home</td>
</tr>
<tr>
<th>91</th>
<td>U12</td>
<td>7</td>
<td>Rochedale</td>
<td>8</td>
<td>0</td>
<td>Home</td>
</tr>
</tbody>
</table>

Firstly, install pandas:
pip install pandas
Then,
import pandas as pd
a = pd.read_csv("Crushers.csv")
# to save as html file
# named as "Table"
a.to_html("Table.htm")
# assign it to a
# variable (string)
html_file = a.to_html()

Below function takes filename, headers(optional) and delimiter(optional) as input and converts csv to html table and returns as string.
If headers are not provided, assumes header is already present in csv file.
Converts csv file contents to HTML formatted table
def csv_to_html_table(fname,headers=None,delimiter=","):
with open(fname) as f:
content = f.readlines()
#reading file content into list
rows = [x.strip() for x in content]
table = "<table>"
#creating HTML header row if header is provided
if headers is not None:
table+= "".join(["<th>"+cell+"</th>" for cell in headers.split(delimiter)])
else:
table+= "".join(["<th>"+cell+"</th>" for cell in rows[0].split(delimiter)])
rows=rows[1:]
#Converting csv to html row by row
for row in rows:
table+= "<tr>" + "".join(["<td>"+cell+"</td>" for cell in row.split(delimiter)]) + "</tr>" + "\n"
table+="</table><br>"
return table
In your case, function call will look like this, but this will not filter out entries in csv but directly convert whole csv file to HTML table.
filename="Crushers.csv"
myheader='age,week,opp,ACscr,OPPscr,location'
html_table=csv_to_html_table(filename,myheader)
Note: To filter out entries with certain values add conditional statement in for loop.

Before you begin printing the desired rows, output some HTML to set up an appropriate table structure.
When you find a row you want to print, output it in HTML table row format.
# begin the table
print("<table>")
# column headers
print("<th>")
print("<td>Week</td>")
print("<td>Opp</td>")
print("<td>ACscr</td>")
print("<td>OPPscr</td>")
print("<td>Location</td>")
print("</th>")
infile = open("Crushers.csv","r")
for line in infile:
row = line.split(",")
age = row[0]
week = row [1]
opp = row[2]
ACscr = row[3]
OPPscr = row[4]
location = row[5]
if age == 'U12':
print("<tr>")
print("<td>%s</td>" % week)
print("<td>%s</td>" % opp)
print("<td>%s</td>" % ACscr)
print("<td>%s</td>" % OPPscr)
print("<td>%s</td>" % location)
print("</tr>")
# end the table
print("</table>")

First some imports:
import csv
from html import escape
import io
Now the building blocks - let's make one function for reading the CSV and another function for making the HTML table:
def read_csv(path, column_names):
with open(path, newline='') as f:
# why newline='': see footnote at the end of https://docs.python.org/3/library/csv.html
reader = csv.reader(f)
for row in reader:
record = {name: value for name, value in zip(column_names, row)}
yield record
def html_table(records):
# records is expected to be a list of dicts
column_names = []
# first detect all posible keys (field names) that are present in records
for record in records:
for name in record.keys():
if name not in column_names:
column_names.append(name)
# create the HTML line by line
lines = []
lines.append('<table>\n')
lines.append(' <tr>\n')
for name in column_names:
lines.append(' <th>{}</th>\n'.format(escape(name)))
lines.append(' </tr>\n')
for record in records:
lines.append(' <tr>\n')
for name in column_names:
value = record.get(name, '')
lines.append(' <td>{}</td>\n'.format(escape(value)))
lines.append(' </tr>\n')
lines.append('</table>')
# join the lines to a single string and return it
return ''.join(lines)
Now just put it together :)
records = list(read_csv('Crushers.csv', 'age week opp ACscr OPPscr location'.split()))
# Print first record to see whether we are loading correctly
print(records[0])
# Output:
# {'age': 'U13', 'week': '1', 'opp': 'Waterford', 'ACscr': '22', 'OPPscr': '36', 'location': 'Home'}
records = [r for r in records if r['age'] == 'U12']
print(html_table(records))
# Output:
# <table>
# <tr>
# <th>age</th>
# <th>week</th>
# <th>opp</th>
# <th>ACscr</th>
# <th>OPPscr</th>
# <th>location</th>
# </tr>
# <tr>
# <td>U12</td>
# <td>1</td>
# <td>Waterford</td>
# <td>0</td>
# <td>4</td>
# <td>Home</td>
# </tr>
# <tr>
# <td>U12</td>
# <td>2</td>
# <td>North Lakes</td>
# <td>12</td>
# <td>18</td>
# <td>Away</td>
# </tr>
# ...
# </table>
A few notes:
csv.reader works better than line splitting because it also handles quoted values and even quoted values with newlines
html.escape is used to escape strings that could potentially contain character < or >
it is often times easier to worh with dicts than tuples
usually the CSV files contain header (first line with column names) and could be easily loaded using csv.DictReader; but the Crushers.csv has no header (the data start from very first line) so we build the dicts ourselves in the function read_csv
both functions read_csv and html_table are generalised so they can work with any data, the column names are not "hardcoded" into them
yes, you could use pandas read_csv and to_html instead :) But it is good to know how to do it without pandas in case you need some customization. Or just as a programming exercise.

This should be working as well:
from html import HTML
import csv
def to_html(csvfile):
H = HTML()
t=H.table(border='2')
r = t.tr
with open(csvfile) as csvfile:
reader = csv.DictReader(csvfile)
for column in reader.fieldnames:
r.td(column)
for row in reader:
t.tr
for col in row.iteritems():
t.td(col[1])
return t
and call the function by passing the csv file to it.

Other answers are suggesting pandas, but that's probably overkill if formatting CSV to an HTML table is all you need. If you want to use an existing package just for this purpose, there's tabulate:
import csv
from tabulate import tabulate
with open("Crushers.csv") as file:
reader = csv.reader(file)
u12_rows = [row for row in reader if row[0] == "U12"]
print(tabulate(u12_rows, tablefmt="html"))

Related

python new line from array

I am using Jinja template in the frontend and in my backend I am using python using which I have an array which is of type string:
#app.route('/', methods=['POST'])
def upload_image():
match = ''
if 'files[]' not in request.files:
flash('No file part')
#file stuff
discription = ""
for file in files:
if file and allowed_file(file.filename):
#OCR related stuffs
for i in range(0, output_ocr_len):
#preprocessing
for index, row in df.iterrows():
if cord_len > 0:
height = result_img.shape[0]
width = result_img.shape[1]
for elements in range(0, cord_len):
char_length = []
predicted_pattern_category = numpy.append(
predicted_pattern_category, 'Misdirection')
[char_length.append(x)
for x in predicted_pattern_category if x not in char_length]
your_predicted_pattern_category = str(char_length)
char_length = []
predicted_pattern_type = numpy.append(
predicted_pattern_type, 'Visual Interference')
[char_length.append(x) for x in predicted_pattern_type if x not in char_length]
your_predicted_pattern_type = "" + str(char_length)
for i in range(0,len(char_length)):
print("from ML :-",char_length[i])
index_of_key = key.index(char_length[i])
discription = discription + "" + value[index_of_key]
print(discription)
match = str(match) + " "
else:
char_length = []
[char_length.append(x)
for x in predicted_pattern_category if x not in char_length]
your_predicted_pattern_category = str(char_length)
if len(your_predicted_pattern_category) < 3:
your_predicted_pattern_category=''
char_length = []
[char_length.append(x) for x in predicted_pattern_type if x not in char_length]
your_predicted_pattern_type = str(char_length)
if len(your_predicted_pattern_type) < 3:
your_predicted_pattern_type=''
for i in range(0,len(char_length)):
print("from ML :-",char_length[i])
index_of_key = key.index(char_length[i])
discription = discription + '\r\n' + value[index_of_key]
print(discription)
return render_template('uploads/results.html',
msg='Processed successfully!',
match=match,
discription=discription
filenames=output_results
)
else:
return render_template('uploads/results.html',
msg='Processed successfully!',
match=match,
filenames=file_names)
To display the description, I am using Jinja template:
<tr>
<th class="table-info">Description</th>
<td>{{ description }}</td>
</tr>
I want that the description is printed on a new line whose content is present within the "value" variable.
currently the description renders together and not in a new line:
happy: is an emotionis: an extensionmy mood: relies on people
What I want is (every sentence in a new line)
happy: is an emotionis:
an extensionmy mood:
relies on people

From what I can tell, your code comes out on one line because you're repeatedly appending to the same variable. If you really want it all in one variable description, and on separate lines, I think you need a new line char, but I think HTML may ignore them...
For your fundamental problem, tables are designed to have information on separate rows. If you want the descriptions to be on separate lines, I think separate rows is the way to go. I would personally do the for-loop in the template instead of the backend.
{% for value in values %}
<tr>
<th class="table-info">Description</th>
<td>{{ value }}</td>
</tr>
{% endfor %}

Read and write in a JSON file using python 2.x

I am writing a dictionary in a json file and I made a function to read the json and send it to a html file. My problem it's that it's not actually writing the dictionary. I really need help with this, because after a lot of thinking and searching I can't find what I am doing wrong.
I am making the dictionary in a file called queries
dict_scan_data = scan_data.to_dict(orient='records')
data_load ={}
data_load['noscan'] = dict_scan_data
return json.dumps(data_load)
def updateJsonFiles():
f = open('../site/json/data.json', 'w')
f.write(calcProductionAsJSON())
f.close()
# updateJsonFiles()
I made the read function like this:
import json
from queries import calcProductionAsJSON
def GetProductionTotals():
"""
Return the production data as json
"""
f = open('../site/json/data.json', 'r')
data = f.read()
f.close()
return json.dumps(data)
def GetProductionTotalsLive():
"""
Return the production data as json
"""
return calcProductionAsJSON()
And in the html:
<tr ng-repeat="item in data">
<td>{{ item.Masina }}</td>
<td>{{ item.Productie }}</td>
<td>{{ item.Scanned }}</td>
<td>{{ item.Delta }}</td>
</tr>
I am very new to python, so sorry if this question may seem easy or silly

I think in the GetProductionTotals(), it should return json.loads(data) instead of json.dumps(data) since json.dumps returns a string while json.loads returns back json from the string passed to it.

json.dumps make string (JSON) from python dict
json.loads load string (JSON) to dict
example (python 2.7.6)
>>> import json
>>> d = {'a': 'foobar'}
>>> json_from_d = json.dumps(d)
>>> json_from_d
'{"a": "foobar"}'
>>>
>>> new_d_from_json = json.loads(json_from_d)
>>> new_d_from_json
{u'a': u'foobar'}
So in GetProductionTotals you should call json.loads(data)

Loop in dictionary in HTML

I have a Python script creating a dictionary and passing it to a html page to generate a report.
in Python:
data_query= {}
data_query["service1"] = "value1"
data_query["service2"] = "value2"
return data_query
in HTML:
% for name, count in data_query:
<tr>
<td>${name}</td>
<td>${count}</td>
</tr>
% endfor
it does not work, says that it does not return enough values.
I also tried (pointed out in a comment in the other question, that I deleted by mistake):
% for name, count in dict.iteritems():
It does not give any error, but does not work. Displays nothing.
${len(dict)}
gives the right dictionary length
${len(dict.iteritems())}
does not display anything and seem to have a weird effect on my table format.
Is there a way to iterate correctly a dictionart in HTMl to display both the key and value?
EDIT: How I transfer the dictionary to the html page.
from mako.lookup import TemplateLookup
from mako.runtime import Context
from mako.exceptions import text_error_template
html_lookup = TemplateLookup(directories=[os.path.join(self.dir_name)])
html_template = html_lookup.get_template('/templates/report.html')
html_data = { 'data_queries' : data_queries }
html_ctx = Context(html_file, **html_data)
try:
html_template.render_context(html_ctx)
except:
print text_error_template().render(full=False)
html_file.close()
return
html_file.close()

% for name, count in dict.items:
<tr>
<td>${name}</td>
<td>${count}</td>
</tr>
% endfor
should probably work ... typically you dont call the fn when you pass it to a templating language... alternatively
% for name in dict:
<tr>
<td>${name}</td>
<td>${dict[name]}</td>
</tr>
% endfor
would likely also work
as an aside ... dict is a terrible variable name as it shadows the builtin dict (which might be part of your problem if that is actually your variable name)

indexError, searching within

I am writing a program that will read a CSV file with data that looks like this:
"10724_artifact11679.jpg","H. 3 1/4 in. (8.26 cm)","10.210.114","This artwork is currently on display in Gallery 171","11679"
And write it into an HTML table. I only want the files that say, in the 3rd position, "This artwork is not on display".. but I've been having issues with this set of data
import csv
metlist4 = []
newList = csv.reader(open("v2img_10724_list.csv", 'r'))
for row in newList:
metlist4.append(row)
artifact_template = """<td>
<div>
<img src= "%(image)s" alt = "artifact" />
<p>Dimensions: %(dimension)s </p>
<p>Accession #: %(accession)s </p>
<p>Display: %(display)s </p>
<p>index2: %(index2)s </p>
</div>
</td>"""
html_list = []
count = 5794
for artifact in metlist4:
if artifact[3] in ["This artwork is not on display"]:
artifactinfo = {}
artifactinfo["image"]=artifact[0]
artifactinfo["dimension"]=artifact[1]
artifactinfo["accession"]=artifact[2]
artifactinfo["display"]=artifact[3]
artifactinfo["index2"]=count
count = count + 1
html_list.append(artifact_template % artifactinfo)
else:
pass
f = open("v3display_test.txt", "w")
f.write("\n".join(html_list))
f.close()
I get this error, but only when I run the entire metlist4...
File "/Users/Rose/Documents/workspace/METProjectFOREAL/src/no_display_Met4.py", line 34, in <module>
if artifact[3] in ["This artwork is not on display"]:
IndexError: list index out of range
if I run just a section, for example metlist4[0:500], the error does not occur. Any ideas or suggestions would be greatly appreciated!! Thanks!

There is at least one row that doesn't have a 4th element. Perhaps the line is empty.
Test for the length, and print the row to test:
if len(artifact) < 4:
print 'short row', artifact
If it is an empty line, just skip it:
if not artifact: continue
You are using a lot of verbose and redundant code; there is no need to build a separate list when you can just loop over the csv.reader() object directly, and there is no need to add an empty else: pass block either.
Idiomatic Python code would be:
artifact_template = """<td>
<div>
<img src= "%(image)s" alt = "artifact" />
<p>Dimensions: %(dimension)s </p>
<p>Accession #: %(accession)s </p>
<p>Display: %(display)s </p>
<p>index2: %(index2)s </p>
</div>
</td>"""
html_list = []
fields = 'image dimension accession display'.split()
with open("v2img_10724_list.csv", 'rb') as inputfile:
reader = csv.DictReader(inputfile, fields=fields, restval='_ignored')
for count, artifact in enumerate(reader, 5794):
if artifact and artifact['display'] == "This artwork is not on display":
artifactinfo["index2"] = count
html_list.append(artifact_template % artifact)
This use a csv.DictReader() instead to create the dictionaries per row, a with statement to ensure the file is closed when done, and enumerate() with a start value to track count.

How to create a dynamic table

I am creating a table using the following code based on the input provided in XML which is working perfectly fine but I want to convert to code to create a table dynamically meaning if i add more columns,code should automatically adjust..currently I have hardcoded that the table will contain four columns..please suggest on what changes need to be done to the code to achieve this
Input XML:-
<Fixes>
CR FA CL TITLE
409452 WLAN 656885 Age out RSSI values from buffer in Beacon miss scenario
12345,45678 BT 54567,34567 Test
379104 BT 656928 CR379104: BT doesn’t work that Riva neither sends HCI Evt for HID ACL data nor response to HCI_INQUIRY after entering into pseudo sniff subrating mode.
</Fixes>
Python code
crInfo = [ ]
CRlist = [ ]
CRsFixedStart=xmlfile.find('<Fixes>')
CRsFixedEnd=xmlfile.find('</Fixes>')
info=xmlfile[CRsFixedStart+12:CRsFixedEnd].strip()
for i in info.splitlines():
index = i.split(None, 3)
CRlist.append(index)
crInfo= CRlisttable(CRlist)
file.close()
def CRlisttable(CRlist,CRcount):
#For logging
global logString
print "\nBuilding the CRtable\n"
logString += "Building the build combo table\n"
#print "CRlist"
#print CRlist
CRstring = "<table cellspacing=\"1\" cellpadding=\"1\" border=\"1\">\n"
CRstring += "<tr>\n"
CRstring += "<th bgcolor=\"#67B0F9\" scope=\"col\">" + CRlist[0][0] + "</th>\n"
CRstring += "<th bgcolor=\"#67B0F9\" scope=\"col\">" + CRlist[0][1] + "</th>\n"
CRstring += "<th bgcolor=\"#67B0F9\" scope=\"col\">" + CRlist[0][2] + "</th>\n"
CRstring += "<th bgcolor=\"#67B0F9\" scope=\"col\">" + CRlist[0][3] + "</th>\n"
CRstring += "</tr>\n"
TEMPLATE = """
<tr>
<td><a href='http://prism/CR/{CR}'>{CR}</a></td>
<td>{FA}</td>
<td>{CL}</td>
<td>{Title}</td>
</tr>
"""
for item in CRlist[1:]:
CRstring += TEMPLATE.format(
CR=item[0],
FA=item[1],
CL=item[2],
Title=item[3],
)
CRstring += "\n</table>\n"
#print CRstring
return CRstring

Although I have some reservations about providing this since you seem unwilling to even attempt doing so yourself, here's an example showing one way it could be done -- all in the hopes that perhaps at least you'll be inclined to the effort to study and possibly learn a little something from it even though it's being handed to you...
with open('cr_fixes.xml') as file: # get some data to process
xmlfile = file.read()
def CRlistToTable(CRlist):
cols = CRlist[0] # first item is header-row of col names on the first line
CRstrings = ['<table cellspacing="1" cellpadding="1" border="1">']
# table header row
CRstrings.append(' <tr>')
for col in cols:
CRstrings.append(' <th bgcolor="#67B0F9" scope="col">{}</th>'.format(col))
CRstrings.append(' </tr>')
# create a template for each table row
TR_TEMPLATE = [' <tr>']
# 1st col of each row is CR and handled separately since it corresponds to a link
TR_TEMPLATE.append(
' <td>{{{}}}</td>'.format(*[cols[0]]*2))
for col in cols[1:]:
TR_TEMPLATE.append(' <td>{{}}</td>'.format(col))
TR_TEMPLATE.append(' </tr>')
TR_TEMPLATE = '\n'.join(TR_TEMPLATE)
# then apply the template to all the non-header rows of CRlist
for items in CRlist[1:]:
CRstrings.append(TR_TEMPLATE.format(CR=items[0], *items[1:]))
CRstrings.append("</table>")
return '\n'.join(CRstrings) + '\n'
FIXES_START_TAG, FIXES_END_TAG = '<Fixes>, </Fixes>'.replace(',', ' ').split()
CRsFixesStart = xmlfile.find(FIXES_START_TAG) + len(FIXES_START_TAG)
CRsFixesEnd = xmlfile.find(FIXES_END_TAG)
info = xmlfile[CRsFixesStart:CRsFixesEnd].strip().splitlines()
# first line of extracted info is a blank-separated list of column names
num_cols = len(info[0].split())
# split non-blank lines of info into list of columnar data
# assuming last col is the variable-length title, comprising reminder of line
CRlist = [line.split(None, num_cols-1) for line in info if line]
# convert list into html table
crInfo = CRlistToTable(CRlist)
print crInfo
Output:
<table cellspacing="1" cellpadding="1" border="1">
<tr>
<th bgcolor="#67B0F9" scope="col">CR</th>
<th bgcolor="#67B0F9" scope="col">FA</th>
<th bgcolor="#67B0F9" scope="col">CL</th>
<th bgcolor="#67B0F9" scope="col">TITLE</th>
</tr>
<tr>
<td>409452</td>
<td>WLAN</td>
<td>656885</td>
<td>Age out RSSI values from buffer in Beacon miss scenario</td>
</tr>
<tr>
<td>12345,45678</td>
<td>BT</td>
<td>54567,34567</td>
<td>Test</td>
</tr>
<tr>
<td>379104</td>
<td>BT</td>
<td>656928</td>
<td>CR379104: BT doesnt work that Riva neither sends HCI Evt for HID ACL data nor
response to HCI_INQUIRY after entering into pseudo sniff subrating mode.</td>
</tr>
</table>

That doesn't look like an XML file - it looks like a tab delimited CSV document within a pair of tags.
I suggest looking into the csv module for parsing the input file, and then a templating engine like jinja2 for writing the HTML generation.
Essentially - read in the csv, check the length of the headers (gives you number of columns), and then pass that data into a template. Within the template, you'll have a loop over the csv structure to generate the HTML.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Converting CSV to HTML Table in Python - python

Firstly, install pandas: pip install pandas Then, import pandas as pd a = pd.read_csv("Crushers.csv") # to save as html file # named as "Table" a.to_html("Table.htm") # assign it to a # variable (string) html_file = a.to_html()

Related

python new line from array

Read and write in a JSON file using python 2.x

Loop in dictionary in HTML

indexError, searching within

How to create a dynamic table

Categories

Resources