Juptyter Notebook HTML export showing </div> and not formatting properly - python

My wife recently did a boot camp program for Data analytics. She is trying to push her projects to GitHub for display. However, some of the Jupyter notebooks are not exporting to HTML properly or showing up on GitHub's display properly. Bellow is screen shots and code snipets.
I can't identify where I need to make changes to resolve this while still maintaining the comments and formatting. Most of her projects have this issue. I noticed the issue seems to start at but I have not been able to identify the specific issue. When I load up the HTML, I do notice that a lot of the conversion gets messed up, like so:
<p></div></p>
So it looks like some of the < /div > are getting messed up during the export/conversion to proper HTML.
When loaded up into the latest version of Jupyter Notebook...
jupyter core : 4.7.1
jupyter-notebook : 6.4.10
The display looks like this
However, when I export as HTML, or upload that to GitHub, it looks like this:
Here is the HTML code, as it looks its the Jupyter Notebook:
# <font color='#32cd32'><b><u> Reviewer Comment </u></b></font>
<div class="alert alert-success" >
**Hello! I am Larchenko Ksenia and I'm glad to review your project**!
You can find my comments in green, yellow and red boxes like these:
</div>
<div class="alert alert-success" >
**Success:** green color shows that everything is done perfectly;
</div>
<div class="alert alert-warning" >
**Remark & Recommendation:** yellow color highlights something to pay attention for;
</div>
<div class="alert alert-danger" >
**Needs fixing:** if the color is red, please rework.
</div>
<div class="alert alert-success" >
**Please, don't remove my comments:)**
</div>
And if I load up the raw ipynb and look at that section, it looks like this:
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Hello, my name is Ivan Alexeev and I am going to review your project.\n",
"\n",
"\n",
"There may be some shortcomings in the work that I will ask you to eliminate, you fix them and I check your decisions. You can find my comments in <font color='green'>green</font>, <font color='orange'>orange</font> or <font color='red'>red</font> boxes like this:\n",
"\n",
"\n",
"<div class=\"alert alert-success\" style=\"box-shadow: 4px 4px 4px\">\n",
"Success: if everything is done successfully\n",
"</div>\n",
"\n",
"\n",
"<div class=\"alert alert-warning\" style=\"box-shadow: 4px 4px 4px\">\n",
"Remark: if I can give some recommendations or additional information\n",
"</div>\n",
"\n",
"\n",
"<div class=\"alert alert-danger\" style=\"box-shadow: 4px 4px 4px\">\n",
"Need fixing: if the block requires some corrections. Work can't be accepted with the red comments\n",
"</div>\n",
"\n",
"\n",
"Thank you for taking time to complete this project, I appreciate the amount of work you've done! There are some issues that you need to work on, but overall it is a great start! \n",
"\n",
"\n",
"Please, don't delete my comments) \n"
]
}

Related

How to add watermarks in all pages of Odoo Reports?

Using below code It is just view on the first page. I want to show watermark on all pages.
<div class="watermark_report">
<img t-att-src="'data:image/png;base64,'+ doc.company_id.report_header_logo"/>
</div>
You already have the answer here:
Add this code for watermark in header of external layout. Its external id is report.external_layout_header:
<style>
.watermark {
position: absolute;
opacity: 0.25;
z-index: 1000;
transform: rotate(300deg);
-webkit-transform: rotate(300deg);
width: 150%;
}
</style>
<div class="watermark">
<p>WATERMARK</p>
<img t-att-src="'/module_name/static/src/img/image_name.png'" />
</div>
I have added a image stored as a file. If you are going to use a static image I think this is the most appropriate way
Note: Instead of using the css attribute opacity you can use a png image with opacity and transparent backgroud. That´s what I had to do
Note 2: I am afraid this does not work in Odoo v11
Update
This solution only is valid if you want to add the same image to all the reports.
There is a module developed by the OCA to add watermarks to the reports. A field appears in all reports where an images (with A4 size) can be added. The module name is report_qweb_pdf_watermark

KO: Error when parsing JSON

Created a JSBin that demostrated the problem: http://jsbin.com/kukehoj/1/edit?html,js,console,output
I'm creating my first REST-powered website. The backend is in Python (Django REST Framework), and seems to be working fine. I'm trying to make the front-end get the comments for the posts, but its not working.
HTML Imports
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.1.1/jquery.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/knockout/3.4.1/knockout-min.js"></script>
scripts.js
function Comment(data) {
this.body = ko.observable(data.responseText)
}
function Post(data) {
this.title = ko.observable(data.title)
this.body = ko.observable(data.body)
var self = this;
self.comments = ko.observableArray([])
self.comments(($.map(data.comments, function(link) { // Map the data from
return $.getJSON(link, function(data) { return new Comment(data)}) //These requests
})))
}
function PostViewModel() {
// Data
var self = this;
self.posts = ko.observableArray([])
// Get the posts and map them to a mappedData array.
$.getJSON("/router/post/?format=json", function(allData) {
var mappedData = $.map(allData, function(data) { return new Post(data)})
self.posts(mappedData)
})
}
ko.applyBindings(new PostViewModel());
Server data:
[{ "title":"-->Title here<--",
"body":"-->Body here<--",
"comments":[
"http://127.0.0.1:8000/router/comment/6/?format=json",
"http://127.0.0.1:8000/router/comment/7/?format=json",
"http://127.0.0.1:8000/router/comment/8/?format=json",
"http://127.0.0.1:8000/router/comment/9/?format=json"]
}]
where each of the links leeds to:
{"body":"-->Body here<--"}
index.html
<div class="col-lg-7" data-bind="foreach: { data: posts, as: 'posts' }">
<h3 data-bind="text: title"></h3>
<p data-bind="text: body"> </p>
<span data-bind="foreach: { data: comments(), as: 'comments' }">
<p data-bind="text: comments.body"></p>
</span>
</div>
(There is a lot more HTML, but i removed the irrelevant parts)
Everything is working fine, except from that the comments seem to be in the wrong format.
The chrome console shows JSON "responseText" bound to each of the comment object values.
Wrong format
I'm sorry if this is a stupid question, but I have tried everything - but it doesn't work. (I'm a noob)
There is nothing wrong with your sample code you provided except the part you have this.body = ko.observable(data.responseText) while your data does not contain a responseText in your sample commentData object . if you replace commentData object with var commentData = {"responseText":"-->Body here<--"} it works.
Note: this part
<span data-bind="foreach: { data: comments(), as: 'comments' }">
<p data-bind="text: comments.body"></p> // comments.body => body
</span>
on your question is wrong but you have it correct on your sample code .It should be
<span data-bind="foreach: { data: comments(), as: 'comments' }">
<p data-bind="text: body"></p>
</span>
Here is a working version of your sample :https://jsfiddle.net/rnhkv840/26/
I assume you are using Django Rest Framework, so the JSON structure you get for your posts is done automatically by your serializer based on your model fields.
Back to the frontend, I have not used knockout js before, but what you require is to load the comments using another controller. Either you do it one by one using the links provided by your main resource (this can result in lots of queries sometimes), or you create a filter on your comments endpoint which will allow you to retrieve comments for a specific post.
Ever considered using the django REST framework? It can help you serialize all you models with a simple viewset. Check out the docs.
So found the actual problem. The way the JavaScript read the data from the server, ment that since there was only one value for the comments, the data property of a comment was the variable storing the body of the comment. Not the data.body.

Google Fusion Maps Info Window Dynamic Templating

I'm doing some web scraping with Python and the last step is to use Google Fusion maps, but as somebody who has never touched any CSS styling before, I have no idea how to do something probably incredibly simple: hide a column title in the info window if it's blank. Not all the data have entries in my Amenities column, so I would like that to be gone from the info window if it's blank.
I've read this (https://support.google.com/fusiontables/answer/3081246?hl=en&ref_topic=2575652), but it's complete gibberish to me at this stage.
This is the default HTML they provide for the info window (with my data):
<div class='googft-info-window'>
<b>Location:</b> {Location}<br>
<b>Movie Title:</b> {Movie Title}<br>
<b>Date:</b> {Date}<br>
<b>Amenities:</b> {Amenities}
</div>
This shot in the dark didn't work:
<div class='googft-info-window'>
<b>Location:</b> {Location}<br>
<b>Movie Title:</b> {Movie Title}<br>
<b>Date:</b> {Date}<br>
<b>{if Amenities.value}
Amenities:
{/if}Amenities:</b> {Amenities}<br>
</div>
This question isn't related to CSS, try this:
{template .contents}
<div class='googft-info-window'>
<b>Location:</b> {$data.value.Location}<br/>
<b>Movie Title:</b> {$data.value['Movie Title']}<br/>
<b>Date:</b> {$data.value.Date}<br/>
{if $data.value.Amenities}
<b>Amenities:</b>{$data.value.Amenities}<br/>
{/if}
</div>
{/template}

Programmatically delete everything before a HTML node?

I am trying to create a corpus of data from a set of .html pages I have stored in a directory.
These HTML pages have lots of info I don't need.
This info is all stored before the line
<div class="channel">
How can I programmatically remove all of the text before
<div class="channel">
in every HTML file in a folder?
Bonus question for a 50point bounty :
How do I programmatically remove everything AFTER, for example,
<div class="footer">
?
So if my index.html was previously :
<head>
<title>This is bad HTML</title>
</head>
<body>
<h1> Remove me</h1>
<div class="channel">
<h1> This is the good data, keep me</h1>
<p> Keep this text </p>
</div>
<div class="footer">
<h1> Remove me, I am pointless</h1>
</div>
</body>
After my script runs, I want it to be :
<div class="channel">
<h1> This is the good data, keep me</h1>
<p> Keep this text </p>
</div>
This is a bit heavy on memory usage, but it works. Basically you open up the directory, get all ".html" files, read them into a variable, find the split point, store the before or after in a variable, and then overwrite the file.
There are probably better ways to do this, nonetheless, but it works.
import os
dir = os.listdir(".")
files = []
for file in dir:
if file[-5:] == '.html':
files.insert(0, file)
for fileName in files:
file = open(fileName)
content = file.read()
file.close()
loc = content.find('<div class="channel">')
newContent = content[loc:]
file = open(fileName, 'w')
file.write(newContent)
file.close()
If you wanted to just keep up to a point:
newContent = content[0:loc - 1] # I think the -1 is needed, not sure
Note that the things you're searching should be kept in a variable, and not hardcoded.
Also, this won't work recursively for file/folder structures, but you can find out how to modify it to do that very easily.
to remove everything above and everything below
that means the only thing left should be this section:
<div class="channel">
<h1> This is the good data, keep me</h1>
<p> Keep this text </p>
</div>
rather than thinking to remove the unwanted, it would be easier to just extract the wanted.
you can easily extract channel div using XML parser such as DOM
You've not mentioned a language in the question - the post is tagged with python so this answer might still be out of context, but I'll give a php solution that could likely easily be rewritten in another language.
$html='....'; // your page
$search='<div class="channel">';
$components = explode($search,$html); // [0 => before the string, 1 => after the string]
$result = $search.$components[1];
return $result;
To do the reverse is fairly easy too; simply take the value of $components[0] after altering $search to your <div class="footer"> value.
If you happen to have the $search string cropping up multiple times:
$html='....'; // your page
$search='<div class="channel">';
$components = explode($search,$html); // [0 => before the string, 1 => after the string]
unset($components[0]);
$result = $search.implode($search,$components);
return $result;
Someone who knows python better than I do feel free to rewrite and take the answer!

Extract artist and music From text (regex)

I have written following regex But its not working. Can you please help me? thank you :-)
track_desc = '''<img src="http://images.raaga.com/catalog/cd/A/A0000102.jpg" align="right" border="0" width="100" height="100" vspace="4" hspace="4" />
<p>
</p>
<p> Artist(s) David: <br/>
Music: Ramana Gogula<br/>
</p>'''
rx = "<p><\/p><p>Artist\(s\): (.*?)<br\/>Music: (.*?)<br\/><\/p>"
m = re.search(rx, track_desc)
Output Should be:
Artist(s) David
Music: Ramana Gogula
You were ignoring the whitespace:
<p>[\s\n\r]*Artist\(s\)[\s\n\r]*(.*?)[\s\n\r]*:[\s\n\r]*<br/>[\s\n\r]*Music:[\s\n\r]*(.*?)<br/>[\s\n\r]*</p>
Output is:
[1] => "David"
[2] => "Ramana Gogula"
(note that your regex didn't match the Artists(s) and Music: prefixes either)
However for production code I would not rely on such rather clumsy regex (and equally clumsily formatted HTML source).
Seriously though, ditch the idea of using regex for this if you aren't the slightest familiar with regex (which it looks like). You're using the wrong tool and a badly formatted data source. Parsing HTML with Regex is wrong in 9 out of 10 cases (see #bgporter's comment link) and doomed to fail. Apart from that HTML is hardly ever an appropriate data source (unless there really really is no alternative source).
import lxml.html as lh
import re
track_desc = '''
<img src="http://images.raaga.com/catalog/cd/A/A0000102.jpg" align="right" border="0" width="100" height="100" vspace="4" hspace="4" />
<p>
</p>
<p> Artist(s) David: <br/>
Music: Ramana Gogula<br/>
</p>
'''
tree = lh.fromstring(track_desc)
print re.findall(r'Artist\(s\) (.+):\s*\nMusic: (.*\w)', tree.text_content())
I see a few errors:
regex is not multiline : should use flags=re.MULTILINE to allow to match on multilines
spaces are not taken into account
artist(s) is not followed by :
As the web page is rather strangely presented, this might be error prone to rely on a regex and I wouldn't advise to use it extensively.
Note, following seems to work:
rx='Artist(?:\(s\))?\s+(.*?)\<br\/>\s+Music:\s*(.*?)\<br'
print ("Art... : %s && Mus... : %s" % re.search(rx, track_desc,flags=re.MULTILINE).groups())

Categories