I am using IronPython 2.7.9.0 on Grasshopper and Rhino to web scrape data from a specific widget on this link: https://vemcount.app/embed/widget/uOCRuLPangWo5fT?locale=en
The code I am using is as follows
import urllib
import os
web = urllib.urlopen(url)
html = web.read()
web.close()
The html output contains all the html code from this link except for the parts I need. When I inspect it on chrome it has a "flex" button next to it such as the following image.
image that summarizes the issue I am facing
Anything that is rooted under the line with a "flex" button does not appear in the scraping result and comes as a blank line.
This is the output html I get:
<!DOCTYPE html>
<html lang="en">
<head>
<title>Central Library - Duhig North & Link</title>
<meta charset="utf-8">
<meta name="google" content="notranslate">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta name="csrf-token" content="">
<link rel="stylesheet" href="/build/app.css?id=2fefc4f9faa59eebcb4b">
<link rel="stylesheet" href="https://vemcount.app/fonts/hamburg_serial/stylesheet.css">
<style>
#embed, #main {
height: 100vh;
}
.vue-grid-item {
margin-bottom: 0px !important;
}
.powered_by {
position: absolute;
bottom: 0px;
right: 0px;
background-color: rgba(0, 0, 0, 0.18);
color: #fff;
padding: 2px 5px;
font-size: 9px;
}
.powered_by:hover, .powered_by:link, .powered_by:visited {
text-decoration: none;
display: none;
}
.dashboard-widget .relative {
overflow: hidden !important;
}
</style>
<script>
window.App = {"socketAppKey":"eJSkWUHWpwolvjVcT2ZxUJZXnDpxtRljdZl74fKr","socketCluster":null,"socketHost":"websocket.vemcount.com","socketPort":443,"socketSecurePort":443,"socketDisableStats":true,"socketEncrypted":true,"locale":"en","settings":[{"name":"type","value":"{\"count_in\":\"column\"}"},{"name":"period","value":"[\"yesterday\"]"},{"name":"period_step","value":"hour"},{"name":"hide_datalabel","value":"0"},{"name":"currency","value":"AUD"},{"name":"show_days","value":"[0,1,2,3,4,5,6]"},{"name":"show_months","value":"[1,2,3,4,5,6,7,8,9,10,11,12]"},{"name":"show_hours_from","value":"00:00"},{"name":"show_hours_to","value":"23:45"},{"name":"data_heatmap","value":"blue"},{"name":"weather_metrics","value":"0"},{"name":"first_day_of_week","value":"1"},{"name":"time_format24","value":"time_format24"},{"name":"date_time_format","value":"2"},{"name":"number_grouping","value":","},{"name":"number_decimal","value":"."},{"name":"opening_hours_overlap","value":"0"},{"name":"data_output","value":"count_in"}],"sound":null};
</script>
<script src="/build/lang/en.js?v=2022.04.4"></script>
</head>
<body class="bg-transparent">
<main id="main">
<div id="embed" >
<div class="w-full h-full vue-grid-item cssTransforms" style="position: absolute;">
<live-inside :embedded="true" :widget="{"id":81438,"pane_id":4005,"title":"Central Library - Duhig North & Link","description":"Live occupancy \/ Seating capacity","x":0,"y":0,"w":2,"h":1,"bg_color":"red","text_color":"black","type":"live-inside","secret":"uOCRuLPangWo5fT","internal":"VRg4JTIRrtJ7Pwg","embeddable":1,"content":{"target":1100,"bidirectional":true,"target_enable":true,"prettify":false,"target_type":"donut","target_donut_hide_metric":false,"target_donut_target_hide_label":false,"target_visual_inside_text":null,"target_visual_available_text":null,"target_screen_ok_title":null,"target_screen_ok_text":null,"target_screen_ok_color":"#38A169","target_screen_ok_image":-1,"target_screen_warning_title":null,"target_screen_warning_pe</live-inside>
</div>
</div>
</main>
<a title=" Vemco Group A/S " class="powered_by" target="_blank"
href="http://vemcount.com">Powered by
<b>vemcount.com</b>
</a>
<script src="/build/manifest.js?id=7f2e9aa3431c681a4683"></script>
<script src="/build/vendor.js?id=19867aae3b960cda7d79"></script>
<script src="/build/embed.js?id=2ff0173dd78c5c1f99c6"></script>
</body>
</html>
As you can see it is missing some lines, which are the lines that have a flex button next to them. (btw I have shortended the code that is in so I dont reach the 30000 character limit).
I am interested in the number 311 which changes every 2 seconds in the live link and it can be found in the html code between
<span>311</span>
Is there a way I can get this value, as well as any other value, using IronPython?
P.S. I am a noob in actual coding, that's why I might have issues with terminologies, but have a fair background in visual scripting. Your help is much appreciated. Thanks.
Just in case you had the same query or were struggling with dynamic web scraping. You have to use CPython and install a webscraper such as Playwright or BS + Selenium
I used playwright which is far more straightforward and has a very much appreciated inner_html() function which reads straight into the dynamic flex HTML code. Here is the code for reference.
#part of the help to write the script I got from https://stackoverflow.com/questions/64303326/using-playwright-for-python-how-do-i-select-or-find-an-element
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(slow_mo=1000)
page = browser.new_page()
page.goto('https://vemcount.app/embed/widget/uOCRuLPangWo5fT')
central = page.query_selector("p.w-full span");
print({'central': central.inner_html()})
browser.close()
Afterwards I am trying to run the .py script remotely from Grasshopper through a batch file and read the output through a txt or CSV file from within Grasshopper.
If there is a better way I am more than happy to hear your suggestions.
Yours,
A Beginner in Python. :)
I have linked my css file in 'main/static/css/style.css' and my pages are in 'main/templates/main/pages.html'.
But the CSS is not working for some pages such as the h2 tag is not taking the style.
base.html
{% load static %} <link rel="stylesheet" href="{% static 'css/style.css' %}">
privacy.html
{% extends 'main/base.html' %}
{% block content %}
<div class="main">
<h1><b>Privacy</b></h1>
<div style="height:35px" aria-hidden="true"></div>
<div class="text_para">
<h2 class="h2_privacy">Mission</h2>
<p>paragraph<p>
.......
style.css
.main {
-ms-flex: 70%; /* IE10 */
flex: 70%;
background-color: white;
padding: 20px;
}
.text_para {
margin-right: 5%;
text-align: justify;
}
/*heading h2 for privacy page*/
.h2_privacy {
color: black;
font-family: 'Lucida Calligraphy', sans-serif;
}
First you need to make sure CSS is loading properly (e.g. check from network section of browser dev tool).
Second you need to make sure the element is in the page and has the correct class matching CSS file.
Check these two and update your question if you didn't find the problem.
Sorry guys I was inactive and thank you.
And yes I have found the answer on stack overflow itself, actually I don't close my tabs so I think every time I refresh the page it does not refresh totally.
I just had to do ctrl + f5 to bypass the cache and the css would load with the updates.
I am using Flask and therefore Jinja2 to dynamically create a web page.
When I use the render_template function the resulting web pages pick up my css styles normally and do as I expect with some strange exceptions.
For example, the following css and html code is working 100% as intended:
CSS:
.intro {
font-family: monospace;
font-size: 24px;
color: #455934;
padding-top: 30px;
padding-right: 50px;
padding-bottom: 5px;
padding-left: 50px;
}
HTML:
<p class ='intro'>
Hello
</p>
However when using a ``Jinja2``` template:
<p class ='intro'>
{{title}}
</p>
The css style for this class is ignored.
This is not true for any other styles in the same css file and template, for reference the whole template looks like this and the styles for div containers, background, etc. work as intended:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<link rel= "stylesheet" type= "text/css" href= "{{ url_for('static',filename='gemuese_style.css') }}">
<title>{{title}}</title>
</head>
<body>
<div class="outer">
<div class="middle">
<div class="inner">
<p class ='intro'>
{{title}}
</p>
<ul>
<li>{{result}}</a></li>
</ul>
</div>
</div>
</div>
</body>
</html>
Edit:
Some testing has shown me that the style seems to work depending on the variable jinja renders. The template is rendered based on the output of different functions who generally return a list of a string and another list. The string is then used to fill in the {{title}} part where the css is failing.
For some functions it does work but I am not sure why because there is no difference in the return format.
Edit2:
Look at the difference between This rendered template and this one.
I am using django-easy_pdf to reder pdf for my reports and would like to know how to display a page footer.
In the django-easy_pdf's source code this piece of code is used to display the page number
<div id="footerContent">
{%block page_foot%}
<pdf:pagenumber />
{%endblock%}
</div>
What I would like to know is:
How to display the footer properly
Start with page number 1
As I have copied the code, it does not display as a footer and the page starts with 0
What am I missing?
UPDATE
I tried this code from here but I can't make it work, seems useful though
<html>
<head>
<style>
.footer { position: fixed; bottom: 0px; }
.pagenum:before { content: counter(page); }
</style>
</head>
<body>
<div class="footer">Page: <span class="pagenum"></span></div>
</body>
</html>
UPDATE 2
I now know what I did wrong, I was missing the #page css and that is why It's not working, I only have the #frame footer
The correct CSS:
#page {
size: {{ pagesize }};
margin: 1cm;
#frame footer {
-pdf-frame-content: footerContent;
bottom: 0cm;
margin-left: 18cm;
margin-right: 0cm;
height: 1cm;
}
}
Then just call it normally(the first snippet)
To display footer correctly make sure that in your template in css styles you have:
#frame footer {
-pdf-frame-content: footerContent;
}
-pdf-frame-content should be pointig to id of your footer container.
You can add the next code in your tempate:
{%block page_foot%}
<div style="display: block;margin: 0 auto;text-align: center;">
page: <pdf:pagenumber />
</div>
{%endblock%}
I am working through Learn Python The Hard Way, and am currently working through exercise 51. In it, the student is asked to try building out some basic web applications using the web.py framework. The first study drill is to improve the quality of the HTML layouts so that the applications are built on well-formatted pages. I am looking to make a template layout that applies to all pages in the application, and leverages a CSS stylesheet to provide the formatting. I would like for the CSS formatting to be external, rather than within the HTML file. For some reason, no matter how I format the path to 'main_layout.css' I cannot get the formatting changes to take effect. I have tried the path with a leading '/' and without the leading '/'. I have tried moving the CSS file into another folder (the root folder, and the templates folder). I tried emptying my browser cache in case that was causing in issue. I tried accessing the 'static' directory and the 'main_layout.css' file itself directly through my browser, which I was able to do in both cases--the files is in there, but I can't get it to accept the formatting markup from 'main_layout.css'. I googled this issue, checked the google group for web.py, and searched stackoverflow--in all cases, the answers were related to the path to the css file, which I believe I have fully explored and attempted to fix to no avail.I have tried all suggestions I could find on the web, and I am stumped. My code is as follows:
/bin
app.py
/ex51
/static
main_layout.css
/templates
hello_form.html
index.html
layout.html
/tests
app.py is written as follows:
import web
urls = (
'/hello', 'Index'
)
app = web.application(urls, globals())
render = web.template.render('templates/', base="layout")
class Index(object):
def GET(self):
return render.hello_form()
def POST(self):
form = web.input(name="Nobody", greet="Hello")
greeting = "%s, %s" % (form.greet, form.name)
return render.index(greeting = greeting)
if __name__ == "__main__":
app.run()
index.html written as follows:
$def with (greeting)
$if greeting:
I just wanted to say <em style="color: green; font-size: 2em;">$greeting</em>
$else:
<em>Hello</em>, world!
hello_form.html written as follows:
<h1>Fill out this form</h1>
<form action="/hello" method="POST">
A Greeting: <input type="text" name="greet">
<br/>
Your Name: <input type="text" name="name">
<br/>
<input type="submit">
</form>
main_layout.css written as follows:
html, body {
height: 100%;
}
.container {
width:800px;
}
.container #body_container {
margin: 10px auto;
padding-bottom: 50px;
min-height: 100%;
text-align: center;
overflow: auto;
}
.container #footer_container {
margin-top: -50px;
height: 50px;
}
and layout.html:
$def with (content)
<html>
<head>
<link rel="stylesheet" type="text/css" href="/static/main_layout.css" />
<title>This is My Page</title>
</head>
<body>
<div class="container" id="body_container">
$:content
</div>
<div class="container" id="footer_container">
Hello World
</div>
</body>
</html>
Thanks in advance for your help.
Edit: One additional bit of information--I am running this script from the PowerShell of my Windows 7 PC, and accessing it at http://localhost:8080/hello through Google Chrome.
You are commenting out the CSS file using a octothorp (#) which is incorrect for a CSS document (but correct for Python, which is where the confusion is). Use /* to comment out your code in a CSS document. Like this:
.container /*body_container*/ {
margin: 10px auto;
padding-bottom: 50px;
min-height: 100%;
text-align: center;
overflow: auto;
}