Python3.7 how to extract numerical value from list - python

I am doing some CTFs and I made this script:
import requests
page = requests.get("http://ctf.slothparadise.com/about.php").text
p_split = page.split("<p>")
p2_split = p_split[3].split("</p>")
print(p2_split)
My output from this is:
['You are the 135181th visitor to this page.\n Every thousandth visitor gets a prize.', '\n </div> <!-- /container -->\n </body>\n</html>\n']
How can I extract the value 135181 out of this list?

You can try to use regex, this is especially easy since it doesn't seem they change 'th' despite the number ending with 1 or 2:
import re
import requests
page = requests.get("http://ctf.slothparadise.com/about.php").text
re.findall("\d+(?=th)", page)
output:
['135335']

To get this working for any value adapt this:
my_split = ['You are the 135181th visitor to this page.\n Every thousandth visitor
gets a prize.', '\n </div> <!-- /container -->\n </body>\n</html>\n']
visitor_num = my_split[0].split('You are the ', 1)[1].split('th')[0]
print(visitor_num)
Actually, I can see similar solutions in the comments as well... hope this works for you!
For future reference, look up using the split function and indexing- it's something you will definitely use again.

Related

How to use Python Flask variables in Javascript

I have a Python variable whose value is a string of text and would like to edit that value via Javascript.
I have no idea how to go about doing this.
Attempts:
function changeValue(val) {
val = 'new text';
}
<textarea placeholder="some text">{{ changeValue({{ result }}) }}</textarea>
<textarea placeholder="some text">
{{ result }}
</textarea>
What I want: I have some text (result) being added and would like to check if the text is empty. If so, I want to show the placeholder text.
The issue: Although I can check if the value is empty, when I try to print that result out it reads none
Thanks to all!
You do not need to call the JavaScript function from the HTML file. There are several approaches you can take:
1. Store the variable in HTML metadata:
<meta id="result_var" data-result={{result}}>
And then get the data in JavaScript:
result = document.getElementById("result_var").value;
2. Keep the variable in the tag where it's supposed to be and get it from there in JavaScript:
<textarea placeholder="some text" id="result-var"> {{result}} </textarea>
And then get it in JavaScript:
let result = document.getElementById("result-var");
3. Query it from your API: You can create a route in your Flask app that returns JSON data with the variable you need and then get that data to your JavaScript file by sending a request to your API.
4. Jinja format: I've seen solutions that involve just using the variable as if it was a jinja variable in JavaScript like this: let result = JSON.parse('{{ result | tojson }}');. But I haven't been able to get this working properly, not sure why.
I hope this helps!

Getting specific part of <div> from webpage

I am currently using aiohttp and lxml to scrape webpages and return values. So far, I have
def get_sr(page, tree):
sr = tree.xpath(".//div[#class='competitive-rank']/div/text()")[0]
return sr
def get_icon_url(page, tree):
url = tree.xpath('.//img[#class="player-portrait"]/#src')[0]
return url
def get_sr_icon_url(page, tree):
url = tree.xpath('.//div[#class="competitive-rank"]/img/#src')[0]
return url
def get_level(page, tree):
level = tree.xpath('.//div[#class="header-avatar"]/text()')[0]
return level
The first 3 functions work perfectly, and yet the final function will not correctly get the text I am looking for. This:
<div class="header-avatar">
<img src="https://blzgdapipro-a.akamaihd.net/game/unlocks/0x0250000000001150.png" width="80" height="80">
<span>369</span>
</div>
Is the code block I am trying to get the number from. Currently, the number is 369 but it constantly changes. I have confirmed that the page and tree are correct through print statements, so instead it's an issue w/ the actual get_level method itself.
Help? Other pieces of code needed to determine issue?
Thank you for the help.
Try this:
level = tree.xpath('.//div[#class="header-avatar"]/span/text()')[0]

Multiline posting in HTML

I'm totally new at web development and I am currently trying to create a little website. The goal of this site, is to show random quotes of some of my teachers. The main pages are actually working just fine (I can get random quotes of my whole database, and random quotes from every teacher). But, I wanted to show all the quotes on the same page, and it happens they just appear all on the same line... And it's quite embarrassing...
In my python code, I used "\n" between each quote, so each new one started on a new line. But, on my HTML code, when I pass this string, it seems to have no effect I all the quotes just follow themselves on one line....
I'm using a Flask application, and a python class:
for i in range(2, max):
inte = inte + citation.ClasseCitations('Classe/citations.json','Classe/profs.json', prof, i).corps + ' \n '
return render_template("integrale.html", citation=inte, auteur=prof)
In my HTML file, I use citation like this:
<p>{{ citation }}</p>
Try this :
for i in range(2, max):
inte = inte + citation.ClasseCitations('Classe/citations.json','Classe/profs.json', prof, i).corps + ' <br/> '
return render_template("integrale.html", citation=inte, auteur=prof)
I'm not able to comment, but try it with
<br/>
instead of
\n
this could work.
I'd try using a list object within your flask-app.
Then in your html:
{% for quote in quotes %}
{{quote}} <br>
{% endfor %}
More on jinja2's for-loops
http://jinja.pocoo.org/docs/2.9/templates/#for-loop

Sleek way of un/commenting out html tags in markdown

I'm trying to find a nice way of wrapping html tags in html comments without writing 5 functions and 50 lines of code. Using an example code :
<section class="left span9">
### Test
</section>
I need to transform it to :
<!--<section class="left span9">-->
### Test
<!--</section>-->
I have a regex to find the tags
re.findall('<.*?>', str)
but in the last years I wasn't using lambdas too often so now I'm having a hard time getting it to work.
btw any ideas for the reverse of this process - decommenting the tags ?
You can comment/uncomment using simple replace like this
myString = '<section class="left span9">'
print myString.replace("<", "<!--<").replace(">", ">-->")
print myString.replace("<!--", "").replace("-->", "")
Output:
<!--<section class="left span9">-->
<section class="left span9">
Note: This works because, a valid HTML document should have < and > only in the HTML tags. If they should appear, as they are, in the output, they have to be properly HTML escaped with > and <
Ok, so temporarily I've ended up using two functions and re.sub for that :
def comment(match):
return '<!--'+match.group(0)+'-->'
def uncomment(html):
return html.replace('<!--', '').replace('-->', '')
commented_html = re.sub('<.*?>', comment, html_string)
uncommented_html = uncomment(commented_html)

Generating HTML documents in python

In python, what is the most elegant way to generate HTML documents. I currently manually append all of the tags to a giant string, and write that to a file. Is there a more elegant way of doing this?
You can use yattag to do this in an elegant way. FYI I'm the author of the library.
from yattag import Doc
doc, tag, text = Doc().tagtext()
with tag('html'):
with tag('body'):
with tag('p', id = 'main'):
text('some text')
with tag('a', href='/my-url'):
text('some link')
result = doc.getvalue()
It reads like html, with the added benefit that you don't have to close tags.
I would suggest using one of the many template languages available for python, for example the one built into Django (you don't have to use the rest of Django to use its templating engine) - a google query should give you plenty of other alternative template implementations.
I find that learning a template library helps in so many ways - whenever you need to generate an e-mail, HTML page, text file or similar, you just write a template, load it with your template library, then let the template code create the finished product.
Here's some simple code to get you started:
#!/usr/bin/env python
from django.template import Template, Context
from django.conf import settings
settings.configure() # We have to do this to use django templates standalone - see
# http://stackoverflow.com/questions/98135/how-do-i-use-django-templates-without-the-rest-of-django
# Our template. Could just as easily be stored in a separate file
template = """
<html>
<head>
<title>Template {{ title }}</title>
</head>
<body>
Body with {{ mystring }}.
</body>
</html>
"""
t = Template(template)
c = Context({"title": "title from code",
"mystring":"string from code"})
print t.render(c)
It's even simpler if you have templates on disk - check out the render_to_string function for django 1.7 that can load templates from disk from a predefined list of search paths, fill with data from a dictory and render to a string - all in one function call. (removed from django 1.8 on, see Engine.from_string for comparable action)
If you're building HTML documents than I highly suggest using a template system (like jinja2) as others have suggested. If you're in need of some low level generation of html bits (perhaps as an input to one of your templates), then the xml.etree package is a standard python package and might fit the bill nicely.
import sys
from xml.etree import ElementTree as ET
html = ET.Element('html')
body = ET.Element('body')
html.append(body)
div = ET.Element('div', attrib={'class': 'foo'})
body.append(div)
span = ET.Element('span', attrib={'class': 'bar'})
div.append(span)
span.text = "Hello World"
if sys.version_info < (3, 0, 0):
# python 2
ET.ElementTree(html).write(sys.stdout, encoding='utf-8',
method='html')
else:
# python 3
ET.ElementTree(html).write(sys.stdout, encoding='unicode',
method='html')
Prints the following:
<html><body><div class="foo"><span class="bar">Hello World</span></div></body></html>
There is also a nice, modern alternative: airium: https://pypi.org/project/airium/
from airium import Airium
a = Airium()
a('<!DOCTYPE html>')
with a.html(lang="pl"):
with a.head():
a.meta(charset="utf-8")
a.title(_t="Airium example")
with a.body():
with a.h3(id="id23409231", klass='main_header'):
a("Hello World.")
html = str(a) # casting to string extracts the value
print(html)
Prints such a string:
<!DOCTYPE html>
<html lang="pl">
<head>
<meta charset="utf-8" />
<title>Airium example</title>
</head>
<body>
<h3 id="id23409231" class="main_header">
Hello World.
</h3>
</body>
</html>
The greatest advantage of airium is - it has also a reverse translator, that builds python code out of html string. If you wonder how to implement a given html snippet - the translator gives you the answer right away.
Its repository contains tests with example pages translated automatically with airium in: tests/documents. A good starting point (any existing tutorial) - is this one: tests/documents/w3_architects_example_original.html.py
I would recommend using xml.dom to do this.
http://docs.python.org/library/xml.dom.html
Read this manual page, it has methods for building up XML (and therefore XHTML). It makes all XML tasks far easier, including adding child nodes, document types, adding attributes, creating texts nodes. This should be able to assist you in the vast majority of things you will do to create HTML.
It is also very useful for analysing and processing existing xml documents.
Here is a tutorial that should help you with applying the syntax:
http://www.postneo.com/projects/pyxml/
I am using the code snippet known as throw_out_your_templates for some of my own projects:
https://github.com/tavisrudd/throw_out_your_templates
https://bitbucket.org/tavisrudd/throw-out-your-templates/src
Unfortunately, there is no pypi package for it and it's not part of any distribution as this is only meant as a proof-of-concept. I was also not able to find somebody who took the code and started maintaining it as an actual project. Nevertheless, I think it is worth a try even if it means that you have to ship your own copy of throw_out_your_templates.py with your code.
Similar to the suggestion to use yattag by John Smith Optional, this module does not require you to learn any templating language and also makes sure that you never forget to close tags or quote special characters. Everything stays written in Python. Here is an example of how to use it:
html(lang='en')[
head[title['An example'], meta(charset='UTF-8')],
body(onload='func_with_esc_args(1, "bar")')[
div['Escaped chars: ', '< ', u'>', '&'],
script(type='text/javascript')[
'var lt_not_escaped = (1 < 2);',
'\nvar escaped_cdata_close = "]]>";',
'\nvar unescaped_ampersand = "&";'
],
Comment('''
not escaped "< & >"
escaped: "-->"
'''),
div['some encoded bytes and the equivalent unicode:',
'你好', unicode('你好', 'utf-8')],
safe_unicode('<b>My surrounding b tags are not escaped</b>'),
]
]
I am attempting to make an easier solution called
PyperText
In Which you can do stuff like this:
from PyperText.html import Script
from PyperText.htmlButton import Button
#from PyperText.html{WIDGET} import WIDGET; ex from PyperText.htmlEntry import Entry; variations shared in file
myScript=Script("myfile.html")
myButton=Button()
myButton.setText("This is a button")
myScript.addWidget(myButton)
myScript.createAndWrite()
I wrote a simple wrapper for the lxml module (should work fine with xml as well) that makes tags for HTML/XML -esq documents.
Really, I liked the format of the answer by John Smith but I didn't want to install yet another module to accomplishing something that seemed so simple.
Example first, then the wrapper.
Example
from Tag import Tag
with Tag('html') as html:
with Tag('body'):
with Tag('div'):
with Tag('span', attrib={'id': 'foo'}) as span:
span.text = 'Hello, world!'
with Tag('span', attrib={'id': 'bar'}) as span:
span.text = 'This was an example!'
html.write('test_html.html')
Output:
<html><body><div><span id="foo">Hello, world!</span><span id="bar">This was an example!</span></div></body></html>
Output after some manual formatting:
<html>
<body>
<div>
<span id="foo">Hello, world!</span>
<span id="bar">This was an example!</span>
</div>
</body>
</html>
Wrapper
from dataclasses import dataclass, field
from lxml import etree
PARENT_TAG = None
#dataclass
class Tag:
tag: str
attrib: dict = field(default_factory=dict)
parent: object = None
_text: str = None
#property
def text(self):
return self._text
#text.setter
def text(self, value):
self._text = value
self.element.text = value
def __post_init__(self):
self._make_element()
self._append_to_parent()
def write(self, filename):
etree.ElementTree(self.element).write(filename)
def _make_element(self):
self.element = etree.Element(self.tag, attrib=self.attrib)
def _append_to_parent(self):
if self.parent is not None:
self.parent.element.append(self.element)
def __enter__(self):
global PARENT_TAG
if PARENT_TAG is not None:
self.parent = PARENT_TAG
self._append_to_parent()
PARENT_TAG = self
return self
def __exit__(self, typ, value, traceback):
global PARENT_TAG
if PARENT_TAG is self:
PARENT_TAG = self.parent

Categories