Special Characters in URL while getting data using requests.get - python

I'm facing some issues while getting the content from the IRI having some special characters. I've been strictly working with requests module.
Following are some of the URLs which are causing trouble
https://cwur.org/2018-19/King's-College-London.php
https://cwur.org/2018-19/University-of-Wisconsin–Madison.php
import requests
res = requests.get('https://cwur.org/2018-19/University-of-São-Paulo.php')
res.text

In order to get response 200, pass an User-Agent in the headers.
import requests
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11'}
res = requests.get('https://cwur.org/2018-19/University-of-São-Paulo.php', headers=headers)
print(res.status_code)
print("---" * 10)
print(res.text)
Output:
200
------------------------------
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
<meta name="description" content="The Center for World University Rankings (CWUR) is a leading consulting organization and publisher of the largest academic ranking of global universities.">
<meta name="keywords" content="ranking, rankings, university, universities, college, colleges, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, world, top, best, global, Ranking universitario mundial, Classement mondial des universités , Weltweites Universitätsranking, Zentrum für weltweite Universitätsrankings , ××ר×× ×××× ××רס××××ת ××¢××××, ××ר×× ×××ר×× ×××× ××רס××××ת ××¢××××, ì¸ê³ ëíìì, ãä¸çã®å¤§å­¦ããã, ä¸ç大學æå中å¿, ì¸ê³ëíë­í¹ì¼í°,ä¸ç大学ã©ã³ã­ã³ã°ã»ã³ã¿ã¼, Ranking mundial universitário, РейÑинг ÑнивеÑÑиÑеÑов миÑа , ÑазÑабоÑки ÑейÑинга ÑнивеÑÑиÑеÑов миÑа, ÙرÙز ,تصÙÙ٠اÙجاÙعات اÙعاÙÙÙØ© ,تصÙÙÙ, اÙجاÙعات, جاÙعات, اÙعاÙÙ, تصÙÙ٠اÙجاÙعات, ÙرÙز تصÙÙ٠اÙجاÙعات اÙعاÙÙÙØ©, Ranking de universidades del mundo, subject, subjects, journal, journals, ranking by subjects, country ranking, country rankings">
<link rel="icon" type="image/png" href="../../favicon.png" />
<!-- Bootstrap core CSS -->
<link href="../../dist/css/bootstrap.min.css" rel="stylesheet">
<!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
<link href="../../assets/css/ie10-viewport-bug-workaround.css" rel="stylesheet">
<!-- Custom styles for this template -->
<link href="../../starter-template.css" rel="stylesheet">
<!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
<style type="text/css">
/* CSS used here will be applied after bootstrap.css */
.navbar-custom {
color: #FFFFFF;
background-color: #222222;
border-color: #222222;
}
</style>
<title> University of São Paulo Ranking | CWUR World University Rankings 2018-2019</title>
</head>
<body>
<nav class="navbar navbar-inverse navbar-fixed-top">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false" aria-controls="navbar">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<img src="../images/logo_944_400.png" height="50">
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>About</li>
<li class="dropdown">
World University Rankings <span class="caret"></span>
<ul class="dropdown-menu">
<li class="dropdown-header">World University Rankings</li>
<li>2020-21</li>
<li>2019-20</li>
<li>2018-19</li>
<li>2017</li>
<li>2016</li>
<li>2015</li>
<li>2014</li>
<li>2013</li>
<li>2012</li>
<li role="separator" class="divider"></li>
<li class="dropdown-header">University Rankings by Country</li>
<li>2018-19</li>
<li>2017</li>
<li>2016</li>
<li>2015</li>
<li>2014</li>
<li role="separator" class="divider"></li>
<li>Rankings by Subject</li>
</ul>
</li>
<li class="dropdown">
Methodology <span class="caret"></span>
<ul class="dropdown-menu">
<li>World University Rankings</li>
<li>Subject Rankings</li>
</ul>
</li>
<li>Media</li>
</ul>
</div>
</div>
</nav>
<div class="container">
<div class="page-header">
<h4> University of São Paulo Ranking - CWUR World University Rankings 2018-2019</h4>
<!-- Go to www.addthis.com/dashboard to customize your tools -->
<div class="addthis_toolbox addthis_default_style addthis_32x32_style"> <a class="addthis_button_preferred_1"></a> <a class="addthis_button_preferred_2"></a> <a class="addthis_button_preferred_3"></a> <a class="addthis_button_preferred_4"></a><a class="addthis_button_compact"></a></div> </div>
<div class="row">
<div class="col-md-8">
<table class="table table-bordered table-hover">
<tr><td><b>Institution Name</b></td><td>University of São Paulo </td></tr>
<tr><td><b>Native Name</b></td><td>Universidade de São Paulo </td></tr>
<tr><td><b>Location</b></td><td>Brazil</td></tr>
<tr><td><b>World Rank</b></td><td>77</td></tr>
<tr><td><b>National Rank</b></td><td>1</td></tr>
<tr><td><b>Quality of Education Rank</b></td><td>583</td></tr>
<tr><td><b>Alumni Employment Rank</b></td><td>256</td></tr>
<tr><td><b>Quality of Faculty Rank</b></td><td>109</td></tr>
<tr><td><b>Research Output Rank</b></td><td>4</td></tr>
<tr><td><b>Quality Publications Rank</b></td><td>60</td></tr>
<tr><td><b>Influence Rank</b></td><td>162</td></tr>
<tr><td><b>Citations Rank</b></td><td>139</td></tr>
<tr><td><b>Overall Score</b></td><td>82.6</td></tr>
<tr><td><b>Domain</b></td><td>usp.br</td></tr>
</table>
</div>
<div class="col-md-4">
<div class="table-responsive">
<table class="table table-bordered table-hover">
<tr><td>Top 2000 Universities (2020-21)</td></tr>
<tr><td>Top 2000 Universities (2019-20)</td></tr>
<tr><td>Top 1000 Universities (2018-19)</td></tr>
<tr><td>Ranking by Country (2018-2019)</td></tr>
<tr><td>Top 1000 Universities (2017)</td></tr>
<tr><td>Ranking by Country (2017)</td></tr>
<tr><td>Rankings by Subject</td></tr>
<tr><td>Top 1000 Universities (2016)</td></tr>
<tr><td>Ranking by Country (2016)</td></tr>
<tr><td>Top 1000 Universities (2015)</td></tr>
<tr><td>Ranking by Country (2015)</td></tr>
<tr><td>Top 1000 Universities (2014)</td></tr>
<tr><td>Ranking by Country (2014)</td></tr>
</table>
</div>
</div>
</div>
<p>Copyright © 2012-2020 Center for World University Rankings</p>
</div>
<!-- Bootstrap core JavaScript
================================================== -->
<!-- Placed at the end of the document so the pages load faster -->
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.3/jquery.min.js"></script>
<script>window.jQuery || document.write('<script src="../../assets/js/vendor/jquery.min.js"><\/script>')</script>
<script src="../../dist/js/bootstrap.min.js"></script>
<!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
<script src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
<!-- Go to www.addthis.com/dashboard to customize your tools -->
<script type="text/javascript" src="//s7.addthis.com/js/300/addthis_widget.js#pubid=ra-5316b43f5ee1fc57"></script>
</body>
</html>
Update:
In case of unicode urls, you can convert them to string
import requests
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11'}
url = "https://cwur.org/2018-19/University-of-S\xc3\xa3o-Paulo.php"
new_url = url.encode("iso-8859-1").decode()
res = requests.get(new_url, headers=headers)
print(res.status_code)
print("---" * 10)
print(res.text)

I recommend trying to store the data you receive from the .get() method in a dictionary and then using pprint module to display in a neat manner:
import requests
from pprint import pprint
url = 'https://cwur.org/2018-19/University-of-Wisconsin–Madison.php'
res = requests.get(url)
# printing the status code is also helpful to see if the API call was successful
print("Status code:", r.status_code)
r_dict = res.json()
pprint(r_dict)
If you get a status code of 200, then the API call was successful. This is more documentation on other status code response: link
Hope this helps you to find the problem with your link.

Related

NC files downloaded using requests in python are corrupted

I want to download 10 years of modis monthly data, for which I have links as lines in a text file. I used the following code:
import requests
import os
import urllib
file = open("link.txt", "r")
Lines = file.readlines()
for line in Lines:
filename= line.strip()
print(filename)
user, password= 'xx#gmail.com', 'yy#123'
r = requests.get(filename, auth=(user, password), allow_redirects=True)
open(str(filename[48:90]), 'wb').write(r.content)
The files getting downloaded are 11kb each and corrupted. I am unable to figure out what is the problem here. The content of the output file is mentioned below.
link.txt contains files like these:
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/AQUA_MODIS.20101201_20101231.L3b.MO.CHL.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/AQUA_MODIS.20110101_20110131.L3b.MO.CHL.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/AQUA_MODIS.20110201_20110228.L3b.MO.CHL.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/AQUA_MODIS.20110301_20110331.L3b.MO.CHL.nc
Output nc file looks like this:
<!DOCTYPE html>
<!--[if lt IE 7]><html class="no-js lt-ie9 lt-ie8 lt-ie7"> <![endif]-->
<!--[if IE 7]><html class="no-js lt-ie9 lt-ie8"> <![endif]-->
<!--[if IE 8]><html class="no-js lt-ie9"> <![endif]-->
<!--[if gt IE 8]><!--><html lang="en" class="no-js"><!--<![endif]-->
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<title>Earthdata Login</title>
<meta name="description" content="Earthdata Login">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<!-- Google Tag Manager -->
<script>(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push(
{'gtm.start': new Date().getTime(),event:'gtm.js'}
);var f=d.getElementsByTagName(s)[0],
j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
})(window,document,'script','dataLayer','GTM-WNP7MLF');</script>
<!-- End Google Tag Manager -->
<link href="https://cdn.earthdata.nasa.gov/eui/1.1.3/stylesheets/application.css" rel="stylesheet" />
<link rel="stylesheet" media="all" href="/assets/application-432b3917d4a41042c0fd963eba859548ef2993f5ed7a0dca4bdb446fdf807556.css" />
<!--[if IE 7]>
<link rel="stylesheet" href="/assets/font-awesome-ie7.min.css">
<![endif]-->
<link href="//netdna.bootstrapcdn.com/font-awesome/4.3.0/css/font-awesome.min.css" rel="stylesheet">
<link href='https://fonts.googleapis.com/css?family=Source+Sans+Pro:300,700' rel='stylesheet' type='text/css'>
<meta name="csrf-param" content="authenticity_token" />
<meta name="csrf-token" content="4cewX3n0xOGOTr5gWHgIlZctPtRm55YQ9lohmdPuvq1SfDO/rNhsQZZB3OUPRDLGsqQ/QG9uVqxV5yaLx9GqAQ==" />
<!-- Grid background: http://subtlepatterns.com/graphy/ -->
</head>
<body class="oauth authorize" data-turbolinks-eval=false>
<!-- Google Tag Manager (noscript) -->
<noscript>
<iframe src="https://www.googletagmanager.com/ns.html?id=GTM-WNP7MLF"
height="0" width="0" style="display:none;visibility:hidden"></iframe>
</noscript>
<!-- End Google Tag Manager (noscript) -->
<header id="earthdata-tophat2" style="height: 32px;"></header>
<!--[if lt IE 7]>
<p class="chromeframe">You are using an <strong>outdated</strong> browser. Please upgrade your browser or activate Google Chrome Frame to improve your experience.</p>
<![endif]-->
<div class="container">
<header role="banner">
<div id="masthead-logo">
<h1><a class="ir" href="/">Earthdata Login</a></h1>
<span class="eui-badge badge daac">Earthdata Login</span>
</div>
<a id="hamburger" href="#"><img title="Mobile Menu" alt="Three horizontal lines stacked" src="/assets/hamburger-68c8505066427f3e3f6ee40b24cfd3c9f7c0fe93ee298b9046564637262115fa.png" /></a>
<nav role="navigation" class="masthead">
<div id="hide">
<ul>
</ul>
</div>
</nav>
</header>
<section id="callout-login">
<div class="client-login">
<br>
<h3 class="client-description">
</h3>
</div>
<form id="login" action="/login" accept-charset="UTF-8" method="post"><input name="utf8" type="hidden" value="✓" autocomplete="off" /><input type="hidden" name="authenticity_token" value="amGKbzpZc+qTleRHBa9q8W4OvZNAUzhOfGu1G9gX0QnZ2gmP73XbSouahsJSk1CiS4e8B0na+PLf1rIJzCjFpQ==" autocomplete="off" />
<p><label for="username">Username</label><i class="fa fa-question-circle fa-question-circle--blue user-name" title="Login using either your Username or Email Address"></i><input type="text" name="username" id="username" autofocus="autofocus" class="default" /></p>
<p><label for="password">Password</label><br /><input type="password" name="password" id="password" /></p>
<p><input type="hidden" name="client_id" id="client_id" value="Z0u-MdLNypXBjiDREZ3roA" autocomplete="off" /></p>
<p><input type="hidden" name="redirect_uri" id="redirect_uri" value="https://oceandata.sci.gsfc.nasa.gov/ob/getfile/restrict" autocomplete="off" /></p> <p><input type="hidden" name="response_type" id="response_type" value="code" autocomplete="off" /></p>
<p><input type="hidden" name="state" id="state" autocomplete="off" /></p>
<p><input type="checkbox" name="stay_in" id="stay_in" value="1" checked="checked" /> <label for="stay_in">Stay signed in (this is a private workstation)</label></p>
<p class="button-with-notes">
<input type="submit" name="commit" value="Log in" class="eui-btn--round eui-btn--green" data-disable-with="Log in" />
<a class="eui-btn--round eui-btn--blue" href="/users/new?client_id=Z0u-MdLNypXBjiDREZ3roA&redirect_uri=https%3A%2F%2Foceandata.sci.gsfc.nasa.gov%2Fob%2Fgetfile%2Frestrict&response_type=code">Register</a>
</p>
<p class="form-instructions">
<em class="icon-question-sign"></em>
<a class="" href="/retrieve_info">I don’t remember my username</a>
<br /><em class="icon-question-sign"></em>
<a class="" href="/reset_passwords/new">I don’t remember my password</a>
<br />
<em class="icon-question-sign"></em>
<a href="javascript:feedback.showForm();" title = 'Need Help? Click on the Feedback button to request help'>Help</a>
</p>
</form>
<aside class="govt-msg">
<div class="nasa-logo"></div>
<p><strong>Why must I register?</strong></p>
<p>
The Earthdata Login provides a single mechanism for user registration and profile management for all EOSDIS system components (DAACs, Tools, Services).
Your Earthdata login also helps the EOSDIS program better understand the usage of EOSDIS services to improve user experience through customization of tools and improvement of services.
EOSDIS data are openly available to all and free of charge except where governed by international agreements.
</p>
</aside>
</section>
<section id="cta">
<h3>Get single sign-on access to all your favorite EOSDIS sites</h3>
<a class="eui-btn--round eui-btn--blue" href="/users/new?client_id=Z0u-MdLNypXBjiDREZ3roA&redirect_uri=https%3A%2F%2Foceandata.sci.gsfc.nasa.gov%2Fob%2Fgetfile%2Frestrict&response_type=code">Register for a Profile</a>
</section>
<div class="govt-warning eui-info-box">
<div class="warning-desktop">
<p><strong>By accessing and using this information system, you acknowledge and consent to the following:</strong></p>
You are accessing a U.S. Government information system, which includes:
(1) this computer;
(2) this computer network;
(3) all computers connected to this network including end user systems;
(4) all devices and storage media attached to this network or to any computer on this network; and
(5) cloud and remote information services. This information system is provided for U.S. Government-authorized use only.
You have no reasonable expectation of privacy regarding any communication transmitted through or data stored on this
information system. At any time, and for any lawful purpose, the U.S. Government may monitor, intercept, search, and
seize any communication or data transiting, stored on, or traveling to or from this information system. You are
NOT authorized to process classified information on this information system. Unauthorized or improper use of this
system may result in suspension or loss of access privileges, disciplinary action, and civil and/or criminal penalties.
</div>
<div class="warning-mobile">
<p><strong>By accessing and using this information system, you acknowledge and consent to the following:</strong></p>
You are accessing a U.S. Government information system, which includes:
(1) this computer;
(2) this computer network;
(3) all computers connected to this network including end user systems;
(4) all devices and storage media attached to this network or to any computer on this network; and
(5) cloud and remote information services. This information system is provided for U.S. Government-authorized use only.
Unauthorized or improper use of this system may result in suspension or loss of access privileges, disciplinary action,
and civil and/or criminal penalties. By using this information system, you acknowledge and consent to the terms
and conditions established in NASA policy and regulatory guidance for NASA IT Systems.
</div>
<div class="warning-mobile-mini">
<strong>
US Govt Property. Unauthorized use subject to prosecution. Use subject to monitoring per
NPD2810.
</strong>
</div>
</div>
</div>
<footer role="contentinfo">
<h3>For questions regarding the EOSDIS Earthdata Login, please contact Earthdata Support</h3>
<ul>
<li class="version badge eui-badge--md">V 4.158
</li>
<li>Home</li>
<li>Register</li>
<li>Documentation</li>
<li><a title="NASA Home" href="http://www.nasa.gov">NASA</a></li>
</ul>
<p>NASA Official: Stephen Berrick</p>
</footer>
<script src="/assets/application-15d4faf28e91715dccccf34f3e808b7a348298fb623e29d09281b30cc1f87492.js"></script>
<script type="text/javascript">
$(window).scroll(function(e){
parallax();
});
function parallax(){
var scrolled = $(window).scrollTop();
$('#content').css('background-position', 'right ' + -(scrolled*0.25)+'px ');
}
</script>
<script src="https://cdn.earthdata.nasa.gov/tophat2/tophat2.js" id="earthdata-tophat-script" data-show-fbm="true" data-show-status="true" data-status-api-url="https://status.earthdata.nasa.gov/api/v1/notifications"></script>
<script type="text/javascript" src="https://fbm.earthdata.nasa.gov/for/URS4/feedback.js"></script>
<script type="text/javascript">
feedback.init();
</script>
<script type="text/javascript">
setTimeout(function()
{var a=document.createElement("script"); var b=document.getElementsByTagName("script")[0];
a.src=document.location.protocol+"//dnn506yrbagrg.cloudfront.net/pages/scripts/0013/2090.js?"+Math.floor(new Date().getTime()/3600000);
a.async=true;a.type="text/javascript";b.parentNode.insertBefore(a,b)}
, 1);
</script>
<!-- BEGIN: DAP Google Analytics -->
<script language="javascript" id="_fed_an_ua_tag" src="https://dap.digitalgov.gov/Universal-Federated-Analytics-Min.js?agency=NASA&subagency=GSFC&dclink=true"></script>
<!-- END: DAP Google Analytics -->
</body>
</html>

Headless doesn't work using Playwright and BeautifulSoup 4

This code is working:
from playwright.sync_api import sync_playwright
from bs4 import BeautifulSoup
from datetime import datetime
import time
with sync_playwright() as p:
browser = p.chromium.launch(headless=False)
page = browser.new_page()
page.goto("https://www.apple.com/br/shop/product/MV7N2BE/A/airpods-com-estojo-de-recarga")
html = page.content()
soup = BeautifulSoup(html,'html.parser')
valorAppleStore = soup.select("span.as-price-installments")[-2].get_text().replace(" à vista (10% de desconto)", '')
print(valorAppleStore)
browser.close()
But if I change headless=True, the code returns an error:
Traceback (most recent call last):
File "c:/Users/ANDERSONCARVALHODELI/Documents/py/AirpodsPW.py", line 19, in <module>
valorAppleStore = soup.select("span.as-price-installments")[-2].get_text().replace(" à vista (10% de desconto)",
'')
IndexError: list index out of range
I fixed this using:
from playwright.sync_api import sync_playwright
from bs4 import BeautifulSoup
from datetime import datetime
import time
with sync_playwright() as p:
browser = p.chromium.launch(headless=False)
page = browser.new_page()
page.goto("https://www.apple.com/br/shop/product/MV7N2BE/A/airpods-com-estojo-de-recarga")
time.sleep(1)
browser.close()
browser = p.chromium.launch(headless=True)
page = browser.new_page()
page.goto("https://www.apple.com/br/shop/product/MV7N2BE/A/airpods-com-estojo-de-recarga")
html = page.content()
soup = BeautifulSoup(html,'html.parser')
valorAppleStore = soup.select("span.as-price-installments")[-2].get_text().replace(" à vista (10% de desconto)", '')
print(valorAppleStore)
But I think this is not the better choice. How do I fix this without opening the browser using headless=False and stick to headless=True?
When I print(html) before soup=..., I see:
<!DOCTYPE html><html><head> <title>Page Not Found - Apple</title> <link rel="stylesheet" href="https://www.apple.com/wss/fonts?families=SF+Pro,v1|SF+Pro+Icons,v1"> <link rel="stylesheet" href="https://www.apple.com/v/errors/c/built/styles/main.built.css" type="text/css"> <link rel="stylesheet" href="https://www.apple.com/v/errors/c/built/styles/overview.built.css" type="text/css"> <link rel="stylesheet" href="https://store.storeimages.cdn-apple.com/4982/store.apple.com/shop/rs-external/rel/us/external.css"> <link rel="stylesheet" href="https://store.storeimages.cdn-apple.com/4982/store.apple.com/shop/rs-globalelements/dist/us/globalelements.css"> <style>.more::after{content: "";}a.pointer, a.more, a.block span.more, button.unbutton.more{padding-right: .7em; background-image: url(https://store.storeimages.cdn-apple.com/4982/store.apple.com/shop/rs-web/2/dist/assets/as-legacy/base/link/res/more.svg); background-repeat: no-repeat; background-position: 100% 50%; background-size: 5px 9px; zoom: 1;}.as-globalfooter-directory-column-section-list a{margin-bottom: .8em; display: block}.as-globalfooter-directory-column-section-list a:last-child{margin-bottom: 0;}.as-globalfooter-mini .as-globalfooter-mini-shop a{color: #06c;}.as-globalfooter .as-globalfooter-mini-legal-copyright, .as-footnotes .as-globalfooter-mini-legal-copyright, .as-globalfooter .as-globalfooter-mini-legal-link, .as-footnotes .as-globalfooter-mini-legal-link{top: -3px; position: relative; z-index: 1;}.as-globalfooter .as-globalfooter-directory+.as-globalfooter-mini, .as-footnotes .as-globalfooter-directory+.as-globalfooter-mini{padding-bottom: 26px;}.container{position: relative;}hr{display: inline-block; border: 0px; border-top: 0.1em solid #CCD2D9; width: 100%}</style></head><body class="page-overview"> <nav data-store-api="/shop/bag/status" id="ac-globalnav"> <div class="ac-gn-content"> <ul class="ac-gn-list"> <p class="ac-gn-link-text">Apple</p> <p class="ac-gn-link-text">Store</p> <p class="ac-gn-link-text">Mac</p> <p class="ac-gn-link-text">iPad</p> <p class="ac-gn-link-text">iPhone</p> <p class="ac-gn-link-text">Watch</p> <p class="ac-gn-link-text">AirPods</p> <p class="ac-gn-link-text">TV & Home</p>
<p class="ac-gn-link-text">Only on Apple</p> <p class="ac-gn-link-text">Accessories</p> <p class="ac-gn-link-text">Support</p> <li class="ac-gn-item ac-gn-item-menu ac-gn-search"> <a id="ac-gn-link-search" class="ac-gn-link ac-gn-link-search" href="/us/search" data-analytics-title="search" data-analytics-intrapage-link="" aria-label="Search apple.com" role="button" aria-haspopup="true"></a> </li> <p class="ac-gn-link-text">Shopping Bag</p> </ul> </div></nav> <div id="ac-gn-placeholder"> </div><main id="main" class="main" role="main" data-page-type="overview"> <h1 class="section-headline typography-headline">The page you’re looking for can’t be found.</h1> <aside id="search-wrapper" role="search" data-analytics-region="search" aria-hidden="false"> <form id="searchform-form" class="searchform" action="/us/search" method="get" data-suggestions-url="/search-services/suggestions/"><input id="searchform-input" type="text" class="form-textbox form-textbox-text form-icon-left" aria-labelledby="textbox_label" required="" aria-required="true" data-placeholder-long="Search for Products, Stores, and Help" autocorrect="off" autocapitalize="off" autocomplete="off"><span class="form-label" id="textbox_label" aria-hidden="true">Search apple.com</span> <div id="searchform-submit" class="form-icons-wrapper form-icons-wrapper-left form-icons-focusable" type="submit" aria-label="Submit"><button class="form-icons form-icons-search15"></button></div><div id="searchform-reset" class="button-reset form-icons-wrapper form-icons-focusable" type="reset" disabled="" aria-label="Clear Search"><button class="form-icons form-icons-small form-icons-clearsolid15 form-icon-reset"></button></div></form> </aside> <div class="cta-sitemap"> <div class="cta-sitemap"> Or see our site map </div></div></main> <footer class="as-globalfooter as-globalfooter-contained"> <div class="as-globalfooter-content"> <div class="as-globalfooter-breadcrumbs"> <p class="as-globalfooter-breadcrumbs-home-icon"></p><p class="as-globalfooter-breadcrumbs-home-label">Apple</p> <div class="as-globalfooter-breadcrumbs-path"> <ol class="as-globalfooter-breadcrumbs-list"> <li class="as-globalfooter-breadcrumbs-item breadcrumbs-title"> Page Not Found</li></ol> </div></div><nav class="as-globalfooter-directory with-5-columns"> <div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Shop and Learn</h3> <ul class="as-globalfooter-directory-column-section-list"> Store Mac iPad iPhone Watch AirPods TV & Home iPod touch AirTag Accessories Gift Cards </ul> </div></div><div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Services</h3> <ul class="as-globalfooter-directory-column-section-list"> Apple Music Apple TV+ Apple Fitness+ Apple News+ Apple Arcade iCloud Apple One Apple Card Apple Books Apple Podcasts App Store </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Account</h3> <ul class="as-globalfooter-directory-column-section-list"> Manage Your Apple ID Apple Store Account iCloud.com </ul> </div></div><div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Apple Store</h3> <ul class="as-globalfooter-directory-column-section-list"> Find a Store Genius Bar Today at Apple Apple Camp Apple Store App Refurbished and Clearance Financing Apple Trade In Order Status Shopping Help </ul> </div></div><div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">For Business</h3> <ul class="as-globalfooter-directory-column-section-list"> Apple and Business Shop for Business </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">For Education</h3> <ul class="as-globalfooter-directory-column-section-list"> Apple and Education Shop for K-12 Shop for College </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">For Healthcare</h3> <ul class="as-globalfooter-directory-column-section-list"> Apple in Healthcare Health on Apple Watch Health Records on iPhone </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">For Government</h3> <ul class="as-globalfooter-directory-column-section-list"> Shop for Government Shop for Veterans and Military </ul> </div></div><div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Apple Values</h3> <ul class="as-globalfooter-directory-column-section-list"> Accessibility Education Environment Inclusion and Diversity Privacy <a href="/racial-equity-justice-initiative/">Racial Equity
and Justice</a> Supplier Responsibility </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">About Apple</h3> <ul class="as-globalfooter-directory-column-section-list"> Newsroom Apple Leadership Career Opportunities Investors Ethics & Compliance Events Contact Apple </ul> </div></div></nav> <div class="as-globalfooter-mini"> <div class="as-globalfooter-mini-shop">More ways to shop:
Find an Apple Store or other retailer near you. <span>Or call 1-800-MY-APPLE.</span> </div><div class="as-globalfooter-mini-locale"> <a class="as-globalfooter-mini-locale-link" href="/choose-country-region/" title="Choose your country or region" aria-label="United States. Choose your country or region" data-analytics-title="choose your country">United States</a> </div><p class="as-globalfooter-mini-legal-copyright">Copyright © 2022 Apple Inc. All rights reserved. </p><a class="as-globalfooter-mini-legal-link" href="/legal/privacy/">Privacy Policy </a> <a class="as-globalfooter-mini-legal-link" href="/legal/internet-services/terms/site.html">Terms of Use </a> <a class="as-globalfooter-mini-legal-link" href="/us/shop/goto/help/sales_refunds">Sales
and Refunds </a> <a class="as-globalfooter-mini-legal-link" href="/legal/">Legal </a> <a class="as-globalfooter-mini-legal-link" href="/sitemap/">Site Map </a> </div></div></footer> <script src="https://www.apple.com/v/errors/c/built/scripts/main.built.js" type="text/javascript" charset="utf-8"></script></body></html>
First of all, Playwright already has a full suite of selectors that work on the live page, so to eliminate a dependency, speed up your scrape, use less code and avoid weird errors when the static HTML snapshot gets out of sync with the live page, I suggest skipping BS.
On to the main problem, you've done good by printing the HTML to see what sort of response you're dealing with. The 404 page indicates you've been detected as a bot when running headlessly, but this can often manifest as a captcha, Cloudflare browser check page, or other "are you a robot?" notice.
As with everything in scraping, there's no one-size-fits-all solution, but one typical approach is to set a custom user agent string:
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
ua = (
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/69.0.3497.100 Safari/537.36"
)
url = (
"https://www.apple.com/br/shop/product/MV7N2BE/A/airpods-com-estojo-de-recarga"
)
page = browser.new_page(user_agent=ua)
page.goto(url, wait_until="domcontentloaded")
sel = "span.as-price-installments:last-child"
text = (
page.wait_for_selector(sel)
.text_content()
.replace("à vista (10% de desconto)", "")
.strip()
)
print(text) # => R$ 1.399,50
browser.close()

How to set header "Access-Control-Allow-Origin" to selenium Webdriver

I am a beginner with python selenium.
Any help I appriciate.
My selenium code :
user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) ' \
'Chrome/80.0.3987.132 Safari/537.36'
chrome_option = webdriver.ChromeOptions()
chrome_option.add_argument('--no-sandbox')
chrome_option.add_argument('--disable-dev-shm-usage')
chrome_option.add_argument('--ignore-certificate-errors')
chrome_option.add_argument("--disable-blink-features=AutomationControlled")
chrome_option.add_argument(f'user-agent={user_agent}')
chrome_option.headless = True
driver = webdriver.Chrome(options = chrome_option)
driver.get("https://apps.apple.com/us/app/tiktok-make-your-day/id835599320#see-all/reviews")
I got following Error Message :
[0708/203913.943:INFO:CONSOLE(0)] "Access to font at 'https://www.apple.com/ac/globalfooter/3/en_US/assets/ac-footer/legacy/appleicons_text.woff' from origin 'https://apps.apple.com' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.", source: https://apps.apple.com/us/app/tiktok-make-your-day/id835599320#see-all/reviews (0)
[0708/203914.012:INFO:CONSOLE(8199)] "Metrics config: No config provided via delegate or fetched via init(), using default/cached config values.", source: https://apps.apple.com/assets/vendor-5715c00de8dadd4a8dd6d176ecd12d82.js (8199)
[0708/203914.022:INFO:CONSOLE(8199)] "Metrics config: No config provided via delegate or fetched via init(), using default/cached config values.", source: https://apps.apple.com/assets/vendor-5715c00de8dadd4a8dd6d176ecd12d82.js (8199)
[0708/203914.027:INFO:CONSOLE(6946)] "ember-i18n has been deprecated in favor of ember-intl", source: https://apps.apple.com/assets/vendor-5715c00de8dadd4a8dd6d176ecd12d82.js (6946)
[0708/203915.024:INFO:CONSOLE(0)] "Access to font at 'https://www.apple.com/ac/globalfooter/3/en_US/assets/ac-footer/legacy/appleicons_text.ttf' from origin 'https://apps.apple.com' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.", source: https://apps.apple.com/us/app/tiktok-make-your-day/id835599320#see-all/reviews (0)
I look forward to hear you.
Regards!
It's inconclusive from your question why you have to use the following argument:
--disable-blink-features=AutomationControlled
Perhaps with a simple tweak I can access the url successfully as follows:
Code Block:
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get('https://apps.apple.com/us/app/tiktok-make-your-day/id835599320#see-all/reviews')
print(driver.page_source)
Console Output:
<html lang="en-us" prefix="og: http://ogp.me/ns#" xml:lang="en-us"><head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1, viewport-fit=cover">
.
.
<meta name="description" content="‎Read reviews, compare customer ratings, see screenshots, and learn more about TikTok - Make Your Day. Download TikTok - Make Your Day and enjoy it on your iPhone, iPad, and iPod touch." id="ember15042807" class="ember-view">
<meta name="keywords" content="TikTok - Make Your Day, TikTok Inc., Entertainment, Photo & Video, ios apps, app, appstore, app store, iphone, ipad, ipod touch, itouch, itunes" id="ember15042809" class="ember-view">
<meta property="og:title" content="‎TikTok - Make Your Day" id="ember15042811" class="ember-view">
<meta property="og:description" content="‎TikTok is THE destination for mobile videos. On TikTok, short-form videos are exciting, spontaneous, and genuine. Whether you’re a sports fanatic, a pet enthusiast, or just looking for a laugh, there’s something for everyone on TikTok. All you have to do is watch, engage with what you like, skip wha…" id="ember15042813" class="ember-view">
<meta property="og:site_name" content="App Store" id="ember15042815" class="ember-view">
<meta property="og:url" content="https://apps.apple.com/us/app/tiktok-make-your-day/id835599320" id="ember15042817" class="ember-view">
<meta property="og:image" content="https://is2-ssl.mzstatic.com/image/thumb/Purple124/v4/b7/54/52/b7545217-dbd1-c219-ce00-00189f3b739e/AppIcon_TikTok-0-0-1x_U007emarketing-0-0-0-7-0-0-sRGB-0-0-0-GLES2_U002c0-512MB-85-220-0-0.png/1200x630wa.png" id="ember15042819" class="ember-view">
<meta property="og:image:alt" content="TikTok - Make Your Day on the App Store" id="ember15042821" class="ember-view">
<meta property="og:image:secure_url" content="https://is2-ssl.mzstatic.com/image/thumb/Purple124/v4/b7/54/52/b7545217-dbd1-c219-ce00-00189f3b739e/AppIcon_TikTok-0-0-1x_U007emarketing-0-0-0-7-0-0-sRGB-0-0-0-GLES2_U002c0-512MB-85-220-0-0.png/1200x630wa.png" id="ember15042823" class="ember-view">
<meta property="og:image:type" content="image/png" id="ember15042825" class="ember-view">
<meta property="og:image:width" content="1200" id="ember15042827" class="ember-view">
<meta property="og:image:height" content="630" id="ember15042829" class="ember-view">
<meta property="og:type" content="website" id="ember15042831" class="ember-view">
<meta property="og:locale" content="en_US" id="ember15042833" class="ember-view">
<meta property="fb:app_id" content="116556461780510" id="ember15042835" class="ember-view">
<meta name="twitter:title" content="‎TikTok - Make Your Day" id="ember15042837" class="ember-view">
<meta name="twitter:description" content="‎TikTok is THE destination for mobile videos. On TikTok, short-form videos are exciting, spontaneous, and genuine. Whether you’re a sports fanatic, a pet enthusiast, or just looking for a laugh, there’s something for everyone on TikTok. All you have to do is watch, engage with what you like, skip wha…" id="ember15042839" class="ember-view">
<meta name="twitter:site" content="#AppStore" id="ember15042841" class="ember-view">
<meta name="twitter:domain" content="AppStore" id="ember15042843" class="ember-view">
<meta name="twitter:image" content="https://is2-ssl.mzstatic.com/image/thumb/Purple124/v4/b7/54/52/b7545217-dbd1-c219-ce00-00189f3b739e/AppIcon_TikTok-0-0-1x_U007emarketing-0-0-0-7-0-0-sRGB-0-0-0-GLES2_U002c0-512MB-85-220-0-0.png/600x600wa.png" id="ember15042845" class="ember-view">
<meta name="twitter:image:alt" content="TikTok - Make Your Day on the App Store" id="ember15042847" class="ember-view">
<meta name="twitter:card" content="summary_large_image" id="ember15042849" class="ember-view">
<meta name="apple-itunes-app" content="app-id=375380948, app-argument=https://apps.apple.com/us/app/tiktok-make-your-day/id835599320" id="ember15042851" class="ember-view">
<script name="schema:software-application" id="ember15042853" class="ember-view" type="application/ld+json">{"#context":"http://schema.org","#type":"SoftwareApplication","name":"TikTok - Make Your Day","description":"TikTok is THE destination for mobile videos. On TikTok, short-form videos are exciting, spontaneous, and genuine. Whether you’re a sports fanatic, a pet enthusiast, or just looking for a laugh, there’s something for everyone on TikTok. All you have to do is watch, engage with what you like, skip what you don’t, and you’ll find an endless stream of short videos that feel personalized just for you. From your morning coffee to your afternoon errands, TikTok has the videos that are guaranteed to make your day.\n\nWe make it easy for you to discover and create your own original videos by providing easy-to-use tools to view and capture your daily moments. Take your videos to the next level with special effects, filters, music, and more. \n\n■ Watch endless amount of videos customized specifically for you\nA personalized video feed based on what you watch, like, and share. TikTok offers you real, interesting, and fun videos that will make your day.\n \n■ Explore videos, just one scroll away\nWatch all types of videos, from Comedy, Gaming, DIY, Food, Sports, Memes, and Pets, to Oddly Satisfying, ASMR, and everything in between.\n \n■ Pause recording multiple times in one video\nPause and resume your video with just a tap. Shoot as many times as you need.\n \n■ Be entertained and inspired by a global community of creators\nMillions of creators are on TikTok showcasing their incredible skills and everyday life. Let yourself be inspired.\n\n■ Add your favorite music or sound to your videos for free\nEasily edit your videos with millions of free music clips and sounds. We curate music and sound playlists for you with the hottest tracks in every genre, including Hip Hop, Edm, Pop, Rock, Rap, and Country, and the most viral original sounds.\n\n■ Express yourself with creative effects\nUnlock tons of filters, effects, and AR objects to take your videos to the next level.\n\n■ Edit your own videos \nOur integrated editing tools allow you to easily trim, cut, merge and duplicate video clips without leaving the app.\n\n* Any feedback? Contact us at feedback#tiktok.com or tweet us #tiktok_us","screenshot":["https://is3-ssl.mzstatic.com/image/thumb/Purple113/v4/ab/10/3d/ab103d22-efca-509b-312a-f2ff2feb819d/pr_source.png/300x0w.jpg","https://is3-ssl.mzstatic.com/image/thumb/Purple113/v4/d3/30/6b/d3306b31-f0e1-fed0-683c-44899a9c0b5d/pr_source.png/300x0w.jpg","https://is2-ssl.mzstatic.com/image/thumb/Purple123/v4/03/88/18/038818fa-44c6-9289-f9b7-fc1d67273fde/pr_source.png/300x0w.jpg","https://is5-ssl.mzstatic.com/image/thumb/Purple123/v4/67/ee/ad/67eead34-780b-5aae-4be7-574ccdd4010a/pr_source.png/300x0w.jpg","https://is5-ssl.mzstatic.com/image/thumb/Purple123/v4/71/6f/da/716fdafc-b9ef-856f-5ce1-5ecf5b05a815/pr_source.png/300x0w.jpg","https://is4-ssl.mzstatic.com/image/thumb/Purple113/v4/98/7a/12/987a1281-b961-75a0-8da7-1e123bbada56/pr_source.jpg/643x0w.jpg","https://is4-ssl.mzstatic.com/image/thumb/Purple113/v4/27/3f/cd/273fcda9-05c6-3d16-1e7d-465cf7d5aa25/pr_source.jpg/643x0w.jpg","https://is3-ssl.mzstatic.com/image/thumb/Purple113/v4/c3/e6/7b/c3e67b5b-9159-50e9-9b62-558aa32cd605/pr_source.jpg/643x0w.jpg","https://is3-ssl.mzstatic.com/image/thumb/Purple113/v4/6c/ff/97/6cff975c-dcb6-253c-a63b-f30c7d465efc/pr_source.jpg/643x0w.jpg","https://is2-ssl.mzstatic.com/image/thumb/Purple123/v4/12/3a/bf/123abfb8-fda2-85bb-6419-a58eefd2af7f/pr_source.jpg/643x0w.jpg"],"image":"https://is2-ssl.mzstatic.com/image/thumb/Purple124/v4/b7/54/52/b7545217-dbd1-c219-ce00-00189f3b739e/AppIcon_TikTok-0-0-1x_U007emarketing-0-0-0-7-0-0-sRGB-0-0-0-GLES2_U002c0-512MB-85-220-0-0.png/1200x630wa.png","applicationCategory":"Entertainment","datePublished":"2014年4月1日","operatingSystem":"Requires iOS 9.3 or later. Compatible with iPhone, iPad, and iPod touch.","author":{"#type":"Person","name":"TikTok Inc.","url":"https://apps.apple.com/us/developer/tiktok-inc/id1039610913"},"aggregateRating":{"#type":"AggregateRating","ratingValue":4.7,"reviewCount":5397820},"offers":{"#type":"Offer","price":0,"priceCurrency":"USD","category":"free"}}
</script>
<meta name="apple:content_id" content="835599320" id="ember15042855" class="ember-view">
<meta name="ember-cli-head-end" content="">
<link rel="stylesheet" type="text/css" href="/global-elements/2014.4.0/en_US/ac-global-nav.d915b46b2869cd416cbafe206ca74838.css" data-global-elements-nav-styles="">
<link rel="stylesheet" type="text/css" href="/global-elements/2014.4.0/en_US/ac-global-footer.23e044b4f5b5dd393dc9767d96faf248.css" data-global-elements-footer-styles="">
<meta name="version" content="2026.2.0">
<link integrity="" rel="stylesheet" href="/assets/web-experience-app-f1c50a018dab4bdb8c454bab56ac60a1.css" data-rtl="/assets/web-experience-rtl-app-bd50312cdbe2a58d742a5dbf295ff43a.css">
<script charset="utf-8" src="https://apps.apple.com/assets/chunk.d1a0e2a45b401b141683.js"></script></head>
<body class="ember-application has-js no-touch">
<div id="ember-app">
<script type="x/boundary" id="fastboot-body-start"></script><aside id="ac-gn-segmentbar" class="ac-gn-segmentbar" lang="en-US" dir="ltr" data-strings="{ 'exit': 'Exit', 'view': '{%STOREFRONT%} Store Home', 'segments': { 'smb': 'Business Store Home', 'eduInd': 'Education Store Home', 'other': 'Store Home' } }">
</aside>
<input type="checkbox" id="ac-gn-menustate" class="ac-gn-menustate">
<nav id="ac-globalnav" class="js no-touch windows" role="navigation" aria-label="Global" data-hires="false" data-analytics-region="global nav" lang="en-US" dir="ltr" data-www-domain="www.apple.com" data-store-locale="us" data-store-root-path="/us" data-store-api="https://www.apple.com/[storefront]/shop/bag/status" data-search-locale="en_US" data-search-suggestions-api="https://www.apple.com/search-services/suggestions/" data-search-defaultlinks-api="https://www.apple.com/search-services/suggestions/defaultlinks/">
<div class="ac-gn-content">
<ul class="ac-gn-header">
<li class="ac-gn-item ac-gn-menuicon">
<label class="ac-gn-menuicon-label" for="ac-gn-menustate" aria-hidden="true">
<span class="ac-gn-menuicon-bread ac-gn-menuicon-bread-top">
<span class="ac-gn-menuicon-bread-crust ac-gn-menuicon-bread-crust-top"></span>
</span>
<span class="ac-gn-menuicon-bread ac-gn-menuicon-bread-bottom">
<span class="ac-gn-menuicon-bread-crust ac-gn-menuicon-bread-crust-bottom"></span>
</span>
</label>
<a href="#ac-gn-menustate" role="button" class="ac-gn-menuanchor ac-gn-menuanchor-open" id="ac-gn-menuanchor-open">
<span class="ac-gn-menuanchor-label">Global Nav Open Menu</span>
</a>
<a href="#" role="button" class="ac-gn-menuanchor ac-gn-menuanchor-close" id="ac-gn-menuanchor-close">
<span class="ac-gn-menuanchor-label">Global Nav Close Menu</span>
</a>
</li>
<li class="ac-gn-item ac-gn-apple">
<a class="ac-gn-link ac-gn-link-apple" href="https://www.apple.com/" data-analytics-title="apple home" id="ac-gn-firstfocus-small">
<span class="ac-gn-link-text">Apple</span>
</a>
</li>
<li class="ac-gn-item ac-gn-bag ac-gn-bag-small" id="ac-gn-bag-small">
<div class="ac-gn-bag-wrapper">
<a class="ac-gn-link ac-gn-link-bag" href="https://www.apple.com/us/shop/goto/bag" data-analytics-title="bag" data-analytics-click="bag" aria-label="Shopping Bag" data-string-badge="Shopping Bag with item count :">
<span class="ac-gn-link-text">Shopping Bag</span>
</a>
<span class="ac-gn-bag-badge">
<span class="ac-gn-bag-badge-separator"></span>
<span class="ac-gn-bag-badge-number"></span>
<span class="ac-gn-bag-badge-unit">+</span>
</span>
</div>
<span class="ac-gn-bagview-caret ac-gn-bagview-caret-large"></span>
</li>
</ul>
<div class="ac-gn-search-placeholder-container" role="search">
<div class="ac-gn-search ac-gn-search-small">
<a id="ac-gn-link-search-small" class="ac-gn-link" href="https://www.apple.com/us/search" data-analytics-title="search" data-analytics-click="search" data-analytics-intrapage-link="" aria-label="Search apple.com" role="button" aria-haspopup="true">
<div class="ac-gn-search-placeholder-bar">
<div class="ac-gn-search-placeholder-input">
<div class="ac-gn-search-placeholder-input-text" aria-hidden="true">
<div class="ac-gn-link-search ac-gn-search-placeholder-input-icon"></div>
<span class="ac-gn-search-placeholder">Search apple.com</span>
</div>
</div>
<div class="ac-gn-searchview-close ac-gn-searchview-close-small ac-gn-search-placeholder-searchview-close">
<span class="ac-gn-searchview-close-cancel" aria-hidden="true">Cancel</span>
</div>
</div>
</a>
</div>
</div>
<ul class="ac-gn-list">
<li class="ac-gn-item ac-gn-apple">
<a class="ac-gn-link ac-gn-link-apple" href="https://www.apple.com/" data-analytics-title="apple home" id="ac-gn-firstfocus">
<span class="ac-gn-link-text">Apple</span>
</a>
</li>
<li class="ac-gn-item ac-gn-item-menu ac-gn-mac">
<a class="ac-gn-link ac-gn-link-mac" href="https://www.apple.com/mac/" data-analytics-title="mac">
<span class="ac-gn-link-text">Mac</span>
</a>
</li>
<li class="ac-gn-item ac-gn-item-menu ac-gn-ipad">
<a class="ac-gn-link ac-gn-link-ipad" href="https://www.apple.com/ipad/" data-analytics-title="ipad">
<span class="ac-gn-link-text">iPad</span>
</a>
</li>
<li class="ac-gn-item ac-gn-item-menu ac-gn-iphone">
<a class="ac-gn-link ac-gn-link-iphone" href="https://www.apple.com/iphone/" data-analytics-title="iphone">
<span class="ac-gn-link-text">iPhone</span>
</a>
</li>
<li class="ac-gn-item ac-gn-item-menu ac-gn-watch">
<a class="ac-gn-link ac-gn-link-watch" href="https://www.apple.com/watch/" data-analytics-title="watch">
<span class="ac-gn-link-text">Watch</span>
</a>
</li>
<li class="ac-gn-item ac-gn-item-menu ac-gn-tv">
<a class="ac-gn-link ac-gn-link-tv" href="https://www.apple.com/tv/" data-analytics-title="tv">
<span class="ac-gn-link-text">TV</span>
</a>
</li>
<li class="ac-gn-item ac-gn-item-menu ac-gn-music">
<a class="ac-gn-link ac-gn-link-music" href="https://www.apple.com/music/" data-analytics-title="music">
<span class="ac-gn-link-text">Music</span>
</a>
</li>
<li class="ac-gn-item ac-gn-item-menu ac-gn-support">
<a class="ac-gn-link ac-gn-link-support" href="https://support.apple.com" data-analytics-title="support">
<span class="ac-gn-link-text">Support</span>
</a>
</li>
<li class="ac-gn-item ac-gn-item-menu ac-gn-search" role="search">
<a id="ac-gn-link-search" class="ac-gn-link ac-gn-link-search" href="https://www.apple.com/us/search" data-analytics-title="search" data-analytics-click="search" data-analytics-intrapage-link="" aria-label="Search apple.com" role="button" aria-haspopup="true"></a>
</li>
<li class="ac-gn-item ac-gn-bag" id="ac-gn-bag">
<div class="ac-gn-bag-wrapper">
<a class="ac-gn-link ac-gn-link-bag" href="https://www.apple.com/us/shop/goto/bag" data-analytics-title="bag" data-analytics-click="bag" aria-label="Shopping Bag" data-string-badge="Shopping Bag with item count : {%BAGITEMCOUNT%}">
<span class="ac-gn-link-text">Shopping Bag</span>
</a>
<span class="ac-gn-bag-badge" aria-hidden="true">
<span class="ac-gn-bag-badge-separator"></span>
<span class="ac-gn-bag-badge-number"></span>
<span class="ac-gn-bag-badge-unit">+</span>
</span>
</div>
<span class="ac-gn-bagview-caret ac-gn-bagview-caret-large"></span>
</li>
</ul>
.
.
.
</div><footer id="ac-globalfooter" class="js flexbox" role="contentinfo" lang="en-US" dir="ltr"><div class="ac-gf-content"><section class="ac-gf-footer">
<div class="ac-gf-footer-shop" x-ms-format-detection="none">
More ways to shop: Find an Apple Store or other retailer near you. <span class="nowrap">Or call 1-800-MY-APPLE.</span>
</div>
<div class="ac-gf-footer-locale">
<a class="ac-gf-footer-locale-link" href="https://www.apple.com/choose-country-region/" title="Choose your country or region" aria-label="Choose your country or region" data-analytics-title="choose your country"><span class="ac-gf-footer-locale-flag" data-hires="false"></span>Choose your country or region</a>
</div>
<div class="ac-gf-footer-legal">
<div class="ac-gf-footer-legal-copyright">Copyright © 2020 Apple Inc. All rights reserved.</div>
<div class="ac-gf-footer-legal-links">
<a class="ac-gf-footer-legal-link" href="https://www.apple.com/legal/privacy/" data-analytics-title="privacy policy">Privacy Policy</a>
<a class="ac-gf-footer-legal-link" href="https://www.apple.com/legal/internet-services/terms/site.html" data-analytics-title="terms of use">Terms of Use</a>
<a class="ac-gf-footer-legal-link" href="https://www.apple.com/us/shop/goto/help/sales_refunds" data-analytics-title="sales and refunds">Sales and Refunds</a>
<a class="ac-gf-footer-legal-link" href="https://www.apple.com/legal/" data-analytics-title="legal">Legal</a>
<a class="ac-gf-footer-legal-link" href="https://www.apple.com/sitemap/" data-analytics-title="site map">Site Map</a>
</div>
</div>
</section>
</div></footer>
.
.
.
</div>
<div id="modal-container"></div>
<script integrity="" src="/assets/vendor-5715c00de8dadd4a8dd6d176ecd12d82.js"></script><div id="ac-gn-viewport-emitter"> </div>
<script integrity="" src="/assets/web-experience-app-26d37fb2d982f3cfebb0a2498926aa6e.js"></script>
<script src="https://js-cdn.music.apple.com/-amp/v2/musickit.js"></script>
<div id="ember-basic-dropdown-wormhole"></div>
</body></html>

Error as I'm switching to peewee for flask app. 'peewee.IntegerField object' has no attribute 'flags'

I began switching from using standard basic SQL in my flask app to using peewee and am getting a weird bug I cant seem to find any info about. My endpoints are working fine but when I tried going to the landing page I get "jinja2.exceptions.UndefinedError: 'peewee.IntegerField object' has no attribute 'flags'"
This seems like some weird interaction with wtforms and peewee but I cant seem to find similar issues. Thanks in advance.
Note everything is in one file
My Models:
class pipelineForm(FlaskForm):
pipeline = IntegerField('Pipeline ID')
class Process(Model):
pipeline_id = IntegerField()
process_name= CharField(null = True)
log= CharField(null = True)
exit_code= IntegerField()
started= CharField(null = True)
finsihed= CharField(null = True)
class Meta:
database=db
End Point for Landing Page:
#app.route('/bloodhound', methods=['GET','POST'])
def index():
form =pipelineForm()
print(form.errors)
if form.validate_on_submit():
print(str(form.pipeline.data))
return redirect(url_for('.display', pipelineId=form.pipeline.data))#'<h1>' + str(form.pipeline.data) + '</h1>'
return render_template('index.html',form=form)
Landing Page:
{% extends "bootstrap/base.html" %} {%import "bootstrap/wtf.html" as wtf%} {% block content %}
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
<meta name="description" content="">
<meta name="author" content="">
<link rel="icon" href="../../favicon.ico">
<title>Narrow Jumbotron Template for Bootstrap</title>
<!-- Bootstrap core CSS -->
<link href="../../dist/css/bootstrap.min.css" rel="stylesheet">
<!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
<link href="../../assets/css/ie10-viewport-bug-workaround.css" rel="stylesheet">
<!-- Custom styles for this template -->
<link href="jumbotron-narrow.css" rel="stylesheet">
<!-- Just for debugging purposes. Don't actually copy these 2 lines! -->
<!--[if lt IE 9]><script src="../../assets/js/ie8-responsive-file-warning.js"></script><![endif]-->
<script src="../../assets/js/ie-emulation-modes-warning.js"></script>
<!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<nav class="navbar navbar-inverse">
<div class="container-fluid">
<div class="navbar-header">
<a class="navbar-brand" href="/bloodhound">Bloodhound</a>
</div>
<ul class="nav navbar-nav">
<li class="active">Home</li>
<li>Performance</li>
</ul>
</div>
</nav>
<body>
<div class="container">
<div class="jumbotron">
<h1>Welcome to Bloodhound!</h1>
<p class="lead">Enter a pipeline Id to get diagnostic information.</p>
<form class="input" method="POST" action="/bloodhound">
<div class="input-group" style="width: 300px;">
{{form.hidden_tag()}} {{wtf.form_field(form.pipeline)}}
<span class="input-group-btn" style="vertical-align: bottom;">
<button class="btn btn-default" type="submit" >Go!</button>
</span>
</div>
</form>
</div>
<footer class="footer">
<p>© 2017 MITRE.</p>
</footer>
</div>
<!-- /container -->
<!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
<script src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
</body>
</html>
{% endblock %}
Full Stack Trace
Traceback (most recent call last):
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask/app.py", line 1997, in __call__
    return self.wsgi_app(environ, start_response)
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask/app.py", line 1985, in wsgi_app
    response = self.handle_exception(e)
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask/app.py", line 1540, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask/_compat.py", line 33, in reraise
    raise value
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask/app.py", line 1982, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask/app.py", line 1614, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask/app.py", line 1517, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask/_compat.py", line 33, in reraise
    raise value
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask/app.py", line 1612, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask/app.py", line 1598, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/patrick/process_tracker/api/tracker_api.py", line 201, in index
    return render_template('index.html',form=form)
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask/templating.py", line 134, in render_template
    context, ctx.app)
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask/templating.py", line 116, in _render
    rv = template.render(context)
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/jinja2/environment.py", line 1008, in render
    return self.environment.handle_exception(exc_info, True)
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/jinja2/environment.py", line 780, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/jinja2/_compat.py", line 37, in reraise
    raise value.with_traceback(tb)
 File "/home/patrick/process_tracker/api/templates/index.html", line 1, in top-level template code
    {% extends "bootstrap/base.html" %} {%import "bootstrap/wtf.html" as wtf%} {% block content %}
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask_bootstrap/templates/bootstrap/base.html", line 1, in top-level template
 code
    {% block doc -%}
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask_bootstrap/templates/bootstrap/base.html", line 4, in block "doc"
    {%- block html %}
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask_bootstrap/templates/bootstrap/base.html", line 20, in block "html"
    {% block body -%}
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask_bootstrap/templates/bootstrap/base.html", line 23, in block "body"
    {% block content -%}
  File "/home/patrick/process_tracker/api/templates/index.html", line 58, in block "content"
    <!-- {{form.hidden_tag()}} {{wtf.form_field(form.pipeline)}} -->
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/jinja2/runtime.py", line 553, in _invoke
    rv = self._func(*arguments)
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/flask_bootstrap/templates/bootstrap/wtf.html", line 36, in template
    {% if field.flags.required and not required in kwargs %}
  File "/home/patrick/enviroments/venv/lib/python3.5/site-packages/jinja2/environment.py", line 430, in getattr
    return getattr(obj, attribute)
According to your error message this field
class pipelineForm(FlaskForm):
pipeline = IntegerField('Pipeline ID')
is of type pewee.IntegerField and you wan it to be the IntegerField type of WTForms. If you have both classes in the same file you need to:
import pewee
import wtforms.fields
class pipelineForm(FlaskForm):
pipeline = fields.IntegerField('Pipeline ID')
class Process(Model):
pipeline_id = pewee.IntegerField() # and so on

Get <span> value from web page and authentication without asking user and password

I have this video surveillance software on Windows that has it's brower remote access with user/password authentication, on PC's IP. (I can also view from remote since I have a public IP).
I check the code in the browser UI and want to take these span values (liveCameraCount and totalCameraCount):
I use the lime code, that is work but reply with:
None
None
ON
While it should be
2
2
ON
(I've also try with a time.sleep() to let the page load but without success)
from bs4 import BeautifulSoup
import urllib
import urllib2
import base64
url = "http://my.pc.ip.address:port"
username = "myuser"
password = "mypass"
handle = urllib2.Request(url)
authheader = "Basic %s" % base64.encodestring('%s:%s' % (username,password))
data = urllib.urlopen(url)
soup = BeautifulSoup(data, "html.parser")
cameras = soup.findAll('span')
for span in cameras:
print span.string
I'm also trying to login automatically without that it ask every time:
Enter username [for my.pc.ip.address:port]:------
Enter password for username at [for my.pc.ip.address:port]:-----
EDIT 1:
OK. that's strange.
If I press F12 I can see the value inside as the image posted, but if I press CRTL + U i see this code (without value). I don't understand:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<script>
var isMobile = navigator.userAgent.match(/(iPhone|iPod|Android|BlackBerry)/);
if(isMobile){
window.location = '/mobile/';
}
</script>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Sighthound</title>
<link rel="stylesheet" href="/css/sighthound-desktop.css" />
<style type="text/css">
#liveTime{
margin-left: 6px;
}
.views {
margin-right: 6px;
}
</style>
<!-- Favicons and touch icons -->
<!-- For retina-display iPads -->
<link href="/img/apple-touch-icon-xlarge.png" rel="apple-touch-icon- precomposed" sizes="144x144" type="image/png"/>
<!-- For retina-display iPhones -->
<link href="/img/apple-touch-icon-large.png" rel="apple-touch-icon-precomposed" sizes="114x114" type="image/png"/>
<!-- For iPad 1 -->
<link href="/img/apple-touch-icon-medium.png" rel="apple-touch-icon-precomposed" sizes="72x72" type="image/png"/>
<!-- For iPhone 3G, iPod Touch and Android -->
<link href="/img/apple-touch-icon-small.png" rel="apple-touch-icon-precomposed" type="image/png"/>
<!-- For Nokia -->
<link href="/img/apple-touch-icon-small.png" rel="shortcut icon"/>
<!-- For everything else -->
<link href="/img/favicon.png" rel="shortcut icon" type="image/png"/>
<link href="/img/favicon.ico" rel="shortcut icon" type="image/x-icon"/>
</head>
<body>
<script type="text/javascript"> if (!window.console) console = {log: function() {}}; </script>
<script src="/js/jquery-2.0.3.min.js"></script>
<script src="/js/handlebars.js"></script>
<script src="/js/underscore-min.js"></script>
<script src="/js/handlebars-extras.js"></script>
<script src="/js/xmlrpc.js"></script>
<script src="/js/sighthoundxmlrpc.js"></script>
<script src="/js/sighthound.js"></script>
<script src="/js/purl.js"></script>
<script src="/js/moment.min.js"></script>
<script src="/js/camera_display.js"></script>
<script src="/js/all_cameras_control.js"></script>
<script src="/js/common.js"></script>
<div class="container">
<div class="header">
<div class="logo"><img src="img/logo.png" height="80" /></div><!-- End of Logo -->
<div class="pageNav buttonBar">
Cameras
Clips
</div><!-- End of Page Nav -->
<div class="on-off buttonBar">
<a id="allOffButton" href="#" class="button">Off</a>
<a id="allOnButton" href="#" class="button">On</a>
</div><!-- End of Buttons -->
<div class="cameras">
Cameras<br />
<strong> <span id="liveCameraCount"></span> / <span id="totalCameraCount"></span> <span class="expressive">ON</span></strong>
</div><!-- Camera -->
</div><!-- End of Header -->
<div class="content">
<div class="contextMenu">
<div id="liveTime" class="date"></div><!-- End of Date -->
<!--<div class="fullscreen buttonBar">
<img src="img/iconFullscreen.png" width="18" />
</div>--><!-- End of Context Fullscreen -->
<div class="views buttonBar">
<!--<a id="view1up" href="#" class="button viewButton"<img src="img/icon1upBlue.png" /></a>-->
<a id="view2up" href="#" class="button viewButton"><img src="img/icon2upBlue.png" /></a>
<a id="view3up" href="#" class="button viewButton"><img src="img/icon3upBlue.png" /></a>
<a id="view4up" href="#" class="button active viewButton"><img src="img/icon4upWhite.png" /></a>
</div><!-- End of Context Views -->
</div><!-- End of Context -->
<div id="cameraGrid"></div>
<script id="cameraGridTemplate" type="text/x-handlebars-template">
{{#everyNth cameras {(columns)}}}
{{#if isModZeroNotFirst}}
</div>
{{/if}}
{{#if isModZero}}
<div class="videos">
{{/if}}
<div class="video-{(columns)}up cameraVideo">
<a href="live.html?camera={{name}}&live={{live}}&cameraindex={{index}}">
<div class="videoImgContainer"
data-camera_index="{{index}}">
<!-- Image stream content is built by camera_display.js -->
</div>
<div class="cameraTitle">
<span>{{name}}<span>
</div>
</a>
</div>
{{#if isLast}}
</div>
{{/if}}
{{/everyNth}}
</script>
</div><!-- End of Content -->
</div><!-- End of Container -->
<script src="/js/index.js" type="text/javascript"></script>
</body>
</html>
You could try swapping string for text as follows:
data = """<strong>
<span id="liveCameraCount">2</span>
" / "
<span id="totalCameraCount">2</span>
<span class="expressive">ON</span>
</strong>
</div>
"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(data, "html.parser")
for span in soup.findAll('span'):
print span.text
This would display the following for your small example:
2
2
ON
For your full HTML, the first two <span>s are empty, so empty strings will be printed.

Categories