jquery is not laoding in python flask app - python

Hello i dont know why the jquery is not loading in my flask app ? even though i have downloaded the jquery and add in js folder
<!DOCTYPE html>
<html lang="eng">
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<link href="{{url_for('static', filename = 'css/bootstrap.min.css')}}" rel="stylesheet">
<link rel="shortcut icon" href="{{url_for('static', filename = 'myicon.ico' )}}">
<script type=text/javascript src="{{url_for('static', filename='js/jquery.min.js') }}"></script>
<script type="text/javascript" src="{{url_for('static', filename = 'js/bootstrap.min.js')}}"></script>

type=text/javascript should be type="text/javascript" in line referencing jQuery.

Related

PyScript: connecting to a Postgres database?

I'm testing PyScript for the purposes of submitting queries to a Postgres database, creating Pandas dataframes, and rendering some numerical results and visualizations in the browser.
I'm attempting to connect to a Postgres database hosted on Heroku.
My code is as follows:
index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<script type="text/javascript" src="https://cdn.bokeh.org/bokeh/release/bokeh-2.4.2.js"></script>
<script type="text/javascript" src="https://cdn.bokeh.org/bokeh/release/bokeh-widgets-2.4.2.min.js"></script>
<script type="text/javascript" src="https://cdn.bokeh.org/bokeh/release/bokeh-tables-2.4.2.min.js"></script>
<script type="text/javascript" src="https://cdn.jsdelivr.net/npm/#holoviz/panel#0.13.1/dist/panel.min.js"></script>
<link rel="stylesheet" href="https://pyscript.net/alpha/pyscript.css" />
<script defer src="https://pyscript.net/alpha/pyscript.js"></script>
<title>PyScript Demo</title>
<py-env>
- pandas
- panel
- psycopg2
</py-env>
</head>
<body>
<div id='output' class="text-3xl text-center"></div>
<button id="retrieve-data" class="mt-4 p-2 text-white bg-blue-600 rounded" pys-onClick="retrieve_data">Get Data</button>
<py-script src="./main.py"></py-script>
</body>
</html>
main.py
import panel as pn
import psycopg2 as pg
pn.extension()
connection = pg.connect(host="ec2-12-345-678-910.compute-1.amazonaws.com",
port = 5432,
database="test_db",
user="test_user",
password="my_password_here")
var_1 = 'a test string...'
pyscript.write('output', var_1)
def retrieve_data(*args):
pyscript.write('output', 'Foo')
In the web browser, I see the standard messages:
Loading runtime...
Runtime created...
Initializing components...
However, the page fails to load.
Is it possible to connect to Postgres using PyScript?

how to copy all the code of a URL with python

I want to copy all the code of an URL (http://modelseed.org/biochem/reactions/rxn00001) using Python 3.6, but I can only copy part of the code, and I don't know why.
So far, I tried with "requests" module
import requests
page = requests.get("http://modelseed.org/biochem/reactions/rxn00001")
print(page.content)
and "urllib"
import urllib.request
site = urllib.request.urlopen("http://modelseed.org/biochem/reactions/rxn00001")
print(site.read())
The part of the code with information of the "Reaction Details", like "Name", "ID" and "Abbreviation" are missing, but they are visible if I inspect the code on the developer bar of Chrome.
The code I'm able to download using the two codes above is:
<!DOCTYPE html>
<html lang="en" ng-app="ModelSEED">
<head>
<base href="/"/>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>
<meta content="IE=edge" http-equiv="X-UA-Compatible"/>
<meta content="initial-scale=1, maximum-scale=1, user-scalable=no" name="viewport">
<meta content="The ModelSEED is a resource for the reconstruction, exploration, comparison, and analysis of metabolic models." name="description"/>
<link href="/img/ModelSEED-favicon.png?v=2.0" rel="shortcut icon"/>
<meta content="nconrad" name="author"/>
<title>
ModelSEED
</title>
<link href="components/angular-material/angular-material.css" rel="stylesheet"/>
<link href="components/bootstrap/dist/css/bootstrap.min.css" rel="stylesheet"/>
<!-- to be removed -->
<link href="components/font-awesome/css/font-awesome.min.css" rel="stylesheet"/>
<link href="icomoon/style.css" rel="stylesheet"/>
<link href="https://fonts.googleapis.com/icon?family=Material+Icons" rel="stylesheet"/>
<link href="http://fonts.googleapis.com/css?family=Montserrat:400,700" rel="stylesheet" type="text/css"/>
<link href="build/style.css" rel="stylesheet"/>
<!--<script src="https://cdn.socket.io/socket.io-1.3.7.js"></script>-->
<script src="build/site.js">
</script>
<!-- HTML5 Shim and Respond.js IE8 support of HTML5 elements and media queries -->
<!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script>
<script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
<![endif]-->
</meta>
</head>
<body>
<div style="height: 100%;" ui-view="">
</div>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-67412611-1', 'auto');
ga('send', 'pageview');
</script>
</body>
</html>
Anyone has any hint why the code between < div style="height: 100%;" ui-view="" > and (just after < body > and before < script >) is not downloaded?
Thank you.
It's being inserted by a javascript script, therefore, either requests nor urllib would find it, you would need to use a browser for this, you should try with selenium or PhantomJS
something like:
from selenium import webdriver
driver = webdriver.Chrome('./chromedriver')
driver.get(url)
driver.page_source
Try getting this url instead: https://www.patricbrc.org/api/model_reaction/?http_accept=application/json&eq(id,rxn00001)

Download a HTML page locally (+CSS, +images)

I'd like to be able to download a HTML page (let's say this actual question!):
f = urllib2.urlopen('https://stackoverflow.com/questions/33914277')
content = f.read() # soup = BeautifulSoup(content) could be useful?
g = open("mypage.html", 'w')
g.write(content)
g.close()
such that it is displayed the same way locally than online. Currently here is the (bad) result:
(source: gget.it)
Thus, one need to download CSS, and modify the HTML itself such that it points to this local CSS file... and the same for images, etc.
How to do this? (I think there should be simpler than this answer, that doesn't handle CSS, but how? Library?)
Since css and image files fall under CORS policy, from your local html you still can refer to them while they are in the cloud. The problem is unresolved URIs. In the html head section you have smth. like this:
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<link rel="stylesheet" type="text/css" href="/assets/8943fcf6/select.css" />
<link href="/css/media.css" rel="stylesheet" type="text/css">
<script type="text/javascript" src="/assets/jquery.yii.js"></script>
<script type="text/javascript" src="/assets/select.js"></script>
</head>
Obviously /css/media.css implies base address, ex. http://example.com. To resolve it for local file you need to make http://example.com/css/media.css as href value in your local copy of html. So now you should parse and add the base into the local code:
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<link rel="stylesheet" type="text/css" href="http://example.com/assets/select.css" />
<link href="http://example.com/css/media.css" rel="stylesheet" type="text/css">
<script type="text/javascript" src="http://example.com/assets/jquery.yii.js"></script>
<script type="text/javascript" src="http://example.com/assets/select.js"></script>
</head>
Use any means for that (js, php...)
Update
Since a local file also contains images' references throughout the body section you'll need to resolve them too.

Obtaining static JSON data in D3 from Flask app

I have a Flask app that I am using to serve a D3 visualization on Heroku. I have the visualization working here: http://wsankey.github.io/dew_mvp/ and the Heroku app here: https://dew.herokuapp.com/. The problem is that I have static json data that is not being picked up by my javascript--there's not even an error being thrown. The map data is 'us.json' and my data is 'dewmvp1.json.'
Here is the relevant piece of the D3 javascript file:
$(document).ready(function() {
//Reading map file and data
queue()
.defer(d3.json, "/static/us.json")
.defer(d3.json, "/static/dewmvpv1.json")
.await(ready);
And my app.py file where I thought I might route the data:
from flask import Flask, render_template
app = Flask(__name__)
#Main DEW page
#app.route('/')
def home():
return render_template("index.html")
#About page
#app.route('/about')
def about():
return render_template("about.html")
#Sending the us.json file
#app.route('/usmap/')
def usmap():
return "<a href=%s>file</a>" % url_for('static', filename='us.json')
#Sending our data
#app.route('/data/')
def data():
return "<a href=%s>file</a>" % url_for('static', filename='dewmvp1.json')
if __name__ == '__main__':
app.run(debug=True)
And the relevant pieces of the index.html where my script.js is being loaded (to give you an idea of how I'm loading it):
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<link href="http://d3js.org/queue.v1.min.js" type="text/javascript">
<link href="{{url_for('static', filename='topojson.v1.min.js')}}" type="text/javascript">
<link href="{{url_for('static', filename='underscore-1.6.0.min.js')}}" type="text/javascript">
<link href="{{url_for('static', filename='jquery-1.10.2.min.js')}}" type="text/javascript">
<link href="{{url_for('static', filename='jquery-ui-1.10.4.js')}}" type="text/javascript">
<link href='https://fonts.googleapis.com/css?family=Arvo:400,700|PT+Sans+Caption' rel='stylesheet' type='text/css'>
<link rel="stylesheet" media="screen" href="{{url_for('static', filename='jquery-ui-1.10.4.css')}}">
<link rel="stylesheet" media="screen" href="{{url_for('static', filename='grid.css')}}">
<link rel="stylesheet" media="screen" href="{{url_for('static', filename='layout.css')}}">
<link rel="stylesheet" media="screen" href="{{url_for('static', filename='map.css')}}">
<link rel="stylesheet" media="screen" href="{{url_for('static', filename='normalize.css')}}">
<link rel="stylesheet" media="screen" href="{{url_for('static', filename='elements.css')}}">
<link rel="stylesheet" media="screen" href="{{url_for('static', filename='typography.css')}}">
<link href='https://fonts.googleapis.com/css?family=Raleway' rel='stylesheet' type='text/css'>
<link href='https://fonts.googleapis.com/css?family=Merriweather' rel='stylesheet' type='text/css'>
<link href="{{url_for('static', filename='d3.min.js')}}" type="text/javascript">
Thank you!
What was happening here was that I using in my index.html when I should have been using . The javascript files were being linked but not executed. #user2569951 was definitely right to say that I needed that {{url_for() }} syntax to access the data but the real issue was more basic. Specifically where I had:
<link href="{{url_for('static', filename='d3.min.js')}}" type="text/javascript">
I needed instead:
<script src="{{url_for('static', filename='d3.min.js')}}"></script>
That was such a head banger for me but I certainly learned the difference between the tags. This fundamental knowledge is required when managing these files.
in your javascript you need to set your map the following way for it to be accessible.
$(document).ready(function() {
//Reading map file and data
queue()
.defer(d3.json, "{{url_for('static', filename='us.json')}}")
.defer(d3.json, "{{url_for('static', filename='dewmvpv1.json')}}")
.await(ready)

experiencing difficulty parsing a web page

I had downloaded a web page and saved it in the html format. I wanted to parse and obtain the values of "fullname", "memberHeadline", "numberOfConnections", etc. I tried using BeautifulSoup in Python, but it is not working. I also tried
>>> import json
>>> encoded_data = json.loads(f)
Traceback (most recent call last):
File "<pyshell#14>", line 1, in <module>
encoded_data = json.loads(f)
File "C:\Python27\lib\json\__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "C:\Python27\lib\json\decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Python27\lib\json\decoder.py", line 383, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
I am not clear on what the format of the file is. copied below is the content of the file.
<!DOCTYPE html>
<!--[if lt IE 7]>
<html lang="en" class="ie ie6 lte9 lte8 lte7 os-win">
<![endif]-->
<!--[if IE 7]>
<html lang="en" class="ie ie7 lte9 lte8 lte7 os-win">
<![endif]-->
<!--[if IE 8]>
<html lang="en" class="ie ie8 lte9 lte8 os-win">
<![endif]-->
<!--[if IE 9]>
<html lang="en" class="ie ie9 lte9 os-win">
<![endif]-->
<!--[if gt IE 9]>
<html lang="en" class="os-win">
<![endif]-->
<!--[if !IE]><!-->
<html lang="en" class="os-win">
<!--<![endif]-->
<head>
<meta name="lnkd-track-json-lib" content="http://s.c.lnkd.licdn.com/scds/concat/common/js?h=2jds9coeh4w78ed9wblscv68v-eo3jgzogk6v7maxgg86f4u27d&fc=2">
<meta name="lnkd-track-lib" content="http://s.c.lnkd.licdn.com/scds/concat/common/js?h=eo3jgzogk6v7maxgg86f4u27d&fc=2">
<meta name="treeID" content="yGlqHfV7FxMQvJqjACsAAA==">
<meta name="appName" content="profile">
<meta name="lnkd-track-error" content="/lite/ua/error?csrfToken=ajax%3A1584468784299534813&goback=%2Enpv_131506997_*1_*1_NAME*4SEARCH_9ikF_*1_en*4US_*1_*1_*1_123452511375704499972_1_63_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1">
<script src="http://static.licdn.com:80/scds/common/u/lib/fizzy/fz-1.3.3-min.js" type="text/javascript"></script>
<script type="text/javascript">
fs.config({
"failureRedirect": "http://www.linkedin.com/nhome/",
"uniEscape": true,
"xhrHeaders": {
"X-FS-Origin-Request": "/profile/view?id=131506997&authType=NAME_SEARCH&authToken=9ikF&locale=en_US&srchid=123452511375704499972&srchindex=1&srchtotal=63&trk=vsrp_people_res_name&trkInfo=VSRPsearchId%3A123452511375704499972%2CVSRPtargetId%3A131506997%2CVSRPcmpt%3Aprimary",
"X-FS-Page-Id": "nprofile-view"
}
});
</script>
<script type="text/javascript" src="http://s.c.lnkd.licdn.com/scds/concat/common/js?h=8swqmpmjehppqzovz8zzfvv9g-aef5jooigi7oiyblwlouo8z90-7tqheyb1qchwa8dejl8nvz7zd-10q339fub5b718xk0pv9lzhpl&fc=2"></script>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=9">
<meta name="pageImpressionID" content="3b47eb21-1db4-42c1-9d00-43b2918c4099">
<meta name="pageKey" content="nprofile_v2_view_fs">
<meta name="analyticsURL" content="/analytics/noauthtracker">
<link rel="openid.server" href="https://www.linkedin.com/uas/openid/authorize">
<link rel="apple-touch-icon-precomposed" href="/img/icon/apple-touch-icon.png">
<link rel="stylesheet" type="text/css" href="http://s.c.lnkd.licdn.com/scds/concat/common/css?h=3bifs78lai5i0ndyj1ew7316e-c8kkvmvykvq2ncgxoqb13d2by-cphg8n6ehozk6lgpbb36za2ap-2it1to3q1pt5evainys9ta07p-4uu2pkz5u0jch61r2nhpyyrn8-7poavrvxlvh0irzkbnoyoginp-4om4nn3a2z730xs82d78xj3be-3t9ar1pajet97hzt9uou74qbb-ct4kfyj4tquup0bvqhttvymms-58ujm6g9r0a3ok6mpq7cs25gn-9zbbsrdszts09by60it4vuo3q-8ti9u6z5f55pestwbmte40d9-5730os2tf3iaiql5c8fukzd2u-cxff0g818hf3ks7dzixd4lqcq-6ramlbadr9lh7v5r7vuc6t4ld-e5frmcn40t833k1adjvwkoyjq&fc=2">
<script type="text/javascript" src="http://s.c.lnkd.licdn.com/scds/concat/common/js?h=dfoaudjrk6rbf82f45bz5crwi-62og8s54488owngg0s7escdit-c8ha6zrgpgcni7poa5ctye7il-3ufb745s29q1ovtbq6htt6rwh-51dv6schthjydhvcv6rxvospp-e9rsfv7b5gx0bk0tln31dx3sq-2r5gveucqe4lsolc3n0oljsn1-8v2hz0euzy8m1tk5d6tfrn6j-d4jr8g8vadmx9i5wz2z6ck3pb-1wfd86vm2f60y6uu5isrw94q-ddqoqtn6xcqi8i0y3tsrdl533-81wr8bey9cjjn6rhvbv530cap-6iw3fvg61uute6gxy89acxi5d-5sxnyeselbctwpry658s2lkew-4c6mz6u5rinti47gswwanj74j-5eec9p1vamr86uabn13sngx92-6oxtrh5eu6olunu39xzqgp10i-cdr0psywot2inbx54hmajga3p-anaxa6l712w7m4gp8089vyb5m-3h7320kwlnqtbngzm67z2annq-80vg9koywz84zoon9sjflbru0-8cwe3ciy81r59l0q3usztbt2r-1na957r12xyfe317uma5pn9mc-57sur4cj634ll9tk38imgvc6g-8v6o0480wy5u6j7f3sh92hzxo-9puf8y7tgjvse2oqtgkdb4wcj-c9pibx8dlmicbwjh48g12z6bl-12xp8e6pputw80p9fcpzyy9m0-3xjyji4eyuzpbppt3cssr1oko-34nej6plgotmo4hbnvjthteuu-4nw8tqsdbe61ig2l9faf3qdi9&fc=2"></script>
<script type="text/javascript">
LI.define('UrlPackage');
LI.UrlPackage.containerCore = ["http://s.c.lnkd.licdn.com/scds/concat/common/js?h=d7z5zqt26qe7ht91f8494hqx5&fc=2"]
[0];
</script>
<script type="text/javascript" src="http://s.c.lnkd.licdn.com/scds/common/u/js/scds-hashes.js"></script>
<script type="text/javascript">
LI.JSContentBasePath = "http://s.c.lnkd.licdn.com/scds/concat/common/js?v=build-2000_1_28557-prod&fc=2";
LI.CSSContentBasePath = "http://s.c.lnkd.licdn.com/scds/concat/common/css?v=build-2000_1_28557-prod&fc=2";
LI.injectRelayHtmlUrl = "http://s.c.lnkd.licdn.com/scds/common/u/lib/inject/0.4.2/relay.html";
LI.injectRelaySwfUrl = "http://s.c.lnkd.licdn.com/scds/common/u/lib/inject/0.4.2/relay.swf";
LI.comboBaseUrl = "http://s.c.lnkd.licdn.com/scds/concat/common/css?v=build-2000_1_28557-prod&fc=2";
LI.staticUrlHashEnabled = "true";
</script>
<script type="text/javascript" src="http://s.c.lnkd.licdn.com/scds/concat/common/js?h=c19zsujfl1pg46iqy33ubhqc5-eu97293ov7e2qciy9zn7fyg55-4n3akxbgwkyp3he1eeb136xxq-aq8gt7g4x1o11fxmypuv7vfkb-8kh6sn7nciobs2crbqunav09q-6m96aslgoubdqpnadnimrxsuk&fc=2"></script>
<script type="text/javascript">
document.cookie = 'lang="v=2&lang=en-us"; domain=linkedin.com; version=0; path=/;';
</script>
<title>xyz abc | LinkedIn</title>
<link rel="stylesheet" type="text/css" href="http://s.c.lnkd.licdn.com/scds/concat/common/css?h=449dtfu96optpu75y189filyn-4lrmst05cep59hjxopm4xrj84-depvqaeschv5p2381431jub3f-ae142xp1b9qwrvanr8q32v931&fc=2">
<!--[if gte IE 9]>
<link rel="shortcut icon" type="image/ico" href="http://s.c.lnkd.licdn.com/scds/common/u/img/favicon_ie9.ico">
<![endif]-->
<!--[if (!IE)|(lt IE 9)]>
<link rel="shortcut icon" type="image/ico" href="http://s.c.lnkd.licdn.com/scds/common/u/img/favicon_v3.ico">
<![endif]-->
<script type="text/javascript" src="http://s.c.lnkd.licdn.com/scds/concat/common/js?h=6rei9ktvfprzc38327x3gt0u3-c61ck8yq8xgf9ji3h55bmaux8-e7bh8bocljccs3kjl5f03uw8j&fc=2"></script>
<script type="text/javascript" src="http://s.c.lnkd.licdn.com/scds/concat/common/js?h=bqzk7a8ih1jk3hy30mnqy9jxb-39gmm0e77xjrikdpkxed7aywi-d228wbwzysn60azozcfg7gzoa&fc=2"></script>"ind_lookup":"Financial Services","isShared":false,"logoId":"/p/3/000/10d/1c1/3af8941.png"},{"link_biz":"/company/axis\u002dmutual\u002dfund?trk=prof\u002dfollowing\u002dcompany\u002dlogo","universalName":"axis\u002dmutual\u002dfund","id":565614,"logo":"http://m.c.lnkd.licdn.com/media/p/2/000/036/30a/1bf5734.png","canonicalName":"Axis Mutual Fund","biz_follow":"/company/follow/submit?id=565614&csrfToken=ajax%3A1584468784299534813&goback=%2Enpv_131506997_*1_*1_NAME*4SEARCH_9ikF_*1_en*4US_*1_*1_*1_123452511375704499972_1_63_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1","ind_lookup":"Financial Services","isShared":false,"logoId":"/p/2/000/036/30a/1bf5734.png"},{"link_biz":"/company/uti\u002dmf?trk=prof\u002dfollowing\u002dcompany\u002dlogo","universalName":"uti\u002dmf","id":467803,"logo":"http://m.c.lnkd.licdn.com/media/p/3/000/02f/205/35eec6b.png","canonicalName":"UTI MF","biz_follow":"/company/follow/submit?id=467803&csrfToken=ajax%3A1584468784299534813&goback=%2Enpv_131506997_*1_*1_NAME*4SEARCH_9ikF_*1_en*4US_*1_*1_*1_123452511375704499972_1_63_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1","ind_lookup":"Financial Services","isShared":false,"logoId":"/p/3/000/02f/205/35eec6b.png"},{"link_biz":"/company/investment\u002ddata\u002dservices?trk=prof\u002dfollowing\u002dcompany\u002dlogo","universalName":"investment\u002ddata\u002dservices","id":92139,"logo":"http://m.c.lnkd.licdn.com/media/p/3/000/0ad/3d0/32f42f2.png","canonicalName":"Investment Data Services","biz_follow":"/company/follow/submit?id=92139&csrfToken=ajax%3A1584468784299534813&goback=%2Enpv_131506997_*1_*1_NAME*4SEARCH_9ikF_*1_en*4US_*1_*1_*1_123452511375704499972_1_63_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1","ind_lookup":"Financial Services","isShared":false,"logoId":"/p/3/000/0ad/3d0/32f42f2.png"}],"i18n_news":"News","lix_profile_showChannels":"control","i18n_unfollow":"Unfollow","isFollowing":true}},"BasicInfo":{"empty":{},"upsell":{"deferImg":true,"visible":true},"basic_info":{"showTopCardDetail":true,"visible":true,"phoneticname":"","i18n__Industry":"Industry","industry_pivot":"/search?search=&industry=43&sortCriteria=R&keepFacets=true&trk=prof\u002d0\u002dovw\u002dindustry","find_others_region":"Find other members in Mumbai Area, India","headline_highlight":"Executive at L&T Mutual Fund","i18n__find_others_in_industry":"Find other members in this industry","i18n_Edit":"Edit","location_highlight":"<strong class=\ "highlight\">Mumbai Area, India</strong>","deferImg":true,"industry_highlight":"Financial Services","industryID":43,"memberHeadline":"Executive at L&T Mutual Fund","i18n__Location":"Location","memberID":131506997,"location_pivot":"/search?search=&sortCriteria=R&keepFacets=true&facet_G=in%3A7150&trk=prof\u002d0\u002dovw\u002dlocation","fmt_location":"Mumbai Area, India","fullname":"xyz abc"}}`
This is an HTML file. You can tell because it starts with <!DOCTYPE html> (in fact, that means it's HTML5).
The reason you can't load it with json is because it's not json. The exception you get is expected, and your code should likely handle it, because in general passing a file of the wrong type is the sort of thing that happens.
You might like to use lxml, possibly with the beautifulsoup backend, to parse this.
The header is the doctype for HTML5 (this feature is similar of XHTML). Different of XHML, you don't really need to close some tags but in this case it seems more like a "hidden iframe streaming" technique, where the page is getting loading forever.
About the parsing, I would recommend the answers in about "parsing malformed html" ->
How to parse malformed HTML in python, using standard libraries
Regards
Luan

Categories