how to get a specific value in a script inside an html? - python

I have an HTML file, and this file contains several scripts
specifically in the last <script></script> contains a value that I would like to get
I need to get the hash value found here
extend(cur, { "hash": "13334a0e457f0793ec", "loginHost": "login", "sureBoxText": false, "strongCode": 0, "joinParams": false, "validationType": 3, "resendDelay": 120, "calledPhoneLen": 4, "calledPhoneExcludeCountries": [1, 49, 200] });
How can I do this? I've tried using soup but I think I'm doing it wrong. I really need to complete this, if you can help me I will be eternally grateful.
I tried using the re library but I don't know how to use it.
ex
re.search(html, "hash: (*?),")
is there any way to do a search like this?

You can use .group() to access a captured group:
import re
data = """extend(cur, { "hash": "13334a0e457f0793ec", "loginHost": "login", "sureBoxText": false, "strongCode": 0, "joinParams": false, "validationType": 3, "resendDelay": 120, "calledPhoneLen": 4, "calledPhoneExcludeCountries": [1, 49, 200] });"""
print(re.search(r'{ "hash": "(.*?)",', data).group(1))
Output:
13334a0e457f0793ec
Regular expression explanation:

Related

Using Python jasonpath_ng to Filter Lists of Objects

Had a look at other answers for similar however this doesn't seem to be working for me.
I have a simple requirement to filter a JSON list by a value in the objects of the list.
I.e.
jsonpath_expression = parse("$.balances[?(#.asset=='BTC')].free")
This path works on https://jsonpath.com/ with the following JSON.
{
"makerCommission": 10,
"takerCommission": 10,
"buyerCommission": 0,
"sellerCommission": 0,
"canTrade": true,
"canWithdraw": true,
"canDeposit": true,
"brokered": false,
"accountType": "SPOT",
"balances": [
{
"asset": "BTC",
"free": "0.06437673",
"locked": "0.00000000"
},
{
"asset": "LTC",
"free": "0.00000000",
"locked": "0.00000000"
}
]
}
When I try in python I get jsonpath_ng.exceptions.JsonPathLexerError: Error on line 1, col 11: Unexpected character: ?
I've tried quite a few variations - which garner various other jsonpath parse errors - based on other articles - this one looked promising and I believe aligns to my attempts.
Any ideas what I am doing wrong?

Why SyntaxError: multiple statements found while compiling a single statement in Python IDE?

import pandas as pd
wine_dict = {
'red_wine': [3, 6, 5],
'white_wine':[5, 0, 10]
}
sales = pd.DataFrame(wine_dict, index=["adam", "bob", "charles"])
print(sales)
Please help me to run the code in my IDE.
The code you have pasted contains indented statements. Indents have a meaning in Python.
Leading whitespace (spaces and tabs) at the beginning of a logical line is used to compute the indentation level of the line, which in turn is used to determine the grouping of statements.
Your code works fine after removing all the indents.
Edit:
Your code after removing indents:
import pandas as pd
wine_dict = {
'red_wine': [3, 6, 5],
'white_wine':[5, 0, 10]
}
sales = pd.DataFrame(wine_dict, index=["adam", "bob", "Charles"])
print(sales)
you are running code in idle itself. go to new file paste the code and then run it.
I hope it works.

Trouble when storing API data in Python list

I'm struggling with my json data that I get from an API. I've gone into several api urls to grab my data, and I've stored it in an empty list. I then want to take out all fields that say "reputation" and I'm only interested in that number. See my code here:
import json
import requests
f = requests.get('my_api_url')
if(f.ok):
data = json.loads(f.content)
url_list = [] #the list stores a number of urls that I want to request data from
for items in data:
url_list.append(items['details_url']) #grab the urls that I want to enter
total_url = [] #stores all data from all urls here
for index in range(len(url_list)):
url = requests.get(url_list[index])
if(url.ok):
url_data = json.loads(url.content)
total_url.append(url_data)
print(json.dumps(total_url, indent=2)) #only want to see if it's working
Thus far I'm happy and can enter all urls and get the data. It's in the next step I get trouble. The above code outputs the following json data for me:
[
[
{
"id": 316,
"name": "storabro",
"url": "https://storabro.net",
"customer": true,
"administrator": false,
"reputation": 568
}
],
[
{
"id": 541,
"name": "sega",
"url": "https://wedonthaveanyyet.com",
"customer": true,
"administrator": false,
"reputation": 45
},
{
"id": 90,
"name": "Villa",
"url": "https://brandvillas.co.uk",
"customer": true,
"administrator": false,
"reputation": 6
}
]
]
However, I only want to print out the reputation, and I cannot get it working. If I in my code instead use print(total_url['reputation']) it doesn't work and says "TypeError: list indices must be integers or slices, not str", and if I try:
for s in total_url:
print(s['reputation'])
I get the same TypeError.
Feels like I've tried everything but I can't find any answers on the web that can help me, but I understand I still have a lot to learn and that my error will be obvious to some people here. It seems very similar to other things I've done with Python, but this time I'm stuck. To clarify, I'm expecting an output similar to: [568, 45, 6]
Perhaps I used the wrong way to do this from the beginning and that's why it's not working all the way for me. Started to code with Python in October and it's still very new to me but I want to learn. Thank you all in advance!
It looks like your total_url is a list of lists, so you might write a function like:
def get_reputations(data):
for url in data:
for obj in url:
print(obj.get('reputation'))
get_reputations(total_url)
# output:
# 568
# 45
# 6
If you'd rather not work with a list of lists in the first place, you can extend the list with each result instead of append in the expression used to construct total_url
You can also use json.load and try to read the response
def get_rep():
response = urlopen(api_url)
r = response.read().decode('utf-8')
r_obj = json.loads(r)
for item in r_obj['response']:
print("Reputation: {}".format(item['reputation']))

OSError: [Errno 22] when I try to .read() a json file

I am simply trying to read my json file in Python. I am in the correct folder when I do so; I am in Downloads, and my file is called 'Books_5.json'. However, when I try to use the .read() function, I get the error
OSError: [Errno 22] Invalid argument
This is my code:
import json
config = json.loads(open('Books_5.json').read())
This also raises the same error:
books = open('Books_5.json').read()
If it helps, this is a small snippet of what my data looks like:
{"reviewerID": "A10000012B7CGYKOMPQ4L", "asin": "000100039X", "reviewerName": "Adam", "helpful": [0, 0], "reviewText": "Spiritually and mentally inspiring! A book that allows you to question your morals and will help you discover who you really are!", "overall": 5.0, "summary": "Wonderful!", "unixReviewTime": 1355616000, "reviewTime": "12 16, 2012"}
{"reviewerID": "A2S166WSCFIFP5", "asin": "000100039X", "reviewerName": "adead_poet#hotmail.com \"adead_poet#hotmail.com\"", "helpful": [0, 2], "reviewText": "This is one my must have books. It is a masterpiece of spirituality. I'll be the first to admit, its literary quality isn't much. It is rather simplistically written, but the message behind it is so powerful that you have to read it. It will take you to enlightenment.", "overall": 5.0, "summary": "close to god", "unixReviewTime": 1071100800, "reviewTime": "12 11, 2003"}
I'm using Python 3.6 on MacOSX
It appears that this is some kind of bug that occurs when the file is too large (my file was ~10GB). Once I use split to break up the file by 200 k lines, the .read() error goes away. This is true even if the file is not in strict json format.
Your code looks fine, it just looks like your json data is formatted incorrectly. Try the following. As others have suggested, it should be in the form [{},{},...].
[{"reviewerID": "A10000012B7CGYKOMPQ4L", "asin": "000100039X",
"reviewerName": "Adam", "helpful": [0, 0], "reviewText": "Spiritually and
mentally inspiring! A book that allows you to question your morals and will
help you discover who you really are!", "overall": 5.0, "summary":
"Wonderful!", "unixReviewTime": 1355616000, "reviewTime": "12 16, 2012"},
{"reviewerID": "A2S166WSCFIFP5", "asin": "000100039X", "reviewerName":
"adead_poet#hotmail.com \"adead_poet#hotmail.com\"", "helpful": [0, 2],
"reviewText": "This is one my must have books. It is a masterpiece of
spirituality. I'll be the first to admit, its literary quality isn't much.
It is rather simplistically written, but the message behind it is so
powerful that you have to read it. It will take you to enlightenment.",
"overall": 5.0, "summary": "close to god", "unixReviewTime": 1071100800,
"reviewTime": "12 11, 2003"}]
Your code and this data worked for me on Windows 7 and python 2.7. Different than your setup, but should still be ok.
In order to read json file, you can use next example:
with open('your_data.json') as data_file:
data = json.load(data_file)
print(data)
print(data[0]['your_key']) # get value via key.
and also try to convert your json objects into a list
[
{'reviewerID': "A10000012B7CGYKOMPQ4L", ....},
{'asin': '000100039X', .....}
]

Python. How do I print specific value from json when from similiar names?

Okay so my problem is that I need to print out one specific value from a json.
I've managed to print out all the values but not the specific one I want.
The json looks like this:
"apple": {
"stuff": 111,
"food": [
{
"money": 4000,
"time": 36,
},
{
"money": 12210,
"time": 94,
It continues like that with money and time.
So my problem is that when I do this:
ourResult = js['apple']['food']
for rs in ourResult:
print rs['time']
I receive all the times.. I only want to receive the time under money: 12210 for an example but I don't know how to do that when there is a colon and a value.
I thank you for all the help in advance.
Well, you already know how to get the value of "time", so just do the same with "money" and check it's equal to 12210.
Edit
for rs in ourResult:
if rs['money'] == 12210:
print rs['time']

Categories