Python function getBetweenHTML - python

This is a short question:
Where does the function getBetweenHTML() come from, e.g. urllib2 or something i am not sure.
Could anyone also give me an explanation of this function or better alternatives, thanks.
Code syntax:
import urllib2
url = urllib2.urlopen('http://google.com').read()
scraped = getBetweenHTML(url,'<div class="question">',"</div>)
print scraped
e.g. prints a question
Edit: I've found the solution, I actually made my own function, that's the reason I couldn't find it anywhere, I coded it myself in bs4, for anyone who needs it:
def getBetweenHTML(strSource, strStart,strEnd):
start = strSource.find(strStart) + len(strStart)
end = strSource.find(strEnd,start)
return strSource[start:end]

Related

TypeError: string indices must be integers when making rest api request

When I try to parse a rest api data,
it raises TypeError.
This is my code:
def get_contracts():
response_object = requests.get(
"https://testnet-api.phemex.com/md/orderbook?symbol=BTCUSD"
)
print(response_object.status_code)
for contract in response_object.json()["result"]["book"]:
print(contract["asks"])
get_contracts()
Any tip or solution will be very welcomed. Thanks in advance.
Edit/Update:
For some reason I am not able to select a specific key in the format above, its only possible if I do it like this:
data = response_object.json()['result']['book']['asks']
print(data)
I will try to work my code around that. Thanks for everyone who helped.
This code review may help you:
import requests
url = "https://testnet-api.phemex.com/md/orderbook?symbol=BTCUSD"
response_object = requests.get(url)
data = response_object.json()
# Printing your data helps to inspect the structure
# print(data)
# This is the list you are looking for:
asks = data['result']['book']['asks']
for ask in asks:
print(ask)
You need to iterate through asks, not book.
You have a nested dictionary where asks is a nested list.
If you simply click on the link you get getting, or print out your response_object.json() you would see the structure.
for foo in response_object.json()['result']['book']['asks']:
print(foo)
Although generally it's better to assign your response_object to a variable.
data = response_object.json()
for foo in data['result']['book']['asks']:
print(foo)
It looks like you are trying to access something that is not there, hence the KeyError.
I would debug, a simple print, the JSON object you are getting as answer and make sure that the keys you are trying to access are there.

Replace words from an API in python

I am using an API I found online for one of my scripts, and I am wondering if I can change one word from the API to something else. My code is:
import requests
people = requests.get('https://insult.mattbas.org/api/insult')
print("Welcome to the insult machine!\nType somebody you want to insult!")
b = input()
print(people.replace("You", b))
Is replace not a command? If so, what plugin and/or commands would I need to do it? Thanks!
The value returned from requests.get isn’t a string, it’s a response object and that class has no replace method.
Have a look at the structure of that class. For example, you can do r = requests.get(...) and r.text.replace(...).
In other words, you need to operate on the text part of the response object.

Parsing and appending a link

I have a link as input below,i need to parse this link and append "/#c/" as shown below,any inputs on how
this can be done?
INPUT:-https://link.com/617394/
OUTPUT:-https://link.com/#/c/617394/
Try something such as:
from urlparse import urlsplit, urlunsplit
s = 'https://link.com/617394/'
split = urlsplit(s)
new_url = urlunsplit(split._replace(path='/#/c' + split.path))
# https://link.com/#/c/617394/
"https://link.com/617394/".replace("m/6","m/#/c/6")
although I suspect your real problem is something else
"com/#/c/".join("https://link.com/617394/".split("com/"))
may be slightly more applicable to your actual problem statement (which I still dont know what that is)
my_link = "https://link.com/617394/"
print re.sub("\.(com|org|net)/",".\\1/#/c/",my_link)
maybe more of what your actually looking for ...
that urlsplit solution of #JonClements is pretty dang sweet too

how do I modify a url that I pick at random in python

I have an app that will show images from reddit. Some images come like this http://imgur.com/Cuv9oau, when I need to make them look like this http://i.imgur.com/Cuv9oau.jpg. Just add an (i) at the beginning and (.jpg) at the end.
You can use a string replace:
s = "http://imgur.com/Cuv9oau"
s = s.replace("//imgur", "//i.imgur")+(".jpg" if not s.endswith(".jpg") else "")
This sets s to:
'http://i.imgur.com/Cuv9oau.jpg'
This function should do what you need. I expanded on #jh314's response and made the code a little less compact and checked that the url started with http://imgur.com as that code would cause issues with other URLs, like the google search I included. It also only replaces the first instance, which could causes issues.
def fixImgurLinks(url):
if url.lower().startswith("http://imgur.com"):
url = url.replace("http://imgur", "http://i.imgur",1) # Only replace the first instance.
if not url.endswith(".jpg"):
url +=".jpg"
return url
for u in ["http://imgur.com/Cuv9oau","http://www.google.com/search?q=http://imgur"]:
print fixImgurLinks(u)
Gives:
>>> http://i.imgur.com/Cuv9oau.jpg
>>> http://www.google.com/search?q=http://imgur
You should use Python's regular expressions to place the i. As for the .jpg you can just append it.

Using Python Web GET data

I'm trying to pass information to a python page via the url. I have the following link text:
"<a href='complete?id=%s'>" % (str(r[0]))
on the complete page, I have this:
import cgi
def complete():
form = cgi.FieldStorage()
db = MySQLdb.connect(user="", passwd="", db="todo")
c = db.cursor()
c.execute("delete from tasks where id =" + str(form["id"]))
return "<html><center>Task completed! Click <a href='/chris'>here</a> to go back!</center></html>"
The problem is that when i go to the complete page, i get a key error on "id". Does anyone know how to fix this?
EDIT
when i run cgi.test() it gives me nothing
I think something is wrong with the way i'm using the url because its not getting passed through.
its basically localhost/chris/complete?id=1
/chris/ is a folder and complete is a function within index.py
Am i formatting the url the wrong way?
The error means that form["id"] failed to find the key "id" in cgi.FieldStorage().
To test what keys are in the called URL, use cgi.test():
cgi.test()
Robust test CGI script, usable as main program. Writes minimal HTTP headers and formats all information provided to the script in HTML form.
EDIT: a basic test script (using the python cgi module with Linux path) is only 3 lines. Make sure you know how to run it on your system, then call it from a browser to check arguments are seen on the CGI side. You may also want to add traceback formatting with import cgitb; cgitb.enable().
#!/usr/bin/python
import cgi
cgi.test()
Have you tried printing out the value of form to make sure you're getting what you think you're getting? You do have a little problem with your code though... you should be doing form["id"].value to get the value of the item from FieldStorage. Another alternative is to just do it yourself, like so:
import os
import cgi
query_string = os.environ.get("QUERY_STRING", "")
form = cgi.parse_qs(query_string)
This should result in something like this:
{'id': ['123']}
First off, you should make dictionary lookups via
possibly_none = my_dict.get( "key_name" )
Because this assigns None to the variable, if the key is not in the dict. You can then use the
if key is not None:
do_stuff
idiom (yes, I'm a fan of null checks and defensive programming in general...). The python documentation suggests something along these lines as well.
Without digging into the code too much, I think you should reference
form.get( 'id' ).value
in order to extract the data you seem to be asking for.

Categories