I'm new to Python, and I'm working in Pycharm to read data line by line from a webpage. For this task, I'm attempting to use the requests module. However, when I try to print the response object, I see "Process finished with exit code 0" and no object displayed.
Do I need to create some sort of setting to be able to work with HTTP requests in Python?
Code:
import re
import requests
def find_phone_number(url='https://www.python-course.eu/barneyhouse.txt'):
response = requests.get(url)
return response
print(find_phone_number(url='https://www.python-course.eu/barneyhouse.txt'))
You need to call the function and access the 'text' element.
Also, in your code the print statement is not indented properly so it will never be run.
Here is an example of the code doing what I think you intendend:
import re
import requests
def find_phone_number(url='https://www.python-course.eu/simpsons_phone_book.txt'):
response = requests.get(url)
return response
text_you_want = find_phone_number().text
print(text_you_want)
Well, for starters, your find_phone_number() function calls itself after it returns. This is because your last line is indented and therefore inside the function definition. The reason you keep getting Process finished with exit code 0 is because your function is never actually called. This should work:
import re
import requests
def find_phone_number(url='https://www.python-course.eu/barneyhouse.txt'):
response = requests.get(url)
return response
print(find_phone_number(url='https://www.python-course.eu/barneyhouse.txt'))
Related
I'm calling a python function and passing an HTTP request as a parameter but it's not working. I created the function in a View and called it in another, but the parameter fails.
Here's the function I'm calling
def load_colmeias(request):
apiario = request.GET.get('apiario')
if apiario != "":
colmeias = Colmeia.objects.filter(apiario=apiario)
return render(request, 'colmeias_choices.html', {'colmeias': colmeias})
else:
return render(request, 'colmeias_choices.html')
Here I call her
load_colmeias(request)
But the following error occurs
NameError: name 'request' is not defined
I already imported the "urlib" and "requests" libraries but it always gives the same error:
AttributeError: module has no attribute 'GET'
Can someone help me ?? I'm new to Python/Django and I'm still learning how to do things
Check if you have requests installed:
import requests
r=requests.get("https://automatetheboringstuff.com/files/rj.txt")
print(len(r.text))
Now, check:
In load_colmeias(request), make sure the parameter is actually request and not requests throughout.
Your filename is not requests.py; otherwise you would be importing your own file.
request and requests are two different things, one without 's' is a parameter and with 's' is a method. to use requests you have to import requests library this is a third party library to fetch data from an API.
if you want to call this function you have to make a request through an API. create an API for this function and then make a request.
your views.py file:
def load_colmeias(request):
apiario = request.GET.get('apiario')
if apiario != "":
colmeias = Colmeia.objects.filter(apiario=apiario)
return render(request, 'colmeias_choices.html', {'colmeias': colmeias})
else:
return render(request, 'colmeias_choices.html')
Your urls.py file:
from django.urls import path
from . import views
urlpatterns = [
path('load_colmeias', views.load_colmeias)
]
Now your API is:
http://127.0.0.1:8000/load_colmeias?apiario=1234
make sure you have used the correct port in the api and have started the server before making the request and pass the correct value of apiario.
There is another way to call this function by using requests library:
import requests
res = requests.get('http://127.0.0.1:8000/load_colmeias?apiario=1234')
print(res.text)
you can use this in any file and call your function like this using this library again the server must be running and pass the correct value of apiario.
Well if you are passing a request parameter then you have to make the request. You can't call it the way you have shown.
I SOLVED THE PROBLEM !! Basically, I just created a function that calls another function and passed the same argument to both... As can be seen below:
def load_colmeias(request):
return carregar_colmeia(request)
I don't know if it's recommended, but it solved my problem of having to rewrite the same code in multiple views.
I am extracting data from this API
I was able to save the JSON file on my local machine.
I want to run the requests for several stocks.
How do I do it?
I tried to play with for loops but not good came out of this. I attached the code below.
the out put is:
AAPL
[]
TSLA
[]
Thank you, Tal
try:
# For Python 3.0 and later
from urllib.request import urlopen
except ImportError:
# Fall back to Python 2's urllib2
from urllib2 import urlopen
import requests
import json
import time
def get_jsonparsed_data(url):
"""
Receive the content of ``url``, parse it as JSON and return the object.
Parameters
----------
url : str
Returns
-------
dict
"""
stock_symbol = ["AAPL","TSLA"]
for symbol in stock_symbol:
print (symbol)
#Sending the API request
r = requests.get('https://financialmodelingprep.com/api/v3/income-statement/symbol={stock_symbol}?limit=120&apikey={removed by me})
packages_JSON = r.json()
print(packages_JSON)
#Exporting the data into JSON file
with open('stocks_data321.json', 'w', encoding='utf-8') as f:
json.dump(packages_JSON, f, ensure_ascii=False, indent=4)
Querying multiple APIs iterativelly will take a lot of time. Consider using theading or AsyncIO to do requests simultaniously and speed up the process.
In a nutshell you should do something like this for each API:
import threading
for provider in [...]: # list of APIs to query
t = threading.Thread(target=api_request_function, args=(provider, ...))
t.start()
However better read this great article first to understand whats and whys of threading approach.
I try to download a serie of text files from different websites. I am using urllib.request with Python. I want to expend the list of URL without making the code long.
The working sequence is
import urllib.request
url01 = 'https://web.site.com/this.txt'
url02 = 'https://web.site.com/kind.txt'
url03 = 'https://web.site.com/of.txt'
url04 = 'https://web.site.com/link.txt'
[...]
urllib.request.urlretrieve(url01, "Liste n°01.txt")
urllib.request.urlretrieve(url02, "Liste n°02.txt")
urllib.request.urlretrieve(url03, "Liste n°03.txt")
[...]
The number of file to download is increasing and I want to keep the second part of the code short.
I tried
i = 0
while i<51
i = i +1
urllib.request.urlretrieve( i , "Liste n°0+"i"+.txt")
It doesn't work and I am thinking that a while loop can be use for string but not for request.
So I was thinking of making it a function.
def newfunction(i)
return urllib.request.urlretrieve(url"i", "Liste n°0"+1+".txt")
But it seem that I am missing a big chunk of it.
This request is working but it seem I cannot transform it for long list or URL.
As a general suggestion, I'd recommend the requests module for Python, rather than urllib.
Based on that, some naive code for a possible function:
import requests
def get_file(site, filename):
target = site + "/" + filename
try:
r = requests.get(target, allow_redirects=True)
open(filename, 'wb').write(r.content)
return r.status_code
except requests.exceptions.RequestException as e:
print("File not downloaded, error: {}".format(e))
You can then call the function, passing in parameters of site and file name:
get_file('https://web.site.com', 'this.txt')
The function will raise an exception, but not stop execution, if it cannot download a file. You could expand exception handling to deal with files not being writable, but this should be a start.
It seems as if you're not casting the variable i to an integer before your concatenating it to the url string. That may be the reason why you're code isn't working. The while-loop/for-loop approach shouldn't effect whether or not the requests get sent out. I recommend using the requests module for making requests as well. Mike's post covers what the function should kind of look like. I also recommend creating a sessions object if you're going to be making a whole lot of requests in a piece of code. The sessions object will keep the underlying TCP connection open while you make your requests, which should reduce latency, CPU usage, and network congestion (https://en.wikipedia.org/wiki/HTTP_persistent_connection#Advantages). The code would look something like this:
import requests
with requests.Session() as s:
for i in range(10):
s.get(str(i)+'.com') # make request
# write to file here
To cast to a string you would want something like this:
i = 0
while i<51
i = i +1
urllib.request.urlretrieve( i , "Liste n°0" + str(i) + ".txt")
I built this function to tell me whether there have been changes to the website. I'm not sure if it works as I have tried it on a few websites that have not changed and it has given me the wrong output. Where is the issue and is there an issue at all?
This is the code:
I put the code into a function so that I could allow the user to input any site
userurl=input("Please enter a valid url")
def checksite(userurl):
change=False
import time
import urllib.request
import io
u = urllib.request.urlopen(userurl)
webContent1 = u.read()
time.sleep(60)
u = urllib.request.urlopen(userurl)
webContent2 = u.read()
if webContent1 == webContent2:
print("Everything is normal")
elif webContent1 !=webContent2:
print("Warning, there has been a change to the webite!")
change=True
return change
checksite(userurl)
Try making a small HTML Hello World page. Given that many websites have dynamic content that changes each time you access it (and might not necessarily be visible), that could lead to your "incorrect" results.
I have tested your code and it works perfectly fine in a Python webserver.
I have started one with
python -m http.server
and placed an index.html in the same directory with some content before starting the server.
and your code
import time
import urllib.request
import io
userurl='http://localhost:8000/index.html'
def checksite(userurl):
change=False
u = urllib.request.urlopen(userurl)
webContent1 = u.read()
print(webContent1)
time.sleep(15)
u = urllib.request.urlopen(userurl)
webContent2 = u.read()
print(webContent2)
if webContent1 == webContent2:
print("Everything is normal")
elif webContent1 !=webContent2:
print("Warning, there has been a change to the webite!")
change=True
return change
checksite(userurl)
and output
b'<html>\n\t<title> Hello </title>\n\t<body>\n\t\tTesting, Webcontent1 \n\t</body>\n\t</html>\n\n'
b'<html>\n\t<title> Hello </title>\n\t<body>\n\t\tTesting, Webcontent2\n\t</body>\n\t</html>\n\n'
Warning, there has been a change to the webite!
[Finished in 17.5s]
Your code is perfectly fine.
to know if a website or a page has changed you need to have a backup of it somewhere, in your code it was like you were comparing the site to itself... anyways. i recomend using the requests library in addition to BS4 and try parsing it line by line comparing to the backup you have.
So while the code is working (aka: the site you have as backup is showing the same lines as the site on the web) it will have a variable true. if it has changed it breaks the loop and simply shows the line where the site has changed.
I'm using the requests module. I have a number of programs that would like to make a complex check on the results of a requests.get(url) call. I thought perhaps I could add this new function in a class that inherited from some part of requests. But the get call is in an api.py file that contains just static function definitions, no class declaration. So I can't figure out what my import or subclass definition should look like ("class Subclass(requests.api)" isn't working.)
What I was think of ending up with:
r = requests.get(url)
r.my_check()
Is there a class-oriented way to accomplish this, or should I just write a function in a separate module of my own, pass it the results of the requests.get(url) call and be done with it?
Not saying it is a great idea, but ultimately I think you are just trying to dynamically add a method to the Response object?
import requests
from requests import Response
def my_method(self):
print(self.content)
Response.my_method = my_method
r = requests.get('https://www.google.com')
r.my_method()
Gives...
b'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><me
You can define your own function and attach to requests module at run-time.
def my_get(self, *args, **kwargs):
original_get = self.get(args, kwargs)
# do what you want with the original_get, maybe change it according to your needs, then return the changed response.
return changed_get
requests.my_get = my_get
# now you can use both of them
requests.get(url) # regular get method
requests.my_get(url) # your own get method