Difference between urlopen.(url) and requests.get(url) - python

My code:
r = requests.get('http://www.pythonchallenge.com/pc/def/banner.p')
t = urlopen('http://www.pythonchallenge.com/pc/def/banner.p')
print(r)
'Response [200]'
print(t)
'http.client.HTTPResponse object at 0x0430A370'
Why does requests.get only return the object instance (in this case the respond code) while urlopen returns the actual object?
My question then would be: how can I use requests to return an object instead of the response code? (I want to desearialize the content using pickle)

You are confusing what is returned by print with the object itself. requests.get does get the object. The developer of requests made the executive decision to return r.status_code when you call the print function. They could have returned anything: r.text, or r.raw, for example. It sounds like you were expecting to see the latter.
If you're interested, here is a bit more information about how developers can define what print returns: How to print a class or objects of class using print()?

Related

How do I use the requests module within Pycharm?

I'm new to Python, and I'm working in Pycharm to read data line by line from a webpage. For this task, I'm attempting to use the requests module. However, when I try to print the response object, I see "Process finished with exit code 0" and no object displayed.
Do I need to create some sort of setting to be able to work with HTTP requests in Python?
Code:
import re
import requests
def find_phone_number(url='https://www.python-course.eu/barneyhouse.txt'):
response = requests.get(url)
return response
print(find_phone_number(url='https://www.python-course.eu/barneyhouse.txt'))
You need to call the function and access the 'text' element.
Also, in your code the print statement is not indented properly so it will never be run.
Here is an example of the code doing what I think you intendend:
import re
import requests
def find_phone_number(url='https://www.python-course.eu/simpsons_phone_book.txt'):
response = requests.get(url)
return response
text_you_want = find_phone_number().text
print(text_you_want)
Well, for starters, your find_phone_number() function calls itself after it returns. This is because your last line is indented and therefore inside the function definition. The reason you keep getting Process finished with exit code 0 is because your function is never actually called. This should work:
import re
import requests
def find_phone_number(url='https://www.python-course.eu/barneyhouse.txt'):
response = requests.get(url)
return response
print(find_phone_number(url='https://www.python-course.eu/barneyhouse.txt'))

Add function to module that doesn't have class instance

I'm using the requests module. I have a number of programs that would like to make a complex check on the results of a requests.get(url) call. I thought perhaps I could add this new function in a class that inherited from some part of requests. But the get call is in an api.py file that contains just static function definitions, no class declaration. So I can't figure out what my import or subclass definition should look like ("class Subclass(requests.api)" isn't working.)
What I was think of ending up with:
r = requests.get(url)
r.my_check()
Is there a class-oriented way to accomplish this, or should I just write a function in a separate module of my own, pass it the results of the requests.get(url) call and be done with it?
Not saying it is a great idea, but ultimately I think you are just trying to dynamically add a method to the Response object?
import requests
from requests import Response
def my_method(self):
print(self.content)
Response.my_method = my_method
r = requests.get('https://www.google.com')
r.my_method()
Gives...
b'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><me
You can define your own function and attach to requests module at run-time.
def my_get(self, *args, **kwargs):
original_get = self.get(args, kwargs)
# do what you want with the original_get, maybe change it according to your needs, then return the changed response.
return changed_get
requests.my_get = my_get
# now you can use both of them
requests.get(url) # regular get method
requests.my_get(url) # your own get method

How to convert suds object to xml string

This is a duplicate to this question:
How to convert suds object to xml
But the question has not been answered: "totxt" is not an attribute on the Client class.
Unfortunately I lack of reputation to add comments. So I ask again:
Is there a way to convert a suds object to its xml?
I ask this because I already have a system that consumes wsdl files and sends data to a webservice. But now the customers want to alternatively store the XML as files (to import them later manually). So all I need are 2 methods for writing data: One writes to a webservice (implemented and tested), the other (not implemented yet) writes to files.
If only I could make something like this:
xml_as_string = My_suds_object.to_xml()
The following code is just an example and does not run. And it's not elegant. Doesn't matter. I hope you get the idea what I want to achieve:
I have the function "write_customer_obj_webservice" that works. Now I want to write the function "write_customer_obj_xml_file".
import suds
def get_customer_obj():
wsdl_url = r'file:C:/somepathhere/Customer.wsdl'
service_url = r'http://someiphere/Customer'
c = suds.client.Client(wsdl_url, location=service_url)
customer = c.factory.create("ns0:CustomerType")
return customer
def write_customer_obj_webservice(customer):
wsdl_url = r'file:C:/somepathhere/Customer.wsdl'
service_url = r'http://someiphere/Customer'
c = suds.client.Client(wsdl_url, location=service_url)
response = c.service.save(someparameters, None, None, customer)
return response
def write_customer_obj_xml_file(customer):
output_filename = r'C\temp\testxml'
# The following line is the problem. "to_xml" does not exist and I can't find a way to do it.
xml = customer.to_xml()
fo = open(output_filename, 'a')
try:
fo.write(xml)
except:
raise
else:
response = 'All ok'
finally:
fo.close()
return response
# Get the customer object always from the wsdl.
customer = get_customer_obj()
# Since customer is an object, setting it's attributes is very easy. There are very complex objects in this system.
customer.name = "Doe J."
customer.age = 42
# Write the new customer to a webservice or store it in a file for later proccessing
if later_processing:
response = write_customer_obj_xml_file(customer)
else:
response = write_customer_obj_webservice(customer)
I found a way that works for me. The trick is to create the Client with the option "nosend=True".
In the documentation it says:
nosend - Create the soap envelope but don't send. When specified, method invocation returns a RequestContext instead of sending it.
The RequestContext object has the attribute envelope. This is the XML as string.
Some pseudo code to illustrate:
c = suds.client.Client(url, nosend=True)
customer = c.factory.create("ns0:CustomerType")
customer.name = "Doe J."
customer.age = 42
response = c.service.save(someparameters, None, None, customer)
print response.envelope # This prints the XML string that would have been sent.
You have some issues in write_customer_obj_xml_file function:
Fix bad path:
output_filename = r'C:\temp\test.xml'
The following line is the problem. "to_xml" does not exist and I can't find a way to do it.
What's the type of customer? type(customer)?
xml = customer.to_xml() # to be continued...
Why mode='a'? ('a' => append, 'w' => create + write)
Use a with statement (file context manager).
with open(output_filename, 'w') as fo:
fo.write(xml)
Don't need to return a response string: use an exception manager. The exception to catch can be EnvironmentError.
Analyse
The following call:
customer = c.factory.create("ns0:CustomerType")
Construct a CustomerType on the fly, and return a CustomerType instance customer.
I think you can introspect your customer object, try the following:
vars(customer) # display the object attributes
help(customer) # display an extensive help about your instance
Another way is to try the WSDL URLs by hands, and see the XML results.
You may obtain the full description of your CustomerType object.
And then?
Then, with the attributes list, you can create your own XML. Use an XML template and fill it with the object attributes.
You may also found the magic function (to_xml) which do the job for you. But, not sure the XML format matches your need.
client = Client(url)
client.factory.create('somename')
# The last XML request by client
client.last_sent()
# The last XML response from Web Service
client.last_received()

How do I return a urllib2 response object from within a Django view?

I know I can use the shortcuts module to make it easier but just to see if I could do it manually I tried to create and return a response object myself but could not get it to work:
import urllib2
def djangoview(request):
data = '<byte string>'
open('body.txt', 'wb').write(data)
headers = {'Content-Type' : 'something', 'Accept' : 'somethingelse'}
newresponse = urllib2.Request('file:body.txt', None, headers)
return HttpResponse(newresponse)
I don't understand what you are trying to do. It's the contract of a view that it returns an instance of django.http.HttpResponse - you are simply not allowed to return anything else. Doing so is not a shortcut, it's a necessity.

urllib.request: any way to read from it without modifying the request object?

Given a standard urllib.request object, retrieved so:
req = urllib.urlopen('http://example.com')
If I read its contents via req.read(), afterwards the request object will be empty.
Unlike normal file-like objects, however, the request object does not have a seek method, for I am sure are excellent reasons.
However, in my case I have a function, and I want it to make certain determinations about a request and then return that request "unharmed" so that it can be read again.
I understand that one option is to re-request it. But I'd like to be able to avoid making multiple HTTP requests for the same url & content.
The only other alternative I can think of is to have the function return a tuple of the extracted content and the request object, with the understanding that anything that calls this function will have to get the content in this way.
Is that my only option?
Delegate the caching to a StringIO object(code not tested, just to give the idea):
import urllib
from io import StringIO
class CachedRequest(object):
def __init__(self, url):
self._request = urllib.urlopen(url)
self._content = None
def __getattr__(self, attr):
# if attr is not defined in CachedRequest, then get it from
# the request object.
return getattr(self._request, attr)
def read(self):
if self._content is None:
content = self._request.read()
self._content = StringIO()
self._content.write(content)
self._content.seek(0)
return content
else:
return self._content.read()
def seek(self, i):
self._content.seek(i)
If the code actually expects a real Request object(i.e. calls isinstance to check the type) then subclass Request and you don't even have to implement __getattr__.
Note that it is possible that a function checks for the exact class(and in this case you can't do nothing) or, if it's written in C, calls the method using C/API calls(in which case the overridden method wont be called).
Make a subclass of urllib2.Request that uses a cStringIO.StringIO to hold whatever gets read. Then you can implement seek and so forth. Actually you could just use a string, but that'd be more work.

Categories