Scraping values from a request response

Scraping values from a request response - python

Unfortunately i can provide only the output of the request and not the full code since it contains quite private infos, basically when printing the request as text file i get a json one, something like that:
{"paymentResource":{"paymentToken":"PAYID-MEJ------","intent":"authorize","redirectUrl":"https://www.paypal.com/checkoutnow?nolegacy=1\u0026token=EC-5JS-----2S","authenticateUrl":null}}
How can i scrape that Paypal url? tried by doing this but it didn't worked (ppstep2 is the name of the request):
content = ppstep2.json()
pp = content["redirectUrl"]
I only get this error while doing it:
pp = content["redirectUrl"]
KeyError: 'redirectUrl'

Your variable content is a dictionary.
To get the value for "redirectUrl" you can do this:
pp = content['paymentResource']['redirectUrl']
The key error was caused by not including ['paymentResource']
I would recommend reviewing python dictionaries and the .get() method as well.
https://docs.python.org/3/library/stdtypes.html?highlight=dictionary%20get#dict.get

Try adding a print(json.dumps(content, indent=4)) before you try to access it and look at the output. You might spot why then.
redirectUrl isn't part of content. It's in the content['paymentResource'] dictionary in that response content.
using content['paymentResource']['redirectUrl'] should work.
Edit: If you want to try and get a value without ending up with an exception, try using .get():
# This will result in a KeyError as you experienced:
pp = content["redirectUrl"]
# This will instead set pp to None if 'redirectUrl' doesn't exist as a Key
pp = content.get("redirectUrl", None)

Related

how to access keys in a rquest sent via POST method

in the below posted webservice, i obtain the data posted from the front-end using json.loads the result of the print statment is posted below in the sample-data section.
the problem i have is when i try to obtain the values included in the request.data, for example, treatmentGeometryAsJSONInEPSG3857. the error i am getting when i run the code is:
treatmentAsGeoJSONInEPSG3857 = json.loads(request.data)['body'][config['FRONT_END_KEYS']['key_treatmentGeometryAsJSONInEPSG3857']]
TypeError: string indices must be integers
please let me know how to access the contents of the posted request
i referred to the following posts as well but none contains an answer to my question
https://stackoverflow.com/questions/10434599/get-the-data-received-in-a-flask-request
https://www.digitalocean.com/community/tutorials/processing-incoming-request-data-in-flask-de
code:
#app.route("/experimentExistenceCheck/", methods=['POST','OPTIONS'])
def experimentExistenceCheck():
preFlight = FlaskAccessControlUtils.preFlightCheck(request)
if preFlight: return preFlight
data = json.loads(request.data)['body']
print("--------->",data)
treatmentAsGeoJSONInEPSG3857 = json.loads(request.data)['body'][config['FRONT_END_KEYS']['key_treatmentGeometryAsJSONInEPSG3857']]
selectedSiteId = json.loads(request.data)['body'][config['FRONT_END_KEYS']['selectedSiteID']]
threshold = json.loads(request.data)['body'][config['FRONT_END_KEYS']['threshold']]
visualizationOperationID = json.loads(request.data)['body'][config['FRONT_END_KEYS']['visualizationOperationID']]
print("--------->",treatmentAsGeoJSONInEPSG3857)
print("--------->",selectedSiteId)
print("--------->",threshold)
print("--------->",visualizationOperationID)
sample data:
{"threshold":"1","visualizationOperationID":2,"treatmentGeometryAsJSONInEPSG3857":"{\"coordinates\":[[[745037.9841857546,6644742.79192291],[744938.5789015774,6644804.979114908],[744973.912533394,6644856.330659814],[745025.2640783006,6644838.428286361],[745022.4373877553,6644802.623539453],[744997.9394030292,6644768.703252909],[745037.9841857546,6644742.79192291]]],\"type\":\"Polygon\"}","selectedSiteID":"202108041239"}

How to download list data from SharePoint Online to a csv (preferably) or json file?

I have accessed a list in SharePoint Online with Python and want to save the list data to a file (csv or json) to transform it and sort some metadata for a migration
I have full access to the Sharepoint site I am connecting(client ID, secret..).
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.runtime.client_request import ClientRequest
from office365.sharepoint.client_context import ClientContext
I have set my settings:
app_settings = {
'url': 'https://company.sharepoint.com/sites/abc',
'client_id': 'id',
'client_secret': 'secret'
}
Connecting to the site:
context_auth = AuthenticationContext(url=app_settings['url'])
context_auth.acquire_token_for_app(client_id=app_settings['client_id'],
client_secret=app_settings['client_secret'])
ctx = ClientContext(app_settings['url'], context_auth)
Getting the lists and checking the titles:
lists = ctx.web.lists
ctx.load(lists)
ctx.execute_query()
for lista in lists:
print(lista.properties["Title"]) # this gives me the titles of each list and it works.
lists is a ListCollection Object
From the previous code, I see that I want to get the list titled: "Analysis A":
a1 = lists.get_by_title("Analysis A")
ctx.load(a1)
ctx.execute_query() # a1 is a List item - non-iterable
Then I get the data in that list:
a1w = a1.get_items()
ctx.load(a1w)
ctx.execute_query() # a1w is a ListItemCollection - iterable
idea 1: df to json/csv
df1 = pd.DataFrame(a1w) #doens't work)
idea 2:
follow this link: How to save a Sharepoint list as a file?
I get an error while executing the json.loads command:
JSONDecodeError: Extra data: line 1 column 5 (char 4)
Alternatives:
I tried Shareplum, but can't connect with it, like I did with office365-python-rest. My guess is that it doesn't have an authorisation option with client id and client secret (as far as I can see)
How would you do it? Or am I missing something?

Sample test demo for your reference.
context_auth = AuthenticationContext(url=app_settings['url'])
context_auth.acquire_token_for_app(client_id=app_settings['client_id'],
client_secret=app_settings['client_secret'])
ctx = ClientContext(app_settings['url'], context_auth)
list = ctx.web.lists.get_by_title("ListA")
items = list.get_items()
ctx.load(items)
ctx.execute_query()
dataList = []
for item in items:
dataList.append({"Title":item.properties["Title"],"Created":item.properties["Created"]})
print("Item title: {0}".format(item.properties["Title"]))
pandas.read_json(json.dumps(dataList)).to_csv("output.csv", index = None,header=True)

Idea 1
It's hard to tell what can go wrong without the error trace. But I suspect it's likely to do with malformed data that you are passing as the argument. See here from the documentation to know exactly what's expected.
Do also consider updating your question with relevant stack error traces.
Idea 2
JSONDecodeError: Extra data: line 1 column 5 (char 4)
This error simply means that the Json string is not a valid format. You can validate JSON strings by using this service. This often tells you the point of error which you can then use it to manually fix the problem.
This error could also be caused if the object that is being parsed is a python object. You can avoid this by jsonifying each line as you go
data_list= []
for line in open('file_name.json', 'r'):
data_list.append(json.loads(line))
This avoids storing intermediate python objects. Also see this related issue if nothing works.

How to convert suds object to xml string

This is a duplicate to this question:
How to convert suds object to xml
But the question has not been answered: "totxt" is not an attribute on the Client class.
Unfortunately I lack of reputation to add comments. So I ask again:
Is there a way to convert a suds object to its xml?
I ask this because I already have a system that consumes wsdl files and sends data to a webservice. But now the customers want to alternatively store the XML as files (to import them later manually). So all I need are 2 methods for writing data: One writes to a webservice (implemented and tested), the other (not implemented yet) writes to files.
If only I could make something like this:
xml_as_string = My_suds_object.to_xml()
The following code is just an example and does not run. And it's not elegant. Doesn't matter. I hope you get the idea what I want to achieve:
I have the function "write_customer_obj_webservice" that works. Now I want to write the function "write_customer_obj_xml_file".
import suds
def get_customer_obj():
wsdl_url = r'file:C:/somepathhere/Customer.wsdl'
service_url = r'http://someiphere/Customer'
c = suds.client.Client(wsdl_url, location=service_url)
customer = c.factory.create("ns0:CustomerType")
return customer
def write_customer_obj_webservice(customer):
wsdl_url = r'file:C:/somepathhere/Customer.wsdl'
service_url = r'http://someiphere/Customer'
c = suds.client.Client(wsdl_url, location=service_url)
response = c.service.save(someparameters, None, None, customer)
return response
def write_customer_obj_xml_file(customer):
output_filename = r'C\temp\testxml'
# The following line is the problem. "to_xml" does not exist and I can't find a way to do it.
xml = customer.to_xml()
fo = open(output_filename, 'a')
try:
fo.write(xml)
except:
raise
else:
response = 'All ok'
finally:
fo.close()
return response
# Get the customer object always from the wsdl.
customer = get_customer_obj()
# Since customer is an object, setting it's attributes is very easy. There are very complex objects in this system.
customer.name = "Doe J."
customer.age = 42
# Write the new customer to a webservice or store it in a file for later proccessing
if later_processing:
response = write_customer_obj_xml_file(customer)
else:
response = write_customer_obj_webservice(customer)

I found a way that works for me. The trick is to create the Client with the option "nosend=True".
In the documentation it says:
nosend - Create the soap envelope but don't send. When specified, method invocation returns a RequestContext instead of sending it.
The RequestContext object has the attribute envelope. This is the XML as string.
Some pseudo code to illustrate:
c = suds.client.Client(url, nosend=True)
customer = c.factory.create("ns0:CustomerType")
customer.name = "Doe J."
customer.age = 42
response = c.service.save(someparameters, None, None, customer)
print response.envelope # This prints the XML string that would have been sent.

You have some issues in write_customer_obj_xml_file function:
Fix bad path:
output_filename = r'C:\temp\test.xml'
The following line is the problem. "to_xml" does not exist and I can't find a way to do it.
What's the type of customer? type(customer)?
xml = customer.to_xml() # to be continued...
Why mode='a'? ('a' => append, 'w' => create + write)
Use a with statement (file context manager).
with open(output_filename, 'w') as fo:
fo.write(xml)
Don't need to return a response string: use an exception manager. The exception to catch can be EnvironmentError.
Analyse
The following call:
customer = c.factory.create("ns0:CustomerType")
Construct a CustomerType on the fly, and return a CustomerType instance customer.
I think you can introspect your customer object, try the following:
vars(customer) # display the object attributes
help(customer) # display an extensive help about your instance
Another way is to try the WSDL URLs by hands, and see the XML results.
You may obtain the full description of your CustomerType object.
And then?
Then, with the attributes list, you can create your own XML. Use an XML template and fill it with the object attributes.
You may also found the magic function (to_xml) which do the job for you. But, not sure the XML format matches your need.

client = Client(url)
client.factory.create('somename')
# The last XML request by client
client.last_sent()
# The last XML response from Web Service
client.last_received()

Django QueryDict variable, passed via application-form-post, how to get variables?

I trying to get some variables from one server, this server isn't mine, and I receive the POST like these:
<QueryDict: {"'data[id]': ['83A0C50B5A0A43AD8F60C1066B16A163'], 'data[status]': ['paid'], 'event': ['invoice.status_changed']": ['']}>
Here is the code:
def get_iugu_retorno(request):
d1 = request.POST
d2 = d1.get['data[id]']
I need to get data[id], data[status] and event... but Django appears to get all these information like a flattern string instead a Dict.
How is the best way to solve these?
I also try to create a list:
d2 = d1.getlist('data')
and nothing...
I`m using Django 1.8

This isn't form-encoded data at all, so you can't access it as if it is. It appears to be a form of JSON. So you need to access the post body directly.
d1 = json.loads(request.body)
d2 = d1.get('data[id]')

You are right in the assumption that data[id] is the name of the key, and not a list, but you are simply accessing the QueryDict the wrong way (get[], it's get()). The following code works fine:
from django.http import QueryDict
qd = QueryDict('data[id]=83A0C50B5A0A43AD8F60C1066B16A163&data[status]=paid&event=invoice.status_changed=')
qd.get('data[id]')
>>> u'83A0C50B5A0A43AD8F60C1066B16A163'

Pyramid route matching and query parameters

I have a Pyramid web service, and code samples are as follows:
View declaration:
#view_config(route_name="services/Prices/GetByTicker/")
def GET(request):
ticker = request.GET('ticker')
startDate = request.GET('startDate')
endDate = request.GET('endDate')
period = request.GET('period')
Routing:
config.add_route('services/Prices/GetByTicker/', 'services/Prices/GetByTicker/{ticker}/{startDate}/{endDate}/{period}')
Now I know this is all screwed up but I don't know what the convention is for Pyramid. At the moment this works inasmuch as the request gets routed to the view successfully, but then I get a "Dictionary object not callable" exception.
The URL looks horrible:
#root/services/Prices/GetByTicker/ticker=APPL/startDate=19981212/endDate=20121231/period=d
Ideally I would like to be able to use a URL something like:
#root/services/Prices/GetByTicker/?ticker=APPL&startDate=19981212&endDate=20121231&period=d
Any Pyramid bods out there willing to take five minutes to explain what I'm doing wrong?

from you sample code, i think you use the URL Dispatch
so it should be like this
config.add_route('services/Prices/GetByTicker/', 'services/Prices/GetByTicker/')
then the URL like:
#root/services/Prices/GetByTicker/?ticker=APPL&startDate=19981212&endDate=20121231&period=d
will match it
--edit--
you don't have to use a name like "services/Prices/GetByTicker" for route_name,and you can get the GET params use request.params['key']
View declaration:
#view_config(route_name="services_Prices_GetByTicker")
def services_Prices_GetByTicker(request):
ticker = request.params['ticker']
startDate = request.params['startDate']
endDate = request.params['endDate']
period = request.params['period']
Routing:
config.add_route('services_Prices_GetByTicker', 'services/Prices/GetByTicker/')

The query string is turned into the request.GET dictionary. You are using parenthesis to call the dictionary instead of accessing items via the brackets. For a url such as
#root/services/Prices/GetByTicker/?ticker=APPL&startDate=19981212&endDate=20121231&period=d
request.GET['ticker'] # -> 'APPL' or an exception if not available
request.GET.get('ticker') # -> 'APPL' or None if not available
request.GET.get('ticker', 'foo') # -> 'APPL' or 'foo' if not available
request.GET.getall('ticker') # -> ['APPL'] or [] if not available
The last option is useful if you expect ticker to be supplied multiple times.
request.params is a combination of request.GET and request.POST where the latter is a dictionary representing the request's body in a form upload.
Anyway, the answer is that request.GET('ticker') syntactically is not one of the options I mentioned, stop doing it. :-)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Scraping values from a request response - python

Related

how to access keys in a rquest sent via POST method

How to download list data from SharePoint Online to a csv (preferably) or json file?

How to convert suds object to xml string

Django QueryDict variable, passed via application-form-post, how to get variables?

Pyramid route matching and query parameters

Categories

Resources