Python Requests post XML file not working - python

I'm trying to debug why i'm not able to POST an xml file with requests .post() to an API. The following wget command works fine:
wget -vv --no-check-certificate --post-file dynobjadd.xml \
"https://1.1.1.1/api/?type=user-id&action=set&key=$MYSUPERSECRETKEY=&file-name=dynobjadd.xml&client=wget" \
--no-http-keep-alive -O response.out
successful wget output:
...
URI encoding = ‘UTF-8’
--2017-01-05 13:21:11-- https://1.1.1.1/api/?type=user-id&action=set&key=$MYSUPERSECRETKEY=&file-name=dynobjadd.xml&client=wget
Certificates loaded: 165
Connecting to 1.1.1.1:443... connected.
...
---request begin---
POST https://1.1.1.1/api/?type=user-id&action=set&key=$MYSUPERSECRETKEY=&file-name=dynobjadd.xml&client=wget HTTP/1.1
User-Agent: Wget/1.18 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: 1.1.1.1
Connection: Close
Content-Type: application/x-www-form-urlencoded
Content-Length: 175
---request end---
[writing BODY file dynobjadd.xml ... done]
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Server:
Date: Thu, 05 Jan 2017 20:21:12 GMT
Content-Type: application/xml; charset=UTF-8
Content-Length: 255
Connection: close
ETag: "437cf-12b-56e39c36"
Pragma: no-cache
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Expires: Thu, 19 Nov 1981 08:52:00 GMT
X-FRAME-OPTIONS: SAMEORIGIN
Set-Cookie: PHPSESSID=123123; path=/; secure; HttpOnly
---response end---
200 OK
python code i am trying:
xml = open('dynobjadd.xml').read()
url = 'https://1.1.1.1/api/?type=user-id&action=set&key=$MYSUPERSECRETKEY=&file-name=dynobjadd.xml&client=requests'
r = requests.post(url, data=xml, verify=False )
r.content() output:
<response status = 'error' code = '400'><result><msg>No file uploaded</msg></result></response>

You should use argument 'files'.
From http://docs.python-requests.org/en/master/user/quickstart/#post-a-multipart-encoded-file :
>>> url = 'http://httpbin.org/post'
>>> files = {'file': open('report.xls', 'rb')}
>>> r = requests.post(url, files=files)
>>> r.text
{
...
"files": {
"file": "<censored...binary...data>"
},
...
}

Related

Report Portal - How to avoid unwanted logs in console?

I get the below messages for every test step, which is bit annoying. I need to process the console logs in a different way.
send: b'PUT /api/v2/superadmin_personal/item/14278b98-4430-4d2e-8301-1e30501da3b3 HTTP/1.1\r\nHost: abc.lab.com:8080\r\nUser-Agent: python-requests/2.27.1\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\nAuthorization: Bearer 2c0717a7-b477-4e02-b1b5-df2a2757db70\r\nContent-Length: 137\r\nContent-Type: application/json\r\n\r\n'
send: b'{"endTime": "1646987482101", "status": "PASSED", "issue": null, "launchUuid": "f380b026-d7c9-4596-b80a-dcaec6fa82f2", "attributes": null}'
reply: 'HTTP/1.1 200 OK\r\n'
header: Cache-Control: no-cache, no-store, max-age=0, must-revalidate
header: Content-Type: application/json
header: Date: Fri, 11 Mar 2022 08:30:58 GMT
header: Expires: 0
header: Pragma: no-cache
header: X-Content-Type-Options: nosniff
header: X-Frame-Options: DENY
header: X-Xss-Protection: 1; mode=block
header: Content-Length: 93
You can set rp.http.logging=false in the reportportal.prop file or as a JVM parameter.
There is a common switch for all HTTP requests/responses Python sends:
from http.client import HTTPConnection
HTTPConnection.debuglevel = 0
Unfortunately Python uses just print to log HTTP (as here), ignoring his own logging framework. That's really silly, but here where Python is. Therefore there is no any straight way to configure what you want log and what you would like to skip. You can just turn on or off console printing for all HTTP requests.

Difference between Python "requests" and Linux "curl"

I tried through several means, but nowhere do I find a satisfatory answer to this -
What are the differences between Python "requests" module and Linux "curl" command? Does "requests" use "curl" underlying, or is it totally different way of dealing with HTTP request/response?
For most of the requests, they both behave in the same way (as it should be), but sometimes, I find a difference in response and it is really hard to figure out why is it so.
eg. Using curl for HEAD request:
curl --head https://historia.sherpadesk.com
HTTP/2 302
content-type: text/html; charset=utf-8
date: Mon, 28 Feb 2022 20:31:30 GMT
access-control-expose-headers: Request-Context
cache-control: private
location: /login/?ref=portal
set-cookie: ASP.NET_SessionId=nghpw4qp5cw2ntwmwfuxw3oi; path=/; HttpOnly; SameSite=Lax
content-length: 135
request-context: appId=cid-v1:d5f9900e-ecd4-442f-9e92-e11b4cdbc0c9
x-frame-options: SAMEORIGIN
x-xss-protection: 1
x-content-type-options: nosniff
strict-transport-security: max-age=31536000
and if I use -L to follow redirects,
curl --head https://historia.sherpadesk.com -L
HTTP/2 302
content-type: text/html; charset=utf-8
date: Mon, 28 Feb 2022 20:31:37 GMT
access-control-expose-headers: Request-Context
cache-control: private
location: /login/?ref=portal
set-cookie: ASP.NET_SessionId=trzp0bql4nibswux5z5wfayy; path=/; HttpOnly; SameSite=Lax
content-length: 135
request-context: appId=cid-v1:d5f9900e-ecd4-442f-9e92-e11b4cdbc0c9
x-frame-options: SAMEORIGIN
x-xss-protection: 1
x-content-type-options: nosniff
strict-transport-security: max-age=31536000
HTTP/2 302
content-type: text/html; charset=utf-8
date: Mon, 28 Feb 2022 20:31:38 GMT
access-control-expose-headers: Request-Context
location: https://app.sherpadesk.com/login/?ref=portal
content-length: 161
request-context: appId=cid-v1:d5f9900e-ecd4-442f-9e92-e11b4cdbc0c9
x-frame-options: SAMEORIGIN
x-xss-protection: 1
x-content-type-options: nosniff
strict-transport-security: max-age=31536000
HTTP/2 200
content-type: text/html; charset=utf-8
date: Mon, 28 Feb 2022 20:31:39 GMT
access-control-expose-headers: Request-Context
cache-control: no-store, no-cache
expires: -1
pragma: no-cache
set-cookie: ASP.NET_SessionId=aqmnxu2s3qkri3sravsrs1cq; path=/; HttpOnly; SameSite=Lax
content-length: 8935
request-context: appId=cid-v1:d5f9900e-ecd4-442f-9e92-e11b4cdbc0c9
x-frame-options: SAMEORIGIN
x-xss-protection: 1
x-content-type-options: nosniff
strict-transport-security: max-age=31536000
and here is the (debug) output when I use Python's requests module requests.head(url):
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): historia.sherpadesk.com:443
send: b'HEAD / HTTP/1.1\r\nHost: historia.sherpadesk.com\r\nUser-Agent: python-requests/2.26.0\r\nAccept-Encoding: gzip, deflate, br\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 403 Forbidden: Access is denied.\r\n'
header: Content-Length: 58
header: Content-Type: text/html
header: Date: Mon, 28 Feb 2022 20:36:18 GMT
header: X-Frame-Options: SAMEORIGIN
header: X-XSS-Protection: 1
header: X-Content-Type-Options: nosniff
header: Strict-Transport-Security: max-age=31536000
DEBUG:urllib3.connectionpool:https://historia.sherpadesk.com:443 "HEAD / HTTP/1.1" 403 0
INFO:root:URL: https://historia.sherpadesk.com/
INFO:root:<Response [403]>
which just results in 403 response code. Response is same whether allow_redirects is True/False. I have also tried using proxy with python code, as I thought maybe its getting blocked as this URL might be recognising Python's request to be a bot, but that also fails. Also, if that was the case, why does curl succeed?
So, my main question here is: what are the major differences between curl and requests, which might cause difference in responses in certain cases? If possible, I would really like thorough explanation which could help me debug and resolve these issues.
The two libraries are different but the problem here is related to user agent.
When I try with curl, specifying the python-requests user agent:
$ curl --head -A "python-requests/2.26.0" https://historia.sherpadesk.com/
HTTP/2 403
content-type: text/html
date: Mon, 28 Feb 2022 22:30:02 GMT
content-length: 58
x-frame-options: SAMEORIGIN
x-xss-protection: 1
x-content-type-options: nosniff
strict-transport-security: max-age=31536000
With curl default user agent:
$ curl --head https://historia.sherpadesk.com/
HTTP/2 302
...
Apparently, they have some type of website security that is blocking HTTP clients like python-requests, but not curl for some reason.

Got 404 error during uploading file to YouTube

I have basic script for uploading video files and small pipeline to run its
The script is https://developers.google.com/youtube/v3/guides/uploading_a_video
It been worked fine couple of last months pretty fine, but I started getting 404 error from api server
There is the rich output
/usr/bin/python2.7 push_video.py --file="/tmp/HofQ.mp4" --description="File name" --keywords="test" --category="22" --privacyStatus="private"
connect: (www.googleapis.com, 443)
send: 'GET /discovery/v1/apis/youtube/v3/rest HTTP/1.1\r\nHost: www.googleapis.com\r\naccept-encoding: gzip, deflate\r\nauthorization: Bearer ya29.a0AfH6SMCzOovQCiAa0I-Mrz7oD-wWeikotEGLIRzk2Z6D2N7umFciU5RDQWZMKtBOXu-7gI_-v_ArhcNtTE9kNPnYHVYi32697vPBUC3be0sAi-kPHN9Utpi00gS1KDpa5gko8ZR_D_euZSzM_3VJrinOMe1jWIsS-WY\r\nuser-agent: Python-httplib2/0.17.2 (gzip)\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Content-Type: application/json; charset=UTF-8
header: Vary: Origin
header: Vary: X-Origin
header: Vary: Referer
header: Content-Encoding: gzip
header: Date: Fri, 18 Sep 2020 09:35:20 GMT
header: Server: scaffolding on HTTPServer2
header: Cache-Control: private
header: X-XSS-Protection: 0
header: X-Frame-Options: SAMEORIGIN
header: X-Content-Type-Options: nosniff
header: Alt-Svc: h3-29=":443"; ma=2592000,h3-27=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-T050=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
header: Transfer-Encoding: chunked
Uploading file...
connect: (youtube.googleapis.com, 443)
send: u'POST /upload/youtube/v3/videos?uploadType=resumable&alt=json&part=status%2Csnippet HTTP/1.1\r\nHost: youtube.googleapis.com\r\ncontent-length: 142\r\naccept-encoding: gzip, deflate\r\naccept: application/json\r\nuser-agent: (gzip)\r\nx-upload-content-length: 155903894\r\nx-upload-content-type: video/mp4\r\ncontent-type: application/json\r\nauthorization: Bearer xxxxxx_-v_xxxx\r\nx-goog-api-client: gdcl/1.8.0 gl-python/2.7.18rc1\r\n\r\n{"status": {"privacyStatus": "private"}, "snippet": {"tags": ["test"], "categoryId": "22", "description": "File name", "title": "Test Title"}}'
reply: 'HTTP/1.1 404 Not Found\r\n'
header: Content-Type: text/plain; charset=utf-8
header: X-GUploader-UploadID: ABg5-UyUv6D3yHDBjVs6znCaTrtwA5GthyyHgrqOZNzB2uRy_QnO10h40rBmFEJMBQQvzwKggt7J-k4ulclMI2e9H90emRvk-A
header: Content-Length: 9
header: Date: Fri, 18 Sep 2020 09:35:20 GMT
header: Server: UploadServer
header: Alt-Svc: h3-Q050=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-27=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-T050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
An HTTP error 404 occurred:
Not Found
What I have tried to do:
Run from another hosts, same problem
Renew authorization, same problem
Many thanks for any help
Youtube fixed the issue by changing the rootUrl in https://www.googleapis.com/discovery/v1/apis/youtube/v3/rest from https://youtube.googleapis.com/ to https://www.googleapis.com/

Cannot retreive CSP header with python

I want to retrieve all headers from a certain site, in this example "https://www.facebook.com" as following:
import urllib2
enter code here`req = urllib2.Request('https://www.facebook.com/')
res = urllib2.urlopen(req)
print res.info()
res.close();
that results in this response:
X-XSS-Protection: 0
Pragma: no-cache
Cache-Control: private, no-cache, no-store, must-revalidate
X-Frame-Options: DENY
Strict-Transport-Security: max-age=15552000; preload
X-Content-Type-Options: nosniff
Expires: Sat, 01 Jan 2000 00:00:00 GMT
Set-Cookie: sb=1GyeWkJzGbmX-VUyBi26; expires=Thu, 05-Mar-2020 10:26:28 GMT; Max-Age=63071999; path=/; domain=.facebook.com; secure; httponly
Vary: Accept-Encoding
Content-Type: text/html; charset=UTF-8
X-FB-Debug: X9aSOOKs6/aER1yuY4iUUIZrj4yTKtKSUAZ/AFE37IieCe8O4MSsFc5xlQ0LoQyHnbrSL4DaYiTVUUkFZeDrsqqg==
Date: Tue, 06 Mar 2018 10:26:29 GMT
Connection: close
I can retrieve all headers except for the Content-Security-Policy (csp);
But whenever I test on geekflare csp test
It succesfully retrieved all headers including the csp one.
Seems like I forgot to set the User-Agent within the request.

Uploading file using requests python giving 500 error

try:
request_url = http://sandbox.api.hmhco.com/v1/documents
with open("C:\\Users\\Animesh\\Downloads\\assignment_4.pdf", "rb") as f:
files = {
"file": f
}
r = requests.post(request_url, files=files)
r.raise_for_status()
return r.json()
except exceptions.RequestException as e:
print e
sys.exit(-1)
This gives a 500 status error. I used logging to get what is being sent:
send:
POST /v1/documents HTTP/1.1
Host: sandbox.api.hmhco.com
Vnd-HMH-Api-Key: some_api_key
Accept-Encoding: gzip, deflate
Content-Length: 56926
Accept: application/json
User-Agent: python-requests/2.9.1
Connection: keep-alive
Content-Type: multipart/form-data
Authorization: some_access_token
--e11c78c0aeeb4cafa9837388ab386660
Content-Disposition: form-data; name="file"; filename="assignment_4.pdf"
%PDF-1.5\n%\xc7\xec\x8f\xa2\n5.......afa9837388ab386660--'
reply: 'HTTP/1.1 500 Internal Server Error\r\n'
header: Access-Control-Allow-Headers: Authorization, Content-Type, Vnd-HMH-Api-Key
header: Access-Control-Allow-Methods: POST, GET, OPTIONS, DELETE, PUT, PATCH
header: Access-Control-Allow-Origin: *
header: Access-Control-Max-Age: 3600
header: Date: Tue, 29 Dec 2015 05:21:58 GMT
header: Server: nginx/1.4.6 (Ubuntu)
header: X-Application-Context: palantir:8070
header: Content-Length: 0
header: Connection: keep-alive
500 Server Error: Internal Server Error for url: http://sandbox.api.hmhco.com/v1/documents
But the corresponding cURL command works:
curl -v -X POST -H "Vnd-HMH-Api-Key:some_api_key" -H "accepts:application/json" -H "Content-Type:application/json" -H "Authorization:some_access_token" "http://sandbox.api.hmhco.com/v1/documents" -F file=#"C:\Users\Animesh\Downloads\assignment_4.pdf"
The verbose out gives:
> POST /v1/documents HTTP/1.1
> User-Agent: curl/7.37.0
> Host: sandbox.api.hmhco.com
> Accept: */*
> Vnd-HMH-Api-Key:some_api_key
> accepts:application/json
> Authorization:some_access_token
> Content-Length: 56982
> Expect: 100-continue
> Content-Type:multipart/form-data; boundary=------------------------20b5cc23cfa6d1d4
>
< HTTP/1.1 100 Continue
< HTTP/1.1 200 OK
< Access-Control-Allow-Headers: Authorization, Content-Type, Vnd-HMH-Api-Key
< Access-Control-Allow-Headers: accept, api_key, x-mashery-debug, Authorization, authCurrentDateTime, Content-Type
< Access-Control-Allow-Methods: POST, GET, OPTIONS, DELETE, PUT, PATCH
< Access-Control-Allow-Methods: POST, GET, OPTIONS, PUT, DELETE
< Access-Control-Allow-Origin: *
< Access-Control-Max-Age: 3600
< Access-Control-Request-Headers: accept, api_key, x-mashery-debug, Authorization, authCurrentDateTime, Content-Type
< Cache-Control: max-age=0, private, must-revalidate
< Content-Type: application/json;charset=UTF-8
< Date: Tue, 29 Dec 2015 05:28:40 GMT
< ETag: "e421273e09891349ef4808ca5677345c"
< P3P: CP="NOI ADM DEV PSAi COM NAV OUR OTRo STP IND DEM"
* Server nginx/1.4.6 (Ubuntu) is not blacklisted
< Server: nginx/1.4.6 (Ubuntu)
< Status: 200 OK
< User-Info: {"iss":"https://identity.api.hmhco.com","aud":"http://www.hmhco.com","iat":1451323840,"sub":"cn\u003dSauron Baraddur,uid\u003dsauron,uniqueIdentifier\u003dab1e436e-8177-4139-8e2d-52a6f7a8be27,dc\u0
3d1","http://www.imsglobal.org/imspurl/lis/v1/vocab/person":["Instructor"],"client_id":"ef5f7a03-58e8-48d7-a38a-abbd2696bdb6.hmhco.com","exp":1451409840}
< X-Application-Context: palantir:8070
< X-Rack-Cache: invalidate, pass
< X-Request-Id: a0078f7f5daf804affb98de6947170ca
< X-Runtime: 0.066617
< X-UA-Compatible: IE=Edge,chrome=1
< Content-Length: 452
< Connection: keep-alive
<
{"content_type":"application/pdf","created_at":"2015-12-29T00:28:40-05:00","file_tmp":"1451366920-13680-2938/assignment_4.pdf","original_filename":"assignment_4.pdf","secure_token":"7f395da4-5ccf-41a6-86a3-16a5
af0de79","size":56774,"updated_at":"2015-12-29T00:2 ......
This gives response code 200 and response body as a json:
{
"content_type": "application/pdf",
"created_at": "2015-12-28T14:13:25-05:00",
"file_tmp": "1451330005-13680-5506/assignment_4.pdf",
"original_filename": "assignment_4.pdf",
"secure_token": "9546cf79-0b00-4454-b577-6bdc8ddd7315",
"size": 56774,
"updated_at": "2015-12-28T14:13:25-05:00", ......
Change line which sends post request to:
r = requests.post(request_url, files=files, headers={'content-type': 'application/json'})
Because in sample above curl making post with 'content-type': 'application/json' and everything goes well and your python requests isn't. Setting content-type in python code should fix your issue.

Categories