I'm creating a forum status grabber. But I want to use sockets to grab the data from the forum. So I am writing to the socket a header. But there is 400 error. So I made a test script to do checking but still I get errors.
import socket
s = socket.socket()
s.connect(("198.57.47.136", 80))
header = """
GET / HTTP/1.1\r\n
Host: httn
Connection: keep-alive\r\n
Cache-Control: max-age=0\r\n
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\n
User-Agent: Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36 OPR/26.0.1656.60\r\n
Accept-Encoding: gzip, deflate, lzma, sdch\r\n
Accept-Language: en-GB,en-US;q=0.8,en;q=0.6\r\n
"""
s.send(header)
print s.recv(10000)
Which returns
HTTP/1.1 400 Bad Request
Server: nginx
Date: Thu, 01 Jan 2015 21:43:47 GMT
Content-Type: text/html
Content-Length: 166
Connection: close
<html>
<head><title>400 Bad Request</title></head>
<body bgcolor="white">
<center><h1>400 Bad Request</h1></center>
<hr><center>nginx</center>
</body>
</html>
A multi-line Python string adds an extra \n for every line. Note:
>>> s = '''
... Host: rile5.com\r\n
... '''
>>>
>>> s
'\nHost: rile5.com\r\n\n'
There is an extra first line and two \n for each line. This works, but not on the original IP address you used:
import socket
s = socket.socket()
s.connect(("rile5.com", 80))
header = b"""\
GET / HTTP/1.1\r
Host: rile5.com\r
Connection: keep-alive\r
Cache-Control: max-age=0\r
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r
User-Agent: Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36 OPR/26.0.1656.60\r
Accept-Encoding: gzip, deflate, lzma, sdch\r
Accept-Language: en-GB,en-US;q=0.8,en;q=0.6\r
\r
"""
s.sendall(header)
print(s.recv(10000))
Note the extra slash after the opening quotes. This suppresses the initial newline.
header = b"""\
Also note the extra blank line at the end. This is required so the server knows the header is complete.
Why not just use urllib.request?
Probably the problem is with the format of your request.
First, your HTTP request starts with a line feed. Also, the lines in a HTTP request must be separated by \r\n, while Python multiline strings only have \n. But since you have literals \r\n in some of them (not all) it is a mess.
Finally, the header must end with an empty line.
My advice is to use a list of strings without any line ending, and then join them:
header_lines = [
"GET / HTTP/1.1",
"Host: httn",
"Connection: keep-alive",
...
]
header = "\r\n".join(header_lines) + "\r\n\r\n"
Note that since str.join() does not add a final EOL, you have to add two of them to include the mandatory empty line.
Related
Hi I have a request as so
GET https://dropezy.goadda.in/master/v1/id/products/search?q=buah&storeId=619e02f0f580166d48277fc0&sortBy=relevance HTTP/2.0
accept: application/json
enro-api-key: rhcvwcdd-7aza-zvv7qskyu-4b8jm8kwppkf
cache-control: max-age=31536000
store-id: 619e02f0f580166d48277fc0
user-agent: Mozilla/5.0 (Linux; Android 11; sdk_gphone_x86 Build/RSR1.201013.001; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/83.0.4103.106 Mobile Safari/537.36
content-type: application/json; charset=utf-8
origin: https://www.dropezy.com
x-requested-with: com.enroco.dropezy
sec-fetch-site: cross-site
sec-fetch-mode: cors
sec-fetch-dest: empty
referer: https://www.dropezy.com/id/search/
accept-encoding: gzip, deflate
accept-language: en-US,en;q=0.9
content-length: 0
When I tried using request in python it returns message not found
This is my current code
url = 'https://dropezy.goadda.in/master/v1/id/products/search?q=buah&storeId=619e02f0f580166d48277fc0&sortBy=relevance HTTP/2.0'
accept='application/json'
jsonData = requests.post(url, accept).json()
Can somebody help? How to insert the other components such as user-agent, enro-api-key into requests?
The two requests aren't the same. The first is a GET request, but the second is a POST.
Use requests.get to make a GET request.
If you need to supply headers in a request, yes, you can do that: https://docs.python-requests.org/en/latest/user/quickstart/#custom-headers.
Generally, the docs for Requests are quite good: https://docs.python-requests.org/en/latest/
I want to handle multipart/form-data with python.
The request will look like this.
POST /upload HTTP/1.1
Host: localhost:8000
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:83.0) Gecko/20100101 Firefox/83.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://localhost:8000/uplaod.html
Content-Type: multipart/form-data; boundary=---------------------------13077140074516507283069689500
Content-Length: 20970
Origin: http://localhost:8000
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Cache-Control: max-age=0
-----------------------------13077140074516507283069689500
Content-Disposition: form-data; name="mylogo"; filename="logo.png"
Content-Type: image/png
image_byte_code
-----------------------------13077140074516507283069689500--
This will be received in bytes from client browser.
I need to decode it to string.
But when i decode it the image bytes get destroyed.
Which then when i save it it will not be an image.
I have so far been able to decode it and receive text files.
Using the email library that defaults with python can really help. you just split the header from the data then use this https://julien.danjou.info/handling-multipart-form-data-python/. Make sure you get the boundary data from the headers.
I'm trying to put variables into long string but it raises error. I've tried %s notation and now I'm trying to put there .format() notation but it doesn't work either.
Could you tell me what is the problem? Is it because the string is on multiple lines?
num = 5
r=requests.post("http://www.quoka.de/kleinanzeigen/nachhilfe/cat_0_ct_0_page_'{0}'.html".format(str(num)), headers=mLib.firebug_headers_to_dict("""POST /kleinanzeigen/nachhilfe/cat_0_ct_0_page_'{1}'.html HTTP/1.1
Host: www.quoka.de
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://www.quoka.de/kleinanzeigen/nachhilfe/cat_0_ct_0_page_'{2}'.html
Cookie: QSESSID=cs9djh8q8c6mjgsme85s4mf7iq24pqrthag630ar6po9fp078e20; PARTNER=VIEW%02quoka%01COOKIEBEGIN%021438270171; QUUHS=QPV%027%01QABAFS%02A%01ARYSEARCHODER%02%7B%22search1%22%3A%22nachhilfe%22%7D; __utma=195481459.415565446.1438270093.1438270093.1438270093.1; __utmb=195481459.22.8.1438270093; __utmc=195481459; __utmz=195481459.1438270093.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmt=1; __utmt_t2=1; POPUPCHECK=1438356493224; axd=100086901728180087; __gads=ID=92bfc541911a1c81:T=1438270098:S=ALNI_MYuEdhQnu7sWAfK-fyKf1Ej93_9KA; crtg_rta=; OX_sd=5; OX_plg=wmp|pm; rsi_segs=L11278_10123|L11278_10571|L11278_11639|F12351_10001|F12351_0; PURESADSCL=done
Connection: keep-alive
""".format(str(num),str(num-1))))
ERROR:
""".format(str(num),str(num-1))))
IndexError: tuple index out of range
In the second multi-line string, you are giving indexes as - {1} and {2} at -
_page_'{1}'.html
and
_0_page_'{2}'.html
But you are only giving two variables as argument for the format() function.
That is why when format function tries to access variable at {2} (which is the third argument, as its 0 indexed) it gets error.
You should define them as {0} and {1} .
Example -
r=requests.post("http://www.quoka.de/kleinanzeigen/nachhilfe/cat_0_ct_0_page_'{0}'.html".format(str(num)), headers=mLib.firebug_headers_to_dict("""POST /kleinanzeigen/nachhilfe/cat_0_ct_0_page_'{0}'.html HTTP/1.1
Host: www.quoka.de
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://www.quoka.de/kleinanzeigen/nachhilfe/cat_0_ct_0_page_'{1}'.html
Cookie: QSESSID=cs9djh8q8c6mjgsme85s4mf7iq24pqrthag630ar6po9fp078e20; PARTNER=VIEW%02quoka%01COOKIEBEGIN%021438270171; QUUHS=QPV%027%01QABAFS%02A%01ARYSEARCHODER%02%7B%22search1%22%3A%22nachhilfe%22%7D; __utma=195481459.415565446.1438270093.1438270093.1438270093.1; __utmb=195481459.22.8.1438270093; __utmc=195481459; __utmz=195481459.1438270093.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmt=1; __utmt_t2=1; POPUPCHECK=1438356493224; axd=100086901728180087; __gads=ID=92bfc541911a1c81:T=1438270098:S=ALNI_MYuEdhQnu7sWAfK-fyKf1Ej93_9KA; crtg_rta=; OX_sd=5; OX_plg=wmp|pm; rsi_segs=L11278_10123|L11278_10571|L11278_11639|F12351_10001|F12351_0; PURESADSCL=done
Connection: keep-alive
""".format(str(num),str(num-1))))
Your format string starts numbering at 1, not 0. You only provide 2 values for slots 0 and 1, but your template expects parameters numbered 1 and 2.
Note that you have two separate strings here. The first string uses {0}, the second uses {1} and {2}. Renumber those to {0} and {1} instead.
To reiterate, the first string is fine:
"http://www.quoka.de/kleinanzeigen/nachhilfe/cat_0_ct_0_page_'{0}'.html".format(str(num))
but you have to restart the numbering in the second string. In that string you have
POST /kleinanzeigen/nachhilfe/cat_0_ct_0_page_'{1}'.html HTTP/1.1
and
Referer: http://www.quoka.de/kleinanzeigen/nachhilfe/cat_0_ct_0_page_'{2}'.html
Simply use
POST /kleinanzeigen/nachhilfe/cat_0_ct_0_page_'{0}'.html HTTP/1.1
and
Referer: http://www.quoka.de/kleinanzeigen/nachhilfe/cat_0_ct_0_page_'{1}'.html
Note that those '..' single quotes are part of the string you are producing. I just tested the site and the URLs there do not use those quotes. You should probably remove them. You also don't have to call str() on everything, the template takes care of that for you:
num = 5
r = requests.post(
"http://www.quoka.de/kleinanzeigen/nachhilfe/cat_0_ct_0_page_{0}.html".format(num),
headers=mLib.firebug_headers_to_dict("""\
POST /kleinanzeigen/nachhilfe/cat_0_ct_0_page_{0}.html HTTP/1.1
Host: www.quoka.de
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://www.quoka.de/kleinanzeigen/nachhilfe/cat_0_ct_0_page_{1}.html
Cookie: QSESSID=cs9djh8q8c6mjgsme85s4mf7iq24pqrthag630ar6po9fp078e20; PARTNER=VIEW%02quoka%01COOKIEBEGIN%021438270171; QUUHS=QPV%027%01QABAFS%02A%01ARYSEARCHODER%02%7B%22search1%22%3A%22nachhilfe%22%7D; __utma=195481459.415565446.1438270093.1438270093.1438270093.1; __utmb=195481459.22.8.1438270093; __utmc=195481459; __utmz=195481459.1438270093.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmt=1; __utmt_t2=1; POPUPCHECK=1438356493224; axd=100086901728180087; __gads=ID=92bfc541911a1c81:T=1438270098:S=ALNI_MYuEdhQnu7sWAfK-fyKf1Ej93_9KA; crtg_rta=; OX_sd=5; OX_plg=wmp|pm; rsi_segs=L11278_10123|L11278_10571|L11278_11639|F12351_10001|F12351_0; PURESADSCL=done
Connection: keep-alive
""".format(num, num - 1)
))
Furthermore, I strongly doubt you need to include the POST <path> HTTP/1.1line at all. The Host, Accept-Encoding and Connection headers will be taken care of by requests, and if you were to use a Session() object, so will the cookies. Try not to set all those headers explicitly. Setting the headers as a dictionary is perhaps more readable:
session = requests.Session()
url = 'http://www.quoka.de/kleinanzeigen/nachhilfe/cat_0_ct_0_page_{0}.html'
num = 5
r = session.post(
url.format(num),
headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0',
'Referer': url.format(num - 1),
}
)
Last but not least, are you sure you need to perform a POST here? You are not posting anything, and a GET works just fine too.
I'm Learning Python And for one of my project I need to POST data to server which uses AMF messaging.
Captured headers looks like this:
POST (info hided)/amfgateway.php HTTP/1.1
Host: (info hided)
Connection: keep-alive
Content-Length: 52
Origin: (info hided)
X-Requested-With: ShockwaveFlash/16.0.0.235
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36
Content-Type: application/x-amf
Accept: */*
Referer: (info hided)
Accept-Encoding: gzip, deflate
Accept-Language: lt,en-US;q=0.8,en;q=0.6,ru;q=0.4,pl;q=0.2
Cookie: (info hided)
bcAmfService.addFriend /1
Aa$
And it's not a problem for me to POST headers but how do I format data that is sended to server:
I know there is a PyAmf library and I looked at documentation but it's very abstract and for beginner like me it's hard to put pieces together in one code.
So how do I format this data in Python?
I am sending several post requests in Fiddler2 to check my site to make sure it is working properly. However, when I automate in Python to simulate this over several hours (I really don't want to spend 7 hours hitting space!).
This works in fiddler. I can create the account and perform related API commands. However in Python, nothing happens with this code:
def main():
import socket
from time import sleep
x = raw_input("Points: ")
x = int(x)
x = int(x/150)
for y in range(x):
new = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
new.connect(('example.com', 80))
mydata ="""POST http://www.example.com/api/site/register/ HTTP/1.1
Host: www.example.com
Connection: keep-alive
Content-Length: 191
X-NewRelic-ID: UAIFVlNXGwEFV1hXAwY=
Origin: http://www.example.com
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.154 Safari/537.36
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Accept: application/json, text/javascript, */*; q=0.01
X-Requested-With: XMLHttpRequest
X-CSRFToken: CEC9EzYaQOGBdO9HGPVVt3Fg66SVWVXg
DNT: 1
Referer: http://www.example.com/signup
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-GB,en;q=0.8
Cookie: sessionid=sessionid; sb-closed=true; arp_scroll_position=600; csrftoken=2u92jo23g929gj2; __utma=912.1.1.2.5.; __utmb=9139i91; __utmc=2019199; __utmz=260270731.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided)
username=user&password=password&moredata=here """
new.send(mydata.encode('hex'))
print "Sent", y, "of", x
sleep(1)
print "Sent all!"
print "restarting"
main()
main()
I know I could use While True, but I intend to add more functions later to test more sites.
Why does this program not do anything to the API, when Fiddler2 can? I know it is my program, as I can send the exact same packet in fiddler (obviously pointing to the right place) and it works.
PS - If anyone does fix this, as its probably something really obvious, please can you only use modules that are bundled with Python. I cannot install modules from other places. Thanks!
HTTP requests are not as easy as you think they are. First of all this is wrong:
"""POST http://www.example.com/api/site/register/ HTTP/1.1
Host: www.example.com
Connection: keep-alive
...
"""
Each line in HTTP request has to end with CRLF (in Python with \r\n), i.e. it should be:
"""POST http://www.example.com/api/site/register/ HTTP/1.1\r
Host: www.example.com\r
Connection: keep-alive\r
...
"""
Note: LF = line feed = \n is there implicitly. Also you didn't see CR in your fiddler, because it's a white-space. But it has to be there (simple copy-paste won't work).
Also HTTP specifies that after headers there has to be CRLF as well. I.e. your entire request should be:
mydata = """POST http://www.example.com/api/site/register/ HTTP/1.1\r
Host: www.example.com\r
Connection: keep-alive\r
Content-Length: 191\r
X-NewRelic-ID: UAIFVlNXGwEFV1hXAwY=\r
Origin: http://www.example.com\r
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.154 Safari/537.36\r
Content-Type: application/x-www-form-urlencoded; charset=UTF-8\r
Accept: application/json, text/javascript, */*; q=0.01\r
X-Requested-With: XMLHttpRequest\r
X-CSRFToken: CEC9EzYaQOGBdO9HGPVVt3Fg66SVWVXg\r
DNT: 1\r
Referer: http://www.example.com/signup\r
Accept-Encoding: gzip,deflate,sdch\r
Accept-Language: en-GB,en;q=0.8\r
Cookie: sessionid=sessionid; sb-closed=true; arp_scroll_position=600; csrftoken=2u92jo23g929gj2; __utma=912.1.1.2.5.; __utmb=9139i91; __utmc=2019199; __utmz=260270731.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided)\r
\r
username=user&password=password&moredata=here"""
Warning: it should be exactly as I've written. There can't be any spaces in front of each line, i.e. this:
mydata = """POST http://www.example.com/api/site/register/ HTTP/1.1\r
Host: www.example.com\r
Connection: keep-alive\r
...
"""
is wrong.
Side note: you can move mydata to the top, outside of the loop. Unimportant optimization but makes your code cleaner.
Now you've said that the site you are using wants you to hex-encode HTTP request? It's hard for me to believe that (HTTP is a raw string by definition). Don't do that (and ask them to specify what exactly this hex-encoding means). Possibly they've meant that the URL should be hex-encoded (since it is the only hex-encoding that is actually used in HTTP)? In your case there is nothing to encode so don't worry about it. Just remove the .encode('hex') line.
Also Content-Length header is messed up. It should be the actual length of the content. So if for example body is username=user&password=password&moredata=here then it should be Content-Length: 45.
Next thing is that the server might not allow you making multiple requests without getting a response. You should use new.recv(b) where b is a number of bytes you want to read. But how many you should read? Well this might be problematic and that's where Content-Length header comes in. First you have to read headers (i.e. read until you read \r\n\r\n which means the end of headers) and then you have to read the body (based on Content-Length header). As you can see things are becoming messy (see: final section of my answer).
There are probably more issues with your code. For example X-CSRFToken suggests that the site uses CSRF prevention mechanism. In that case your request might not work at all (you have to get the value of X-CSRFToken header from the server).
Finally: don't use sockets directly. Httplib (http://docs.python.org/2/library/httplib.html) is a great (and built-in) library for making HTTP requests which will deal with all the funky and tricky HTTP stuff for you. Your code for example may look like this:
import httplib
headers = {
"Host": "www.example.com",
"X-NewRelic-ID": "UAIFVlNXGwEFV1hXAwY=",
"Origin": "http://www.example.com",
"Connection": "keep-alive",
"User-Agent": "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.154 Safari/537.36",
"Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
"Accept": "application/json, text/javascript, */*; q=0.01",
"X-Requested-With": "XMLHttpRequest",
"X-CSRFToken": "CEC9EzYaQOGBdO9HGPVVt3Fg66SVWVXg",
"DNT": "1",
"Referer": "http://www.example.com/signup",
"Accept-Encoding": "gzip,deflate,sdch",
"Accept-Language": "en-GB,en;q=0.8",
"Cookie": "sessionid=sessionid; sb-closed=true; arp_scroll_position=600; csrftoken=2u92jo23g929gj2; __utma=912.1.1.2.5.; __utmb=9139i91; __utmc=2019199; __utmz=260270731.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided)"
}
body = "username=user&password=password&moredata=here"
conn = httplib.HTTPConnection("example.com")
conn.request("POST", "http://www.example.com/api/site/register/", body, headers)
res = conn.getresponse()
Note that you don't need to specify Content-Length header.