Recording HTTP in Python with Scotch

Recording HTTP in Python with Scotch - python

I am trying to record HTTP GET/POST requests sent by my browser using the library scotch.
I am using their sample code: http://darcs.idyll.org/~t/projects/scotch/doc/recipes.html#id2
import scotch.proxy
app = scotch.proxy.ProxyApp()
import scotch.recorder
recorder = scotch.recorder.Recorder(app, verbosity=1)
try:
from wsgiref.simple_server import WSGIServer, WSGIRequestHandler
server_address = ('', 8000)
httpd = WSGIServer(server_address, WSGIRequestHandler)
httpd.set_app(app)
while 1:
httpd.handle_request()
finally:
from cPickle import dump
outfp = open('recording.pickle', 'w')
dump(recorder.record_holder, outfp)
outfp.close()
print 'saved %d records' % (len(recorder.record_holder))
So I ran above code, went over to google chrome, and visited a few sites to see if that would get recorded.
However, I do not see how the code should terminate. It seems that there has to be an error in httpd.handle_request() for the code to terminate.
I tried a variation of the code where I removed the try and finally syntax, and changed the while condition so that the loop ran for 30 seconds. However, that seems to be running forever as well.
Any ideas on how to get this working? I am also open to using other python libraries available for what I am trying to do: record my browser's GET/POST requests, including logons, and replay this within python.
Thanks.

Correct me if I'm wrong, but you're trying to log the activity of your local browser by setting a local proxy. If this is the case your browser needs to go through your proxy in order for your proxy server to log the activity.
The code that you've provided sets a proxy server at localhost:8000, so you need to tell your browser about this. The actual setting depends on the browser, I'm sure you'd be able to google it easily.
When I've asked to check if the code is running I actually mean whether your local proxy accepts some kind of request from the browser. Do you see the 'saved records' print out of your code at some point?

Related

Using Python library pyodata to access data in Odata

So, I am trying to use the pyodata library in Python to access and download data from Odata.
I tried accessing the Northwind data and it worked. So, i guess the codes i used is ok.
import requests
import pyodata
url_t = 'http://services.odata.org/V2/Northwind/Northwind.svc'
# connection set up
northwind = pyodata.Client(url_t, requests.Session())
# This prints out a single value from the table Customers
for customer in northwind.entity_sets.Customers.get_entities().execute():
print(customer.CustomerID,",", customer.CompanyName)
break
# This will print out - ALFKI , Alfreds Futterkiste
I also tried connecting to Odata in excel to see if the codes above return the correct data, and it did.
Click to see the screenshot in excel for Odata connection
Now, using the same code to connect to the data source where I want to pull the data did not work:
#using this link to connect to Odata worked.
url_1 = 'https://batch.decisionkey.npd.com/odata/dkusers'
session = requests.Session()
session.auth = (user_name, psw)
theservice = pyodata.Client(url_1, session)
The above codes return this error message(is it something about security?):
Click to see error message
Connecting to the data in excel looks like this:
Click the view image
I am thinking about it might be security issue that is blocking me from accessing the data, or it could be something else. Please let me know if anything need to be clarify. Thanks.
First time asking question, so please let me know if anything I did not do right here. ^_^

You got HTTP 404 - Not Found.
The service "https://batch.decisionkey.npd.com/odata/dkusers" is not accessible from outside world for me to try it, so there is something more from networking point of view that happens in the second picture in the Excel import.
You can forget the pyodata at the moment, for your problem it is just wrapper around HTTP networking layer, the Requests library. You need to find a way initialize the Requests session in a way, that will return HTTP 200 OK instead.
Northwind example service is just plain and simple, so no problem during initialization of pyodata.Client
Refer to Requests library documentation- https://docs.python-requests.org/en/latest/user/advanced/
//sample script
url_1 = 'https://batch.decisionkey.npd.com/odata/dkusers'
session = requests.Session()
session.auth = (user_name, psw)
//??? SSL certificate needs to be provided perhaps?
//?? or maybe you are behind some proxy that Excel uses but python not.. try ping in CMD
response = session.get(url_1)
print(response.text)
Usable can be pyodata documentation about initialization, however you will not find there the reason why you get HTTP 404 - https://pyodata.readthedocs.io/en/latest/usage/initialization.html

Requests/urllib not working in Flask/Apache/mod_wsgi/Windows

I have a Flask app with code that processes data coming from a request to another web app hosted on a different server, and it works just fine in development, furthermore, the library that processes the request can be called and used perfectly fine from python in our Windows server... However when the library is called by the webapp in production using mod_wsgi it refuses to work, requests made by the server just... time out.
I have tried everything from moving my code to the file it's used in, to switching from requests to urllib... nothing, so long as they're made from mod_wsgi all requests I make time out.
Why is that? is it some weird apache configuration thing that I'm unaware of?
I'm posting the library below (sorry I have to censor it up a bit, but I promise it works)
import requests
import re
class CannotAccessServerException(Exception):
pass
class ServerItemNotFoundException(Exception):
pass
class Service():
REQUEST_URL = "http://server-ip/url?query={query}&toexcel=csv"
#classmethod
def fetch_info(cls, query):
# Get Approximate matches
try:
server_request = requests.get(cls.REQUEST_URL.format(query = query), timeout = 30).content
except:
raise CannotAccessServerException
# If you're getting ServerItemNotFoundException or funny values consistently maybe the server has changed their tables.
server_regex = re.compile('^([\d\-]+);[\d\-]+;[\d\-]+;[\d\-]+;[\d\-]+;[\-"\w]+;[\w"\-]+;{query};[\w"\-]+;[\w"\-]+;[\w"\-]+;[\w"\-]+;[\w\s:"\-]+;[\w\s"\-]+;[\d\-]+;[\d\-]+;[\d\-]+;([\w\-]+);[\w\s"\-]+;[\w\-]+;[\w\s"\-]+;[\d\-]+;[\d\-]+;[\d\-]+;([\w\-]+);[\d\-]+;[\d\-]+;[\w\-]+;[\w\-]+;[\w\-]+;[\w\-]+;[\w\s"\-]+$'.format(query = query), re.MULTILINE)
server_exact_match = server_regex.search(server_request.decode())
if server_exact_match is None:
raise ServerItemNotFoundException
result_json = {
"retrieved1": server_exact_match.group(1),
"retrieved2": server_exact_match.group(2),
"retrieved3": server_exact_match.group(3)
}
return result_json
if __name__ == '__main__':
print(Service.fetch_info(99999))
PS: I know it times out because one of the things I tried was capturing the error raised by requests.get and returning its repr esentation.

In case anybody's wondering, after lots of research, trying to run my module as a subprocess, and all kinds of experiments, I had to resort to replicating the entirety of the dataset I needed to query from the remote server to my database with a weekly crontab task and then querying that.
So... Yeah, I don't have a solution, to be frank, or an explanation of why this happens. But if this is happening to you, your best bet might sadly be replicating the entire dataset on your server.

Sending parameters to a remote server via python script

I'm learning python these days, and I have a rather basic question.
There's a remote server where a webserver is listening in on incoming traffic. If I enter a uri like the following in my browser, it performs certain processing on some files for me:
//223.58.1.10:8000/processVideo?container=videos&video=0eb8c45a-3238-4e2b-a302-b89f97ef0554.mp4
My rather basic question is: how can I achieve sending the above dynamic parameters (i.e. container = container_name and video = video_name) to 223.58.1.10:8000/processVideo via a python script?
I've seen ways to ping an IP, but my requirements are more than that. Guidance from experts will be very helpful. Thanks!

import requests
requests.get("http://223.58.1.10:8000/processVideo",data={"video":"123asd","container":"videos"})
I guess ... its not really clear what you are asking ...
Im sure you could do the same with just urllib and/or urllib2
import urllib
some_url = "http://223.58.1.10:8000/processVideo?video=%s&container=%s"%(video_name,container_name)# you should probably use urllib urllencode here ...
urllib.urlopen(some_url).read()

WebIOPi and Harmony Hub

My end goal here is to turn on my tv using my Pi. I've already setup and configured everything I can think of, I can access the pi remotely via http, but I constantly get a 404 when trying to call a macro via the REST API. Script runs fine on its own, just can't seem to be called from http.
At this point, I'd take any solution that can be executed via http. Php, cgi, etc, don't care, I just need it to run beside the current setup.
Added to config file as follows:
myscript = /home/pi/harmony.py
harmony.py
import webiopi
import sys
import os
#webiopi.macro
def HarAll():
os.system("/home/pi/Desktop/harmonycontrol/HarmonyHubControl em#i.l passwort start_activity 6463490")
When I attempt to access http://piaddress:8000/macros/HarAll I get a 404. I'm positive I'm missing a step here, for some reason, webIOPi simply isn't adding the macro to the web server.

Got it figured out, this whole time I was trying to test it instead of just adding it to the app I made, I was sending http GET from web browser instead of http POST. Works perfectly.

Bottle with CherryPy does not behave multi-threaded when same end-point is called

As far as I know Bottle when used with CherryPy server should behave multi-threaded. I have a simple test program:
from bottle import Bottle, run
import time
app = Bottle()
#app.route('/hello')
def hello():
time.sleep(5)
#app.route('/hello2')
def hello2():
time.sleep(5)
run(app, host='0.0.0.0', server="cherrypy", port=8080)
When I call localhost:8080/hello by opening 2 tabs and refreshing them at the same time, they don't return at the same time but one of them is completed after 5 seconds and the other is completed after 5 more seconds.
But when I call /hello in one tab and /hello2 in another at the same time they finish at the same time.
Why does Bottle not behave multi-threaded when the same end-point is called twice? Is there a way to make it multi-threaded?
Python version: 2.7.6
Bottle version: 0.12.8
CherryPy version: 3.7.0
OS: Tried on both Ubuntu 14.04 64-Bit & Windows 10 64-Bit

I already met this behaviour answering one question and it had gotten me confused. If you would have searched around for related questions the list would go on and on.
The suspect was some incorrect server-side handling of Keep-Alive, HTTP pipelining, cache policy or the like. But in fact it has nothing to do with server-side at all. The concurrent requests coming to the same URL are serialised because of a browser cache implementation (Firefox, Chromium). The best answer I've found before searching bugtrackers directly, says:
Necko's cache can only handle one writer per cache entry. So if you make multiple requests for the same URL, the first one will open the cache entry for writing and the later ones will block on the cache entry open until the first one finishes.
Indeed, if you disable cache in Firebug or DevTools, the effect doesn't persist.
Thus, if your clients are not browsers, API for example, just ignore the issue. Otherwise, if you really need to do concurrent requests from one browser to the same URL (normal requests or XHRs) add random query string parameter to make request URLs unique, e.g. http://example.com/concurrent/page?nocache=1433247395.

It's almost certainly your browser that's serializing the request. Try using two different ones, or better yet a real client. It doesn't reproduce for me using curl.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Recording HTTP in Python with Scotch - python

Related

Using Python library pyodata to access data in Odata

Requests/urllib not working in Flask/Apache/mod_wsgi/Windows

Sending parameters to a remote server via python script

WebIOPi and Harmony Hub

Bottle with CherryPy does not behave multi-threaded when same end-point is called

Categories

Resources