Python server can't access files with escaped characters in the URL

Python server can't access files with escaped characters in the URL - python

I have a python server for mp3 streaming. Using Twisted Matrix library.
If I try to access the file a.mp3 it works normally.
But this file, for example 衝.mp3 doesn't work, it says "File not found".
This file name as URL escaped is %E8%A1%9D.mp3 but it can't access it.
If I try to access it using unicode instead of the symbol, like this \u885d.mp3 it still says "File not found".
Here is the code, notice that I had to put request.path = request.path.replace('%20', ' ') because that's the only way it can access a file that has spaces in the path. That shouldn't be the normal behaviour I believe.
class playMP3(Resource):
isLeaf = True
def render_GET(self, request):
this=urlparse.urlparse(request.path)#scheme,netloc,path,query
root,ext=os.path.splitext(this.path)
filename=os.path.basename(request.path)
fileFolder=request.path.replace(filename,"")
self.serverRoot=os.getcwd()
print (request.path)
if ext==".mp3":
request.path = request.path.replace('%20', ' ')
thisFile=File(self.serverRoot+request.path)
return File.render_GET(thisFile,request)
resource = playMP3()
factory = Site(resource)
reactor.listenTCP(8880, factory)
reactor.run()
I also tried to put request.path = urllib.unquote(request.path) but instead of decoding it to 衝.mp3 it becomes ÞíØ.mp3. Weird.

Related

Python BaseHTTPRequestHandler throws Lookup error on anything but utf-8

I Need to write a server program in Python serving webpages and handling other GET and POST requests to and from client. I'm new to servers in Python, so I looked up some examples and after a while I had a basic Requesthandler running with some routing to my pages as a start. Routing worked in browser and pages were displayed there but I only got text, no styles, no pictures. Then I looked a bit further and realised that I also needed to handle GET requests for these .css, .js,.jpg files. So I did that, and ended up with smth like this:
class Serv(BaseHTTPRequestHandler):
def do_GET(self):
#route incoming path to correct page
if self.path in("","/"):
self.path = "/my_site/index.html"
#TODO do same for every page in the site
if self.path == "/foo":
self.path = "/my_site/fooandstuff.html"
if self.path == "/bar":
self.path = "/my_site/subdir/barfly.html"
try:
sendReply = False
if self.path.endswith(".html"):
mimetype = "text/html"
sendReply = True
if self.path.endswith(".jpg"):
mimetype = "image/jpg"
sendReply = True
if self.path.endswith(".js"):
mimetype = "application/javascript"
sendReply = True
if self.path.endswith(".css"):
mimetype = "text/css"
sendReply = True
if sendReply == True:
f = open(self.path[1:]).read()
self.send_response(200)
self.send_header('Content-type',mimetype)
self.end_headers()
self.wfile.write(f.encode(mimetype))
return
except IOError:
self.send_error(404, "File not found %s" % self.path)
When I run this and request a page, I get the following LookupError:
File "d:/somedir/myfile.py", line 47, in do_GET
self.wfile.write(f.encode(mimetype))
LookupError: unknown encoding: text/html
if I change text/html to utf-8, that seems te "solve" the problem, but then I run into the same Lookuperror but this time for image/jpg, and so on. It seems like wfile.write only accepts utf-8, although , when I look around, I see people passing file.read() just like that to wfile.write
wfile.write(file.read())
and for them it seems to work. Yet, when I do that, what I get is
File "C:\Users\myuser\AppData\Local\Programs\Python\Python37\lib\socketserver.py", line 799, in write
self._sock.sendall(b)
TypeError: a bytes-like object is required, not 'str'
What could cause this to happen?

for server handling with python better lookup flask
sample code will look like
from flask import Flask, render_template, url_for, request, redirect
import csv
#app.route('/')
def my_home():
return render_template('index.html')
#app.route('/<string:page_name>')
def html_page(page_name):
return render_template(page_name)
put all HTML in the same folder as your server.py in a folder called [template]
and all CSS and java in folder called [static] assets and all include. dont forget to change paths in css, java and html

The answer was in the opening of an image file, that needed an extra argument "rb" , like this:
if mimetype != "image/jpg":
f = open(self.path[1:])
else:
f = open(self.path[1:], "rb")
and then also:
if mimetype == "image/jpg":
self.wfile.write(f.read())
else:
self.wfile.write(f.read().encode("utf-8"))

Runtime Error - web.config

I have created a website using flask that takes in a string, creates a url based off the string, parses the url and then feeds it back into the website. I created a function to do so and it works perfectly. However when I implement it within my flask program it started throwing a runtime error that states:
An application error occurred on the server. The current custom error settings for this application prevent the details of the application error from being viewed remotely (for security reasons). It could, however, be viewed by browsers running on the local server machine.
Details:To enable the details of this specific error message to be viewable on remote machines, please create a customErrors tag within a "web.config" configuration file located in the root directory of the current web application. This customErrors tag should then have its "mode" attribute set to "Off".
I am not familiar with creating a web.config or how to implement this within my flask program. Any help would be appreciated.
Code:
Function that works when ran on it's own:
def parse_wotc():
set_list = []
# Manually enter in value for test
card_url = 'http://gatherer.wizards.com/Pages/Card/Details.aspx?name=' +
'mountain' # (replace mountain) card_name.replace(' ', '+')
soup = BeautifulSoup(requests.get(card_url).text, 'html.parser')
for image in soup.find_all('img'):
if image.get('title') is not None:
set_list.append(image.get('title'))
print(set_list)
return set_list
webapp code:
#app.route('/', methods=['GET', 'POST'])
def index():
card_name = None
card_url = '/static/images/card_back.jpg'
if request.form.get('random_button'):
card_url, card_name = random_card_image(list_card_names)
# When function ran here it give the error
parse_wotc(card_name)
def random_card_image(list_card_names):
"""This function will pull a random card name from the provided list and
return to main program"""
card_name = random.choice(list_card_names)
card_url = 'http://gatherer.wizards.com/Handlers/Image.ashx?name=' +
card_name.replace(' ', '+').lower() + \
'&type=card'
return card_url, card_name

It took a couple of hours to determine what the issue was, but it is working now. The issue is that I made a text file that had a list of card names that I was pulling from to create a random selection - the text file however included a trailing \n on each entry. Therefore it was creating a url with \n in it which was unnoticeable at the time and causing an error. I used rsplit() when creating the name list to remove the trailing \n and now it works perfectly.

Flask not writing to file

I've been meaning to log all the users that visit the site to a file.
Using Flask for the backend.
I have not been able to get python to write to the file. Tried keeping exception handling to catch any errors that might be generated while writing. No exceptions are being raised.
Here is the part of the blueprint that should write to file.
from .UserDataCache import UserDataCache
udc = UserDataCache()
#main.route('/')
def index():
s = Suggestion.query.all()
udc.writeUsertoFile()
return render_template('suggestions.html', suggestions = s)
Here is the UserDataCache class:
from flask import request
from datetime import datetime
class UserDataCache():
def __init__(self):
pass
def writeUsertoFile(self):
try:
with open("userData.txt","a") as f:
f.write(str(datetime.now()) + " " + request.remote_addr + " " + request.url + " " + request.headers.get('User-Agent') + "\n")
except IOError,e:
print e
return

I recommend using an absolute path and verifying the permissions on that file. Something like /tmp/UserData.txt or another absolute path should work. The web server's user is what needs the permission to write to the file (www-data if you're using apache2 with Ubuntu, or check your web server's conf file to verify).
As far as why you're not seeing the exception you're catching, I see you're using print. If you're calling the app using a web browser, you'll need to send the error to something else, like a log file or flash it to the browser, or raise an error so it gets logged in the web server error log.

Is your python file name begins with uppercase? If so, try to modify it into lowercase.
I just came into the same problem and copied the exactly same code into two .py file. The only difference is their file name, one being 'Flask_test.py' and another being 'flask_for_test.py'. It's weird that 'Flask_test.py' works just fine except it cannot write into any file and 'flask_for_test.py' works perfectly.
I don't know whether the format of the file name has an effect on the function of python but using lowercase file name works for me.
By the way, all other solutions I found didn't work.

Why does SimpleHTTPServer redirect to ?querystring/ when I request ?querystring?

I like to use Python's SimpleHTTPServer for local development of all kinds of web applications which require loading resources via Ajax calls etc.
When I use query strings in my URLs, the server always redirects to the same URL with a slash appended.
For example /folder/?id=1 redirects to /folder/?id=1/ using a HTTP 301 response.
I simply start the server using python -m SimpleHTTPServer.
Any idea how I could get rid of the redirecting behaviour? This is Python 2.7.2.

The right way to do this, to ensure that the query parameters remain as they should, is to make sure you do a request to the filename directly instead of letting SimpleHTTPServer redirect to your index.html
For example http://localhost:8000/?param1=1 does a redirect (301) and changes the url to http://localhost:8000/?param=1/ which messes with the query parameter.
However http://localhost:8000/index.html?param1=1 (making the index file explicit) loads correctly.
So just not letting SimpleHTTPServer do a url redirection solves the problem.

Okay. With the help of Morten I've come up with this, which seems to be all I need: Simply ignoring the query strings if they are there and serving the static files.
import SimpleHTTPServer
import SocketServer
PORT = 8000
class CustomHandler(SimpleHTTPServer.SimpleHTTPRequestHandler):
def __init__(self, req, client_addr, server):
SimpleHTTPServer.SimpleHTTPRequestHandler.__init__(self, req, client_addr, server)
def do_GET(self):
# cut off a query string
if '?' in self.path:
self.path = self.path.split('?')[0]
SimpleHTTPServer.SimpleHTTPRequestHandler.do_GET(self)
class MyTCPServer(SocketServer.ThreadingTCPServer):
allow_reuse_address = True
if __name__ == '__main__':
httpd = MyTCPServer(('localhost', PORT), CustomHandler)
httpd.allow_reuse_address = True
print "Serving at port", PORT
httpd.serve_forever()

I'm not sure how the redirect is generated... I've tried implementing a very basic SimpleHTTPServer, and I don't get any redirects when using query string params.
Just do something like self.path.split("/") and process the path before handling the request?
This code does what you want I think:
import SocketServer
import SimpleHTTPServer
import os
class CustomHandler(SimpleHTTPServer.SimpleHTTPRequestHandler):
def folder(self):
fid = self.uri[-1].split("?id=")[-1].rstrip()
return "FOLDER ID: %s" % fid
def get_static_content(self):
# set default root to cwd
root = os.getcwd()
# look up routes and set root directory accordingly
for pattern, rootdir in ROUTES:
if path.startswith(pattern):
# found match!
path = path[len(pattern):] # consume path up to pattern len
root = rootdir
break
# normalize path and prepend root directory
path = path.split('?',1)[0]
path = path.split('#',1)[0]
path = posixpath.normpath(urllib.unquote(path))
words = path.split('/')
words = filter(None, words)
path = root
for word in words:
drive, word = os.path.splitdrive(word)
head, word = os.path.split(word)
if word in (os.curdir, os.pardir):
continue
path = os.path.join(path, word)
return path
def do_GET(self):
path = self.path
self.uri = path.split("/")[1:]
actions = {
"folder": self.folder,
}
resource = self.uri[0]
if not resource:
return self.get_static_content()
action = actions.get(resource)
if action:
print "action from looking up '%s' is:" % resource, action
return self.wfile.write(action())
SimpleHTTPServer.SimpleHTTPRequestHandler.do_GET(self)
class MyTCPServer(SocketServer.ThreadingTCPServer):
allow_reuse_address = True
httpd = MyTCPServer(('localhost', 8080), CustomHandler)
httpd.allow_reuse_address = True
print "serving at port", 8080
httpd.serve_forever()
Try it out:
HTTP GET /folder/?id=500x -> "FOLDER ID: 500x"
EDIT:
Okay so if you haven't used the SimpleHTTPServer-stuff before, you basically implement the base request handler, implement do_GET(), do_PUT(), do_POST() etc.
What I usually do then is parse the request string (using re), pattern match and see if I can find a request-handler and if not, handle the request as a request for static content if possible.
You say you want to serve static content if at all possible, then you should flip this pattern matching around, and FIRST see if the request matches the file-store and if not, then match against handlers :)

File upload with Django via PUT

I am trying to implement a function in Django to upload an image from a client (an iPhone app) to an Amazon S3 server. The iPhone app sends a HttpRequest (method PUT) with the content of the image in the HTTPBody. For instance, the client PUTs the image to the following URL: http://127.0.0.1:8000/uploadimage/sampleImage.png/
My function in Django looks like this to handle such a PUT request and save the file to S3:
def store_in_s3(filename, content):
conn = S3Connection(settings.ACCESS_KEY, settings.PASS_KEY) # gets access key and pass key from settings.py
bucket = conn.create_bucket("somepicturebucket")
k = Key(bucket)
k.key = filename
mime = mimetypes.guess_type(filename)[0]
k.set_metadata("Content-Type", mime)
k.set_contents_from_string(content)
k.set_acl("public-read")
def upload_raw_data(request, name):
if request.method == 'PUT':
store_in_s3(name,request.raw_post_data)
return HttpResponse('Upload of raw data to S3 successful')
else:
return HttpResponse('Upload not successful')
My problem is how to tell my function the name of the image. In my urls.py I have the following but it won't work:
url(r'^uploadrawdata/(\d+)/', upload_raw_data ),
Now as far as I'm aware, d+ stands for digits, so it's obviously of no use here when I pass the name of a file. However, I was wondering if this is the correct way in the first place. I read this post here and it suggests the following line of code which I don't understand at all:
file_name = path.split("/")[-1:][0]
Also, I have no clue what the rest of the code is all about. I'm a bit new to all of this, so any suggestions of how to simply upload an image would be very welcome. Thanks!

This question is not really about uploading, and the linked answer is irrelevant. If you want to accept a string rather than digits in the URL, in order to pass a filename, you can just use w instead of d in the regex.
Edit to clarify Sorry, didn't realise you were trying to pass a whole file+extension. You probably want this:
r'^uploadrawdata/(.+)/$'
so that it matches any character. You should probably read an introduction to regular expressions, though.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python server can't access files with escaped characters in the URL - python

Related

Python BaseHTTPRequestHandler throws Lookup error on anything but utf-8

Runtime Error - web.config

Flask not writing to file

Why does SimpleHTTPServer redirect to ?querystring/ when I request ?querystring?

File upload with Django via PUT

Categories

Resources