Why video served through CGI is not played by the browser - python

I'm trying to create a Python CGI that loads videos from a hidden location in my server and "prints" that video to stdout so a HTTP client could get it's content.
Here is my python CGI script:
import os
import sys
filePath = "test_video.mp4"
print "Last-Modified: Fri, 24 Apr 2015 22:09:52 GMT"
print "Accept-Ranges: bytes"
print "Content-Length:", os.path.getsize(filePath)
print "Content-type: video/mp4\n"
sys.stdout.flush()
content = file(filePath, "rb").read()
sys.stdout.write(content)
sys.stdout.flush()
When I open the link that points to the video, with Google Chrome, for example (http://192.168.0.4/~fccoelho/mpeg/test_video.mp4) the video is played perfectly. However when I open (http://192.168.0.4/~fccoelho/mpeg/test.py) that is the CGI script for serving the same file, the video is not played.
What is odd is that when I get both with wget, the served file is exactly the same.
I have also checked the headers with curl:
curl -I http://192.168.0.4/~fccoelho/mpeg/test.py
HTTP/1.1 200 OK
Date: Fri, 24 Apr 2015 23:09:34 GMT
Server: Apache/2.4.7 (Ubuntu)
Accept-Ranges: bytes
Last-Modified: Fri, 24 Apr 2015 22:09:52 GMT
Content-Length: 32445874
Content-Type: video/mp4
curl -I http://192.168.0.4/~fccoelho/mpeg/test_video.mp4
HTTP/1.1 200 OK
Date: Fri, 24 Apr 2015 23:09:55 GMT
Server: Apache/2.4.7 (Ubuntu)
Last-Modified: Fri, 24 Apr 2015 22:09:52 GMT
ETag: "1ef15b2-5147fa7cfbd80"
Accept-Ranges: bytes
Content-Length: 32445874
Content-Type: video/mp4
And they seem to be compatible.
Could someone give a clue on why this approach is not working?
I have tried everything here in Firefox and it works.
But what scares me is how Google Chrome can behave differently even receiving the same response.

Related

How to print text from a website in Python?

These are the homework assignment and what it is asking.
to prompt the user for the URL so it can read any web page.
You can use split('/') to break the URL into its component parts so you can extract the host name for the socket connect call.
Add error checking using try and except to handle the condition where the user enters an improperly formatted or non-existent URL
the code only needs to accept 300 bytes from the URL page - in the code in 12.2 it would be len(data)to determine how much data was read. In the the while loop the mysock.recv only reads 128 bytes at a time
When putting a website in, it does print on what it has within that website,
I need something like this for my output:
HTTP/1.1 200 OK
Date: Thu, 15 Apr 2021 20:46:29 GMT
Server: Apache/2.4.18 (Ubuntu)
Last-Modified: Sat, 13 May 2017 11:22:22 GMT
ETag: "a7-54f6609245537"
Accept-Ranges: bytes
Content-Length: 167
Cache-Control: max-age=0, no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Connection: close
Content-Type: text/plain
But soft what light through yonder window breaks
It is the east and Juliet is the sun
Arise fair sun and kill the envious moon
Who is already sick and pale with grief
Please help me

How to get missing content length for a file from url?

I am trying to write a simple download manager using python with concurrency. The aim is to use the Content-Length header from the response of a file url and splits the file into chunks then download those chunks concurrently. The idea works for all the urls which have Content-Length header but recently I came across some urls which doesn't serve a Content-Length header.
https://filesamples.com/samples/audio/mp3/sample3.mp3
HTTP/1.1 200 OK
Date: Sat, 08 Aug 2020 11:53:15 GMT
Content-Type: audio/mpeg
Transfer-Encoding: chunked
Connection: close
Set-Cookie: __cfduid=d2a4be3535695af67cb7a7efe5add19bf1596887595; expires=Mon, 07-Sep-20 11:53:15 GMT; path=/; domain=.filesamples.com; HttpOnly; SameSite=Lax
Cache-Control: public, max-age=86400
Display: staticcontent_sol, staticcontent_sol
Etag: W/"5def04f1-19d6dd-gzip"
Last-Modified: Fri, 31 Jul 2020 21:52:34 GMT
Response: 200
Vary: Accept-Encoding
Vary: User-Agent,Origin,Accept-Encoding
X-Ezoic-Cdn: Miss
X-Middleton-Display: staticcontent_sol, staticcontent_sol
X-Middleton-Response: 200
CF-Cache-Status: HIT
Age: 24
cf-request-id: 046f8413ab0000e047449da200000001
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Server: cloudflare
CF-RAY: 5bf90932ae53e047-SEA
How can I get the content-length of the file without downloading the whole file?

Huge gap between two methods of measuring request time via python-requests

I am trying to measure the respone time of a certain request using python-requests.
import requests
import time
start = time.time()
r = requests.get("https://www.dl.soc.i.kyoto-u.ac.jp/index.php/members/")
end = time.time()
print(end - start)
print(r.elapsed.seconds)
It gave me a result of
64.67747116088867
0.631163
Could anyone please explain the reason of this huge gap? Thanks.
By the way, when I was trying the same request on Google-Chrome, actually the first result is what I want.
I made some test with an artificially delaying webserver:
nc -l 8080
Then in another terminal in a Python session:
import time, requests
a=time.time()
r = requests.get("http://localhost:8080/")
b=time.time()
print r.elapsed, b-a
Pasting this issued this HTTP request on the server terminal:
GET / HTTP/1.1
Host: localhost:8080
Connection: keep-alive
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.9.1
I waited for 5 seconds, then I pasted this reply in the server:
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 12
Connection: keep-alive
Keep-Alive: timeout=15
Date: Thu, 04 Jan 2018 10:12:09 GMT
Server: Apache
Last-Modified: Wed, 09 Dec 2015 13:57:24 GMT
ETag: "28bd-52677784b6090"
Accept-Ranges: bytes
hello
I stated 12 bytes but only sent 6 (hello\n), so this was unfinished. I waited another five seconds, then pasted this text:
world
This finished the reply with the remaining six bytes (world\n). In the client terminal I saw the result appear:
0:00:05.185509 10.8904578686
So, obviously the r.elapsed is the Time-To-First-Byte (TTFB) while the call to requests.get() only terminates after the whole message has been received (Time-To-Last-Byte, TTLB).

How to successfully download range of bytes instead of complete file using python?

I am trying to download only range of bytes of a file and I am trying the following process:
r = requests.get('https://stackoverflow.com', headers={'Range':'bytes=0-999'})
But it give status code 200 as opposed to 206 and I am getting the entire file.
I tried following this Python: How to download file using range of bytes? but it also gave me status code 200. What is the reason and how do I download files partially using python?
Headers for stackoverflow:
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Last-Modified: Fri, 04 Aug 2017 05:28:29 GMT
X-Frame-Options: SAMEORIGIN
X-Request-Guid: 86fd186e-b5ac-472f-9d79-c45149343776
Strict-Transport-Security: max-age=15552000
Content-Length: 107699
Accept-Ranges: bytes
Date: Wed, 06 Sep 2017 11:48:16 GMT
Via: 1.1 varnish
Age: 0
Connection: keep-alive
X-Served-By: cache-sin18023-SIN
X-Cache: MISS
X-Cache-Hits: 0
X-Timer: S1504698496.659820,VS0,VE404
Vary: Fastly-SSL
X-DNS-Prefetch-Control: off
Set-Cookie: prov=6df82a45-2405-b1dd-1430-99e27827b360; domain=.stackoverflow.com; expires=Fri, 01-Jan-2055 00:00:00 GMT; path=/; HttpOnly
Cache-Control: private
This requires server-side support. Probably the server does not support it.
To find out, make a HEAD request and see if the response has Accept-Ranges.

Python urllib open issue

I'm trying to fetch data from http://book.libertorrent.com/, but at the moment I'm failing badly because some additional data (headers) present in response. My code is very simple:
response = urllib.urlopen('http://book.libertorrent.com/login.php')
f = open('someFile.html', 'w')
f.write(response.read())
read() returns:
Date: Fri, 09 Nov 2012 07:36:54 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: close
Cache-Control: no-cache, pre-check=0, post-check=0
Expires: 0
Pragma: no-cache
Set-Cookie: bb_test=973132321; path=/; domain=book.libertorrent.com
Content-Language: ru
1ec0
...Html...
0
And response.info() is empty.
is there any way to correct response?
Let's try this:
$ echo -ne "GET /index.php HTTP/1.1\r\nHost: book.libertorrent.com\r\n\r\n" | nc book.libertorrent.com 80 | head -n 10
HTTP/1.1 200 OK
WWW
Date: Sat, 10 Nov 2012 17:41:57 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Content-Language: ru
1f57
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"><html dir="ltr">
See that "WWW" in the second line? That's no valid HTTP header, I'm guessing that's what's throwing off the response parser here.
By the way, python2 and python3 behave differently here:
python2 seems to immediately interpret anything after this invalid header as content
python3 ignores all headers and continues reading the content after the double newline. Because the headers are ignored, so is the transfer encoding, and therfore the content lengths are interpreted as part of the body.
So in the end the problem is that the server is sending an invalid response, which should be fixed at the server's end.

Categories