I would like to implement a server in Python that streams music in MP3 format over HTTP. I would like it to broadcast the music such that a client can connect to the stream and start listening to whatever is currently playing, much like a radio station.
Previously, I've implemented my own HTTP server in Python using SocketServer.TCPServer (yes I know BaseHTTPServer exists, just wanted to write a mini HTTP stack myself), so how would a music streamer be different architecturally? What libraries would I need to look at on the network side and on the MP3 side?
The mp3 format was designed for streaming, which makes some things simpler than you might have expected. The data is essentially a stream of audio frames with built-in boundary markers, rather than a file header followed by raw data. This means that once a client is expecting to receive audio data, you can just start sending it bytes from any point in an existing mp3 source, whether it be live or a file, and the client will sync up to the next frame it finds and start playing audio. Yay!
Of course, you'll have to give clients a way to set up the connection. The de-facto standard is the SHOUTcast (ICY) protocol. This is very much like HTTP, but with status and header fields just different enough that it isn't directly compatible with Python's built-in http server libraries. You might be able to get those libraries to do some of the work for you, but their documented interfaces won't be enough to get it done; you'll have to read their code to understand how to make them speak SHOUTcast.
Here are a few links to get you started:
https://web.archive.org/web/20220912105447/http://forums.winamp.com/showthread.php?threadid=70403
https://web.archive.org/web/20170714033851/https://www.radiotoolbox.com/community/forums/viewtopic.php?t=74
https://web.archive.org/web/20190214132820/http://www.smackfu.com/stuff/programming/shoutcast.html
http://en.wikipedia.org/wiki/Shoutcast
I suggest starting with a single mp3 file as your data source, getting the client-server connection setup and playback working, and then moving on to issues like live sources, multiple encoding bit rates, inband meta-data, and playlists.
Playlists are generally either .pls or .m3u files, and essentially just static text files pointing at the URL for your live stream. They're not difficult and not even strictly necessary, since many (most?) mp3 streaming clients will accept a live stream URL with no playlist at all.
As for architecture, the field is pretty much wide open. You have as many options as there are for HTTP servers. Threaded? Worker processes? Event driven? It's up to you. To me, the more interesting question is how to share the data from a single input stream (the broadcaster) with the network handlers serving multiple output streams (the players). In order to avoid IPC and synchronization complications, I would probably start with a single-threaded event-driven design. In python 2, a library like gevent will give you very good I/O performance while allowing you to structure your code in a very understandable way. In python 3, I would prefer asyncio coroutines.
Since you already have good python experience (given you've already written an HTTP server) I can only provide a few pointers on how to extend the ground-work you've already done:
Prepare your server for dealing with Request Headers like: Accept-Encoding, Range, TE (Transfer Encoding), etc. An MP3-over-HTTP player (i.e. VLC) is nothing but an mp3 player that knows how to "speak" HTTP and "seek" to different positions in the file.
Use wireshark or tcpdump to sniff actual HTTP requests done by VLC when playing an mp3 over HTTP, so you know how what request headers you'll be receiving and implement them.
Good luck with your project!
You'll want to look into serving m3u or pls files. That should give you a file format that players understand well enough to hit your http server looking for mp3 files.
A minimal m3u file would just be a simple text file with one song url per line. Assuming you've got the following URLs available on your server:
/playlists/<playlist_name/playlist_id>
/songs/<song_name/song_id>
You'd serve a playlist from the url:
/playlists/myfirstplaylist
And the contents of the resource would be just:
/songs/1
/songs/mysong.mp3
A player (like Winamp) will be able to open the URL to the m3u file on your HTTP server and will then start streaming the first song on the playlist. All you'll have to do to support this is serve the mp3 file just like you'd serve any other static content.
Depending on how many clients you want to support you may want to look into asynchronous IO using a library like Twisted to support tons of simultaneous streams.
Study these before getting too far:
http://wiki.python.org/moin/PythonInMusic
Specifically
http://edna.sourceforge.net/
You'll want to have a .m3u or .pls file that points at a static URI (e.g. http://example.com/now_playing.mp3) then give them mp3 data starting wherever you are in the song when they ask for that file. Probably there are a bunch of minor issues I'm glossing over here...However, at least as forest points out, you can just start streaming the mp3 data from any byte.
Related
I have a python code on a server and if I upload a video from mobile to the server,so How can I provide path of that video to the python code if I want every video should prodceed by the python code
I have doubts that your explanation has truly reflects what you need. First of all servers accept every thing "as it is" as long as the input has the appropriate format for that specific "server". In your case, the video might be a stream, binary, or event encoded data in to a "socket" in your "server". the framework should not matter. So when you have a stream you should be able get it in to your "server" to be processed. If you have problem in that sense, you should try to look first how "servers" accept input. I assume you're knowledgable for that. Let's say you have a nginx server on a linux machine which also has a python included. So your web server should be configured to run in python (Django or something similar). Once you started to upload your file, the content can be passed as async, or sync process in python (I think I should not mention how RESTFull model work on http). When you have the data (stream or static/bulk), you should be able to whatever you want to do with that data.
I'm creating a simple app that can play audio files (currently only mp3 files) located on a webserver.
Currently, I'm using Python's SimpleHTTPServer server side, and the AVAudioPlayer for iOS.
It sort of works, since the file is streamed over HTTP instead of just being downloaded from the webserver. But I often experience that the playback of a file is suddenly restarted.
I'm considering using another method of streaming, eg. RTMP, but on the other hand I want to keep things simple. I'm wondering if another HTTP server might do the trick? Any other experiences/suggestions?
What happens when the playback is restarted? Print the HTTP URLs on the server. Does the player start from index=0, go to index=4000, then back to index=0 again?
I have an interesting project going on at our workplace. The task, that stands before us, is such:
Build a custom server using Python
It has a web server part, serving REST
It has a FTP server part, serving files
It has a SMTP part, which receives mail only
and last but not least, a it has a background worker that manages lowlevel file IO based on requests received from the above mentioned services
Obviously the go to place was Twisted library/framework, which is an excelent networking tool. However, studying the docs further, a few things came up that I'm not sure about.
Having Java background, I would solve the task (at least at the beginning) by spawning a separate thread for each service and going from there. Being in Python however, I cannot do that for any reasonable purpose as Python has GIL. I'm not sure, how Twisted handles this. I would expect, that Twisted has large (if not majority) code written in C, where GIL is not the issue, but that I couldn't find the docs explained to my satisfaction.
So the most oustanding question is: Given that Twisted uses Reactor as it's main design pattern, will it be able to:
Serve all those services needed
Do it in a non-blocking fashion (it should, according to docs, but if someone could elaborate, I'd be grateful)
Be able to serve about few hundreds of clients at once
Serve large file downloads in a reasonable way, meaning that it can serve multiple clients, using multiple services, downloading and uploading large files.
Large files being in the order of hundres of MB, or few GB. The size is not important, it's the time that the client has to stay connected to the server that matters.
Edit: I'm actually inclined to go the way of python multiprocessing, but not sure, whether that's a correct thing to do with Twisted etc.
Serve all those services needed
Yes.
Do it in a non-blocking fashion (it should, according to docs, but if someone could elaborate, I'd be grateful)
Twisted's uses the common reactor model. I/O goes through your choice of poll, select, whatever to determine if data is available. It handles only what is available, and passes the data along to other stages of your app. This is how it is non-blocking.
I don't think it provides non-blocking disk I/O, but I'm not sure. That feature not what most people need when they say non-blocking.
Be able to serve about few hundreds of clients at once
Yes. No. Maybe. What are those clients doing? Is each hitting refresh every second on a browser making 100 requests? Is each one doing a numerical simulation of galaxy collisions? Is each sending the string "hi!" to the server, without expecting a response?
Twisted can easily handle 1000+ requests per second.
Serve large file downloads in a reasonable way, meaning that it can serve multiple clients, using multiple services, downloading and uploading large files.
Sure. For example, the original version of BitTorrent was written in Twisted.
I'm relatively new to twisted and I'm planning on using it to create a file downloader. It would accept a file url and a number of parts to download the file.
What I have in mind is to split the file into how many parts the user specified and download each parts through deferred and when it is done, all parts gets assembled.
But do I need a protocol for each file to be downloaded and have each protocol dispatch a defer to download each file's chunks?
Is there a twisted component to read the remote file that has a seek? I really don't have any idea where to start.
If your mention of a URL implies that the protocol in use is HTTP (and I hope HTTP 1.1;-), then you could use twisted's relatively new HTTP 1.1 client (discussed at length here, and from the fact that the issue was marked as fixed 9 months ago I assume the client is finally in -- I have not checked that), using HTTP 1.1's range requests to get "slices" of the file.
If you're stuck with HTTP 1.0, or a not fully compliant server, you may be out of luck; if you really mean the "U" part of "URL", i.e., you need a Universal solution across all kinds of protocols, the problem of course becomes much, much harder.
I'm writing an application that streams the output (by this I mean both sys.stdout and sys.stderr) of a python script excited on the server, in real time to the browser.
The users on the site will be allowed to select the script to run, excite and kill their chosen script, and change some parameters, so I will need a different thread per user on the site (user A can start, stop and change a script, whilst user B can do the same with a different script).
I know I need to use comet for the web clients, and seeing as the rest of the project is written in python, I'd like to use twisted for the server, however I'm not really sure of what I need to do next!
There are a daunting number of options (Divmod Mantissa, Divmod Nevow, twisted.web, STOMP, etc), and some are better documented that others, making the whole thing rather tricky!
I have a working demo using stompservice on orbited, using Orbited.TCPSocket for the javascript side of things, however I'm starting to think that STOMPs channel model isn't going to work for multithreading, multi-running scripts (unless I open a new channel per run, but that seems like the wrong use of the channel model).
Can anyone point me in the right direction, or some sample code I can learn from?
Thanks!
Nevow Athena is a framework specifically for AJAX and COMET applications and in theory is exactly the sort of thing you are looking for.
However, I am not sure that it is well used or supported at this time - looking at mailing list traffic and google search results suggests that it may not be.
There are a couple of tutorials you could look at to help you decide on it:
one on the 'official' site: http://divmod.org/trac/wiki/DivmodNevow/Athena/Tutorials/LiveElement
and one other that I found:
http://divmodsphinx.funsize.net/nevow/chattutorial/part01/index.html
The code for the latter seems to be included in the Nevow distribution when you download it under /doc/listings/partxx (I think...)
You can implement a very simple "HTTP streaming" by keeping the http connection open and appending javascript chunks that update the dom contents. This works since the browser evaluates the "script" chunks as they arrive.
I wrote a blog entry a while ago with a running example using twisted and very few lines of javascript: Simple HTTP streaming with Twisted & Javascript
You can easily mix this pattern with a publisher/subscriber pattern to make it multiuser, etc. I use this pattern to watch live log streams via web.
An example of serving for long-polling clients with Twisted is slosh. This might not be what you want, but because it's not a large framework, it can help you figure out how to use Twisted.