Python Game Server - Optimizing networking

Python Game Server - Optimizing networking - python

I have a Python game server running. The game is turn based. Players complain about short lag spikes which are several turns in duration (ie. their client is stuck waiting for awhile and then suddenly they receive several turns worth of updates all at once). I'm hoping to find some way to improve the networking consistency, but I'm not sure what's left to be done.
Here's what I'm doing:
Asynchronous sockets
TCP_NODELAY flag is set
epoll for polling
Here's how I receive plays:
for (fileno, event) in events:
if fileno == self.server_socket.fileno():
self.add_new_client()
elif event & select.EPOLLIN:
c = self.clients[fileno]
c.read() # reads and processes all input and generates all output
...
And here's how I send updates:
if turn_finished:
for user in self.clients.itervalues():
for msg in user.queued_messages:
msg = self.encode(msg)
bytes_sent = user.socket.send(msg)
...
Whenever I write to or read from sockets, I check that all bytes were sent and log any socket errors. These things almost never show up in the logs.
Would it be better if I only did one socket.send() call?
Is there anything I can check or tweak on the linux (Ubuntu) host?
There seems to be some issue with data arriving late to the clients. Can anyone give any suggestions for debugging this issue?
About the messages sent:
If a player is idle, they typically receive get 0-3 updates a turn. If a player is in the middle of action with other players, they typically receive to 2-3 updates per other player on their screen. Updates are typically about <20 bytes long. There are a couple updates (only sent to 1 player at a time) that are 256 and 500-600 bytes in size.
There are typically about 10-50 players active a time, and no more than 10 in the same screen.
P.S. - I run with PyPy. I've profiled and everything looks good. All player moves are handled in <3 ms and the server idles the vast majority of the time.

It sounds like you need to debug the client and find out what it is doing while it is "stuck".
Only once you understand the cause will you be able to find a solution.

Related

Python pygame Client/Server runs slow

I found a basic space invaders pygame on Youtube and I want to modify it in order that, as of right now, the server is doing all the processing and drawing, and the client only sends keyboard input(all run on localhost). The problem is that the game is no longer that responsive after I implemented this mechanism. It appears to be about 1 second delay after I press a key to when the ship is actually moving (when starting the game from pycharm, when it starts from cmd it's much worse).
I don't have any idea why this is happening because there isn't really anything heavy to process and I could really use your help.
I also monitored the Ethernet traffic in wireshark and there seems to be sent about 60-70 packets each second.
Here is the github link with all the necesary things: https://github.com/PaaulFarcas/C-S-Game

I would expect this code in the main loop is the issue:
recv = conn.recv(661)
keys = pickle.loads(recv)
The socket function conn.recv() will block until 661 bytes are received, or there is some socket event (like being closed). So your program is blocking every iteration of the main loop waiting for the data to arrive.
You could try using socket.setblocking( False ) as per the manual.
However I prefer to use the select module (manual link), as I like the better level of control it gives. Basically you can use it to know if any data has arrived on the socket (or if there's an error). This gives you a simple select-read-buffer type logic loop:
procedure receiveSocketData
Use select on the socket, with an immediate timeout.
Did select indicate any data arrived on my socket?
Read the data, appending it to a Rx-buffer
Does the Rx-buffer contain enough for a whole packet?
take the packet-chunk from the head of the Rx-buffer
decode & return it
Else
Keep the Rx-Buffer somewhere safe
return None
Did any errors happen on my socket
clear Rx-Buffer
close socket
return error
I guess using an unknown-sized packet, you could try to un-pickle it, and return OK when successful... this is quite inefficient though. I would use a fixed size packet and the struct module to pack and unpack it in network-byte-order.

implementing a networked multiplayer game loop in twisted

i am working on a game project and i decide to go with twisted for the server part.
its a multiplayer shooting game.
now i want to integrate a main loop into the game (on server side) to process input and physics(for bullets and players).The inputs are recieved from the clients through websockets.
i want the game loop to run game at lets say. 50 fps.
if i follow the method for implementing a game loop mentioned in this atricle. i have this code below
previous = getCurrentTime()
def loop():
double lag = 0.0
while True:
current = getCurrentTime()
elapsed = current - previous
previous = current
lag += elapsed
processInput()
while (lag >= MS_PER_UPDATE):
update()
lag -= MS_PER_UPDATE
send_state_to_connected_clients()
In the article it mentions that:
If you’re making a game that runs in a web browser, you pretty much can’t write your own classic game loop. The browser’s event-based nature precludes it
Now i am having a difficult time understanding it as this applies to Twisted as it's also event based.(i think what it says is the while true statement will block the reactor forever.so what can we do to implement our own loop in twisted given its even based)
in the same article towards the bottom it mentions these points:
Use the platform’s event loop:
1. It’s simple. You don’t have to worry about writing and optimizing the core loop of the game
2. It plays nice with the platform. You don’t have to worry about explicitly giving the host time to process its own events, caching events, or otherwise managing the impedance mismatch between the platform’s input model and yours.
What i am looking for is a general approach towards implementing a game loop in twisted(for a networked multiplayer game).
should i use the inbuilt reactor by using the LoopingCall to call
my Loop? how does then it handles the issues
mentioned in the article.
should i create my own loop somehow? (ex by using threads/processes or some other construct to run the game loop seperate from reactor)
should i create my own reactor implementation somehow?

If I understand the problem accurately, you will have a Python server and players will play a real-time FPS in the browser. You will have to:
display the game in real-time
handle user events in the browser
send browser-event results to the server
parse the results on the server
send server events to the browser
handle server events in the browser
We already know that you are going to use WebSockets.
Display the game in real-time
You will need to display the graphics somewhere, maybe inside a canvas. You will need to implement lots of functions, like update health bar, display changes and so on. These will be triggered when you handle responses from the server.
Handle user events in the browser
If we assume that clicking is shooting, space is activate and so on, you will need some event handlers for those. In the browser you will need a browser-level validation. For instance, if the player intends to shoot, but there is no more ammo, then you do not even have to send a message to the server, maybe display a sound effect of the gun which signifies that shooting was unsuccessful. If, according to the data you have in the browser you have ammo, then the direction you shoot at should be sent to the server.
Send browser-event results to the server
When an event occurs in the browser and is evaluated, then the results in many cases will be sent to the server, which will handle them and eventually send a response to the browser.
Parse the results on the server
The server will have an event loop and will receive WebSocket messages from the browsers of the players. For example if the server receives a shoot event, then it will get the current coordinates of the player and the direction, calculate where the bullet goes and send a message to the players' browser. If someone is hit, then damage is to be calculated and determined whether the player dies, subsequently players will get WebSocket messages from the server and subsequently the sound of the bullet will be displayed along with the bullet's graphical display and potentially some blood, falling players and so on.
Send server events to the browser
The browsers will listen to WebSocket messages from the server and handle those.
Handle server events in the browser
We cannot trust user events, because some cheating could be involved, so when you shoot, the browser will handle the event and the server will receive a message. When the server sends a WebSocket message, the browsers will "believe" that the server sent an accurate response.
Technical needs
You will need a graphics API, user event listeners and WebSocket listeners in the browsers. On the server you will listen to client WebSocket messages.

A Process to check if Infinite Loop is still running in Python3

I am unable to grasp this with the help of Programming concepts in general with the following scenario:
Note: All Data transmission in this scenario is done via UDP packets using socket module of Python3
I have a Server which sends some certain amount of data, assume 300 Packets over a WiFi Channel
At the other end, I have a receiver which works on a certain Decoding process to decode the data. This Decoding Process is kind of Infinite Loop which returns Boolean Value true or false at every iteration depending on certain aspects which can be neglected as of now
a Rough Code Snippet is as follows:Python3
incomingPacket = next(bringNextFromBuffer)
if decoder.consume_data(incomingPacket):
# this if condition is inside an infinite loop
# unless the if condition becomes True keep
# keep consuming data in a forever for loop
print("Data has been received")
Everything as of moment works since the Server and Client are in proximity and the data can be decoded. But in practical scenarios I want to check the loop that is mentioned above. For instance, after a certain amount of time, if the above loop is still in the Forever (Infinite) state I would like to send out something back to the server to start the data sending again.
I am not much clear with multithreading concept, but can I use a thread over here in this scenario?
For Example:
Thread a Process for a certain amount of time and keep checking the decoder.consume_data() function and if the time expires and the output is still False can I then send out a kind of Feedback to the server using struct.pack() over sockets.
Of course the networking logic, need NOT be addressed as of now. But is python capable of MONITORING THIS INFINITE LOOP VIA A PARALLEL THREAD OR OTHER CONCEPT OF PROGRAMMING?
Caveats
Unfortunately the Receiver in question is a dumb receiver i.e. No user control is specified. Only thing Receiver can do is decode the data and perhaps send a Feedback to the Server stating whether the data is received or not and that is possible only when the above mentioned LOOP is completed.
What is a possible solution here?
(Would be happy to share more information on request)

Yes you can do this. Roughly it'll look like this:
from threading import Thread
from time import sleep
state = 'running'
def monitor():
while True:
if state == 'running':
tell_client()
sleep(1) # to prevent too much happening here
Thread(target=monitor).start()
while state == 'running':
receive_data()

Why won't ZMQ drop messages?

I have an application which fetches messages from a ZeroMQ publisher, using a PUB/SUB setup. The reader is slow sometimes so I set a HWM on both the sender and receiver. I expect that the receiver will fill the buffer and jump to catch up when it recovers from processing slowdowns. But the behavior that I observe is that it never drops! ZeroMQ seems to be ignoring the HWM. Am I doing something wrong?
Here's a minimal example:
publisher.py
import zmq
import time
ctx = zmq.Context()
sock = ctx.socket(zmq.PUB)
sock.setsockopt(zmq.SNDHWM, 1)
sock.bind("tcp://*:5556")
i = 0
while True:
sock.send(str(i))
print i
time.sleep(0.1)
i += 1
subscriber.py
import zmq
import time
ctx = zmq.Context()
sock = ctx.socket(zmq.SUB)
sock.setsockopt(zmq.SUBSCRIBE, "")
sock.setsockopt(zmq.RCVHWM, 1)
sock.connect("tcp://localhost:5556")
while True:
print sock.recv()
time.sleep(0.5)

I believe there are a couple things at play here:
High Water Marks are not exact (see the last paragraph in the linked section) - typically this means the real queue size will be smaller than your listed number, I don't know how this will behave at 1.
Your PUB HWM will never drop messages... due to the way PUB sockets work, it will always immediately processes the message whether there is an available subscriber or not. So unless it actually takes ZMQ .1 seconds to process the message through the queue, your HWM will never come into play on the PUB side.
What should be happening is something like the following (I'm assuming an order of operations that would allow you to actually receive the first published message):
Start up subscriber.py & wait a suitable period to make sure it's completely spun up (basically immediately)
Start up publisher.py
PUB processes and sends the first message, SUB receives and processes the first message
PUB sleeps for .1 seconds and processes & sends the second message
SUB sleeps for .5 seconds, the socket receives the second message but sits in queue until the next call to sock.recv() processes it
PUB sleeps for .1 seconds and processes & sends the third message
SUB is still sleeping for another .3 seconds, so the third message should hit the queue behind the second message, which would make 2 messages in the queue, and the third one should drop due to the HWM
... etc etc etc.
I suggest the following changes to help troubleshoot the issue:
Remove the HWM on your publisher... it does nothing but add a variable we don't need to deal with in your test case, since we never expect it to change anything. If you need it for your production environment, add it back in and test it in a high volume scenario later.
Change the HWM on your subscriber to 50. It'll make the test take longer, but you won't be at the extreme edge case, and since the ZMQ documentation states that the HWM isn't exact, the extreme edge cases could cause unexpected behavior. Mind you, I believe your test (being small numbers) wouldn't do that, but I haven't looked at the code implementing the queues so I can't say with certainty, and it may be possible that your data is small enough that your effective HWM is actually larger.
Change your subscriber sleep time to 3 full seconds... in theory, if your queue holds up to exactly 50 messages, you'll saturate that within two loops (just like you do now), and then you'll have to wait 2.5 minutes to work through those messages to see if you start getting skips, which after the first 50 messages should start jumping large groups of numbers. But I'd wait at least 5-10 minutes. If you find that you start skipping after 100 or 200 messages, then you're being bitten by the smallness of your data.
This of course doesn't address what happens if you still don't skip any messages... If you do that and still experience the same issue, then we may need to dig more into how high water marks actually work, there may be something we're missing.

I met exactly the same problem, and my demo is nearly the same with yours, the subscriber or publisher won't drop any message after either zmq.RCVHWM or zmq.SNDHWM is set to 1.
I walk around after referring to the suicidal snail pattern for slow subscriber detection in Chap.5 of zguide. Hope it helps.
BTW: would you please let me know if you've solved the bug of zmq.HWM ?

Interact with long running python process

I have a long running python process running headless on a raspberrypi (controlling a garden) like so:
from time import sleep
def run_garden():
while 1:
/* do work */
sleep(60)
if __name__ == "__main__":
run_garden()
The 60 second sleep period is plenty of time for any changes happening in my garden (humidity, air temp, turn on pump, turn off fan etc), BUT what if i want to manually override these things?
Currently, in my /* do work */ loop, i first call out to another server where I keep config variables, and I can update those config variables via a web console, but it lacks any sort of real time feel, because it relies on the 60 second loop (e.g. you might update the web console, and then wait 45 seconds for the desired effect to take effect)
The raspberryPi running run_garden() is dedicated to the garden and it is basically the only thing taking up resources. So i know i have room to do something, I just dont know what.
Once the loop picks up the fact that a config var has been updated, the loop could then do exponential backoff to keep checking for interaction, rather than wait 60 seconds, but it just doesnt feel like that is a whole lot better.
Is there a better way to basically jump into this long running process?

Listen on a socket in your main loop. Use a timeout (e.g. of 60 seconds, the time until the next garden update should be performed) on your socket read calls so you get back to your normal functionality at least every minute when there are no commands coming in.
If you need garden-tending updates to happen no faster than every minute you need to check the time since the last update, since read calls will complete significantly faster when there are commands coming in.

Python's select module sounds like it might be helpful.
If you've ever used the unix analog (for example in socket programming maybe?), then it'll be familiar.
If not, here is the select section of a C sockets reference I often recommend. And here is what looks like a nice writeup of the module.
Warning: the first reference is specifically about C, not Python, but the concept of the select system call is the same, so the discussion might be helpful.
Basically, it allows you to tell it what events you're interested in (for example, socket data arrival, keyboard event), and it'll block either forever, or until a timeout you specify elapses.
If you're using sockets, then adding the socket and stdin to the list of events you're interested in is easy. If you're just looking for a way to "conditionally sleep" for 60 seconds unless/until a keypress is detected, this would work just as well.
EDIT:
Another way to solve this would be to have your raspberry-pi "register" with the server running the web console. This could involve a little bit extra work, but it would give you the realtime effect you're looking for.
Basically, the raspberry-pi "registers" itself, by alerting the server about itself, and the server stores the address of the device. If using TCP, you could keep a connection open (which might be important if you have firewalls to deal with). If using UDP you could bind the port on the device before registering, allowing the server to respond to the source address of the "announcement".
Once announced, when config. options change on the server, one of two things usually happen:
A) You send a tiny "ping" (in the general sense, not the ICMP host detection protocol) to the device alerting it that config options have changed. At this point the host would immediately request the full config. set, acquiring the update with it.
B) You send the updated config. option (or maybe the entire config. set) back to the device. This decreases the number of messages between the device and server, but would probably take more work as it seems like more a deviation from your current setup.

Why not use an event based loop instead of sleeping for a certain amount of time.
That way your loop will only run when a change is detected, and it will always run when a change is detected (which is the point of your question?).
You can do such a thing by using:
python event objects
Just wait for one or all of your event objects to be triggered and run the loop. You can also wait for X events to be done, etc, depending if you expect one variable to be updated a lot.
Or even a system like:
broadcasting events

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.