I was wondering if it would be a good idea to use callLater in Twisted to keep track of auction endings. It would be a callLater on the order of 100,000's of seconds, though does that matter? Seems like it would be very convenient. But then again it seems like a horrible idea if the server crashes.
Keeping a database of when all the auctions are ending seems like the most secure solution, but checking the whole database each second to see if any auction has ended seems very expensive.
If the server crashes, maybe the server can recreate all the callLater's from database entries of auction end times. Are there other potential concerns for such a model?
One of the Divmod projects, Axiom, might be applicable here. Axiom is an object database. One of its unexpected, useful features is a persistent scheduling system.
You schedule events using APIs provided by the database. When the events come due, a callback you specified is called. The events persist across process restarts, since they're represented as database objects. Large numbers of scheduled events are supported, by only doing work to keep track when the next event is going to happen.
The canonical Divmod site went down some time ago (sadly the company is no longer an operating concern), but the code is all available at http://launchpad.net/divmod.org and the documentation is being slowly rehosted at http://divmod.readthedocs.org/.
Related
So far I have investigated two different ways of persistently tracking player attribute skills in a game. These are mainly conceptual except for the threading option I came up with / found an example for.
The case:
Solo developing a web game. Geo political simulator but with a little twist in comparison to others out there which I won't reveal.
I'm using a combination of Flask and SQLAlchemy for which I have written routes for and have templates extending into a base dynamically.
Currently running it in dev mode locally with the intention of putting it behind a WSGI and a reverse proxy like Nginx on the cloud based Linux vm.
About the player attribute mechanics - a player will submit a post request which will specify a few bits of information. First we want to know which skill, intelligence, endurance etc. Next wee need to know which player, but all of this will be generated automatically, we can use Flask-LoginManager to get the current user with our nifty user_loader decorator and function. We can use the user ID it provides to query the rest of it, namely what level the player is. We can specify the math used to decide the wait time increase later in seconds.
The options;
Option 1:
As suggested by a colleague of mine. Allow the database to manage the timings of the skills. When the user submits the form, we will have created a new table to hold skill upgrade information. We take a note of what time the user submitted the form and also we multiply the current skill level by a factor of X amount of time and we put both pieces of data into the database. Then we create a new process that manages the constant checking of this table. Using timedelta, we can check if the amount of time that has elapsed since the form was submitted is equal to or greater than the time the player must wait until the upgrade is complete.
Option 2:
Import threading and create a class which expects the same information as abovr supplied on init and then simply use time.sleep for X amount of time then fire the upgrade and kill the thread when it's finished.
I hope this all makes sense. I haven't written either yet because I am undecided about which is the most efficient way around it.
I'm looking for the most scalable solution (even if it's not an option listed here) but one that is also as practical or an improvement on my concept of the skill tracking mechanic.
I'm open to adding another lib to the package but I really would rather not.
I'll expand on my comment a little bit:
In terms of scaleability:
What if the upgrade processes become very long? Hours or days?
What if you have a lot of users
What if people disconnect and reconnect to sessions at different times?
Hopefully it is clear you cannot ensure a robust process with option 2. Threading and waiting will put a continuous and potentially limiting load on a server and if a server fails all those threads likely to be lost.
In terms of robustness:
On the other hand if you record all of the information to a database you have the facility to cross check the states of any items and perform upgrade/downgrade actions as deemed necessary by some form of task scheduler. This allows you to ensure that character states are always consistent with what you expect. And you only need one process to scan through the DB periodically and perform actions on all of the open rows flagged for an upgrade.
You could, if you wanted, also avoid a global task scheduler altogether. When a user performs an activity on the site a little task could run in the background (as a kind of decorator) that checks the upgrade status and if the time is right performs the DB activity, otherwise just passes. But a user would need to be actively in a session to make sure this happens, as opposed to the scheduled task above.
In programming web applications, Django in particular, sometimes we have a set of actions that must all succeed or all fail (in order to insure a predictable state of some sort). Now obviously, when we are working with the database, we can use transactions.
But in some circumstances, these (all or nothing) constraints are needed outside of a database context
(e.g. If payment is a success, we must send the product activation code or else risk customer complaints, etc)
But lets say on some fateful day, the send_code() function just failed time and again due to some temporary network error (that lasted for 1+ hours)
Should I log the error, and manually fix the problem, e.g. send the mail manually
Should I set up some kind of work queue, where when things fail, they just go back onto the end of the queue for future retry?
What if the logging/queueing systems also fail? (am I worrying too much at this point?)
We use microservices in our company and at least once a month, we have one of our microservices down for a while. We have Transaction model for the payment process and statuses for every step that go before we send a product to the user. If something goes wrong or one of the connected microservices is down, we mark it like status=error and save to the database. Then we use cron job to find and finish those processes. You need to try something for the beginning and if does not fit your needs, try something else.
We have a little data which almost won't be updated but read frequently (site config and some selection items like states and counties information), I think if I can move it to application memory instead of any database, our I/O performance would get a big improvement.
But we have a lot of web servers, I cannot figure out a good solution how to notice all the servers to reload these data.
You are likely looking for a cache pattern: Is there a Python caching library? You just need to ask how stale you can afford to be. If it was looking this up on every request, even a short-lived cache can massively improve performance. It's likely though that this information can live for minutes or hours without too much risk of being "stale".
If you can't live with a stale cache, I've implemented solutions that have a single database call, which keeps track of the last updated date for any of the cached data. This at least reduces the cache lookups to a single database call.
Be aware though, as soon as you are sharing updateable information, you have to deal with multi-threaded updates of shared state. Make sure you understand the implications of this. Hopefully your caching library handles this gracefully.
Our situation is as follows:
We are working on a schoolproject where the intention is that multiple teams walk around in a city with smarthphones and play a city game while walking.
As such, we can have 10 active smarthpones walking around in the city, all posting their location, and requesting data from the google appengine.
Someone is behind a webbrowser,watching all these teams walk around, and sending them messages etc.
We are using the datastore the google appengine provides to store all the data these teams send and request, to store the messages and retrieve them etc.
However we soon found out we where at our max limit of reads and writes, so we searched for a solution to be able to retrieve periodic updates(which cost the most reads and writes) without using any of the limited resources google provides. And obviously, because it's a schoolproject we don't want to pay for more reads and writes.
Storing this information in global variables seemed an easy and quick solution, which it was... but when we started to truly test we noticed some of our data was missing and then reappearing. Which turned out to be because there where so many requests being done to the cloud that a new instance was made, and instances don't keep these global variables persistent.
So our question is:
Can we somehow make sure these global variables are always the same on every running instance of google appengine.
OR
Can we limit the amount of instances ever running, no matter how many requests are done to '1'.
OR
Is there perhaps another way to store this data in a better way, without using the datastore and without using globals.
You should be using memcache. If you use the ndb (new database) library, you can automatically cache the results of queries. Obviously this won't improve your writes much, but it should significantly improve the numbers of reads you can do.
You need to back it with the datastore as data can be ejected from memcache at any time. If you're willing to take the (small) chance of losing updates you could just use memcache. You could do something like store just a message ID in the datastore and have the controller periodically verify that every message ID has a corresponding entry in memcache. If one is missing the controller would need to reenter it.
Interesting question. Some bad news first, I don't think there's a better way of storing data; no, you won't be able to stop new instances from spawning and no, you cannot make seperate instances always have the same data.
What you could do is have the instances perioidically sync themselves with a master record in the datastore, by choosing the frequency of this intelligently and downloading/uploading the information in one lump you could limit the number of read/writes to a level that works for you. This is firmly in the kludge territory though.
Despite finding the quota for just about everything else, I can't find the limits for free read/write so it is possible that they're ludicrously small but the fact that you're hitting them with a mere 10 smartphones raises a red flag to me. Are you certain that the smartphones are being polled (or calling in) at a sensible frequency? It sounds like you might be hammering them unnecessarily.
Consider jabber protocol for communication between peers. Free limits are on quite high level for it.
First, definitely use memcache as Tim Delaney said. That alone will probably solve your problem.
If not, you should consider a push model. The advantage is that your clients won't be asking you for new data all the time, only when something has actually changed. If the update is small enough that you can deliver it in the push message, you won't need to worry about datastore reads on memcache misses, or any other duplicate work, for all those clients: you read the data once when it changes and push it out to everyone.
The first option for push is C2DM (Android) or APN (iOS). These are limited on the amount of data they send and the frequency of updates.
If you want to get fancier you could use XMPP instead. This would let you do more frequent updates with (I believe) bigger payloads but might require more engineering. For a starting point, see Stack Overflow questions about Android and iOS.
Have fun!
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I'm engaged in developing a turn-based casual MMORPG game server.
The low level engine(NOT written by us) which handle networking,
multi-threading, timer, inter-server communication, main game loop etc, was
written by C++. The high level game logic was written by Python.
My question is about the data model design in our game.
At first we simply try to load all data of a player into RAM and a shared data
cache server when client login and schedule a timer periodically flush data into
data cache server and data cache server will persist data into database.
But we found this approach has some problems
1) Some data needs to be saved or checked instantly, such as quest progress,
level up, item & money gain etc.
2) According to game logic, sometimes we need to query some offline player's
data.
3) Some global game world data needs to be shared between different game
instances which may be running on a different host or a different process on the
same host. This is the main reason we need a data cache server sits between game
logic server and database.
4) Player needs freely switch between game instances.
Below is the difficulty we encountered in the past:
1) All data access operation should be asynchronized to avoid network I/O
blocking the main game logic thread. We have to send message to database or
cache server and then handle data reply message in callback function and
continue proceed game logic. It quickly become painful to write some moderate
complex game logic that needs to talk several times with db and the game logic
is scattered in many callback functions makes it hard to understand and
maintain.
2) The ad-hoc data cache server makes things more complex, we hard to maintain
data consistence and effectively update/load/refresh data.
3) In-game data query is inefficient and cumbersome, game logic need to query
many information such as inventory, item info, avatar state etc. Some
transaction machanism is also needed, for example, if one step failed the entire
operation should be rollback. We try to design a good data model system in RAM,
building a lot of complex indexs to ease numerous information query, adding
transaction support etc. Quickly I realized what we are building is a in-memory
database system, we are reinventing the wheel...
Finally I turn to the stackless python, we removed the cache server. All data
are saved in database. Game logic server directly query database. With stackless
python's micro tasklet and channel, we can write game logic in a synchronized
way. It is far more easy to write and understand and productivity greatly
improved.
In fact, the underlying DB access is also asynchronized: One client tasklet
issue request to another dedicate DB I/O worker thread and the tasklet is
blocked on a channel, but the entire main game logic is not blocked, other
client's tasklet will be scheduled and run freely. When DB data reply the
blocked tasklet will be waken up and continue to run on the 'break
point'(continuation?).
With above design, I have some questions:
1) The DB access will be more frequently than previous cached solution, does the
DB can support high frequent query/update operation? Does some mature cache
solution such as redis, memcached is needed in near future?
2) Are there any serious pitfalls in my design? Can you guys give me some better
suggestions, especially on in-game data management pattern.
Any suggestion would be appreciated, thanks.
I've worked with one MMO engine that operated in a somewhat similar fashion. It was written in Java, however, not Python.
With regards to your first set of points:
1) async db access We actually went the other route, and avoided having a “main game logic thread.” All game logic tasks were spawned as new threads. The overhead of thread creation and destruction was completely lost in the noise floor compared to I/O. This also preserved the semantics of having each “task” as a reasonably straightforward method, instead of the maddening chain of callbacks that one otherwise ends up with (although there were still cases of this.) It also meant that all game code had to be concurrent, and we grew increasingly reliant upon immutable data objects with timestamps.
2) ad-hoc cache We employed a lot of WeakReference objects (I believe Python has a similar concept?), and also made use of a split between the data objects, e.g. “Player”, and the “loader” (actually database access methods) e.g. “PlayerSQLLoader;” the instances kept a pointer to their Loader, and the Loaders were called by a global “factory” class that would handle cache lookups versus network or SQL loads. Every “Setter” method in a data class would call the method changed, which was an inherited boilerplate for myLoader.changed (this);
In order to handle loading objects from other active servers, we employed “proxy” objects that used the same data class (again, say, “Player,”) but the Loader class we associated was a network proxy that would (synchronously, but over gigabit local network) update the “master” copy of that object on another server; in turn, the “master” copy would call changed itself.
Our SQL UPDATE logic had a timer. If the backend database had received an UPDATE of the object within the last ($n) seconds (we typically kept this around 5), it would instead add the object to a “dirty list.” A background timer task would periodically wake and attempt to flush any objects still on the “dirty list” to the database backend asynchronously.
Since the global factory maintained WeakReferences to all in-core objects, and would look for a single instantiated copy of a given game object on any live server, we would never attempt to instantiate a second copy of one game object backed by a single DB record, so the fact that the in-RAM state of the game might differ from the SQL image of it for up to 5 or 10 seconds at a time was inconsequential.
Our entire SQL system ran in RAM (yes, a lot of RAM) as a mirror to another server who tried valiantly to write to disc. (That poor machine burned out RAID drives on average of once every 3-4 months due to “old age.” RAID is good.)
Notably, the objects had to be flushed to database when being removed from cache, e.g. due to exceeding the cache RAM allowance.
3) in-memory database … I hadn't run across this precise situation. We did have “transaction-like” logic, but it all occurred on the level of Java getters/setters.
And, in regards to your latter points:
1) Yes, PostgreSQL and MySQL in particular deal well with this, particularly when you use a RAMdisk mirror of the database to attempt to minimize actual HDD wear and tear. In my experience, MMO's do tend to hammer the database more than is strictly necessary, however. Our “5 second rule”* was built specifically to avoid having to solve the problem “correctly.” Each of our setters would call changed. In our usage pattern, we found that an object typically had either 1 field changed, and then no activity for some time, or else had a “storm” of updates happen, where many fields changed in a row. Building proper transactions or so (e.g. informing the object that it was about to accept many writes, and should wait for a moment before saving itself to the DB) would have involved more planning, logic, and major rewrites of the system; so, instead, we bypassed the situation.
2) Well, there's my design above :-)
In point of fact, the MMO engine I'm presently working on uses even more reliance upon in-RAM SQL databases, and (I hope) will be doing so a bit better. However, that system is being built using an Entity-Component-System model, rather than the OOP model that I described above.
If you already are based on an OOP model, shifting to ECS is a pretty paradigm shift and, if you can make OOP work for your purposes, it's probably better to stick with what your team already knows.
*- “the 5 second rule” is a colloquial US “folk belief” that after dropping food on the floor, it's still OK to eat it if you pick it up within 5 seconds.
It's difficult to comment on the entire design/datamodel without greater understanding of the software, but it sounds like your application could benefit from an in-memory database.* Backing up such databases to disk is (relatively speaking) a cheap operation. I've found that it is generally faster to:
A) Create an in-memory database, create a table, insert a million** rows into the given table, and then back-up the entire database to disk
than
B) Insert a million** rows into a table in a disk-bound database.
Obviously, single record insertions/updates/deletions also run faster in-memory. I've had success using JavaDB/Apache Derby for in-memory databases.
*Note that the database need not be embedded in your game server.
**A million may not be an ideal size for this example.