So far I have investigated two different ways of persistently tracking player attribute skills in a game. These are mainly conceptual except for the threading option I came up with / found an example for.
The case:
Solo developing a web game. Geo political simulator but with a little twist in comparison to others out there which I won't reveal.
I'm using a combination of Flask and SQLAlchemy for which I have written routes for and have templates extending into a base dynamically.
Currently running it in dev mode locally with the intention of putting it behind a WSGI and a reverse proxy like Nginx on the cloud based Linux vm.
About the player attribute mechanics - a player will submit a post request which will specify a few bits of information. First we want to know which skill, intelligence, endurance etc. Next wee need to know which player, but all of this will be generated automatically, we can use Flask-LoginManager to get the current user with our nifty user_loader decorator and function. We can use the user ID it provides to query the rest of it, namely what level the player is. We can specify the math used to decide the wait time increase later in seconds.
The options;
Option 1:
As suggested by a colleague of mine. Allow the database to manage the timings of the skills. When the user submits the form, we will have created a new table to hold skill upgrade information. We take a note of what time the user submitted the form and also we multiply the current skill level by a factor of X amount of time and we put both pieces of data into the database. Then we create a new process that manages the constant checking of this table. Using timedelta, we can check if the amount of time that has elapsed since the form was submitted is equal to or greater than the time the player must wait until the upgrade is complete.
Option 2:
Import threading and create a class which expects the same information as abovr supplied on init and then simply use time.sleep for X amount of time then fire the upgrade and kill the thread when it's finished.
I hope this all makes sense. I haven't written either yet because I am undecided about which is the most efficient way around it.
I'm looking for the most scalable solution (even if it's not an option listed here) but one that is also as practical or an improvement on my concept of the skill tracking mechanic.
I'm open to adding another lib to the package but I really would rather not.
I'll expand on my comment a little bit:
In terms of scaleability:
What if the upgrade processes become very long? Hours or days?
What if you have a lot of users
What if people disconnect and reconnect to sessions at different times?
Hopefully it is clear you cannot ensure a robust process with option 2. Threading and waiting will put a continuous and potentially limiting load on a server and if a server fails all those threads likely to be lost.
In terms of robustness:
On the other hand if you record all of the information to a database you have the facility to cross check the states of any items and perform upgrade/downgrade actions as deemed necessary by some form of task scheduler. This allows you to ensure that character states are always consistent with what you expect. And you only need one process to scan through the DB periodically and perform actions on all of the open rows flagged for an upgrade.
You could, if you wanted, also avoid a global task scheduler altogether. When a user performs an activity on the site a little task could run in the background (as a kind of decorator) that checks the upgrade status and if the time is right performs the DB activity, otherwise just passes. But a user would need to be actively in a session to make sure this happens, as opposed to the scheduled task above.
Related
Hello I don't think this is in the right place for this question but I don't know where to ask it. I want to make a website and an api for that website using the same SQLAlchemy database would just running them at the same time independently be safe or would this cause corruption from two write happening at the same time.
SQLA is a python wrapper for SQL. It is not it's own database. If you're running your website (perhaps flask?) and managing your api from the same script, you can simply use the same reference to your instance of SQLA. Meaning, when you use SQLA to connect to a database and save to a variable, what is really happening is it saves the connection to a variable, and you continually reference that variable, as opposed to the more inefficient method of creating a new connection every time. So when you say
using the same SQLAlchemy database
I believe you are actually referring to the actual underlying database itself, not the SQLA wrapper/connection to it.
If your website and API are not running in the same script (or even if they are, depending on how your API handles simultaneous requests), you may encounter a race condition, which, according to Wikipedia, is defined as:
the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events. It becomes a bug when one or more of the possible behaviors is undesirable.
This may be what you are referring to when you mentioned
would this cause corruption from two write happening at the same time.
To avoid such situations, when a process accesses a file, (depending on the OS,) check is performed to see if there is a "lock" on that file, and if so, the OS refuses to open that file. A lock is created when a process accesses a file (and there is no other process holding a lock on that file), such as by using with open(filename): and is released when the process no longer holds an open reference to the file (such as when python execution leaves the with open(filename): indentation block.) This may be the real issue you might encounter when using two simultaneous connections to a SQLite db.
However, if you are using something like MySQL, where you connect to a SQL server process, and NOT a file, since there is no direct access to a file, there will be no lock on the database, and you may run in to that nasty race condition in the following made up scenario:
Stack Overflow queries the reputation an account to see if it should be banned due to negative reputation.
AT THE EXACT SAME TIME, Someone upvotes an answer made by that account that sets it one point under the account ban threshold.
The outcome is now determined by the speed of execution of these 2 tasks.
If the upvoter has, say, a slow computer, and the "upvote" does not get processed by StackOverflow before the reputation query completes, the account will be banned. However, if there is some lag on Stack Overflow's end, and the upvote processes before the account query finishes, the account will not get banned.
The key concept behind this example is that all of these steps can occur within fractions of a second, and the outcome depends of the speed of execution on both ends.
To address the issue of data corruption, most databases have a system in place that properly order database read and writes, however, there are still semantic issues that may arise, such as the example given above.
Two applications can use the same database as the DB is a separate application that will be accessed by each flask app.
What you are asking can be done and is the methodology used by many large web applications, specially when the API is written in a different framework than the main application.
Since SQL databases are ACID compliant, they have a system in place to queue the multiple read/write requests put to it and perform them in the correct order while ensuring data reliability.
One question to ask though is whether it is useful to write two separate applications. For most flask-only projects the best approach would be to separate the project using blueprints, having a “main” blueprint and a “api” blueprint.
In programming web applications, Django in particular, sometimes we have a set of actions that must all succeed or all fail (in order to insure a predictable state of some sort). Now obviously, when we are working with the database, we can use transactions.
But in some circumstances, these (all or nothing) constraints are needed outside of a database context
(e.g. If payment is a success, we must send the product activation code or else risk customer complaints, etc)
But lets say on some fateful day, the send_code() function just failed time and again due to some temporary network error (that lasted for 1+ hours)
Should I log the error, and manually fix the problem, e.g. send the mail manually
Should I set up some kind of work queue, where when things fail, they just go back onto the end of the queue for future retry?
What if the logging/queueing systems also fail? (am I worrying too much at this point?)
We use microservices in our company and at least once a month, we have one of our microservices down for a while. We have Transaction model for the payment process and statuses for every step that go before we send a product to the user. If something goes wrong or one of the connected microservices is down, we mark it like status=error and save to the database. Then we use cron job to find and finish those processes. You need to try something for the beginning and if does not fit your needs, try something else.
I'm developing a Python-application that "talks" to the user, and performs tasks based on what the user says(e.g. User:"Do I have any new facebook-messages?", answer:"Yes, you have 2 new messages. Would you like to see them?"). Functionality like integration with facebook or twitter is provided by plugins. Based on predefined parsing rules, my application calls the plugin with the parsed arguments, and uses it's response. The application needs to be able to answer multiple query's from different users at the same time(or practically the same time).
Currently, I need to call a function, "Respond", with the user input as argument. This has some disadvantages, however:
i)The application can only "speak when it is spoken to". It can't decide to query facebook for new messages, and tell the user whether it does, without being told to do that.
ii)Having a conversation with multiple users at a time is very hard, because the application can only do one thing at a time: if Alice asks the application to check her Facebook for new messages, Bob can't communicate with the application.
iii)I can't develop(and use) plugins that take a lot of time to complete, e.g. download a movie, because the application isn't able to do anything whilesame the previous task isn't completed.
Multithreading seems like the obvious way to go, here, but I'm worried that creating and using 500 threads at a time dramatically impacts performance, so using one thread per query(a query is a statement from the user) doesn' seem like the right option.
What would be the right way to do this? I've read a bit about Twisted, and the "reactor" approach seems quite elegant. However, I'm not sure how to implement something like that in my application.
i didn't really understand what sort of application its going to be, but i tried to anwser your questions
create a thread that query's, and then sleeps for a while
create a thread for each user, and close it when the user is gone
create a thread that download's and stops
after all, there ain't going to be 500 threads.
Our situation is as follows:
We are working on a schoolproject where the intention is that multiple teams walk around in a city with smarthphones and play a city game while walking.
As such, we can have 10 active smarthpones walking around in the city, all posting their location, and requesting data from the google appengine.
Someone is behind a webbrowser,watching all these teams walk around, and sending them messages etc.
We are using the datastore the google appengine provides to store all the data these teams send and request, to store the messages and retrieve them etc.
However we soon found out we where at our max limit of reads and writes, so we searched for a solution to be able to retrieve periodic updates(which cost the most reads and writes) without using any of the limited resources google provides. And obviously, because it's a schoolproject we don't want to pay for more reads and writes.
Storing this information in global variables seemed an easy and quick solution, which it was... but when we started to truly test we noticed some of our data was missing and then reappearing. Which turned out to be because there where so many requests being done to the cloud that a new instance was made, and instances don't keep these global variables persistent.
So our question is:
Can we somehow make sure these global variables are always the same on every running instance of google appengine.
OR
Can we limit the amount of instances ever running, no matter how many requests are done to '1'.
OR
Is there perhaps another way to store this data in a better way, without using the datastore and without using globals.
You should be using memcache. If you use the ndb (new database) library, you can automatically cache the results of queries. Obviously this won't improve your writes much, but it should significantly improve the numbers of reads you can do.
You need to back it with the datastore as data can be ejected from memcache at any time. If you're willing to take the (small) chance of losing updates you could just use memcache. You could do something like store just a message ID in the datastore and have the controller periodically verify that every message ID has a corresponding entry in memcache. If one is missing the controller would need to reenter it.
Interesting question. Some bad news first, I don't think there's a better way of storing data; no, you won't be able to stop new instances from spawning and no, you cannot make seperate instances always have the same data.
What you could do is have the instances perioidically sync themselves with a master record in the datastore, by choosing the frequency of this intelligently and downloading/uploading the information in one lump you could limit the number of read/writes to a level that works for you. This is firmly in the kludge territory though.
Despite finding the quota for just about everything else, I can't find the limits for free read/write so it is possible that they're ludicrously small but the fact that you're hitting them with a mere 10 smartphones raises a red flag to me. Are you certain that the smartphones are being polled (or calling in) at a sensible frequency? It sounds like you might be hammering them unnecessarily.
Consider jabber protocol for communication between peers. Free limits are on quite high level for it.
First, definitely use memcache as Tim Delaney said. That alone will probably solve your problem.
If not, you should consider a push model. The advantage is that your clients won't be asking you for new data all the time, only when something has actually changed. If the update is small enough that you can deliver it in the push message, you won't need to worry about datastore reads on memcache misses, or any other duplicate work, for all those clients: you read the data once when it changes and push it out to everyone.
The first option for push is C2DM (Android) or APN (iOS). These are limited on the amount of data they send and the frequency of updates.
If you want to get fancier you could use XMPP instead. This would let you do more frequent updates with (I believe) bigger payloads but might require more engineering. For a starting point, see Stack Overflow questions about Android and iOS.
Have fun!
I was wondering if it would be a good idea to use callLater in Twisted to keep track of auction endings. It would be a callLater on the order of 100,000's of seconds, though does that matter? Seems like it would be very convenient. But then again it seems like a horrible idea if the server crashes.
Keeping a database of when all the auctions are ending seems like the most secure solution, but checking the whole database each second to see if any auction has ended seems very expensive.
If the server crashes, maybe the server can recreate all the callLater's from database entries of auction end times. Are there other potential concerns for such a model?
One of the Divmod projects, Axiom, might be applicable here. Axiom is an object database. One of its unexpected, useful features is a persistent scheduling system.
You schedule events using APIs provided by the database. When the events come due, a callback you specified is called. The events persist across process restarts, since they're represented as database objects. Large numbers of scheduled events are supported, by only doing work to keep track when the next event is going to happen.
The canonical Divmod site went down some time ago (sadly the company is no longer an operating concern), but the code is all available at http://launchpad.net/divmod.org and the documentation is being slowly rehosted at http://divmod.readthedocs.org/.