Django/Python - Updating the database every second

Django/Python - Updating the database every second - python

I'm working on creating a browser-based game in Django and Python, and I'm trying to come up with a solution to one of the problems I'm having.
Essentially, every second, multiple user variables need to be updated. For example, there's a currency variable that should increase by some amount every second, progressively getting larger as you level-up and all of that jazz.
I feel like it's a bad idea to do this with cronjobs (and from my research, other people think that too), so right now I'm thinking I should just create a thread that loops through all of the users in the database that performs these updates.
Am I on the right track here, or is there a better solution? In Django, how can I start a thread the second the server starts?
I appreciate the insight.

One of the possible solutions would be to use separate daemonized lightweight python script to perform all the in-game business logic and left django be just the frontend to your game. To bind them together you might pick any of high-performance asynchronous messaging library like ZeroMQ (for instance to pass player's actions to that script). This stack would also have a benefit of a frontend being separated and completely agnostic of a backend implementation.

Generally a better solution would be to not update everyone's currency every second. Instead, store the timestamp of the user's most recent transaction, their income rate, and their latest balance. With these three pieces of data, you can calculate their current balance whenever needed.
When the user makes a purchase for example, do the math to calculate what their new balance is. Some pseudocode:
def current_balance(user):
return user.latest_balance + user.income * (now() - user.last_timestamp)
def purchase(user, price):
if price > current_balance(user):
print("You don't have enough cash.")
return
user.balance = current_balance(user) - price
user.last_timestamp = now()
user.save()
print("You just bought a thing!")
With a system like this, your database will only get updated upon user interaction, which will make your system scale oodles better.

Related

Django - How to avoid query the same obj

I apologize for the extremely long message, but I'm new with this and need your knowledge and advice about Python and Django.
Basically, I am developing a small "questions and answers" game.I have a model with all the questions with their codes, etc.
Participants will login (they could be from 5 to 20 participants on each game), and on the screen, they will have an option that says "ASK A QUESTION". All participants can click the button at the same time, so how can I be sure that each user gets a different question?. Obviously I already thought, in placing true / false fields, if the question was already used, but the idea is to avoid duplication.
I come from JAVA, so with synchronized methods are not going to allow the method to be accessed at the same time, avoiding duplicate results. So is there something similar in Python? or is there a way to avoid that duplication?
Thanks a lot!

You could do this using transaction.atomic and a field used as a flag to indicate that a question has been given. (is_asked?)
def myview(request):
with transaction.atomic():
valid_questions = Question.objects.select_for_update()\
.filter(is_asked=False)
# some code here to get a valid question
# and return the question, then saving it as being
# is_asked = True before exiting the atomic block.
What transaction.atomic() guarantees is that the whole interaction is wrapped in a single transaction and select_for_update() locks the table until this transaction is done. That way, a concurrent click waits its turn and the database is in a consistent state before the next lookup fires.

How to count votes live for complex polls

I'm working on a project using Google app engine where I need to have users vote on a poll live, and they only have 15 seconds to submit their vote. I already have the delivery of the options working using Pusher.com, but I'm struggling to think of the right way to go about the voting.
A set of options is generated every 30-60seconds. After that votes are counted and a new set is delivered to the users, the old votes are useless and don't need to be stored. The number of options varies every time, usually around 5 but it could be up to 20 in rare occasions. Here comes the tricky part, there are also 2 sets of sub options which are all different for every main option. These however are secondary and only matter if the particular option won. Also not every main option has them. So a sample set could be this:
Option A
sub options:
- X
- Y
- Z
sub options 2
- F
- G
- H
Option B
sub options 2
- X
- Z
Option C
Option D
sub options
- sub X
- sub Y
- sub Z
At first I thought about using a simple database table but Google app engine has concurrent user limits and it gets expensive to go to the higher tiers where I would be wasting a bunch of resources like storage limit since I don't need to save the results. I need this to be scalable to a couple thousand concurrent users.
From what I read in the Googledocumentation of sharded counters, it seems like they can only be integers, so I can't store an array or string, which would be ideal. (an example of a single vote data would be {'option':'2', 'sub':'0', 'sub2':'1'}) I've been playing around here and the only idea I've come up with is to create an int counter for every possible vote combination but that just seems inefficient, there could often be over a hundred counters. Any idea of how I could set this up? Also there doesn't seem to be any documentation on how to delete a counter after the app is running.
I should also add that I'm a beginner self taught programmer and this is my first time stepping out of my comfort zone of PHP, javascript, and very simple python.
Thank you so much for taking the time to read this.

This may be a naive solution, but you can simply store the votes in memory in when using instances of basic or manual scaling, and push them to the datastore every few seconds, thus not encountering the maximum allowed concurrent updates.

SQLAlchemy(Postgresql) - Race Conditions

We are writing an inventory system and I have some questions about sqlalchemy (postgresql) and transactions/sessions. This is a web app using TG2, not sure this matters but to much info is never a bad.
How can make sure that when changing inventory qty's that i don't run into race conditions. If i understand it correctly if user on is going to decrement inventory on an item to say 0 and user two is also trying to decrement the inventory to 0 then if user 1s session hasn't been committed yet then user two starting inventory number is going to be the same as user one resulting in a race condition when both commit, one overwriting the other instead of having a compound effect.
If i wanted to use postgresql sequence for things like order/invoice numbers how can I get/set next values from sqlalchemy without running into race conditions?
EDIT: I think i found the solution i need to use with_lockmode, using for update or for share. I am going to leave open for more answers or for others to correct me if I am mistaken.
TIA

If two transactions try to set the same value at the same time one of them will fail. The one that loses will need error handling. For your particular example you will want to query for the number of parts and update the number of parts in the same transaction.
There is no race condition on sequence numbers. Save a record that uses a sequence number the DB will automatically assign it.
Edit:
Note as Limscoder points out you need to set the isolation level to Repeatable Read.

Setup the scenario you are talking about and see how your configuration handles it. Just open up two separate connections to test it.
Also read up on FOR UPDATE For Update and also on transaction isolation level Isolation Level

Django Models: Keep track of activity through related models?

I have something of a master table of Persons. Everything in my Django app some relates to one or more People, either directly or through long fk chains. Also, all my models have the standard bookkeeping fields 'created_at' and 'updated_at'. I want to add a field on my Person table called 'last_active_at', mostly for raw sql ordering purposes.
Creating or editing certain related models produces new timestamps for those objects. I need to somehow update Person.'last_active_at' with those values. Functionally, this isn't too hard to accomplish, but I'm concerned about undue stress on the app.
My two greatest causes of concern are that I'm restricted to a real db field--I can't assign a function to the Person table as a #property--and one of these 'activity' models receives and processes new instances from a foreign datasource I have no control over, sporadically receiving a lot of data at once.
My first thought was to add a post_save hook to the 'activity' models. Still seems like my best option, but I know nothing about them, how hard they hit the db, etc.
My second thought was to write some sort of script that goes through the day's activity and updates those models over the night. My employers a 'live'er stream, though.
My third thought was to modify the post_save algo to check if the 'updated_at' is less than half an hour from the Person's 'last_active_at', and not update the person if true.
Are my thoughts tending in a scalable direction? Are there other approaches I should pursue?

It is said that premature optimization is the mother of all problems. You should start with the dumbest implementation (update it every time), and then measure and - if needed - replace it with something more efficient.
First of all, let's put a method to update the last_active_at field on Person. That way, all the updating logic itself is concentrated here, and we can easily modify it later.
The signals are quite easy to use : it's just about declaring a function and registering it as a receiver, and it will be ran each time the signal is emitted. See the documentation for the full explanation, but here is what it might look like :
from django.db.models.signals import post_save
from django.dispatch import receiver
#receiver(post_save, sender=RelatedModel)
def my_handler(sender, **kwargs):
# sender is the object being saved
person = # Person to be updated
person.update_activity()
As for the updating itself, start with the dumbest way to do it.
def update_activity(self):
self.last_active_at = now()
Then measure and decide if it's a problem or not. If it's a problem, some of the things you can do are :
Check if the previous update is recent before updating again. Might be useless if a read to you database is not faster than a write. Not a problem if you use a cache.
Write it down somewhere for a deferred process to update later. No need to be daily : if the problem is that you have 100 updates per seconds, you can just have a script update the database every 10 seconds, or every minutes. You can probably find a good performance/uptodatiness trade-off using this technique.
These are just some though based on what you proposed, but the right choice depends on the kind of figures you have. Determine what kind of load you'll have, what kind of reaction time is needed for that field, and experiment.

Object oriented design for an investment/stock and options portfolio in Python

I'm a beginner/intermediate Python programmer but I haven't written an application, just scripts. I don't currently use a lot of object oriented design, so I would like this project to help build my OOD skills. The problem is, I don't know where to start from a design perspective (I know how to create the objects and all that stuff). For what it's worth, I'm also self taught, no formal CS education.
I'd like to try writing a program to keep track of a portfolio stock/options positions.
I have a rough idea about what would make good object candidates (Portfolio, Stock, Option, etc.) and methods (Buy, Sell, UpdateData, etc.).
A long position would buy-to-open, and sell-to-close while a short position has a sell-to-open and buy-to-close.
portfolio.PlaceOrder(type="BUY", symbol="ABC", date="01/02/2009", price=50.00, qty=100)
portfolio.PlaceOrder(type="SELL", symbol="ABC", date="12/31/2009", price=100.00, qty=25)
portfolio.PlaceOrder(type="SELLSHORT", symbol="XYZ", date="1/2/2009", price=30.00, qty=50)
portfolio.PlaceOrder(type="BUY", symbol="XYZ", date="2/1/2009", price=10.00, qty=50)
Then, once this method is called how do I store the information? At first I thought I would have a Position object with attributes like Symbol, OpenDate, OpenPrice, etc. but thinking about updating the position to account for sales becomes tricky because buys and sells happen at different times and amounts.
Buy 100 shares to open, 1 time, 1 price. Sell 4 different times, 4 different prices.
Buy 100 shares. Sell 1 share per day, for 100 days.
Buy 4 different times, 4 different prices. Sell entire position at 1 time, 1 price.
A possible solution would be to create an object for each share of stock, this way each share would have a different dates and prices. Would this be too much overhead? The portfolio could have thousands or millions of little Share objects. If you wanted to find out the total market value of a position you'd need something like:
sum([trade.last_price for trade in portfolio.positions if trade.symbol == "ABC"])
If you had a position object the calculation would be simple:
position.last * position.qty
Thanks in advance for the help. Looking at other posts it's apparent SO is for "help" not to "write your program for you". I feel that I just need some direction, pointing down the right path.
ADDITIONAL INFO UPON REFLECTION
The Purpose
The program would keep track of all positions, both open and closed; with the ability to see a detailed profit and loss.
When I think about detailed P&L I want to see...
- all the open dates (and closed dates)
- time held
- open price (closed date)
- P&L since open
- P&L per day
#Senderle
I think perhaps you're taking the "object" metaphor too literally, and so are trying to make a share, which seems very object-like in some ways, into an object in the programming sense of the word. If so, that's a mistake, which is what I take to be juxtapose's point.
This is my mistake. Thinking about "objects" a share object seems natural candidate. It's only until there may be millions that the idea seems crazy. I'll have some free coding time this weekend and will try creating an object with a quantity.

There are two basic precepts you should keep in mind when designing such a system:
Eliminate redundancy from your data. No redundancy insures integrity.
Keep all the data you need to answer any inquiry, at the lowest level of detail.
Based on these precepts, my suggestion is to maintain a Transaction Log file. Each transaction represents a change of state of some kind, and all the pertinent facts about it: when, what, buy/sell, how many, how much, etc. Each transaction would be represented by a record (a namedtuple is useful here) in a flat file. A years worth (or even 5 or 10 years) of transactions should easily fit in a memory resident list. You can then create functions to select, sort and summarize whatever information you need from this list, and being memory resident, it will be amazingly fast, much faster than a SQL database.
When and if the Transaction Log becomes too large or too slow, you can compute the state of all your positions as of a particular date (like year-end), use that for the initial state for the following period, and archive your old log file to disc.
You may want some auxiliary information about your holdings such as value/price on any particular date, so you can plot value vs. time for any or all holdings (There are on-line sources for this type of information, yahoo finance for one.) A master database containing static information about each of your holdings would also be useful.
I know this doesn't sound very "object oriented", but OO design could be useful to hide the detailed workings of the system in a TransLog object with methods to save/restore the data to/from disc (save/open methods), enter/change/delete a transaction; and additional methods to process the data into meaningful information displays.
First write the API with a command line interface. When this is working to your satisfaction, then you can go on to creating a GUI front end if you wish.
Good luck and have fun!

Avoid objects. Object oriented design is flawed. Think about your program as a collection of behaviors that operate on data (lists and dictionaries). Then group your related behaviors as functions in a module.
Each function should have clear input and outputs. Store your data globally in each module.
Why do it without objects? Because it maps closer to the problem space. Object oriented programming creates too much indirection to solve a problem. Unnecessary indirection causes software bloat and bugs.
A possible solution would be to create an object for each share of stock, this way each share would have a different dates and prices. Would this be too much overhead? The portfolio could have thousands or millions of little Share objects. If you wanted to find out the total market value of a position you'd need something like:
Yes it would be too much overhead. The solution here is you would store the data in a database. Finding the total market value of a position would be done in SQL unless you use a NOSQL scheme.
Don't try to design for all possible future outcomes. Just make your program work that way it needs to work now.

I think I'd separate it into
holdings (what you currently own or owe of each symbol)
orders (simple demands to buy or sell a single symbol at a single time)
trades (collections of orders)
This makes it really easy to get a current value, queue orders, and build more complex orders, and maps easily into data objects with a database behind them.

To answer your question: You appear to have a fairly clear idea of your data model already. But it looks to me like you need to think more about what you want this program to do. Will it keep track of changes in stock prices? Place orders, or suggest orders to be placed? Or will it simply keep track of the orders you've placed? Each of these uses may call for different strategies.
That said, I don't see why you would ever need to have an object for every share; I don't understand the reasoning behind that strategy. Even if you want to be able to track your order history in great detail, you could just store aggregate data, as in "x shares at y dollars per share, on date z".
It would make more sense to have a position object (or holding object, in Hugh's terminology) -- one per stock, perhaps with an .order_history attribute, if you really need a detailed history of your holdings in that stock. And yes, a database would definitely be useful for this kind of thing.
To wax philosophical for a moment: I think perhaps you're taking the "object" metaphor too literally, and so are trying to make a share, which seems very object-like in some ways, into an object in the programming sense of the word. If so, that's a mistake, which is what I take to be juxtapose's point.
I disagree with him that object oriented design is flawed -- that's a pretty bold pronouncement! -- but his answer is right insofar as an "object" (a.k.a. a class instance) is almost identical to a module**. It's a collection of related functions linked to some shared state. In a class instance, the state is shared via self or this, while in a module, it's shared through the global namespace.
For most purposes, the only major difference between a class instance and a module is that there can be many class instances, each one with its own independent state, while there can be only one module instance. (There are other differences, of course, but most of the time they involve technical matters that aren't very important for learning OOD.) That means that you can think about objects in a way similar to the way you think about modules, and that's a useful approach here.
**In many compiled languages, the file that results when you compile a module is called an "object" file. I think that's where the "object" metaphor actually comes from. (I don't have any real evidence of that! So anyone who knows better, feel free to correct me.) The ubiquitous toy examples of OOD that one sees -- car.drive(mph=50); car.stop(); car.refuel(unleaded, regular) -- I believe are back-formations that can confuse the concept a bit.

I would love to hear with what you came up with. I am ~ 4 months (part time) into creating an Order Handler and although its mostly complete, I still have the same questions as you as I'd like it to be made properly.
Currently, I save two files
A "Strategy Hit Log" where each buy/sell signal that comes from any strategy script is saved. For example: when the buy_at_yesterdays_close_price.py strategy is triggered, it saved that buy request in this file, and passes the request to the Order Handler
An "Order Log" which is a single DataFrame - this file fits the purpose you were focusing on.
Each request from a strategy pertains to a single underlying security (for example, AAPL stock) and creates an Order which is saved as a row in the DataFrame, containing columns for the Ticker, the Strategy name that spawned this Order as well as Suborder and Broker Suborder columns (explained below).
Each Order has a list of Suborders (dicts) stored in the Suborder column. For example: if you are bullish on AAPL, the suborders could be:
[
{'security': 'equity', 'quantity':10},
{'security': 'option call', 'quantity':10}
]
Each Order also has a list of Broker Suborders (dicts) stored in the Broker Suborder column. Each Broker Suborder is a request to the Broker to buy or sell a security, and is indexed with the "order ID" that the Broker provides for that request. Every new request to the Broker is a new Broker Suborder, and canceling that Broker Suborder is recorded in that dict. To record modifications to Broker Suborders, you cancel the old Broker Suborder and send and record a new one (its the same commission using IBKR).
Improvements
List of Classes instead of DataFrame: I think it'd be much more pythonic to save each Order as an instance of an Order_Class (instead of a row of a DataFrame), which has Suborder and Broker_Suborder attributes, both of which are also instances of Suborder_Class and Broker_Suborder_Class.
My current question is whether saving a list of classes as my record of all open and closed Orders is pythonic or silly.
Visualization Considerations: It seems like Orders should be saved in table form for easier viewing, but maybe it is better to save them in this "list of class instances" form and use a function to tabularize them at the time of viewing? Any input by anyone would be greatly appreciated. I'd like to move away from this and start playing with ML, but I don't want to leave the Order Handler unfinished.
Is it all crap?: should each Broker Suborder (buy/sell request to a Broker) be attached to an Order (which is just a specific strategy request from a strategy script) or should all Broker Suborders be recorded in chronological order and simply have a reference to the strategy-order (Order) that spawned the Broker Suborder? Idk... but I would love everyone's input.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.