Calculate worth of inventory using python - python

I am trying to create a website using Django and Python3 where users can add there MTG cards to their inventory. I have two tables, inventory_cards - 12,000 Rows - that holds the users inventory data and pricing_cards - 65,000 rows - that holds the pricing data for each card.
inventory_cards table contains user_id, card_id, nonfoil, foil
pricing_cards tables contains card_id, date, nonfoil, foil
I am trying to work out the total worth of a users inventory, however my code is either slow or super heavy on the database.
method 1:
The method takes around 6 minutes but only hits the database twice. With this data being displayed of a website, it is not viable to have the page take over 6 minutes to load.
user_inventory = list(inventory_cards.objects.filter(user_id=request.user.id).values_list('card_id', 'nonfoil', 'foil'))
pricing = list(pricing_cards.objects.order_by('card_id', '-date').distinct('card_id').values_list('card_id', 'nonfoil', 'foil'))
combined_list = [x + y[1:] for x in user_inventory for y in pricing if x[0] == y[0]]
for i in combined_list:
inventory_value = Decimal(inventory_value) + ((Decimal(i[1]) * Decimal(i[3])) + (Decimal(i[2]) * Decimal(i[4])))
method 2:
The method takes around 15 seconds but the database shows over 25,00 transactions. This is still too long for the page to load, and can imagine the strain on the database with multiple users.
user_inventory = inventory_cards.objects.filter(user_id=request.user.id).values('card_id', 'nonfoil', 'foil')
for i in user_inventory:
nonfoil_value = i['nonfoil'] * float(pricing_cards.objects.filter(card_id=i['card_id']).values_list('nonfoil').get()[0])
foil_value = i['foil'] * float(pricing_cards.objects.filter(card_id=i['card_id']).values_list('foil').get()[0])
inventory_value = inventory_value + (nonfoil_value + foil_value)
is there a better way to perform this calculation? is there python package I can install to perform this calculation better.

bb1950328 pointed me in the right direction. The below code run a SQL query without using the Django models. The webpage loads almost instantly.
with connections['default'].cursor() as cursor:
cursor.execute("SELECT SUM((inventory_cards.nonfoil * pricing_cards.nonfoil) + (inventory_cards.foil * pricing_cards.foil)) FROM inventory_cards INNER JOIN pricing_cards ON pricing_cards.card_id = inventory_cards.card_id WHERE inventory_cards.user_id = 3")
inventory_value = cursor.fetchone()[0]

Related

rfid mysql query to python

I have an old php system that uses Rfid. We have a sql query which converts the Rfid codes. I am changing the PHP system to ODOO 11 Here is the SQL query
UPDATE members set member_id=100000*floor(fob_id/65536)+(fob_id%65536)
It is basically the fob id multiplied by 100000 then divided by 65536
But later they discovered if a fob number ended with a 7 or 9 i think, it was being calculated wrong so the floor part always rounds down and the % adds on the remainder after dividing them. (I think is how it worked)
How can I get the same result in Python as the above query (Python version 3.5)
Here is my code
#api.onchange('barcode2')
def _onchange_barcode2(self):
self.barcode = (self.barcode2)
a = (self.barcode2) * 100000
fob = (math.floor(a)) /65536 +(self.barcode2)%65536
self.memberid = fob
I have a rfid number of 0005225306 which should be 07947962 but with the above code I get 8,021,146.
Am I using the % operator correct? If I input 0005225306 into my old PHP system which uses MySQL UPDATE members set member_id=100000*floor(fob_id/65536)+(fob_id%65536)it gives correct value of 07947962.
Any idea why my python code is not getting the same value?
I have no idea about your barcode* fields and their data type, but care when parsing them. And the order of those calculations is important:
#api.onchange('barcode2')
def _onchange_barcode2(self):
# let's suppose the barcode fields are strings
# because of leading zeros
self.barcode = self.barcode2
try:
barcode_int = int(self.barcode2)
fob = math.floor(barcode_int / 65536)
fob = int((fob * 100000) + barcode_int % 65536)
# let's suppose memberid is a char field
self.memberid = '{:08d}'.format(fob)
except:
# handle parse problems eg. Odoo Warning

Find all users x miles away in flask + geoalchemy with ORM

I have created the following SQL query to find all users within a mile and it seems to work fine:
SELECT * FROM user
WHERE ST_DWithin(
user.location,
ST_MakePoint(-2.242631, 53.480759)::geography, 1609)
);
However I want to convert this into a flask/sqlalchemy/geoalchemy query?
Try something like this:
DISTANCE = 100 #100 meters
db.session.query(User).filter(func.ST_DWithin(User.location, cast(funct.ST_SetSRID(func.ST_MakePoint(-2.242631, 53.480759), 1609), Geography), DISTANCE)).all()

Performance SQLAlchemy and or

I use the following sqlalchemy code to retrieve some data from a database
q = session.query(hd_tbl).\
join(dt_tbl, hd_tbl.c['data_type'] == dt_tbl.c['ID']).\
filter(or_(and_(hd_tbl.c['object_id'] == get_id(row['object']),
hd_tbl.c['data_type'] == get_id(row['type']),
hd_tbl.c['data_provider'] == get_id(row['provider']),
hd_tbl.c['data_account'] == get_id(row['account']))
for index, row in data.iterrows())).\
with_entities(hd_tbl.c['ID'], hd_tbl.c['object_id'],
hd_tbl.c['data_type'], hd_tbl.c['data_provider'],
hd_tbl.c['data_account'], dt_tbl.c['value_type'])
where hd_tbland dt_tbl are two tables in sql db, and datais pandas dataframe containing typically around 1k-9k entries. hd_tbl contains at the moment around 90k rows.
The execution time seems to exponentially grow with the length of data. The corresponding sql statement (by sqlalchemy) looks as follows:
SELECT data_header.`ID`, data_header.object_id, data_header.data_type, data_header.data_provider, data_header.data_account, basedata_data_type.value_type
FROM data_header INNER JOIN basedata_data_type ON data_header.data_type = basedata_data_type.`ID`
WHERE data_header.object_id = %s AND data_header.data_type = %s AND data_header.data_provider = %s AND data_header.data_account = %s OR
data_header.object_id = %s AND data_header.data_type = %s AND data_header.data_provider = %s AND data_header.data_account = %s OR
...
data_header.object_id = %s AND data_header.data_type = %s AND data_header.data_provider = %s AND data_header.data_account = %s OR
The tables and columns are fully indexed, and performance is not satisfying. Currently it is way faster to read all the data of hd_tbl and dt_tbl into memory and merge with pandas merge function. However, this is seems to be suboptimal. Anyone having an idea on how to improve the sqlalchemy call?
EDIT:
I was able to improve performance signifcantly by using sqlalchemy tuple_ in the following way:
header_tuples = [tuple([int(y) for y in tuple(x)]) for x in
data_as_int.values]
q = session.query(hd_tbl). \
join(dt_tbl, hd_tbl.c['data_type'] == dt_tbl.c['ID']). \
filter(tuple_(hd_tbl.c['object_id'], hd_tbl.c['data_type'],
hd_tbl.c['data_provider'],
hd_tbl.c['data_account']).in_(header_tuples)). \
with_entities(hd_tbl.c['ID'], hd_tbl.c['object_id'],
hd_tbl.c['data_type'], hd_tbl.c['data_provider'],
hd_tbl.c['data_account'], dt_tbl.c['value_type'])
with corresponding query...
SELECT data_header.`ID`, data_header.object_id, data_header.data_type, data_header.data_provider, data_header.data_account, basedata_data_type.value_type
FROM data_header INNER JOIN basedata_data_type ON data_header.data_type = basedata_data_type.`ID`
WHERE (data_header.object_id, data_header.data_type, data_header.data_provider, data_header.data_account) IN ((%(param_1)s, %(param_2)s, %(param_3)s, %(param_4)s), (%(param_5)s, ...))
I'd recommend you create a composite index on fields object_id, data_type, data_provider, ... with the same order, which they are placed in table, and make sure they're following in the same order in your WHERE condition. It may speed-up a bit your requests by cost of the disk space.
Also you may use several consequent small SQL requests instead a large query with complex OR condition. Accumulate extracted data on the application side or, if amount is large enough, in a fast temporary storage (a temporary table, noSQL, etc.)
In addition you may check MySQL configuration and increase values, related to memory volume per a thread, request, etc. A good idea is to check is your composite index fits into available memory, or it is useless.
I guess DB tuning may help a lot to increase productivity. Otherwise you may analyze your application's architecture to get more significant results.

sqlalchemy query using joinedload exponentially slower with each new filter clause

I have this sqlalchemy query:
query = session.query(Store).options(joinedload('salesmen').
joinedload('comissions').
joinedload('orders')).\
filter(Store.store_code.in_(selected_stores))
stores = query.all()
for store in stores:
for salesman in store.salesmen:
for comission in salesman.comissions:
#generate html for comissions for each salesman in each store
#print html document using PySide
This was working perfectly, however I added two new filter queries:
filter(Comissions.payment_status == 0).\
filter(Order.order_date <= self.dateEdit.date().toPython())
If I add just the first filter the application hangs for a couple of seconds, if I add both the application hangs indefinitely
What am I doing wrong here? How do I make this query fast?
Thank you for your help
EDIT: This is the sql generated, unfortunately the class and variable names are in Portuguese, I just translated them to English so it would be easier to undertand,
so Loja = Store, Vendedores = Salesmen, Pedido = Order, Comission = Comissao
Query generated:
SELECT "Loja"."CodLoja", "Vendedores_1"."CodVendedor", "Vendedores_1"."NomeVendedor", "Vendedores_1"."CodLoja", "Vendedores_1"."PercentualComissao",
"Vendedores_1"."Ativo", "Comissao_1"."CodComissao", "Comissao_1"."CodVendedor", "Comissao_1"."CodPedido",
"Pedidos_1"."CodPedido", "Pedidos_1"."CodLoja", "Pedidos_1"."CodCliente", "Pedidos_1"."NomeCliente", "Pedidos_1"."EnderecoCliente", "Pedidos_1"."BairroCliente",
"Pedidos_1"."CidadeCliente", "Pedidos_1"."UFCliente", "Pedidos_1"."CEPCliente", "Pedidos_1"."FoneCliente", "Pedidos_1"."Fone2Cliente", "Pedidos_1"."PontoReferenciaCliente",
"Pedidos_1"."DataPedido", "Pedidos_1"."ValorProdutos", "Pedidos_1"."ValorCreditoTroca",
"Pedidos_1"."ValorTotalDoPedido", "Pedidos_1"."Situacao", "Pedidos_1"."Vendeu_Teflon", "Pedidos_1"."ValorTotalTeflon",
"Pedidos_1"."DataVenda", "Pedidos_1"."CodVendedor", "Pedidos_1"."TipoVenda", "Comissao_1"."Valor", "Comissao_1"."DataPagamento", "Comissao_1"."StatusPagamento"
FROM "Comissao", "Pedidos", "Loja" LEFT OUTER JOIN "Vendedores" AS "Vendedores_1" ON "Loja"."CodLoja" = "Vendedores_1"."CodLoja"
LEFT OUTER JOIN "Comissao" AS "Comissao_1" ON "Vendedores_1"."CodVendedor" = "Comissao_1"."CodVendedor" LEFT OUTER JOIN "Pedidos" AS "Pedidos_1" ON "Pedidos_1"."CodPedido" = "Comissao_1"."CodPedido"
WHERE "Loja"."CodLoja" IN (:CodLoja_1) AND "Comissao"."StatusPagamento" = :StatusPagamento_1 AND "Pedidos"."DataPedido" <= :DataPedido_1
Your FROM clause is producing a Cartesian product and includes each table twice, once for filtering the result and once for eagerly loading the relationship.
To stop this use contains_eager instead of joinedload in your options. This will look for the related attributes in the query's columns instead of constructing an extra join. You will also need to explicitly join to the other tables in your query, e.g.:
query = session.query(Store)\
.join(Store.salesmen)\
.join(Store.commissions)\
.join(Store.orders)\
.options(contains_eager('salesmen'),
contains_eager('comissions'),
contains_eager('orders'))\
.filter(Store.store_code.in_(selected_stores))\
.filter(Comissions.payment_status == 0)\
.filter(Order.order_date <= self.dateEdit.date().toPython())

database field value that matches to every query

I would like to insert records into a sqlite database with fields such that every query that specifies a value for that field does not disqualify the record.
Make Model Engine Parameter
Ford * * 1
Ford Taurus * 2
Ford Escape * 3
So a query = (database.table.Make == Ford') & (database.table.Model == 'Taurus') would return the first two records
EDIT: thanks to woot, I decided to use the following: (database.table.Make.belongs('Ford','')) & (database.table.Model.belongs('Taurus','')) which is the syntax for the IN operator in web2py
Are you looking for something like this? It won't perform well due to the ORs if you have a lot of rows.
SELECT *
FROM Cars
WHERE ( Cars.Make = 'Ford' OR Cars.Make = '*' )
AND ( Cars.Model = 'Taurus' OR Cars.Model = '*' )
Here is a SQL Fiddle example.
If you meant to use NULL, you can just replace that and replace the OR condition with OR Cars.Make IS NULL, etc.
Or to make it maybe a little less verbose:
SELECT *
FROM Cars
WHERE Cars.Make IN ('Ford','*')
AND Cars.Model IN ('Taurus','*')
But you wouldn't be able to use NULL in this case and would have to use the * token.
SQL Fiddle

Categories