How to query conditional attributes in DynamoDB - python

I'm trying to query all the Food values in the "Categories" attribute and the "review_count" attribute values that are at least 100. My first time working with scanning tables in DynamoDB through python. I need to use the table.scan function as well. This is what I have tried so far.
resp = table.scan(FilterExpression='(categories = cat1) AND' + '(review_count >= 100)',
ExpressionAttributeValues={
':cat1': 'Food',
})
Any help would be greatly appreciated. Thanks

Assuming table name is test
FilterExpression can't contain constants, should only have either table attributes like categories, review_count and placeholders like :cat1, :rc . So, 100 can be replaced with a variable :rc.
All placeholders should start with : , so, cat1 should be :cat1
table = dynamodb.Table('test')
response = table.scan(
FilterExpression= 'categories=:cat1 AND review_count>=:rc',
ExpressionAttributeValues= {
':cat1': "Food" ,
':rc': 100,
}
)
data = response['Items']
Important point to note on scan , from documentation
A single Scan operation reads up to the maximum number of items set
(if using the Limit parameter) or a maximum of 1 MB of data and then
apply any filtering to the results using FilterExpression.

Related

How to use variable column name in filter in Django ORM?

I have two tables BloodBank(id, name, phone, address) and BloodStock(id, a_pos, b_pos, a_neg, b_neg, bloodbank_id). I want to fetch all the columns from two tables where the variable column name (say bloodgroup) which have values like a_pos or a_neg... like that and their value should be greater than 0. How can I write ORM for the same?
SQL query is written like this to get the required results.
sql="select * from public.bloodbank_bloodbank as bb, public.bloodbank_bloodstock as bs where bs."+blood+">0 and bb.id=bs.bloodbank_id order by bs."+blood+" desc;"
cursor = connection.cursor()
cursor.execute(sql)
bloodbanks = cursor.fetchall()
You could be more specific in your questions, but I believe you have a variable called blood which contains the string name of the column and that the columns a_pos, b_pos, etc. are numeric.
You can use a dictionary to create keyword arguments from strings:
filter_dict = {bloodstock__blood + '__gt': 0}
bloodbanks = Bloodbank.objects.filter(**filter_dict)
This will get you Bloodbank objects that have a related bloodstock with a greater than zero value in the bloodgroup represented by the blood variable.
Note that the way I have written this, you don't get the bloodstock columns selected, and you may get duplicate bloodbanks. If you want to get eliminate duplicate bloodbanks you can add .distinct() to your query. The bloodstocks are available for each bloodbank instance using .bloodstock_set.all().
The ORM will generate SQL using a join. Alternatively, you can do an EXISTS in the where clause and no join.
from django.db.models import Exists, OuterRef
filter_dict = {blood + '__gt': 0}
exists = Exists(Bloodstock.objects.filter(
bloodbank_id=OuterRef('id'),
**filter_dict
)
bloodbanks = Bloodbank.objects.filter(exists)
There will be no need for a .distinct() in this case.

Assign elements while you loop inside a function or with append automatically

I was wondering if you could help me with my problem
I am trying to create a function that gets data frames based on SQL QUERY.
My function is:
def request_data(str,conn):
cur = conn.cursor()
cur.execute(str)
data = pd.read_sql_query(str, conn)
return data
When I try to apply my function using append method, I get what I expect as a result!
Tables_to_initialize = ['sales', 'stores', 'prices']
Tables = []
for x in Tables_to_initialize:
Tables.append(request_data("SELECT * FROM {i} WHERE d_cr_bdd =
(SELECT MAX(d_cr_bdd) FROM {i} ) ; ".format(i = x),conn))
But,
Tables is a list that contains all the sorted data frames based on my query, what i really want to do is to assign every element in my list tables to it's name, for example
Tables_to_initialize[0] = 'sales'
and i want to Tables[0] to be sales as object (data frame).
Is there any method to assign objects inside the function or with append automatically? Or any other solution?
I really appreciate your help
Best regards,
To get a list of objects based on given query.
table = [requested_data("SELECT * FROM {i} WHERE d_cr_bdd = (SELECT MAX(d_cr_bdd) FROM {i}".format(i)) for i in Tables_to_initialize ]

Django: queryset "loosing" a value

I have 2 models, one with a list of clients and the other with a list of sales.
My intention is to add sales rank value to the clients queryset.
all_clients = Contactos.objects.values("id", "Vendedor", "codigo", 'Nombre', "NombrePcia", "Localidad", "FechaUltVenta")
sales = Ventas.objects.all()
Once loaded I aggregate all the sales per client summing the subtotal values of their sales and then order the result by their total sales.
sales_client = sales.values('cliente').annotate(
fact_total=Sum('subtotal'))
client_rank = sales_client .order_by('-fact_total')
Then I set the rank of those clients and store the value in a the "Rank" values in the same client_rank queryset.
a = 0
for rank in client_rank:
a = a + 1
rank['Rank'] = a
Everything fine up to now. When I print the results in the template I get the expected values in the "client_rank" queryset: "client name" + "total sales per client" + "Rank".
{'cliente': '684 DROGUERIA SUR', 'fact_total': Decimal('846470'), 'Rank': 1}
{'cliente': '699 KINE ESTETIC', 'fact_total': Decimal('418160'), 'Rank': 2}
etc....
The problem starts here
First we should take into account that not all the clients in the "all_clients" queryset have actual sales in the "sales" queryset. So I must find which ones do have sales, assign them the "Rank" value and a assign a standard value for the ones who don´t.
for subject in all_clients:
subject_code = str(client["codigo"])
try:
selected_subject = ranking_clientes.get(cliente__icontains=subject_code)
subject ['rank'] = selected_subject['Rank']
except:
subject ['rank'] = "Some value"
The Try always fails because "selected_subject" doesn´t seems to hace the "Rank" value. If I print the "selected_subject" I get the following:
{'cliente': '904 BAHIA BLANCA BASKET', 'fact_total': Decimal('33890')}
Any clues on why I´, lossing the "Rank" value? The original "client_rank" queryset still has that value included.
Thanks!
I presume that ranking_clientes is the same as client_rank.
The problem is that .get will always do a new query against the database. This means that any modifications you made to the dictionaries returned in the original query will not have been applied to the result of the get call.
You would need to iterate through your query to find the one you need:
selected_subject = next(client for client in ranking_clientes if subject_code in client.cliente)
Note, this is pretty inefficient if you have a lot of clients. I would rethink your model structure. Alternatively, you could look into using a database function to return the rank directly as part of the original query.

Using ? within INSERT INTO of a executemany() in Python with Sqlite

I am trying to submit data to a Sqlite db through python with executemany(). I am reading data from a JSON file and then placing it into the db. My problem is that the JSON creation is not under my control and depending on who I get the file from, the order of values is not the same each time. The keys are correct so they correlate with the keys in the db but I can't just toss the values at the executemany() function and have the data appear in the correct columns each time.
Here is what I need to be able to do.
keyTuple = (name, address, telephone)
listOfTuples = [(name1, address1, telephone1),
(name2, address2, telephone2),
(...)]
cur.executemany("INSERT INTO myTable(?,?,?)", keysTuple"
"VALUES(?,?,?)", listOfTuples)
The problem I have is that some JSON files have order of "name, telephone, address" or some other order. I need to be able to input my keysTuple into the INSERT portion of the command so I can keep my relations straight no matter what order the JSON file come in without having to completely rebuild the listOfTuples. I know there has got to be a way but what I have written doesn't match the right syntax for the INSERT portion. The VALUES line works just fine, it uses each element in listofTuples.
Sorry if I am not asking with the correct verbage. FNG here and this is my first post. I have look all over the web but it only produces the examples of using ? in the VALUE portion, never in the INSERT INTO portion.
You cannot use SQL parameters (?) for table/column names.
But when you already have the column names in the correct order, you can simply join them in order to be able to insert them into the SQL command string:
>>> keyTuple = ("name", "address", "telephone")
>>> "INSERT INTO MyTable(" + ",".join(keyTuple) + ")"
'INSERT INTO MyTable(name,address,telephone)'
Try this
Example if you have table named products with the following fields:
Prod_Name Char( 30 )
UOM Char( 10 )
Reference Char( 10 )
Const Float
Price Float
list_products = [('Garlic', '5 Gr.', 'Can', 1.10, 2.00),
('Beans', '8 On.', 'Bag', 1.25, 2.25),
('Apples', '1 Un.', 'Unit', 0.25, 0.30),
]
c.executemany('Insert Into products Values (?,?,?,?,?)', list_products )

How to achieve unique _id value in MongoDB?

I am using Python2.7, Pymongo and MongoDB. I'm trying to get rid of the default _id values in MongoDB. Instead, I want certain fields of columns to go as _id.
For example:
{
"_id" : ObjectId("568f7df5ccf629de229cf27b"),
"LIFNR" : "10099",
"MANDT" : "100",
"BUKRS" : "2646",
"NODEL" : "",
"LOEVM" : ""
}
I would like to concatenate LIFNR+MANDT+BUKRS as 100991002646 and hash it to achieve uniqueness and store it as new _id.
But how far hashing helps for unique ids? And how do I achieve it?
I understood that using default hash function in Python gives different results for different machines (32 bit / 64 bit). If it is true, how would I go about generating _ids?
But I need LIFNR+MANDT+BUKRS to be used however. Thanks in advance.
First you can't update the _id field. Instead you should create a new field and set it value to the concatenated string. To return the concatenated value you need to use the .aggregate() method which provides access to the aggregation pipeline. The only stage in the pipeline is the $project stage where you use the $concat operator which concatenates strings and returns the concatenated string.
From there you then iterate the cursor and update each document using "bulk" operations.
bulk = collection.initialize_ordered_bulk_op()
count = 0
cursor = collection.aggregate([
{"$project": {"value": {"$concat": ["$LIFNR", "$MANDT", "$BUKRS"]}}}
])
for item in cursor:
bulk.find({'_id': item['_id']}).update_one({'$set': {'id': item['value']}})
count = count + 1
if count % 200 == 0:
bulk.execute()
if count > 0:
bulk.execute()
MongoDB 3.2 deprecates Bulk() and its associated methods so you will need to use the bulk_write() method.
from pymongo import UpdateOne
requests = []
for item in cursor:
requests.append(UpdateOne({'_id': item['_id']}, {'$set': {'id': item['value']}}))
collection.bulk_write(requests)
Your documents will then look like this:
{'BUKRS': '2646',
'LIFNR': '10099',
'LOEVM': '',
'MANDT': '100',
'NODEL': '',
'_id': ObjectId('568f7df5ccf629de229cf27b'),
'id': '100991002646'}

Categories