Sample document:
{
"_id":"ADANIGREEN",
"longcount":6,"
shortcount":0,
"trend":"Y",
"shortdate":[{"$date":"2020-07-13T00:00:00.000Z"}],
"longdate":[{"$date":"2020-07-20T00:00:00.000Z"}]
}
I need to query document like the SQL query below:
select _id from sample_document where longdate='2020-07-20T00:00:00.000'
Thanks in advance.
You may try using the $in operator, which can check if one array has elements which are contained with another array:
db.sample.find( { longdate: { $in: ["2020-07-20T00:00:00.000"] } } )
Related
I'm trying to figure out how to work better with json in postgres.
I have a file that stores information about many tables (structure and values). File is periodically updated, this may mean changes in data as well as in table structures. It turns out some kind of dynamic tables.
As a result, I have json table structure (key is column, value is field type (string or number only)) and list of json records for each table.
Something like this (actualy structure does not matter):
{
'table_name': 'table1',
'columns': {
'id': 'int',
'data1': 'string',
'data2': 'string'
},
'values': [
[1, 'aaa', 'bbb'],
[2, 'ccc', 'ddd']
]
}
At first I wanted to make a real table for each table in file, do truncate when updating the data and drop table if table structure changes. Second option I'm testing now is a single table with json data:
CREATE TABLE IF NOT EXISTS public.data_tables
(
id integer NOT NULL,
table_name character varying(50),
row_data jsonb,
CONSTRAINT data_tables_pkey PRIMARY KEY (id)
)
And now there is the question of how to properly work with json:
directly query row_data like row_data->>'id' = 1 with hash index for 'id' key
use jsonb_populate_record with custom types for each table (yes, I need to recreate them each time table structure will change)
probably some other way to work with it?
First option is the easiest and fast because of indexes, but there is no data type control and you have to put it in every query.
Second option is more difficult to implement, but easier to use in queries. I can even create views for each table with jsonb_populate_record. But as far as I see - indexes won't work with json function?
Perhaps there is a better way? Or is recreating tables not such a bad option?
Firstly, your JSON string is not the correct format. I wrote the corrected sample JSON string:
{
"table_name": "table1",
"columns": {
"id": "integer",
"data1": "text",
"data2": "text"
},
"values": [
{
"id": 1,
"data1": "aaa",
"data2": "bbb"
},
{
"id": 2,
"data1": "ccc",
"data2": "ddd"
}
]
}
I wrote a sample function for you, but only for creating table from JSON. You can write SQL code for inserting process too, it's easy, not difficult.
Sample Function:
CREATE OR REPLACE FUNCTION dynamic_create_table()
RETURNS boolean
LANGUAGE plpgsql
AS $function$
declare
rec record;
begin
FOR rec IN
select
t1.table_name,
string_agg(t2.pkey || ' ' || t2.pval || ' NULL', ', ') as sql_columns
from data_tables t1
cross join jsonb_each_text(t1.row_data->'columns') t2(pkey, pval)
group by t1.table_name
loop
execute 'create table ' || rec.table_name || ' (' || rec.sql_columns || ')';
END loop;
return true;
END;
$function$;
For example, I have a collection of documents: [{id:1, title:'A'}, {id:2, title:'B'}, ...]
I want to fetch documents based on some conditions, and only get the values of the fields I want instead of the whole object. In SQL, this can be done by SELECT title FROM documents WHERE year = 2020
Can I achieve similar results in MongoDB with PyMongo?
Try projection like this
$cursor = $db->inventory->find(
['status' => 'A'],
['projection' => ['item' => 1, 'status' => 1]]
);
.find() in MongoDB will have this syntax .find(filter_part, projection_part), You can try below code in pymongo :
Code :
/** As `.find()` returns a cursor you can iterate on it to get values out,
* In projection `_id` is by default included, So we need to exclude it, projection can use True/False or 1/0 */
for eachDocument in collection.find({year : 2020}, {title: True, _id : False}):
print(eachDocument)
started using DynamoDB recently and I am having problems fetching data by multiple keys.
I am trying to get multiple items from a table.
My table schema is defined as follows:
{
"AttributeDefinitions": [
{
"AttributeName": "id",
"AttributeType": "S"
},
{
"AttributeName": "date",
"AttributeType": "S"
}
],
"KeySchema": [
{
"AttributeName": "id",
"KeyType": "HASH"
},
{
"AttributeName": "date",
"KeyType": "RANGE"
}
],
...
}
I have a filter list of ids and a date range for each id:
[
{ "id": "abc", "start_date": "24/03/2020", "end_date": "26/03/2020" },
{ "id": "def", "start_date": "10/04/2020", "end_date": "20/04/2020" },
{ "id": "ghi", "start_date": "11/04/2020", "end_date": "11/04/2020" }
]
I need to fetch all items that match the filter list.
The problem is that I cannot use Query as KeyConditionExpression only accepts a single partition key (and I need to match it to the entire filter list)
The condition must perform an equality test on a single partition key value.
I cannot use BatchGetItem as it requires the exact key (and I need a date range for my sort key Key('date').between(start_date, end_date))
Keys - An array of primary key attribute values that define specific items in the table. For each primary key, you must provide all of the key attributes. For example, with a simple primary key, you only need to provide the partition key value. For a composite key, you must provide both the partition key value and the sort key value.
I am kind of lost...
Is there a way to fetch by multiple keys with a range query (by a single request - not multiple requests from a loop)?
Would you suggest any table changes?
You need to make one query per unique id. Each of these queries should include a key condition expression that has equality on the id partition key and range of values on the date sort key, like this:
#id = :id AND #date BETWEEN :startdate AND :enddate
Don't use scan for this. As your table grows, performance will decline.
You can use table.scan to get multiple records. See documentation here.
Here's an example code:
import boto3
# Get the service resource.
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('tablename')
response = table.scan(
FilterExpression=Attr('first_name').begins_with('J') & Attr('account_type').eq('super_user')
)
items = response['Items']
print(items)
refering this post https://stackoverflow.com/a/70494101/7706503, you could try partiQL to get by multiple partition keys in one query
I'm trying to get Mongo to remove documents with the TTL feature however without success. Have tried many things but mongo doesn't seem to clean up.
My index:
{
"v" : 1,
"key" : {
"date" : 1
},
"name" : "date_1",
"ns" : "history.history",
"expireAfterSeconds" : 60
}
The date value from document:
"date" : "2016-09-29 11:08:46.461207",
Output from db.serverStatus().metrics.ttl:
{ "deletedDocuments" : NumberLong(0), "passes" : NumberLong(29) }
Time output from db.serverStatus():
"localTime" : ISODate("2016-09-29T11:19:45.345Z")
Only thing I suspect is the way I insert the value from Python. Could be that it's in some way wrong. I have a JSON document which contains the following element:
"date": str(datetime.utcnow()),
Any clues where the problem might lay?
Thanks,
Janis
As you have guessed, the problem is in how you insert the date value. I'll quote the docs:
If the indexed field in a document is not a date or an array that
holds a date value(s), the document will not expire.
You are casting the date to a string. If you are using the pymongo driver, he will handle datetimes nicely and convert it to MongoDB native Date type.
This way, the following should work:
"date": datetime.utcnow()
I am using Python2.7, Pymongo and MongoDB. I'm trying to get rid of the default _id values in MongoDB. Instead, I want certain fields of columns to go as _id.
For example:
{
"_id" : ObjectId("568f7df5ccf629de229cf27b"),
"LIFNR" : "10099",
"MANDT" : "100",
"BUKRS" : "2646",
"NODEL" : "",
"LOEVM" : ""
}
I would like to concatenate LIFNR+MANDT+BUKRS as 100991002646 and hash it to achieve uniqueness and store it as new _id.
But how far hashing helps for unique ids? And how do I achieve it?
I understood that using default hash function in Python gives different results for different machines (32 bit / 64 bit). If it is true, how would I go about generating _ids?
But I need LIFNR+MANDT+BUKRS to be used however. Thanks in advance.
First you can't update the _id field. Instead you should create a new field and set it value to the concatenated string. To return the concatenated value you need to use the .aggregate() method which provides access to the aggregation pipeline. The only stage in the pipeline is the $project stage where you use the $concat operator which concatenates strings and returns the concatenated string.
From there you then iterate the cursor and update each document using "bulk" operations.
bulk = collection.initialize_ordered_bulk_op()
count = 0
cursor = collection.aggregate([
{"$project": {"value": {"$concat": ["$LIFNR", "$MANDT", "$BUKRS"]}}}
])
for item in cursor:
bulk.find({'_id': item['_id']}).update_one({'$set': {'id': item['value']}})
count = count + 1
if count % 200 == 0:
bulk.execute()
if count > 0:
bulk.execute()
MongoDB 3.2 deprecates Bulk() and its associated methods so you will need to use the bulk_write() method.
from pymongo import UpdateOne
requests = []
for item in cursor:
requests.append(UpdateOne({'_id': item['_id']}, {'$set': {'id': item['value']}}))
collection.bulk_write(requests)
Your documents will then look like this:
{'BUKRS': '2646',
'LIFNR': '10099',
'LOEVM': '',
'MANDT': '100',
'NODEL': '',
'_id': ObjectId('568f7df5ccf629de229cf27b'),
'id': '100991002646'}