Mongo TTL cleanup is not working - python

I'm trying to get Mongo to remove documents with the TTL feature however without success. Have tried many things but mongo doesn't seem to clean up.
My index:
{
"v" : 1,
"key" : {
"date" : 1
},
"name" : "date_1",
"ns" : "history.history",
"expireAfterSeconds" : 60
}
The date value from document:
"date" : "2016-09-29 11:08:46.461207",
Output from db.serverStatus().metrics.ttl:
{ "deletedDocuments" : NumberLong(0), "passes" : NumberLong(29) }
Time output from db.serverStatus():
"localTime" : ISODate("2016-09-29T11:19:45.345Z")
Only thing I suspect is the way I insert the value from Python. Could be that it's in some way wrong. I have a JSON document which contains the following element:
"date": str(datetime.utcnow()),
Any clues where the problem might lay?
Thanks,
Janis

As you have guessed, the problem is in how you insert the date value. I'll quote the docs:
If the indexed field in a document is not a date or an array that
holds a date value(s), the document will not expire.
You are casting the date to a string. If you are using the pymongo driver, he will handle datetimes nicely and convert it to MongoDB native Date type.
This way, the following should work:
"date": datetime.utcnow()

Related

Usage of postgres jsonb

I'm trying to figure out how to work better with json in postgres.
I have a file that stores information about many tables (structure and values). File is periodically updated, this may mean changes in data as well as in table structures. It turns out some kind of dynamic tables.
As a result, I have json table structure (key is column, value is field type (string or number only)) and list of json records for each table.
Something like this (actualy structure does not matter):
{
'table_name': 'table1',
'columns': {
'id': 'int',
'data1': 'string',
'data2': 'string'
},
'values': [
[1, 'aaa', 'bbb'],
[2, 'ccc', 'ddd']
]
}
At first I wanted to make a real table for each table in file, do truncate when updating the data and drop table if table structure changes. Second option I'm testing now is a single table with json data:
CREATE TABLE IF NOT EXISTS public.data_tables
(
id integer NOT NULL,
table_name character varying(50),
row_data jsonb,
CONSTRAINT data_tables_pkey PRIMARY KEY (id)
)
And now there is the question of how to properly work with json:
directly query row_data like row_data->>'id' = 1 with hash index for 'id' key
use jsonb_populate_record with custom types for each table (yes, I need to recreate them each time table structure will change)
probably some other way to work with it?
First option is the easiest and fast because of indexes, but there is no data type control and you have to put it in every query.
Second option is more difficult to implement, but easier to use in queries. I can even create views for each table with jsonb_populate_record. But as far as I see - indexes won't work with json function?
Perhaps there is a better way? Or is recreating tables not such a bad option?
Firstly, your JSON string is not the correct format. I wrote the corrected sample JSON string:
{
"table_name": "table1",
"columns": {
"id": "integer",
"data1": "text",
"data2": "text"
},
"values": [
{
"id": 1,
"data1": "aaa",
"data2": "bbb"
},
{
"id": 2,
"data1": "ccc",
"data2": "ddd"
}
]
}
I wrote a sample function for you, but only for creating table from JSON. You can write SQL code for inserting process too, it's easy, not difficult.
Sample Function:
CREATE OR REPLACE FUNCTION dynamic_create_table()
RETURNS boolean
LANGUAGE plpgsql
AS $function$
declare
rec record;
begin
FOR rec IN
select
t1.table_name,
string_agg(t2.pkey || ' ' || t2.pval || ' NULL', ', ') as sql_columns
from data_tables t1
cross join jsonb_each_text(t1.row_data->'columns') t2(pkey, pval)
group by t1.table_name
loop
execute 'create table ' || rec.table_name || ' (' || rec.sql_columns || ')';
END loop;
return true;
END;
$function$;

MongoDB - find document based on date stored as array

Sample document:
{
"_id":"ADANIGREEN",
"longcount":6,"
shortcount":0,
"trend":"Y",
"shortdate":[{"$date":"2020-07-13T00:00:00.000Z"}],
"longdate":[{"$date":"2020-07-20T00:00:00.000Z"}]
}
I need to query document like the SQL query below:
select _id from sample_document where longdate='2020-07-20T00:00:00.000'
Thanks in advance.
You may try using the $in operator, which can check if one array has elements which are contained with another array:
db.sample.find( { longdate: { $in: ["2020-07-20T00:00:00.000"] } } )

DynamoDB - fetching items by multiple keys

started using DynamoDB recently and I am having problems fetching data by multiple keys.
I am trying to get multiple items from a table.
My table schema is defined as follows:
{
"AttributeDefinitions": [
{
"AttributeName": "id",
"AttributeType": "S"
},
{
"AttributeName": "date",
"AttributeType": "S"
}
],
"KeySchema": [
{
"AttributeName": "id",
"KeyType": "HASH"
},
{
"AttributeName": "date",
"KeyType": "RANGE"
}
],
...
}
I have a filter list of ids and a date range for each id:
[
{ "id": "abc", "start_date": "24/03/2020", "end_date": "26/03/2020" },
{ "id": "def", "start_date": "10/04/2020", "end_date": "20/04/2020" },
{ "id": "ghi", "start_date": "11/04/2020", "end_date": "11/04/2020" }
]
I need to fetch all items that match the filter list.
The problem is that I cannot use Query as KeyConditionExpression only accepts a single partition key (and I need to match it to the entire filter list)
The condition must perform an equality test on a single partition key value.
I cannot use BatchGetItem as it requires the exact key (and I need a date range for my sort key Key('date').between(start_date, end_date))
Keys - An array of primary key attribute values that define specific items in the table. For each primary key, you must provide all of the key attributes. For example, with a simple primary key, you only need to provide the partition key value. For a composite key, you must provide both the partition key value and the sort key value.
I am kind of lost...
Is there a way to fetch by multiple keys with a range query (by a single request - not multiple requests from a loop)?
Would you suggest any table changes?
You need to make one query per unique id. Each of these queries should include a key condition expression that has equality on the id partition key and range of values on the date sort key, like this:
#id = :id AND #date BETWEEN :startdate AND :enddate
Don't use scan for this. As your table grows, performance will decline.
You can use table.scan to get multiple records. See documentation here.
Here's an example code:
import boto3
# Get the service resource.
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('tablename')
response = table.scan(
FilterExpression=Attr('first_name').begins_with('J') & Attr('account_type').eq('super_user')
)
items = response['Items']
print(items)
refering this post https://stackoverflow.com/a/70494101/7706503, you could try partiQL to get by multiple partition keys in one query

Create a JSON format using loops

I'm in the process of creating a Table for AWS DynamoDB. All the documentation on how to use the required JSON format is demonstrated entirely manually... in my case, I want to create several tables, each with several columns - it seems inefficient to do this manually when I know my column headers and their data types...
The boto3 website has a guide with the following snippet:
# Get the service resource.
dynamodb = boto3.resource('dynamodb')
# Create the DynamoDB table.
table = dynamodb.create_table(
TableName='users',
KeySchema=[
{
'AttributeName': 'username',
'KeyType': 'HASH'
},
{
'AttributeName': 'last_name',
'KeyType': 'RANGE'
}
],
AttributeDefinitions=[
{
'AttributeName': 'username',
'AttributeType': 'S'
},
{
'AttributeName': 'last_name',
'AttributeType': 'S'
},
],
ProvisionedThroughput={
'ReadCapacityUnits': 5,
'WriteCapacityUnits': 5
}
)
Now I'm wondering, of course if you had hundreds of columns/AttributeTypes in your data, you wouldn't want to sit there typing it all in. How can I automate this process with a loop? I have a general idea but I'm coming from Java and I'm not quite proficient to think of the solution in this case.
Could anyone help? Thanks!
EDIT:
I worded this question horribly, and had gotten too bogged down in the documentation to understand what I was asking about. I wanted a solution for automating the addition of data to a DynamoDB table using loops. I explain in my answer below.
So unbeknownst to me at the time - the snippet shown in my question is only about defining the key schema - i.e. your primary/ composite keys. What I wanted to do was add actual data to the table - and the examples of this were all done manually on the boto3 documentation.
To answer my own question: first, you obviously need to create and define the key schema - and that's no time at all to do manually using the template shown in the question.
Note that Boto3 will not allow float values... my solution was to change them into str types. Boto3 recommends using Decimal, but Decimal(str(value)) would not work, as it was being passed as string Decimal(value) for some reason (can anyone explain?):
Passing a Decimal(str(value)) to a dictionary for raw value
This is how I used pandas to import data from excel, and then automated putting that data into my table:
#panda reads the excel document to extract df
df = pd.read_excel(fileRoute,dtype=object)
#the column headers are allocated to the keys variable
keys= df.columns.values
#create a dynamodb instance and link it to an existing table in dynamodb
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(name)
#batch_writer() allows the addition of many items at once to the table
with table.batch_writer() as batch:
dfvalues = df.values
#loop through values to generate dictionaries for every row in table
for sublist in dfvalues:
dicts = {}
for ind, value in enumerate(sublist):
if type(value) is float:
if math.isnan(value): #you might want to avoid 'NaN' types
continue
value = str(value) #convert float values to str
dicts[headings[ind]] = value
batch.put_item(
Item=dicts #add item to dynamoDB table
)

How to Parse JSON strings stored as an in DB

From JSON file, the data is created into data base and stored some fields into an array. How to validate these strings that are mapped to this single array.
How can I write a code for this?
For example:
From this JSON, below strings are stored in DB as an array
From JSON
{
"roadTrip" : [{
"firstStop" : "Place1",
"SecondStop" : "Place2",
"currency" : "DOLLAR"
}]
}
To DB:
roadTrip:Array
"firstStop" : "Place1",
"SecondStop" : "Place2",
"currency" : "DOLLAR"

Categories