Nested dictionary to database tables with sqlalchemy - python

Assuming a database with tables G, P, and C. A one-to-many relationship exists between G and P, as well between P and C.
Having now a nested dictionary:
db_entry = {
"G_key_1": {
"G_col_1": 11,
"G_col_2": 12,
"G_Ps": [
{
"P_col_1": 21,
"P_col_2": 22,
"P_Cs": [
{
"C_col_1": 31,
"C_col_2": 32,
}
]
}
]
}
}
Is there a nice way to add the db_entry nested dictionary to the DB? At the moment I just loop over all the children since there are only 3 levels in the dictionary. But for dictionaries with more than 3 levels it would not be possible any longer.

Related

How can I transpose a list of documents in MongoDB?

I have a document like:
{
"_id": "6345e01473144cec0073ea95",
"results": [
{"total_cost": 10, "total_time": 20},
{"total_cost": 30, "total_time": 40}
]
}
And I want to 'transpose' the list of documents to get:
{
"total_cost": [10, 30],
"total_time": [20, 40]
}
How can I find an object by ID, and then apply a transform on a list of documents with a mongo aggregation?
Every question/answer I have seen doing this has been for multiple documents, however this is for a single document with a list field.
(I am using MongoEngine/Pymongo so I have included the python tag)
simply access the fields with dot notation. You can think the projection of the fields as an individual array, i.e. results.total_cost is an array with content [10, 30]
db.collection.aggregate([
{
$project: {
total_cost: "$results.total_cost",
total_time: "$results.total_time"
}
}
])
Here is the Mongo Playground with your reference.

how do you convert json output to a data frame in python

I need to convert this json file to a data frame in python:
print(resp2)
{
"totalCount": 1,
"nextPageKey": null,
"result": [
{
"metricId": "builtin:tech.generic.cpu.usage",
"data": [
{
"dimensions": [
"process_345678"
],
"dimensionMap": {
"dt.entity.process_group_instance": "process_345678"
},
"timestamps": [
1642021200000,
1642024800000,
1642028400000
],
"values": [
10,
15,
12
]
}
]
}
]
}
Output needs to be like this:
metricId dimensions timestamps values
builtin:tech.generic.cpu.usage process_345678 1642021200000 10
builtin:tech.generic.cpu.usage process_345678 1642024800000 15
builtin:tech.generic.cpu.usage process_345678 1642028400000 12
I have tried this:
print(pd.json_normalize(resp2, "data"))
I get invalid syntax, any ideas?
Take a look at the examples of json_normalize, and you'll see a list of dictionaries that have the key names of the columns you want, unique to each row. When you have nested lists/objects, then the columns will be flatten to have dot-notation, but nested arrays will not end up duplicated across rows.
Therefore, parse the data into a flat list, then you can use from_records.
data = []
for r in resp2['result']:
metricId = r['metricId']
for d in r['data']:
dimension = d['dimensions'][0] # unclear why this is an array
timestamps = d['timestamps']
values = d['values']
for t, v in zip(timestamps, values):
data.append({'metricId': metricId, 'dimensions': dimension, 'timestamps': t, 'values': v})
df = pd.DataFrame.from_records(data)

is it possible to use wildcards for field names in mongodb?

I have a set of field names as follows:
"field0.registers.hilo"
"field0.registers.lllo"
...
"field1.registers.hilo"
"field1.registers.lllo"
...
"field2.registers.hilo"
"field2.registers.lllo"
...
"fieldn.registers.hilo"
"fieldn.registers.lllo"
...
Is there a way to indicate the fields in mongodb with the index to range from 0 to n succinctly without having to expand it all out beforehand?
something like this example for project:
{ $project: { "fieldn.registers.hilo": 1, "fieldn.registers.lllo": 1 } }
For now, I am fully expanding all the project fields from 0 to n in python before interfacing with the collection using pymongo.
is it possible to use wildcards for field names in mongodb?
No.
If your data is in this structure, refactor it to use lists. That's exactly what lists are desgined for.
Taking the refactored example below, Use $elemMatch to project only the array elements needed:
from pymongo import MongoClient
db = MongoClient()['mydatabase']
db.register.insert_many([{
'registers': [
{
'field': 0,
'hilo': 1,
'lllo': 2
},
{
'field': 1,
'hilo': 2,
'lllo': 3
},
{
'field': 2,
'hilo': 3,
'lllo': 4
}
]}])
print(db.register.find_one({}, {'registers': {'$elemMatch': {'field': 1}}}))
prints:
{'_id': ObjectId('60b64e57c3214d73c390557b'), 'registers': [{'field': 1, 'hilo': 2, 'lllo': 3}]}

Create a new dictionary from existing with new keys

I have a dictionary d, I want to modify the keys and create a new dictionary. What is best way to do this?
Here's my existing code:
import json
d = json.loads("""{
"reference": "DEMODEVB02C120001",
"business_date": "2019-06-18",
"final_price": 40,
"products": [
{
"quantity": 4,
"original_price": 10,
"final_price": 40,
"id": "123"
}
]
}""")
d2 ={
'VAR_Reference':d['reference'],
'VAR_date': d['business_date'],
'VAR_TotalPrice': d['final_price']
}
Is there a better way to map the values using another mapping dictionary or a file where mapping values can be kept.
for eg, something like this:
d3 = {
'reference':'VAR_Reference',
'business_date': 'VAR_date',
'final_price': 'VAR_TotalPrice'
}
Appreciate any tips or hints.
You can use a dictionary comprehension to iterate over your original dictionary, and fetch your new keys from the mapping dictionary
{d3.get(key):value for key, value in d.items()}
You can also iterate over d3 and get the final dictionary (thanks #IcedLance for the suggestion)
{value:d.get(key) for key, value in d3.items()}

Google sheets python script conditional formatting custom formula

I want to use a script to format the background color for cells in two separate ranges based on values in a single range.
Ranges to be formatted:
AU6:BL405
BN6:CE405
Based on value = 1 in range:
A6:R405
I'm hoping that using a script will make the file more efficient compared to setting up the conditional formatting using the custom formula formatting function in Sheets.
I would appreciate any help!
Here is the code I've come up with so far:
SpreadsheetID: xxx
myRangeA = {
startRowIndex: 6,
startColumnIndex: 47,
endColumnIndex: 64,
}
myRangeB = {
startRowIndex: 6,
startColumnIndex: 66,
endColumnIndex: 83,
}
reqs = [
{addConditionalFormatRule: {
index: 0,
rule: {
ranges: [ myRangeA, myRange B ],
booleanRule: {
format: {
backgroundColor: {backgroundColor: {red: 0.8}}}
condition: {
type: CUSTOM_FORMULA,
values:
[{userEnteredValue: =A6=1}]
},
},
},
}},
]
SHEETS.spreadsheets().batchUpdate(spreadsheetId=SHEET_ID,
body={requests: reqs}).execute()

Categories