Partition Key bad request when adding data to a container - python - python

So I've been going through the tutorial code for cosmodb in python to have better understanding with it, and currently am stuck on an issue with the partition key when trying to add data.
This code creates the container object, with a partition key called account_number.
container = db.create_container(id=CONTAINER_ID, partition_key=PartitionKey(path='/account_number') ,offer_throughput=400)
Here is some dummy data:
def get_sales_order(item_id):
order1 = {'id' : item_id,
'account_number' : 'Account1',
'purchase_order_number' : 'PO18009186470',
'order_date' : datetime.date(2005,1,10).strftime('%c'),
'subtotal' : 419.4589,
'tax_amount' : 12.5838,
'freight' : 472.3108,
'total_due' : 985.018,
'items' : [
{'order_qty' : 1,
'product_id' : 100,
'unit_price' : 418.4589,
'line_price' : 418.4589
}
],
'ttl' : 60 * 60 * 24 * 30
}
return order1
When trying to run the following code to add the dummy data into the db I use the following:
sales_order = get_sales_order("SalesOrder1")
container.create_item(body = sales_order)
When I run this I get the following error message.
(BadRequest) Message: {"Errors":["The partition key supplied in x-ms-partitionkey header has fewer components than defined in the collection."]}
After looking at the documentation for the create_item method, I noticed the initial_headers param. So I created a dict with the necessary info as seen here:
headers = {}
headers["x-ms-documentdb-partitionkey"]= "account_number"
Next I ran the following code with the addition of the `"x-ms-documentdb-partitionkey" header.
container.create_item(body = sales_order, initial_headers = headers)
However I still receive the same error. Help is very much appreciated.

Related

S3 Select Query JSON for nested value when keys are dynamic

I have a JSON object in S3 which follows this structure:
<code> : {
<client>: <value>
}
For example,
{
"code_abc": {
"client_1": 1,
"client_2": 10
},
"code_def": {
"client_2": 40,
"client_3": 50,
"client_5": 100
},
...
}
I am trying to retrieve the numerical value with an S3 Select query, where the "code" and the "client" are populated dynamically with each query.
So far I have tried:
sql_exp = f"SELECT * from s3object[*][*] s where s.{proc}.{client_name} IS NOT NULL"
sql_exp = f"SELECT * from s3object s where s.{proc}[*].{client_name}[*] IS NOT NULL"
as well as without the asterisk inside the square brackets, but nothing works, I get ClientError: An error occurred (ParseUnexpectedToken) when calling the SelectObjectContent operation: Unexpected token found LITERAL:UNKNOWN at line 1, column X (depending on the length of the query string)
Within the function defining the object, I have:
resp = s3.select_object_content(
Bucket=<bucket>,
Key=<filename>,
ExpressionType="SQL",
Expression=sql_exp,
InputSerialization={'JSON': {"Type": "Document"}},
OutputSerialization={"JSON": {}},
)
Is there something off in the way I define the object serialization? How can I fix the query so I can retrieve the desired numerical value on the fly when I provide ”code” and “client”?
I did some tinkering based on the documentation, and it works!
I need to access the single event in the EventStream (resp) as follows:
event_stream = resp['Payload']
# unpack successful query response
for event in event_stream:
if "Records" in event:
output_str = event["Records"]["Payload"].decode("utf-8") # bytes to string
output_dict = json.loads(output_str) # string to dict
Now the correct SQL expression is:
sql_exp= f"SELECT s['{code}']['{client}'] FROM S3Object s"
where I have gotten (dynamically) my values for code and client beforehand.
For example, based on the dummy JSON structure above, if code = "code_abc" and client = "client_2", I want this S3 Select query to return the value 10.
The f-string resolves to sql_exp = "SELECT s['code_abc']['client_2'] FROM S3Object s", and when we call resp, we retrieve output_dict = {'client_2': 10} (Not sure if there is a clear way to get the value by itself without the client key, this is how it looks like in the documentation as well).
So, the final step is to retrieve value = output_dict['client_2'], which in our case is equal to 10.

FastAPI - How to set correct response_model to return n documents from MongoDB?

Working on a FastAPI project using a MongoDB backend (using motor-asyncio). I will caveat all of this by saying that I'm very new to both FastAPI and motor (previously just worked with PyMongo).
I'm working on an endpoint that is supposed to return n documents from Mongo, contingent on the params set. I am able to get this to work if I don't set a response model in the function definition, and simply do the following:
#ScoresRouter.get("/getScores")
# omitted further query params for simplicity. there are a few of them which then alter the query
# but this is the simplest version
async def get_scores(members: list[str] | None = Query(default=None)):
c = coll.find({'membershipNumber' : {'$in' : members}}, {'_id' : 0, 'membershipNumber' : 1, 'score' : 1, 'workplace' : 1, 'created_at' : 1}
out = []
async for doc in c:
out.append(doc)
return out
But, I want to use the proper pydantic response_model syntax. So, I've defined the following:
class ScoreOut(BaseModel):
membershipNumber : str
score : float | None
historic_data : list | None
created_at : datetime | None
region : str | None
district : str | None
branch : str | None
workplace : str | None
class ScoresOut(BaseModel):
result : List[ScoreOut]
This is in accordance with what the data looks like in my target DB collection, which is this (this is copied from the mongo shell, not python):
mydb:PRIMARY> db.my_coll.findOne();
{
"_id" : ObjectId("1234"),
"membershipNumber" : "M123456"
"score" : 8.3,
"workplace" : "W0943294",
"created_at" : ISODate("2022-07-09T23:00:04.070Z"),
"end_date" : ISODate("2022-07-09T00:00:00Z"),
"historical_data" : [
{
"score" : 0,
"created_at" : ISODate("2022-05-10T16:50:19.136Z"),
"end_date" : ISODate("2020-01-08T00:00:00Z")
},
{
"score" : 0,
"end_date" : ISODate("2020-01-15T00:00:00Z"),
"created_at" : ISODate("2022-05-10T16:55:21.644Z")
}
]
}
Now, I change the route/function definition as follows:
async def get_scores(members: list[str] | None = Query(default=None),
response_model=ScoresOut,
response_model_exclude_unset=True):
c = coll.find({'membershipNumber' : {'$in' : members}}, {'_id' : 0, 'membershipNumber' : 1, 'score' : 1, 'workplace' : 1, 'created_at' : 1}
out = []
async for doc in c:
out.append(doc)
return out
And it no longer works. On the swagger-GUI I get a rather uninformative Internal Server Error, but in my terminal I'm getting this error:
pydantic.error_wrappers.ValidationError: 1 validation error for ScoresOut
response
value is not a valid dict (type=type_error.dict)
I imagine I somehow have to tell my function to wrap out in the ScoresOut response model, although a lot of tutorials I've seen do not do this step: They simple output an object which appears to match the response_model they've defined, and it somehow just works.
I wonder if this has something to do with Mongo's rather difficult bson datatypes, and converting them to something FastAPI/pydantic understand? I doubt it though, because if I don't define a response-model and simply return the out object, it works, and it looks as it would if I'd print the list of dicts in Python.
Any help with this would be hugely appreciated.
Since you're returning a list and not a single object, you'll have to tell FastAPI that it can expect a list of objects compatible with ScoresOut to be returned:
#ScoresRouter.get("/getScores", response_model=List[ScoresOut])
(side note: generally I recommend using just /scores as the path, as GET is implied by the HTTP method you're using)
If your mongodb library returns objects (instead of dictionaries), you'll need to configure your model to load its values from property lookups as well (.foo):
class ScoresOut(BaseModel):
result : List[ScoreOut]
class Config:
orm_mode = True
And as long as your mongodb library returns a iterable, you don't have to create the list and iterate it yourself:
async def get_scores(members: list[str] | None = Query(default=None),
response_model=List[ScoresOut],
response_model_exclude_unset=True):
return coll.find({'membershipNumber' : {'$in' : members}}, {'_id' : 0, 'membershipNumber' : 1, 'score' : 1, 'workplace' : 1, 'created_at' : 1}

How to print a value in Python that is inside a Json in MongoDB object

I have this Json, and I need a value from a rate.
"_id" : ObjectId("5addb57d0043582d48ba898a"),
"success" : true,
"timestamp" : 1524477784,
"base" : "EUR",
"date" : "2018-04-23",
"rates" : {
"AED" : 4.492662,
"AFN" : 85.576329,
"ALL" : 128.39508,
"AMD" : 586.837094,
"ANG" : 2.177608,
"AOA" : 267.092358,
"ARS" : 24.678283,
"AUD" : 1.602032,
"AWG" : 2.177639,
"AZN" : 2.079155,
"BAM" : 1.958775,
"BBD" : 2.446786,
"BDT" : 101.517146,
"BGN" : 1.943843,
"BHD" : 0.460968,
"NOK" : 9.626194,
}
And this is my Python code
import pymongo
uri = "mongodb://127.0.0.1:27017"
client = pymongo.MongoClient(uri)
database = client['db']
collection = database['currency']
collection2 = database['countries']
p = str(input('Insert the country: ')).capitalize()
if p=='Norway':
currency = collection.find_one({'NOK'})
print(currency)
I want to print the value inside de NOK, how can I do it?
Thanks in advance.
I think you can call it by :
currency = collection.find_one({"NOK"})
print(currency['rates']['NOK'])
What you're trying to do is fetch a value in a dictionary which is integrated inside of another dictionary. JSON returns are dictionaries.
I imagine that the collection variable contains the JSON return
One way of fetching this data is by calling the dictionary and telling it what key you want the dictionary to return, as the "NOK" key is in the "rates" key we will be calling the rates key first, then when the dictionary has returned the dictionary that contains the "NOK" key we will then pass the "NOK" key to that returned dictionary, so the code looks like this:
currency = collection["rates"]["NOK"]
Another way of doing this is using the for loop, it is a lot longer but may help you understand.
for key in collection:
if key == "rates":
for keyRates in JSON[key]:
if keyRates == "NOK":
currency = JSON[key][keyRates]
Hope this helps

Smartsheet API change? Bad Request Error related to Index format type

I have a piece of software that utilizes the Smartsheet API (specifically the Python SDK). I recently realized that the Smartsheet connectivity of the software has been broken since mid-January and as far as I can tell it’s due to a change on the Smartsheet side of things. I even rolled back to a version I know to have worked previously (the version that resulted from help in this related post).
Here is the code I use to access the sheet:
#Now query the smartsheet server
try:
# Load entire sheet
sheet = ss.Sheets.get_sheet(sheet_id, include=['format'],
page_size=500)
logging.info("Loaded "+str(len(sheet.rows))+
" rows from sheet: "+sheet.name)
column_map = {}
# Build column map for later reference-translates
# column names to column id
for column in sheet.columns:
column_map[column.title] = column.id
# Helper function to find cell in a row i a smartsheet object
def get_cell_by_column_name(row, column_ame):
column_id = column_map[column_ame]
return row.get_column(column_id)
# Find first row that hasn't already been written to
newrowindex = [row.row_number
for row in sheet.rows
if get_cell_by_column_name(row,
'Column4').display_value != None][-1]
# Build the row to update
newRow = ss.models.Row()
newRow.id = sheet.rows[newrowindex].id
for col in sheet.columns:
newCell = ss.models.Cell()
newCell.column_id = col.id
newCell.value = df[sheet.rows[2].get_column(
col.id).display_value
].loc[1]
oldformat = sheet.rows[newrowindex].get_column(
col.id).format
if oldformat == None:
oldformat = ',,,,,,,,,,,,,,,'
newformat = oldformat.split(',')[:-3]
newformat = "".join([ii+',' for ii in
newformat])[:-1]
newformat = (newformat + df_fmts[
sheet.rows[2].get_column(
col.id).display_value
].loc[1])
newCell._format = newformat
newRow.cells.append(newCell)
newRow.cells[0].value = sheet.rows[newrowindex].cells[0].value
newRow.cells[2].value = fieldinputs.projectid
result = ss.Sheets.update_rows(sheet_id,[newRow])
smshtexportButton.button_type = "success"
smshtexportButton.label = "Sucessfully Published!"
Here is the resulting error log:
Request: {
command: GET https://api.smartsheet.com/2.0/sheets/7120922902587268?
include=format&pageSize=500&page=1
}
2018-02-16 09:53:00,378 Loaded 205 rows from sheet: Bulk Update
2018-02-16 09:53:01,126 Request: {
command: PUT https://api.smartsheet.com/2.0/sheets/7120922902587268/rows
}
2018-02-16 09:53:01,134 Response: {
status: 400 Bad Request
content: {
{
"detail": {
"index": 0,
"rowId": 6710889537660804
},
"errorCode": 1008,
"message": "Unable to parse request. The following error occurred: Index
value '11' is invalid for format type DECIMAL_COUNT",
"refId": "1upixmmo45bp3"
}
}
The last successful update the software made to the 'Bulk Update' sheet was 1/16/18 at 3:37pm PST. I initially suspected that I had made changes to the df and df_fmts dataframes in the code above, but upon reverting back to the same version of the code used on 1/16, I still get the Bad Request error.
Any help would be greatly appreciated.
In the Smartsheet app the furthest you can set decimal places is 5. Anything greater than that will return an error like you are seeing. According to the error you are setting the decimal places to be 11, which is unsupported.
I suggest looking at your code that sets the format and how it is getting to 11. Then make sure the decimal place never goes above 5.

Mongoengine, retriving only some of a MapField

For Example.. In Mongodb..
> db.test.findOne({}, {'mapField.FREE':1})
{
"_id" : ObjectId("4fb7b248c450190a2000006a"),
"mapField" : {
"BOXFLUX" : {
"a" : "f",
}
}
}
The 'mapField' field is made of MapField of Mongoengine.
and 'mapField' field has a log of key and data.. but I just retrieved only 'BOXFLUX'..
this query is not working in MongoEngine....
for example..
BoxfluxDocument.objects( ~~ querying ~~ ).only('mapField.BOXFLUX')
AS you can see..
only('mapField.BOXFLUX') or only only('mapField__BOXFLUX') does not work.
it retrieves all 'mapField' data, including 'BOXFLUX' one..
How can I retrieve only a field of MapField???
I see there is a ticket for this: https://github.com/hmarr/mongoengine/issues/508
Works for me heres an example test case:
def test_only_with_mapfields(self):
class BlogPost(Document):
content = StringField()
author = MapField(field=StringField())
BlogPost.drop_collection()
post = BlogPost(content='Had a good coffee today...',
author={'name': "Ross", "age": "20"}).save()
obj = BlogPost.objects.only('author__name',).get()
self.assertEquals(obj.author['name'], "Ross")
self.assertEquals(obj.author.get("age", None), None)
Try this:
query = BlogPost.objects({your: query})
if name:
query = query.only('author__'+name)
else:
query = query.only('author')
I found my fault! I used only twice.
For example:
BlogPost.objects.only('author').only('author__name')
I spent a whole day finding out what is wrong with Mongoengine.
So my wrong conclusion was:
BlogPost.objects()._collection.find_one(~~ filtering query ~~, {'author.'+ name:1})
But as you know it's a just raw data not a mongoengine query.
After this code, I cannot run any mongoengine methods.
In my case, I should have to query depending on some conditions.
so it will be great that 'only' method overwrites 'only' methods written before.. In my humble opinion.
I hope this feature would be integrated with next version. Right now, I have to code duplicate code:
not this code:
query = BlogPost.objects()
query( query~~).only('author')
if name:
query = query.only('author__'+name)
This code:
query = BlogPost.objects()
query( query~~).only('author')
if name:
query = BlogPost.objects().only('author__'+name)
So I think the second one looks dirtier than first one.
of course, the first code shows you all the data
using only('author') not only('author__name')

Categories