IN Query not working for Amazon DynamoDB - python

I would like to check retrieve items that have an attribute value that is present in the list of value I provide. Below is the query I have for searching. Unfortunately the response return an empty list of items. I don't understand why this is the case and would like to know the correct query.
def search(self, src_words, translations):
entries = []
query_src_words = [word.decode("utf-8") for word in src_words]
params = {
"TableName": self.table,
"FilterExpression": "src_word IN (:src_words) AND src_language = :src_language AND target_language = :target_language",
"ExpressionAttributeValues": {
":src_words": {"SS": query_src_words},
":src_language": {"S": config["source_language"]},
":target_language": {"S": config["target_language"]}
}
}
page_iterator = self.paginator.paginate(**params)
for page in page_iterator:
for entry in page["Items"]:
entries.append(entry)
return entries
Below is the table that I would like to query from. For example if my list of query_src_word have: [soccer ball, dog] then only row with entry_id=2 should be returned
Any insights would be much appreciated.

I think this is because in the query_src_word you have "soccer_ball" (with an underscore), while in the database you have "soccer ball" (without an underscore).
Change "soccer_ball" to "soccer ball" in your query_src_words and it should work find

Related

S3 Select Query JSON for nested value when keys are dynamic

I have a JSON object in S3 which follows this structure:
<code> : {
<client>: <value>
}
For example,
{
"code_abc": {
"client_1": 1,
"client_2": 10
},
"code_def": {
"client_2": 40,
"client_3": 50,
"client_5": 100
},
...
}
I am trying to retrieve the numerical value with an S3 Select query, where the "code" and the "client" are populated dynamically with each query.
So far I have tried:
sql_exp = f"SELECT * from s3object[*][*] s where s.{proc}.{client_name} IS NOT NULL"
sql_exp = f"SELECT * from s3object s where s.{proc}[*].{client_name}[*] IS NOT NULL"
as well as without the asterisk inside the square brackets, but nothing works, I get ClientError: An error occurred (ParseUnexpectedToken) when calling the SelectObjectContent operation: Unexpected token found LITERAL:UNKNOWN at line 1, column X (depending on the length of the query string)
Within the function defining the object, I have:
resp = s3.select_object_content(
Bucket=<bucket>,
Key=<filename>,
ExpressionType="SQL",
Expression=sql_exp,
InputSerialization={'JSON': {"Type": "Document"}},
OutputSerialization={"JSON": {}},
)
Is there something off in the way I define the object serialization? How can I fix the query so I can retrieve the desired numerical value on the fly when I provide ”code” and “client”?
I did some tinkering based on the documentation, and it works!
I need to access the single event in the EventStream (resp) as follows:
event_stream = resp['Payload']
# unpack successful query response
for event in event_stream:
if "Records" in event:
output_str = event["Records"]["Payload"].decode("utf-8") # bytes to string
output_dict = json.loads(output_str) # string to dict
Now the correct SQL expression is:
sql_exp= f"SELECT s['{code}']['{client}'] FROM S3Object s"
where I have gotten (dynamically) my values for code and client beforehand.
For example, based on the dummy JSON structure above, if code = "code_abc" and client = "client_2", I want this S3 Select query to return the value 10.
The f-string resolves to sql_exp = "SELECT s['code_abc']['client_2'] FROM S3Object s", and when we call resp, we retrieve output_dict = {'client_2': 10} (Not sure if there is a clear way to get the value by itself without the client key, this is how it looks like in the documentation as well).
So, the final step is to retrieve value = output_dict['client_2'], which in our case is equal to 10.

pymongo query for all items containing a unique identifier

I have a mongo collection with data structure in the follwoing way
content: {'description': { 'text': [{'_date': '2019-05-21','_sectionId': 'a13a','_objectId: 'f637cee'},
{'_date': '2019-05-21','_objectId': '8b2ed183', '_source: 'f637cee'},
{ etc....}
{'_date': '2019-05-21','_sectionId': 'a13a','_objectId: 'XXXcee'}
},
'client' : {.....},
}
I am looking for the way to query the collection to get a list of tuples in the following way:
given a section Id I would like to get the corresponding 'objectId'
In this case the result would be:
('a13a','f637cee'), ('a13a','XXXcee')
I started to do something like this:
import pymongo
myclient = pymongo.MongoClient(mongoconnection)
print('databases names:')
myclient.list_database_names()
# getting the collection:
mydb = myclient["clients"]
query = {'content.description.text._sectionId': 'a13a'}
cur = mydb.find(query)
But I dont know how to extract the information from the cursor.
Some help?
Note the info might be nested in different places, i.e. there are more nodes preceding "content" that can vary.
Thanks a lot
Use the second parameter of the find() to get required fields.
Ex:
query = {'content.description.text._sectionId': 'a13a'}
cur = mydb.find(query, { "_id": 0, "_sectionId": 1, "_objectId": 1 })
print([tuple(i.values()) for i in cur])

Access one item from a dict and store it into a variable

I am trying to get all the "uuid"'s from an API, and the issue is that it is stored into a dict (I think). Her is how it looks on the API:
{"guild": {
"_id": "5eba1c5f8ea8c960a61f38ed",
"name": "Creators Club",
"name_lower": "creators club",
"coins": 0,
"coinsEver": 0,
"created": 1589255263630,
"members":
[{ "uuid": "db03ceff87ad4909bababc0e2622aaf8",
"rank": "Guild Master",
"joined": 1589255263630,
"expHistory": {
"2020-06-01": 280,
"2020-05-31": 4701,
"2020-05-30": 0,
"2020-05-29": 518,
"2020-05-28": 1055,
"2020-05-27": 136665,
"2020-05-26": 34806}}]
}
}
Now I am interested in the "uuid" part there, and take note: There is multiple players, it can be 1 to 100 players, and I am going to need every UUID.
Now I have done this in my python to get the UUID's displayed on the website:
try:
f = requests.get(
"https://api.hypixel.net/guild?key=[secret]&id=" + guild).json()
guildName = f["guild"]["name"]
guildMembers = f["guild"]["members"]
members = client.getPlayer(uuid=guildMembers) #this converts UUID to player names
#I need to store all uuid's in variables and put them at "guildMembers"
And that gives me all the "UUID codes", and I will be using client.getPlayer(uuid=---) to convert the UUID into the Player Names. I have to loop through each "UUID" into that code client.getPlayer(uuid=---) . But first of I need to save the UUID'S in variables, I have been doing members.uuid to access the UUID on my HTML file, but I don't know how you do the .uuid part in python
If you need anything else, just comment :)
List comprehension is a powerful concept:
members = [client.getPlayer(member['uuid']) for member in guildMembers]
Edit:
If you want to insert the names back into your data (in guildMembers),
use a dictionary comprehension with {uuid: member_name,} format:
members = {member['uuid']: client.getPlayer(uuid=member['uuid']) for member in guildMembers}
Than you can update guildMembers with your results:
for member in guildMembers:
guildMembers[member]['name'] = members[member['uuid']]
Assuming that guild is the main dictionary in which a key called members exists with a list of "sub dictionaries", you can try
uuid = list()
for x in guild['members']:
uuid.append(x['uuid'])
uuid now has all the uuids
If i understood situation right, You just need to loop through all received uuids and get players' data. Something like this:
f = requests.get("https://api.hypixel.net/guild?key=[secret]&id=" + guild).json()
guildName = f["guild"]["name"]
guildMembers = f["guild"]["members"]
guildMembersData = dict() # Here we will save member's data from getPlayer method
for guildMember in guildMembers:
uuid = guildMember["uuid"]
memberData = client.getPlayer(uuid=uuid)
guildMembersData[uuid] = client.getPlayer(uuid=guildMember["uuid"])
print(guildMembersData) # Here will be players' Data.

How to save None/Null if Index out of Range (Python/Django)

I trying to loop through a list of objects and pull out their attributes and add them to a dictionary. In this list of objects some of the data was previously populated but sometimes there will be null or blank values. When the loop runs into the blank value it throws an "Index out of Range" Error.
obj = Idea.objects.get(name=idea_name)
new_obj = []
plan = ProductPlan.objects.all()
for product in plan:
answers = product.question.answer_set.filter(idea=obj.id)
new_plan = {"title": product.title, "answer": answers[0]}
print new_plan
new_obj.append(new_plan)
return render(request, 'idea.html', {"new_obj": new_obj, "obj":obj})
If the index is null, how do I just store it as empty.
answers = product.question.answer_set.filter(idea=obj.id)
answer = answers[0] if answers.exists() else None
new_plan = {"title": product.title, "answer": answer}
Just in case you don't know, exists() would efficiently test whether a queryset is empty or not. Check django doc for more info.

Mongoengine, retriving only some of a MapField

For Example.. In Mongodb..
> db.test.findOne({}, {'mapField.FREE':1})
{
"_id" : ObjectId("4fb7b248c450190a2000006a"),
"mapField" : {
"BOXFLUX" : {
"a" : "f",
}
}
}
The 'mapField' field is made of MapField of Mongoengine.
and 'mapField' field has a log of key and data.. but I just retrieved only 'BOXFLUX'..
this query is not working in MongoEngine....
for example..
BoxfluxDocument.objects( ~~ querying ~~ ).only('mapField.BOXFLUX')
AS you can see..
only('mapField.BOXFLUX') or only only('mapField__BOXFLUX') does not work.
it retrieves all 'mapField' data, including 'BOXFLUX' one..
How can I retrieve only a field of MapField???
I see there is a ticket for this: https://github.com/hmarr/mongoengine/issues/508
Works for me heres an example test case:
def test_only_with_mapfields(self):
class BlogPost(Document):
content = StringField()
author = MapField(field=StringField())
BlogPost.drop_collection()
post = BlogPost(content='Had a good coffee today...',
author={'name': "Ross", "age": "20"}).save()
obj = BlogPost.objects.only('author__name',).get()
self.assertEquals(obj.author['name'], "Ross")
self.assertEquals(obj.author.get("age", None), None)
Try this:
query = BlogPost.objects({your: query})
if name:
query = query.only('author__'+name)
else:
query = query.only('author')
I found my fault! I used only twice.
For example:
BlogPost.objects.only('author').only('author__name')
I spent a whole day finding out what is wrong with Mongoengine.
So my wrong conclusion was:
BlogPost.objects()._collection.find_one(~~ filtering query ~~, {'author.'+ name:1})
But as you know it's a just raw data not a mongoengine query.
After this code, I cannot run any mongoengine methods.
In my case, I should have to query depending on some conditions.
so it will be great that 'only' method overwrites 'only' methods written before.. In my humble opinion.
I hope this feature would be integrated with next version. Right now, I have to code duplicate code:
not this code:
query = BlogPost.objects()
query( query~~).only('author')
if name:
query = query.only('author__'+name)
This code:
query = BlogPost.objects()
query( query~~).only('author')
if name:
query = BlogPost.objects().only('author__'+name)
So I think the second one looks dirtier than first one.
of course, the first code shows you all the data
using only('author') not only('author__name')

Categories