How to query a string field contains one of array items? - python

I have documents like this:
{
name: '...'
}
I want to query for documents which names contains one of:
cities = ['a', 'b', 'c']
Of course it's easy to check for exact match like this:
col_areas = db['areas']
col_areas.find({'name': {'$in': cities}})
I want use $regex with each item of cities. How to do that?
I also have tried:
for c in cities:
cities_query.append('/^%s/' % c)
results = col_areas.find({'name': {'$in': cities_query}})

Maybe there is a better way.
Sample
db.col_areas.aggregate([
{
$project: {
'name': 1
}
},
{
$match: {
$or:
[{ 'name': { $regex: 'a', $options: 'g' } }]
}
}
])
Customize by yourself.

Related

need to turn JSON values into keys

I have some json that I would like to transform from this:
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
...
{
"name":"fieldN",
"intValue":"N"
}
]
into this:
{ "field1" : "1",
"field2" : "2",
...
"fieldN" : "N",
}
For each pair, I need to change the value of the name field to a key, and the values of the intValue field to a value. This doesn't seem like flattening or denormalizing. Are there any tools that might do this out-of-the-box, or will this have to be brute-forced? What's the most pythonic way to accomplish this?
parameters = [ # assuming this is loaded already
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]
field_int_map = dict()
for p in parameters:
field_int_map[p['name']] = p['intValue']
yields {'field1': '1', 'field2': '2', 'fieldN': 'N'}
or as a dict comprehension
field_int_map = {p['name']:p['intValue'] for p in parameters}
This works to combine the name attribute with the intValue as key:value pairs, but the result is a dictionary instead of the original input type which was a list.
Use dictionary comprehension:
json_dct = {"parameters":
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]}
dct = {d["name"]: d["intValue"] for d in json_dct["parameters"]}
print(dct)
# {'field1': '1', 'field2': '2', 'fieldN': 'N'}

Elegant way of iterating list of dict python

I have a list of dictionary as below. I need to iterate the list of dictionary and remove the content of the parameters and set as an empty dictionary in sections dictionary.
input = [
{
"category":"Configuration",
"sections":[
{
"section_name":"Global",
"parameters":{
"name":"first",
"age":"second"
}
},
{
"section_name":"Operator",
"parameters":{
"adrress":"first",
"city":"first"
}
}
]
},
{
"category":"Module",
"sections":[
{
"section_name":"Global",
"parameters":{
"name":"first",
"age":"second"
}
}
]
}
]
Expected Output:
[
{
"category":"Configuration",
"sections":[
{
"section_name":"Global",
"parameters":{}
},
{
"section_name":"Operator",
"parameters":{}
}
]
},
{
"category":"Module",
"sections":[
{
"section_name":"Global",
"parameters":{}
}
]
}
]
My current code looks like below:
category_list = []
for categories in input:
sections_list = []
category_name_dict = {"category": categories["category"]}
for sections_dict in categories["sections"]:
section = {}
section["section_name"] = sections_dict['section_name']
section["parameters"] = {}
sections_list.append(section)
category_name_dict["sections"] = sections_list
category_list.append(category_name_dict)
Is there any elegant and more performant way to do compute this logic. Keys such as category, sections, section_name, and parameters are constants.
The easier way is not to rebuild the dictionary without the parameters, just clear it in every section:
for value in values:
for section in value['sections']:
section['parameters'] = {}
Code demo
Elegance is in the eye of the beholder, but rather than creating empty lists and dictionaries then filling them why not do it in one go with a list comprehension:
category_list = [
{
**category,
"sections": [
{
**section,
"parameters": {},
}
for section in category["sections"]
],
}
for category in input
]
This is more efficient and (in my opinion) makes it clearer that the intention is to change a single key.

Avoid iterating too much time - Algorithm construction

I have a list - memory_per_instance - which looks like the following:
[
{
'mem_used': '14868480',
'rsrc_name': 'node-5b5cf484-g582f'
},
{
'mem_used': '106618880',
'rsrc_name': 'infrastructure-656cf59bbb-xc6bb'
},
{
'mem_used': '27566080',
'rsrc_name': 'infrastructuret-l6fl'
},
{
'mem_used': '215556096',
'rsrc_name': 'node-62lnc'
}
]
Now, here we can see that there is 2 resources groups node and infrastructure.
I would like to create a array of which the final product contains the name of the resource (node or infrastructure) and the mem_used would be the sum of the mem_used.
I was already already able to differentiate the two groups from it, with regex.
From now, how can I create an array - memory_per_group - with a result such has
[
{
'mem_used': '230424576',
'rsrc_name': 'node'
},
{
'mem_used': '134184960',
'rsrc_name': 'infrastructure'
},
]
I could store the name of the rsrc in a tmp variable, so something like:
memory_per_pod_group = []
for item in memory_per_pod_instance:
tmp_rsrc = item['rsrc_name']
if(item['rsrc_name'] == tmp_rsrc):
memory_per_pod_group.append({'rsrc_name':get_group(tmp_rsrc, pod_hash_map), 'mem_used':mem_used})
memory_per_pod_instance.remove(item)
pprint.pprint(memory_per_pod_group)
But then, I would iterate through the list a non-negligeable amount of time.
Would there be a way to be more efficient ?
Well, sure. You only need one iteration:
data = [
{
'mem_used': '14868480',
'rsrc_name': 'node-5b5cf484-g582f'
},
{
'mem_used': '106618880',
'rsrc_name': 'infrastructure-656cf59bbb-xc6bb'
},
{
'mem_used': '27566080',
'rsrc_name': 'infrastructuret-l6fl'
},
{
'mem_used': '215556096',
'rsrc_name': 'node-62lnc'
}
]
def get_group(item):
rsrc_name = item['rsrc_name']
index = rsrc_name.index('-');
return rsrc_name[0:index]
def summary(list):
data = {};
for item in list:
group = get_group(item)
if not (group in data):
data[group] = 0
data[group] += int(item['mem_used'])
result = []
for rsrc_name, mem_used in data.items():
result.append({ 'rsrc_name': rsrc_name, 'mem_used': str(mem_used) })
return result
if __name__ == '__main__':
print(summary(data))
Result:
[{'mem_used': 230424576, 'rsrc_name': 'node'}, {'mem_used': 106618880, 'rsrc_name': 'infrastructure'}, {'mem_used': 27566080, 'rsrc_name': 'infrastructuret'}]
Note, that get_group might be too simple for your use case. The result has three groups since one of the resources has key 'infrastructuret' with a "t" at the end.
You could just iterate trough it a single time and checking with a simple startswith and then appending directly to the dictionary key that you want with a simple increment.
Something like
memory_total = { 'node': 0, 'instance': 0 };
for item in memory_per_instance:
if item['rsrc_name'].startsWith('node'):
memory_total['node'] += item['mem_used']
if item['rsrc_name'].startsWith('infrastructure'):
memory_total['instance'] += item['mem_used']

using $split for twice in mongodb using python

Let say in simple my document in mongodb is like this:
{'status' = {'tat': 'a, b <b>, c, d <d>' } }
I want to separate them and print it like
{bbced_name : 'a'},
{bbced_name : 'b'},
{bbced_name : 'c'},
{bbced_name : 'd'},
Therefore I try to split the data for twice. The first one is that to split the text with separator comma, then I split again with the separator < :
#the first split
project = { "$project" : { "bcced_name" : {
"$split" :
["$status.tat", ", "]
}
}
}
unwind = {"$unwind" : "$bcced_name"}
#the second split
project2= {"$project" : { "bbced_name2" : {
"$split" :
["$cced_name", "<"]
}
}
}
unwind2 = {"unwind" : "$bbced2"}
cur = collection.aggregate([project, unwind, project2, unwind2])
could I use split for twice in one pipeline? The first split is working well, but the second isn't.
You can below aggregation in 3.4.
$split to create a array of string values followed by $map to output a $substrCP value from start of the string to delimiter <.
Each substring end value is calculated by iterating the string using $range and $filter to output the location of the < string.
db.collection_name.aggregate(
[{"$project":
{"bcced_name":
{"$map":{
"input":{"$split":["$status.tat",", "]},
"as":"tat",
"in":{
"$cond":[
{"$eq":[{"$strLenCP":"$$tat"},1]},
"$$tat",
{
"$substrCP":[
"$$tat",
0,
{
"$arrayElemAt":[
{"$filter":{
"input":{"$range":[0,{"$strLenCP":"$$tat"},1]},
"as":"r",
"cond":{"$eq":[{"$substrCP":["$$tat","$$r",2]}," <"]}}
},
0]
}
]
}
]
}
}
}
}
},
{"$unwind": "$bcced_name"}
])
Update: (Use $indexOfCP)
db.collection_name.aggregate(
[{"$project":
{"bcced_name":
{"$map":{
"input":{"$split":["$status.tat",", "]},
"as":"tat",
"in":{
"$cond":[
{"$eq":[{"$strLenCP":"$$tat"},1]},
"$$tat",
{
"$substrCP":[
"$$tat",
0,
{ "$indexOfCP": [ "$$tat", " <" ] }
]
}
]
}
}
}
}
},
{"$unwind": "$bcced_name"}
])
{"$project":{
"bcced_name":
{"$map":{
"input":{"$split":["$status.tat",", "]},
"as":"tat",
"in":{
"$cond":[
{"$gt":[{"$indexOfCP":["$$tat","<"]},0]},
{"$arrayElemAt" : [{"$split":["$$tat", "<"]}, 0]},
"$$tat"
]
}
}
}
}
}

Logic for building converter using python dictionary values

I have such slice of loaded json tp python dictionary (size_dict):
{
"sizeOptionName":"XS",
"sizeOptionId":"1528",
"sortOrderNumber":"7017"
},
{
"sizeOptionName":"S",
"sizeOptionId":"1529",
"sortOrderNumber":"7047"
},
{
"sizeOptionName":"M",
"sizeOptionId":"1530",
"sortOrderNumber":"7095"
}
and I have products with size Id (dictionary_prod):
{
"catalogItemId":"7627712",
"catalogItemTypeId":"3",
"regularPrice":"0.0",
"sizeDimension1Id":"1528",
"sizeDimension2Id":"0",
}
I need to make such as output for any product:
result_dict = {'variant':
[{"catalogItemId":"7627712", ...some other info...,
'sizeName': 'XS', 'sizeId': '1525'}}]}
so I need to convert size ID and add it to new result object
What is the best pythonic way to do this?
I dont know how to get right data from size_dict
if int(dictionary_prod['sizeDimension1Id']) > o:
(result_dict['variant']).append('sizeName': size_dict???)
As Tommy mentioned, this is best facilitated by mapping the size id's to their respective dictionaries.
size_dict = \
[
{
"sizeOptionName":"XS",
"sizeOptionId":"1528",
"sortOrderNumber":"7017"
},
{
"sizeOptionName":"S",
"sizeOptionId":"1529",
"sortOrderNumber":"7047"
},
{
"sizeOptionName":"M",
"sizeOptionId":"1530",
"sortOrderNumber":"7095"
}
]
size_id_map = {size["sizeOptionId"] : size for size in size_dict}
production_dict = \
[
{
"catalogItemId":"7627712",
"catalogItemTypeId":"3",
"regularPrice":"0.0",
"sizeDimension1Id":"1528",
"sizeDimension2Id":"0",
}
]
def make_variant(idict):
odict = idict.copy()
size_id = odict.pop("sizeDimension1Id")
odict.pop("sizeDimension2Id")
odict["sizeName"] = size_id_map[size_id]["sizeOptionName"]
odict["sizeId"] = size_id
return odict
result_dict = \
{
"variant" : [make_variant(product) for product in production_dict]
}
print(result_dict)
Your question is a little confusing but it looks like you have a list (size_dict) of dictionaries that contain some infroamtion and you want to do a lookup to find a particular element in the list that contains the SizeOptionName you are interested in so that you can read off the SizeOptionID.
So first you could organsie your size_dict as a dictionary rather than a list - i.e.
sizeDict = {"XS":{
"sizeOptionName":"XS",
"sizeOptionId":"1528",
"sortOrderNumber":"7017"
}, "S": {
"sizeOptionName":"S",
"sizeOptionId":"1529",
"sortOrderNumber":"7047"
}, ...
You could then read off the SizeOptionID you need by doing:
sizeDict[sizeNameYouAreLookingFor][SizeOptionID]
Alternative you could keep your current structure and just search the list of dictionaries that is size_dict.
So:
for elem in size_dict:
if elem.SizeOptionID == sizeYouAreLookingFor:
OptionID = elem.SizeOptionId
Or perhaps you are asking something else?

Categories