Related
I have created the below logic to add list to a dictionary in Python it works fine
>>> dict ={}
>>> dict ={"key":[]}
>>> dict['key']
[]
>>> dict['key'].append('a')
>>> dict['key']
['a']
When I try to Implement same thing while parsing the JSON and creating dynamic dictionary based on certain it failed to add in the List and report as NoneType
import json
json_data="""{
"data":[
{
"package_path":"/bin/bash",
"package_type":"rpm"
},
{
"package_path":"com.test3",
"package_type":"java"
},
{
"package_path":"com.test",
"package_type":"java"
}
]
}
"""
j_dict = json.loads(json_data)
dict2 = {}
for vuln in j_dict['data']:
package_type = vuln['package_type']
package_path = vuln['package_path']
if package_type in dict2:
dict2 ={package_type:[]}
else:
dict2[package_type].append(package_path)
Throws error
% python ~/Desktop/test.py
Traceback (most recent call last):
File "test.py", line 30, in <module>
dict2[package_type].append(package_path)
KeyError: u'rpm'
Expecting output like
dict2 {"java":['com.test3','com.test2'],"rpm":['/bin/bash']}
You can use dict.setdefault to create empty list if the key is not found inside the dictionary. For example:
import json
json_data="""{
"data":[
{
"package_path":"/bin/bash",
"package_type":"rpm"
},
{
"package_path":"com.test3",
"package_type":"java"
},
{
"package_path":"com.test",
"package_type":"java"
}
]
}
"""
j_dict = json.loads(json_data)
out = {}
for vuln in j_dict['data']:
out.setdefault(vuln['package_type'], []).append(vuln['package_path'])
print(out)
Prints:
{'rpm': ['/bin/bash'], 'java': ['com.test3', 'com.test']}
defaultdict can be used here.
import json
from collections import defaultdict
json_data = """{
"data":[
{
"package_path":"/bin/bash",
"package_type":"rpm"
},
{
"package_path":"com.test3",
"package_type":"java"
},
{
"package_path":"com.test",
"package_type":"java"
}
]
}
"""
j_dict = json.loads(json_data)
data = defaultdict(list)
for entry in j_dict['data']:
data[entry['package_type']].append(entry['package_path'])
print(data)
output
defaultdict(<class 'list'>, {'rpm': ['/bin/bash'], 'java': ['com.test3', 'com.test']})
You have your if statement slightly wrong:
if package_type in dict2:
dict2 ={package_type:[]}
else:
dict2[package_type].append(package_path)
Should be:
if package_type not in dict2:
dict2[package_type] = []
dict2[package_type].append(package_path)
You were saying, if a key was in the dictionary, replace the whole dictionary with a single key and a list as the value.
In the else clause you were saying, if the key doesn't exist, fetch the value for that key, assume it's a list, and append to it. Which fails.
This idiom of either adding value if it doesn't exist or using the existing value is so common, there are a couple of different ways of doing it built into python and its libraries.
dict.setdefault will either set a new default value for a key, or return the existing value if it exists. I find its call syntax ugly:
dict2.setdefault(package_type, []).append(package_path)
This sets the key's value to [] if it doesn't exist, returns it, and then you can append to the list as it exists in the dictionary.
An alternative that I prefer is to use collections.defaultdict, which is a dictionary that automatically creates a default value when a key doesn't already exist:
dict2 = defaultdict(list)
dict2[package_type].append(package_path)
I have a dictionary with a parent-key and its value is a dict. I want to extract a key,val pair from a list of dict.
given:
{"Premier" : {}}
I want to extract:
all_compseasons = content: [
{
label: "2019/20",
id: 274
},
{
label: "2018/19",
id: 210
}]
So to get:
{"Premier" :
{"2019/20" : 274,
"2018/19" : 210
}
}
I can't seem to find a good way to do it. I've tried below given other examples of the problem, but doesn't work.
compseasons = {}
for comp in all_compseasons:
competition_id = 'Premier'
index = competition_id
compseasons[index]comp['label'] = comp['id']
Your very close. Dictionary keys need to be referenced with surrounding [], so comp['label'] should be [comp['label']]. You can also just use the given dictionary {"Premier" : {}} instead of creating a new one with compseasons = {}, but either will give you the same result.
Working solution:
d = {"Premier": {}}
all_compseasons = [{"label": "2019/20", "id": 274}, {"label": "2018/19", "id": 210}]
for comp in all_compseasons:
d["Premier"][comp["label"]] = comp["id"]
print(d)
# {'Premier': {'2019/20': 274, '2018/19': 210}}
You just made a mistake in how you declared compseasons and how you are accessing the value of premier key which is also a dictionary.
Declaring compseasons = {"Premier" : {}} will not give you KeyError when you are trying to access it via compseasons[index] since Premier has already been inserted as a key.
Second, since your value of Premier itself is a dictionary, you should access the inner key enclosed in [] which would translate to compseasons[index][comp['label']] = comp['id'].
all_compseasons = [
{
'label': "2019/20",
'id': 274
},
{
'label': "2018/19",
'id': 210
}]
compseasons = {"Premier" : {}}
for comp in all_compseasons:
competition_id = 'Premier'
index = competition_id
compseasons[index][comp['label']] = comp['id']
I have a dynamodb table with an attribute containing a nested map and I would like to update a specific inventory item that is filtered via a filter expression that results in a single item from this map.
How to write an update expression to update the location to "in place three" of the item with name=opel,tags include "x1" (and possibly also f3)?
This should just update the first list elements location attribute.
{
"inventory": [
{
"location": "in place one", # I want to update this
"name": "opel",
"tags": [
"x1",
"f3"
]
},
{
"location": "in place two",
"name": "abc",
"tags": [
"a3",
"f5"
]
}],
"User" :"test"
}
Updated Answer - based on updated question statement
You can update attributes in a nested map using update expressions such that only a part of the item would get updated (ie. DynamoDB would apply the equivalent of a patch to your item) but, because DynamoDB is a document database, all operations (Put, Get, Update, Delete etc.) work on the item as a whole.
So, in your example, assuming User is the partition key and that there is no sort key (I didn't see any attribute that could be a sort key in that example), an Update request might look like this:
table.update_item(
Key={
'User': 'test'
},
UpdateExpression="SET #inv[0].#loc = :locVal",
ExpressionAttributeNames={
'#inv': 'inventory',
'#loc': 'location'
},
ExpressionAttributeValues={
':locVal': 'in place three',
},
)
That said, you do have to know what the item schema looks like and which attributes within the item should be updated exactly.
DynamoDB does NOT have a way to operate on sub-items. Meaning, there is no way to tell Dynamo to execute an operation such as "update item, set 'location' property of elements of the 'inventory' array that have a property of 'name' equal to 'opel'"
This is probably not the answer you were hoping for, but it is what's available today. You may be able to get closer to what you want by changing the schema a bit.
If you need to reference the sub-items by name, perhaps storing something like:
{
"inventory": {
"opel": {
"location": "in place one", # I want to update this
"tags": [ "x1", "f3" ]
},
"abc": {
"location": "in place two",
"tags": [ "a3", "f5" ]
}
},
"User" :"test"
}
Then your query would be:
table.update_item(
Key={
'User': 'test'
},
UpdateExpression="SET #inv.#brand.#loc = :locVal",
ExpressionAttributeNames={
'#inv': 'inventory',
'#loc': 'location',
'#brand': 'opel'
},
ExpressionAttributeValues={
':locVal': 'in place three',
},
)
But YMMV as even this has limitations because you are limited to identifying inventory items by name (ie. you still can't say "update inventory with tag 'x1'"
Ultimately you should carefully consider why you need Dynamo to perform these complex operations for you as opposed to you being specific about what you want to update.
You can update the nested map as follow:
First create and empty item attribute of type map. In the example graph is the empty item attribute.
dynamoTable = dynamodb.Table('abc')
dynamoTable.put_item(
Item={
'email': email_add,
'graph': {},
}
Update nested map as follow:
brand_name = 'opel'
DynamoTable = dynamodb.Table('abc')
dynamoTable.update_item(
Key={
'email': email_add,
},
UpdateExpression="set #Graph.#brand= :name, ",
ExpressionAttributeNames={
'#Graph': 'inventory',
'#brand': str(brand_name),
},
ExpressionAttributeValues = {
':name': {
"location": "in place two",
'tag': {
'graph_type':'a3',
'graph_title': 'f5'
}
}
Updating Mike's answer because that way doesn't work any more (at least for me).
It is working like this now (attention for UpdateExpression and ExpressionAttributeNames):
table.update_item(
Key={
'User': 'test'
},
UpdateExpression="SET inv.#brand.loc = :locVal",
ExpressionAttributeNames={
'#brand': 'opel'
},
ExpressionAttributeValues={
':locVal': 'in place three',
},
)
And whatever goes in Key={}, it is always partition key (and sort key, if any).
EDIT:
Seems like this way only works when with 2 level nested properties. In this case you would only use "ExpressionAttributeNames" for the "middle" property (in this example, that would be #brand: inv.#brand.loc). I'm not yet sure what is the real rule now.
DynamoDB UpdateExpression does not search on the database for matching cases like SQL (where you can update all items that match some condition). To update an item you first need to identify it and get primary key or composite key, if there are many items that match your criteria, you need to update one by one.
then the issue to update nested objects is to define UpdateExpression,ExpressionAttributeValues & ExpressionAttributeNames to pass to Dynamo Update Api .
I use a recursive function to update nested Objects on dynamoDB. You ask for Python but I use javascript, I think is easy to see this code and implents on Python:
https://gist.github.com/crsepulv/4b4a44ccbd165b0abc2b91f76117baa5
/**
* Recursive function to get UpdateExpression,ExpressionAttributeValues & ExpressionAttributeNames to update a nested object on dynamoDB
* All levels of the nested object must exist previously on dynamoDB, this only update the value, does not create the branch.
* Only works with objects of objects, not tested with Arrays.
* #param obj , the object to update.
* #param k , the seed is any value, takes sense on the last iteration.
*/
function getDynamoExpression(obj, k) {
const key = Object.keys(obj);
let UpdateExpression = 'SET ';
let ExpressionAttributeValues = {};
let ExpressionAttributeNames = {};
let response = {
UpdateExpression: ' ',
ExpressionAttributeNames: {},
ExpressionAttributeValues: {}
};
//https://stackoverflow.com/a/16608074/1210463
/**
* true when input is object, this means on all levels except the last one.
*/
if (((!!obj) && (obj.constructor === Object))) {
response = getDynamoExpression(obj[key[0]], key);
UpdateExpression = 'SET #' + key + '.' + response['UpdateExpression'].substring(4); //substring deletes 'SET ' for the mid level values.
ExpressionAttributeNames = {['#' + key]: key[0], ...response['ExpressionAttributeNames']};
ExpressionAttributeValues = response['ExpressionAttributeValues'];
} else {
UpdateExpression = 'SET = :' + k;
ExpressionAttributeValues = {
[':' + k]: obj
}
}
//removes trailing dot on the last level
if (UpdateExpression.indexOf(". ")) {
UpdateExpression = UpdateExpression.replace(". ", "");
}
return {UpdateExpression, ExpressionAttributeValues, ExpressionAttributeNames};
}
//you can try many levels.
const obj = {
level1: {
level2: {
level3: {
level4: 'value'
}
}
}
}
I had the same need.
Hope this code helps. You only need to invoke compose_update_expression_attr_name_values passing the dictionary containing the new values.
def compose_update_expression_attr_name_values(data: dict) -> (str, dict, dict):
""" Constructs UpdateExpression, ExpressionAttributeNames, and ExpressionAttributeValues for updating an entry of a DynamoDB table.
:param data: the dictionary of attribute_values to be updated
:return: a tuple (UpdateExpression: str, ExpressionAttributeNames: dict(str: str), ExpressionAttributeValues: dict(str: str))
"""
# prepare recursion input
expression_list = []
value_map = {}
name_map = {}
# navigate the dict and fill expressions and dictionaries
_rec_update_expression_attr_name_values(data, "", expression_list, name_map, value_map)
# compose update expression from single paths
expression = "SET " + ", ".join(expression_list)
return expression, name_map, value_map
def _rec_update_expression_attr_name_values(data: dict, path: str, expressions: list, attribute_names: dict,
attribute_values: dict):
""" Recursively navigates the input and inject contents into expressions, names, and attribute_values.
:param data: the data dictionary with updated data
:param path: the navigation path in the original data dictionary to this recursive call
:param expressions: the list of update expressions constructed so far
:param attribute_names: a map associating "expression attribute name identifiers" to their actual names in ``data``
:param attribute_values: a map associating "expression attribute value identifiers" to their actual values in ``data``
:return: None, since ``expressions``, ``attribute_names``, and ``attribute_values`` get updated during the recursion
"""
for k in data.keys():
# generate non-ambiguous identifiers
rdm = random.randrange(0, 1000)
attr_name = f"#k_{rdm}_{k}"
while attr_name in attribute_names.keys():
rdm = random.randrange(0, 1000)
attr_name = f"#k_{rdm}_{k}"
attribute_names[attr_name] = k
_path = f"{path}.{attr_name}"
# recursion
if isinstance(data[k], dict):
# recursive case
_rec_update_expression_attr_name_values(data[k], _path, expressions, attribute_names, attribute_values)
else:
# base case
attr_val = f":v_{rdm}_{k}"
attribute_values[attr_val] = data[k]
expression = f"{_path} = {attr_val}"
# remove the initial "."
expressions.append(expression[1:])
I have a large database from the following type:
data = {
"2": {"overall": 172, "buy": 172, "name": "ben", "id": 2, "sell": 172},
"3": {"overall": 173, "buy": 173, "name": "dan", "id": 3, "sell": 173},
"4": {"overall": 174, "buy": 174, "name": "josh", "id": 4, "sell": 174},
...
and so on for about 10k rows.
Then, I created a loop to find if inside this dict() there are specific names:
I used the next loop
items = ["ben","josh"]
Database = dict()
Database = {"Buying_Price": "", "Selling_Price": ""}
for masterkey, mastervalue in data.items():
if mastervalue['name'] in items:
Database["Name"] = Database["Name"].append(mastervalue['name'])
Database["Buying_Price"] = Database["Buying_Price"].append(mastervalue['buy'])
Database["Selling_Price"] = Database["Selling_Price"].append(mastervalue['sell'])
However, I'm getting the next error:
Database["Buying_Price"] = Database["Buying_Price"].append(mastervalue['buy_average'])
AttributeError: 'str' object has no attribute 'append'
My goal is to obtain a dict names Database with 2 keys: Buying_Price,Selling_Price where in each one I will have the following:
Buying_Price = {"ben":172,"josh":174}
Sellng_Price = {"ben":172,"josh":174}
Thank you.
There are a couple of issues with the code you posted, so we'll go line by line and fix them:
items = ["ben", "josh"]
Database = dict()
Database = {"Buying_Price": "", "Selling_Price": ""}
for masterkey, mastervalue in data.items():
if mastervalue['name'] in items:
Database["Name"] = Database["Name"].append(mastervalue['name'])
Database["Buying_Price"] = Database["Buying_Price"].append(mastervalue['buy_average'])
Database["Selling_Price"] = Database["Selling_Price"].append(mastervalue['sell_average'])
In Python, you don't need to define the object type
explicitly and then assign its value, so it means that Database =
dict() is redundant since you already define this to be a
dictionary the line below.
You intend to aggregate your results of the if statement
so both Buying_Price and Selling_Price should be defined as lists and not as strings. You can either do it by assigning a []
value or the literal list().
According to your data structure, you don't have the
buy_average and sell_average keys, only buy and sell so make sure you use the correct keys.
You don't need to re-assign your list value when using the
append() method, it's the object's method so it will update the object in-place.
You didn't set what Name is in your Database object and
yet you're trying to append values to it.
Overall, the code should roughly look like this:
items = ["ben","josh"]
Database = {"Buying_Price": [], "Selling_Price": [], "Name": []}
for masterkey, mastervalue in data.items():
if mastervalue['name'] in items:
Database["Name"].append(mastervalue['name'])
Database["Buying_Price"].append(mastervalue['buy'])
Database["Selling_Price"].append(mastervalue['sell'])
It sounds like you want a nested dict.
items = ["ben", "josh"]
new_database = {"Buying_Price": {}, "Selling_Price": {}}
for key, row in data.items():
name = row["name"]
if name in items:
new_database["Buying_Price"][name] = row["buy"]
new_database["Selling_Price"][name] = row["sell"]
In Database = {"Buying_Price": "", "Selling_Price": ""}, you are defining the key Buying_Price as "" : meaning a string. You are trying to use the .append() list method into a string, hence the error 'str' object has no attribute 'append'.
We do not know the output you want, but seeing how you want to compute your data, I suggest you do :
Database = {"Name" : [], "Buying_Price": [], "Selling_Price": []}
instead of the original...
Database = {"Buying_Price": "", "Selling_Price": ""}
This way, you will be able to append your data Name, Buying_Price, and Selling_Price at the same time, and you'll be able to make search and access data of all the arrays using the index of only one.
I haven't paid attention, but you are badly appending your data to your dict.
.append() will work in-place, meaning that you should do :
Database["Name"].append(mastervalue['name'])
instead of
Database["Name"] = Database["Name"].append(mastervalue['name'])
I want to transform dictionary into a string. What would be beginner-level question is complicated by few rules that I have to adhere to:
There is a list of known keys that must come out in particular, arbitrary order
Each of known keys is optional, i.e. it may not be present in dictionary
It is guaranteed that at least one of known keys will be present in dictionary
Dictionary may contain additional keys; they must come after known keys and their order is not important
I cannot make assumptions about order in which keys will be added to dictionary
What is the pythonic way of processing some dictionary keys before others?
So far, I have following function:
def format_data(input_data):
data = dict(input_data)
output = []
for key in ["title", "slug", "date", "modified", "category", "tags"]:
if key in data:
output.append("{}: {}".format(key.title(), data[key]))
del data[key]
if data:
for key in data:
output.append("{}: {}".format(key.title(), data[key]))
return "\n".join(output)
data = {
"tags": "one, two",
"slug": "post-title",
"date": "2017-02-01",
"title": "Post Title",
}
print(format_data(data))
data = {
"format": "book",
"title": "Another Post Title",
"date": "2017-02-01",
"slug": "another-post-title",
"custom": "data",
}
print(format_data(data))
Title: Post Title
Slug: post-title
Date: 2017-02-01
Tags: one, two
Title: Another Post Title
Slug: another-post-title
Date: 2017-02-01
Custom: data
Format: book
While this function does provide expected results, it has some issues that makes me think there might be better approach. Namely, output.append() line is duplicated and input data structure is copied to allow it's modification without side-effects.
To sum up, how can I process some keys in particular order and before other keys?
I suggest that you simply run a pair of list comprehensions: one for the desired keys, and one for the rest. Concatenate them in the desired order in bulk, rather than one at a time. This reduces the critical step to a single command to build output.
The first comprehension looks for desired keys in the dict; the second looks for any dict keys not in the "desired" list.
def format_data(input_data):
data = dict(input_data)
key_list = ["title", "slug", "date", "modified", "category", "tags"]
output = ["{}: {}".format(key.title(), data[key]) for key in key_list if key in data] + \
["{}: {}".format(key.title(), data[key]) for key in data if key not in key_list]
return "\n".join(output)
I'd suggest list comprehensions and pop():
def format_data(input_data):
data = dict(input_data)
keys = ["title", "slug", "date", "modified", "category", "tags"]
output = ['{}: {}'.format(key.title(), data.pop(key)) for key in keys if key in data]
output.extend(['{}: {}'.format(key.title(), val) for key, val in data.items()])
return "\n".join(output)
To the concern about deleting during iteration - note that the iteration is over the list of keys, not the dictionary being evaluated, so I wouldn't consider that a red flag.
To completely edit, the below will take a list of primary keys (you can pass them in if you want or set it in a config file) and then it will set those in the beginning of your dictionary.
I think I see what you mean now:
Try this:
from collections import OrderedDict
data = {'aaa': 'bbbb',
'custom': 'data',
'date': '2017-02-01',
'foo': 'bar',
'format': 'book',
'slug': 'another-post-title',
'title': 'Another Post Title'}
def format_data(input_data):
primary_keys = ["title", "slug", "date", "modified", "category", "tags"]
data = OrderedDict((k, input_data.get(k)) for k in primary_keys + input_data.keys())
output = []
for key, value in data.items():
if value:
output.append("{}: {}".format(key.title(), value))
return "\n".join(output)
print(format_data(data))
Title: Another Post Title
Slug: another-post-title
Date: 2017-02-01
Aaa: bbbb
Format: book
Custom: data
Foo: bar
Find the difference between the known keys and the keys in the input dictionary; Use itertools.chain to iterate over both sets of keys; catch KeyErrors for missing keys and just pass. No need to copy the input and no duplication.
import itertools
def format_data(input_data):
known_keys = ["title", "slug", "date", "modified", "category", "tags"]
xtra_keys = set(input_data.keys()).difference(known_keys)
output = []
for key in itertools.chain(known_keys, xtra_keys):
try:
output.append("{}: {}".format(key.title(), data[key]))
except KeyError as e:
pass
return '\n'.join(output)
data = {"tags": "one, two",
"slug": "post-title",
"date": "2017-02-01",
"title": "Post Title",
"foo": "bar"}
>>> print format_data(data)
Title: Post Title
Slug: post-title
Date: 2017-02-01
Tags: one, two
Foo: bar
>>>