Combining YAML Objects via anchors and overriding object values

Combining YAML Objects via anchors and overriding object values - python

I am trying to reduce the headache of copying same configuration data (Stored in YAML file) by using anchor tag in YAML. The example YAML looks like :
profiles:
home: &home
key1: value1
object1:
subKey1: subVal1
subKey2: subVal2
complexObject:
something: value
someOtherThing: value
work:
<<: *home
object1:
subKey2: completelyDifferentValue # something like this ?!
complexObject.something: notValue # or something like this ?
The equivalent JSON for the above YAML is
{
"profiles": {
"home": {
"key1": "value1",
"object1": {
"subKey1": "subVal1",
"subKey2": "subVal2",
"complexObject": {
"something": "value",
"someOtherThing": "value"
}
}
},
"work": {
"key1": "value1",
"object1": {
"subKey2": "completelyDifferentValue",
"complexObject.something": "notValue"
}
}
}
}
Whereas what i wanted was :
{
"profiles": {
"home": {
"key1": "value1",
"object1": {
"subKey1": "subVal1",
"subKey2": "subVal2",
"complexObject": {
"something": "value",
"someOtherThing": "value"
}
}
},
"work": {
"key1": "value1",
"object1": {
"subKey1": "subVal1",
"subKey2": "completelyDifferentValue",
"complexObject": {
"something": "notValue",
"someOtherThing": "value"
}
}
}
}
}
(the additional subKey1 that was removed)
The YAML config file will have Objects inside Objects and idea is to have one parent object and then just copy it and modify a few keys (inside child objects objects)
I understand that YAML spec might not be very helpful directly in this case and would appreciate any workarounds in python via pyyaml(or some other library) as well!

Due to the bad influence of Java, it is a common misconception that these two YAML structures are equivalent:
a.b: c
a:
b:
c
They are not. A period in YAML is a content character just like a, making the first YAML have a key named a.b which does not imply a nested mapping.
Now about merging: Anchors and aliases exist to be able to serialize arbitrary, possibly cyclic, graphs. Recursive descent (as needed for a deep merge) needs to be wary of such cycles, which is why I assume << is specified not to do this.
What << actually does is that this specific sequence of characters is assigned the tag !!merge. The YAML processor then implements merging as „for every mapping that has a key with tag !!merge, pull the unknown key-value-pairs from that key's value(s) into the current mapping“.
The problem for you is that while libraries like PyYAML allow you to register custom constructors for user-defined tags, these can only produce a value for the tagged item – however, !!merge influences the mapping around the tagged value, so its semantics cannot easily reproduced and expanded via custom constructors.
You can, however, simply override PyYAML's merge implementation. For this, inherit from SafeConstructor, FullConstructor or UnsafeConstructor depending on your needs, reimplement flatten_mapping, then define a loader (see here) that uses your constructor. Theoretically, besides deep merging, you can also implement periods-as-nested-mappings here, but I advise against it. These would then only work at places where you do merging, and not elsewhere, which is counter-intuitive.

Related

Write JSON for corresponding XML

I want to write JSON code which can be converted in fix format kind of XML
<function>foo
<return>uint32_t</return>
<param>count
<type>uint32_t</type>
</param>
</function>
I have tried multiple ways to develop a JSON which can be formatted like as in above but failed to get perfection because no separate key is required for foo and count which are orphan values otherwise.
Tried ways:
Way 1:
{
"function" :
{"foo":
{"return":"uint32_t"},
"param":
{"count":
{"type":"uint32_t"}
}
}
}
Way 2:
{
"function" :
["foo",{"return":"uint32_t"}],
"param":
["count",{"type":"uint32_t"}]
}
Way 3: But i do not need name tag :(
{
"function":
{"name": "foo",
"return": "uint32_t",
"param": "count",
"type": "uint32_t"
}
}
For generating output and testing please use:
JSON to XML convertor
Requesting your help.. I later have a script to convert the formatted excel to C header files.

It is very rare for a JSON-to-XML conversion library to give you precise control over the XML that is generated, or conversely, for an XML-to-JSON converter to give you precise control over the JSON that is generated. It's basically not possible because the data models are very different.
Typically you have to accept what the JSON-to-XML converter gives you, and then use XSLT to transform it into the flavour of XML that you actually want.
(Consider using the json-to-xml() conversion function in XSLT 3.0 and then applying template rules to the result.)

MongoDB Update with Array Filters [duplicate]

I am trying to update a value in the nested array but can't get it to work.
My object is like this
{
"_id": {
"$oid": "1"
},
"array1": [
{
"_id": "12",
"array2": [
{
"_id": "123",
"answeredBy": [], // need to push "success"
},
{
"_id": "124",
"answeredBy": [],
}
],
}
]
}
I need to push a value to "answeredBy" array.
In the below example, I tried pushing "success" string to the "answeredBy" array of the "123 _id" object but it does not work.
callback = function(err,value){
if(err){
res.send(err);
}else{
res.send(value);
}
};
conditions = {
"_id": 1,
"array1._id": 12,
"array2._id": 123
};
updates = {
$push: {
"array2.$.answeredBy": "success"
}
};
options = {
upsert: true
};
Model.update(conditions, updates, options, callback);
I found this link, but its answer only says I should use object like structure instead of array's. This cannot be applied in my situation. I really need my object to be nested in arrays
It would be great if you can help me out here. I've been spending hours to figure this out.
Thank you in advance!

General Scope and Explanation
There are a few things wrong with what you are doing here. Firstly your query conditions. You are referring to several _id values where you should not need to, and at least one of which is not on the top level.
In order to get into a "nested" value and also presuming that _id value is unique and would not appear in any other document, you query form should be like this:
Model.update(
{ "array1.array2._id": "123" },
{ "$push": { "array1.0.array2.$.answeredBy": "success" } },
function(err,numAffected) {
// something with the result in here
}
);
Now that would actually work, but really it is only a fluke that it does as there are very good reasons why it should not work for you.
The important reading is in the official documentation for the positional $ operator under the subject of "Nested Arrays". What this says is:
The positional $ operator cannot be used for queries which traverse more than one array, such as queries that traverse arrays nested within other arrays, because the replacement for the $ placeholder is a single value
Specifically what that means is the element that will be matched and returned in the positional placeholder is the value of the index from the first matching array. This means in your case the matching index on the "top" level array.
So if you look at the query notation as shown, we have "hardcoded" the first ( or 0 index ) position in the top level array, and it just so happens that the matching element within "array2" is also the zero index entry.
To demonstrate this you can change the matching _id value to "124" and the result will $push an new entry onto the element with _id "123" as they are both in the zero index entry of "array1" and that is the value returned to the placeholder.
So that is the general problem with nesting arrays. You could remove one of the levels and you would still be able to $push to the correct element in your "top" array, but there would still be multiple levels.
Try to avoid nesting arrays as you will run into update problems as is shown.
The general case is to "flatten" the things you "think" are "levels" and actually make theses "attributes" on the final detail items. For example, the "flattened" form of the structure in the question should be something like:
{
"answers": [
{ "by": "success", "type2": "123", "type1": "12" }
]
}
Or even when accepting the inner array is $push only, and never updated:
{
"array": [
{ "type1": "12", "type2": "123", "answeredBy": ["success"] },
{ "type1": "12", "type2": "124", "answeredBy": [] }
]
}
Which both lend themselves to atomic updates within the scope of the positional $ operator
MongoDB 3.6 and Above
From MongoDB 3.6 there are new features available to work with nested arrays. This uses the positional filtered $[<identifier>] syntax in order to match the specific elements and apply different conditions through arrayFilters in the update statement:
Model.update(
{
"_id": 1,
"array1": {
"$elemMatch": {
"_id": "12","array2._id": "123"
}
}
},
{
"$push": { "array1.$[outer].array2.$[inner].answeredBy": "success" }
},
{
"arrayFilters": [{ "outer._id": "12" },{ "inner._id": "123" }]
}
)
The "arrayFilters" as passed to the options for .update() or even
.updateOne(), .updateMany(), .findOneAndUpdate() or .bulkWrite() method specifies the conditions to match on the identifier given in the update statement. Any elements that match the condition given will be updated.
Because the structure is "nested", we actually use "multiple filters" as is specified with an "array" of filter definitions as shown. The marked "identifier" is used in matching against the positional filtered $[<identifier>] syntax actually used in the update block of the statement. In this case inner and outer are the identifiers used for each condition as specified with the nested chain.
This new expansion makes the update of nested array content possible, but it does not really help with the practicality of "querying" such data, so the same caveats apply as explained earlier.
You typically really "mean" to express as "attributes", even if your brain initially thinks "nesting", it's just usually a reaction to how you believe the "previous relational parts" come together. In reality you really need more denormalization.
Also see How to Update Multiple Array Elements in mongodb, since these new update operators actually match and update "multiple array elements" rather than just the first, which has been the previous action of positional updates.
NOTE Somewhat ironically, since this is specified in the "options" argument for .update() and like methods, the syntax is generally compatible with all recent release driver versions.
However this is not true of the mongo shell, since the way the method is implemented there ( "ironically for backward compatibility" ) the arrayFilters argument is not recognized and removed by an internal method that parses the options in order to deliver "backward compatibility" with prior MongoDB server versions and a "legacy" .update() API call syntax.
So if you want to use the command in the mongo shell or other "shell based" products ( notably Robo 3T ) you need a latest version from either the development branch or production release as of 3.6 or greater.
See also positional all $[] which also updates "multiple array elements" but without applying to specified conditions and applies to all elements in the array where that is the desired action.

I know this is a very old question, but I just struggled with this problem myself, and found, what I believe to be, a better answer.
A way to solve this problem is to use Sub-Documents. This is done by nesting schemas within your schemas
MainSchema = new mongoose.Schema({
array1: [Array1Schema]
})
Array1Schema = new mongoose.Schema({
array2: [Array2Schema]
})
Array2Schema = new mongoose.Schema({
answeredBy": [...]
})
This way the object will look like the one you show, but now each array are filled with sub-documents. This makes it possible to dot your way into the sub-document you want. Instead of using a .update you then use a .find or .findOne to get the document you want to update.
Main.findOne((
{
_id: 1
}
)
.exec(
function(err, result){
result.array1.id(12).array2.id(123).answeredBy.push('success')
result.save(function(err){
console.log(result)
});
}
)
Haven't used the .push() function this way myself, so the syntax might not be right, but I have used both .set() and .remove(), and both works perfectly fine.

How to copy a python script which includes dictionaries to a new python script?

I have a python script which contains dictionaries and is used as input from another python script which performs calculations. I want to use the first script which is used as input, to create more scripts with the exact same structure in the dictionaries but different values for the keys.
Original Script: Car1.py
Owner = {
"Name": "Jim",
"Surname": "Johnson",
}
Car_Type = {
"Make": "Ford",
"Model": "Focus",
"Year": "2008"
}
Car_Info = {
"Fuel": "Gas",
"Consumption": 5,
"Max Speed": 190
}
I want to be able to create more input files with identical format but for different cases, e.g.
New Script: Car2.py
Owner = {
"Name": "Nick",
"Surname": "Perry",
}
Car_Type = {
"Make": "BMW",
"Model": "528",
"Year": "2015"
}
Car_Info = {
"Fuel": "Gas",
"Consumption": 10,
"Max Speed": 280
}
So far, i have only seen answers that print just the keys and the values in a new file but not the actual name of the dictionary as well. Can someone provide some help? Thanks in advance!

If you really want to do it that way (not recommended, because of the reasons statet in the comment by spectras and good alternatives) and import your input Python file:
This question has answers on how to read out the dictionaries names from the imported module. (using the dict() on the module while filtering for variables that do not start with "__")
Then get the new values for the dictionary entries and construct the new dicts.
Finally you need to write a exporter that takes care of storing the data in a python readable form, just like you would construct a normal text file.
I do not see any advantage over just storing it in a storage format.

read the file with something like
text=open('yourfile.py','r').read().split('\n')
and then interpret the list of strings you get... after that you can save it with something like
new_text = open('newfile.py','w')
[new_text.write(line) for line in text]
new_text.close()
as spectras said earlier, not ideal... but if that's what you want to do... go for it

Python: Mutability and dictionaries in config

I want to keep some large, static dictionaries in config to keep my main application code clean. Another reason for doing that is so the dicts can be occasionally edited without having to touch the application.
I thought a good solution was using a json config a la:
http://www.ilovetux.com/Using-JSON-Configs-In-Python/
JSON is a natural, readable format for this type of data. Example:
{
"search_dsl_full": {
"function_score": {
"boost_mode": "avg",
"functions": [
{
"filter": {
"range": {
"sort_priority_inverse": {
"gte": 200
}
}
},
"weight": 2.4
}
],
"query": {
"multi_match": {
"fields": [
"name^10",
"search_words^5",
"description",
"skuid",
"backend_skuid"
],
"operator": "and",
"type": "cross_fields"
}
},
"score_mode": "multiply"
}
}
The big problem is, when I import it into my python app and set a dict equal to it like this:
with open("config.json", "r") as fin:
config = json.load(fin)
...
def create_query()
query_dsl = config['search_dsl_full']
return query_dsl
and then later, only when a certain condition is met, I need to update that dict like this:
if (special condition is met):
query_dsl['function_score']['query']['multi_match']['operator'] = 'or'
Since query_dsl is a reference, it updates the config dictionary too. So when I call the function again, it reflects the updated-for-special-condition version ("or") rather than the the desired config default ("and").
I realize this is a newb issue (yes, I'm a python newb), but I can't seem to figure out a 'pythonic' solution. I'm trying to not be a hack.
Possible options:
When I set query_dsl equal to the config dict, use copy.deepcopy()
Figure out how to make all nested slices of the config dictionary immutable
Maybe find a better way to accomplish what I'm trying to do? I'm totally open to this whole approach being a preposterous newbie mistake.
Any help appreciated. Thanks!

How to define and select groups of values using configobj?

I would like to define several groups of values where the values of a particular group are used if that group is selected.
Here's a an example to make that clearer:
[environment]
type=prod
[prod]
folder=data/
debug=False
[dev]
folder=dev_data/
debug=True
Then to use it:
print config['folder'] # prints 'data/' because config['environment']=='prod'
Is there a natural or idiomatic way to do this in configobj or otherwise?
Additional Info
My current thoughts are overwriting or adding to the resulting config object using some logic post parsing the config file. However, this feels contrary to the nature of a config file, and feels like it would require somewhat complex logic to validate.

I know this is maybe not exactly what you're searching for, but have you considered using json for easy nested access?
For example, if your config file looks like
{
"environment": {
"type": "prod"
},
"[dev]": {
"debug": "True",
"folder": "dev_data/"
},
"[prod]": {
"debug": "False",
"folder": "data/"
}
}
you can access it with [dev] or [prod] key to get your folder:
>>> config = json.loads(config_data)
>>> config['[dev]']['folder']
'dev_data/'
>>> config['[prod]']['folder']
'data/'

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.