str.format a list by joining its values - python

Say I have a dictionary:
data = {
"user" : {
"properties" : ["i1", "i2"]
}
}
And the following string:
txt = "The user has properties {user[properties]}"
I want to have:
txt.format(**data)
to equal:
The user has properties i1, i2
I believe to achieve this, I could subclass the formatter used by str.format but I am unfortunately unsure how to proceed. I rarely subclass standard Python classes. Note that writing {user[properties][0]}, {user[properties][1]} is not an ideal option for me here. I don't know how many items are in the list so I would need to do a regex to identify matches, then find the relevant value in data and replace the matched text with {user[properties][0]}, {user[properties][1]}. str.format takes care of all the indexing from the string's value so it is very practical.

Just join the items in data["user"]["properties"]
txt = "The user has properties {properties}"
txt.format(properties = ", ".join(data["user"]["properties"]))
Here you have a live example

I ended up using the jinja2 package for all of my formatting needs. It's extremely powerful and I really recommend it!

Related

Write a custom JSON interpreter for a file that looks like json but isnt using Python

What I need to do is to write a module that can read and write files that use the PDX script language. This language looks alot like json but has enough differences that a custom encoder/decoder is needed to do anything with those files (without a mess of regex substitutions which would make maintenance hell). I originally went with just reading them as txt files, and use regex to find and replace things to convert it to valid json. This lead me to my current point, where any additions to the code requires me to write far more code than I would want to, just to support some small new thing. So using a custom json thing I could write code that shows what valid key:value pairs are, then use that to handle the files. To me that will be alot less code and alot easier to maintain.
So what does this code look like? In general it looks like this (tried to put all possible syntax, this is not an example of a working file):
#key = value # this is the definition for the scripted variable
key = {
# This is a comment. No multiline comments
function # This is a single key, usually optimize_memory
# These are the accepted key:value pairs. The quoted version is being phased out
key = "value"
key = value
key = #key # This key is using a scripted variable, defined either in the file its in or in the `scripted_variables` folder. (see above for example on how these are initially defined)
# type is what the key type is. Like trigger:planet_stability where planet_stability is a trigger
key = type:key
# Variables like this allow for custom names to be set. Mostly used for flags and such things
[[VARIABLE_NAME]
math_key = $VARIABLE_NAME$
]
# this is inline math, I dont actually understand how this works in the script language yet as its new. The "<" can be replaced with any math symbol.
# Valid example: planet_stability < #[ stabilitylevel2 + 10 ]
key < #[ key + 10 ]
# This is used alot to handle code blocks. Valid example:
# potential = {
# exists = owner
# owner = {
# has_country_flag = flag_name
# }
# }
key = {
key = value
}
# This is just a list. Inline brackets are used alot which annoys me...
key = { value value }
}
The major differences between json and PDX script is the nearly complete lack of quotations, using an equals sign instead of a colon for separation and no comma's at the end of the lines. Now before you ask me to change the PDX code, I cant. Its not mine. This is what I have to work with and cant make any changes to the syntax. And no I dont want to convert back and forth as I have already mentioned this would require alot of work. I have attempted to look for examples of this, however all I can find are references to convert already valid json to a python object, which is not what I want. So I cant give any examples of what I have already done, as I cant find anywhere to even start.
Some additional info:
Order of key:value pairs does not technically matter, however it is expected to be in a certain order, and when not in that order causes issues with mods and conflict solvers
bool properties always use yes or no rather than true or false
Lowercase is expected and in some cases required
Math operators are used as separators as well, eg >=, <= ect
The list of syntax is not exhaustive, but should contain most of the syntax used in the language
Past work:
My last attempts at this all revolved around converting it from a text file to a json file. This was alot of work just to get a small piece of this to work.
Example:
potential = {
exists = owner
owner = {
is_regular_empire = yes
is_fallen_empire = no
}
NOR = {
has_modifier = resort_colony
has_modifier = slave_colony
uses_habitat_capitals = yes
}
}
And what i did to get most of the way to json (couldnt find a way to add quotes)
test_string = test_string.replace("\n", ",")
test_string = test_string.replace("{,", "{")
test_string = test_string.replace("{", "{\n")
test_string = test_string.replace(",", ",\n")
test_string = test_string.replace("}, ", "},\n")
test_string = "{\n" + test_string + "\n}"
# Replace the equals sign with a colon
test_string = test_string.replace(" =", ":")
This resulted in this:
{
potential: {
exists: owner,
owner: {
is_regular_empire: yes,
is_fallen_empire: no,
},
NOR: {
has_modifier: resort_colony,
has_modifier: slave_colony,
uses_habitat_capitals: yes,
},
}
}
Very very close yes, but in no way could I find a way to add the quotations to each word (I think I did try a regex sub, but wasnt able to get it to work, since this whole thing is just one unbroken string), making this attempt stuck and also showing just how much work is required just to get a very simple potential block to mostly work. However this is not the method I want anymore, one because its alot of work and two because I couldnt find anything to finish it. So a custom json interpreter is what I want.
The classical approach (potentially leading to more code, but also more "correctness"/elegance) is probably to build a "recursive descent parser", from a bunch of conditionals/checks, loops and (sometimes recursive?) functions/handlers to deal with each of the encountered elements/characters on the input stream. An implicit parse/call tree might be sufficient if you directly output/print the JSON equivalent, or otherwise you could also create a representation/model in memory for later output/conversion.
Related book recommendation could be "Language Implementation Patterns" by Terence Parr, me avoiding to promote my own interpreters and introductory materials :-) In case you need further help, maybe write me?

Django model filter targeting JSONField where the keys contain hyphen / dash

I am trying to apply a model filter on a JSONField BUT the keys in the JSON are UUIDs.
So when is do something like...
MyModel.objects.filter(data__8d8dd642-32cb-48fa-8d71-a7d6668053a7=‘bob’)
... I get a compile error. The hyphens in the UUID are the issue.
Any clues if there is an escape char or another behaviour to use? My database is PostgreSQL.
Update 1 - now with added JSON
{
‘8d8dd642-32cb-48fa-8d71-a7d6668053a7’: ’8d8dd642-32cb-48fa-8d71-a7d6668053a7’,
‘9a2678c4-7a49-4851-ab5d-6e7fd6d33d72’: ‘John Smith’,
‘9933ae39-1a27-4477-a9f4-3d1839f93fb4’: ‘Employee’
}
I was having this same issue where I couldn't use __contains, and found that you can use **kwargs unpacking to get it to work, which allows you to pass the filter as a string (this is also useful if you were to need a dynamic filter):
kwargs = {
'data__8d8dd642-32cb-48fa-8d71-a7d6668053a7': 'bob'
}
MyModel.objects.filter(**kwargs)
That's looks difficult and I'm not 100% convinced what I have will work.
You can try using the JsonField contains lookup. The lookup is explained more under the docs for HStoreField since it's shared functionality.
This would look like this:
MyModel.objects.filter(data__contains={'8d8dd642-32cb-48fa-8d71-a7d6668053a7': 'bob'})
I think this will allow you to circumvent the fact that the lookups need to be valid Python variable names.
If you also want to search wild card (using the example from above)
This will basically search LIKE %bob%
kwargs = {
'data__8d8dd642-32cb-48fa-8d71-a7d6668053a7__icontains': 'bob'
}
MyModel.objects.filter(**kwargs)

convert string representation of a dict inside json dict value to dict

Hello all and sorry if the title was worded poorly. I'm having a bit of trouble wrapping my head around how to solve this issue I have encountered. I would have liked to simply pass a dict as the value for this key in my json obj but sadly I have to pass it as a string. So, I have a json dict object that looks like this
data = {"test": "Fuzz", "options": "'{'size':'Regular','connection':'unconnected'}'"}. Obviously, I would prefer that the second dict value weren't a string representation of a dictionary but rather a dictionary. Is the best route here to just strip the second and second to last single quotes for the data[options] or is there a better alternative?
Sorry for any confusion. This is how the json object looks after I perform
json.dump(data, <filename>)
The value for options can be thought of as another variable say x and it's equivalent to '{'size':'Regular','connection':'unconnected'}'
I could do x[1:-1] but I'm not sure if that is the most pythonic way to do things here.
import ast
bad_string_dict = "'{'size':'Regular','connection':'unconnected'}'"
good_string_dict = bad_string_dict.strip("'")
good_dict = ast.literal_eval(good_string_dict)
print(good_dict)
You will have to strip quotation mark, no other way around
Given OP's comments I suggest the following:
Set the environment variable to a known data format (example: json/yaml/...), not a specific language (python)
Use the json module (or the format you've chosen) to load the data
The data should look like this:
raw_data = {"test": "Fuzz", "options": "{\"size\": \"Regular\", \"connection\": \"unconnected\"}"}
And the code should look like this:
raw_options = raw_data['options']
options = json.loads(raw_options)
data = {**raw_data, 'options': options}

reformat string of index field when creating RethinkDB index

I am trying to create an index in RethinkDB.
The documents look like this:
{ "pinyin" : "a1 ai3"}
To make searching easier, I would like to preprocess the index entries and remove spaces and numbers, the entry thus should simply be "aai" in this case. What I tried are various variants of the following:
r.index_create('pinyin', lambda doc: doc['pinyin'].replace("1", "")).run()
This is a most simple case to build from, but even here I get an error
Expected 2 arguments but found 3 in:
r.table('collection').index_create('pinyin', lambda var_7: var_7['pinyin'].replace('1', ''))
It's obvious that I do not understand what's going on. Can anybody help? I gather that the lambda expression has to follow python syntax, but since it will be used on the server has to be JavaScript??
I do not have experience using rethinkdb. But if you would like to remove all numbers and space from a string value the below snippet might help.
import re
doc = { "pinyin" : "a1 ai3"}
removeIntSpace = lambda doc: re.sub("\d", "x", doc['pinyin'].replace(" ", ""))
print removeIntSpace(doc)
#r.index_create('pinyin', removeIntSpace(doc)).run()
Output:
axaix
OK, I figured it out and am posting here for others who might run into the same problem. This is how the index can be created in the data explorer of the rethinkdb admin console:
r.db("dics").table("collection").indexCreate('pinyin', r.row("pinyin").split("").filter(function(char) { return r.expr(["1", "2", "3", "4", " "]).contains(char).not(); }).reduce(function(a, b) { return a.add(b); }))
A corresponding python version would use an anonymous function with lambda in the filter part.

Find documents which contain a particular value - Mongo, Python

I'm trying to add a search option to my website but it doesn't work. I looked up solutions but they all refer to using an actual string, whereas in my case I'm using a variable, and I can't make those solutions work. Here is my code:
cursor = source.find({'title': search_term}).limit(25)
for document in cursor:
result_list.append(document)
Unfortunately this only gives back results which match the search_term variable's value exactly. I want it to give back any results where the title contains the search term - regardless what other strings it contains. How can I do it if I want to pass a variable to it, and not an actual string? Thanks.
You can use $regex to do contains searches.
cursor = collection.find({'field': {'$regex':'regular expression'}})
And to make it case insensitive:
cursor = collection.find({'field': {'$regex':'regular expression', '$options'‌​:'i'}})
Please try cursor = source.find({'title': {'$regex':search_term}}).limit(25)
$text
You can perform a text search using $text & $search. You first need to set a text index, then use it:
$ db.docs.createIndex( { title: "text" } )
$ db.docs.find( { $text: { $search: "search_term" } } )
$regex
You may also use $regex, as answered here: https://stackoverflow.com/a/10616781/641627
$ db.users.findOne({"username" : {$regex : ".*son.*"}});
Both solutions compared
Full Text Search vs. Regular Expressions
... The regular expression search takes longer for queries with just a
few results while the full text search gets faster and is clearly
superior in those cases.

Categories