Dictionary creation code. What is going on here most likely? - python

I am looking at this code:
DICT_IDS = dict(x.split('::')
for x in object.method()
['ids_comma_separated'].split(','))
DICT_ATTRS = dict(x.split('::')
for x in object.method()
['comma_separated_key_value_pairs'].split(','))
So each constanty will ultimately refer to a dictionary, but what is going on inside the constructors?
Does this occur first:
x.split('::')
for x in object.method()
So x must be a string that is split on the ::? right?
EDIT
Oh....
for x in object.method()
['ids_comma_separated'].split(',')
is executed first. x is probably another dictionary that we key into using ids_comma_separated whose value is a string that needs to be split on the , like "cat,dog, mouse" into a list. So x is going to be a list?

It is just parsing values like this into a dict:
'ids_comma_separated': "somekey::somevalue,anotherkey::anothervalue"
from a method (object.method()) that returns a dictionary:
class object:
def method():
return {
'ids_comma_separated': "somekey::somevalue,anotherkey::anothervalue"
}
DICT_IDS = dict(x.split('::')
for x in object.method()
['ids_comma_separated'].split(','))
DICT_IDS
# {'somekey': 'somevalue', 'anotherkey': 'anothervalue'}
The part inside the dict() is a generator comprehension but the line breaks make it a little hard to see that:
(x.split('::') for x in object.method()['ids_comma_separated'].split(','))
in each iteration x is somekey::somevalue which gets split once again.

Related

Python: understanding lambda operations in a function

Suppose I have a function designed to find the largest Y value in a list of dictionaries.
s1 = [
{'x':10, 'y':8.04},
{'x':8, 'y':6.95},
{'x':13, 'y':7.58},
{'x':9, 'y':8.81},
{'x':11, 'y':8.33},
{'x':14, 'y':9.96},
{'x':6, 'y':7.24},
{'x':4, 'y':4.26},
{'x':12, 'y':10.84},
{'x':7, 'y':4.82},
{'x':5, 'y':5.68},
]
def range_y(list_of_dicts):
y = lambda dict: dict['y']
return y(min(list_of_dicts, key=y)), y(max(list_of_dicts, key=y))
range_y(s1)
This works and gives the intended result.
What I don't understand is the y before the (min(list_of_dicts, key=y). I know I can find the min and max with min(list_of_dicts, key=lambda d: d['y'])['y'] where the y parameter goes at the end (obviously swapping min for max).
Can someone explain to me what is happening in y(min(list_of_dicts, key=y)) with the y and the parenthetical?
y is a function, where the function is defined by the lambda statement. The function accepts a dictionary as an argument, and returns the value at key 'y' in the dictionary.
min(list_of_dicts, key=y) returns the dictionary from the list with the smallest value under key 'y'
so putting it together, you get the value at key 'y' in the dictionary from the list with the smallest value under key 'y' of all dictionaries in the list
I know I can find the min and max with min(list_of_dicts, key=lambda d: d['y'])['y'] ...
It's exactly the same as that, but the function y does the indexing. It's a bit shorter and DRYer to write it that way.
Note that named lambdas are generally bad practice, although this case isn't too bad. Best practice is to use operator.itemgetter:
y = operator.itemgetter('y')
However, you can do it better by using generator expressions to get the min/max y-values directly, instead of their containing dicts. Then the indexing only happens twice, which makes the function y practically pointless.
return min(d['y'] for d in list_of_dicts), max(d['y'] for d in list_of_dicts)
I'm declaring Abuse of Lambda. Whenever you see a lambda assigned to a variable, you need to ask why give a name to an anonymous function? And when that function lacks a clear name, why make this hard? The function could be rewritten as follows:
def get_y(d):
"""Return "y" item from collection `d`"""
return d['y']
def range_y(list_of_dicts):
return get_y(min(list_of_dicts, key=get_y)), get_y(max(list_of_dicts, key=get_y))
In fact, there is a function in the standard lib that does this, so this may be more expected
def range_y(list_of_dicts):
get_y = operator.itemgetter("y")
return get_y(min(list_of_dicts, key=get_y)), get_y(max(list_of_dicts, key=get_y))
But there is a more straight forward way to write this. itemgetter is useful as a key in the min/max searches, but only confuses things once you've selected the dicts.
def range_y(list_of_dicts):
get_y = operator.itemgetter("y")
return min(list_of_dicts, key=get_y)["y"], max(list_of_dicts, key=get_y)["y"]
But since all you care about is the min/max "y", extract those values and work with them from the beginning.
def range_y(list_of_dicts):
y_vals = [d["y"] for d in list_of_dicts]
return min(y_vals), max(y_vals)

Split string list into dictionary keys in python

I have a string 'request.context.user_id' and I want to split the string by '.' and use each element in the list as a dictionary key. Is there a way to do this for lists of varying lengths without trying to hard code all the different possible list lengths after the split?
parts = string.split('.')
if len(parts)==1:
data = [x for x in logData if x[parts[0]] in listX]
elif len(parts)==2:
data = [x for x in logData if x[parts[0]][parts[1]] in listX]
else:
print("Add more hard code")
listX is a list of string values that should be retrieved by x[parts[0]][parts[1]
logData is a list obtained from reading a json file and then the list can be read into a dataframe using json_normalize... the df portion is provided to give some context about its structure.. a list of dicts:
import json
from pandas.io.json import json_normalize
with open(project_root+"filename") as f:
logData = json.load(f)
df = json_normalize(logData)
If you want arbitrary counts, that means you need a loop. You can use get repeatedly to drill through layers of dictionaries.
parts = "request.context.user_id".split(".")
logData = [{"request": {"context": {"user_id": "jim"}}}]
listX = "jim"
def generate(logData, parts):
for x in logData:
ref = x
# ref will be, successively, x, then the 'request' dictionary, then the
# 'context' dictionary, then the 'user_id' value 'jim'.
for key in parts:
ref = ref[key]
if ref in listX:
yield x
data = list(generate(logData, parts))) # ['jim']
I just realized in the comments you said that you didn't want to create a new dictionary but access an existing one x via chaining up the parts in the list.
(3.b) use a for loop to get/set the value in the key the path
In case you want to only read the value at the end of the path in
import copy
def get_val(key_list, dict_):
reduced = copy.deepcopy(dict_)
for i in range(len(key_list)):
reduced = reduced[key_list[i]]
return reduced
# this solution isn't mine, see the link below
def set_val(dict_, key_list, value_):
for key in key_list[:-1]:
dict_ = dict_.setdefault(key, {})
dict_[key_list[-1]] = value_
get_val()
Where the key_list is the result of string.slit('.') and dict_ is the x dictionary in your case.
You can leave out the copy.deepcopy() part, that's just for paranoid peeps like me - the reason is the python dict is not immutable, thus working on a deepcopy (a separate but exact copy in the memory) is a solution.
set_val() As I said it's not my idea, credit to #Bakuriu
dict.setdefault(key, default_value) will take care of non-existing keys in x.
(3) evaluating a string as code with eval() and/or exec()
So here's an ugly unsafe solution:
def chainer(key_list):
new_str = ''
for key in key_list:
new_str = "{}['{}']".format(new_str, key)
return new_str
x = {'request': {'context': {'user_id': 'is this what you are looking for?'}}}
keys = 'request.context.user_id'.split('.')
chained_keys = chainer(keys)
# quite dirty but you may use eval() to evaluate a string
print( eval("x{}".format(chained_keys)) )
# will print
is this what you are looking for?
which is the innermost value of the mockup x dict
I assume you could use this in your code like this
data = [x for x in logData if eval("x{}".format(chained_keys)) in listX]
# or in python 3.x with f-string
data = [x for x in logData if eval(f"x{chained_keys}") in listX]
...or something similar.
Similarly, you can use exec() to execute a string as code if you wanted to write to x, though it's just as dirty and unsafe.
exec("x{} = '...or this, maybe?'".format(chained_keys))
print(x)
# will print
{'request': {'context': {'user_id': '...or this, maybe?'}}}
(2) An actual solution could be a recursive function as so:
def nester(key_list):
if len(key_list) == 0:
return 'value' # can change this to whatever you like
else:
return {key_list.pop(0): nester(key_list)}
keys = 'request.context.user_id'.split('.')
# ['request', 'context', 'user_id']
data = nester(keys)
print(data)
# will result
{'request': {'context': {'user_id': 'value'}}}
(1) A solution with list comprehension for split the string by '.' and use each element in the list as a dictionary key
data = {}
parts = 'request.context.user_id'.split('.')
if parts: # one or more items
[data.update({part: 'value'}) for part in parts]
print(data)
# the result
{'request': 'value', 'context': 'value', 'user_id': 'value'}
You can overwrite the values in data afterwards.

Optimize a dictionary key conditional

I would like to optimize this piece of code. I'm sure there is a way to write it in a single line:
if 'value' in dictionary:
x = paas_server['support']
else:
x = []
use dictionary get() method as:
x = dictionary.get('support', [])
if support is not a key in the dictionary, it returns second method's argument, here, an empty list.

Substitution in function based on input parameter

Say I have multiple lists called data_w1, data_w2, data_w3, ..., data_wn. I have a function that takes an integer band as an input, and I'd like the function to operate only on the corresponding list.
I am familiar with string substitution, but I'm not dealing with strings here, so how would I do this? Some pseudocode:
def my_function(band):
wband_new = []
for entry in data_wband:
# do stuff
wband_new.append( #new stuff )
return wband_new
But doing the above doesn't work as expected because I get the errors that anything with wband in it isn't defined. How can I do this?
Not exactly sure what you're asking, but if you mean to have lists 1, 2, ..., n then an integer i and you want to get the i'th list, simply have a list of lists and index the outer list with the integer i (in your case called band).
l = [data_w1, data_w2, data_w3]
list_to_operate_on = l[band]
func(list_to_operate_on)
Suppose you have your data variables in the script before the function. What you need to do is substitute data_wband with globals()['data_w'+str(band)]:
data_w1 = [1,2,3]
data_w2 = [4,5,6]
def my_function(band):
wband_new = []
for entry in globals()['data_w'+str(band)]:
# do stuff
wband_new.append( #new stuff )
return wband_new

Choose list which is returned by def

I have a definition to separate some coordinates on specific properties.
For this separation I use 1 definition and within the definition i have 9 lists (different criteria's). Now for the output i just want the list defined by me. Otherwise I cannot use it for plotting.
def sorteerCord(cord):
tweestijging=[]
stijginggelijk=[]
stijgingdaling=[]
tweedaling=[]
dalinggelijk=[]
dalingstijging=[]
tweegelijk=[]
gelijkstijging=[]
gelijkdaling=[]
y=0
while y<len(cord):
lijst=cord[y]
if (lijst[1]-lijst[0])>0.5:
if (lijst[2]-lijst[1])>0.5:
tweestijging.append(y)
if (lijst[2]-lijst[1])<=0.5 and (lijst[2]-lijst[1])>=-0.5:
stijginggelijk.append(y)
if (lijst[2]-lijst[1])<-0.5:
stijgingdaling.append(y)
if (lijst[1]-lijst[0])<-0.5:
if (lijst[2]-lijst[1])>0.5:
dalingstijging.append(y)
if (lijst[2]-lijst[1])<=0.5 and (lijst[2]-lijst[1])>=-0.5:
dalinggelijk.append(y)
if (lijst[2]-lijst[1])<-0.5:
tweedaling.append(y)
if (lijst[1]-lijst[0])<=0.5 and (lijst[1]-lijst[0])>=-0.5:
if (lijst[2]-lijst[1])>0.5:
gelijkstijging.append(y)
if (lijst[2]-lijst[1])<=0.5 and (lijst[2]-lijst[1])>=-0.5:
tweegelijk.append(y)
if (lijst[2]-lijst[1])<-0.5:
gelijkdaling.append(y)
y=y+1
print raw_input()
return raw_input()
Is their a way to define in my def what the output file is like (def sorteerdCord(cord,outpu=tweestijging)
I am guessing that in the last two lines you want the user to input what output list to use but am not quite sure. You could use dictionary to map input strings to variables.
Something like:
def sorteerCord(cord, output):
# all of your separation code
outputmap = { 'tweestijging': tweestijging,
'gelijkstijging' : gelijkstijging,
# and more of those
}
return outputmap[ output ]
And then call:
sorteerCord(cord, 'gelijkstijging')
You could of course also opt for returning all of the lists or keep them in a dictionary instead:
output = { 'tweestijging': [],
'gelijkstijging': [],
# etc
}
# code to manipulate lists goes here
return output
Then selecting one afterwards using the same technique.

Categories