All permutations of string using formatting - python

I have a string template and I want to generate filenames from it. It uses percent formatting with named placeholders right now, and there can be any number of parts to be replaced.
template = "image_%(uval)02d_%(vval)02d.%(frame)04d.tif"
I have an object containing the keys for placeholders, and lists of values:
params = {
"uval": [1,2],
"vval": [1,2],
"frame": [10,11]
}
And I want to generate permutations with formatting:
[
"image_01_01.0010.tif",
"image_01_01.0011.tif",
"image_01_02.0010.tif",
"image_01_02.0011.tif",
"image_02_01.0010.tif",
"image_02_01.0011.tif",
"image_02_02.0010.tif",
"image_02_02.0011.tif"
]
So I tried this:
def permutations(template, params):
# loop through params, each time replacing expanded with the
# new list of resolved filenames.
expanded = [template]
for param in params:
newlist = []
for filename in expanded:
for number in params[param]:
newlist.append(filename % {param: number})
expanded = newlist
return expanded
print permutations(template, params)
And the problem is:
newlist.append(filename % {param: number})
KeyError: 'uval'
As it replaces one key at a time, only one placeholder exists in each iteration, so those that are not present cause the error. Ideally while replacing one key it should leave the rest of the template untouched.
It works fine if there's only one placeholder of course:
template = "image.%(frame)04d.tif"
params = {"frame": [10, 11]}
print permutations(template, params)
Result: ['image.0010.tif', 'image.0011.tif']
I don't mind using a different system, but ideally I want the template string to be expressive and easy to reason about.
Ideas welcome

I'd use itertools.product to select the parameters, and for each combination, build a single dictionary to use in a formatting step that replaces all the placeholders at once:
import itertools
def permutations(template, params):
for vals in itertools.product(*params.values()):
substituion_dict = dict(zip(params, vals))
yield template % substituion_dict
This is a generator function, so it returns an iterator rather than a list of results. In order to print it, you'll need to pass the iterator to list first. But if your real code is going to do something else (like looping over the results in a for loop, doing something with each one), you may not need to create the list at all. You can just loop on the iterator from the generator function directly.

Related

To use the parameter as a part of the name for a newly created variable

I wonder if there is a way to create variables automatically using strings, e.g. I have the following code (which does not work properly in Python):
def function(lst1, string1):
lst2 = 'processed_' + string1
lst2 = [] #here I created a string called lst2, but I want to use the string as the variable name.
for i in range(len(lst1)):
if abs(lst1[i]) >= 0.0001 :
lst2.append(i)
return lst2
function(list1, 'price') # list1 is a list which contains the index for column numbers, e.g., [1,2,3]
function(list1, 'promotion')
function(list1, 'calendar')
I would expect that with the function I would be able to create lists such as processed_price, processed_promotion, and processed_calendar, and the function will return these lists.
However the code above would not work as in Python. I wonder how should I write the code properly to achieve the same goal?
getattr(object, name, [default])
setattr(object, name, value)
To get or set values for a variable named via a string, use one of the above as appropriate. However, any time you use user input, it can be a source of injection attacks — the user could use a name that you did not expect them to use but the name is valid so the user gets access to data they should not have access to.
So it is usually advisable to use the user input as a key into a dictionary you define.
dictionary = {
'apple': 'my_value'
}
dictionary[user_input] = 'their_value'

Python substitute elements inside a list

I have the following code that is filtering and printing a list. The final output is json that is in the form of name.example.com. I want to substitute that with name.sub.example.com but I'm having a hard time actually doing that. filterIP is a working bit of code that removes elements entirely and I have been trying to re-use that bit to also modify elements, it doesn't have to be handled this way.
def filterIP(fullList):
regexIP = re.compile(r'\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}$')
return filter(lambda i: not regexIP.search(i), fullList)
def filterSub(fullList2):
regexSub = re.compile(r'example\.com, sub.example.com')
return filter(lambda i: regexSub.search(i), fullList2)
groups = {key : filterSub(filterIP(list(set(items)))) for (key, items) in groups.iteritems() }
print(self.json_format_dict(groups, pretty=True))
This is what I get without filterSub
"type_1": [
"server1.example.com",
"server2.example.com"
],
This is what I get with filterSub
"type_1": [],
This is what I'm trying to get
"type_1": [
"server1.sub.example.com",
"server2.sub.example.com"
],
The statement:
regexSub = re.compile(r'example\.com, sub.example.com')
doesn't do what you think it does. It creates a compiled regular expression that matches the string "example.com" followed by a comma, a space, the string "sub", an arbitrary character, the string "example", an arbitrary character, and the string "com". It does not create any sort of substitution.
Instead, you want to write something like this, using the re.sub function to perform the substitution and using map to apply it:
def filterSub(fullList2):
regexSub = re.compile(r'example\.com')
return map(lambda i: re.sub(regexSub, "sub.example.com", i),
filter(lambda i: re.search(regexSub, i), fullList2))
If the examples are all truly as simple as those you listed, a regex is probably overkill. A simple solution would be to use string .split and .join. This would likely give better performance.
First split the url at the first period:
url = 'server1.example.com'
split_url = url.split('.', 1)
# ['server1', 'example.com']
Then you can use the sub to rejoin the url:
subbed_url = '.sub.'.join(split_url)
# 'server1.sub.example.com'
Of course you can do the split and the join at the same time
'.sub.'.join(url.split('.', 1))
Or create a simple function:
def sub_url(url):
return '.sub.'.join(url.split('.', 1))
To apply this to the list you can take several approaches.
A list comprehension:
subbed_list = [sub_url(url)
for url in url_list]
Map it:
subbed_list = map(sub_url, url_list)
Or my favorite, a generator:
gen_subbed = (sub_url(url)
for url in url_list)
The last looks like a list comprehension but gives the added benefit that you don't rebuild the entire list. It processes the elements one item at a time as the generator is iterated through. If you decide you do need the list later you can simply convert it to a list as follows:
subbed_list = list(gen_subbed)

Having problems in extracting duplicates

I am stumped with this problem, and no matter how I get around it, it is still giving me the same result.
Basically, supposedly I have 2 groups - GrpA_null and GrpB_null, each having 2 meshes in them and are named exactly the same, brick_geo and bars_geo
- Result: GrpA_null --> brick_geo, bars_geo
But for some reason, in the code below which I presume is the one giving me problems, when it is run, the program states that GrpA_null has the same duplicates as GrpB_null, probably they are referencing the brick_geo and bars_geo. As soon as the code is run, my children geo have a numerical value behind,
- Result: GrpA_null --> brick_geo0, bars_geo0, GrpB_null1 --> brick_geo, bars_geo1
And so, I tried to modify the code such that it will as long as the Parent (GrpA_null and GrpB_null) is different, it shall not 'touch' on the children.
Could someone kindly advice me on it?
def extractDuplicateBoxList(self, inputs):
result = {}
for i in range(0, len(inputs)):
print '<<< i is : %s' %i
for n in range(0, len(inputs)):
print '<<< n is %s' %n
if i != n:
name = inputs[i].getShortName()
# Result: brick_geo
Lname = inputs[i].getLongName()
# Result: |GrpA_null|concrete_geo
if name == inputs[n].getShortName():
# If list already created as result.
if result.has_key(name):
# Make sure its not already in the list and add it.
alreadyAdded = False
for box in result[name]:
if box == inputs[i]:
alreadyAdded = True
if alreadyAdded == False:
result[name].append(inputs[i])
# Otherwise create a new list and add it.
else:
result[name] = []
result[name].append(inputs[i])
return result
There are a couple of things you may want to be aware of. First and foremost, indentation matters in Python. I don't know if the indentation of your code as is is as intended, but your function code should be indented further in than your function def.
Secondly, I find your question a little difficult to understand. But there are several things which would improve your code.
In the collections module, there is (or should be) a type called defaultdict. This type is similar to a dict, except for it having a default value of the type you specify. So a defaultdict(int) will have a default of 0 when you get a key, even if the key wasn't there before. This allows the implementation of counters, such as to find duplicates without sorting.
from collections import defaultdict
counter = defaultdict(int)
for item in items:
counter[item] += 1
This brings me to another point. Python for loops implement a for-each structure. You almost never need to enumerate your items in order to then access them. So, instead of
for i in range(0,len(inputs)):
you want to use
for input in inputs:
and if you really need to enumerate your inputs
for i,input in enumerate(inputs):
Finally, you can iterate and filter through iterable objects using list comprehensions, dict comprehensions, or generator expressions. They are very powerful. See Create a dictionary with list comprehension in Python
Try this code out, play with it. See if it works for you.
from collections import defaultdict
def extractDuplicateBoxList(self, inputs):
counts = defaultdict(int)
for input in inputs:
counts[input.getShortName()] += 1
dup_shns = set([k for k,v in counts.items() if v > 1])
dups = [i for i in inputs if input.getShortName() in dup_shns]
return dups
I was on the point to write the same remarks as bitsplit, he has already done it.
So I just give you for the moment a code that I think is doing exactly the same as yours, based on these remarks and the use of the get dictionary's method:
from collections import defaultdict
def extract_Duplicate_BoxList(self, inputs):
result = defaultdict()
for i,A in enumerate(inputs):
print '<<< i is : %s' %i
name = A.getShortName() # Result: brick_geo
Lname = A.getLongName() # Result: |GrpA_null|concrete_geo
for n in (j for j,B in enumerate(inputs)
if j!=i and B.getShortName()==name):
print '<<< n is %s' %n
if A not in result.get(name,[])):
result[name].append(A)
return result
.
Secondly, as bitsplit said it, I find your question ununderstandable.
Could you give more information on the elements of inputs ?
Your explanations about GrpA_null and GrpB_null and the names and the meshes are unclear.
.
EDIT:
If my reduction/simplification is correct, examining it , I see that What you essentially does is to compare A and B elements of inputs (with A!=B) and you record A in the dictionary result at key shortname (only one time) if A and B have the same shortname shortname;
I think this code can still be reduced to just:
def extract_Duplicate_BoxList(inputs):
result = defaultdict()
for i,A in enumerate(inputs):
print '<<< i is : %s' %i
result[B.getShortName()].append(A)
return result
this may be do what your looking for if I understand it, which seems to be comparing the sub-hierarchies of different nodes to see if they are they have the same names.
import maya.cmds as cmds
def child_nodes(node):
''' returns a set with the relative paths of all <node>'s children'''
root = cmds.ls(node, l=True)[0]
children = cmds.listRelatives(node, ad=True, f=True)
return set( [k[len(root):] for k in children])
child_nodes('group1')
# Result: set([u'|pCube1|pCubeShape1', u'|pSphere1', u'|pSphere1|pSphereShape1', u'|pCube1']) #
# note the returns are NOT valid maya paths, since i've removed the root <node>,
# you'd need to add it back in to actually access a real shape here:
all_kids = child_nodes('group1')
real_children = ['group1' + n for n in all_kids ]
Since the returns are sets, you can test to see if they are equal, see if one is a subset or superset of the other, see what they have in common and so on:
# compare children
child_nodes('group1') == child_nodes('group2')
#one is subset:
child_nodes('group1').issuperset(child_nodes('group2'))
Iterating over a bunch of nodes is easy:
# collect all the child sets of a bunch of nodes:
kids = dict ( (k, child_nodes(k)) for k in ls(*nodes))

Choose list which is returned by def

I have a definition to separate some coordinates on specific properties.
For this separation I use 1 definition and within the definition i have 9 lists (different criteria's). Now for the output i just want the list defined by me. Otherwise I cannot use it for plotting.
def sorteerCord(cord):
tweestijging=[]
stijginggelijk=[]
stijgingdaling=[]
tweedaling=[]
dalinggelijk=[]
dalingstijging=[]
tweegelijk=[]
gelijkstijging=[]
gelijkdaling=[]
y=0
while y<len(cord):
lijst=cord[y]
if (lijst[1]-lijst[0])>0.5:
if (lijst[2]-lijst[1])>0.5:
tweestijging.append(y)
if (lijst[2]-lijst[1])<=0.5 and (lijst[2]-lijst[1])>=-0.5:
stijginggelijk.append(y)
if (lijst[2]-lijst[1])<-0.5:
stijgingdaling.append(y)
if (lijst[1]-lijst[0])<-0.5:
if (lijst[2]-lijst[1])>0.5:
dalingstijging.append(y)
if (lijst[2]-lijst[1])<=0.5 and (lijst[2]-lijst[1])>=-0.5:
dalinggelijk.append(y)
if (lijst[2]-lijst[1])<-0.5:
tweedaling.append(y)
if (lijst[1]-lijst[0])<=0.5 and (lijst[1]-lijst[0])>=-0.5:
if (lijst[2]-lijst[1])>0.5:
gelijkstijging.append(y)
if (lijst[2]-lijst[1])<=0.5 and (lijst[2]-lijst[1])>=-0.5:
tweegelijk.append(y)
if (lijst[2]-lijst[1])<-0.5:
gelijkdaling.append(y)
y=y+1
print raw_input()
return raw_input()
Is their a way to define in my def what the output file is like (def sorteerdCord(cord,outpu=tweestijging)
I am guessing that in the last two lines you want the user to input what output list to use but am not quite sure. You could use dictionary to map input strings to variables.
Something like:
def sorteerCord(cord, output):
# all of your separation code
outputmap = { 'tweestijging': tweestijging,
'gelijkstijging' : gelijkstijging,
# and more of those
}
return outputmap[ output ]
And then call:
sorteerCord(cord, 'gelijkstijging')
You could of course also opt for returning all of the lists or keep them in a dictionary instead:
output = { 'tweestijging': [],
'gelijkstijging': [],
# etc
}
# code to manipulate lists goes here
return output
Then selecting one afterwards using the same technique.

How to compare an element of a tuple (int) to determine if it exists in a list

I have the two following lists:
# List of tuples representing the index of resources and their unique properties
# Format of (ID,Name,Prefix)
resource_types=[('0','Group','0'),('1','User','1'),('2','Filter','2'),('3','Agent','3'),('4','Asset','4'),('5','Rule','5'),('6','KBase','6'),('7','Case','7'),('8','Note','8'),('9','Report','9'),('10','ArchivedReport',':'),('11','Scheduled Task',';'),('12','Profile','<'),('13','User Shared Accessible Group','='),('14','User Accessible Group','>'),('15','Database Table Schema','?'),('16','Unassigned Resources Group','#'),('17','File','A'),('18','Snapshot','B'),('19','Data Monitor','C'),('20','Viewer Configuration','D'),('21','Instrument','E'),('22','Dashboard','F'),('23','Destination','G'),('24','Active List','H'),('25','Virtual Root','I'),('26','Vulnerability','J'),('27','Search Group','K'),('28','Pattern','L'),('29','Zone','M'),('30','Asset Range','N'),('31','Asset Category','O'),('32','Partition','P'),('33','Active Channel','Q'),('34','Stage','R'),('35','Customer','S'),('36','Field','T'),('37','Field Set','U'),('38','Scanned Report','V'),('39','Location','W'),('40','Network','X'),('41','Focused Report','Y'),('42','Escalation Level','Z'),('43','Query','['),('44','Report Template ','\\'),('45','Session List',']'),('46','Trend','^'),('47','Package','_'),('48','RESERVED','`'),('49','PROJECT_TEMPLATE','a'),('50','Attachments','b'),('51','Query Viewer','c'),('52','Use Case','d'),('53','Integration Configuration','e'),('54','Integration Command f'),('55','Integration Target','g'),('56','Actor','h'),('57','Category Model','i'),('58','Permission','j')]
# This is a list of resource ID's that we do not want to reference directly, ever.
unwanted_resource_types=[0,1,3,10,11,12,13,14,15,16,18,20,21,23,25,27,28,32,35,38,41,47,48,49,50,57,58]
I'm attempting to compare the two in order to build a third list containing the 'Name' of each unique resource type that currently exists in unwanted_resource_types. e.g. The final result list should be:
result = ['Group','User','Agent','ArchivedReport','ScheduledTask','...','...']
I've tried the following that (I thought) should work:
result = []
for res in resource_types:
if res[0] in unwanted_resource_types:
result.append(res[1])
and when that failed to populate result I also tried:
result = []
for res in resource_types:
for type in unwanted_resource_types:
if res[0] == type:
result.append(res[1])
also to no avail. Is there something i'm missing? I believe this would be the right place to perform list comprehension, but that's still in my grey basket of understanding fully (The Python docs are a bit too succinct for me in this case).
I'm also open to completely rethinking this problem, but I do need to retain the list of tuples as it's used elsewhere in the script. Thank you for any assistance you may provide.
Your resource types are using strings, and your unwanted resources are using ints, so you'll need to do some conversion to make it work.
Try this:
result = []
for res in resource_types:
if int(res[0]) in unwanted_resource_types:
result.append(res[1])
or using a list comprehension:
result = [item[1] for item in resource_types if int(item[0]) in unwanted_resource_types]
The numbers in resource_types are numbers contained within strings, whereas the numbers in unwanted_resource_types are plain numbers, so your comparison is failing. This should work:
result = []
for res in resource_types:
if int( res[0] ) in unwanted_resource_types:
result.append(res[1])
The problem is that your triples contain strings and your unwanted resources contain numbers, change the data to
resource_types=[(0,'Group','0'), ...
or use int() to convert the strings to ints before comparison, and it should work. Your result can be computed with a list comprehension as in
result=[rt[1] for rt in resource_types if int(rt[0]) in unwanted_resource_types]
If you change ('0', ...) into (0, ... you can leave out the int() call.
Additionally, you may change the unwanted_resource_types variable into a set, like
unwanted_resource_types=set([0,1,3, ... ])
to improve speed (if speed is an issue, else it's unimportant).
The one-liner:
result = map(lambda x: dict(map(lambda a: (int(a[0]), a[1]), resource_types))[x], unwanted_resource_types)
without any explicit loop does the job.
Ok - you don't want to use this in production code - but it's fun. ;-)
Comment:
The inner dict(map(lambda a: (int(a[0]), a[1]), resource_types)) creates a dictionary from the input data:
{0: 'Group', 1: 'User', 2: 'Filter', 3: 'Agent', ...
The outer map chooses the names from the dictionary.

Categories