I have below list where i would like to segregate based on condition where all strings that starts with same string would become a newlist
Eg:-
list1 = ["glibc-2.11.3/include/sys/file.h", "glibc-2.11.3/include/sys/ioctl.h", "glibc-2.11.3/lib/crtn.o", "linux-libc-headers-2.6.32/asm-generic/bitsperlong.h" , "linux-libc-headers-2.6.32/asm-generic/bitsperlong.h", "test-3.7.10/asm/posix_types.h", "test-3.7.10/dsm/posix_types.h"]
Here is my try:-
list1 = ["glibc-2.11.3/include/sys/file.h", "glibc-2.11.3/include/sys/ioctl.h", "glibc-2.11.3/lib/crtn.o", "linux-libc-headers-2.6.32/asm-generic/bitsperlong.h" , "linux-libc-headers-2.6.32/asm-generic/bitsperlong.h"]
element = list1[0].split("/")[0]
newlist = []
for i in list1:
if i.startswith(element):
newlist.append(i)
print newlist
o/p:- ['glibc-2.11.3/include/sys/file.h', 'glibc-2.11.3/include/sys/ioctl.h', 'glibc-2.11.3/lib/crtn.o']
I get the 1st set of paths that starts with same string. I need to loop over for other remaining sets.
Basically What i am looking is , for a 1st iteration i am expecting to get all paths that starts with glibc-2.11.3 and for 2nd iteration all paths that starts with linux-libc-headers-2.6.32..so on. Actually i need to perform some check on set of same paths (starts with same string) that gets returned. Please help!
Use a dictionary to keep track of your filepaths
list1 = ["glibc-2.11.3/include/sys/file.h", "glibc-2.11.3/include/sys/ioctl.h", "glibc-2.11.3/lib/crtn.o", "linux-libc-headers-2.6.32/asm-generic/bitsperlong.h" , "linux-libc-headers-2.6.32/asm-generic/bitsperlong.h", "test-3.7.10/asm/posix_types.h", "test-3.7.10/dsm/posix_types.h"]
directories = {}
for filepath in list1:
key = filepath.split("/")[0]
directories.setdefault(key, []).append(filepath)
print(directories)
Outputs:
{'glibc-2.11.3': ['glibc-2.11.3/include/sys/file.h',
'glibc-2.11.3/include/sys/ioctl.h',
'glibc-2.11.3/lib/crtn.o'],
'linux-libc-headers-2.6.32': ['linux-libc-headers-2.6.32/asm-generic/bitsperlong.h',
'linux-libc-headers-2.6.32/asm-generic/bitsperlong.h'],
'test-3.7.10': ['test-3.7.10/asm/posix_types.h',
'test-3.7.10/dsm/posix_types.h']}
list(directories.items()) would give you the list of lists you were trying to create, but instead of doing that you can just use directories.items() the exact same way you would use a list of lists.
dictionary.setdefault(key, []) is a quirky way of saying give me the list at this dictionary key or if there is not already a list there, create a new list and save it in the dictionary under this dictionary key and then give me that. documentation.
I searched for a while but I can't find a solution to my problem. I'm still new to Python, so I'm sometime struggling with obvious things... Thanks by advance for your advises!
I have a list containing objects and duplicates of these objects, both have specific names: objects_ext and duplicatedObject_SREF_ext. What I want is that if there is a duplicated object in my list, check if the original object is also in list, if it is, remove the duplicated object from list.
I tried to use the remove() method, as there can only be one occurrence of each name in the list, but it doesn't work. Here is my code:
rawSelection = [u'crapacruk_high', u'doubidou_high', u'blahbli_high', u'crapacruk_SREF_high', u'doubidou_SREF_high', u'blahbli_SREF_high']
# objects with '_SREF_' in their names are the duplicated ones
for obj in rawSelection:
if '_SREF_' in str(obj):
rawName = str(obj).split('_')
rootName = rawName [0]
defName = rootName + '_' + '_'.join(rawName[2:])
if defName in rawSelection:
rawSelection.remove (obj)
# Always returns:
# [u'crapacruk_high', u'doubidou_high', u'blahbli_high', u'doubidou_SREF_high']
# Instead of:
# [u'crapacruk_high', u'doubidou_high', u'blahbli_high']
Edit: Oh, forgot to say that the duplicated object must be removed from list only if the original one is in it too.
The problem is that you're mutating the same list you're iterating over.
When you remove u'crapacruk_SREF_high' from the list, everything after it shifts to the left (this done on the C source code level) so the value of obj is now u'doubidou_SREF_high'. Then the end of the for loop comes and obj becomes the next element in the list, u'blahbli_SREF_high'.
To fix this you can copy the list over and get
for obj in rawSelection[:]:
...
You can turn the for loop from for obj in rawSelection: to for obj in list(rawSelection):. This should fix your issue as it iterates over the copy of the list. The way you do it, you modify the list while iterating over it, leading to problems.
rawSelection = [u'crapacruk_high', u'doubidou_high', u'blahbli_high', u'crapacruk_SREF_high', u'doubidou_SREF_high', u'blahbli_SREF_high']
for obj in list(rawSelection):
if '_SREF_' in str(obj):
rawName = str(obj).split('_')
rootName = rawName [0]
defName = rootName + '_' + '_'.join(rawName[2:])
if defName in rawSelection:
rawSelection.remove (obj)
print(rawSelection)
Break the problem up into subtasks
def get_orig_name(name):
if '_SREF_' in name:
return '_'.join(name.split('_SREF_'))
else:
return name
Then just construct a new list with no dups
rawSelection = [u'crapacruk_high',
u'doubidou_high',
u'blahbli_high',
u'crapacruk_SREF_high',
u'doubidou_SREF_high',
u'blahbli_SREF_high']
uniqueList = [ n for n in rawSelection if ('_SREF_' not in n) or
(get_orig_name(n) not in rawSelection ) ]
print uniqueList
You could use filter to get quite a clean solution.
def non_duplicate(s):
return not('_SREF_' in s and s.replace('_SREF', '') in raw_selection)
filtered_selection = filter(non_duplicate, raw_selection)
This will do what you want (note that it doesn't matter what order the items appear in):
rawSelection = list({i.replace('_SREF', '') for i in rawSelection})
This works by iterating through the original list, and removing the '_SREF' substring from each item. Then each edited string object is added to a set comprehension (that's what the {} brackets mean: a new set object is being created). Then the set object is turned back into a list object.
This works because for set objects, you can't have duplicate items, so when an attempt is made to add a duplicate, it fails (silently). Note that the order of the original items is not preserved.
EDIT: as #PeterDeGlopper pointed out in the comments, this does not work for the constraint that the _SREF_ item only gets removed only if the original appears. For that, we'll do the following:
no_SREF_Set = {i for i in rawSelection if '_SREF_' not in i}
rawSelection = list({i.replace('_SREF', '') if i.replace('_SREF', '') in no_SREF_Set else i for i in rawSelection})
You can combine this into a one-liner, but it's a little long for my taste:
rawSelection = list({i.replace('_SREF', '') if i.replace('_SREF', '') in {i for i in rawSelection if '_SREF_' not in i} else i for i in rawSelection})
This works by creating a set of the items that don't have '_SREF_', and then creating a new list (similar to the above) that only replaces the '_SREF' if the no '_SREF_' version of the item appears in the no_SREF_Set.
In web2py I have been trying to break down this list comprehension so I can do what I like with the categories it creates. Any ideas as to what this breaks down to?
def menu_rec(items):
return [(x.title,None,URL('shop', 'category',args=pretty_url(x.id, x.slug)),menu_rec(x.children)) for x in items or []]
In addition the following is what uses it:
response.menu = [(SPAN('Catalog', _class='highlighted'), False, '',
menu_rec(db(db.category).select().as_trees()) )]
So far I've come up with:
def menu_rec(items):
for x in items:
return x.title,None,URL('shop', 'category',args=pretty_url(x.id, x.slug)),menu_rec(x.children))
I've got other variations of this but, every variation only gives me back 1(one) category, when compared to the original that gives me all the categories.
Can anyone see where I'm messing this up at? Any and all help is appreciated, thank you.
A list comprehension builds a list by appending:
def menu_rec(items):
result = []
for x in items or []:
url = URL('shop', 'category', args=pretty_url(x.id, x.slug))
menu = menu_rec(x.children) # recursive call
result.append((x.title, None, url, menu))
return result
I've added two local variables to break up the long line somewhat, and to show how it recursively calls itself.
Your version returned directly out of the for loop, during the first iteration, and never built up a list.
You don't want to do return. Instead append to a list and then return the list:
def menu_rec(items):
result = []
for x in items:
result.append(x.title,None,URL('shop', 'category',args=pretty_url(x.id, x.slug)),menu_rec(x.children)))
return result
If you do return, it will return the value after only the first iteration. Instead, keep adding it to a list and then return that list at the end. This will ensure that your result list only gets returned when all the values have been added instead of just return one value.
I have the two following lists:
# List of tuples representing the index of resources and their unique properties
# Format of (ID,Name,Prefix)
resource_types=[('0','Group','0'),('1','User','1'),('2','Filter','2'),('3','Agent','3'),('4','Asset','4'),('5','Rule','5'),('6','KBase','6'),('7','Case','7'),('8','Note','8'),('9','Report','9'),('10','ArchivedReport',':'),('11','Scheduled Task',';'),('12','Profile','<'),('13','User Shared Accessible Group','='),('14','User Accessible Group','>'),('15','Database Table Schema','?'),('16','Unassigned Resources Group','#'),('17','File','A'),('18','Snapshot','B'),('19','Data Monitor','C'),('20','Viewer Configuration','D'),('21','Instrument','E'),('22','Dashboard','F'),('23','Destination','G'),('24','Active List','H'),('25','Virtual Root','I'),('26','Vulnerability','J'),('27','Search Group','K'),('28','Pattern','L'),('29','Zone','M'),('30','Asset Range','N'),('31','Asset Category','O'),('32','Partition','P'),('33','Active Channel','Q'),('34','Stage','R'),('35','Customer','S'),('36','Field','T'),('37','Field Set','U'),('38','Scanned Report','V'),('39','Location','W'),('40','Network','X'),('41','Focused Report','Y'),('42','Escalation Level','Z'),('43','Query','['),('44','Report Template ','\\'),('45','Session List',']'),('46','Trend','^'),('47','Package','_'),('48','RESERVED','`'),('49','PROJECT_TEMPLATE','a'),('50','Attachments','b'),('51','Query Viewer','c'),('52','Use Case','d'),('53','Integration Configuration','e'),('54','Integration Command f'),('55','Integration Target','g'),('56','Actor','h'),('57','Category Model','i'),('58','Permission','j')]
# This is a list of resource ID's that we do not want to reference directly, ever.
unwanted_resource_types=[0,1,3,10,11,12,13,14,15,16,18,20,21,23,25,27,28,32,35,38,41,47,48,49,50,57,58]
I'm attempting to compare the two in order to build a third list containing the 'Name' of each unique resource type that currently exists in unwanted_resource_types. e.g. The final result list should be:
result = ['Group','User','Agent','ArchivedReport','ScheduledTask','...','...']
I've tried the following that (I thought) should work:
result = []
for res in resource_types:
if res[0] in unwanted_resource_types:
result.append(res[1])
and when that failed to populate result I also tried:
result = []
for res in resource_types:
for type in unwanted_resource_types:
if res[0] == type:
result.append(res[1])
also to no avail. Is there something i'm missing? I believe this would be the right place to perform list comprehension, but that's still in my grey basket of understanding fully (The Python docs are a bit too succinct for me in this case).
I'm also open to completely rethinking this problem, but I do need to retain the list of tuples as it's used elsewhere in the script. Thank you for any assistance you may provide.
Your resource types are using strings, and your unwanted resources are using ints, so you'll need to do some conversion to make it work.
Try this:
result = []
for res in resource_types:
if int(res[0]) in unwanted_resource_types:
result.append(res[1])
or using a list comprehension:
result = [item[1] for item in resource_types if int(item[0]) in unwanted_resource_types]
The numbers in resource_types are numbers contained within strings, whereas the numbers in unwanted_resource_types are plain numbers, so your comparison is failing. This should work:
result = []
for res in resource_types:
if int( res[0] ) in unwanted_resource_types:
result.append(res[1])
The problem is that your triples contain strings and your unwanted resources contain numbers, change the data to
resource_types=[(0,'Group','0'), ...
or use int() to convert the strings to ints before comparison, and it should work. Your result can be computed with a list comprehension as in
result=[rt[1] for rt in resource_types if int(rt[0]) in unwanted_resource_types]
If you change ('0', ...) into (0, ... you can leave out the int() call.
Additionally, you may change the unwanted_resource_types variable into a set, like
unwanted_resource_types=set([0,1,3, ... ])
to improve speed (if speed is an issue, else it's unimportant).
The one-liner:
result = map(lambda x: dict(map(lambda a: (int(a[0]), a[1]), resource_types))[x], unwanted_resource_types)
without any explicit loop does the job.
Ok - you don't want to use this in production code - but it's fun. ;-)
Comment:
The inner dict(map(lambda a: (int(a[0]), a[1]), resource_types)) creates a dictionary from the input data:
{0: 'Group', 1: 'User', 2: 'Filter', 3: 'Agent', ...
The outer map chooses the names from the dictionary.