I searched for a while but I can't find a solution to my problem. I'm still new to Python, so I'm sometime struggling with obvious things... Thanks by advance for your advises!
I have a list containing objects and duplicates of these objects, both have specific names: objects_ext and duplicatedObject_SREF_ext. What I want is that if there is a duplicated object in my list, check if the original object is also in list, if it is, remove the duplicated object from list.
I tried to use the remove() method, as there can only be one occurrence of each name in the list, but it doesn't work. Here is my code:
rawSelection = [u'crapacruk_high', u'doubidou_high', u'blahbli_high', u'crapacruk_SREF_high', u'doubidou_SREF_high', u'blahbli_SREF_high']
# objects with '_SREF_' in their names are the duplicated ones
for obj in rawSelection:
if '_SREF_' in str(obj):
rawName = str(obj).split('_')
rootName = rawName [0]
defName = rootName + '_' + '_'.join(rawName[2:])
if defName in rawSelection:
rawSelection.remove (obj)
# Always returns:
# [u'crapacruk_high', u'doubidou_high', u'blahbli_high', u'doubidou_SREF_high']
# Instead of:
# [u'crapacruk_high', u'doubidou_high', u'blahbli_high']
Edit: Oh, forgot to say that the duplicated object must be removed from list only if the original one is in it too.
The problem is that you're mutating the same list you're iterating over.
When you remove u'crapacruk_SREF_high' from the list, everything after it shifts to the left (this done on the C source code level) so the value of obj is now u'doubidou_SREF_high'. Then the end of the for loop comes and obj becomes the next element in the list, u'blahbli_SREF_high'.
To fix this you can copy the list over and get
for obj in rawSelection[:]:
...
You can turn the for loop from for obj in rawSelection: to for obj in list(rawSelection):. This should fix your issue as it iterates over the copy of the list. The way you do it, you modify the list while iterating over it, leading to problems.
rawSelection = [u'crapacruk_high', u'doubidou_high', u'blahbli_high', u'crapacruk_SREF_high', u'doubidou_SREF_high', u'blahbli_SREF_high']
for obj in list(rawSelection):
if '_SREF_' in str(obj):
rawName = str(obj).split('_')
rootName = rawName [0]
defName = rootName + '_' + '_'.join(rawName[2:])
if defName in rawSelection:
rawSelection.remove (obj)
print(rawSelection)
Break the problem up into subtasks
def get_orig_name(name):
if '_SREF_' in name:
return '_'.join(name.split('_SREF_'))
else:
return name
Then just construct a new list with no dups
rawSelection = [u'crapacruk_high',
u'doubidou_high',
u'blahbli_high',
u'crapacruk_SREF_high',
u'doubidou_SREF_high',
u'blahbli_SREF_high']
uniqueList = [ n for n in rawSelection if ('_SREF_' not in n) or
(get_orig_name(n) not in rawSelection ) ]
print uniqueList
You could use filter to get quite a clean solution.
def non_duplicate(s):
return not('_SREF_' in s and s.replace('_SREF', '') in raw_selection)
filtered_selection = filter(non_duplicate, raw_selection)
This will do what you want (note that it doesn't matter what order the items appear in):
rawSelection = list({i.replace('_SREF', '') for i in rawSelection})
This works by iterating through the original list, and removing the '_SREF' substring from each item. Then each edited string object is added to a set comprehension (that's what the {} brackets mean: a new set object is being created). Then the set object is turned back into a list object.
This works because for set objects, you can't have duplicate items, so when an attempt is made to add a duplicate, it fails (silently). Note that the order of the original items is not preserved.
EDIT: as #PeterDeGlopper pointed out in the comments, this does not work for the constraint that the _SREF_ item only gets removed only if the original appears. For that, we'll do the following:
no_SREF_Set = {i for i in rawSelection if '_SREF_' not in i}
rawSelection = list({i.replace('_SREF', '') if i.replace('_SREF', '') in no_SREF_Set else i for i in rawSelection})
You can combine this into a one-liner, but it's a little long for my taste:
rawSelection = list({i.replace('_SREF', '') if i.replace('_SREF', '') in {i for i in rawSelection if '_SREF_' not in i} else i for i in rawSelection})
This works by creating a set of the items that don't have '_SREF_', and then creating a new list (similar to the above) that only replaces the '_SREF' if the no '_SREF_' version of the item appears in the no_SREF_Set.
Related
def mutations (list_a,string1,name,list_b):
""" (list of str, str, list of str, list of str) -> NoneType
"""
dna=list_a
for i in range(len(list_b)):
strand=dna[:dna.index(list_b[i])]
string1=string1[string1.index(list_b[i]):]
dna[strand+string1]
>>>dna=['TGCAGAATTCGGTT','ACGTCCCGGGTTGC']
>>>mutations(dna,'CCCGGGGAATTCTCGC',['EcoRI','SmaI'],['GAATTC','CCCGGG'])
>>>mutated
>>>['TGCAGAATTCTCGC','ACGTCCCGGGGAATTCTCGC']
It's suppose to modify the first parameter. So basically im trying to modify list_a and making it change to ['TGCAGAATTCTCGC','ACGTCCCGGGGAATTCTCGC'] however, i get an error saying
strand=dna[:dna.index(string1[i])].
ValueError: 'GAATTC' is not in list
Also, is there a way if the sequence does not exist, it doesn't modify the function?
Well if I understand you correctly, you want to check each element in list_a if it contains its corresponding element from list_b. If so, you want to modify the element from list_a by replacing the rest of the string (including the list_b element) with part of a control string that does also contain the element from list_b, right?!
Ideally you would put this in your question!!
A way of doing this would be as follow:
def mut(list_a, control, list_b):
check_others = Falsee
for i in range(len(list_a)): # run through list_a (use xrange in python 2.x)
if i == len(list_b): # if we are at the end of list_b we run will
# check all other elements if list_b (see below)
check_others = True
if not check_others: # this is the normal 1 to 1 match.
if list_b[i] in list_a[i]: # if the element from list_b is in it
# correct the element
list_a[i] = list_a[i][:list_a[i].index(list_b[i])] +\
control[control.index(list_b[i]):]
else: # this happens if we are at the end of list_b
for j in xrange(len(list_b)): # run through list_b for the start
if list_b[j] in list_a[i]:
list_a[i] = list_a[i][:list_a[i].index(list_b[j])] +\
control[control.index(list_b[j]):]
break # only the first match with an element in list_b is used!
As described in the comments, dna is a list, not a string, so finding a substring won't quite work how you want.
dna=list_a is unnecessary, and dna[strand+string1] doesn't modify a list, so not sure what you were trying to accomplish there.
All in all, I know the following code doesn't get the output you are expecting (or maybe it does), but hopefully it sets you on the more correct path.
(I removed name because it was not used)
def mutations (mutated,clean,recognition):
""" (list of str, str, list of str) -> NoneType
"""
# Loop over 'mutated' list. We need the index to update the list
for i,strand in enumerate(mutated):
# Loop over 'recognition' list
for rec in recognition:
# Find the indices in the two strings
strand_idx = strand.find(rec)
clean_idx = clean.find(rec)
# Check that 'rec' existed in both strings
if strand_idx > 0 and clean_idx > 0:
# both are found, so get the substrings
strand_str = strand[:strand_idx]
clean_str = clean[clean_idx:]
# debug these values
print(rec, (strand_idx, strand_str,), (clean_idx, clean_str, ))
# updated 'mutated' like this
mutated[i] = strand_str+clean_str
And, the output. (The first dna element changed, the second did not)
dna=['TGCAGAATTCGGTT','ACGTCCCGGGTTGC']
mutations(dna,'CCCGGGGAATTCTCGC',['GAATTC','CCCGGG'])
print(dna) # ['TGCAGAATTCTCGC', 'ACGTCCCGGGTTGC']
I've been programming for less than four weeks and have run into a problem that I cannot figure out. I'm trying to append a string value to an existing key with an existing string stored in it but if any value already exists in the key I get "str object has no attribute 'append'.
I've tried turning the value to list but this also does not work. I need to use the .append() attribute because update simply replaces the value in clientKey instead of appending to whatever value is already stored. After doing some more research, I understand now that I need to somehow split the value stored in clientKey.
Any help would be greatly appreciated.
data = {}
while True:
clientKey = input().upper()
refDate = strftime("%Y%m%d%H%M%S", gmtime())
refDate = refDate[2 : ]
ref = clientKey + refDate
if clientKey not in data:
data[clientKey] = ref
elif ref in data[clientKey]:
print("That invoice already exists")
else:
data[clientKey].append(ref)
break
You can't .append() to a string because a string is not mutable. If you want your dictionary value to be able to contain multiple items, it should be a container type such as a list. The easiest way to do this is just to add the single item as a list in the first place.
if clientKey not in data:
data[clientKey] = [ref] # single-item list
Now you can data[clientkey].append() all day long.
A simpler approach for this problem is to use collections.defaultdict. This automatically creates the item when it's not there, making your code much simpler.
from collections import defaultdict
data = defaultdict(list)
# ... same as before up to your if
if clientkey in data and ref in data[clientkey]:
print("That invoice already exists")
else:
data[clientKey].append(ref)
You started with a string value, and you cannot call .append() on a string. Start with a list value instead:
if clientKey not in data:
data[clientKey] = [ref]
Now data[clientKey] references a list object with one string in it. List objects do have an append() method.
If you want to keep appending to the string you can use data[clientKey]+= ref
In web2py I have been trying to break down this list comprehension so I can do what I like with the categories it creates. Any ideas as to what this breaks down to?
def menu_rec(items):
return [(x.title,None,URL('shop', 'category',args=pretty_url(x.id, x.slug)),menu_rec(x.children)) for x in items or []]
In addition the following is what uses it:
response.menu = [(SPAN('Catalog', _class='highlighted'), False, '',
menu_rec(db(db.category).select().as_trees()) )]
So far I've come up with:
def menu_rec(items):
for x in items:
return x.title,None,URL('shop', 'category',args=pretty_url(x.id, x.slug)),menu_rec(x.children))
I've got other variations of this but, every variation only gives me back 1(one) category, when compared to the original that gives me all the categories.
Can anyone see where I'm messing this up at? Any and all help is appreciated, thank you.
A list comprehension builds a list by appending:
def menu_rec(items):
result = []
for x in items or []:
url = URL('shop', 'category', args=pretty_url(x.id, x.slug))
menu = menu_rec(x.children) # recursive call
result.append((x.title, None, url, menu))
return result
I've added two local variables to break up the long line somewhat, and to show how it recursively calls itself.
Your version returned directly out of the for loop, during the first iteration, and never built up a list.
You don't want to do return. Instead append to a list and then return the list:
def menu_rec(items):
result = []
for x in items:
result.append(x.title,None,URL('shop', 'category',args=pretty_url(x.id, x.slug)),menu_rec(x.children)))
return result
If you do return, it will return the value after only the first iteration. Instead, keep adding it to a list and then return that list at the end. This will ensure that your result list only gets returned when all the values have been added instead of just return one value.
seen2 = set()
def eliminate_abs(d): ##remove all entries that connect to the abstraction node, type(d) = list
def rec(x):
if x not in seen2:
seen2.add(x)
a = x.hypernyms()
if len(a) != 0:
kk = a[0]
if re.search('abstraction',str(kk)):
syns.remove(ii)
else:
rec(kk)
for ii in d: ##type(ii) = <class 'nltk.corpus.reader.wordnet.Synset'>
rec(ii)
eliminate_abs(syns)
The list "syns" will eventually be converted into a tree but I first need to remove all of the items which ultimately connect to the abstraction node. What I want this function to do is recursively look through all of the hypernyms for each item in "syns" and and if "abstraction" is ever found, remove the original term from "syns". For some reason this is only removing some of them.
Since you are mucking with syns while iterating over it, you should iterate over a slice of syns, i.e. make a copy of the list and iterate over the copy:
for ii in d[:]:
rec(ii)
Figured it out. It works fine but they're are a bunch of repeats in syns so all of them after the first one get skipped. Removing
if not x in seen2:
seen2.add(x)
makes it work fine.
I have a list of dictionaries called lod. All dictionaries have the same keys but different values. I am trying to update one specific value in the list of values for the same key in all the dictionaries.
I am attempting to do it with the following for loop:
for i in range(len(lod)):
a=lod[i][key][:]
a[p]=a[p]+lov[i]
lod[i][key]=a
What's happening is each is each dictionary is getting updated len(lod) times so lod[0][key][p] is supposed to have lov[0] added to it but instead it is getting lov[0]+lov[1]+.... added to it.
What am I doing wrong?
Here is how I declared the list of dicts:
lod = [{} for _ in range(len(dataul))]
for j in range(len(dataul)):
for i in datakl:
rrdict[str.split(i,',')[0]]=list(str.split(i,',')[1:len(str.split(i,','))])
lod[j]=rrdict
The problem is in how you created the list of dictionaries. You probably did something like this:
list_of_dicts = [{}] * 20
That's actually the same dict 20 times. Try doing something like this:
list_of_dicts = [{} for _ in range(20)]
Without seeing how you actually created it, this is only an example solution to an example problem.
To know for sure, print this:
[id(x) for x in list_of_dicts]
If you defined it in the * 20 method, the id is the same for each dict. In the list comprehension method, the id is unique.
This it where the trouble starts: lod[j] = rrdict. lod itself is created properly with different dictionaries. Unfortunately, afterwards any references to the original dictionaries in the list get overwritten with a reference to rrdict. So in the end, the list contains only references to one single dictionary. Here is some more pythonic and readable way to solve your problem:
lod = [{} for _ in range(len(dataul))]
for rrdict in lod:
for line in datakl:
splt = line.split(',')
rrdict[splt[0]] = splt[1:]
You created the list of dictionaries correctly, as per other answer.
However, when you are updating individual dictionaries, you completely overwrite the list.
Removing noise from your code snippet:
lod = [{} for _ in range(whatever)]
for j in range(whatever):
# rrdict = lod[j] # Uncomment this as a possible fix.
for i in range(whatever):
rrdict[somekey] = somevalue
lod[j] = rrdict
Assignment on the last line throws away the empty dict that was in lod[j] and inserts a reference to the object represented by rrdict.
Not sure what your code does, but see a commented-out line - it might be the fix you are looking for.