Related
I have created a function which randomly generates a list of the letters "a", "b", "c", and "d". I would like to create a new list which is the same as the first list but with any letters/items which are the same as the previous letter/item removed. Where I am having problems is referring to the previous letter in the list.
For example, if :
letterlist = ['a','a','a','b','b','a,',b']
then the output should be,
nondupelist = ['a','b','a','b']
The problem is that nodupeletterlist is the same as letterlist - meaning it's not removing items which are the same as the last - because I am getting the function to refer to the previous item in letterlist wrong. I have tried using index and enumerate, but I am obviously using them wrong because I'm not getting the correct results. Below is my current attempt.
import random
def rdmlist(letterlist, nodupeletterlist):
for item in range(20):
rng = random.random()
if rng < 0.25:
letterlist.append("a")
elif 0.25 <= rng and rng < 0.5:
letterlist.append("b")
elif 0.5 <= rng and rng < 0.75:
letterlist.append("c")
else:
letterlist.append("d")
for letter in letterlist:
if letter != letterlist[letterlist.index(letter)-1]:
nodupeletterlist.append(letter)
else:
pass
return
letterlist1 = []
nodupeletterlist1 = []
rdmlist(letterlist1, nodupeletterlist1)
EDIT:
This is what I ended up using. I used this solution simply because I understand how it works. The answers below may provide more succinct or pythonic solutions.
for index, letter in enumerate(letterlist, start=0):
if 0 == index:
nodupeletterlist.append(letter)
else:
pass
for index, letter in enumerate(letterlist[1:], start = 1):
if letter != letterlist[index-1]:
nodupeletterlist.append(letter)
else:
pass
for i, letter in enumerate(([None]+letterlist)[1:], 1):
if letter != letterlist[i-1]:
nodupeletterlist.append(letter)
You can use itertools.groupby:
import itertools
nodupeletterlist = [k for k, _ in itertools.groupby(letterlist)]
Solution without using itertools, as requested in the comments:
def nodupe(letters):
if not letters:
return []
r = [letters[0]]
for ch in letters[1:]:
if ch != r[-1]:
r.append(ch)
return r
nodupeletterlist = nodupe(letterlist)
A fixed version of the proposed "working solution":
def nodupe(letters):
if not letters:
return []
r = [letters[0]]
r += [l for i, l in enumerate(letters[1:]) if l != letters[i]]
return r
nodupeletterlist = nodupe(letterlist)
You can also simplify your random generator a bit, by using random.choices:
import random
chars = 'abcd'
letterlist = random.choices(chars, k=20)
or by using random.randint:
import random
start, end = ord('a'), ord('d')
letterlist = [chr(random.randint(start, end)) for _ in range(20)]
Here's what I came up with. Using random.choices() would be better than what I have below, but same idea. doesn't involve itertools
>>> li_1 = [random.choice("abcdefg") for i in range(20)]
>>> li_1
['c', 'e', 'e', 'g', 'b', 'd', 'b', 'g', 'd', 'c', 'e', 'g', 'e', 'c', 'd',
'e', 'e', 'f', 'd', 'd']
>>>
>>> li_2 = [li_1[i] for i in range(len(li_1))
... if not i or i and li_1[i - 1] != li_1[i]]
>>> li_2
['c', 'e', 'g', 'b', 'd', 'b', 'g', 'd', 'c', 'e', 'g', 'e', 'c',
'd', 'e', 'f', 'd']
The problem with the way that you are using letterlist.index(letter)-1 is that list.index(arg) returns the the index of the first occurrence of arg in list, in this case the letter. This means that if you have list = ["a", "b", "a"] and you run list.index("a") it will always return 0.
A way to do what you intend to (removing consecutive repetitions of letters) would be:
nodupeletterlist.append(letterlist[0])
for idx in range(1, len(letterlist)):
if letterlist[idx] != letterlist[idx-1]:
nodupeletterlist.append(letterlist[idx])
Do This:
L1 = ['a','a','a','b','b','c','d']
L2 = []
L2.append(L1[0])
for i in range(1,len(L1)):
if L1[i] != L1[i-1]:
L2.append(L1[i])
set() will create a set with only unique values,then the list() will convert it back to a a list containing values without any repetition.
I hope this helps...
This question already has answers here:
How can I use `return` to get back multiple values from a loop? Can I put them in a list?
(2 answers)
Closed 4 months ago.
I want to know how to return values without breaking a loop in Python.
Here is an example:
def myfunction():
list = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
print(list)
total = 0
for i in list:
if total < 6:
return i #get first element then it breaks
total += 1
else:
break
myfunction()
return will only get the first answer then leave the loop, I don't want that, I want to return multiple elements till the end of that loop.
How can resolve this, is there any solution?
You can create a generator for that, so you could yield values from your generator (your function would become a generator after using the yield statement).
See the topics below to get a better idea of how to work with it:
Generators
What does the "yield" keyword do?
yield and Generators explain's
An example of using a generator:
def simple_generator(n):
i = 0
while i < n:
yield i
i += 1
my_simple_gen = simple_generator(3) # Create a generator
first_return = next(my_simple_gen) # 0
second_return = next(my_simple_gen) # 1
Also you could create a list before the loop starts and append items to that list, then return that list, so this list could be treated as list of results "returned" inside the loop.
Example of using list to return values:
def get_results_list(n):
results = []
i = 0
while i < n:
results.append(i)
i += 1
return results
first_return, second_return, third_return = get_results_list(3)
NOTE: In the approach with list you have to know how many values your function would return in results list to avoid too many values to unpack error
Using a generator is a probable way:
def myfunction():
l = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
total = 0
for i in l:
if total < 6:
yield i #yields an item and saves function state
total += 1
else:
break
g = myfunction()
Now you can access all elements returned with yield i by calling next() on it:
>>> val = next(g)
>>> print(v)
a
>>> v = next(g)
>>> print(v)
b
Or, in a for loop by doing:
>>> for returned_val in myfunction():
... print(returned_val)
a
b
c
d
e
f
What you want is most easily expressed with list slicing:
>>> l = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
>>> l[:6]
# ['a', 'b', 'c', 'd', 'e', 'f']
Alternatively create another list which you will return at the end of the function.
def myfunction():
l = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
ret = []
total = 0
for i in l:
if total < 6:
total += 1
ret.append(i)
return ret
myfunction()
# ['a', 'b', 'c', 'd', 'e', 'f']
A yield statement to create a generator is what you want.
What does the "yield" keyword do in Python?
Then use the next method to get each value returned from the loop.
var = my_func().next()
I was trying chp 10.15 in book Think Python and wrote following codes:
def turn_str_to_list(string):
res = []
for letter in string:
res.append(letter)
return res
def sort_and_unique (t):
t.sort()
for i in range (0, len(t)-2, 1):
for j in range (i+1, len(t)-1, 1):
if t[i]==t[j]:
del t[j]
return t
line=raw_input('>>>')
t=turn_str_to_list(line)
print t
print sort_and_unique(t)
I used a double 'for' structure to eliminate any duplicated elements in a sorted list.
However, when I ran it, I kept getting wrong outputs.
if I input 'committee', the output is ['c', 'e', 'i', 'm', 'o', 't', 't'], which is wrong because it still contains double 't'.
I tried different inputs, sometimes the program can't pick up duplicated letters in middle of the list, and it always can not pick up the ones at the end.
What was I missing? Thanks guys.
The reason why your program isn't removing all the duplicate letters is because the use of del t[j] in the nested for-loops is causing the program to skip letters.
I added some prints to help illustrate this:
def sort_and_unique (t):
t.sort()
for i in range (0, len(t)-2, 1):
print "i: %d" % i
print t
for j in range (i+1, len(t)-1, 1):
print "\t%d %s len(t):%d" % (j, t[j], len(t))
if t[i]==t[j]:
print "\tdeleting %c" % t[j]
del t[j]
return t
Output:
>>>committee
['c', 'o', 'm', 'm', 'i', 't', 't', 'e', 'e']
i: 0
['c', 'e', 'e', 'i', 'm', 'm', 'o', 't', 't']
1 e len(t):9
2 e len(t):9
3 i len(t):9
4 m len(t):9
5 m len(t):9
6 o len(t):9
7 t len(t):9
i: 1
['c', 'e', 'e', 'i', 'm', 'm', 'o', 't', 't']
2 e len(t):9
deleting e
3 m len(t):8
4 m len(t):8
5 o len(t):8
6 t len(t):8
7 t len(t):8
i: 2
['c', 'e', 'i', 'm', 'm', 'o', 't', 't']
3 m len(t):8
4 m len(t):8
5 o len(t):8
6 t len(t):8
i: 3
['c', 'e', 'i', 'm', 'm', 'o', 't', 't']
4 m len(t):8
deleting m
5 t len(t):7
6 t len(t):7
i: 4
['c', 'e', 'i', 'm', 'o', 't', 't']
5 t len(t):7
i: 5
['c', 'e', 'i', 'm', 'o', 't', 't']
i: 6
['c', 'e', 'i', 'm', 'o', 't', 't']
['c', 'e', 'i', 'm', 'o', 't', 't']
Whenever del t[j] is called, the list becomes one element smaller but the inner j variable for-loops keeps iterating.
For example:
i=1, j=2, t = ['c', 'e', 'e', 'i', 'm', 'm', 'o', 't', 't']
It sees that t[1] == t[2] (both 'e') so it removes t[2].
Now t = ['c', 'e', 'i', 'm', 'm', 'o', 't', 't']
However, the code continues with i=1, j=3, which compares 'e' to 'm' and skips over 'i'.
Lastly, it is not catching the last two 't's because by the time i=5, len(t) is 7, so the conditions of the inner for-loop is range(6,6,1) and is not executed.
In python you could make use of the inbuilt data structures and library functions like set() & list()
Your turn_str_to_list() can be done with list(). Maybe you know this but wanted to do it on your own.
Using the list() and set() APIs:
line=raw_input('>>>')
print list(set(line))
Your sort_and_unique() has a O(n^2) complexity. One of the ways to make cleaner:
def sort_and_unique2(t):
t.sort()
res = []
for i in t:
if i not in res:
res.append(i)
return res
This would still be O(n^2) since look up (i not in res) would be linear time, but code looks a bit cleaner. Deletion has complexity O(n), so instead you could do append to new list since append is O(1). See this for complexities of list API: https://wiki.python.org/moin/TimeComplexity
You can try the following code snippet
s = "committe"
res = sorted((set(list(s))))
Solution explained:
>>> word = "committee"
Turn string to list of characters:
>>> clst = list(word)
>>> clst
['c', 'o', 'm', 'm', 'i', 't', 't', 'e', 'e']
Use set to get only unique items:
>>> unq_clst = set(clst)
>>> unq_clst
{'c', 'e', 'i', 'm', 'o', 't'}
It turns out (thanks Blckknght), that the list step is not necessary and we could do that this way:
>>> unq_clst = set(word)
{'c', 'e', 'i', 'm', 'o', 't'}
Both, set and list are taking as parameter an iterable, and iterating over string returns one character by another.
Sort it:
>>> sorted(unq_clst)
['c', 'e', 'i', 'm', 'o', 't']
One line version:
>>> sorted(set("COMMITTEE"))
['C', 'E', 'I', 'M', 'O', 'T']
Here you go:
In [1]: word = 'committee'
In [3]: word_ = set(word)
In [4]: word_
Out[4]: {'c', 'e', 'i', 'm', 'o', 't'}
The standard way to check for unique elements in python is to use a set. The constructor of a set takes any sequential object. A string is a collection of sequential ascii codes (or unicode codepoints), so it qualifies.
If you have further problems, do leave a comment.
So you want to have explained, what is wrong in your code. Here you are:
Before we dive into coding, make test case(s)
It would make our coding faster, if we get test case at hand from very begining
For testing I will make small utility function:
def textinout(text):
return "".join(sort_and_unique(list(text)))
This allows quick test like:
>>> textinout("committee")
"ceimot"
and another helper function for readable error traces:
def checkit(textin, expected):
msg = "For input '{textin}' we expect '{expected}', got '{result}'"
result = textinout(textin)
assert result == expected, msg.format(textin=textin, expected=expected, result=result)
And make the test case function:
def testit():
checkit("abcd", 'abcd')
checkit("aabbccdd", 'abcd')
checkit("a", 'a')
checkit("ddccbbaa", 'abcd')
checkit("ddcbaa", 'abcd')
checkit("committee", 'ceimot')
Let us make first test with existing function:
def sort_and_unique (t):
t.sort()
for i in range (0, len(t)-2, 1):
for j in range (i+1, len(t)-1, 1):
if t[i]==t[j]:
del t[j]
return t
Now we can test it:
testit()
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-11-291a15d81032> in <module>()
----> 1 testit()
<ipython-input-4-d8ad9abb3338> in testit()
1 def testit():
2 checkit("abcd", 'abcd')
----> 3 checkit("aabbccdd", 'abcd')
4 checkit("a", 'a')
5 checkit("ddccbbaa", 'abcd')
<ipython-input-10-620ac3b14f51> in checkit(textin, expected)
2 msg = "For input '{textin}' we expect '{expected}', got '{result}'"
3 result = textinout(textin)
----> 4 assert result == expected, msg.format(textin=textin, expected=expected, result=result)
AssertionError: For input 'aabbccdd' we expect 'abcd', got 'abcdd'
Reading the last line of error trace we know, what is wrong.
General comments to your code
Accessing list members via index
In most cases this is not efficient and it makes the code hard to read.
Instead of:
lst = ["a", "b", "c"]
for i in range(len(lst)):
itm = lst[i]
# do something with the itm
You should use:
lst = ["a", "b", "c"]
for itm in lst:
# do something with the itm
print itm
If you need to access subset of a list, use slicing
Instead of:
for i in range (0, len(lst)-2, 1):
itm = lst[i]
Use:
for itm in lst[:-2]:
# do something with the itm
print itm
If you really need to know position of processed item for inner loops, use enumerate:
Instead of:
lst = ["a", "b", "c", "d", "e"]
for i in range(0, len(lst)):
for j in range (i+1, len(lst)-1, 1):
itm_i = lst[i]
itm_j = lst[j]
# do something
Use enumerate, which turn each list item into tuple (index, item):
lst = ["a", "b", "c", "d", "e"]
for i, itm_i in enumerate(lst):
for itm_j in lst[i+1, -1]
print itm_i, itm_j
# do something
Manipulating a list which is processed
You are looping over a list and suddenly delete an item from it. List modification during iteration is generally better to avoid, if you have to do it, you have to
think twice and take care, like iterating backward so that you do not modify that part, which is
about to be processed in some next iteration.
As alternative to deleting an item from iterated list you can note findings (like duplicated items) to another list and
after you are out of the loop, use it somehow.
How could be your code rewritten
def sort_and_unique (lst):
lst.sort()
to_remove = []
for i, itm_i in enumerate(lst[:-2]):
for j, itm_j in enumerate(lst[i+1: -1]):
if itm_i == itm_j:
to_remove.append(itm_j)
# now we are out of loop and can modify the lst
# note, we loop over one list and modify another, this is safe
for itm in to_remove:
lst.remove(itm)
return lst
Reading the code, the problem turns out: you never touch last item in the sorted list. That is why you do not get "t" removed as it is alphabetically the last item after applying sort.
So your code could be corrected this way:
def sort_and_unique (lst):
lst.sort()
to_remove = []
for i, itm_i in enumerate(lst[:-1]):
for j, itm_j in enumerate(lst[i+1:]):
if itm_i == itm_j:
to_remove.append(itm_j)
for itm in to_remove:
lst.remove(itm)
return lst
From now on, the code is correct, and you shall prove it by calling testit()
>>> testit()
Silent test output is what we were dreaming about.
Having the test function make further code modification easy, as it will be quick to check, if things are still working as expected.
Anyway, the code can be shortened by getting tuples (itm_i, itm_j) using zip
def sort_and_unique (lst):
lst.sort()
to_remove = []
for itm_i, itm_j in zip(lst[:-1], lst[1:]):
if itm_i == itm_j:
to_remove.append(itm_j)
for itm in to_remove:
lst.remove(itm)
return lst
Test it:
>>> testit()
or using list comprehension:
def sort_and_unique (lst):
lst.sort()
to_remove = [itm_j for itm_i, itm_j in zip(lst[:-1], lst[1:]) if itm_i == itm_j]
for itm in to_remove:
lst.remove(itm)
return lst
Test it:
>>> testit()
As list comprehension (using []) completes creation of returned value sooner then are the values
used, we can remove another line:
def sort_and_unique (lst):
lst.sort()
for itm in [itm_j for itm_i, itm_j in zip(lst[:-1], lst[1:]) if itm_i == itm_j]:
lst.remove(itm)
return lst
Test it:
>>> testit()
Note, that so far, the code still reflects your original algorithm, only two bugs were removed:
- not manipulating list, we are iterating over
- taking into account also last item from the list
I'm trying to write a function that makes nCk from the list in python
for example from the list for pairs:
['a', 'b', 'c']
output should be:
[['a','b'],['a','c'],['b','c']]
however I'm getting no output
here's my attempt:
def chose(elements, k):
output = []
for i in range(len(elements)):
if k == 1:
output.append(elements[i])
for c in chose(elements[i+1:], k-1):
output.append(elements[i])
output.append(c)
return output
print chose(['a', 'b', 'c'],2)
can you kindly tell what is wrong with function
Use itertools.combinations if you want to find all combinations:
from itertools import combinations
a = ['a', 'b', 'c']
result = [list(i) for i in combinations(a,2)]
The documentation and implementation of the combinations() function can be found on here ...
Update
This function should do what you want:
def chose(elements, k):
output = []
if k == 1:
return [[i] for i in elements]
else:
for i in range(len(elements)):
head = elements[i]
tails = chose(elements[i+1:], k-1)
output += [[head] + tail for tail in tails]
return output
print chose(['a','b','c'], 2)
You can use a powerset without using any imports:
def power_set(items,k):
n = len(items)
for i in xrange(2**n):
combo = []
for j in xrange(n):
if (i >> j) % 2 == 1:
combo.append(items[j])
if len(combo) == k:
yield combo
print(list(power_set(['a', 'b', 'c'],2)))
[['a', 'b'], ['a', 'c'], ['b', 'c']]
I have a list:
l=['a','>>','b','>>','d','e','f','g','>>','i','>>','>>','j','k','l','>>','>>']
I need to extract all the neighbors of '>>' and split them into groups where they have elements in between that are neither '>>' or neigbors of '>>'.
For the example list the expected outcome would be:
[['a', 'b', 'd'], ['g', 'i', 'j'], ['l']]
I have tried quite a few things, but all the simple ones have failed one way or another. At the moment the only code that seems to work is this:
def func(L,N):
outer=[]
inner=[]
for i,e in enumerate(L):
if e!=N:
try:
if L[i-1]==N or L[i+1]==N:
inner.append(e)
elif len(inner)>0:
outer.append(inner)
inner=[]
except IndexError:
pass
if len(inner):
outer.append(inner)
return outer
func(l,'>>')
Out[196]:
[['a', 'b', 'd'], ['g', 'i', 'j'], ['l']]
Although it seems to work, i am wondering if there is a better,cleaner way to do it?
I would argue that the most pythonic and easy to read solution would be something like this:
import itertools
def neighbours(items, fill=None):
"""Yeild the elements with their neighbours as (before, element, after).
neighbours([1, 2, 3]) --> (None, 1, 2), (1, 2, 3), (2, 3, None)
"""
before = itertools.chain([fill], items)
after = itertools.chain(items, [fill]) #You could use itertools.zip_longest() later instead.
next(after)
return zip(before, items, after)
def split_not_neighbour(seq, mark):
"""Split the sequence on each item where the item is not the mark, or next
to the mark.
split_not_neighbour([1, 0, 2, 3, 4, 5, 0], 0) --> (1, 2), (5)
"""
output = []
for items in neighbours(seq):
if mark in items:
_, item, _ = items
if item != mark:
output.append(item)
else:
if output:
yield output
output = []
if output:
yield output
Which we can use like so:
>>> l = ['a', '>>', 'b', '>>', 'd', 'e', 'f', 'g', '>>', 'i', '>>', '>>',
... 'j', 'k', 'l', '>>', '>>']
>>> print(list(split_not_neighbour(l, ">>")))
[['a', 'b', 'd'], ['g', 'i', 'j'], ['l']]
Note the neat avoidance of any direct indexing.
Edit: A more elegant version.
def split_not_neighbour(seq, mark):
"""Split the sequence on each item where the item is not the mark, or next
to the mark.
split_not_neighbour([1, 0, 2, 3, 4, 5, 0], 0) --> (1, 2), (5)
"""
neighboured = neighbours(seq)
for _, items in itertools.groupby(neighboured, key=lambda x: mark not in x):
yield [item for _, item, _ in items if item != mark]
Here is one alternative:
import itertools
def func(L, N):
def key(i_e):
i, e = i_e
return e == N or (i > 0 and L[i-1] == N) or (i < len(L) and L[i+1] == N)
outer = []
for k, g in itertools.groupby(enumerate(L), key):
if k:
outer.append([e for i, e in g if e != N])
return outer
Or an equivalent version with a nested list comprehension:
def func(L, N):
def key(i_e):
i, e = i_e
return e == N or (i > 0 and L[i-1] == N) or (i < len(L) and L[i+1] == N)
return [[e for i, e in g if e != N]
for k, g in itertools.groupby(enumerate(L), key) if k]
You can simplify it like this
l = ['']+l+['']
stack = []
connected = last_connected = False
for i, item in enumerate(l):
if item in ['','>>']: continue
connected = l[i-1] == '>>' or l[i+1] == '>>'
if connected:
if not last_connected:
stack.append([])
stack[-1].append(item)
last_connected = connected
my naive attempt
things = (''.join(l)).split('>>')
output = []
inner = []
for i in things:
if not i:
continue
i_len = len(i)
if i_len == 1:
inner.append(i)
elif i_len > 1:
inner.append(i[0])
output.append(inner)
inner = [i[-1]]
output.append(inner)
print output # [['a', 'b', 'd'], ['g', 'i', 'j'], ['l']]
Something like this:
l=['a','>>','b','>>','d','e','f','g','>>','i','>>','>>','j','k','l','>>','>>']
l= filter(None,"".join(l).split(">>"))
lis=[]
for i,x in enumerate(l):
if len(x)==1:
if len(lis)!=0:
lis[-1].append(x[0])
else:
lis.append([])
lis[-1].append(x[0])
else:
if len(lis)!=0:
lis[-1].append(x[0])
lis.append([])
lis[-1].append(x[-1])
else:
lis.append([])
lis[-1].append(x[0])
lis.append([])
lis[-1].append(x[-1])
print lis
output:
[['a', 'b', 'd'], ['g', 'i', 'j'], ['l']]
or:
l=['a','>>','b','>>','d','e','f','g','>>','i','>>','>>','j','k','l','>>','>>']
l= filter(None,"".join(l).split(">>"))
lis=[[] for _ in range(len([1 for x in l if len(x)>1])+1)]
for i,x in enumerate(l):
if len(x)==1:
for y in reversed(lis):
if len(y)!=0:
y.append(x)
break
else:
lis[0].append(x)
else:
if not all(len(x)==0 for x in lis):
for y in reversed(lis):
if len(y)!=0:
y.append(x[0])
break
for y in lis:
if len(y)==0:
y.append(x[-1])
break
else:
lis[0].append(x[0])
lis[1].append(x[-1])
print lis
output:
[['a', 'b', 'd'], ['g', 'i', 'j'], ['l']]
Another medthod using superimposition of original list
import copy
lis_dup = copy.deepcopy(lis)
lis_dup.insert(0,'')
prev_in = 0
tmp=[]
res = []
for (x,y) in zip(lis,lis_dup):
if '>>' in (x,y):
if y!='>>' :
if y not in tmp:
tmp.append(y)
elif x!='>>':
if x not in tmp:
print 'x is ' ,x
tmp.append(x)
else:
if prev_in ==1:
res.append(tmp)
prev_in =0
tmp = []
prev_in = 1
else:
if prev_in == 1:
res.append(tmp)
prev_in =0
tmp = []
res.append(tmp)
print res