Flattening a list recursively [duplicate] - python

This question already has answers here:
Flatten an irregular (arbitrarily nested) list of lists
(51 answers)
Closed 6 months ago.
I am trying to flatten lists recursively in Python. I have this code:
def flatten(test_list):
#define base case to exit recursive method
if len(test_list) == 0:
return []
elif isinstance(test_list,list) and type(test_list[0]) in [int,str]:
return [test_list[0]] + flatten(test_list[1:])
elif isinstance(test_list,list) and isinstance(test_list[0],list):
return test_list[0] + flatten(test_list[1:])
else:
return flatten(test_list[1:])
I am looking for a very basic method to recursively flatten a list of varying depth that does not use any for loops either.
My code does not pass these tests:
flatten([[[[]]], [], [[]], [[], []]]) # empty multidimensional list
flatten([[1], [2, 3], [4, [5, [6, [7, [8]]]]]]) # multiple nested list
What is wrong with the code, and how can I fix it?

This handles both of your cases, and I think will solve the general case, without any for loops:
def flatten(S):
if S == []:
return S
if isinstance(S[0], list):
return flatten(S[0]) + flatten(S[1:])
return S[:1] + flatten(S[1:])

li=[[1,[[2]],[[[3]]]],[['4'],{5:5}]]
flatten=lambda l: sum(map(flatten,l),[]) if isinstance(l,list) else [l]
print flatten(li)

Here's a possible solution without any loops or list comprehensions, just using recursion:
def flatten(test_list):
if isinstance(test_list, list):
if len(test_list) == 0:
return []
first, rest = test_list[0], test_list[1:]
return flatten(first) + flatten(rest)
else:
return [test_list]

Well, if you want it a lisp way, let's have it.
atom = lambda x: not isinstance(x, list)
nil = lambda x: not x
car = lambda x: x[0]
cdr = lambda x: x[1:]
cons = lambda x, y: x + y
flatten = lambda x: [x] if atom(x) else x if nil(x) else cons(*map(flatten, [car(x), cdr(x)]))

Related

Avoid computing the same expression twice in list comprehension [duplicate]

This question already has answers here:
Python list comprehension - want to avoid repeated evaluation
(12 answers)
Closed 3 years ago.
I am using a function in a list comprehension and an if function:
new_list = [f(x) for x in old_list if f(x) !=0]
It annoys me that the expression f(x) is computed twice in each loop.
Is there a way to do it in a cleaner way? Something along the lines of storing the value or including the if statement at the beginning of the list comprehension.
Compute the results beforehand and iterate over them
new_list = [x for x in map(f, old_list) if x !=0]
Moreover, since map computes the result per element when the element is accessed this is just one loop.
you could use a generator expression (in order to avoid creating an unnecessary list) inside your list comprehension:
new_list = [fx for fx in (f(x) for x in old_list) if fx != 0]
starting from python 3.8 you will be able to do this:
new_list = [fx for x in old_list if (fx := f(x)) != 0]
in Python 3.8 we'll have the "walrus operator" and be able to do just that!
[y for x in old_list if (y := f(x)) != 0]
You could use a filter to remove results:
def f (x):
return x * 2
old_list = [0, 1, 2, 3]
new_list = filter(lambda x: x != 0, [f(x) for x in old_list])
for x in new_list:
print(x)
See it working here.
Alternatively you could memoize the function so as to prevent ever having to compute the same result twice:
def f (x):
return x * 2
def memoize(f):
memo = {}
def helper(x):
if x not in memo:
print("Computing result for %s" % x)
memo[x] = f(x)
return memo[x]
return helper
memF = memoize(f)
old_list = [0, 1, 2, 3]
new_list = [memF(x) for x in old_list if memF(x) != 0]
for x in new_list:
print(x)
Which is available here.

Python syntax help: 'return [x] if [x != e] for [e] in [array]'

I am new to Python and wanted to explore it's pseudo-code like syntax to solve the following problem:
# x is 0, 1 or 2
arr = [0, 1, 2]
I want to return any element in arr that is not equal to x
My intuition:
return x if x != element for element in arr
I have tried to complete the conditional with an else clause. Still, the syntax is invalid
What is my mistake? What is a correct one-line solution (if any exists)
Thanks!
return [element for element in arr if element != x]
https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions
If you want to return one element from your list matching a certain condition (in this case the condition is !=x), you can use next.
return next(item for item in arr if item!=x)
You're pretty close. Just put the if last. This will return a generator:
def f(ls, e):
return (x for x in ls if x != e)
You can also return a list instead:
def f(ls, e):
return [x for x in ls if x != e]
You can use ternary if statement as given below:
def f(x,arr):
return x if x in arr else None
Or you can use list comprehension if you want return all elements contains in arr, as you can see below:
def f(x,arr):
return [y for y in arr if y in x] #here x is a list
I would use a list comprehension:
return [a for a in arr if a != x]

Function that takes a list of strings and returns another specific list

I need to create a function that takes a list of words and then I want to check all the strings in that list and return another list of strings where the string first and last character are the same.
For example, given input_list = ['a','aa','aba','abb'] output should be ['aa','aba'].
Try the following:
def myfunc(lst):
return [item for item in lst if len(item) > 1 and item[0] == item[-1]]
>>> myfunc(['a','aa','aba','abb'])
['aa', 'aba']
>>>
Just check the length is > 1, and see if the first char x[0] is equal to the last char x[-1]:
print(list(filter(lambda x: len(x) > 1 and x[0] == x[-1], lst)))
['aa', 'aba']
Or if you want a function:
f = lambda l:list(filter(lambda x: len(x) > 1 and x[0] == x[-1], l))
print(f(lst))
The way to approach this, typically, is filtering a list, instead of seeing it as a different one, and define any function (either regular or lambda) to express what needs to be filtered on. That way your code is clear and easy to test and maintain:
filteredList = filter(lambda x: len(x) > 1 and x[0] == x[-1], myList)
#or:
def starts_and_ends_with_same_char(subject):
return len(subject) > 1 and str[0] == subject[-1]
filteredList = filter(starts_and_ends_with_same_char, myList)
Golfing a little:
>>> [s for s in lst if s[1:] and s[0] == s[-1]]
['aa', 'aba']

How to add the list elements in python [duplicate]

This question already has answers here:
sum of nested list in Python
(14 answers)
Closed 8 years ago.
for example I have a list with numbers like this:
a = [10,[20,30],40]
or
b = [[10,20],30]
Now I have to add all the elements in the above lists.
so that if add the first list then I should get the answer as follows: 10+20+30+40 = 100.
and for the second one b as follows: 10+20+30 = 60.
The solution is to be expressed as a function.
I have tried this one but it can be used for only adding if there is no nested list.
def sum(t):
total = 0
for x in t:
total = total+x
return total
Now can anyone help me solve this kind of problem in python programming.
Thank you in advance!!!!!
You can use reduce:
x = reduce(lambda prev,el: prev+([x for x in el] if type(el) is list else [el]), x, [])
And use its result to feed your loop.
def sum(t):
t = reduce(lambda prev,el: prev+([x for x in el] if type(el) is list else [el]), t, [])
total = 0
for x in t:
total = total+x
return total
You can recursively flatten into a single list:
def flatten(lst, out=None):
if out is None:
out = []
for item in lst:
if isinstance(item, list):
flatten(item, out)
else:
out.append(item)
return out
Now you can just use sum:
>>> sum(flatten([10, [20, 30], 40]))
100
You need to define a recursion to handle the nested lists:
rec = lambda x: sum(map(rec, x)) if isinstance(x, list) else x
rec, applied on a list, will return the sum (recursively), on a value, return the value.
result = rec(a)
Seems like the best approach would be to iterate over the top-level list and check each element's type (using is_instance(type, item)). If it's an integer, add it to the total, otherwise if it's a list, iterate over that list.
Making your function recursive would make it most usable.
Edit: For anybody stumbling upon this question, here's an example.
def nested_sum(input_list):
total = 0
for element in input_list:
if isinstance(element, int):
total += element
elif isinstance(element, list):
total += nested_sum(element)
else:
raise TypeError
return total
Usage:
my_list = [72, 5, [108, 99, [8, 5], 23], 44]
print nested_sum(my_list)
>>> 364

Checking if all elements in a list are unique

What is the best way (best as in the conventional way) of checking whether all elements in a list are unique?
My current approach using a Counter is:
>>> x = [1, 1, 1, 2, 3, 4, 5, 6, 2]
>>> counter = Counter(x)
>>> for values in counter.itervalues():
if values > 1:
# do something
Can I do better?
Not the most efficient, but straight forward and concise:
if len(x) > len(set(x)):
pass # do something
Probably won't make much of a difference for short lists.
Here is a two-liner that will also do early exit:
>>> def allUnique(x):
... seen = set()
... return not any(i in seen or seen.add(i) for i in x)
...
>>> allUnique("ABCDEF")
True
>>> allUnique("ABACDEF")
False
If the elements of x aren't hashable, then you'll have to resort to using a list for seen:
>>> def allUnique(x):
... seen = list()
... return not any(i in seen or seen.append(i) for i in x)
...
>>> allUnique([list("ABC"), list("DEF")])
True
>>> allUnique([list("ABC"), list("DEF"), list("ABC")])
False
An early-exit solution could be
def unique_values(g):
s = set()
for x in g:
if x in s: return False
s.add(x)
return True
however for small cases or if early-exiting is not the common case then I would expect len(x) != len(set(x)) being the fastest method.
for speed:
import numpy as np
x = [1, 1, 1, 2, 3, 4, 5, 6, 2]
np.unique(x).size == len(x)
How about adding all the entries to a set and checking its length?
len(set(x)) == len(x)
Alternative to a set, you can use a dict.
len({}.fromkeys(x)) == len(x)
Another approach entirely, using sorted and groupby:
from itertools import groupby
is_unique = lambda seq: all(sum(1 for _ in x[1])==1 for x in groupby(sorted(seq)))
It requires a sort, but exits on the first repeated value.
Here is a recursive O(N2) version for fun:
def is_unique(lst):
if len(lst) > 1:
return is_unique(s[1:]) and (s[0] not in s[1:])
return True
Here is a recursive early-exit function:
def distinct(L):
if len(L) == 2:
return L[0] != L[1]
H = L[0]
T = L[1:]
if (H in T):
return False
else:
return distinct(T)
It's fast enough for me without using weird(slow) conversions while
having a functional-style approach.
All answer above are good but I prefer to use all_unique example from 30 seconds of python
You need to use set() on the given list to remove duplicates, compare its length with the length of the list.
def all_unique(lst):
return len(lst) == len(set(lst))
It returns True if all the values in a flat list are unique, False otherwise.
x = [1, 2, 3, 4, 5, 6]
y = [1, 2, 2, 3, 4, 5]
all_unique(x) # True
all_unique(y) # False
I've compared the suggested solutions with perfplot and found that
len(lst) == len(set(lst))
is indeed the fastest solution. If there are early duplicates in the list, there are some constant-time solutions which are to be preferred.
Code to reproduce the plot:
import perfplot
import numpy as np
import pandas as pd
def len_set(lst):
return len(lst) == len(set(lst))
def set_add(lst):
seen = set()
return not any(i in seen or seen.add(i) for i in lst)
def list_append(lst):
seen = list()
return not any(i in seen or seen.append(i) for i in lst)
def numpy_unique(lst):
return np.unique(lst).size == len(lst)
def set_add_early_exit(lst):
s = set()
for item in lst:
if item in s:
return False
s.add(item)
return True
def pandas_is_unique(lst):
return pd.Series(lst).is_unique
def sort_diff(lst):
return not np.any(np.diff(np.sort(lst)) == 0)
b = perfplot.bench(
setup=lambda n: list(np.arange(n)),
title="All items unique",
# setup=lambda n: [0] * n,
# title="All items equal",
kernels=[
len_set,
set_add,
list_append,
numpy_unique,
set_add_early_exit,
pandas_is_unique,
sort_diff,
],
n_range=[2**k for k in range(18)],
xlabel="len(lst)",
)
b.save("out.png")
b.show()
How about this
def is_unique(lst):
if not lst:
return True
else:
return Counter(lst).most_common(1)[0][1]==1
If and only if you have the data processing library pandas in your dependencies, there's an already implemented solution which gives the boolean you want :
import pandas as pd
pd.Series(lst).is_unique
You can use Yan's syntax (len(x) > len(set(x))), but instead of set(x), define a function:
def f5(seq, idfun=None):
# order preserving
if idfun is None:
def idfun(x): return x
seen = {}
result = []
for item in seq:
marker = idfun(item)
# in old Python versions:
# if seen.has_key(marker)
# but in new ones:
if marker in seen: continue
seen[marker] = 1
result.append(item)
return result
and do len(x) > len(f5(x)). This will be fast and is also order preserving.
Code there is taken from: http://www.peterbe.com/plog/uniqifiers-benchmark
Using a similar approach in a Pandas dataframe to test if the contents of a column contains unique values:
if tempDF['var1'].size == tempDF['var1'].unique().size:
print("Unique")
else:
print("Not unique")
For me, this is instantaneous on an int variable in a dateframe containing over a million rows.
It does not fully fit the question but if you google the task I had you get this question ranked first and it might be of interest to the users as it is an extension of the quesiton. If you want to investigate for each list element if it is unique or not you can do the following:
import timeit
import numpy as np
def get_unique(mylist):
# sort the list and keep the index
sort = sorted((e,i) for i,e in enumerate(mylist))
# check for each element if it is similar to the previous or next one
isunique = [[sort[0][1],sort[0][0]!=sort[1][0]]] + \
[[s[1], (s[0]!=sort[i-1][0])and(s[0]!=sort[i+1][0])]
for [i,s] in enumerate (sort) if (i>0) and (i<len(sort)-1) ] +\
[[sort[-1][1],sort[-1][0]!=sort[-2][0]]]
# sort indices and booleans and return only the boolean
return [a[1] for a in sorted(isunique)]
def get_unique_using_count(mylist):
return [mylist.count(item)==1 for item in mylist]
mylist = list(np.random.randint(0,10,10))
%timeit for x in range(10): get_unique(mylist)
%timeit for x in range(10): get_unique_using_count(mylist)
mylist = list(np.random.randint(0,1000,1000))
%timeit for x in range(10): get_unique(mylist)
%timeit for x in range(10): get_unique_using_count(mylist)
for short lists the get_unique_using_count as suggested in some answers is fast. But if your list is already longer than 100 elements the count function takes quite long. Thus the approach shown in the get_unique function is much faster although it looks more complicated.
If the list is sorted anyway, you can use:
not any(sorted_list[i] == sorted_list[i + 1] for i in range(len(sorted_list) - 1))
Pretty efficient, but not worth sorting for this purpose though.
For begginers:
def AllDifferent(s):
for i in range(len(s)):
for i2 in range(len(s)):
if i != i2:
if s[i] == s[i2]:
return False
return True

Categories