python array creation syntax [for in range] - python

I came across the following syntax to create a python array. It is strange to me.
Can anyone explain it to me? And how should I learn this kind of syntax?
[str(index) for index in range(100)]

First of all, this is not an array. This is a list. Python does have built-in arrays, but they are rarely used (google the array module, if you're interested). The structure you see is called list comprehension. This is the fastest way to do vectorized stuff in pure Python. Let's get through some examples.
Simple list comprehensions are written this way:
[item for item in iterable] - this will build a list containing all items of an iterable.
Actually, you can do something with each item using an expression or a function: [item**2 for item in iterable] - this will square each element, or [f(item) for item in iterable] - f is a function.
You can even add if and else statements like this [number for number in xrange(10) if not number % 2] - this will create a list of even numbers; ['even' if not number % 2 else 'odd' for number in range(10)] - this is how you use else statements.
You can nest list comprehensions [[character for character in word] for word in words] - this will create a list of lists. List comprehensions are similar to generator expressions, so you should google Python docs for additional information.
List comprehensions and generator expressions are among the most powerful and valuable Python features. Just start an interactive session and play for a while.
P.S.
There are other types of comprehensions that create sets and dictionaries. They use the same concept. Google them for additional information.

List comprehension itself is concept derived from mathematics' set comprehension, where to get new set, you specify parent set and the rule to filter out its elements.
In its simplest but full form list comprehension looks like this:
[f(i) for i in range(1000) if i % 2 == 0]
range(1000) - set of values you iterates through. It could be any iterable (list, tuple etc). range is just a function, which returns list of consecutive numbers, e.g. range(4) -> [0, 1, 2, 3]
i - variable will be assigned on each iteration.
if i%2 == 0 - rule condition to filter values. If condition is not True, resulting list will not contain this element.
f(i) - any python code or function on i, result of which will be in resulting list.
For understand concept of list comprehensions, try them out in python console, and look at output. Here is some of them:
[i for i in [1,2,3,4]]
[i for i in range(10)]
[i**2 for i in range(10)]
[max(4, i) for i in range(10)]
[(1 if i>5 else -1) for i in range(10)]
[i for i in range(10) if i % 2 == 0]

I recommend you to unwrap all comprehensions you face into for-loops to better understand their mechanics and syntax until you get used to them. For example, your comprehension can be unwrapped this way:
newlist = []
for index in range(100)
newlist.append(str(index))
I hope it's clear now.

Related

How do I sum the first two values in each tuple in a list of tuples in Python?

I searched through stackoverflow but couldn't find an answer or adapt similar codes, unfortunately.
The problem is that I have a list of tuples:
tupl = [(5,3,33), (2,5,2), (4,1,7)]
and I should use list comprehension to have this output:
[8,7,5]
The code that I wrote is a bit dump and is:
sum_tupl = []
tupl = [(5,3,33), (2,5,2), (4,1,7)]
sum_tupl = [tupl[0][0]+tupl[0][1] for tuple in tupl]
sum_tupl
but instead of doing what I want it to do, it returns
[8,8,8]
which is the sum of the first couple executed three times.
I tried using a variant:
tupl = [(5,3,33), (2,5,2), (4,1,7)]
sum_tupl = [sum(tupl[0],tupl[1]) for tuple in tupl]
sum_tupl
(which is missing something) but to no avail.
Note: When you're beginning python, it's much easier to use loops instead of list comprehensions because then you can debug your code more easily, either by using print() statements or by stepping through it using a debugger.
Now, on to the answer:
When you do for x in y, with a list y, the individual items go in x. So for your code for tuple in tupl, you shouldn't use tupl because that is the entire list. You need to use the individual item in the list, i.e. tuple.
Note that it's not a good idea to name a variable tuple because that's already a builtin type in python.
You need:
tupl = [(5,3,33), (2,5,2), (4,1,7)]
sum_tupl = [t[0]+t[1] for t in tupl]
Which gives the list
sum_tupl: [8, 7, 5]
If you have more elements you want to sum, it makes more sense to use the sum() function and a slice of t instead of writing out everything. For example, if you wanted to sum the first 5 elements, you'd write
sum_tupl = [sum(t[0:5]) for t in tupl]
instead of
sum_tupl = [t[0]+t[1]+t[2]+t[3]+t[4] for t in tupl]
Each loop iteration you're accessing the first element of your tupl list, instead of using the current element. This is what you want:
sum_tupl = [t[0] + t[1] for t in tupl]
I prefer slicing:
>>> [x + y for x, y, *_ in tupl]
[8, 7, 5]
>>>

Python List Indexing or Appending?

What is the best way to add values to a List in terms of processing time, memory usage and just generally what is the best programming option.
list = []
for i in anotherArray:
list.append(i)
or
list = range(len(anotherArray))
for i in list:
list[i] = anotherArray[i]
Considering that anotherArray is for example an array of Tuples. (This is just a simple example)
It really depends on your use case. There is no generic answer here as it depends on what you are trying to do.
In your example, it looks like you are just trying to create a copy of the array, in which case the best way to do this would be to use copy:
from copy import copy
list = copy(anotherArray)
If you are trying to transform the array into another array you should use list comprehension.
list = [i[0] for i in anotherArray] # get the first item from tuples in anotherArray
If you are trying to use both indexes and objects, you should use enumerate:
for i, j in enumerate(list)
which is much better than your second example.
You can also use generators, lambas, maps, filters, etc. The reason all of these possibilities exist is because they are all "better" for different reasons. The writters of python are pretty big on "one right way", so trust me, if there was one generic way which was always better, that is the only way that would exist in python.
Edit: Ran some results of performance for tuple swap and here are the results:
comprehension: 2.682028295999771
enumerate: 5.359116118001111
for in append: 4.177091988000029
for in indexes: 4.612594166001145
As you can tell, comprehension is usually the best bet. Using enumerate is expensive.
Here is the code for the above test:
from timeit import timeit
some_array = [(i, 'a', True) for i in range(0,100000)]
def use_comprehension():
return [(b, a, i) for i, a, b in some_array]
def use_enumerate():
lst = []
for j, k in enumerate(some_array):
i, a, b = k
lst.append((b, a, i))
return lst
def use_for_in_with_append():
lst = []
for i in some_array:
i, a, b = i
lst.append((b, a, i))
return lst
def use_for_in_with_indexes():
lst = [None] * len(some_array)
for j in range(len(some_array)):
i, a, b = some_array[j]
lst[j] = (b, a, i)
return lst
print('comprehension:', timeit(use_comprehension, number=200))
print('enumerate:', timeit(use_enumerate, number=200))
print('for in append:', timeit(use_for_in_with_append, number=200))
print('for in indexes:', timeit(use_for_in_with_indexes, number=200))
Edit2:
It was pointed out to me the the OP just wanted to know the difference between "indexing" and "appending". Really, those are used for two different use cases as well. Indexing is for replacing objects, whereas appending is for adding. However, in a case where the list starts empty, appending will always be better because the indexing has the overhead of creating the list initially. You can see from the results above that indexing is slightly slower, mostly because you have to create the first list.
Best way is list comprehension :
my_list=[i for i in anotherArray]
But based on your problem you can use a generator expression (is more efficient than list comprehension when you just want to loop over your items and you don't need to use some list methods like indexing or len or ... )
my_list=(i for i in anotherArray)
I would actually say the best is a combination of index loops and value loops with enumeration:
for i, j in enumerate(list): # i is the index, j is the value, can't go wrong

python list.iteritems replacement

I've got a list in which some items shall be moved into a separate list (by a comparator function). Those elements are pure dicts. The question is how should I iterate over such list.
When iterating the simplest way, for element in mylist, then I don't know the index of the element. There's no .iteritems() methods for lists, which could be useful here. So I've tried to use for index in range(len(mylist)):, which [1] seems over-complicated as for python and [2] does not satisfy me, since range(len()) is calculated once in the beginning and if I remove an element from the list during iteration, I'll get IndexError: list index out of range.
Finally, my question is - how should I iterate over a python list, to be able to remove elements from the list (using a comparator function and put them in another list)?
You can use enumerate function and make a temporary copy of the list:
for i, value in enumerate(old_list[:]):
# i == index
# value == dictionary
# you can safely remove from old_list because we are iterating over copy
Creating a new list really isn't much of a problem compared to removing items from the old one. Similarly, iterating twice is a very minor performance hit, probably swamped by other factors. Unless you have a very good reason to do otherwise, backed by profiling your code, I'd recommend iterating twice and building two new lists:
from itertools import ifilter, ifilterfalse
l1 = list(ifilter(condition, l))
l2 = list(ifilterfalse(condition, l))
You can slice-assign the contents of one of the new lists into the original if you want:
l[:] = l1
If you're absolutely sure you want a 1-pass solution, and you're absolutely sure you want to modify the original list in place instead of creating a copy, the following avoids quadratic performance hits from popping from the middle of a list:
j = 0
l2 = []
for i in range(len(l)):
if condition(l[i]):
l[j] = l[i]
j += 1
else:
l2.append(l[i])
del l[j:]
We move each element of the list directly to its final position without wasting time shifting elements that don't really need to be shifted. We could use for item in l if we wanted, and it'd probably be a bit faster, but when the algorithm involves modifying the thing we're iterating over, I prefer the explicit index.
I prefer not to touch the original list and do as #Martol1ni, but one way to do it in place and not be affected by the removal of elements would be to iterate backwards:
for i in reversed(range(len()):
# do the filtering...
That will affect only the indices of elements that you have tested/removed already
Try the filter command, and you can override the original list with it too if you don't need it.
def cmp(i): #Comparator function returning a boolean for a given item
...
# mylist is the initial list
mylist = filter(cmp, mylist)
mylist is now a generator of suitable items. You can use list(mylist) if you need to use it more than once.
Haven't tried this yet but.. i'll give it a quick shot:
new_list = [old.pop(i) for i, x in reversed(list(enumerate(old))) if comparator(x)]
You can do this, might be one line too much though.
new_list1 = [x for x in old_list if your_comparator(x)]
new_list2 = [x for x in old_list if x not in new_list1]

Python list of tuples to list of int

So, I have x=[(12,), (1,), (3,)] (list of tuples) and I want x=[12, 1, 3] (list of integers) in best way possible? Can you please help?
You didn't say what you mean by "best", but presumably you mean "most pythonic" or "most readable" or something like that.
The list comprehension given by F3AR3DLEGEND is probably the simplest. Anyone who knows how to read a list comprehension will immediately know what it means.
y = [i[0] for i in x]
However, often you don't really need a list, just something that can be iterated over once. If you've got a billion elements in x, building a billion-element y just to iterate over it one element at a time may be a bad idea. So, you can use a generator expression:
y = (i[0] for i in x)
If you prefer functional programming, you might prefer to use map. The downside of map is that you have to pass it a function, not just an expression, which means you either need to use a lambda function, or itemgetter:
y = map(operator.itemgetter(0), x)
In Python 3, this is equivalent to the generator expression; if you want a list, pass it to list. In Python 2, it returns a list; if you want an iterator, use itertools.imap instead of map.
If you want a more generic flattening solution, you can write one yourself, but it's always worth looking at itertools for generic solutions of this kind, and there is in fact a recipe called flatten that's used to "Flatten one level of nesting". So, copy and paste that into your code (or pip install more-itertools) and you can just do this:
y = flatten(x)
If you look at how flatten is implemented, and then at how chain.from_iterable is implemented, and then at how chain is implemented, you'll notice that you could write the same thing in terms of builtins. But why bother, when flatten is going to be more readable and obvious?
Finally, if you want to reduce the generic version to a nested list comprehension (or generator expression, of course):
y = [j for i in x for j in i]
However, nested list comprehensions are very easy to get wrong, both in writing and reading. (Note that F3AR3DLEGEND, the same person who gave the simplest answer first, also gave a nested comprehension and got it wrong. If he can't pull it off, are you sure you want to try?) For really simple cases, they're not too bad, but still, I think flatten is a lot easier to read.
y = [i[0] for i in x]
This only works for one element per tuple, though.
However, if you have multiple elements per tuple, you can use a slightly more complex list comprehension:
y = [i[j] for i in x for j in range(len(i))]
Reference: List Comprehensions
Just do this:
x = [i[0] for i in x]
Explanation:
>>> x=[(12,), (1,), (3,)]
>>> x
[(12,), (1,), (3,)]
>>> [i for i in x]
[(12,), (1,), (3,)]
>>> [i[0] for i in x]
[12, 1, 3]
This is the most efficient way:
x = [i for i, in x]
or, equivalently
x = [i for (i,) in x]
This is a bit slower:
x = [i[0] for i in x]
you can use map function....
map(lambda y: y[0], x)

How does `sum` flatten lists?

A multidimensional list like l=[[1,2],[3,4]] could be converted to a 1D one by doing sum(l,[]). How does this happen?
(This doesn't work directly for higher multidimensional lists, but it can be repeated to handle those cases. For example if A is a 3D-list, then sum(sum(A),[]),[]) will flatten A to a 1D list.)
If your list nested is, as you say, "2D" (meaning that you only want to go one level down, and all 1-level-down items of nested are lists), a simple list comprehension:
flat = [x for sublist in nested for x in sublist]
is the approach I'd recommend -- much more efficient than summing would be (sum is intended for numbers -- it was just too much of a bother to somehow make it block all attempts to "sum" non-numbers... I was the original proposer and first implementer of sum in the Python standard library, so I guess I should know;-).
If you want to go down "as deep as it takes" (for deeply nested lists), recursion is the simplest way, although by eliminating the recursion you can get higher performance (at the price of higher complication).
This recipe suggests a recursive solution, a recursion elimination, and other approaches
(all instructive, though none as simple as the one-liner I suggested earlier in this answer).
sum adds a sequence together using the + operator. e.g sum([1,2,3]) == 6. The 2nd parameter is an optional start value which defaults to 0. e.g. sum([1,2,3], 10) == 16.
In your example it does [] + [1,2] + [3,4] where + on 2 lists concatenates them together. Therefore the result is [1,2,3,4]
The empty list is required as the 2nd paramter to sum because, as mentioned above, the default is for sum to add to 0 (i.e. 0 + [1,2] + [3,4]) which would result in unsupported operand type(s) for +: 'int' and 'list'
This is the relevant section of the help for sum:
sum(sequence[, start]) -> value
Returns the sum of a sequence of
numbers (NOT strings) plus the value
of parameter 'start' (which defaults
to 0).
Note
As wallacoloo comented this is not a general solution for flattening any multi dimensional list. It just works for a list of 1D lists due to the behavior described above.
Update
For a way to flatten 1 level of nesting see this recipe from the itertools page:
def flatten(listOfLists):
"Flatten one level of nesting"
return chain.from_iterable(listOfLists)
To flatten more deeply nested lists (including irregularly nested lists) see the accepted answer to this question (there are also some other questions linked to from that question itself.)
Note that the recipe returns an itertools.chain object (which is iterable) and the other question's answer returns a generator object so you need to wrap either of these in a call to list if you want the full list rather than iterating over it. e.g. list(flatten(my_list_of_lists)).
For any kind of multidiamentional array, this code will do flattening to one dimension :
def flatten(l):
try:
return flatten(l[0]) + (flatten(l[1:]) if len(l) > 1 else []) if type(l) is list else [l]
except IndexError:
return []
It looks to me more like you're looking for a final answer of:
[3, 7]
For that you're best off with a list comprehension
>>> l=[[1,2],[3,4]]
>>> [x+y for x,y in l]
[3, 7]
I wrote a program to do multi-dimensional flattening using recursion. If anyone has comments on making the program better, you can always see me smiling:
def flatten(l):
lf=[]
li=[]
ll=[]
p=0
for i in l:
if type(i).__name__=='list':
li.append(i)
else:
lf.append(i)
ll=[x for i in li for x in i]
lf.extend(ll)
for i in lf:
if type(i).__name__ =='list':
#not completely flattened
flatten(lf)
else:
p=p+1
continue
if p==len(lf):
print(lf)
I've written this function:
def make_array_single_dimension(l):
l2 = []
for x in l:
if type(x).__name__ == "list":
l2 += make_array_single_dimension(x)
else:
l2.append(x)
return l2
It works as well!
The + operator concatenates lists and the starting value is [] an empty list.

Categories