Python list of tuples to list of int - python

So, I have x=[(12,), (1,), (3,)] (list of tuples) and I want x=[12, 1, 3] (list of integers) in best way possible? Can you please help?

You didn't say what you mean by "best", but presumably you mean "most pythonic" or "most readable" or something like that.
The list comprehension given by F3AR3DLEGEND is probably the simplest. Anyone who knows how to read a list comprehension will immediately know what it means.
y = [i[0] for i in x]
However, often you don't really need a list, just something that can be iterated over once. If you've got a billion elements in x, building a billion-element y just to iterate over it one element at a time may be a bad idea. So, you can use a generator expression:
y = (i[0] for i in x)
If you prefer functional programming, you might prefer to use map. The downside of map is that you have to pass it a function, not just an expression, which means you either need to use a lambda function, or itemgetter:
y = map(operator.itemgetter(0), x)
In Python 3, this is equivalent to the generator expression; if you want a list, pass it to list. In Python 2, it returns a list; if you want an iterator, use itertools.imap instead of map.
If you want a more generic flattening solution, you can write one yourself, but it's always worth looking at itertools for generic solutions of this kind, and there is in fact a recipe called flatten that's used to "Flatten one level of nesting". So, copy and paste that into your code (or pip install more-itertools) and you can just do this:
y = flatten(x)
If you look at how flatten is implemented, and then at how chain.from_iterable is implemented, and then at how chain is implemented, you'll notice that you could write the same thing in terms of builtins. But why bother, when flatten is going to be more readable and obvious?
Finally, if you want to reduce the generic version to a nested list comprehension (or generator expression, of course):
y = [j for i in x for j in i]
However, nested list comprehensions are very easy to get wrong, both in writing and reading. (Note that F3AR3DLEGEND, the same person who gave the simplest answer first, also gave a nested comprehension and got it wrong. If he can't pull it off, are you sure you want to try?) For really simple cases, they're not too bad, but still, I think flatten is a lot easier to read.

y = [i[0] for i in x]
This only works for one element per tuple, though.
However, if you have multiple elements per tuple, you can use a slightly more complex list comprehension:
y = [i[j] for i in x for j in range(len(i))]
Reference: List Comprehensions

Just do this:
x = [i[0] for i in x]
Explanation:
>>> x=[(12,), (1,), (3,)]
>>> x
[(12,), (1,), (3,)]
>>> [i for i in x]
[(12,), (1,), (3,)]
>>> [i[0] for i in x]
[12, 1, 3]

This is the most efficient way:
x = [i for i, in x]
or, equivalently
x = [i for (i,) in x]
This is a bit slower:
x = [i[0] for i in x]

you can use map function....
map(lambda y: y[0], x)

Related

How do I sum the first two values in each tuple in a list of tuples in Python?

I searched through stackoverflow but couldn't find an answer or adapt similar codes, unfortunately.
The problem is that I have a list of tuples:
tupl = [(5,3,33), (2,5,2), (4,1,7)]
and I should use list comprehension to have this output:
[8,7,5]
The code that I wrote is a bit dump and is:
sum_tupl = []
tupl = [(5,3,33), (2,5,2), (4,1,7)]
sum_tupl = [tupl[0][0]+tupl[0][1] for tuple in tupl]
sum_tupl
but instead of doing what I want it to do, it returns
[8,8,8]
which is the sum of the first couple executed three times.
I tried using a variant:
tupl = [(5,3,33), (2,5,2), (4,1,7)]
sum_tupl = [sum(tupl[0],tupl[1]) for tuple in tupl]
sum_tupl
(which is missing something) but to no avail.
Note: When you're beginning python, it's much easier to use loops instead of list comprehensions because then you can debug your code more easily, either by using print() statements or by stepping through it using a debugger.
Now, on to the answer:
When you do for x in y, with a list y, the individual items go in x. So for your code for tuple in tupl, you shouldn't use tupl because that is the entire list. You need to use the individual item in the list, i.e. tuple.
Note that it's not a good idea to name a variable tuple because that's already a builtin type in python.
You need:
tupl = [(5,3,33), (2,5,2), (4,1,7)]
sum_tupl = [t[0]+t[1] for t in tupl]
Which gives the list
sum_tupl: [8, 7, 5]
If you have more elements you want to sum, it makes more sense to use the sum() function and a slice of t instead of writing out everything. For example, if you wanted to sum the first 5 elements, you'd write
sum_tupl = [sum(t[0:5]) for t in tupl]
instead of
sum_tupl = [t[0]+t[1]+t[2]+t[3]+t[4] for t in tupl]
Each loop iteration you're accessing the first element of your tupl list, instead of using the current element. This is what you want:
sum_tupl = [t[0] + t[1] for t in tupl]
I prefer slicing:
>>> [x + y for x, y, *_ in tupl]
[8, 7, 5]
>>>

Understanding this line: list_of_tuples = [(x,y) for x, y, label in data_one]

As you've already understood I'm a beginner and am trying to understand what the "Pythonic way" of writing this function is built on.
I know that other threads might include a partial answer to this, but I don't know what to look for since I don't understand what is happening here.
This line is a code that my friend sent me, to improve my code which is:
import numpy as np
#load_data:
def load_data():
data_one = np.load ('/Users/usr/... file_name.npy')
list_of_tuples = []
for x, y, label in data_one:
list_of_tuples.append( (x,y) )
return list_of_tuples
print load_data()
The "improved" version:
import numpy as np
#load_data:
def load_data():
data_one = np.load ('/Users/usr.... file_name.npy')
list_of_tuples = [(x,y) for x, y, label in data_one]
return list_of_tuples
print load_data()
I wonder:
What is happening here?
Is it a better or worse way? since it is "Pythonic" I assume it wouldn't
work with other languages and so perhaps it's better to get used to the more general way?
list_of_tuples = [(x,y) for x, y, label in data_one]
(x, y) is a tuple <-- linked tutorial.
This is a list comprehension
[(x,y) for x, y, label in data_one]
# ^ ^
# | ^comprehension syntax^ |
# begin list end list
data_one is an iterable and is necessary for a list comprehension. Under the covers they are loops and must iterate over something.
x, y, label in data_one tells me that I can "unpack" these three items from every element that is delivered by the data_one iterable. This is just like a local variable of a for loop, it changes upon each iteration.
In total, this says:
Make a list of tuples that look like (x, y) where I get x, y, and label from each item delivered by the iterable data_one. Put each x and y into a tuple inside a list called list_of_tuples. Yes I know I "unpacked" label and never used it, I don't care.
Both ways are correct and work. You could probably relate the first way with the way things are done in C and other languages. This is, you basically run a for loop to go through all of the values and then append it to your list of tuples.
The second way is more pythonic but does the same. If you take a look at [(x,y) for x, y, label in data_one] (this is a list comprehension) you will see that you are also running a for loop on the same data but your result will be (x, y) and all of those results will form a list. So it achieves the same thing.
The third way (added as a response of the comments) uses a slice method.
I've prepared a small example similar to yours:
data = [(1, 2, 3), (2, 3, 4), (4, 5, 6)]
def load_data():
list_of_tuples = []
for x, y, label in data:
list_of_tuples.append((x,y))
return list_of_tuples
def load_data_2():
return [(x,y) for x, y, label in data]
def load_data_3():
return [t[:2] for t in data]
They all do the same thing and return [(1, 2), (2, 3), (4, 5)] but their runtime is different. This is why a list comprehension is a better way to do this.
When i run the first method load_data() i get:
%%timeit
load_data()
1000000 loops, best of 3: 1.36 µs per loop
When I run the second method load_data_2() I get:
%%timeit
load_data_2()
1000000 loops, best of 3: 969 ns per loop
When I run the third method load_data_3() I get:
%%timeit
load_data_3()
1000000 loops, best of 3: 981 ns per loop
The second way, list comprehension, is faster!
The "improved" version uses a list comprehension. This makes the code declarative (describing what you want) rather than imperative (describing how to get what you want).
The advantages of declarative programming are that the implementation details are mostly left out, and the underlying classes and data-structures can perform the operations in an optimal way. For example, one optimisation that the python interpreter could make in your example above, would be to pre-allocate the correct size of the array list_of_tuples rather than having to continually resize the array during the append() operation.
To get you started with list comprehensions, I'll explain the way I normally start to write them. For a list L write something like this:
output = [x for x in L]
For each element in L, a variable is extracted (the centre x) and can be used to form the output list (the x on the left). The above expression effectively does nothing, and output the same as L. Imperatively, it is akin to:
output = []
for x in L:
output.append(x)
From here though, you could realise that each x is actually a tuple that could be unpacked using tuple assignment:
output = [x for x, y, label in L]
This will create a new list, containing only the x element from each tuple in the list.
If you wanted to pack a different tuple in the output list, you just pack it on the left-hand side:
output = [(x,y) for x, y, label in L]
This is basically what you end up with in your optimised version.
You can do other useful things with list comprehensions, such as only inserting values that conform to a specific condition:
output = [(x,y) for x, y, label in L if x > 10]
Here is a useful tutorial about list comprehensions that you might find interesting: http://treyhunner.com/2015/12/python-list-comprehensions-now-in-color/
The action is essentially the same. In newer Python interpreters the scope of the variables in the list comprehension is narrower (x can't be seen outside the comprehension).
list_of_tuples = []
for x, y, label in data_one:
list_of_tuples.append( (x,y) )
list_of_tuples = [(x,y) for x, y, label in data_one]
This kind of action occurs often enough that Python developers thought it worth while to use special syntax. There's a map(fn, iterable) function that does something similar, but I think the list comprehension is clearer.
Python developers like this syntax enough to extend it to generators and dictionaries and sets. And they allow nesting and conditional clauses.
Both forms use tuple unpacking x,y,label in data_one.
What are both of these clips doing? data_one apparently is a list of tuples (or sublists) with 3 elements. This code is creating a new list with 2 element tuples - 2 out of the 3 elements. I think it's easier to see that in the list comprehension.
It's wise to be familiar with both. Sometimes the action is too complicated to cast in the comprehension form.
Another feature of the comprehension - it doesn't allow side effects (or at least it is trickier to incorporate them). That may be a defect in some cases, but generally it makes the code clearer.
This is called a list comprehension. It's similar to a loop and can often accomplish the same task, but will generate a list with the results. The general format is [operation for variable in iterable]. For example,
[x**2 for x in range(4)] would result in [0, 1, 4, 9].
They can also be made more complicated (like yours above is) by using multiple functions, variables, and iterables in one list comprehension. For example,
[(x,y) for x in range(5) for y in range(10)].
You can find more reading on this here.

python array creation syntax [for in range]

I came across the following syntax to create a python array. It is strange to me.
Can anyone explain it to me? And how should I learn this kind of syntax?
[str(index) for index in range(100)]
First of all, this is not an array. This is a list. Python does have built-in arrays, but they are rarely used (google the array module, if you're interested). The structure you see is called list comprehension. This is the fastest way to do vectorized stuff in pure Python. Let's get through some examples.
Simple list comprehensions are written this way:
[item for item in iterable] - this will build a list containing all items of an iterable.
Actually, you can do something with each item using an expression or a function: [item**2 for item in iterable] - this will square each element, or [f(item) for item in iterable] - f is a function.
You can even add if and else statements like this [number for number in xrange(10) if not number % 2] - this will create a list of even numbers; ['even' if not number % 2 else 'odd' for number in range(10)] - this is how you use else statements.
You can nest list comprehensions [[character for character in word] for word in words] - this will create a list of lists. List comprehensions are similar to generator expressions, so you should google Python docs for additional information.
List comprehensions and generator expressions are among the most powerful and valuable Python features. Just start an interactive session and play for a while.
P.S.
There are other types of comprehensions that create sets and dictionaries. They use the same concept. Google them for additional information.
List comprehension itself is concept derived from mathematics' set comprehension, where to get new set, you specify parent set and the rule to filter out its elements.
In its simplest but full form list comprehension looks like this:
[f(i) for i in range(1000) if i % 2 == 0]
range(1000) - set of values you iterates through. It could be any iterable (list, tuple etc). range is just a function, which returns list of consecutive numbers, e.g. range(4) -> [0, 1, 2, 3]
i - variable will be assigned on each iteration.
if i%2 == 0 - rule condition to filter values. If condition is not True, resulting list will not contain this element.
f(i) - any python code or function on i, result of which will be in resulting list.
For understand concept of list comprehensions, try them out in python console, and look at output. Here is some of them:
[i for i in [1,2,3,4]]
[i for i in range(10)]
[i**2 for i in range(10)]
[max(4, i) for i in range(10)]
[(1 if i>5 else -1) for i in range(10)]
[i for i in range(10) if i % 2 == 0]
I recommend you to unwrap all comprehensions you face into for-loops to better understand their mechanics and syntax until you get used to them. For example, your comprehension can be unwrapped this way:
newlist = []
for index in range(100)
newlist.append(str(index))
I hope it's clear now.

How to use pop for multidimensional array in pythonic way

I want to remove the list items found in list B from the list A. This is the function I wrote:
def remove(A,B):
to_remove=[];
for i in range(len(A)):
for j in range(len(B)):
if (B[j]==A[i]):
to_remove.append(i);
for j in range(len(to_remove)):
A.pop(to_remove[j]);
Is this the normal way to do it ? Although, this works completely fine (if typos, I don't know), I think there might be more pythonic way to do it. Please suggest.
Convert B to a set first and then create a new array from A using a list comprehension:
s = set(B)
A = [item for item in A if item not in s]
Item lookup in a set is an O(1) operation.
If you don't want to change the id() of A, then:
A[:] = [item for item in A if item not in s]
First, note that your function doesn't work right. Try this:
A = [1, 2, 3]
B = [1, 2, 3]
remove(A, B)
You'll get an IndexError, because the correct indices to delete change each time you do a .pop().
You'll doubtless get answers recommending using sets, and that's indeed much better if the array elements are hashable and comparable, but in general you may need something like this:
def remove(A, B):
A[:] = [avalue for avalue in A if avalue not in B]
That works for any kinds of array elements (provided only they can be compared for equality), and preserves the original ordering. But it takes worst-case time proportional to len(A) * len(B).
List comprehenstion to the rescue:
[item for item in A if item not in B]
This however creates a new list. You can return the list from the function.
Or, if you are ok with loosing any duplicates in list A, or there are no duplicates, you can use set difference:
return list(set(A) - set(B))
One caveat is, this won't preserve the order of elements in A. So, if you want elements in order, this is not what you want. Use the 1st approach instead.
What about list comprehension?
def remove(removeList, fromList):
return [x for x in fromList if x not in removeList]
Also, to make life easier and removing faster you can make a set from list removeList, leaving only unique elements:
def remove(removeList, fromList):
removeSet = set(removeList)
return [x for x in fromList if x not in removeSet]
>>> print remove([1,2,3], [1,2,3,4,5,6,7])
[4, 5, 6, 7]
And, of course, you can use built-in filter function, though someone will say that it's non-pythonic, and you should use list generators instead. Either way, here is an example:
def remove(removeList, fromList):
removeSet = set(removeList)
return filter(lambda x : x not in removeSet, fromList)

Swap values in a tuple/list inside a list in python?

I have a list of tuples like this:
[('foo','bar'),('foo1','bar1'),('foofoo','barbar')]
What is the fastest way in python (running on a very low cpu/ram machine) to swap values like this...
[('bar','foo'),('bar1','foo1'),('barbar','foofoo')]
I am currently using:
for x in mylist:
self.my_new_list.append(((x[1]),(x[0])))
Is there a better or faster way???
You could use map:
map (lambda t: (t[1], t[0]), mylist)
Or list comprehension:
[(t[1], t[0]) for t in mylist]
List comprehensions are preferred and supposedly much faster than map when lambda is needed, however note that list comprehension has a strict evaluation, that is it will be evaluated as soon as it gets bound to variable, if you're worried about memory consumption use a generator instead:
g = ((t[1], t[0]) for t in mylist)
#call when you need a value
g.next()
There are some more details here: Python List Comprehension Vs. Map
You can use reversed like this:
tuple(reversed((1, 2)) == (2, 1)
To apply it to a list, you can use map or a list/generator comprehension:
map(tuple, map(reversed, tuples)) # map
[tuple(reversed(x)) for x in tuples] # list comprehension
(tuple(reversed(x)) for x in tuples) # generator comprehension
If you're interested primarily in runtime speed, I can only recommend that you profile the various approaches and pick the fastest.
A fancy way:
[t[::-1] for t in mylist]
Using a list comprehension I find more elegant and understandable to use separate variables instead of indices for a single variable as in the solution provided by #iabdalkader:
[(b, a) for a, b in mylist]
To modify the current list in-place, the most efficient way is:
my_list[:] = [(y, x) for x, y in my_list]
It is assigning to the list slice, which covers the entire list, without creating an extra duplicate of the list in memory. See also this answer.

Categories