I noticed a strange difference between two list constructors that I believed to be equivalent.
Here is a small example:
hello = 'Hello World'
first = list(hello)
second = [hello]
print(first)
print(second)
This code will produce the following output:
['H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd']
['Hello World']
So, the difference is quite clear between the two constructors... And, I guess that this could be generalized to other constructors as well, but I fail to understand the logic behind it.
Can somebody cast its lights upon my interrogations?
The list() constructor function takes exactly one argument, which must be an iterable. It returns a new list with each element being an element from the given iterable. Since strings are iterable (by character), a list with individual characters is returned.
[] takes as many "arguments" as you like, each being a single element in the list; the items are not "evaluated" or iterated, they are taken as is.
Everything as documented.
The first just transform the list "Hello world" (an character array) into a list
first = list(hello)
The second create a list with element inside brackets.
first = [hello]
In the second case for example you could also do:
first = [hello, 'hi', 'world']
and as output of the print you will get
['Hello World', 'hi', 'world']
your "first" uses the list method, which takes in hello and treats it as an iterable, converting it to a list. Which is why each chararcter is seperate.
your "second" creates a new list, using the string as its value
You are assuming that list(hello) should create a list containing one element, the object referred to by hello. That's not true; by that logic you would expect list(5) to return [5]. list takes a single iterable argument (a list, a tuple, a string, a dict, etc) and returns a list whose elements are taken from the given iterable.
The bracket notation, however, is not limited to containing a single item. Each comma-separated object is treated as a distinct element for the new list.
The most important distinction of these 2 behaviours comes when you work with generators. Given that Python 3 transformed things like map and zip into generators ...
If we assume map returns generators:
a = list(map(lambda x: str(x), [1, 2, 3]))
print(a)
The result is:
['1', '2', '3']
But if we do:
a = [map(lambda x: str(x), [1, 2, 3])]
print(a)
The result is:
[<map object at 0x00000209231CB2E8>]
It is obvious that the 2nd case is in most situations undesirable and not expected.
P.S.
If you are in Python 2, then do at the beginning: from itertools import imap as map
first = list(hello)
converts a string into a list.
second = [hello]
this places an item into a new list. it is not a constructor
Related
Why do these two operations (append() resp. +) give different results?
>>> c = [1, 2, 3]
>>> c
[1, 2, 3]
>>> c += c
>>> c
[1, 2, 3, 1, 2, 3]
>>> c = [1, 2, 3]
>>> c.append(c)
>>> c
[1, 2, 3, [...]]
>>>
In the last case there's actually an infinite recursion. c[-1] and c are the same. Why is it different with the + operation?
To explain "why":
The + operation adds the array elements to the original array. The array.append operation inserts the array (or any object) into the end of the original array, which results in a reference to self in that spot (hence the infinite recursion in your case with lists, though with arrays, you'd receive a type error).
The difference here is that the + operation acts specific when you add an array (it's overloaded like others, see this chapter on sequences) by concatenating the element. The append-method however does literally what you ask: append the object on the right-hand side that you give it (the array or any other object), instead of taking its elements.
An alternative
Use extend() if you want to use a function that acts similar to the + operator (as others have shown here as well). It's not wise to do the opposite: to try to mimic append with the + operator for lists (see my earlier link on why). More on lists below:
Lists
[edit] Several commenters have suggested that the question is about lists and not about arrays. The question has changed, though I should've included this earlier.
Most of the above about arrays also applies to lists:
The + operator concatenates two lists together. The operator will return a new list object.
List.append does not append one list with another, but appends a single object (which here is a list) at the end of your current list. Adding c to itself, therefore, leads to infinite recursion.
As with arrays, you can use List.extend to add extend a list with another list (or iterable). This will change your current list in situ, as opposed to +, which returns a new list.
Little history
For fun, a little history: the birth of the array module in Python in February 1993. it might surprise you, but arrays were added way after sequences and lists came into existence.
The concatenation operator + is a binary infix operator which, when applied to lists, returns a new list containing all the elements of each of its two operands. The list.append() method is a mutator on list which appends its single object argument (in your specific example the list c) to the subject list. In your example this results in c appending a reference to itself (hence the infinite recursion).
An alternative to '+' concatenation
The list.extend() method is also a mutator method which concatenates its sequence argument with the subject list. Specifically, it appends each of the elements of sequence in iteration order.
An aside
Being an operator, + returns the result of the expression as a new value. Being a non-chaining mutator method, list.extend() modifies the subject list in-place and returns nothing.
Arrays
I've added this due to the potential confusion which the Abel's answer above may cause by mixing the discussion of lists, sequences and arrays.
Arrays were added to Python after sequences and lists, as a more efficient way of storing arrays of integral data types. Do not confuse arrays with lists. They are not the same.
From the array docs:
Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained. The type is specified at object creation time by using a type code, which is a single character.
append is appending an element to a list. if you want to extend the list with the new list you need to use extend.
>>> c = [1, 2, 3]
>>> c.extend(c)
>>> c
[1, 2, 3, 1, 2, 3]
Python lists are heterogeneous that is the elements in the same list can be any type of object. The expression: c.append(c) appends the object c what ever it may be to the list. In the case it makes the list itself a member of the list.
The expression c += c adds two lists together and assigns the result to the variable c. The overloaded + operator is defined on lists to create a new list whose contents are the elements in the first list and the elements in the second list.
So these are really just different expressions used to do different things by design.
The method you're looking for is extend(). From the Python documentation:
list.append(x)
Add an item to the end of the list; equivalent to a[len(a):] = [x].
list.extend(L)
Extend the list by appending all the items in the given list; equivalent to a[len(a):] = L.
list.insert(i, x)
Insert an item at a given position. The first argument is the index of the element before which to insert, so a.insert(0, x) inserts at the front of the list, and a.insert(len(a), x) is equivalent to a.append(x).
you should use extend()
>>> c=[1,2,3]
>>> c.extend(c)
>>> c
[1, 2, 3, 1, 2, 3]
other info: append vs. extend
See the documentation:
list.append(x)
Add an item to the end of the list; equivalent to a[len(a):] = [x].
list.extend(L)
- Extend the list by appending all the items in the given list;
equivalent to a[len(a):] = L.
c.append(c) "appends" c to itself as an element. Since a list is a reference type, this creates a recursive data structure.
c += c is equivalent to extend(c), which appends the elements of c to c.
Let's say I have a string
str1 = "TN 81 NZ 0025"
two = first2(str1)
print(two) # -> TN
How do I get the first two letters of this string? I need the first2 function for this.
It is as simple as string[:2]. A function can be easily written to do it, if you need.
Even this, is as simple as
def first2(s):
return s[:2]
In general, you can get the characters of a string from i until j with string[i:j].
string[:2] is shorthand for string[0:2]. This works for lists as well.
Learn about Python's slice notation at the official tutorial
t = "your string"
Play with the first N characters of a string with
def firstN(s, n=2):
return s[:n]
which is by default equivalent to
t[:2]
Heres what the simple function would look like:
def firstTwo(string):
return string[:2]
In python strings are list of characters, but they are not explicitly list type, just list-like (i.e. it can be treated like a list). More formally, they're known as sequence (see http://docs.python.org/2/library/stdtypes.html#sequence-types-str-unicode-list-tuple-bytearray-buffer-xrange):
>>> a = 'foo bar'
>>> isinstance(a, list)
False
>>> isinstance(a, str)
True
Since strings are sequence, you can use slicing to access parts of the list, denoted by list[start_index:end_index] see Explain Python's slice notation . For example:
>>> a = [1,2,3,4]
>>> a[0]
1 # first element, NOT a sequence.
>>> a[0:1]
[1] # a slice from first to second, a list, i.e. a sequence.
>>> a[0:2]
[1, 2]
>>> a[:2]
[1, 2]
>>> x = "foo bar"
>>> x[0:2]
'fo'
>>> x[:2]
'fo'
When undefined, the slice notation takes the starting position as the 0, and end position as len(sequence).
In the olden C days, it's an array of characters, the whole issue of dynamic vs static list sounds like legend now, see Python List vs. Array - when to use?
All previous examples will raise an exception in case your string is not long enough.
Another approach is to use
'yourstring'.ljust(100)[:100].strip().
This will give you first 100 chars.
You might get a shorter string in case your string last chars are spaces.
For completeness: Instead of using def you could give a name to a lambda function:
first2 = lambda s: s[:2]
Please consider the two snippets of code (notice the distinction between string and integer):
a = []
a[:] = '1'
and
a = []
a[:] = 1
In the first case a is ['1']. In the second, I get the error TypeError: can only assign an iterable. Why would using '1' over 1 be fundamentally different here?
Assigning to a slice requires an iterable on the right-hand side.
'1' is iterable, while 1 is not. Consider the following:
In [7]: a=[]
In [8]: a[:]='abc'
The result is:
In [9]: a
Out[9]: ['a', 'b', 'c']
As you can see, the list gets each character of the string as a separate item. This is a consequence of the fact that iterating over a string yields its characters.
If you want to replace a range of a's elements with a single scalar, simply wrap the scalar in an iterable of some sort:
In [11]: a[:]=(1,) # single-element tuple
In [12]: a
Out[12]: [1]
This also applies to strings (provided the string is to be treated as a single item and not as a sequence of characters):
In [17]: a[:]=('abc',)
In [18]: a
Out[18]: ['abc']
'1' is a string, but it is iterable. It is like a list of characters. a[:]='1' replaces the contents of the list a with the content of the string '1'. But 1 is an integer.
Python does not change the type.
Example:
print bool(1=='1') # --> False
What is this called in python:
[('/', MainPage)]
Is that an array .. of ... erhm one dictionary?
Is that
()
A tuple? ( or whatever they call it? )
Its a list with a single tuple.
Since no one has answered this bit yet:
A tuple? ( or whatever they call it? )
The word "tuple" comes from maths. In maths, we might talk about (ordered) pairs, if we're doing 2d geometry. Moving to three dimensions means we need triples. In higher dimensions, we need quadruples, quintuples, and, uh, whatever the prefix is for six, and so on. This starts to get to be a pain, and mathematicians also love generalising ("let's work in n dimensions today!"), so they started using the term "n-tuple" for an ordered list of n things (usually numbers).
After that, a bit of natural laziness is all you need to drop the "n-" and we end up with tuples.
Note that this:
("is not a tuple")
A tuple is defined by the commas, except in the case of the zero-length tuple. This:
"is a tuple",
because of the comma at the end. The parentheses just enforce grouping (again, except in the case of a zero-length tuple.
That's a list of tuples.
This is a list of integers: [1, 2, 3, 4, 5]
This is also a list of integers: [1]
This is a (string, integer) tuple: ("hello world", 42)
This is a list of (string, integer) tuples: [("a", 1), ("b", 2), ("c", 3)]
And so is this: [("a", 1)]
In Python, there's not much difference between lists and tuples. However, they are conceptually different. An easy way to think of it is that a list contains lots of items of the same type (homogeneous) , and a tuple contains a fixed number of items of different types (heterogeneous). An easy way to remember this is that lists can be appended to, and tuples cannot, because appending to a list makes sense and appending to a tuple doesn't.
Python doesn't enforce these distinctions -- in Python, you can append to a tuple with +, or store heterogeneous types in a list.
Yes, it's a tuple.
They look like this:
()
(foo,)
(foo, bar)
(foo, bar, baz)
etc.
[('/', MainPage)]
That's a list consisting of a two element tuple.
()
That's a zero element tuple.
It is a list of tuple(s). You can verify that by
x=[('/', MainPage)]
print type(x) # You will find a <list> type here
print type(x[0]) # You will find a <tuple> type here
You can build a dictionary from this type of structure (may be more tuple inside the list) with this code
my_dict = dict(x) # x=[('/',MainPage)]
It is a list of tuples containing one tuple.
A tuple is just like a list except that it is immutable, meaning that it can't be changed once it's created. You can't add, remove, or change elements in a tuple. If you want your tuple to be different, you have to create a new tuple with the new data. This may sound like a pain but in reality tuples have many benefits both in code safety and speed.
It's a list of just one tuple. That tuple has two elements, a string and the object MainPage whatever it is.
Both lists and tuples are ordered groups of object, it doesn't matter what kind of object, they can be heterogeneous in both cases.
The main difference between lists and tuples is that tuples are immutable, just like strings.
For example we can define a list and a tuple:
>>> L = ['a', 1, 5, 'b']
>>> T = ('a', 1, 5, 'b')
we can modify elements of L simply by assigning them a new value
>>> print L
['a', 1, 5, 'b']
>>> L[1] = 'c'
>>> print L
['a', 'c', 5, 'b']
This is not true for tuples
>>> print T
('a', 1, 5, 'b')
>>> T[1] = 'c'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
This is because they are immutable.
Tuples' elements may be mutable, and you can modify them, for example:
>>> T = (3, ['a', 1, 2], 'lol')
>>> T[1]
['a', 1, 2]
>>> T[1][0] = 'b'
>>> T
(3, ['b', 1, 2], 'lol')
but the list we edited is still the same object, we didn't replaced the tuple's element.
I'm looking to take a string and create a list of strings that build up the original string.
e.g.:
"asdf" => ["a", "as", "asd", "asdf"]
I'm sure there's a "pythonic" way to do it; I think I'm just losing my mind. What's the best way to get this done?
One possibility:
>>> st = 'asdf'
>>> [st[:n+1] for n in range(len(st))]
['a', 'as', 'asd', 'asdf']
If you're going to be looping over the elements of your "list", you may be better off using a generator rather than list comprehension:
>>> text = "I'm a little teapot."
>>> textgen = (text[:i + 1] for i in xrange(len(text)))
>>> textgen
<generator object <genexpr> at 0x0119BDA0>
>>> for item in textgen:
... if re.search("t$", item):
... print item
I'm a lit
I'm a litt
I'm a little t
I'm a little teapot
>>>
This code never creates a list object, nor does it ever (delta garbage collection) create more than one extra string (in addition to text).