"in" statement behavior in lists vs. strings - python

In Python, asking if a substring exists in a string is pretty straightforward:
>>> their_string = 'abracadabra'
>>> our_string = 'cad'
>>> our_string in their_string
True
However, checking if these same characters are "in" a list fails:
>>> ours, theirs = map(list, [our_string, their_string])
>>> ours in theirs
False
>>> ours, theirs = map(tuple, [our_string, their_string])
>>> ours in theirs
False
I wasn't able to find any obvious reason why checking for elements "in" an ordered (even immutable) iterable would behave differently than a different type of ordered, immutable iterable.

For container types such as lists and tuples, x in container checks if x is an item in the container. Thus with ours in theirs, Python checks if ours is an item in theirs and finds that it is False.
Remember that a list could contain a list. (e.g [['a','b','c'], ...])
>>> ours = ['a','b','c']
>>> theirs = [['a','b','c'], 1, 2]
>>> ours in theirs
True

Are you looking to see if 'cad' is in any of the strings in a list of strings? That would like something like:
stringsToSearch = ['blah', 'foo', 'bar', 'abracadabra']
if any('cad' in s for s in stringsToSearch):
# 'cad' was in at least one string in the list
else:
# none of the strings in the list contain 'cad'

From the Python documentation, https://docs.python.org/2/library/stdtypes.html for sequences:
x in s True if an item of s is equal to x, else False (1)
x not in s False if an item of s is equal to x, else True (1)
(1) When s is a string or Unicode string object the in and not in operations act like a substring test.
For user defined classes, the __contains__ method implements this in test. list and tuple implement the basic notion. string has the added notion of 'substring'. string is a special case among the basic sequences.

Related

How can I assert that the argument only contains listed integers in Python

I have a method in Python and I only want to accept integers listed or singular, how can I define this?
def autoInt(integers):
assert int(integers)
assert len(integers) > 0
This fails as I cannot have a list. I'm sure it's something easy.
TypeError: int() argument must be a string or a number, not 'list'
Edit: I have been tasked so that this method can ONLY accept integers in a list.
That depends on what passes as an integer by your definition. For example, do instances of bool count? Does the float 1.0?
Anyway - you can combine the all builtin with a generator expression.
>>> a = [1,2,True]
>>> all(isinstance(x, int) for x in a)
True
As a sidenote: rigorously checking the argument types in not something Python programmers do when there's no specific reason. A better approach is usually to write clear docstrings and/or type hints.
Here is an answer which explains how to do the latter. Apart from that, there's usually a "garbage in -> garbage (or error)" out mentality.
You can test a list of integers like this:
assert(all(isinstance(item, int) for item in integers))
From How to test if every item in a list of type 'int'?
>>> my_list = [1, 2, 3.25]
>>> all(isinstance(item, int) for item in my_list)
False
>>> other_list = range(3)
>>> all(isinstance(item, int) for item in other_list)
True
>>>
Source: https://stackoverflow.com/a/6009630/6748523

my_set.copy().add( something ) why return None [duplicate]

What am I doing wrong here?
a = set().add(1)
print a # Prints `None`
I'm trying to add the number 1 to the empty set.
It is a convention in Python that methods that mutate sequences return None.
Consider:
>>> a_list = [3, 2, 1]
>>> print a_list.sort()
None
>>> a_list
[1, 2, 3]
>>> a_dict = {}
>>> print a_dict.__setitem__('a', 1)
None
>>> a_dict
{'a': 1}
>>> a_set = set()
>>> print a_set.add(1)
None
>>> a_set
set([1])
Some may consider this convention "a horrible misdesign in Python", but the Design and History FAQ gives the reasoning behind this design decision (with respect to lists):
Why doesn’t list.sort() return the sorted list?
In situations where performance matters, making a copy of the list
just to sort it would be wasteful. Therefore, list.sort() sorts the
list in place. In order to remind you of that fact, it does not return
the sorted list. This way, you won’t be fooled into accidentally
overwriting a list when you need a sorted copy but also need to keep
the unsorted version around.
In Python 2.4 a new built-in function – sorted() – has been added.
This function creates a new list from a provided iterable, sorts it
and returns it.
Your particular problems with this feature come from a misunderstanding of good ways to create a set rather than a language misdesign. As Lattyware points out, in Python versions 2.7 and later you can use a set literal a = {1} or do a = set([1]) as per Sven Marnach's answer.
Parenthetically, I like Ruby's convention of placing an exclamation point after methods that mutate objects, but I find Python's approach acceptable.
The add() method adds an element to the set, but it does not return the set again -- it returns None.
a = set()
a.add(1)
or better
a = set([1])
would work.
Because add() is modifing your set in place returning None:
>>> empty = set()
>>> print(empty.add(1))
None
>>> empty
set([1])
Another way to do it that is relatively simple would be:
a = set()
a = set() | {1}
this creates a union between your set a and a set with 1 as the element
print(a) yields {1} then because a would now have all elements of both a and {1}
You should do this:
a = set()
a.add(1)
print a
Notice that you're assigning to a the result of adding 1, and the add operation, as defined in Python, returns None - and that's what is getting assigned to a in your code.
Alternatively, you can do this for initializing a set:
a = set([1, 2, 3])
The add method updates the set, but returns None.
a = set()
a.add(1)
print a
You are assigning the value returned by set().add(1) to a. This value is None, as add() does not return any value, it instead acts in-place on the list.
What you wanted to do was this:
a = set()
a.add(1)
print(a)
Of course, this example is trivial, but Python does support set literals, so if you really wanted to do this, it's better to do:
a = {1}
print(a)
The curly brackets denote a set (although be warned, {} denotes an empty dict, not an empty set due to the fact that curly brackets are used for both dicts and sets (dicts are separated by the use of the colon to separate keys and values.)
Alternatively to a = set() | {1} consider "in-place" operator:
a = set()
a |= {1}

What does the builtin function any() do?

I did some google searching on how to check if a string has any elements of a list in it and I found this bit of code that works:
if any(i in string for i in list):
I know this works, but I don't really know why. Could you share some insight?
As the docs for any say:
Return True if any element of the iterable is true. If the iterable is empty, return False. Equivalent to:
def any(iterable):
for element in iterable:
if element:
return True
return False
So, this is equivalent to:
for element in (i in string for i in list):
if element:
return True
return False
… which is itself effectively equivalent to:
for i in list:
element = i in string
if element:
return True
return False
If you don't understand the last part, first read the tutorial section on list comprehensions, then skip ahead to iterators, generators, and generator expressions.
If you want to really break it down, you can do this:
elements = []
for i in list:
elements.append(i in string)
for element in elements:
if element:
return True
return False
That still isn't exactly the same, because a generator expression builds a generator, not a list, but it should be enough to get you going until you read the tutorial sections.
But meanwhile, the point of having any and comprehensions and so on is that you can almost read them as plain English:
if any(i in string for i in list): # Python
if any of the i's is in the string, for each i in the list: # pseudo-English
i in string for i in list
This produces an iterable of booleans indicating whether each item in list is in string. Then you check whether any item in this iterable of bools is true.
In effect, you're checking whether any of the items in the list are substrings of string.
What's going on here with:
if any(i in string for i in list):
is best explained by illustrating:
>>> xs = ["Goodbye", "Foo", "Balloon"]
>>> s = "Goodbye World"
>>> [i in s for i in xs]
[True, False, False]
>>> any([i in s for i in xs])
True
If you read the any documentaiton you'll note:
any(iterable) Return True if any element of the iterable is true.
If the iterable is empty, return False. Equivalent to:
The list comprehension should be more obvious as it constructs a list of i in s for each element of xs.
Basically (in English) you are returning any match where each sub-string exists in the search string (haystack).
It's important to note as well that any() will short circuit and end on the first True(ish) value it finds. any() can be implement in pure Python like this:
def any(iterable):
for x in iterable:
if x:
return True
return False

How to get the first 2 letters of a string in Python?

Let's say I have a string
str1 = "TN 81 NZ 0025"
two = first2(str1)
print(two) # -> TN
How do I get the first two letters of this string? I need the first2 function for this.
It is as simple as string[:2]. A function can be easily written to do it, if you need.
Even this, is as simple as
def first2(s):
return s[:2]
In general, you can get the characters of a string from i until j with string[i:j].
string[:2] is shorthand for string[0:2]. This works for lists as well.
Learn about Python's slice notation at the official tutorial
t = "your string"
Play with the first N characters of a string with
def firstN(s, n=2):
return s[:n]
which is by default equivalent to
t[:2]
Heres what the simple function would look like:
def firstTwo(string):
return string[:2]
In python strings are list of characters, but they are not explicitly list type, just list-like (i.e. it can be treated like a list). More formally, they're known as sequence (see http://docs.python.org/2/library/stdtypes.html#sequence-types-str-unicode-list-tuple-bytearray-buffer-xrange):
>>> a = 'foo bar'
>>> isinstance(a, list)
False
>>> isinstance(a, str)
True
Since strings are sequence, you can use slicing to access parts of the list, denoted by list[start_index:end_index] see Explain Python's slice notation . For example:
>>> a = [1,2,3,4]
>>> a[0]
1 # first element, NOT a sequence.
>>> a[0:1]
[1] # a slice from first to second, a list, i.e. a sequence.
>>> a[0:2]
[1, 2]
>>> a[:2]
[1, 2]
>>> x = "foo bar"
>>> x[0:2]
'fo'
>>> x[:2]
'fo'
When undefined, the slice notation takes the starting position as the 0, and end position as len(sequence).
In the olden C days, it's an array of characters, the whole issue of dynamic vs static list sounds like legend now, see Python List vs. Array - when to use?
All previous examples will raise an exception in case your string is not long enough.
Another approach is to use
'yourstring'.ljust(100)[:100].strip().
This will give you first 100 chars.
You might get a shorter string in case your string last chars are spaces.
For completeness: Instead of using def you could give a name to a lambda function:
first2 = lambda s: s[:2]

Is there a short contains function for lists?

Given a list xs and a value item, how can I check whether xs contains item (i.e., if any of the elements of xs is equal to item)? Is there something like xs.contains(item)?
For performance considerations, see Fastest way to check if a value exists in a list.
Use:
if my_item in some_list:
...
Also, inverse operation:
if my_item not in some_list:
...
It works fine for lists, tuples, sets and dicts (check keys).
Note that this is an O(n) operation in lists and tuples, but an O(1) operation in sets and dicts.
In addition to what other have said, you may also be interested to know that what in does is to call the list.__contains__ method, that you can define on any class you write and can get extremely handy to use python at his full extent.
A dumb use may be:
>>> class ContainsEverything:
def __init__(self):
return None
def __contains__(self, *elem, **k):
return True
>>> a = ContainsEverything()
>>> 3 in a
True
>>> a in a
True
>>> False in a
True
>>> False not in a
False
>>>
I came up with this one liner recently for getting True if a list contains any number of occurrences of an item, or False if it contains no occurrences or nothing at all. Using next(...) gives this a default return value (False) and means it should run significantly faster than running the whole list comprehension.
list_does_contain = next((True for item in list_to_test if item == test_item), False)
The list method index will return -1 if the item is not present, and will return the index of the item in the list if it is present. Alternatively in an if statement you can do the following:
if myItem in list:
#do things
You can also check if an element is not in a list with the following if statement:
if myItem not in list:
#do things
There is also the list method:
[2, 51, 6, 8, 3].__contains__(8)
# Out[33]: True
[2, 51, 6, 3].__contains__(8)
# Out[33]: False
There is one another method that uses index. But I am not sure if this has any fault or not.
list = [5,4,3,1]
try:
list.index(2)
#code for when item is expected to be in the list
print("present")
except:
#code for when item is not expected to be in the list
print("not present")
Output:
not present

Categories