Class pattern is matching the wrong cases - python

I'm writing an object serializer but am having issues where the class patterns are not matching the expected cases:
def dump_obj(x):
match(x):
case list():
emit('L')
dump_obj(len(x))
for elem in x:
dump_obj(elem)
case Iterable():
emit('I')
dump_obj((type(x), list(x)))
case tuple():
emit('T')
dump_obj(list(x))
case str():
emit('S')
dump_obj(len(x))
emit(x)
case int():
emit('D')
emit(str(x))
case _:
raise TypeError(f'Unknown obj {x!r}')
When I call dump_obj() with a tuple, it giving an infinite recursion on the I-case for iterables rather than matching the T-case for tuples.
When I call dump_obj() with a list subclass, it is matching the L-case for lists instead of the intended I-case for iterables.

First problem: Ordering
The cases are not independent of one another. They are tested from the top-down (like a long if/elif chain) and the first to match wins.
In the example, the specific match tests like like list, tuple, and str need to come before more general matches like Iterable. Otherwise with the current code, a tuple input like (10, 20, 30) will match the I-case instead of the intended T-case.
Second problem: Specificity
A class pattern performs an isinstance() check which would match both a type and subclasses of the type. To restrict the case to an exact match, use a type guard:
case list() if type(x) == list:
...
Putting it all together
With both solutions applied, here is the new code:
def dump_obj(x):
match(x):
case list() if type(x) == list: # <-- Added guard
emit('L')
dump_obj(len(x))
for elem in x:
dump_obj(elem)
case tuple() if type(x) == tuple: # <-- Added guard
emit('T')
dump_obj(list(x))
case str() if type(x) == str: # <-- Added guard
emit('S')
dump_obj(len(x))
emit(x)
case Iterable(): # <-- Move after list, tuple, str
emit('I')
dump_obj((type(x).__name__, list(x)))
case int():
emit('D')
emit(str(x))
case _:
raise TypeError(f'Unknown obj {x!r}')
Sample runs
Here we show that the two problematic cases work as expected.
>>> dump_obj((10, 20)) # Tuple of integers
T
L
D
2
D
10
D
20
>>> class List(list):
... pass
...
>>> dump_obj(List((30, 40))) # List subclass
I
T
L
D
2
S
D
4
List
L
D
2
D
30
D
40

Related

python, printing longest length of string in a list

My question is to write a function which returns the longest string and ignores any non-strings, and if there are no strings in the input list, then it should return None.
my answer:
def longest_string(x):
for i in max(x, key=len):
if not type(i)==str:
continue
if
return max
longest_string(['cat', 'dog', 'horse'])
I'm a beginner so I have no idea where to start. Apologies if this is quite simple.
This is how i would do it:
def longest_string(x):
Strings = [i for i in x if isinstance(i, str)]
return(max(Strings, key=len)) if Strings else None
Based on your code:
def longest_string(x):
l = 0
r = None
for s in x:
if isinstance(s, str) and len(s) > l:
l = len(s)
r = s
return r
print(longest_string([None, 'cat', 1, 'dog', 'horse']))
# horse
def longest_string(items):
try:
return max([x for x in items if isinstance(x, str)], key=len)
except ValueError:
return None
def longest_string(items):
strings = (s for s in items if isinstance(s, str))
longest = max(strings, key=len) if strings else None
return longest
print(longest_string(['cat', 'dog', 'horse']))
Your syntax is wrong (second-to-last line: if with no condition) and you are returning max which you did not define manually. In actuality, max is a built-in Python function which you called a few lines above.
In addition, you are not looping through all strings, you are looping through the longest string. Your code should instead be
def longest_string(l):
strings = [item for item in l if type(item) == str]
if len(strings):
return max(strings, key=len)
return None
You're on a good way, you could iterate the list and check each item is the longest:
def longest_string(x)
# handle case of 0 strings
if len(x) == 0:
return None
current_longest = ""
# Iterate the strings
for i in x:
# Handle nonestring
if type(i) != str:
continue
# if the current string is longer than the longest, replace the string.
if len(i) > len(current_longest):
current_longest = i
# This condition handles multiple elements where none are strings and should return None.
if len(current_longest) > 0:
return current_longest
else:
return None
Since you are a beginner, I recommend you to start using python's built-in methods to sort and manage lists. Is the best when it comes to logic and leaves less room for bugs.
def longest_string(x):
x = filter(lambda obj: isinstance(obj, str), x)
longest = max(list(x), key=lambda obj: len(obj), default=None)
return longest
Nonetheless, you were in a good way. Just avoid using python´s keywords for variable names (such as max, type, list, etc.)
EDIT: I see a lot of answers using one-liner conditionals, list comprehension, etc. I think those are fantastic solutions, but for the level of programming the OP is at, my answer attempts to document each step of the process and be as readable as possible.
First of all, I would highly suggest defining the type of the x argument in your function.
For example; since I see you are passing a list, you can define the type like so:
def longest_string(x: list):
....
This not only makes it more readable for potential collaborators but helps enormously when creating docstrings and/or combined with using an IDE that shows type hints when writing functions.
Next, I highly suggest you break down your "specs" into some pseudocode, which is enormously helpful for taking things one step at a time:
returns the longest string
ignores any non-strings
if there are no strings in the input list, then it should return None.
So to elaborate on those "specifications" further, we can write:
Return the longest string from a list.
Ignore any element from the input arg x that is not of type str
if no string is present in the list, return None
From here we can proceed to writing the function.
def longest_string(x: list):
# Immediately verify the input is the expected type. if not, return None (or raise Exception)
if type(x) != list:
return None # input should always be a list
# create an empty list to add all strings to
str_list = []
# Loop through list
for element in x:
# check type. if not string, continue
if type(element) != str:
pass
# at this point in our loop the element has passed our type check, and is a string.
# add the element to our str_list
str_list.append(element)
# we should now have a list of strings
# however we should handle an edge case where a list is passed to the function that contains no strings at all, which would mean we now have an empty str_list. let's check that
if not str_list: # an empty list evaluates to False. if not str_list is basically saying "if str_list is empty"
return None
# if the program has not hit one of the return statements yet, we should now have a list of strings (or at least 1 string). you can check with a simple print statement (eg. print(str_list), print(len(str_list)) )
# now we can check for the longest string
# we can use the max() function for this operation
longest_string = max(str_list, key=len)
# return the longest string!
return longest_string

Python: general iterator or pure function for testing any condition across list

I would like to have a function AllTrue that takes three arguments:
List: a list of values
Function: a function to apply to all values
Condition: something to test against the function's output
and return a boolean of whether or not all values in the list match the criteria.
I can get this to work for basic conditions as follows:
def AllTrue(List, Function = "Boolean", Condition = True):
flag = True
condition = Condition
if Function == "Boolean"
for element in List:
if element != condition:
flag = False
break
else:
Map = map(Function, List)
for m in Map:
if m != condition:
flag = False
break
return flag
Since python doesn't have function meant for explicitly returning if something is True, I just make the default "Boolean". One could clean this up by defining TrueQ to return True if an element is True and then just mapping TrueQ on the List.
The else handles queries like:
l = [[0,1], [2,3,4,5], [6,7], [8,9],[10]]
AllTrue(l, len, 2)
#False
testing if all elements in the list are of length 2. However, it can't handle more complex conditions like >/< or compound conditions like len > 2 and element[0] == 15
How can one do this?
Cleaned up version
def TrueQ(item):
return item == True
def AllTrue(List, Function = TrueQ, Condition = True):
flag = True
condition = Condition
Map = map(Function, List)
for m in Map:
if m != condition:
flag = False
break
return flag
and then just call AllTrue(List,TrueQ)
Python already has built-in the machinery you are trying to build. For example to check if all numbers in a list are even the code could be:
if all(x%2==0 for x in L):
...
if you want to check that all values are "truthy" the code is even simpler:
if all(L):
...
Note that in the first version the code is also "short-circuited", in other words the evaluation stops as soon as the result is known. In:
if all(price(x) > 100 for x in stocks):
...
the function price will be called until the first stock is found with a lower or equal price value. At that point the search will stop because the result is known to be False.
To check that all lengths are 2 in the list L the code is simply:
if all(len(x) == 2 for x in L):
...
i.e. more or less a literal translation of the request. No need to write a function for that.
If this kind of test is a "filter" that you want to pass as a parameter to another function then a lambda may turn out useful:
def search_DB(test):
for record in database:
if test(record):
result.append(record)
...
search_DB(lambda rec: all(len(x) == 2 for x in rec.strings))
I want a function that takes a list, a function, and a condition, and tells me if every element in the list matches the condition. i.e. foo(List, Len, >2)
In Python >2 is written lambda x : x>2.
There is (unfortunately) no metaprogramming facility in Python that would allow to write just >2 or things like ·>2 except using a string literal evaluation with eval and you don't want to do that. Even the standard Python library tried going down that path (see namedtuple implementation in collections) but it's really ugly.
I'm not saying that writing >2 would be a good idea, but that it would be nice to have a way to do that in case it was a good idea. Unfortunately to have decent metaprogramming abilities you need a homoiconic language representing code as data and therefore you would be programming in Lisp or another meta-language, not Python (programming in Lisp would indeed be a good idea, but for reasons unknown to me that approach is still unpopular).
Given that, the function foo to be called like
foo(L, len, lambda x : x > 2)
is just
def foo(L, f=lambda x : x, condition=lambda x: x):
return all(condition(f(x)) for x in L)
but no Python programmer would write such a function, because the original call to foo is actually more code and less clear than inlining it with:
all(len(x) > 2 for x in L)
and requires you to also learn about this thing foo (that does what all and a generator expression would do, just slower, with more code and more obfuscated).
You are reinventing the wheel. Just use something like this:
>>> l = [[0,1], [2,3,4,5], [6,7], [8,9],[10]]
>>> def all_true(iterable, f, condition):
... return all(condition(f(e)) for e in iterable)
...
>>> def cond(x): return x == 2
...
>>> all_true(l, len, cond)
False
You can define a different function to check a different condition:
>>> def cond(x): return x >= 1
...
>>> all_true(l, len, b)
True
>>>
And really, having your own function that does this seems like overkill. For example, to deal with your "complex condition" you could simply do something like:
>>> l = [[0,2],[0,1,2],[0,1,3,4]]
>>> all(len(sub) > 2 and sub[0] == 5 for sub in l)
False
>>> all(len(sub) > 1 and sub[0] == 0 for sub in l)
True
>>>
I think the ideal solution in this case may be:
def AllTrue(List, Test = lambda x:x):
all(Test(x) for x in List)
This thereby allows complex queries like:
l = [[0, 1], [1, 2, 3], [2, 5]]
AllTrue(l, lambda x: len(x) > 2 and x[0] == 1)
To adhere to Juanpa's suggestion, here it is in python naming conventions and an extension of what I posted in the question now with the ability to handle simple conditions like x > value.
from operator import *
all_true(a_list, a_function, an_operator, a_value):
a_map = map(a_function, a_list)
return all( an_operator(m, a_value) for m in a_map)
l = [[0,2],[0,1,2],[0,1,3,4]]
all_true(l, len, gt, 2)
#True
Note: this works for single conditions, but not for complex conditions like
len > 2 and element[0] == 5

Python newbie clarification about tuples and strings

I just learned that I can check if a substring is inside a string using:
substring in string
It looks to me that a string is just a special kind of tuple where its elements are chars. So I wonder if there's a straightforward way to search a slice of a tuple inside a tuple. The elements in the tuple can be of any type.
tupleslice in tuple
Now my related second question:
>>> tu = 12 ,23, 34,56
>>> tu[:2] in tu
False
I gather that I get False because (12, 23) is not an element of tu. But then, why substring in string works?. Is there syntactic sugar hidden behind scenes?.
string is not a type of tuple. Infact both belongs to different class. How in statement will be evaluated is based on the __contains__() magic function defined within there respective class.
Read How do you set up the contains method in python, may be you will find it useful. To know about magic functions in Python, read: A Guide to Python's Magic Methods
A string is not just a special kind of tuple. They have many similar properties, in particular, both are iterators, but they are distinct types and each defines the behavior of the in operator differently. See the docs on this here: https://docs.python.org/3/reference/expressions.html#in
To solve your problem of finding whether one tuple is a sub-sequence of another tuple, writing an algorithm like in your answer would be the way to go. Try something like this:
def contains(inner, outer):
inner_len = len(inner)
for i, _ in enumerate(outer):
outer_substring = outer[i:i+inner_len]
if outer_substring == inner:
return True
return False
This is how I accomplished to do my first request, however, it's not straightforward nor pythonic. I had to iterate the Java way. I wasn't able to make it using "for" loops.
def tupleInside(tupleSlice):
i, j = 0, 0
while j < len(tu):
t = tu[j]
ts = tupleSlice[i]
print(t, ts, i, j)
if ts == t:
i += 1
if i == len(tupleSlice):
return True
else:
j -= i
i = 0
j += 1
return False
tu = tuple('abcdefghaabedc')
print(tupleInside(tuple(input('Tuple slice: '))))
Try just playing around with tuples and splices. In this case its pretty easy because your splice is essentially indexing.
>>> tu = 12 ,23, 34,56
>>> tu
(12, 23, 34, 56) #a tuple of ints
>>> tu[:1] # a tuple with an int in it
(12,)
>>> tu[:1] in tu #checks for a tuple against int. no match.
False
>>> tu[0] in tu #checks for int against ints. matched!
True
>>> #you can see as we iterate through the values...
>>> for i in tu:
print(""+str(tu[:1])+" == " + str(i))
(12,) == 12
(12,) == 23
(12,) == 34
(12,) == 56
Splicing is returning a list of tuples, but you need to index further to compare in by values and not containers. Spliced strings return values, strings and the in operator can compare to values, but splicing tuples returns tuples, which are containers.
Just adding to Cameron Lee's answer so that it accepts inner containing a single integer.
def contains(inner, outer):
try:
inner_len = len(inner)
for i, _ in enumerate(outer):
outer_substring = outer[i:i+inner_len]
if outer_substring == inner:
return True
return False
except TypeError:
return inner in outer
contains(4, (3,1,2,4,5)) # returns True
contains((4), (3,1,2,4,5)) # returns True

Measure a length of tuples or string

I have following string and I want to convert it to array/list so I can measure its length.
a="abc,cde,ert,ert,eee"
b="a", "b", "c"
The expected length for a should be 1 and the expected length for b should be 3.
a is a string, b is a tuple. You can try something like this:
def length_of_str_or_tuple(obj):
if(isinstance(obj,basestring)):
return 1
return len(obj)
Although what you're doing is really weird and you should probably rethink your approach.
You can use something like this:
>>> a="abc,cde,ert,ert,eee"
>>> b="a", "b", "c"
>>> 1 if isinstance(a, str) else len(a)
1
>>> 1 if isinstance(b, str) else len(b)
3
>>>
In the above code, the conditional expression uses isinstance to test whether or not item is a string object. It returns 1 if so and len(item) if not.
Note that in Python 2.x, you should use isinstance(item, basestring) in order to handle both unicode and str objects.
There's a crude way to do this: check which is a string and which a tuple:
x ={}
for item in (a,b):
try:
item.find('')
x[item] = 1
except:
x[item] = len(item)
Since a tuple object doesn't have an attribute find, it will raise an exception.
To measure the length of the string:
len(a.split())
for the tuple:
len(list(b))
combine the previous answers to test for tuple or list and you would get what you want, or use:
if type(x) is tuple:
len(list(x))
else:
a = x.split("\"")
len(a)

Check if a sublist contains an item

I have a list of lists in Python. As illustrated below, I want to check if one of the sublists contains an item. The following attempt fails. Does anyone know of a simple way -- without me writing my own for loop?
>>> a = [[1,2],[3,4],[5,6],7,8,9]
>>> 2 in a
I was hoping for True but the return was False
>>> a = [[1,2],[3,4],[5,6],7,8,9]
>>> any(2 in i for i in a)
True
For a list that contains some lists and some integers, you need to test whether the element i is a list before testing whether the search target is in i.
>>> any(2 in i for i in a if isinstance(i, list))
True
>>> any(8 in i for i in a if isinstance(i, list))
False
If you don't check whether i is a list, then you'll get an error like below. The accepted answer is wrong, because it gives this error.
>>> any(8 in i for i in a)
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
any(8 in i for i in a)
File "<pyshell#3>", line 1, in <genexpr>
any(8 in i for i in a)
TypeError: argument of type 'int' is not iterable
I think this type of situation is where we can take some inspiration from functional programming by delegating the evaluation of the boolean expression to its own function. This way, if you need to change the behaviour of your bool condition, you only need to change that function definition!
Let's say you want to check sublists AND also int that happen to be in the top level. We can define a function that returns a boolean when performing a comparison on a single list element:
def elem(a, b):
'''
Defines if an object b matches a.
'''
return (isinstance(b, int) and a == b) or (isinstance(b, list) and a in b)
Note here that this function says nothing about our list - the argument b in our usage is just a single element within the list, but we could just as easily call it just to compare two values. Now we have the following:
>>> a = [[1,2],[3,4],[5,6],7,8,9]
>>> any(elem(2, i) for i in a)
True
>>> any(elem(8, i) for i in a)
True
>>> any(elem(10, i) for i in a)
False
Bingo! Another benefit of this type of definition is that it allows you to partially apply functions, and gives you the ability to assign names to searches for only one type of number:
from functools import partial
>>> contains2 = partial(elem, 2)
>>> any(map(contains2, a))
True
>>> b = [[1],[3,4],[5,6],7,8,9]]
>>> any(map(contains2, b))
False
In my opinion, this makes for more readable code at the cost of a bit of boilerplate and the need to know what map does - since you can make your variable names sensical rather than a jungle of temporary list comprehension variables. I don't particularly care if the functional approach is less Pythonic - Python is a multiparadigm language and I think it looks better this way, plain and simple. But that's personal choice - it's up to you.
Now let's say that our situation has changed and we now want to check only the sublists - it's not enough for an occurrence in a top level. That's okay because now all we need to change is our definition of elem. Let's see:
def elem(a, b):
return isinstance(b, list) and a in b
We've just removed the possibility of a match in the case that b is a top level int! If we run this now:
>>> a = [[1,2],[3,4],[5,6],7,8,9,"a",["b","c"]]
>>> any(elem(2, i) for i in a)
True
>>> any(elem(8, i) for i in a)
False
I'll illustrate one final example that really drives home how powerful this type of definition is. Suppose we have a list of arbitrarily deeply nested lists of integers. How do we check if an integer is in any of the levels?
We can take a recursive approach - and it doesn't take much modification at all:
def elem(a, b):
return (isinstance(b, int) and a == b) or \
(isinstance(b, list) and any(map(partial(elem, a), b)))
Because we've used this recursive definition that is defined to act on a single element, all the previous lines of code used still work:
>>> d = [1, [2, [3, [4, 5]]]]
>>> any(elem(1, i) for i in d)
True
>>> any(elem(4, i) for i in d)
True
>>> any(elem(10, i) for i in d)
False
>>> any(map(contains2, d))
True
Of course, given that this function is now recursive, we can really just call it directly:
>>> elem(4, d)
True
But the point remains that this modular approach has allowed us to alter the functionality by only changing the definition of elem without touching our main script, which means less TypeErrors and quicker refactoring.
I don't think there's any way of doing the test without a loop of some kind.
Here's a function that uses a straightforward for loop to explicitly check for an object within a sublist:
def sublist_contains(lst, obj):
for item in lst:
try:
if obj in item:
return True
except TypeError:
pass
return False
Of course, that doesn't test if the object is in the top level list, nor will it work if there is more than one level of nesting. Here's a more general solution using recursion, which puts the loop in a generator expression that's passed to the built-in function any:
def nested_contains(lst, obj):
return any(item == obj or
isinstance(item, list) and nested_contains(item, obj)
for item in lst)
Simple way to do this is:
a = [[1,2],[3,4],[5,6],7,8,9]
result = [2 in i for i in a]
True in result --> True

Categories