Python C-API: Using `PySequence_Length` with dictionaries

Python C-API: Using `PySequence_Length` with dictionaries - python

I'm trying to use PySequence_Length to get the length of a Python dictionary in C. I realize I can use PyDict_Size, but I'm interested in using a more generic function in certain contexts.
PyObject* d = PyDict_New();
Py_ssize_t res = PySequence_Length(d);
printf("Result : %ld\n", res);
if (res == -1) PyErr_Print();
This fails, and prints the error:
TypeError: object of type 'dict' has no len()
My question is: why does this fail? Although Python dictionary objects don't support the Sequence protocol, the documentation for PySequence_Length says:
Py_ssize_t PySequence_Length(PyObject *o)
Returns the number of objects in sequence o on success, and -1 on
failure. For objects that do not provide sequence protocol, this is
equivalent to the Python expression len(o).
Since a dictionary type does have a __len__ attribute, and since the expression len(d) (where d is a dictionary) properly returns the length in Python, I don't understand why PySequence_Length should fail in C.
Any explanation? Is the documentation incorrect here?

The documentation is misleading, yes. A dict is not a sequence, even though it does implement some parts of the sequence protocol (for containment tests, which are part of the sequence protocol.) This particular distinction in the Python/C types API is unfortunate, but it's an artifact of a design choice made decades ago. The documentation reflects that distinction, albeit in an equally awkward way. What it tries to say is that for Python classes it's the same thing as len(o), regardless of what the Python class pretends to be. For C types, if the type does not implement the sequence version of the sizefunc, PySequence_Length() will raise an exception without even considering whether the type has the mapping version of the sizefunc.
If you are not entirely sure whether you have a sequence or not, you should use PyObject_Size() instead. In fact, there's very little reason to call PySequence_Length(); normally you either know the type (because you just created it, and you can call a type-specific length function or macro like PyList_GET_SIZE()) or you don't even know if it'll be a sequence.

Related

I'm trying to validate a python class property, but it isn't raising the error I tried to raise. How should I correct this? [duplicate]

How do I check if an object is of a given type, or if it inherits from a given type?
How do I check if the object o is of type str?
Beginners often wrongly expect the string to already be "a number" - either expecting Python 3.x input to convert type, or expecting that a string like '1' is also simultaneously an integer. This is the wrong canonical for those questions. Please carefully read the question and then use How do I check if a string represents a number (float or int)?, How can I read inputs as numbers? and/or Asking the user for input until they give a valid response as appropriate.

Use isinstance to check if o is an instance of str or any subclass of str:
if isinstance(o, str):
To check if the type of o is exactly str, excluding subclasses of str:
if type(o) is str:
See Built-in Functions in the Python Library Reference for relevant information.
Checking for strings in Python 2
For Python 2, this is a better way to check if o is a string:
if isinstance(o, basestring):
because this will also catch Unicode strings. unicode is not a subclass of str; both str and unicode are subclasses of basestring. In Python 3, basestring no longer exists since there's a strict separation of strings (str) and binary data (bytes).
Alternatively, isinstance accepts a tuple of classes. This will return True if o is an instance of any subclass of any of (str, unicode):
if isinstance(o, (str, unicode)):

The most Pythonic way to check the type of an object is... not to check it.
Since Python encourages Duck Typing, you should just try...except to use the object's methods the way you want to use them. So if your function is looking for a writable file object, don't check that it's a subclass of file, just try to use its .write() method!
Of course, sometimes these nice abstractions break down and isinstance(obj, cls) is what you need. But use sparingly.

isinstance(o, str) will return True if o is an str or is of a type that inherits from str.
type(o) is str will return True if and only if o is a str. It will return False if o is of a type that inherits from str.

After the question was asked and answered, type hints were added to Python. Type hints in Python allow types to be checked but in a very different way from statically typed languages. Type hints in Python associate the expected types of arguments with functions as runtime accessible data associated with functions and this allows for types to be checked. Example of type hint syntax:
def foo(i: int):
return i
foo(5)
foo('oops')
In this case we want an error to be triggered for foo('oops') since the annotated type of the argument is int. The added type hint does not cause an error to occur when the script is run normally. However, it adds attributes to the function describing the expected types that other programs can query and use to check for type errors.
One of these other programs that can be used to find the type error is mypy:
mypy script.py
script.py:12: error: Argument 1 to "foo" has incompatible type "str"; expected "int"
(You might need to install mypy from your package manager. I don't think it comes with CPython but seems to have some level of "officialness".)
Type checking this way is different from type checking in statically typed compiled languages. Because types are dynamic in Python, type checking must be done at runtime, which imposes a cost -- even on correct programs -- if we insist that it happen at every chance. Explicit type checks may also be more restrictive than needed and cause unnecessary errors (e.g. does the argument really need to be of exactly list type or is anything iterable sufficient?).
The upside of explicit type checking is that it can catch errors earlier and give clearer error messages than duck typing. The exact requirements of a duck type can only be expressed with external documentation (hopefully it's thorough and accurate) and errors from incompatible types can occur far from where they originate.
Python's type hints are meant to offer a compromise where types can be specified and checked but there is no additional cost during usual code execution.
The typing package offers type variables that can be used in type hints to express needed behaviors without requiring particular types. For example, it includes variables such as Iterable and Callable for hints to specify the need for any type with those behaviors.
While type hints are the most Pythonic way to check types, it's often even more Pythonic to not check types at all and rely on duck typing. Type hints are relatively new and the jury is still out on when they're the most Pythonic solution. A relatively uncontroversial but very general comparison: Type hints provide a form of documentation that can be enforced, allow code to generate earlier and easier to understand errors, can catch errors that duck typing can't, and can be checked statically (in an unusual sense but it's still outside of runtime). On the other hand, duck typing has been the Pythonic way for a long time, doesn't impose the cognitive overhead of static typing, is less verbose, and will accept all viable types and then some.

In Python 3.10, you can use | in isinstance:
>>> isinstance('1223', int | str)
True
>>> isinstance('abcd', int | str)
True

isinstance(o, str)
Link to docs

You can check for type of a variable using __name__ of a type.
Ex:
>>> a = [1,2,3,4]
>>> b = 1
>>> type(a).__name__
'list'
>>> type(a).__name__ == 'list'
True
>>> type(b).__name__ == 'list'
False
>>> type(b).__name__
'int'

For more complex type validations I like typeguard's approach of validating based on python type hint annotations:
from typeguard import check_type
from typing import List
try:
check_type('mylist', [1, 2], List[int])
except TypeError as e:
print(e)
You can perform very complex validations in very clean and readable fashion.
check_type('foo', [1, 3.14], List[Union[int, float]])
# vs
isinstance(foo, list) and all(isinstance(a, (int, float)) for a in foo)

I think the cool thing about using a dynamic language like Python is you really shouldn't have to check something like that.
I would just call the required methods on your object and catch an AttributeError. Later on this will allow you to call your methods with other (seemingly unrelated) objects to accomplish different tasks, such as mocking an object for testing.
I've used this a lot when getting data off the web with urllib2.urlopen() which returns a file like object. This can in turn can be passed to almost any method that reads from a file, because it implements the same read() method as a real file.
But I'm sure there is a time and place for using isinstance(), otherwise it probably wouldn't be there :)

The accepted answer answers the question in that it provides the answers to the asked questions.
Q: What is the best way to check whether a given object is of a given type? How about checking whether the object inherits from a given type?
A: Use isinstance, issubclass, type to check based on types.
As other answers and comments are quick to point out however, there's a lot more to the idea of "type-checking" than that in python. Since the addition of Python 3 and type hints, much has changed as well. Below, I go over some of the difficulties with type checking, duck typing, and exception handling. For those that think type checking isn't what is needed (it usually isn't, but we're here), I also point out how type hints can be used instead.
Type Checking
Type checking is not always an appropriate thing to do in python. Consider the following example:
def sum(nums):
"""Expect an iterable of integers and return the sum."""
result = 0
for n in nums:
result += n
return result
To check if the input is an iterable of integers, we run into a major issue. The only way to check if every element is an integer would be to loop through to check each element. But if we loop through the entire iterator, then there will be nothing left for intended code. We have two options in this kind of situation.
Check as we loop.
Check beforehand but store everything as we check.
Option 1 has the downside of complicating our code, especially if we need to perform similar checks in many places. It forces us to move type checking from the top of the function to everywhere we use the iterable in our code.
Option 2 has the obvious downside that it destroys the entire purpose of iterators. The entire point is to not store the data because we shouldn't need to.
One might also think that checking if checking all of the elements is too much then perhaps we can just check if the input itself is of the type iterable, but there isn't actually any iterable base class. Any type implementing __iter__ is iterable.
Exception Handling and Duck Typing
An alternative approach would be to forgo type checking altogether and focus on exception handling and duck typing instead. That is to say, wrap your code in a try-except block and catch any errors that occur. Alternatively, don't do anything and let exceptions rise naturally from your code.
Here's one way to go about catching an exception.
def sum(nums):
"""Try to catch exceptions?"""
try:
result = 0
for n in nums:
result += n
return result
except TypeError as e:
print(e)
Compared to the options before, this is certainly better. We're checking as we run the code. If there's a TypeError anywhere, we'll know. We don't have to place a check everywhere that we loop through the input. And we don't have to store the input as we iterate over it.
Furthermore, this approach enables duck typing. Rather than checking for specific types, we have moved to checking for specific behaviors and look for when the input fails to behave as expected (in this case, looping through nums and being able to add n).
However, the exact reasons which make exception handling nice can also be their downfall.
A float isn't an int, but it satisfies the behavioral requirements to work.
It is also bad practice to wrap the entire code with a try-except block.
At first these may not seem like issues, but here's some reasons that may change your mind.
A user can no longer expect our function to return an int as intended. This may break code elsewhere.
Since exceptions can come from a wide variety of sources, using the try-except on the whole code block may end up catching exceptions you didn't intend to. We only wanted to check if nums was iterable and had integer elements.
Ideally we'd like to catch exceptions our code generators and raise, in their place, more informative exceptions. It's not fun when an exception is raised from someone else's code with no explanation other than a line you didn't write and that some TypeError occured.
In order to fix the exception handling in response to the above points, our code would then become this... abomination.
def sum(nums):
"""
Try to catch all of our exceptions only.
Re-raise them with more specific details.
"""
result = 0
try:
iter(nums)
except TypeError as e:
raise TypeError("nums must be iterable")
for n in nums:
try:
result += int(n)
except TypeError as e:
raise TypeError("stopped mid iteration since a non-integer was found")
return result
You can kinda see where this is going. The more we try to "properly" check things, the worse our code is looking. Compared to the original code, this isn't readable at all.
We could argue perhaps this is a bit extreme. But on the other hand, this is only a very simple example. In practice, your code is probably much more complicated than this.
Type Hints
We've seen what happens when we try to modify our small example to "enable type checking". Rather than focusing on trying to force specific types, type hinting allows for a way to make types clear to users.
from typing import Iterable
def sum(nums: Iterable[int]) -> int:
result = 0
for n in nums:
result += n
return result
Here are some advantages to using type-hints.
The code actually looks good now!
Static type analysis may be performed by your editor if you use type hints!
They are stored on the function/class, making them dynamically usable e.g. typeguard and dataclasses.
They show up for functions when using help(...).
No need to sanity check if your input type is right based on a description or worse lack thereof.
You can "type" hint based on structure e.g. "does it have this attribute?" without requiring subclassing by the user.
The downside to type hinting?
Type hints are nothing more than syntax and special text on their own. It isn't the same as type checking.
In other words, it doesn't actually answer the question because it doesn't provide type checking. Regardless, however, if you are here for type checking, then you should be type hinting as well. Of course, if you've come to the conclusion that type checking isn't actually necessary but you want some semblance of typing, then type hints are for you.

To Hugo:
You probably mean list rather than array, but that points to the whole problem with type checking - you don't want to know if the object in question is a list, you want to know if it's some kind of sequence or if it's a single object. So try to use it like a sequence.
Say you want to add the object to an existing sequence, or if it's a sequence of objects, add them all
try:
my_sequence.extend(o)
except TypeError:
my_sequence.append(o)
One trick with this is if you are working with strings and/or sequences of strings - that's tricky, as a string is often thought of as a single object, but it's also a sequence of characters. Worse than that, as it's really a sequence of single-length strings.
I usually choose to design my API so that it only accepts either a single value or a sequence - it makes things easier. It's not hard to put a [ ] around your single value when you pass it in if need be.
(Though this can cause errors with strings, as they do look like (are) sequences.)

If you have to check for the type of str or int please use instanceof. As already mentioned by others the explanation is to also include sub classes. One important example for sub classes from my perspective are Enums with data type like IntEnum or StrEnum. Which are a pretty nice way to define related constants. However, it is kind of annoying if libraries do not accept those as such types.
Example:
import enum
class MyEnum(str, enum.Enum):
A = "a"
B = "b"
print(f"is string: {isinstance(MyEnum.A, str)}") # True
print(f"is string: {type(MyEnum.A) == str}") # False!!!
print(f"is string: {type(MyEnum.A.value) == str}") # True

In Python, you can use the built-in isinstance() function to check if an object is of a given type, or if it inherits from a given type.
To check if the object o is of type str, you would use the following code:
if isinstance(o, str):
# o is of type str
You can also use type() function to check the object type.
if type(o) == str:
# o is of type str
You can also check if the object is a sub class of a particular class using issubclass() function.
if issubclass(type(o),str):
# o is sub class of str

A simple way to check type is to compare it with something whose type you know.
>>> a = 1
>>> type(a) == type(1)
True
>>> b = 'abc'
>>> type(b) == type('')
True

I think the best way is to typing well your variables. You can do this by using the "typing" library.
Example:
from typing import NewType
UserId = NewType ('UserId', int)
some_id = UserId (524313`)
See https://docs.python.org/3/library/typing.html.

different behaviour of unhashable typeError between cpp and python

It is ok to insert a vector to a set in cpp, which not works in python however. Codes shown below:
// OK in cpp
set<vector<int>> cpp_set;
cpp_set.insert(vector<int>{1,2,3});
// not OK in python
py_set = set()
py_set.add([1,2,3]) # TypeError: unhashable type: 'list'
py_set.add({1,2,3}) # TypeError: unhashable type: 'set'
Even adding a set to a set is not allowed, and I don't understand why a set is unhashable.
Why does the difference happens? And is it a permitted good way to use in cpp? I mean adding a vector to a set.

Python's set is a hash table, and has no hashing functions for set and list.
std::set is not a hash table but a sequence ordered by an ordering relation (std::less by default).
You can use std::set with any type where you can define a strict weak ordering.
Try std::unordered_set in C++ and you will encounter problems.

Elements of a container modeling a mathematical set should always be immutable, regardless of language, which is important for the guarantees/invariants that are typically associated with such a type. The difference between Python and C++ here is that Python requires an immutable type up front, while C++ just uses const to make one immutable.
Notes:
The elements of a C++ set<T> are actually T const.
Of course, C++ has const_cast, so it's not as immutable as it may seem.
Storing an element in a set creates a copy. Modifying the original doesn't change the copy.
For a C++ hash set (std::unordered_set), you have the additional requirement that you must be able to hash the elements. The hash of an element must remain the same over time as well, which is why it has similar limitations.

Defining typed dictionaries in Python [duplicate]

How do I check if an object is of a given type, or if it inherits from a given type?
How do I check if the object o is of type str?
Beginners often wrongly expect the string to already be "a number" - either expecting Python 3.x input to convert type, or expecting that a string like '1' is also simultaneously an integer. This is the wrong canonical for those questions. Please carefully read the question and then use How do I check if a string represents a number (float or int)?, How can I read inputs as numbers? and/or Asking the user for input until they give a valid response as appropriate.

Use isinstance to check if o is an instance of str or any subclass of str:
if isinstance(o, str):
To check if the type of o is exactly str, excluding subclasses of str:
if type(o) is str:
See Built-in Functions in the Python Library Reference for relevant information.
Checking for strings in Python 2
For Python 2, this is a better way to check if o is a string:
if isinstance(o, basestring):
because this will also catch Unicode strings. unicode is not a subclass of str; both str and unicode are subclasses of basestring. In Python 3, basestring no longer exists since there's a strict separation of strings (str) and binary data (bytes).
Alternatively, isinstance accepts a tuple of classes. This will return True if o is an instance of any subclass of any of (str, unicode):
if isinstance(o, (str, unicode)):

The most Pythonic way to check the type of an object is... not to check it.
Since Python encourages Duck Typing, you should just try...except to use the object's methods the way you want to use them. So if your function is looking for a writable file object, don't check that it's a subclass of file, just try to use its .write() method!
Of course, sometimes these nice abstractions break down and isinstance(obj, cls) is what you need. But use sparingly.

isinstance(o, str) will return True if o is an str or is of a type that inherits from str.
type(o) is str will return True if and only if o is a str. It will return False if o is of a type that inherits from str.

After the question was asked and answered, type hints were added to Python. Type hints in Python allow types to be checked but in a very different way from statically typed languages. Type hints in Python associate the expected types of arguments with functions as runtime accessible data associated with functions and this allows for types to be checked. Example of type hint syntax:
def foo(i: int):
return i
foo(5)
foo('oops')
In this case we want an error to be triggered for foo('oops') since the annotated type of the argument is int. The added type hint does not cause an error to occur when the script is run normally. However, it adds attributes to the function describing the expected types that other programs can query and use to check for type errors.
One of these other programs that can be used to find the type error is mypy:
mypy script.py
script.py:12: error: Argument 1 to "foo" has incompatible type "str"; expected "int"
(You might need to install mypy from your package manager. I don't think it comes with CPython but seems to have some level of "officialness".)
Type checking this way is different from type checking in statically typed compiled languages. Because types are dynamic in Python, type checking must be done at runtime, which imposes a cost -- even on correct programs -- if we insist that it happen at every chance. Explicit type checks may also be more restrictive than needed and cause unnecessary errors (e.g. does the argument really need to be of exactly list type or is anything iterable sufficient?).
The upside of explicit type checking is that it can catch errors earlier and give clearer error messages than duck typing. The exact requirements of a duck type can only be expressed with external documentation (hopefully it's thorough and accurate) and errors from incompatible types can occur far from where they originate.
Python's type hints are meant to offer a compromise where types can be specified and checked but there is no additional cost during usual code execution.
The typing package offers type variables that can be used in type hints to express needed behaviors without requiring particular types. For example, it includes variables such as Iterable and Callable for hints to specify the need for any type with those behaviors.
While type hints are the most Pythonic way to check types, it's often even more Pythonic to not check types at all and rely on duck typing. Type hints are relatively new and the jury is still out on when they're the most Pythonic solution. A relatively uncontroversial but very general comparison: Type hints provide a form of documentation that can be enforced, allow code to generate earlier and easier to understand errors, can catch errors that duck typing can't, and can be checked statically (in an unusual sense but it's still outside of runtime). On the other hand, duck typing has been the Pythonic way for a long time, doesn't impose the cognitive overhead of static typing, is less verbose, and will accept all viable types and then some.

In Python 3.10, you can use | in isinstance:
>>> isinstance('1223', int | str)
True
>>> isinstance('abcd', int | str)
True

isinstance(o, str)
Link to docs

You can check for type of a variable using __name__ of a type.
Ex:
>>> a = [1,2,3,4]
>>> b = 1
>>> type(a).__name__
'list'
>>> type(a).__name__ == 'list'
True
>>> type(b).__name__ == 'list'
False
>>> type(b).__name__
'int'

For more complex type validations I like typeguard's approach of validating based on python type hint annotations:
from typeguard import check_type
from typing import List
try:
check_type('mylist', [1, 2], List[int])
except TypeError as e:
print(e)
You can perform very complex validations in very clean and readable fashion.
check_type('foo', [1, 3.14], List[Union[int, float]])
# vs
isinstance(foo, list) and all(isinstance(a, (int, float)) for a in foo)

I think the cool thing about using a dynamic language like Python is you really shouldn't have to check something like that.
I would just call the required methods on your object and catch an AttributeError. Later on this will allow you to call your methods with other (seemingly unrelated) objects to accomplish different tasks, such as mocking an object for testing.
I've used this a lot when getting data off the web with urllib2.urlopen() which returns a file like object. This can in turn can be passed to almost any method that reads from a file, because it implements the same read() method as a real file.
But I'm sure there is a time and place for using isinstance(), otherwise it probably wouldn't be there :)

The accepted answer answers the question in that it provides the answers to the asked questions.
Q: What is the best way to check whether a given object is of a given type? How about checking whether the object inherits from a given type?
A: Use isinstance, issubclass, type to check based on types.
As other answers and comments are quick to point out however, there's a lot more to the idea of "type-checking" than that in python. Since the addition of Python 3 and type hints, much has changed as well. Below, I go over some of the difficulties with type checking, duck typing, and exception handling. For those that think type checking isn't what is needed (it usually isn't, but we're here), I also point out how type hints can be used instead.
Type Checking
Type checking is not always an appropriate thing to do in python. Consider the following example:
def sum(nums):
"""Expect an iterable of integers and return the sum."""
result = 0
for n in nums:
result += n
return result
To check if the input is an iterable of integers, we run into a major issue. The only way to check if every element is an integer would be to loop through to check each element. But if we loop through the entire iterator, then there will be nothing left for intended code. We have two options in this kind of situation.
Check as we loop.
Check beforehand but store everything as we check.
Option 1 has the downside of complicating our code, especially if we need to perform similar checks in many places. It forces us to move type checking from the top of the function to everywhere we use the iterable in our code.
Option 2 has the obvious downside that it destroys the entire purpose of iterators. The entire point is to not store the data because we shouldn't need to.
One might also think that checking if checking all of the elements is too much then perhaps we can just check if the input itself is of the type iterable, but there isn't actually any iterable base class. Any type implementing __iter__ is iterable.
Exception Handling and Duck Typing
An alternative approach would be to forgo type checking altogether and focus on exception handling and duck typing instead. That is to say, wrap your code in a try-except block and catch any errors that occur. Alternatively, don't do anything and let exceptions rise naturally from your code.
Here's one way to go about catching an exception.
def sum(nums):
"""Try to catch exceptions?"""
try:
result = 0
for n in nums:
result += n
return result
except TypeError as e:
print(e)
Compared to the options before, this is certainly better. We're checking as we run the code. If there's a TypeError anywhere, we'll know. We don't have to place a check everywhere that we loop through the input. And we don't have to store the input as we iterate over it.
Furthermore, this approach enables duck typing. Rather than checking for specific types, we have moved to checking for specific behaviors and look for when the input fails to behave as expected (in this case, looping through nums and being able to add n).
However, the exact reasons which make exception handling nice can also be their downfall.
A float isn't an int, but it satisfies the behavioral requirements to work.
It is also bad practice to wrap the entire code with a try-except block.
At first these may not seem like issues, but here's some reasons that may change your mind.
A user can no longer expect our function to return an int as intended. This may break code elsewhere.
Since exceptions can come from a wide variety of sources, using the try-except on the whole code block may end up catching exceptions you didn't intend to. We only wanted to check if nums was iterable and had integer elements.
Ideally we'd like to catch exceptions our code generators and raise, in their place, more informative exceptions. It's not fun when an exception is raised from someone else's code with no explanation other than a line you didn't write and that some TypeError occured.
In order to fix the exception handling in response to the above points, our code would then become this... abomination.
def sum(nums):
"""
Try to catch all of our exceptions only.
Re-raise them with more specific details.
"""
result = 0
try:
iter(nums)
except TypeError as e:
raise TypeError("nums must be iterable")
for n in nums:
try:
result += int(n)
except TypeError as e:
raise TypeError("stopped mid iteration since a non-integer was found")
return result
You can kinda see where this is going. The more we try to "properly" check things, the worse our code is looking. Compared to the original code, this isn't readable at all.
We could argue perhaps this is a bit extreme. But on the other hand, this is only a very simple example. In practice, your code is probably much more complicated than this.
Type Hints
We've seen what happens when we try to modify our small example to "enable type checking". Rather than focusing on trying to force specific types, type hinting allows for a way to make types clear to users.
from typing import Iterable
def sum(nums: Iterable[int]) -> int:
result = 0
for n in nums:
result += n
return result
Here are some advantages to using type-hints.
The code actually looks good now!
Static type analysis may be performed by your editor if you use type hints!
They are stored on the function/class, making them dynamically usable e.g. typeguard and dataclasses.
They show up for functions when using help(...).
No need to sanity check if your input type is right based on a description or worse lack thereof.
You can "type" hint based on structure e.g. "does it have this attribute?" without requiring subclassing by the user.
The downside to type hinting?
Type hints are nothing more than syntax and special text on their own. It isn't the same as type checking.
In other words, it doesn't actually answer the question because it doesn't provide type checking. Regardless, however, if you are here for type checking, then you should be type hinting as well. Of course, if you've come to the conclusion that type checking isn't actually necessary but you want some semblance of typing, then type hints are for you.

To Hugo:
You probably mean list rather than array, but that points to the whole problem with type checking - you don't want to know if the object in question is a list, you want to know if it's some kind of sequence or if it's a single object. So try to use it like a sequence.
Say you want to add the object to an existing sequence, or if it's a sequence of objects, add them all
try:
my_sequence.extend(o)
except TypeError:
my_sequence.append(o)
One trick with this is if you are working with strings and/or sequences of strings - that's tricky, as a string is often thought of as a single object, but it's also a sequence of characters. Worse than that, as it's really a sequence of single-length strings.
I usually choose to design my API so that it only accepts either a single value or a sequence - it makes things easier. It's not hard to put a [ ] around your single value when you pass it in if need be.
(Though this can cause errors with strings, as they do look like (are) sequences.)

If you have to check for the type of str or int please use instanceof. As already mentioned by others the explanation is to also include sub classes. One important example for sub classes from my perspective are Enums with data type like IntEnum or StrEnum. Which are a pretty nice way to define related constants. However, it is kind of annoying if libraries do not accept those as such types.
Example:
import enum
class MyEnum(str, enum.Enum):
A = "a"
B = "b"
print(f"is string: {isinstance(MyEnum.A, str)}") # True
print(f"is string: {type(MyEnum.A) == str}") # False!!!
print(f"is string: {type(MyEnum.A.value) == str}") # True

In Python, you can use the built-in isinstance() function to check if an object is of a given type, or if it inherits from a given type.
To check if the object o is of type str, you would use the following code:
if isinstance(o, str):
# o is of type str
You can also use type() function to check the object type.
if type(o) == str:
# o is of type str
You can also check if the object is a sub class of a particular class using issubclass() function.
if issubclass(type(o),str):
# o is sub class of str

A simple way to check type is to compare it with something whose type you know.
>>> a = 1
>>> type(a) == type(1)
True
>>> b = 'abc'
>>> type(b) == type('')
True

I think the best way is to typing well your variables. You can do this by using the "typing" library.
Example:
from typing import NewType
UserId = NewType ('UserId', int)
some_id = UserId (524313`)
See https://docs.python.org/3/library/typing.html.

What is Python's sequence protocol?

Python does a lot with magic methods and most of these are part of some protocol. I am familiar with the "iterator protocol" and the "number protocol" but recently stumbled over the term "sequence protocol". But even after some research I'm not exactly sure what the "sequence protocol" is.
For example the C API function PySequence_Check checks (according to the documentation) if some object implements the "sequence protocol". The source code indicates that this is a class that's not a dict but implements a __getitem__ method which is roughly identical to what the documentation on iter also states:
[...]must support the sequence protocol (the __getitem__() method with integer arguments starting at 0).[...]
But the requirement to start with 0 isn't something that's "implemented" in PySequence_Check.
Then there is also the collections.abc.Sequence type, which basically says the instance has to implement __reversed__, __contains__, __iter__ and __len__.
But by that definition a class implementing the "sequence protocol" isn't necessarily a Sequence, for example the "data model" and the abstract class guarantee that a sequence has a length. But a class just implementing __getitem__ (passing the PySequence_Check) throws an exception when using len(an_instance_of_that_class).
Could someone please clarify for me the difference between a sequence and the sequence protocol (if there's a definition for the protocol besides reading the source code) and when to use which definition?

It's not really consistent.
Here's PySequence_Check:
int
PySequence_Check(PyObject *s)
{
if (PyDict_Check(s))
return 0;
return s != NULL && s->ob_type->tp_as_sequence &&
s->ob_type->tp_as_sequence->sq_item != NULL;
}
PySequence_Check checks if an object provides the C sequence protocol, implemented through a tp_as_sequence member in the PyTypeObject representing the object's type. This tp_as_sequence member is a pointer to a struct containing a bunch of functions for sequence behavior, such as sq_item for item retrieval by numeric index and sq_ass_item for item assignment.
Specifically, PySequence_Check requires that its argument is not a dict, and that it provides sq_item.
Types with a __getitem__ written in Python will provide sq_item regardless of whether they're conceptually sequences or mappings, so a mapping written in Python that doesn't inherit from dict will pass PySequence_Check.
On the other hand, collections.abc.Sequence only checks whether an object concretely inherits from collections.abc.Sequence or whether its class (or a superclass) is explicitly registered with collections.abc.Sequence. If you just implement a sequence yourself without doing either of those things, it won't pass isinstance(your_sequence, Sequence). Also, most classes registered with collections.abc.Sequence don't support all of collections.abc.Sequence's methods. Overall, collections.abc.Sequence is a lot less reliable than people commonly expect it to be.
As for what counts as a sequence in practice, it's usually anything that supports __len__ and __getitem__ with integer indexes starting at 0 and isn't a mapping. If the docs for a function say it takes any sequence, that's almost always all it needs. Unfortunately, "isn't a mapping" is hard to test for, for reasons similar to how "is a sequence" is hard to pin down.

For a type to be in accordance with the sequence protocol, these 4 conditions must be met:
Retrieve elements by index
item = seq[index]
Find items by value
index = seq.index(item)
Count items
num = seq.count(item)
Produce a reversed sequence
r = reversed(seq)

How to use a .NET method which modifies in place in Python?

I am trying to use a .NET dll in Python. In a .NET language the method requires passing it 2 arrays by reference which it then modifies:
public void GetItems(
out int[] itemIDs,
out string[] itemNames
)
How can I use this method in Python using the Python for .NET module?
Edit: Forgot to mention this is in CPython not IronPython.
Additional info.
When I do the following:
itemIDs = []
itemNames = []
GetItems(itemIDs, itemNames)
I get an output like:
(None, <System.Int32[] at 0x43466c0>, <System.String[] at 0x43461c0>)
Do I just need to figure out how to convert these back into python types?

PythonNet doesn't document this quite as clearly as IronPython, but it does almost the same thing.
So, let's look at the IronPython documentation for ref and out parameters:
The Python language passes all arguments by-value. There is no syntax to indicate that an argument should be passed by-reference like there is in .NET languages like C# and VB.NET via the ref and out keywords. IronPython supports two ways of passing ref or out arguments to a method, an implicit way and an explicit way.
In the implicit way, an argument is passed normally to the method call, and its (potentially) updated value is returned from the method call along with the normal return value (if any). This composes well with the Python feature of multiple return values…
In the explicit way, you can pass an instance of clr.Reference[T] for the ref or out argument, and its Value field will get set by the call. The explicit way is useful if there are multiple overloads with ref parameters…
There are examples for both. But to tailor it to your specific case:
itemIDs, itemNames = GetItems()
Or, if you really want:
itemIDsRef = clr.Reference[Array[int]]()
itemNamesRef = clr.Reference[Array[String]]()
GetItems(itemIDs, itemNames)
itemIDs, itemNames = itemIDsRef.Value, itemNamesRef.Value
CPython using PythonNet does basically the same thing. The easy way to do out parameters is to not pass them and accept them as extra return values, and for ref parameters to pass the input values as arguments and accept the output values as extra return values. Just like IronPython's implicit solution. (Except that a void function with ref or out parameters always returns None before the ref or out arguments, even if it wouldn't in IronPython.) You can figure it out pretty easily by inspecting the return values. So, in your case:
_, itemIDs, itemNames = GetItems()
Meanwhile, the fact that these happen to be arrays doesn't make things any harder. As the docs explain, PythonNet provides the iterable interface for all IEnumerable collections, and the sequence protocol as well for Array. So, you can do this:
for itemID, itemName in zip(itemIDs, itemNames):
print itemID, itemName
And the Int32 and String objects will be converted to native int/long and str/unicode objects just as if they were returned directly.
If you really want to explicitly convert these to native values, you can. map or a list comprehension will give you a Python list from any iterable, including a PythonNet wrapper around an Array or other IEnumerable. And you can explicitly make a long or unicode out of an Int32 or String if you need to. So:
itemIDs = map(int, itemIDs)
itemNames = map(unicode, itemNames)
But I don't see much advantage to doing this, unless you need to, e.g., pre-check all the values before using any of them.

I have managed to use the method
bool XferData(ref byte[] buf, ref int len) from C# library CyUSB.dll
with the following code:
>>> xferLen = 2;
>>> outData=[10, 0]
>>> inData=[]
>>> n, outData, xferLen = XferData(outData, xferLen)
>>> print n, outData[0], outData[1], xferLen
True 10 0 2
Hope this helps someone.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python C-API: Using `PySequence_Length` with dictionaries - python

Related

I'm trying to validate a python class property, but it isn't raising the error I tried to raise. How should I correct this? [duplicate]

different behaviour of unhashable typeError between cpp and python

Defining typed dictionaries in Python [duplicate]

What is Python's sequence protocol?

How to use a .NET method which modifies in place in Python?

Categories

Resources