I've seen np.int0 used for converting bounding box floating point values to int in OpenCV problems.
What exactly is np.int0?
I've seen np.uint8, np.int32, etc. I can't seem to find np.int0 in any online documentation. What kind of int does this cast arguments to?
int0 is an alias for intp; this, in turn, is
Integer used for indexing (same as C ssize_t; normally either int32 or int64)
-- Numpy docs: basic types
It's a mere alias to int64, try this from either Python 2 or Python 3:
>>> import numpy
>>> numpy.int0 is numpy.int64
True
Here is some more information:
# get complete list of datatype names
>>> np.sctypeDict.keys()
# returns the default integer type (here `int64`)
>>> np.sctypeDict['int0']
<class 'numpy.int64'>
# verify
>>> arr = np.int0([1, 2, 3])
>>> arr.nbytes
24
Related
I need to create a very large numpy array that will hold non-negative integer values. I know in advance what the largest integer will be, so I want to try to use the smallest datatype possible. So far I have the following:
>>> import numpy as np
>>> def minimal_type(max_val, types=[np.uint8,np.uint16,np.uint32,np.uint64]):
''' finds the minimal data type needed to correctly store the given max_val
returns None if none of the provided types are sufficient
'''
for t in types:
if max_val <= np.iinfo(t).max:
return t
return None
>>> print(minimal_type(42))
<class 'numpy.uint8'>
>>> print(minimal_type(255))
<class 'numpy.uint8'>
>>> print(minimal_type(256))
<class 'numpy.uint16'>
>>> print(minimal_type(4200000000))
<class 'numpy.uint32'>
>>>
Is there a numpy builtin way to achieve this functionality?
It's numpy.min_scalar_type. Examples from the docs:
>>> np.min_scalar_type(10)
dtype('uint8')
>>> np.min_scalar_type(-260)
dtype('int16')
>>> np.min_scalar_type(3.1)
dtype('float16')
>>> np.min_scalar_type(1e50)
dtype('float64')
>>> np.min_scalar_type(np.arange(4,dtype='f8'))
dtype('float64')
You might not be interested in the behavior for floats, but I'm including it anyway for other people who come across the question, particularly since the use of float16 and the lack of float->int demotion might be surprising.
Not to be confused with the inverse task, that is covered plenty.
I am looking for something like np.dtype(7.7) == np.float.
The motivation is to be able to handle any array-like input just like numpy itself. To construct the output or temporary data, I sometimes want to use the input type if possible.
Edit: Maybe that was a bad (too specific) example; I know that np.float happens to be just an alias for the builtin float. I was thinking more along the following lines.
myInput = something
# required to have a homogeneous data type in the documentation of my function;
# maybe constrained to float, int, string, lists of same length and type
# but I would like to handle simply as much of that as numpy can handle
numInput = len(myInput)
numOutput = numInput // 2 # just for example
myOutput = np.empty(shape=(numOutput), dtype=???)
for i in range(numOutput):
myOutput[i] = whatever # maybe just a copy, hence the same data type
numpy.float is just the regular Python float type. It's not a NumPy dtype. It's almost certainly not what you need:
>>> import numpy
>>> numpy.float is float
True
If you want the dtype NumPy would coerce your scalar to, just make an array and get its dtype:
>>> numpy.array(7.7).dtype
dtype('float64')
If you want the type NumPy uses for scalars of this dtype, access the dtype's type attribute:
>>> numpy.array(7.7).dtype.type
<class 'numpy.float64'>
You could simply use np.float64(original_float), or whatever numpy type you wish to convert your variable to.
For the record, this code works:
val = 7.7
if isinstance(val, float) is True:
val = np.float64(val)
if isinstance(val, np.float64) is True:
print("Success!")
>>>Success!
Hope this helps.
Edit:
I just saw #user2357112 supports Monica's comment to your question and it's important to note that effectively np.float acts the same way as float. The implementation I provided is oriented towards special numpy types like np.float32 or np.float64, the one I used in the test code. But if I performed the same test with just np.float this would be the result:
val = 7.7
if isinstance(val, float) is True:
if isinstance(val, np.float) is True:
print("Success!")
>>>Success!
Thus proving that from the interpreter's point of view float and np.float are pretty much the same type.
I am trying to convert a numpy array of 64 bit integers into an array of standard python integers (i.e., variables of type int).
In my naive thinking I believed that np.int64 represents the 64 bit integer and int represents the standard python integer but that doesn't seem to be correct:
import numpy as np
a = np.arange(2)
print(a.dtype) # dtype('int64')
print(type(a[0])) # <class 'numpy.int64'>
b = a.astype(int)
print(b.dtype) # dtype('int64')
print(type(b[0])) # <class 'numpy.int64'>
c = a.astype('int')
print(c.dtype) # dtype('int64')
print(type(c[0])) # <class 'numpy.int64'>
One thing that works of course is:
d = a.tolist()
print(type(d[0])) # int
Is it possible to have a numpy array with numbers of type int or does numpy require the variables to be of its equivalent np.int datatypes?
This is merely a repost of the comments to close the question.
The built-in int isn't a valid dtype. It will be converted to
whatever np.int_ is, which may be np.int64 or np.int32 depending
on your platform. You can, but you must use dtype=object, which
essentially removes the advantages of numpy to give you a Python list.
By juanpa.arrivillaga
I'm looking at a third-party lib that has the following if-test:
if isinstance(xx_, numpy.ndarray) and xx_.dtype is numpy.float64 and xx_.flags.contiguous:
xx_[:] = ctypes.cast(xx_.ctypes._as_parameter_,ctypes.POINTER(ctypes.c_double))
It appears that xx_.dtype is numpy.float64 always fails:
>>> xx_ = numpy.zeros(8, dtype=numpy.float64)
>>> xx_.dtype is numpy.float64
False
What is the correct way to test that the dtype of a numpy array is float64 ?
This is a bug in the lib.
dtype objects can be constructed dynamically. And NumPy does so all the time. There's no guarantee anywhere that they're interned, so constructing a dtype that already exists will give you the same one.
On top of that, np.float64 isn't actually a dtype; it's a… I don't know what these types are called, but the types used to construct scalar objects out of array bytes, which are usually found in the type attribute of a dtype, so I'm going to call it a dtype.type. (Note that np.float64 subclasses both NumPy's numeric tower types and Python's numeric tower ABCs, while np.dtype of course doesn't.)
Normally, you can use these interchangeably; when you use a dtype.type—or, for that matter, a native Python numeric type—where a dtype was expected, a dtype is constructed on the fly (which, again, is not guaranteed to be interned), but of course that doesn't mean they're identical:
>>> np.float64 == np.dtype(np.float64) == np.dtype('float64')
True
>>> np.float64 == np.dtype(np.float64).type
True
The dtype.type usually will be identical if you're using builtin types:
>>> np.float64 is np.dtype(np.float64).type
True
But two dtypes are often not:
>>> np.dtype(np.float64) is np.dtype('float64')
False
But again, none of that is guaranteed. (Also, note that np.float64 and float use the exact same storage, but are separate types. And of course you can also make a dtype('f8'), which is guaranteed to work the same as dtype(np.float64), but that doesn't mean 'f8' is, or even ==, np.float64.)
So, it's possible that constructing an array by explicitly passing np.float64 as its dtype argument will mean you get back the same instance when you check the dtype.type attribute, but that isn't guaranteed. And if you pass np.dtype('float64'), or you ask NumPy to infer it from the data, or you pass a dtype string for it to parse like 'f8', etc., it's even less likely to match. More importantly, you definitely not get np.float64 back as the dtype itself.
So, how should it be fixed?
Well, the docs define what it means for two dtypes to be equal, and that's a useful thing, and I think it's probably the useful thing you're looking for here. So, just replace the is with ==:
if isinstance(xx_, numpy.ndarray) and xx_.dtype == numpy.float64 and xx_.flags.contiguous:
However, to some extent I'm only guessing that's what you're looking for. (The fact that it's checking the contiguous flag implies that it's probably going to go right into the internal storage… but then why isn't it checking C vs. Fortran order, or byte order, or anything else?)
Try:
x = np.zeros(8, dtype=np.float64)
print x.dtype is np.dtype(np.float64))
is tests for the identity of 2 objects, whether they have the same id(). It is used for example to test is None, but can give errors when testing for integers or strings. But in this case, there's a further problem, x.dtype and np.float64 are not the same class.
isinstance(x.dtype, np.dtype) # True
isinstance(np.float64, np.dtype) # False
x.dtype.__class__ # numpy.dtype
np.float64.__class__ # type
np.float64 is actually a function. np.float64() produces 0.0. x.dtype() produces an error. (correction np.float64 is a class.)
In my interactive tests:
x.dtype is np.dtype(np.float64)
returns True. But I don't know if that's universally the case, or just the result of some sort of local caching. The dtype documentation mentions a dtype attribute:
dtype.num A unique number for each of the 21 different built-in types.
Both dtypes give 12 for this num.
x.dtype == np.float64
tests True.
Also, using type works:
x.dtype.type is np.float64 # True
When I import ctypes and do the cast (with your xx_) I get an error:
ValueError: setting an array element with a sequence.
I don't know enough of ctypes to understand what it is trying to do. It looks like it is doing a type conversion of the data pointer of xx_, xx_.ctypes._as_parameter_ is the same number as xx_.__array_interface__['data'][0].
In the numpy test code I find these dtype tests:
issubclass(arr.dtype.type, (nt.integer, nt.bool_)
assert_(dat.dtype.type is np.float64)
assert_equal(A.dtype.type, np.unicode_)
assert_equal(r['col1'].dtype.kind, 'i')
numpy documentation also talks about
np.issubdtype(x.dtype, np.float64)
np.issubsctype(x, np.float64)
both of which use issubclass.
Further tracing of the c code suggests that x.dtype == np.float64 is evaluated as:
x.dtype.num == np.dtype(np.float64).num
That is, the scalar type is converted to a dtype, and the .num attributes compared. The code is in scalarapi.c, descriptor.c, multiarraymodule.c of numpy / core / src / multiarray
I'm not sure when this API was introduced, but at least as of 2022 it looks like you can use numpy.issubdtype for the type checking part and therefore write:
if isinstance(arr, numpy.ndarray) and numpy.issubdtype(arr.dtype, numpy.floating):
...
I'm using Numpy and Python. I need to copy data, WITHOUT numeric conversion between np.uint64 and np.float64, e.g. 1.5 <-> 0x3ff8000000000000.
I'm aware of float.hex, but the output format a long way from uint64:
In [30]: a=1.5
In [31]: float.hex(a)
Out[31]: '0x1.8000000000000p+0'
Im also aware of various string input routines for the other way.
Can anybody suggest more direct methods? After all, its just simple copy and type change but python/numpy seem really rigid about converting the data on the way.
Use an intermediate array and the frombuffer method to "cast" one array type into the other:
>>> v = 1.5
>>> fa = np.array([v], dtype='float64')
>>> ua = np.frombuffer(fa, dtype='uint64')
>>> ua[0]
4609434218613702656 # 0x3ff8000000000000
Since frombuffer creates a view into the original buffer, this is efficient even for reinterpreting data in large arrays.
So, what you need is to see the 8 bytes that represent the float64 in memory as an integer number. (representing this int64 number as an hexadecimal string is another thing - it
is just its representation).
The Struct and Union functionality that comes bundled with the stdlib's ctypes
may be nice for you - no need for numpy. It has a Union type that works
quite like C language unions, and allow you to do this:
>>> import ctypes
>>> class Conv(ctypes.Union):
... _fields_ = [ ("float", ctypes.c_double), ("int", ctypes.c_uint64)]
...
>>> c = Conv()
>>> c.float = 1.5
>>> print hex(c.int)
0x3ff8000000000000L
The built-in "hex" function is a way to get the hexadecimal representation of the number.
You can use the struct module as well: pack the number to a string as a double, and unpack it as int. I think it is both less readable and less efficient than using ctypes Union:
>>> inport struct
>>> hex(struct.unpack("<Q", struct.pack("<d", 1.5))[0])
'0x3ff8000000000000'
Since you are using numpy , however, you can simply change the array type, "on the fly", and manipulate all the array as integers with 0 copy:
>>> import numpy
>>> x = numpy.array((1.5,), dtype=numpy.double)
>>> x[0]
1.5
>>> x.dtype=numpy.dtype("uint64")
>>> x[0]
4609434218613702656
>>> hex(x[0])
'0x3ff8000000000000L'
This is by far the most efficient way of doing it, whatever is your purpose in getting the raw bytes of the float64 numbers.