How does Python convert data types?

How does Python convert data types? - python

So as the title says how does converting data types work? For example in Python, when you take an intger or floating number and turn it into a string. What's going on behind the scenes that does this kind of conversion. My hypothesis was that it reads the actual bytes and then goes into memory and makes a new variable that's a string.

I am sharing idea and then you can extend it more :
There is no limit to how long an integer value can be but depends on amount of memory your system has, but beyond that an integer can be as long as you need it to be.
Strings can be prepended to an integer
Code : print(0x10)
Output : 16
Code : print(0b10)
Output : 2

Related

Big Binary Code into File in Python

I have been working on a program and I have been trying to convert a big binary file (As a string) and pack it into a file. I have tried for days to make such thing possible. Here is the code I had written to pack the large binary string.
binaryRecieved="11001010101....(Shortened)"
f=open(fileName,'wb')
m=long(binaryRecieved,2)
struct.pack('i',m)
f.write(struct.pack('i',m))
f.close()
quit()
I am left with the error
struct.pack('i',x)
struct.error: integer out of range for 'i' format code
My integer is out of range, so I was wondering if there is a different way of going about with this.
Thanks

Convert your bit string to a byte string: see for example this question Converting bits to bytes in Python. Then pack the bytes with struct.pack('c', bytestring)

For encoding m in big-endian order (like "ten" being written as "10" in normal decimal use) use:
def as_big_endian_bytes(i):
out=bytearray()
while i:
out.append(i&0xff)
i=i>>8
out.reverse()
return out
For encoding m in little-endian order (like "ten" being written as "01" in normal decimal use) use:
def as_little_endian_bytes(i):
out=bytearray()
while i:
out.append(i&0xff)
i=i>>8
return out
both functions work on numbers - like you do in your question - so the returned bytearray may be shorter than expected (because for numbers leading zeroes do not matter).
For an exact representation of a binary-digit-string (which is only possible if its length is dividable by 8) you would have to do:
def as_bytes(s):
assert len(s)%8==0
out=bytearray()
for i in range(0,len(s)-8,8):
out.append(int(s[i:i+8],2))
return out

In struct.pack you have used 'i' which represents an integer number, which is limited. As your code states, you have a long output; thus, you may want to use 'd' in stead of 'i', to pack your data up as double. It should work.
See Python struct for more information.

How does Python know which number type to use in order to Multiply arbitrary two numbers?

In C, I have to set proper type, such as int, float, long for a simple arithmetic for multiplying two numbers. Otherwise, it will give me an incorrect answer.
But in Python, basically it can automatically give me the correct answer.
I have tried debug a simple 987*456 calculation to see the source code.
I set a break point at that line in PyCharm, but I cannot step into the source code, it just finished right away.
How can I see the source code? Is it possible? Or how does Python do that multiplication?
I mean, how does Python carry out the different of number type in the result of
98*76 or 987654321*123457789, does Python detect some out of range error and try another number type?

I mean, how does Python carry out the different of number type in the result of 98*76 or 987654321*123457789, does Python detect some out of range error and try another number type?
Pretty much. The source code for integer multiplication can be found in intobject.c. It multiplies the integers as C longs, then casts the longs to doubles and multiplies those. If the results are close, the long multiplication didn't overflow. If the results are very different, it switches to Python longs, which use a bignum representation.

The type promotion for mixed arithmetic is:
integer -> long -> float
The narrower type is converted to the wider type, and the multiplication is carried out.
https://docs.python.org/2/library/stdtypes.html#numeric-types-int-float-long-complex
Some examples to see what happens:
987*456 = 450072
987*456L = 450072L
987*456.0 = 450072.0
I hope I understood your question.

Variables are nothing but reserved memory locations to store values. This means that when you create a variable you reserve some space in memory.
Based on the data type of a variable, the interpreter allocates memory and decides what can be stored in the reserved memory. Therefore, by assigning different data types to variables, you can store integers, decimals or characters in these variables.
Python variables do not have to be explicitly declared to reserve memory space. The declaration happens automatically when you assign a value to a variable. The equal sign (=) is used to assign values to variables.
The operand to the left of the = operator is the name of the variable and the operand to the right of the = operator is the value stored in the variable.

Python : Do not include "L" at the end of the outcome for : randint(100000000000000000000, 999999999999999999999)

so far this is what i found:
from random import randint
randint(100000000000000000000, 999999999999999999999)
the output is:
922106555361958347898L
but i do not want that L there..
i can only use this as an int if there is no "L" there at the end of it.
UPDATE
would it be a better idea to generate two small numbers and then combine them if the goal is to simply have a random number that is 30 digits long ?

The reason there's an L there is because this is too large to fit into an int,* so it's a long. See Numeric Types — int, long, float, complex in the docs for more details.
So, why do you get that L, and how do you get rid of it?
Python has two different ways to turn a value into a string representation:
repr either returns the canonical source-code representation, or something like <__main__.Eggs at 0x10fbd8908>).
str returns a human-friendly representation.
For example, for strings, str('abc') is abc, while repr('abc') is 'abc'.
And for longs, str(1L) is 1, while repr(1L) is 1L.
When you just type an expression at the interactive prompt, it uses repr. But when you use the print command, it uses str. So, if you want to see the value without the L, just print it:
print randint(100000000000000000000, 999999999999999999999)
If you want to, e.g., save the string in a variable or write it to a file, you have to call str explicitly.
But if you just want to use it as a number, you don't have to worry about this at all; it's a number, and int and long values can be intermixed freely (as of Python 2.3 or so).
And if you're trying to store it in a MySQL database, whichever MySQL interface you use won't care whether you're giving it int values or long, as long as they fit into the column type.**
Or you could upgrade to Python 3.x, where there is no separate long type anymore (all integers are int, no matter how big) and no L suffix.
* The exact cutoff isn't documented anywhere, but at least for CPython, it's whatever fits into a C long on your platform. So, on most 64-bit platforms, the max value is (1<<63)-1; on the other 64-bit platforms, and all 32-bit platforms, it's (1<<31)-1. You can see for yourself on your platform by printing sys.maxint. At any rate, your number takes 70 bits, so unless someone ports Python 2.x to a platform with 128-bit C longs, it won't fit.
** Note that your values are too big to fit into even a MySQL BIGINT, so you're going to be using either DECIMAL or NUMERIC. Depending on which interface you're using, and how you've set things up, you may have to convert to and from strings manually. But you can do that with the str and int functions, without worrying about which values fit into the int type and which don't.)

If you're on the interactive prompt, explicitly print the value. The repr of the value has an L, but the str of the value doesn't.
>>> 922106555361958347898
922106555361958347898L
>>> print 922106555361958347898
922106555361958347898

The output in the REPL has an L suffixed; if you print the value, it is not displayed.
>>> from random import randint
>>> print randint(100000000000000000000, 999999999999999999999)
106315199286113607384
>>>

Full Hexadecimal Representation in Python?

I've writing a RSA implementation in Python and have now successfully encrypted it however when it prints out the cipher, every time it comes out '0x1L.' I believe this is a long number, however I do not know how to show the full number. If my code is required, I will post in the comments section( It is quite long).
Thanks

Python's representation of your result as 0x1L indicates the number's type and value - however, it doesn't truncate the number at all. While it is stored as a long internally, its value is still just 1 in this case.

floats inside tuples changing values when accessed

So I have a list of tuples of two floats each. Each tuple represents a range. I am going through another list of floats which represent values to be fit into the ranges. All of these floats are < 1 but positive, so precision matter. One of my tests to determine if a value fits into a range is failing when it should pass. If I print the value and the range that is causing problems I can tell this much:
curValue = 0.00145000000671
range = (0.0014500000067055225, 0.0020968749796738849)
The conditional that is failing is:
if curValue > range[0] and ... blah :
# do some stuff
From the values given by curValue and range, the test should clearly pass (don't worry about what is in the conditional). Now, if I print explicitly what the value of range[0] is I get:
range[0] = 0.00145000000671
Which would explain why the test is failing. So my question then, is why is the float changing when it is accessed. It has decimal values available up to a certain precision when part of a tuple, and a different precision when accessed. Why would this be? What can I do to ensure my data maintains a consistent amount of precision across my calculations?

The float doesn't change. The built-in numberic types are all immutable. The cause for what you're observing is that:
print range[0] uses str on the float, which (up until very recent versions of Python) printed less digits of a float.
Printing a tuple (be it with repr or str) uses repr on the individual items, which gives a much more accurate representation (again, this isn't true anymore in recent releases which use a better algorithm for both).
As for why the condition doesn't work out the way you expect, it's propably the usual culprit, the limited precision of floats. Try print repr(curVal), repr(range[0]) to see if what Python decided was the closest representation of your float literal possible.

In modern day PC's floats aren't that precise. So even if you enter pi as a constant to 100 decimals, it's only getting a few of them accurate. The same is happening to you. This is because in 32-bit floats you only get 24 bits of mantissa, which limits your precision (and in unexpected ways because it's in base2).
Please note, 0.00145000000671 isn't the exact value as stored by Python. Python only diplays a few decimals of the complete stored float if you use print. If you want to see exactly how python stores the float use repr.
If you want better precision use the decimal module.

It isn't changing per se. Python is doing its best to store the data as a float, but that number is too precise for float, so Python modifies it before it is even accessed (in the very process of storing it). Funny how something so small is such a big pain.
You need to use a arbitrary fixed point module like Simple Python Fixed Point or the decimal module.

Not sure it would work in this case, because I don't know if Python's limiting in the output or in the storage itself, but you could try doing:
if curValue - range[0] > 0 and...

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.