Python: Convert string to instance

Python: Convert string to instance - python

I am new to Python and I have been stuck for hours with this problem... I don't know how to convert a variable (type string) to another variable (type instance).
>>from Crypto.PublicKey import RSA
>>from Crypto import Random
>>randomValue = Random.new().read
>>priv = RSA.generate(512, randomValue)
After these lines of code, "priv" is created, and this has type "instance".
And I had to convert this "priv" to type string using str(priv).
>>convertedToStr = str(priv)
>>type(convertedToStr)
<type 'str'>
Now, I need to convert it back to 'instance' and want to get the same thing in value and type as the original "priv". Assume that I cannot use "priv" anymore, and I need to convert "convertedToStr" (type string) into "convertedToStr" (type instance).
Is this ever possible?
Note: The reason I am doing this complex thing is because I have client and server sides and when one side sends a message to the other using sendall(var), it does not allow me to send variable of type 'instance'. So I had to convert it to string before sending it. Now, I want to use that on the receiver side as an variable of type 'instance' but I do not know how to convert it back.

The instance type is used for instances of old-style classes in Python 2. You may want to look at priv.__class__ instead of type(priv) to find out what class it actually has. I expect you'll find that it's class is Crypto.PublicKey.RSA._RSAObject, since that's what the generate function is documented to return.
I don't have the Crypto package installed, so I don't actually know what string you get when you call str on a private key instance. You might be able to parse the string and then call the function Crypto.PublicKey.RSA.construct with appropriate values to reconstruct the key object.
But I think that is doing more work than necessary. Instead of calling str on the key, you should instead call its exportKey method. Then, after you send the string you get back to the other system, you can pass it to Crypto.PublicKey.RSA.importKey.
Note that sending a private key over a network may expose it to eavesdropping, making it useless! You probably shouldn't do it unless the connection between your two systems is encrypted with some other system. Your system is only as secure as its weakest link.

Type instance is nothing specific, you can make a custom class and instantiate it, and it will have type instance:
>>> class x:
... y=1
...
>>> type(x())
<type 'instance'>
You can't arbitrarily convert things to a string by calling str() and guarantee get useful results - it merely asks the object to return a string that could say anything at all. In this case you asked for an RSA private key 512 bytes long and the str() output is ~45 bytes long, there's not 10% of the information needed to get the full object state back from that.
The general problem you're trying to solve is serialization/deserialization, and it's the topic of many modules, libraries and protocols - but luckily RSA keys are easy to convert to useful text and back again (not all objects are).
>>> out = priv.exportKey()
>>> new = RSA.importKey(out)
>>> new == priv
True
NB. when I tried your code, it clearly complained at me that 512 byte keys are weak and refused to generate them, insisting on 1024 bytes or more. You possibly are on an older version, but should specify a longer keylength.

Related

Namespacing issue in Python

I am using a program where Python is the native scripting language. Unfortunately, they have a native function that uses the name bytes. This causes a problem when I am trying to use the actual bytes built-in function, and it thinks I am referencing that built-in variable. I will show you what I mean, one object as the following built-in code:
def receive(row, table, message, bytes):
#This is defined in the GUI
So, row, table, message, and bytes are all passed in as arguments, effectively overwriting the name bytes. So if I were to say bytes(something).decode() I get a TypeError: 'bytes' object is not callable
Is there any way to get out of this jam?

Use a different name for the fourth parameter (if you can change the signature of the function)
def receive(row, table, message, bytes_):
#This is defined in the GUI

Your problem is similar to this one. Just from builtins import bytes as _bytes; this will let you do _bytes(something).decode().
Although renaming the fourth argument is a better solution.

Dealing with ctypes and ASCII strings when porting Python 2 code to Python 3

I got fed up last night and started porting PyVISA to Python 3 (progress here: https://github.com/thevorpalblade/pyvisa).
I've gotten it to the point where everything works, as long as I pass device addresses (well, any string really) as an ASCII string rather than the default unicode string (For example,
HP = vida.instrument(b"GPIB::16") works, whereas
HP = vida.instrument("GPIB::16") does not, raising a ValueError.
Ideally, the end user should not have to care about string encoding.
Any suggestions as to how I should approach this? Something in the ctypes type definitions perhaps?
As it stands, the relevant ctypes type definition is:
ViString = _ctypes.c_char_p

ctypes, like most things in Python 3, intentionally doesn't automatically convert between unicode and bytes. That's because in most use cases, that would just be asking for the same kind of mojibake or UnicodeEncodeError disasters that people switched to Python 3 to avoid.
However, when you know you're only dealing with pure ASCII, that's another story. You have to be explicit—but you can factor out that explicitness into a wrapper.
As explained in Specifying the required argument types (function prototypes), in addition to a standard ctypes type, you can pass any class that has a from_param classmethod—which normally returns an instance of some type (usually the same type) with an _as_parameter_ attribute, but can also just return a native ctypes-type value instead.
class Asciifier(object):
#classmethod
def from_param(cls, value):
if isinstance(value, bytes):
return value
else:
return value.encode('ascii')
This may not be the exact rule you want—for example, it'll fail on bytearray (just as c_char_p will) even though that could be converted quietly to bytes… but then you wouldn't want to implicitly convert an int to bytes. Anything, whatever rule you decide on should be easy to code.
Here's an example (on OS X; you'll obviously have to change how libc is loaded for linux, Windows, etc., but you presumably know how to do that):
>>> libc = CDLL('libSystem.dylib')
>>> libc.atoi.argtypes = [Asciifier]
>>> libc.atoi.restype = c_int
>>> libc.atoi(b'123')
123
>>> libc.atoi('123')
123
>>> libc.atoi('１２３') # Unicode fullwidth digits
ArgumentError: argument 1: <class 'UnicodeEncodeError'>: 'ascii' codec can't encode character '\uff10' in position 0: ordinal not in range(128)
>>> libc.atoi(123)
ArgumentError: argument 1: <class 'AttributeError'>: 'int' object has no attribute 'encode'
Obviously you can catch the exception and raise a different one if those aren't clear enough for your use case.
You can similarly write a Utf8ifier, or an Encodifier(encoding, errors=None) class factory, or whatever else you need for some particular library and stick it in the argtypes the same way.
If you also want to auto-decode return types, see Return types and errcheck.
One last thing: When you're sure the data are supposed to be UTF-8, but you want to deal with the case where they aren't in the same way Python 2.x would (by preserving them as-is), you can even do that in 3.x. Use the aforementioned Utf8ifier as your argtype, and a decoder errcheck, and use errors=surrogateescape. See here for a complete example.

Creating a customized language using Python

I have started playing with Sage recently, and I've come to suspect that the standard Python int is wrapped in a customized class called Integer in Sage. If I type in type(1) in Python, I get <type 'int'>, however, if I type in the same thing in the sage prompt I get <type 'sage.rings.integer.Integer'>.
If I wanted to replace Python int (or list or dict) with my own custom class, how might it be done? How difficult would it be (e.g. could I do it entirely in Python)?

As an addendum to the other answers: when running any code, Sage has a preprocessing step which converts the Sage-Python to true Python (which is then executed). This is done by the preparse function, e.g.
sage: preparse('a = 1')
'a = Integer(1)'
sage: preparse('2^40')
'Integer(2)**Integer(40)'
sage: preparse('F.<x> = PolynomialRing(ZZ)')
"F = PolynomialRing(ZZ, names=('x',)); (x,) = F._first_ngens(1)"
This step is precisely what allows the transparent use of Integers (in place of ints) and the other non-standard syntax (like the polynomial ring example above and [a..b] etc).
As far as I understand, this is the only way to completely transparently use replacements for the built-in types in Python.

You are able to subclass all of Python's built-in types. For example:
class MyInt(int):
pass
i = MyInt(2)
#i is now an instance of MyInt, but still will behave entirely like an integer.
However, you need to explicitly say each integer is a member of MyInt. So type(1) will still be int, you'll need to do type(MyInt(1)).
Hopefully that's close to what you're looking for.

In the case of Sage, it's easy. Sage has complete control of its own REPL (read-evaluate-print loop), so it can parse the commands you give it and make the parts of your expression into whatever classes it wants. It is not so easy to have standard Python automatically use your integer type for integer literals, however. Simply reassigning the built-in int() to some other type won't do it. You could probably do it with an import filter, that scans each file imported for (say) integer literals and replaces them with MyInt(42) or whatever.

Listing all possible values for SOAP enumeration with Python SUDS

I'm connecting with a SUDS client to a SOAP Server whose wsdl contains many enumerations like the following:
</simpleType>
<simpleType name="FOOENUMERATION">
<restriction base="xsd:string">
<enumeration value="ALPHA"><!-- enum const = 0 -->
<enumeration value="BETA"/><!-- enum const = 1 -->
<enumeration value="GAMMA"/><!-- enum const = 2 -->
<enumeration value="DELTA"/><!-- enum const = 3 -->
</restriction>
</simpleType>
In my client I am receiving sequences which contain elements of these various enumeration types. My need is that given a member variable, I need to know all possible enumeration values. Basically I need a function which takes an instance of one of these enums and returns a list of strings which are all the possible values.
When I have an instance, running:
print type(foo.enumInstance)
I get:
<class 'suds.sax.text.Text'>
I'm not sure how to get the actual simpleType name from this, and then get the possible values from that short of parsing the WSDL myself.
Edit: I've discovered a way to get the enumerations given the simpleType name, as below, so my problem narrows down to findingthe type name for a given variable, given that type(x) returns suds.sax.text.Text instead of the real name
for l in client.factory.create('FOOENUMERATION'):
print l[0]

If you know the name of the enum you want, you should be able to treat the enumeration object suds gives you like a dictionary, and do a direct lookup with that name. For example, if your enumeration type is called SOAPIPMode and you want the enum named STATIC_MANUAL in that enumeration:
soapIPMode = client.factory.create('SOAPIPMode')
staticManual = soapIPMode['STATIC_MANUAL']
The resulting value is of type suds.sax.text.Text which acts like a string.
You can also iterate over the enumeration type as if it were an array:
for i in range(len(soapIPMode):
process(soapIPMode[i])

I have figured out a rather hacky way to pull this off, but hopefully someone still has a better answer for me. For some reason objects returned from the server have enums with the suds.sax.text.Text type, but those created with the factory have types related to the enum, so this works:
def printEnums(obj,field):
a=client.factory.create(str(getattr(client.factory.create( str(obj.__class__).replace('suds.sudsobject.','')),field).__class__).replace('suds.sudsobject.',''))
for i in a:
print i[0]
Then I can do:
printEnums(foo,'enumInstance')
and even if foo was returned from the server and not created by a factory get a listing of the possible values for foo.enumInstance, since I factory create a new class of the same type as the one passed in. Still, I can't imagine that this mess is the correct/best way to do this.

See if you can feed in the WSDL into the ElementTree component on Python and use it to obtain the enumerations.

Accessing Object Memory Address

When you call the object.__repr__() method in Python you get something like this back:
<__main__.Test object at 0x2aba1c0cf890>
Is there any way to get a hold of the memory address if you overload __repr__(), other then calling super(Class, obj).__repr__() and regexing it out?

The Python manual has this to say about id():
Return the "identity'' of an object.
This is an integer (or long integer)
which is guaranteed to be unique and
constant for this object during its
lifetime. Two objects with
non-overlapping lifetimes may have the
same id() value. (Implementation note:
this is the address of the object.)
So in CPython, this will be the address of the object. No such guarantee for any other Python interpreter, though.
Note that if you're writing a C extension, you have full access to the internals of the Python interpreter, including access to the addresses of objects directly.

You could reimplement the default repr this way:
def __repr__(self):
return '<%s.%s object at %s>' % (
self.__class__.__module__,
self.__class__.__name__,
hex(id(self))
)

Just use
id(object)

There are a few issues here that aren't covered by any of the other answers.
First, id only returns:
the “identity” of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.
In CPython, this happens to be the pointer to the PyObject that represents the object in the interpreter, which is the same thing that object.__repr__ displays. But this is just an implementation detail of CPython, not something that's true of Python in general. Jython doesn't deal in pointers, it deals in Java references (which the JVM of course probably represents as pointers, but you can't see those—and wouldn't want to, because the GC is allowed to move them around). PyPy lets different types have different kinds of id, but the most general is just an index into a table of objects you've called id on, which is obviously not going to be a pointer. I'm not sure about IronPython, but I'd suspect it's more like Jython than like CPython in this regard. So, in most Python implementations, there's no way to get whatever showed up in that repr, and no use if you did.
But what if you only care about CPython? That's a pretty common case, after all.
Well, first, you may notice that id is an integer;* if you want that 0x2aba1c0cf890 string instead of the number 46978822895760, you're going to have to format it yourself. Under the covers, I believe object.__repr__ is ultimately using printf's %p format, which you don't have from Python… but you can always do this:
format(id(spam), '#010x' if sys.maxsize.bit_length() <= 32 else '#18x')
* In 3.x, it's an int. In 2.x, it's an int if that's big enough to hold a pointer—which is may not be because of signed number issues on some platforms—and a long otherwise.
Is there anything you can do with these pointers besides print them out? Sure (again, assuming you only care about CPython).
All of the C API functions take a pointer to a PyObject or a related type. For those related types, you can just call PyFoo_Check to make sure it really is a Foo object, then cast with (PyFoo *)p. So, if you're writing a C extension, the id is exactly what you need.
What if you're writing pure Python code? You can call the exact same functions with pythonapi from ctypes.
Finally, a few of the other answers have brought up ctypes.addressof. That isn't relevant here. This only works for ctypes objects like c_int32 (and maybe a few memory-buffer-like objects, like those provided by numpy). And, even there, it isn't giving you the address of the c_int32 value, it's giving you the address of the C-level int32 that the c_int32 wraps up.
That being said, more often than not, if you really think you need the address of something, you didn't want a native Python object in the first place, you wanted a ctypes object.

Just in response to Torsten, I wasn't able to call addressof() on a regular python object. Furthermore, id(a) != addressof(a). This is in CPython, don't know about anything else.
>>> from ctypes import c_int, addressof
>>> a = 69
>>> addressof(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: invalid type
>>> b = c_int(69)
>>> addressof(b)
4300673472
>>> id(b)
4300673392

You can get something suitable for that purpose with:
id(self)

With ctypes, you can achieve the same thing with
>>> import ctypes
>>> a = (1,2,3)
>>> ctypes.addressof(a)
3077760748L
Documentation:
addressof(C instance) -> integer
Return the address of the C instance internal buffer
Note that in CPython, currently id(a) == ctypes.addressof(a), but ctypes.addressof should return the real address for each Python implementation, if
ctypes is supported
memory pointers are a valid notion.
Edit: added information about interpreter-independence of ctypes

I know this is an old question but if you're still programming, in python 3 these days... I have actually found that if it is a string, then there is a really easy way to do this:
>>> spam.upper
<built-in method upper of str object at 0x1042e4830>
>>> spam.upper()
'YO I NEED HELP!'
>>> id(spam)
4365109296
string conversion does not affect location in memory either:
>>> spam = {437 : 'passphrase'}
>>> object.__repr__(spam)
'<dict object at 0x1043313f0>'
>>> str(spam)
"{437: 'passphrase'}"
>>> object.__repr__(spam)
'<dict object at 0x1043313f0>'

You can get the memory address/location of any object by using the 'partition' method of the built-in 'str' type.
Here is an example of using it to get the memory address of an object:
Python 3.8.3 (default, May 27 2020, 02:08:17)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> object.__repr__(1)
'<int object at 0x7ca70923f0>'
>>> hex(int(object.__repr__(1).partition('object at ')[2].strip('>'), 16))
0x7ca70923f0
>>>
Here, I am using the built-in 'object' class' '__repr__' method with an object/item such as 1 as an argument to return the string and then I am partitioning that string which will return a tuple of the string before the string that I provided, the string that I provided and then the string after the string that I provided, and as the memory location is positioned after 'object at', I can get the memory address as it has partitioned it from that part.
And then as the memory address was returned as the third item in the returned tuple, I can access it with index 2 from the tuple. But then, it has a right angled bracket as a suffix in the string that I obtained, so I use the 'strip' function to remove it, which will return it without the angled bracket. I then transformed the resulted string into an integer with base 16 and then turn it into a hex number.

While it's true that id(object) gets the object's address in the default CPython implementation, this is generally useless... you can't do anything with the address from pure Python code.
The only time you would actually be able to use the address is from a C extension library... in which case it is trivial to get the object's address since Python objects are always passed around as C pointers.

If the __repr__ is overloaded, you may consider __str__ to see the memory address of the variable.
Here is the details of __repr__ versus __str__ by Moshe Zadka in StackOverflow.

There is a way to recovery the value from the 'id' command, here it the TL;DR.
ctypes.cast(memory_address,ctypes.py_object).value
source

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.