What is md5_hash property of BlobInfo object intended for? - python

If I call
blobsotre.BlobInfo.properties()
the function return
set(['filename', 'creation', 'content_type', 'md5_hash', 'size'])
but if I call
a = blobstore.BlobInfo.all()
obj = a.fetch(1)[0]
print obj.md5_hash
the function raise exception
AttributeError(name) AttributeError: md5_hash
What is md5_hash property of BlobInfo object intended for?
P.S. I want to check what uploaded file is not exist into Blobstore.

A cryptographic hash function can be used for many things:
to provide an integrity check value for the file/blob to detect changes
to provide a unique identifier for a file/blob used to refer to the contents
to enable fast lookup of the contents of a hash table
to enable fast searching for duplicate files
etc
The "intended" use of course depends on what application the blobstore is supporting - are you building a shopping cart, or a data cache, or a map-reduce processing application, or what?

The code you show works fine for me, on shell.appspot.com:
>>> from google.appengine.ext import blobstore
>>> blobstore.BlobInfo.properties()
set(['filename', 'creation', 'content_type', 'md5_hash', 'size'])
>>> o = blobstore.BlobInfo.all().get()
>>> o.md5_hash
u'5d41402abc4b2a76b9719d911017c592'
You must be doing something different to what's in your sample code. Can you paste your exact code, and the complete stacktrace?

You probably have BlobInfo objects that don't have an md5_hash written including the first result returned by blobstore.BlobInfo.all()
You can check easily in your dev server's interactive console:
from google.appengine.ext import blobstore
query1 = blobstore.BlobInfo.all()
query2 = blobstore.BlobInfo.gql("WHERE md5_hash != ''")
print query1.count(), query2.count()
# for me this returns '100 85'

Related

Hazelcast and python there is no suitable de-serializer for type -120

hello i guess have problem with client and member config which config should i use as you can see i am inserting json as data when i call get_data it returns with no problem but when i try to use predicate-sql it gives me error "hazelcast.errors.HazelcastSerializationError: Exception from server: com.hazelcast.nio.serialization.HazelcastSerializationException: There is no suitable de-serializer for type -120. This exception is likely caused by differences in t
he serialization configuration between members or between clients and members."
#app.route('/insert_data/<database_name>/<collection_name>', methods=['POST'])
def insert_data(database_name, collection_name):
client = hazelcast.HazelcastClient(cluster_members=[
url
])
dbname_map = client.get_map(f"{database_name}-{collection_name}").blocking()
if request.json:
received_json_data = request.json
received_id = received_json_data["_id"]
del received_json_data["_id"]
dbname_map.put(received_id, received_json_data)
client.shutdown()
return jsonify()
else:
client.shutdown()
abort(400)
#app.route('/get_data/<database_name>/<collection_name>', methods=['GET'])
def get_all_data(database_name, collection_name):
client = hazelcast.HazelcastClient(cluster_members=[
url
])
dbname_map = client.get_map(f"{database_name}-{collection_name}").blocking()
entry_set = dbname_map.entry_set()
output = dict()
datas = []
for key, value in entry_set:
value['_id'] = key
output = value
datas.append(output)
client.shutdown()
return jsonify({"Result":datas})
#bp.route('/get_query/<database_name>/<collection_name>/<name>', methods=['GET'])
def get_query_result(database_name, collection_name,name):
client = hazelcast.HazelcastClient(cluster_members=[
url
])
predicate_map = client.get_map(f"{database_name}-{collection_name}").blocking()
predicate = and_(sql(f"name like {name}%"))
entry_set = predicate_map.values(predicate)
#entry_set = predicate_map.entry_set(predicate)
send_all_data = ""
for x in entry_set:
send_all_data += x.to_string()
send_all_data += "\n"
print(send_all_data)
# print("Retrieved %s values whose age is less than 30." % len(result))
# print("Entry is", result[0].to_string())
# value=predicate_map.get(70)
# print(value)
return jsonify()
i try to change hazelcast.xml according to hazelcast-full-example.xml but i can't start hazelcast after
the changes and do i really have to use serialization ? hazelcast version:4.1 python:3.9
This is most likely happening because you are putting entries of the type dictionary to the map, which is serialized by the pickle because you didn't specify a serializer for that and the client does not know how to handle that correctly, so it fallbacks to the default serializer. However, since pickle serialization is Python-specific, servers cannot deserialize it and throw such an exception.
There are possible solutions to that, see the https://hazelcast.readthedocs.io/en/stable/serialization.html chapter for details.
I think the most appropriate solution for your use case would be Portable serialization which does not require a configuration change or code on the server-side. See the https://hazelcast.readthedocs.io/en/stable/serialization.html#portable-serialization
BTW, client objects are quite heavyweight, so you shouldn't be creating them on demand like this. You can construct it once in your application and share and use it in your endpoints or business-logic code freely since it is thread-safe. The same applies to the map proxy you get from the client. It can also be re-used.

How to convert suds object to xml string

This is a duplicate to this question:
How to convert suds object to xml
But the question has not been answered: "totxt" is not an attribute on the Client class.
Unfortunately I lack of reputation to add comments. So I ask again:
Is there a way to convert a suds object to its xml?
I ask this because I already have a system that consumes wsdl files and sends data to a webservice. But now the customers want to alternatively store the XML as files (to import them later manually). So all I need are 2 methods for writing data: One writes to a webservice (implemented and tested), the other (not implemented yet) writes to files.
If only I could make something like this:
xml_as_string = My_suds_object.to_xml()
The following code is just an example and does not run. And it's not elegant. Doesn't matter. I hope you get the idea what I want to achieve:
I have the function "write_customer_obj_webservice" that works. Now I want to write the function "write_customer_obj_xml_file".
import suds
def get_customer_obj():
wsdl_url = r'file:C:/somepathhere/Customer.wsdl'
service_url = r'http://someiphere/Customer'
c = suds.client.Client(wsdl_url, location=service_url)
customer = c.factory.create("ns0:CustomerType")
return customer
def write_customer_obj_webservice(customer):
wsdl_url = r'file:C:/somepathhere/Customer.wsdl'
service_url = r'http://someiphere/Customer'
c = suds.client.Client(wsdl_url, location=service_url)
response = c.service.save(someparameters, None, None, customer)
return response
def write_customer_obj_xml_file(customer):
output_filename = r'C\temp\testxml'
# The following line is the problem. "to_xml" does not exist and I can't find a way to do it.
xml = customer.to_xml()
fo = open(output_filename, 'a')
try:
fo.write(xml)
except:
raise
else:
response = 'All ok'
finally:
fo.close()
return response
# Get the customer object always from the wsdl.
customer = get_customer_obj()
# Since customer is an object, setting it's attributes is very easy. There are very complex objects in this system.
customer.name = "Doe J."
customer.age = 42
# Write the new customer to a webservice or store it in a file for later proccessing
if later_processing:
response = write_customer_obj_xml_file(customer)
else:
response = write_customer_obj_webservice(customer)
I found a way that works for me. The trick is to create the Client with the option "nosend=True".
In the documentation it says:
nosend - Create the soap envelope but don't send. When specified, method invocation returns a RequestContext instead of sending it.
The RequestContext object has the attribute envelope. This is the XML as string.
Some pseudo code to illustrate:
c = suds.client.Client(url, nosend=True)
customer = c.factory.create("ns0:CustomerType")
customer.name = "Doe J."
customer.age = 42
response = c.service.save(someparameters, None, None, customer)
print response.envelope # This prints the XML string that would have been sent.
You have some issues in write_customer_obj_xml_file function:
Fix bad path:
output_filename = r'C:\temp\test.xml'
The following line is the problem. "to_xml" does not exist and I can't find a way to do it.
What's the type of customer? type(customer)?
xml = customer.to_xml() # to be continued...
Why mode='a'? ('a' => append, 'w' => create + write)
Use a with statement (file context manager).
with open(output_filename, 'w') as fo:
fo.write(xml)
Don't need to return a response string: use an exception manager. The exception to catch can be EnvironmentError.
Analyse
The following call:
customer = c.factory.create("ns0:CustomerType")
Construct a CustomerType on the fly, and return a CustomerType instance customer.
I think you can introspect your customer object, try the following:
vars(customer) # display the object attributes
help(customer) # display an extensive help about your instance
Another way is to try the WSDL URLs by hands, and see the XML results.
You may obtain the full description of your CustomerType object.
And then?
Then, with the attributes list, you can create your own XML. Use an XML template and fill it with the object attributes.
You may also found the magic function (to_xml) which do the job for you. But, not sure the XML format matches your need.
client = Client(url)
client.factory.create('somename')
# The last XML request by client
client.last_sent()
# The last XML response from Web Service
client.last_received()

Switch collection in mongoengine for find query

I've read mongoengine documentation about switching collection to save document. And test this code and it worked successfully:
from mongoengine.context_managers import switch_db
class Group(Document):
name = StringField()
Group(name="test").save() # Saves in the default db
with switch_collection(Group, 'group2000') as Group:
Group(name="hello Group 2000 collection!").save() # Saves in group2000 collection
But the problem is when I want to find saved document in switch collection switch_collection doesn't work at all.
with switch_collection(Group, 'group2000') as GroupT:
GroupT.objects.get(name="hello Group 2000 collection!") # Finds in group2000 collection
As of mongoengine==0.10.0 mongoengine.context_managers.switch_collection(cls, collection_name)
used as "with switch_collection(Group, 'group1') as Group:" in the example
doesn't work inside functions. It gives unboundlocalerror. A simple get around with existing resources is :
To get:
new_group = Group.switch_collection(Group(),'group1')
from mongoengine.queryset import QuerySet
new_objects = QuerySet(Group,new_group._get_collection())
Use new_objects.all() to get all objects etc.
To save:
group_obj = Group()
group_obj.switch_collection('group2')
group_obj.save()
Although Prachetos Sadhukhan answer works for me, I prefer to get the collection directly, not relying on private _get_collection method:
from mongoengine import connection
new_group_collection = connection.get_db()['group1']
from mongoengine.queryset import QuerySet
new_objects = QuerySet(Group, new_group_collection)

how to enable connectiondraining with python boto modify_lb_attribute

I have been trying to enable ELB connection draining using the modify_lb_attribute method in the python boto module; however I haven't been able to get it working. According to the documentation here http://boto.readthedocs.org/en/latest/ref/elb.html I should be able to call it like his:
modify_lb_attribute(load_balancer_name, attribute, value)
Here is an example:
modify_lb_attribute('my-elb', 'connectionDraining', 120)
When I do this however I receive the following error:
File "/Library/Python/2.7/site-packages/boto/ec2/elb/init.py", line 421, in modify_lb_attribute
value.enabled and 'true' or 'false'
AttributeError: 'NoneType' object has no attribute 'enabled'
I have been able to get it to work successfully with crossZoneLoadBalancing.
For example this works:
modify_lb_attribute('my-elb', 'crossZoneLoadBalancing', 'true')
Any help or suggestions would be appreciated.
Thanks
Working syntax for instantiating a ConnectionDrainingAttribute and passing it a to a load balancer:
from boto.ec2.elb.attributes import ConnectionDrainingAttribute
import boto.ec2.elb
connection = boto.ec2.elb.connect_to_region("region")
cda = ConnectionDrainingAttribute(connection)
cda.enabled = True
cda.timeout = 120
connection.modify_lb_attribute(
load_balancer_name='my-elb',
attribute='connectionDraining',
value=cda
)
More information about the ConnectionDrainingAttribute class can be found here in the boto docs.
When you modify the connectionDraining attribute of a load balancer, there are actually two values you can supply. The first is a boolean indicating whether you are enabling or disabling the connection draining feature. The second is an integer indicating the timeout which obviously applies only if connection draining is being enabled.
To allow you to specify both of these values, boto defines a ConnectionDrainingAttribute class in boto.ec2.elb.attributes. You must pass an instance of this class as the value to modify_elb_attribute, e.g.:
from boto.ec2.elb.attributes import ConnectionDrainingAttribute
cda = ConnectionDrainingAttribute()
cda.enabled = True
cda.timeout = 120
...
modify_lb_attribute('my-elb', cda)

How to find user id from session_data from django_session table?

In django_session table session_data is stored which is first pickled using pickle module of Python and then encoded in base64 by using base64 module of Python.
I got the decoded pickled session_data.
session_data from django_session table:
gAJ9cQEoVQ9fc2Vzc2lvbl9leHBpcnlxAksAVRJfYXV0aF91c2VyX2JhY2tlbmRxA1UpZGphbmdvLmNvbnRyaWIuYXV0aC5iYWNrZW5kcy5Nb2RlbEJhY2tlbmRxBFUNX2F1dGhfdXNlcl9pZHEFigECdS5iZmUwOWExOWI0YTZkN2M0NDc2MWVjZjQ5ZDU0YjNhZA==
after decoding it by base64.decode(session_data):
\x80\x02}q\x01(U\x0f_session_expiryq\x02K\x00U\x12_auth_user_backendq\x03U)django.contrib.auth.backends.ModelBackendq\x04U\r_auth_user_idq\x05\x8a\x01\x02u.bfe09a19b4a6d7c44761ecf49d54b3ad
I want to find out the value of auth_user_id from auth_user_idq\x05\x8a\x01\x02u.
I had trouble with Paulo's method (see my comment on his answer), so I ended up using this method from a scottbarnham.com blog post:
from django.contrib.sessions.models import Session
from django.contrib.auth.models import User
session_key = '8cae76c505f15432b48c8292a7dd0e54'
session = Session.objects.get(session_key=session_key)
uid = session.get_decoded().get('_auth_user_id')
user = User.objects.get(pk=uid)
print user.username, user.get_full_name(), user.email
NOTE: format changed since original answer, for 1.4 and above see the update below
import pickle
data = pickle.loads(base64.decode(session_data))
>>> print data
{'_auth_user_id': 2L, '_auth_user_backend': 'django.contrib.auth.backends.ModelBackend',
'_session_expiry': 0}
[update]
My base64.decode requires filename arguments, so then I tried base64.b64decode, but this returned "IndexError: list assignment index out of range".
I really don't know why I used the base64 module, I guess because the question featured it.
You can just use the str.decode method:
>>> pickle.loads(session_data.decode('base64'))
{'_auth_user_id': 2L, '_auth_user_backend': 'django.contrib.auth.backends.ModelBackend',
'_session_expiry': 0}
I found a work-around (see answer below), but I am curious why this doesn't work.
Loading pickled data from user sources (cookies) is a security risk, so the session_data format was changed since this question was answered (I should go after the specific issue in Django's bug tracker and link it here, but my pomodoro break is gone).
The format now (since Django 1.4) is "hash:json-object" where the first 40 byte hash is a crypto-signature and the rest is a JSON payload. For now you can ignore the hash (it allows checking if the data was not tampered by some cookie hacker).
>>> json.loads(session_data.decode('base64')[41:])
{u'_auth_user_backend': u'django.contrib.auth.backends.ModelBackend',
u'_auth_user_id': 1}
If you want to learn more about it and know how does encode or decode work, there are some relevant code.
By the way the version of Django that i use is 1.9.4.
django/contrib/sessions/backends/base.py
class SessionBase(object):
def _hash(self, value):
key_salt = "django.contrib.sessions" + self.__class__.__name__
return salted_hmac(key_salt, value).hexdigest()
def encode(self, session_dict):
"Returns the given session dictionary serialized and encoded as a string."
serialized = self.serializer().dumps(session_dict)
hash = self._hash(serialized)
return base64.b64encode(hash.encode() + b":" + serialized).decode('ascii')
def decode(self, session_data):
encoded_data = base64.b64decode(force_bytes(session_data))
try:
# could produce ValueError if there is no ':'
hash, serialized = encoded_data.split(b':', 1)
expected_hash = self._hash(serialized)
if not constant_time_compare(hash.decode(), expected_hash):
raise SuspiciousSession("Session data corrupted")
else:
return self.serializer().loads(serialized)
except Exception as e:
# ValueError, SuspiciousOperation, unpickling exceptions. If any of
# these happen, just return an empty dictionary (an empty session).
if isinstance(e, SuspiciousOperation):
logger = logging.getLogger('django.security.%s' %
e.__class__.__name__)
logger.warning(force_text(e))
return {}
django/contrib/sessions/serializer.py
class JSONSerializer(object):
"""
Simple wrapper around json to be used in signing.dumps and
signing.loads.
"""
def dumps(self, obj):
return json.dumps(obj, separators=(',', ':')).encode('latin-1')
def loads(self, data):
return json.loads(data.decode('latin-1'))
Let's focus on SessionBase's encode function.
Serialize the session dictionary to a json
create a hash salt
add the salt to serialized session , base64 the concatenation
So, decode is inverse.
We can simplify the decode function in the following code.
import json
import base64
session_data = 'YTUyYzY1MjUxNzE4MzMxZjNjODFiNjZmZmZmMzhhNmM2NWQzMTllMTp7ImNvdW50Ijo0fQ=='
encoded_data = base64.b64decode(session_data)
hash, serialized = encoded_data.split(b':', 1)
json.loads(serialized.decode('latin-1'))
And that what session.get_decoded() did.
from django.conf import settings
from django.contrib.auth.models import User
from django.utils.importlib import import_module
def get_user_from_sid(session_key):
django_session_engine = import_module(settings.SESSION_ENGINE)
session = django_session_engine.SessionStore(session_key)
uid = session.get('_auth_user_id')
return User.objects.get(id=uid)
I wanted to do this in pure Python with the latest version of DJango (2.05). This is what I did:
>>> import base64
>>> x = base64.b64decode('OWNkOGQxYjg4NzlkN2ZhOTc2NmU1ODY0NWMzZmQ4YjdhMzM4OTJhNjp7Im51bV92aXNpdHMiOjJ9')
>>> print(x)
b'9cd8d1b8879d7fa9766e58645c3fd8b7a33892a6:{"num_visits":2}'
>>> import json
>>> data = json.loads(x[41:])
>>> print(data)
{'num_visits': 2}
I just had to solve something like this on a Django install. I knew the ID (36) of the user and wanted to delete the session data for that specific user. I wanted to put this code out as a prototype to build from for finding a user in session data:
from django.contrib.sessions.models import Session
TARGET_USER = 36 # edit this to match target user.
TARGET_USER = str(TARGET_USER) # type found to be a string
for session in Session.objects.all():
raw_session= session.get_decoded()
uid = session.get_decoded().get('_auth_user_id')
if uid == TARGET_USER: # this could be a list also if multiple users
print(session)
# session.delete() # uncomment to delete session data associated with the user
Hope this helps anyone out there.

Categories