Celery not accepting pickle even after allowing it - python

I'm trying to write a celery application that passes numpy arrays (or any arbitrary objects) to the workers. As far as I can tell, this requires serialization to occur via pickle (NB: I'm aware of the security implications but this isn't a concern in this case).
However, even after trying every possible way I could find to allow pickle as a serializer, I keep getting the following kombu exception:
kombu.exceptions.ContentDisallowed: Refusing to deserialize untrusted
content of type pickle (application/x-python-serialize)
My current files are currently:
# tasks.py
from celery import Celery
app = Celery(
'tasks',
broker='redis://localhost',
accept_content=['pickle'],
task_serializer='pickle'
)
#app.task
def adding(x, y):
return x + y
if __name__ == '__main__':
import numpy as np
adding.apply_async((np.array([1]), np.array([1])), serializer='pickle')
In addition I have a config file:
# celeryconfig.py
print('configuring...')
accept_content = ['pickle', 'application/x-python-serialize']
task_serializer = 'pickle'
result_serializer = 'pickle'
from kombu import serialization
serialization.register_pickle()
serialization.enable_insecure_serializers()
However, if I run the worker (celery -A tasks worker --loglevel=info) and then execute the code that makes an async call (python tasks.py), I get the following traceback. Am I missing something?
[2018-06-16 11:46:23,617: CRITICAL/MainProcess] Unrecoverable error: ContentDisallowed('Refusing to deserialize untrusted content of type pickle (application/x-python-serialize)',)
Traceback (most recent call last):
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/celery/worker/worker.py", line 205, in start
self.blueprint.start(self)
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/celery/bootsteps.py", line 369, in start
return self.obj.start()
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 322, in start
blueprint.start(self)
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/celery/worker/consufrom celery import Celery
mer/consumer.py", line 598, in start
c.loop(*c.loop_args())
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/celery/worker/loops.py", line 91, in asynloop
next(loop)
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/kombu/asynchronous/hub.py", line 354, in create_loop
cb(*cbargs)
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/kombu/transport/redis.py", line 1040, in on_readable
self.cycle.on_readable(fileno)
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/kombu/transport/redis.py", line 337, in on_readable
chan.handlers[type]()
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/kombu/transport/redis.py", line 724, in _brpop_read
self.connection._deliver(loads(bytes_to_str(item)), dest)
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/kombu/transport/virtual/base.py", line 983, in _deliver
callback(message)
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/kombu/transport/virtual/base.py", line 633, in _callback
return callback(message)
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/kombu/messaging.py", line 624, in _receive_callback
return on_m(message) if on_m else self.receive(decoded, message)
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 572, in on_task_received
callbacks,
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/celery/worker/strategy.py", line 136, in task_message_handler
if body is None and 'args' not in message.payload:
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/kombu/message.py", line 207, in payload
return self._decoded_cache if self._decoded_cache else self.decode()
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/kombu/message.py", line 192, in decode
self._decoded_cache = self._decode()
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/kombu/message.py", line 197, in _decode
self.content_encoding, accept=self.accept)
File "/opt/anaconda/envs/Python3/lib/python3.6/site-packages/kombu/serialization.py", line 253, in loads
raise self._for_untrusted_content(content_type, 'untrusted')
kombu.exceptions.ContentDisallowed: Refusing to deserialize untrusted content of type pickle (application/x-python-serialize)

For anyone coming to this question:
The answer was to use the app.config_from_object method:
import celeryconfig
app.config_from_object(celeryconfig)

Related

Why am I getting pymongo.errors.AutoReconnect: connection pool paused

I'm getting this error of autoreconnect, and there are around 100 connections in the logs during this call to .objects. Here is the document:
class NotificationDoc(Document):
patient_id = StringField(max_length=32)
type = StringField(max_length=32)
sms_sent = BooleanField(default=False)
email_sent = BooleanField(default=False)
sending_time = DateTimeField(default=datetime.utcnow)
def __unicode__(self):
return "{}_{}_{}".format(self.patient_id, self.type, self.sending_time.strftime('%Y-%m-%d'))
and this is the query call that causes the issue:
docs = NotificationDoc.objects(sending_time__lte=from_date)
This is the complete stacktrace. How can I understand what is going on?
pymongo.errors.AutoReconnect: connection pool paused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 451, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 734, in __protected_call__
return self.run(*args, **kwargs)
File "/app/src/reminder/configurations/app/worker.py", line 106, in schedule_recall
schedule_recall_use_case.execute()
File "/app/src/reminder/domain/notification/recall/use_cases/schedule_recall.py", line 24, in execute
notifications_sent = self.notifications_provider.find_recalled_notifications_for_date(no_recalls_before_date)
File "/app/src/reminder/data_providers/database/odm/repositories.py", line 42, in find_recalled_notifications_for_date
if NotificationDoc.objects(sending_time__lte=from_date).count() > 0:
File "/usr/local/lib/python3.8/site-packages/mongoengine/queryset/queryset.py", line 144, in count
return super().count(with_limit_and_skip)
File "/usr/local/lib/python3.8/site-packages/mongoengine/queryset/base.py", line 423, in count
count = count_documents(
File "/usr/local/lib/python3.8/site-packages/mongoengine/pymongo_support.py", line 38, in count_documents
return collection.count_documents(filter=filter, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pymongo/collection.py", line 1502, in count_documents
return self.__database.client._retryable_read(
File "/usr/local/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1307, in _retryable_read
with self._secondaryok_for_server(read_pref, server, session) as (
File "/usr/local/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/usr/local/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1162, in _secondaryok_for_server
with self._get_socket(server, session) as sock_info:
File "/usr/local/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/usr/local/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1099, in _get_socket
with server.get_socket(
File "/usr/local/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/usr/local/lib/python3.8/site-packages/pymongo/pool.py", line 1371, in get_socket
sock_info = self._get_socket(all_credentials)
File "/usr/local/lib/python3.8/site-packages/pymongo/pool.py", line 1436, in _get_socket
self._raise_if_not_ready(emit_event=True)
File "/usr/local/lib/python3.8/site-packages/pymongo/pool.py", line 1407, in _raise_if_not_ready
_raise_connection_failure(
File "/usr/local/lib/python3.8/site-packages/pymongo/pool.py", line 250, in _raise_connection_failure
raise AutoReconnect(msg) from error
pymongo.errors.AutoReconnect: mongo:27017: connection pool paused
turned out to be celery is leaving connections open, so here is the solution
from mongoengine import disconnect
from celery.signals import task_prerun
#task_prerun.connect
def on_task_init(*args, **kwargs):
disconnect(alias='default')
connect(db, host=host, port=port, maxPoolSize=400, minPoolSize=200, alias='default')
This could be related to the fact that PyMongo is not fork-safe. If you're using a process pool in any way, which includes server software like uWSGI and even some configurations of popular ASGI servers.
Every time you run a query in PyMongo, that MongoClient object becomes fork-unsafe. A MongoClient object that has never run any queries is fork-safe. Created Database and Collection objects will reference back to their parent MongoClient without checking for existence.
I do see you are using MongoEngine, which uses PyMongo underneath. I don't know the semantics of that particular library but I would assume they are the same.
In short: you must re-create your connections as part of your forking process.
pip install -U pymongo; is already fix; see this https://github.com/mongodb/mongo-python-driver/pull/944

Celery ignoring config for serializer?

I have a Celery app (4.4.6) that uses dataclasses. Since JSON can't serialize/deserialize dataclasses, I have forced the use of pickle throughout (I'm aware of the risk, but I think it's mitigated in the way the app is deployed). However, I am getting errors from within the app from kombu, saying TypeError: Object of type ResourceGroup is not JSON serializable. Everything else is working, so in general, it has to be using pickle OK, but in this one case it isn't. However nothing in the stacktrace that comes with the exception mentions my code. The structure of the software is tasks that create other tasks dynamically as they perform discovery, using delay(). Almost all the tasks are passed this ResourceGroup object, and they are running fine, except one (I think, judging by the frequency of these errors, and the logging I am getting for completing tasks).
This is how I'm configuring Celery in my worker. Is there some other setting I need to set to really really make it use pickle in all situations? (alternatively, is there a JSON serializer/deserializer that can reconstitute dataclasses?)
class CeleryConfig:
task_serializer = 'pickle'
result_serializer = 'pickle'
event_serializer = 'pickle'
accept_content = ['pickle']
result_accept_content = ['pickle']
app = Celery('collector',
backend='redis://' + Config.REDIS_HOST,
broker='redis://' + Config.REDIS_HOST,
include=['tasks'])
app.config_from_object(CeleryConfig)
Update:
Here's a full example of one of the exceptions:
[2020-07-23 05:02:10,621: DEBUG/MainProcess] pidbox received method active() [reply_to:{'exchange': 'reply.celery.pidbox', 'routing_key': 'cde2e89b-bb81-3c19-8491-b57b072e5f29'} ticket:6c2cc493-0e2d-4a85-821b-350bdc4bceeb]
[2020-07-23 05:02:10,621: ERROR/MainProcess] Control command error: EncodeError(TypeError('Object of type ResourceGroup is not JSON serializable'))
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/kombu/serialization.py", line 50, in _reraise_errors
yield
File "/usr/local/lib/python3.8/site-packages/kombu/serialization.py", line 221, in dumps
payload = encoder(data)
File "/usr/local/lib/python3.8/site-packages/kombu/utils/json.py", line 69, in dumps
return _dumps(s, cls=cls or _default_encoder,
File "/usr/local/lib/python3.8/json/__init__.py", line 234, in dumps
return cls(
File "/usr/local/lib/python3.8/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/local/lib/python3.8/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/usr/local/lib/python3.8/site-packages/kombu/utils/json.py", line 59, in default
return super(JSONEncoder, self).default(o)
File "/usr/local/lib/python3.8/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type ResourceGroup is not JSON serializable
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/celery/worker/pidbox.py", line 46, in on_message
self.node.handle_message(body, message)
File "/usr/local/lib/python3.8/site-packages/kombu/pidbox.py", line 145, in handle_message
return self.dispatch(**body)
File "/usr/local/lib/python3.8/site-packages/kombu/pidbox.py", line 112, in dispatch
self.reply({self.hostname: reply},
File "/usr/local/lib/python3.8/site-packages/kombu/pidbox.py", line 149, in reply
self.mailbox._publish_reply(data, exchange, routing_key, ticket,
File "/usr/local/lib/python3.8/site-packages/kombu/pidbox.py", line 280, in _publish_reply
producer.publish(
File "/usr/local/lib/python3.8/site-packages/kombu/messaging.py", line 167, in publish
body, content_type, content_encoding = self._prepare(
File "/usr/local/lib/python3.8/site-packages/kombu/messaging.py", line 252, in _prepare
body) = dumps(body, serializer=serializer)
File "/usr/local/lib/python3.8/site-packages/kombu/serialization.py", line 221, in dumps
payload = encoder(data)
File "/usr/local/lib/python3.8/contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.8/site-packages/kombu/serialization.py", line 54, in _reraise_errors
reraise(wrapper, wrapper(exc), sys.exc_info()[2])
File "/usr/local/lib/python3.8/site-packages/vine/five.py", line 194, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.8/site-packages/kombu/serialization.py", line 50, in _reraise_errors
yield
File "/usr/local/lib/python3.8/site-packages/kombu/serialization.py", line 221, in dumps
payload = encoder(data)
File "/usr/local/lib/python3.8/site-packages/kombu/utils/json.py", line 69, in dumps
return _dumps(s, cls=cls or _default_encoder,
File "/usr/local/lib/python3.8/json/__init__.py", line 234, in dumps
return cls(
File "/usr/local/lib/python3.8/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/local/lib/python3.8/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/usr/local/lib/python3.8/site-packages/kombu/utils/json.py", line 59, in default
return super(JSONEncoder, self).default(o)
File "/usr/local/lib/python3.8/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
kombu.exceptions.EncodeError: Object of type ResourceGroup is not JSON serializable
I agree that ResourceGroup isn't JSON serializable. I just don't understand why it's trying! :-)
update2:
I got the celery worker running with concurrency=1 in the debugger, and now with a breakpoint in the JSON serializer (which I should never hit) I get a stop. From reading back inside kombu, the first mention of serializer in the stack trace seems to be from Node.reply() in pidbox.py where self.mailbox.serializer is None (and defaults to JSON because of that). I don't see where that Mailbox object is originally created though.
update3:
The Control object appears to always create a Mailbox that will accept and send JSON, regardless of any config:
"""Worker remote control client."""
Mailbox = Mailbox
def __init__(self, app=None):
self.app = app
self.mailbox = self.Mailbox(
app.conf.control_exchange,
type='fanout',
accept=['json'],
producer_pool=lazy(lambda: self.app.amqp.producer_pool),
queue_ttl=app.conf.control_queue_ttl,
reply_queue_ttl=app.conf.control_queue_ttl,
queue_expires=app.conf.control_queue_expires,
reply_queue_expires=app.conf.control_queue_expires,
)

Setup of Flask architecture for machine learning pipeline

I want to setup a machine learning pipeline that is callable by flask but I am facing some issues, these links are for the documentations I have read so far:
https://exploreflask.com/en/latest/views.html#view-decorators
https://flask.palletsprojects.com/en/1.1.x/api/#flask.Flask
Let me explain the pipeline I have in mind:
pull a dataframe from a PostgreSQL database
encode said dataframe to make it ready for most algorithms
split up the data
feed to a pipeline and determine accuracy
store the model in a pickle file
What is working so far:
All parts are working as a regular script
I can just slap all the steps into one huge flask file with one decorator and it would run as well (my emergency solution)
The File Structure
The encoder script:
#Flask main thread
#makes flask start this part as application and not as module
app = Flask('encoder_module')
#app.route('/df_encoder')
def df_encoder(rng = 4):
encoding stuff
`return df`
The Pipeline script (random forest regressor here)
app = Flask('pipeline_module')
#app.route('/pipeline_rfr')
def pipeline_rfr():
pipeline stuff
`return grid_search_rfr`
The pickle module:
app = Flask('pickle_module')
#app.route('/store_reg_pickle')
def store_pickle():
"""
Usage of a Pickle Model -Storage of a trained Model
"""
model = grid_search_rfr
#specify file name in letter strings
model_file = "regression_model"
with open(model_file, mode='wb') as m_f:
pickle.dump(model, m_f)
print(f"Model saved in: {os.getcwd()}")
return model_file
The Main Flask File
#packages
from flask import Flask
from encoder_main_thread import df_encoder
from rfr_pipeline_function import pipeline_rfr
from pickle_call import store_pickle
app = Flask(__name__.split('.')[0])
#app.route('/regression_pipe')
#df_encoder
#pipeline_rfr
#store_reg_pickle
def regression_pipe():
`return 'pipeline done`
The problem iss that the return value of the encoder cannot be a dataframe, only a string, tuple, etc.
Is there a workaround for this?
I actually want it to be a flawless passing of the dataframe to the pipeline and eventually storing it in the pickle file which is then saved in the folder.
For some reason it cannot detect the pickle file import and throws following error:
Use a production WSGI server instead.
* Debug mode: off
Traceback (most recent call last):
File "C:\ANACONDA3\Scripts\flask-script.py", line 9, in <module>
sys.exit(main())
File "C:\ANACONDA3\lib\site-packages\flask\cli.py", line 967, in main
cli.main(args=sys.argv[1:], prog_name="python -m flask" if as_module else None)
File "C:\ANACONDA3\lib\site-packages\flask\cli.py", line 586, in main
return super(FlaskGroup, self).main(*args, **kwargs)
File "C:\ANACONDA3\lib\site-packages\click\core.py", line 782, in main
rv = self.invoke(ctx)
File "C:\ANACONDA3\lib\site-packages\click\core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "C:\ANACONDA3\lib\site-packages\click\core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\ANACONDA3\lib\site-packages\click\core.py", line 610, in invoke
return callback(*args, **kwargs)
File "C:\ANACONDA3\lib\site-packages\click\decorators.py", line 73, in new_func
return ctx.invoke(f, obj, *args, **kwargs)
File "C:\ANACONDA3\lib\site-packages\click\core.py", line 610, in invoke
return callback(*args, **kwargs)
File "C:\ANACONDA3\lib\site-packages\flask\cli.py", line 848, in run_command
app = DispatchingApp(info.load_app, use_eager_loading=eager_loading)
File "C:\ANACONDA3\lib\site-packages\flask\cli.py", line 305, in __init__
self._load_unlocked()
File "C:\ANACONDA3\lib\site-packages\flask\cli.py", line 330, in _load_unlocked
self._app = rv = self.loader()
File "C:\ANACONDA3\lib\site-packages\flask\cli.py", line 388, in load_app
app = locate_app(self, import_name, name)
File "C:\ANACONDA3\lib\site-packages\flask\cli.py", line 240, in locate_app
__import__(module_name)
File "C:\Users\bill-\OneDrive\Dokumente\Docs Bill\TA_files\functions_scripts_storage\flask_test\flask_regression_pipeline.py", line 18, in <module>
#store_reg_pickle
NameError: name 'store_reg_pickle' is not defined
If you wish I could upload the entire scripts but that is a lot to look through and since it is working as a long regular pice of code, the mistake needs to be somewhere with my flask setup.

I wrote a project with tornado, but this exception is always in my log file

This is the error log:
[I 160308 11:09:59 web:1908] 200 GET /admin/realtime (117.93.180.216) 107.13ms
[E 160308 11:09:59 http1connection:54] Uncaught exception
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/tornado/http1connection.py", line 238, in _read_message
delegate.finish()
File "/usr/local/lib/python3.4/dist-packages/tornado/httpserver.py", line 290, in finish
self.delegate.finish()
File "/usr/local/lib/python3.4/dist-packages/tornado/web.py", line 1984, in finish
self.execute()
File "/usr/local/lib/python3.4/dist-packages/blueware-1.0.10/blueware/hooks/framework_tornado/web.py", line 480, in _bw_wrapper__RequestDispatcher_execute
future = wrapped(*args, **kwargs)
File "/usr/local/lib/python3.4/dist-packages/tornado/web.py", line 2004, in execute
**self.handler_kwargs)
File "/usr/local/lib/python3.4/dist-packages/blueware-1.0.10/blueware/hooks/framework_tornado/web.py", line 448, in _bw_wrapper_RequestHandler___init___
return wrapped(*args, **kwargs)
File "/usr/local/lib/python3.4/dist-packages/tornado/web.py", line 185, in init
self.initialize(**kwargs)
File "/usr/local/lib/python3.4/dist-packages/tornado/web.py", line 2714, in wrapper
self.redirect(url)
File "/usr/local/lib/python3.4/dist-packages/tornado/web.py", line 671, in redirect
self.finish()
File "/usr/local/lib/python3.4/dist-packages/blueware-1.0.10/blueware/hooks/framework_tornado/web.py", line 309, in _bw_wrapper_RequestHandler_finish_
return wrapped(*args, **kwargs)
File "/usr/local/lib/python3.4/dist-packages/tornado/web.py", line 934, in finish
self.flush(include_footers=True)
File "/usr/local/lib/python3.4/dist-packages/tornado/web.py", line 870, in flush
for transform in self._transforms:
TypeError: 'NoneType' object is not iterable
[I 160308 11:10:00 web:1908] 200 GET /admin/order?order_type=1&order_status=1&page=0&action=allreal (49.89.27.173) 134.53ms
Can anyone tell me how to solve this problem? Thank you very much
I assume that OneAPM (blueware agent) is compatible with your python and Tornado version, however it's can be tricky.
Solution
Move self.redirect(url) from your handler initialize method to get method, like this
class MyHandler(tornado.web.RequestHandler):
def get(self):
self.redirect('/some_url')
or use RedirectHandler.
Every action that could finish request needs to be called in context of http-verb method (get, post, put and so on). The common mistake is making authetication/authorization in __init__ or initialize.
More detail
In Tornado's source there is a note about _transforms that is initialized in the constructor with None and set in_execute (oversimplifying - after headers_received).
A transform modifies the result of an HTTP request (e.g., GZip encoding).
Applications are not expected to create their own OutputTransforms
or interact with them directly; the framework chooses which transforms
(if any) to apply.
Reproduce
Sample that triggers this error. I'm including this only as a cross-check that blueware is not the cause:
import tornado.ioloop
import tornado.web
class SomeHandler(tornado.web.RequestHandler):
def initialize(self, *args, **kwargs):
url = '/some'
self.redirect(url)
# ^ this is wrong
def get(self):
# redirect should be here
self.write("Hello")
def make_app():
return tornado.web.Application([
(r"/", SomeHandler),
])
if __name__ == "__main__":
app = make_app()
app.listen(8888)
tornado.ioloop.IOLoop.current().start()
And stacktrace:
ERROR:tornado.application:Uncaught exception
Traceback (most recent call last):
File "/tmp/py3/lib/python3.4/site-packages/tornado/http1connection.py", line 238, in _read_message
delegate.finish()
File "/tmp/py3/lib/python3.4/site-packages/tornado/httpserver.py", line 289, in finish
self.delegate.finish()
File "/tmp/py3/lib/python3.4/site-packages/tornado/web.py", line 2022, in finish
self.execute()
File "/tmp/py3/lib/python3.4/site-packages/tornado/web.py", line 2042, in execute
**self.handler_kwargs)
File "/tmp/py3/lib/python3.4/site-packages/tornado/web.py", line 183, in __init__
self.initialize(**kwargs)
File "test.py", line 8, in initialize
self.redirect(url)
File "/tmp/py3/lib/python3.4/site-packages/tornado/web.py", line 666, in redirect
self.finish()
File "/tmp/py3/lib/python3.4/site-packages/tornado/web.py", line 932, in finish
self.flush(include_footers=True)
File "/tmp/py3/lib/python3.4/site-packages/tornado/web.py", line 868, in flush
for transform in self._transforms:
TypeError: 'NoneType' object is not iterable

google-api-python-client broken because of OAuth2?

I am trying to check if a certain dataset exists in bigquery using the Google Api Client in Python. It always worked untill the last update where I got this strange error I don't know how to fix:
Traceback (most recent call last):
File "/root/miniconda/lib/python2.7/site-packages/dsUtils/bq_utils.py", line 106, in _get
resp = bq_service.datasets().get(projectId=self.project_id, datasetId=self.id).execute(num_retries=2)
File "/root/miniconda/lib/python2.7/site-packages/oauth2client/util.py", line 140, in positional_wrapper
return wrapped(*args, **kwargs)
File "/root/miniconda/lib/python2.7/site-packages/googleapiclient/http.py", line 755, in execute
method=str(self.method), body=self.body, headers=self.headers)
File "/root/miniconda/lib/python2.7/site-packages/googleapiclient/http.py", line 93, in _retry_request
resp, content = http.request(uri, method, *args, **kwargs)
File "/root/miniconda/lib/python2.7/site-packages/oauth2client/client.py", line 598, in new_request
self._refresh(request_orig)
File "/root/miniconda/lib/python2.7/site-packages/oauth2client/client.py", line 864, in _refresh
self._do_refresh_request(http_request)
File "/root/miniconda/lib/python2.7/site-packages/oauth2client/client.py", line 891, in _do_refresh_request
body = self._generate_refresh_request_body()
File "/root/miniconda/lib/python2.7/site-packages/oauth2client/client.py", line 1597, in _generate_refresh_req
uest_body
assertion = self._generate_assertion()
File "/root/miniconda/lib/python2.7/site-packages/oauth2client/service_account.py", line 263, in _generate_ass
ertion
key_id=self._private_key_id)
File "/root/miniconda/lib/python2.7/site-packages/oauth2client/crypt.py", line 97, in make_signed_jwt
signature = signer.sign(signing_input)
File "/root/miniconda/lib/python2.7/site-packages/oauth2client/_pycrypto_crypt.py", line 101, in sign
return PKCS1_v1_5.new(self._key).sign(SHA256.new(message))
File "/root/miniconda/lib/python2.7/site-packages/Crypto/Signature/PKCS1_v1_5.py", line 112, in sign
m = self._key.decrypt(em)
File "/root/miniconda/lib/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 174, in decrypt
return pubkey.pubkey.decrypt(self, ciphertext)
File "/root/miniconda/lib/python2.7/site-packages/Crypto/PublicKey/pubkey.py", line 93, in decrypt
plaintext=self._decrypt(ciphertext)
File "/root/miniconda/lib/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 235, in _decrypt
r = getRandomRange(1, self.key.n-1, randfunc=self._randfunc)
File "/root/miniconda/lib/python2.7/site-packages/Crypto/Util/number.py", line 123, in getRandomRange
value = getRandomInteger(bits, randfunc)
File "/root/miniconda/lib/python2.7/site-packages/Crypto/Util/number.py", line 104, in getRandomInteger
S = randfunc(N>>3)
File "/root/miniconda/lib/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 202, in read
return self._singleton.read(bytes)
File "/root/miniconda/lib/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 178, in read
return _UserFriendlyRNG.read(self, bytes)
File "/root/miniconda/lib/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 137, in read
self._check_pid()
File "/root/miniconda/lib/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 153, in _check_pid
raise AssertionError("PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()")
AssertionError: PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()
Is someone understanding what is hapening?
Note that I also get this error with other bricks like GCStorage.
Note also that I use the following command to load my Google credentials:
from oauth2client.client import GoogleCredentials
def get_credentials(credentials_path): #my json credentials path
logger.info('Getting credentials...')
try:
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = credentials_path
credentials = GoogleCredentials.get_application_default()
return credentials
except Exception as e:
raise e
So if anyone know a better way to load my google credentials using my json service account file, and which would avoid the error, please tell me.
It looks like the error is in the PyCrypto module, which appears to be used under the hood by Google's OAuth2 implementation. If your code is calling os.fork() at some point, you may need to call Crypto.Random.atfork() afterward in both the parent and child process in order to update the module's internal state.
See here for PyCrypto docs; search for "atfork" for more info:
https://github.com/dlitz/pycrypto
This question and answer might also be relevant:
PyCrypto : AssertionError("PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()")

Categories