Why does Airflow show error messages twice in log? - python

I am using AWS Airflow MWAA environment, Airflow version 2.2.2.
Whatever exception is raised inside task from whatever task(PythonOperator, custom operator, whatever), it is shown twice in the log output:
[2023-02-15, 11:30:04 UTC] {{taskinstance.py:1429}} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=run_pi_edits_mhp
AIRFLOW_CTX_TASK_ID=test_failure
AIRFLOW_CTX_EXECUTION_DATE=2023-02-15T11:30:00.598379+00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2023-02-15T11:30:00.598379+00:00
[2023-02-15, 11:30:04 UTC] {{taskinstance.py:1703}} ERROR - Task failed with exception
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1332, in _run_raw_task
self._execute_task_with_callbacks(context)
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1458, in _execute_task_with_callbacks
result = self._execute_task(context, self.task)
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1514, in _execute_task
result = execute_callable(context=context)
File "/usr/local/lib/python3.7/site-packages/airflow/operators/python.py", line 151, in execute
return_value = self.execute_callable()
File "/usr/local/lib/python3.7/site-packages/airflow/operators/python.py", line 162, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/usr/local/airflow/dags/pi_edits_poc.py", line 16, in test_raise
raise Exception('Test_error_msg')
Exception: Test_error_msg
[2023-02-15, 11:30:04 UTC] {{taskinstance.py:1280}} INFO - Marking task as FAILED. dag_id=run_pi_edits_mhp, task_id=test_failure, execution_date=20230215T113000, start_date=20230215T113004, end_date=20230215T113004
[2023-02-15, 11:30:04 UTC] {{standard_task_runner.py:91}} ERROR - Failed to execute job 404 for task test_failure
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/airflow/task/task_runner/standard_task_runner.py", line 85, in _start_by_fork
args.func(args, dag=self.dag)
File "/usr/local/lib/python3.7/site-packages/airflow/cli/cli_parser.py", line 48, in command
return func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/airflow/utils/cli.py", line 92, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/airflow/cli/commands/task_command.py", line 292, in task_run
_run_task_by_selected_method(args, dag, ti)
File "/usr/local/lib/python3.7/site-packages/airflow/cli/commands/task_command.py", line 107, in _run_task_by_selected_method
_run_raw_task(args, ti)
File "/usr/local/lib/python3.7/site-packages/airflow/cli/commands/task_command.py", line 184, in _run_raw_task
error_file=args.error_file,
File "/usr/local/lib/python3.7/site-packages/airflow/utils/session.py", line 70, in wrapper
return func(*args, session=session, **kwargs)
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1332, in _run_raw_task
self._execute_task_with_callbacks(context)
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1458, in _execute_task_with_callbacks
result = self._execute_task(context, self.task)
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1514, in _execute_task
result = execute_callable(context=context)
File "/usr/local/lib/python3.7/site-packages/airflow/operators/python.py", line 151, in execute
return_value = self.execute_callable()
File "/usr/local/lib/python3.7/site-packages/airflow/operators/python.py", line 162, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/usr/local/airflow/dags/pi_edits_poc.py", line 16, in test_raise
raise Exception('Test_error_msg')
Exception: Test_error_msg
[2023-02-15, 11:30:05 UTC] {{local_task_job.py:154}} INFO - Task exited with return code 1
[2023-02-15, 11:30:05 UTC] {{local_task_job.py:264}} INFO - 0 downstream tasks scheduled from follow-on schedule check
Ideally i want to NOT SEE error message from {{standard_task_runner.py:91}} ERROR
Could someone tell me how to do it?
I was trying to use different type of operators and raising different exception types.

Related

Celery TypeError: unhashable type: 'dict'

I'm trying to run celery, and can't run it because of the following exception:
[2023-02-14 11:25:11,689: CRITICAL/MainProcess] Unrecoverable error: TypeError("unhashable type: 'dict'")
Traceback (most recent call last):
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/celery/worker/worker.py", line 203, in start
self.blueprint.start(self)
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/celery/bootsteps.py", line 116, in start
step.start(parent)
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/celery/bootsteps.py", line 365, in start
return self.obj.start()
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/celery/worker/consumer/consumer.py", line 332, in start
blueprint.start(self)
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/celery/bootsteps.py", line 116, in start
step.start(parent)
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/celery/worker/consumer/consumer.py", line 628, in start
c.loop(*c.loop_args())
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/celery/worker/loops.py", line 94, in asynloop
update_qos()
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/kombu/common.py", line 435, in update
return self.set(self.value)
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/kombu/common.py", line 428, in set
self.callback(prefetch_count=new_value)
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/celery/worker/consumer/tasks.py", line 43, in set_prefetch_count
return c.task_consumer.qos(
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/kombu/messaging.py", line 558, in qos
return self.channel.basic_qos(prefetch_size,
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/amqp/channel.py", line 1894, in basic_qos
return self.send_method(
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/amqp/abstract_channel.py", line 79, in send_method
return self.wait(wait, returns_tuple=returns_tuple)
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/amqp/abstract_channel.py", line 99, in wait
self.connection.drain_events(timeout=timeout)
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/amqp/connection.py", line 525, in drain_events
while not self.blocking_read(timeout):
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/amqp/connection.py", line 531, in blocking_read
return self.on_inbound_frame(frame)
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/amqp/method_framing.py", line 77, in on_frame
callback(channel, msg.frame_method, msg.frame_args, msg)
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/amqp/connection.py", line 537, in on_inbound_method
return self.channels[channel_id].dispatch_method(
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/amqp/abstract_channel.py", line 156, in dispatch_method
listener(*args)
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/amqp/channel.py", line 1629, in _on_basic_deliver
fun(msg)
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/kombu/messaging.py", line 626, in _receive_callback
return on_m(message) if on_m else self.receive(decoded, message)
File "/Users/shira/PycharmProjects/demo/venv/lib/python3.10/site-packages/celery/worker/consumer/consumer.py", line 591, in on_task_received
strategy = strategies[type_]
TypeError: unhashable type: 'dict'
I tried to uninstall celery, stop rabbitMQ process, and googled it and didn't find any solution.
I run a simple basic code of celery using only one function ("add", without any dictionary).
I think maybe there is some issues with the libraries I import.
I put a breakpoint where the exception is thrown.
I found out that I sent the illegal task a few days ago, so I understood I need to remove it from the queue so other tasks could be done.
I used this command:
celery -A tasks purge
This solved me the issue :)

Any conda command shows this error report

$ conda info --envs
# >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<
Traceback (most recent call last):
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/exceptions.py", line 1082, in __call__
return func(*args, **kwargs)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/cli/main.py", line 87, in _main
exit_code = do_call(args, p)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/cli/conda_argparse.py", line 84, in do_call
return getattr(module, func_name)(args, parser)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/cli/main_info.py", line 317, in execute
info_dict = get_info_dict(args.system)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/cli/main_info.py", line 135, in get_info_dict
_supplement_index_with_system(virtual_pkg_index)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/core/index.py", line 164, in _supplement_index_with_system
dist_name, dist_version = context.os_distribution_name_version
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/auxlib/decorators.py", line 268, in new_fget
cache[inner_attname] = func(self)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/base/context.py", line 863, in os_distribution_name_version
from conda._vendor.distro import id, version
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/_vendor/distro.py", line 1084, in <module>
_distro = LinuxDistribution()
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/_vendor/distro.py", line 599, in __init__
self._lsb_release_info = self._get_lsb_release_info() \
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/_vendor/distro.py", line 943, in _get_lsb_release_info
raise subprocess.CalledProcessError(code, cmd, stdout, stderr)
subprocess.CalledProcessError: Command 'lsb_release -a' returned non-zero exit status 1.
`$ /home/user/miniconda3/bin/conda info --envs`
An unexpected error has occurred. Conda has prepared the above report.
If submitted, this report will be used by core maintainers to improve
future releases of conda.
Would you like conda to send this report to the core maintainers?
[y/N]: y
Traceback (most recent call last):
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/exceptions.py", line 1082, in __call__
return func(*args, **kwargs)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/cli/main.py", line 87, in _main
exit_code = do_call(args, p)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/cli/conda_argparse.py", line 84, in do_call
return getattr(module, func_name)(args, parser)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/cli/main_info.py", line 317, in execute
info_dict = get_info_dict(args.system)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/cli/main_info.py", line 135, in get_info_dict
_supplement_index_with_system(virtual_pkg_index)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/core/index.py", line 164, in _supplement_index_with_system
dist_name, dist_version = context.os_distribution_name_version
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/auxlib/decorators.py", line 268, in new_fget
cache[inner_attname] = func(self)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/base/context.py", line 863, in os_distribution_name_version
from conda._vendor.distro import id, version
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/_vendor/distro.py", line 1084, in <module>
_distro = LinuxDistribution()
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/_vendor/distro.py", line 599, in __init__
self._lsb_release_info = self._get_lsb_release_info() \
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/_vendor/distro.py", line 943, in _get_lsb_release_info
raise subprocess.CalledProcessError(code, cmd, stdout, stderr)
subprocess.CalledProcessError: Command 'lsb_release -a' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/user/miniconda3/bin/conda", line 13, in <module>
sys.exit(main())
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/cli/main.py", line 155, in main
return conda_exception_handler(_main, *args, **kwargs)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/exceptions.py", line 1374, in conda_exception_handler
return_value = exception_handler(func, *args, **kwargs)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/exceptions.py", line 1085, in __call__
return self.handle_exception(exc_val, exc_tb)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/exceptions.py", line 1129, in handle_exception
return self.handle_unexpected_exception(exc_val, exc_tb)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/exceptions.py", line 1144, in handle_unexpected_exception
self._execute_upload(error_report)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/exceptions.py", line 1306, in _execute_upload
'User-Agent': self.user_agent,
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/exceptions.py", line 1104, in user_agent
return context.user_agent
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/auxlib/decorators.py", line 268, in new_fget
cache[inner_attname] = func(self)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/base/context.py", line 805, in user_agent
builder.append("%s/%s" % self.os_distribution_name_version)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/auxlib/decorators.py", line 268, in new_fget
cache[inner_attname] = func(self)
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/base/context.py", line 863, in os_distribution_name_version
from conda._vendor.distro import id, version
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/_vendor/distro.py", line 1084, in <module>
_distro = LinuxDistribution()
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/_vendor/distro.py", line 599, in __init__
self._lsb_release_info = self._get_lsb_release_info() \
File "/home/user/miniconda3/lib/python3.9/site-packages/conda/_vendor/distro.py", line 943, in _get_lsb_release_info
raise subprocess.CalledProcessError(code, cmd, stdout, stderr)
subprocess.CalledProcessError: Command 'lsb_release -a' returned non-zero exit status 1.
Hi, I keep getting this ERROR REPORT for any conda command (conda info, conda install, conda whatever).
I removed and re-installed anaconda but then I get the same error, so I removed everything and installed miniconda, then I re-installed miniconda, then I also re-installed Ubuntu.
None of the above clearly solved the problem and this ERROR REPORT comes and goes.
Any help would be appreciated.

Airflows `CreateEMRJobFlowOperator` Doesn't Have Configuration For `AutoTerminationPolicy`

Description
I'm currently hitting
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1138, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1311, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1341, in _execute_task
result = task_copy.execute(context=context)
File "/usr/local/airflow/dags/plugins/operators/shippo_emr_operators.py", line 133, in execute
return super().execute(context)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/providers/amazon/aws/operators/emr_create_job_flow.py", line 81, in execute
response = emr.create_job_flow(job_flow_overrides)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/emr.py", line 88, in create_job_flow
response = self.get_conn().run_job_flow(**config)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/botocore/client.py", line 649, in _make_api_call
api_params, operation_model, context=request_context)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/botocore/client.py", line 697, in _convert_to_request_dict
api_params, operation_model)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/botocore/validate.py", line 293, in serialize_to_request
raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Unknown parameter in input: "AutoTerminationPolicy", must be one of: Name, LogUri, LogEncryptionKmsKeyId, AdditionalInfo, AmiVersion, ReleaseLabel, Instances, Steps, BootstrapActions, SupportedProducts, NewSupportedProducts, Applications, Configurations, VisibleToAllUsers, JobFlowRole, ServiceRole, Tags, SecurityConfiguration, AutoScalingRole, ScaleDownBehavior, CustomAmiId, EbsRootVolumeSize, RepoUpgradeOnBoot, KerberosAttributes, StepConcurrencyLevel, ManagedScalingPolicy, PlacementGroupConfigs
After adding
EmrCreateOrUseJobFlowOperator
task_id=name,
job_flow_overrides={"AutoTerminationPolicy": 60}
aws_conn_id=aws_conn_id,
emr_conn_id=emr_conn_id,
dag=dag,
When I can clearly that AutoTerminationPolicy is a configurable value according to AWS's botocore library for the version of botocore that we're running.
Does anyone understanding why I'm hitting this validation error?
Package Versions
boto3==1.17.54
boto==2.49.0
botocore==1.20.54

Python Redis - RuntimeError pubsub connection not set

Version: redis-py=3.1.0 and redis=3.2.10
Platform: Python 2.7.5 / CentOS Linux release 7.4.1708 (Core)
Infrastructure:
two machines (worker1, worker2 ) for running celery worker services with default concurrency (=8).
one dedicated machine (redis1) for running redis server.
Issue:
After the workers running for some time, suddenly a worker running on machine1 dies due to a RuntimeError raised losing a connection to pubsub.
machine1.worker.log
[2019-02-01 13:43:39,477: CRITICAL/MainProcess] Unrecoverable error: RuntimeError(u'pubsub connection not set: did you forget to call subscribe() or psubscribe()?',)
Traceback (most recent call last):
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/worker.py", line 205, in start
self.blueprint.start(self)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/bootsteps.py", line 369, in start
return self.obj.start()
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/consumer/consumer.py", line 322, in start
blueprint.start(self)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/consumer/consumer.py", line 598, in start
c.loop(*c.loop_args())
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/loops.py", line 91, in asynloop
next(loop)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/asynchronous/hub.py", line 354, in create_loop
cb(*cbargs)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/transport/redis.py", line 1047, in on_readable
self.cycle.on_readable(fileno)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/transport/redis.py", line 344, in on_readable
chan.handlers[type]()
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/transport/redis.py", line 674, in _receive
ret.append(self._receive_one(c))
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/transport/redis.py", line 685, in _receive_one
response = c.parse_response()
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/redis/client.py", line 3032, in parse_response
'pubsub connection not set: '
RuntimeError: pubsub connection not set: did you forget to call subscribe() or psubscribe()?
While at the same time I have spotted that worker running on machine2 suffers due to not being able to connect to redis. Eventually, it managed to recover and reconnect to redis and receiving the queued tasks.
machine2.worker.log
[2019-02-01 14:43:41,722: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/consumer/consumer.py", line 322, in start
blueprint.start(self)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/consumer/mingle.py", line 40, in start
self.sync(c)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/consumer/mingle.py", line 44, in sync
replies = self.send_hello(c)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/consumer/mingle.py", line 57, in send_hello
replies = inspect.hello(c.hostname, our_revoked._data) or {}
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/app/control.py", line 143, in hello
return self._request('hello', from_node=from_node, revoked=revoked)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/app/control.py", line 95, in _request
timeout=self.timeout, reply=True,
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/app/control.py", line 454, in broadcast
limit, callback, channel=channel,
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/pidbox.py", line 315, in _broadcast
serializer=serializer)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/pidbox.py", line 290, in _publish
serializer=serializer,
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/messaging.py", line 181, in publish
exchange_name, declare,
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/messaging.py", line 203, in _publish
mandatory=mandatory, immediate=immediate,
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/transport/virtual/base.py", line 605, in basic_publish
message, exchange, routing_key, **kwargs
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/transport/virtual/exchange.py", line 151, in deliver
exchange, message, routing_key, **kwargs)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/transport/redis.py", line 781, in _put_fanout
dumps(message),
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/redis/client.py", line 2716, in publish
return self.execute_command('PUBLISH', channel, message)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/redis/client.py", line 775, in execute_command
return self.parse_response(connection, command_name, **options)
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/redis/client.py", line 789, in parse_response
response = connection.read_response()
File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/redis/connection.py", line 636, in read_response
raise e
ConnectionError: Error while reading from socket: (u'Connection closed by server.',)
[2019-02-01 14:44:50,237: INFO/MainProcess] Received task: c1_cip_middleware.tasks.validate_purchases.run_purchases_validation[a411dd90-ab50-4101-becb-90adda3663a2]
* Questions *
I wonder what are the circumstances/scenarios when RuntimeError is raised, thus, the worker gets into the "unrecovery" stage and must be stopped?
I am in doubt what could be a root-cause of having this issue, especially that one worker managed to recover but the other one just died?

Celery worker can't complete job wher running as daemon

I have a celery setup with django & redis.
When i run celery by command from user, like celery multi start 123_work -A 123 --pidfile="/var/log/celery/%n.pid" --logfile="/var/log/celery/%n.log" --workdir="/data/ports/dj_dois" --loglevel=INFO job work's fine, but if i run celery via celeryd or supervisor some job's give me an error:
[2015-12-28 09:10:59,229: ERROR/MainProcess] Unrecoverable error: UnpicklingError('NEWOBJ class argument has NULL tp_new',)
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/celery/worker/__init__.py", line 206, in start
self.blueprint.start(self)
File "/usr/local/lib/python3.4/dist-packages/celery/bootsteps.py", line 123, in start
step.start(parent)
File "/usr/local/lib/python3.4/dist-packages/celery/bootsteps.py", line 374, in start
return self.obj.start()
File "/usr/local/lib/python3.4/dist-packages/celery/worker/consumer.py", line 278, in start
blueprint.start(self)
File "/usr/local/lib/python3.4/dist-packages/celery/bootsteps.py", line 123, in start
step.start(parent)
File "/usr/local/lib/python3.4/dist-packages/celery/worker/consumer.py", line 821, in start
c.loop(*c.loop_args())
File "/usr/local/lib/python3.4/dist-packages/celery/worker/loops.py", line 76, in asynloop
next(loop)
File "/usr/local/lib/python3.4/dist-packages/kombu/async/hub.py", line 328, in create_loop
next(cb)
File "/usr/local/lib/python3.4/dist-packages/celery/concurrency/asynpool.py", line 258, in _recv_message
message = load(bufv)
_pickle.UnpicklingError: NEWOBJ class argument has NULL tp_new
[2015-12-28 09:10:59,317: ERROR/MainProcess] Task db_select_task[dd5af67d-6bbe-49bb-8f13-59d0a0a9717b] raised unexpected: WorkerLostError('Worker exited prematurely: exitcode 0.',)
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/celery/worker/__init__.py", line 206, in start
self.blueprint.start(self)
File "/usr/local/lib/python3.4/dist-packages/celery/bootsteps.py", line 123, in start
step.start(parent)
File "/usr/local/lib/python3.4/dist-packages/celery/bootsteps.py", line 374, in start
return self.obj.start()
File "/usr/local/lib/python3.4/dist-packages/celery/worker/consumer.py", line 278, in start
blueprint.start(self)
File "/usr/local/lib/python3.4/dist-packages/celery/bootsteps.py", line 123, in start
step.start(parent)
File "/usr/local/lib/python3.4/dist-packages/celery/worker/consumer.py", line 821, in start
c.loop(*c.loop_args())
File "/usr/local/lib/python3.4/dist-packages/celery/worker/loops.py", line 76, in asynloop
next(loop)
File "/usr/local/lib/python3.4/dist-packages/kombu/async/hub.py", line 328, in create_loop
next(cb)
File "/usr/local/lib/python3.4/dist-packages/celery/concurrency/asynpool.py", line 258, in _recv_message
message = load(bufv)
_pickle.UnpicklingError: NEWOBJ class argument has NULL tp_new
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/billiard/pool.py", line 1175, in mark_as_worker_lost
human_status(exitcode)),
billiard.exceptions.WorkerLostError: Worker exited prematurely: exitcode 0.
My celery version:
software -> celery:3.1.19 (Cipater) kombu:3.0.32 py:3.4.2
billiard:3.3.0.22 py-amqp:1.4.8
platform -> system:Linux arch:64bit, ELF imp:CPython
loader -> celery.loaders.default.Loader
settings -> transport:amqp results:disabled
Pythnon - 3.4
Django - 1.8.7
Redis server v=2.8.17
Example of job that give's me an Error:
#shared_task(name='db_select_task')
def db_select_task(arg1,arg2):
conn_pool = pool.manage(cx_Oracle)
db = conn_pool.connect("user/pass#db")
try:
cursor = db.cursor()
ports = {}
t = tech
cursor.execute("sql")
data = cursor.fetchall()
except Exception:
return ('Error: with db')
finally:
cursor.close()
db.close()
return data
Problem was with oracle paths for celeryd daemon. Just add aditional export for celeryd config.

Categories