Python w/ Celery and RabbitMQ Does Not Behave Normally

Python w/ Celery and RabbitMQ Does Not Behave Normally - python

I'm using Django 1.4.3 with Python 2.7, Celery 3.0.1 and django-celery 3.0.17 on Ubuntu 13.04.
I have some tasks setup to run time consuming processes. If I set them up to queue with celery they do not behave properly. If I run them without queuing them everything behaves perfectly. Any thoughts as to why this would be the case?
To provide some motivation to my problem. I need to clone company contracts. Each contract has multiple offers associated with it. Each offer has multiple offer fields. Each offer field has multiple values. I need to clone everything.
Here is an example of what I'm doing.
def clone_contract(self, contract_id, contract_name):
old_contract = models.Contract.objects.get(pk=contract_id)
contract_dict = dict()
for attr in old_contract._meta.fields:
contract_dict[attr.name] = getattr(old_contract, attr.name)
del contract_dict['id']
contract_dict['name'] = contract_name
new_contract = contracts_models.Contract(**contract_dict)
new_contract.save()
contracts_tasks.clone_offers.delay(new_contract, old_contract)
#task(name='Clone Offers')
def clone_offers(new_contract, old_contract):
for offer in old_contract.offer_set.all():
offer_dict = dict()
for attr in offer._meta.fields:
offer_dict[attr.name] = getattr(offer, attr.name)
del offer_dict['id']
del offer_dict['contract']
offer_dict['contract_id'] = new_contract.pk
new_offer = contracts_models.Offer(**offer_dict)
new_offer.save()
clone_offer_fields(new_offer, offer)
def clone_offer_fields(new_offer, old_offer):
offer_fields = models.OfferField.objects.filter(offer=old_offer)
for offer_field in offer_fields:
initial = dict()
for attr in offer_field._meta.fields:
initial[attr.name] = getattr(offer_field, attr.name)
initial['offer'] = new_offer
del initial['id']
new_offer_field = contracts_models.OfferField(**initial)
new_offer_field.save()
model = models.OfferFieldValue
values = model.objects.filter(**{'field': offer_field})
clone_model(new_offer_field, model, 'field', values)
def clone_model(new_obj, model, fk_name, values):
for value in values:
initial = dict()
for attr in value._meta.fields:
initial[attr.name] = getattr(value, attr.name)
del initial['id']
initial[fk_name] = new_obj
new_value = model(**initial)
new_value.save()
From what I've observed the clone_offers works but the clone_offer_fields does not - again, only if clone_offers gets called as clone_offers.delay(). If I run this calling clone_offers without .delay() (not queuing it) everything works perfectly.
Unfortunately I'm unable to log in queued tasks (nothing seems to be written to the log file) so I can't troubleshoot within the code.
Is there an issue calling functions within a queued task? I'm pretty sure I've done this before with no problems. (Edit: Answered below)
Any suggestions would be greatly appreciated.
EDIT1:
I decided to test this throwing all the methods together. I was 99% sure this wouldn't be the problem but thought it'd be better to check to make sure. No difference if I use a single massive method.

The problem involved Celery hijacking the default logging. I implemented the solution given: Django Celery Logging Best Practice

Related

Flyte 0.16.2: Error loading Blob - How to get Types.Blob.fetch() to work in task decorated function?

I have a Flyte task function like this:
#task
def do_stuff(framework_obj):
framework_obj.get_outputs() # This calls Types.Blob.fetch(some_uri)
Trying to load a blob URI using flytekit.sdk.types.Types.Blob.fetch, but getting this error:
ERROR:flytekit: Exception when executing No temporary file system is present. Either call this method from within the context of a task or surround with a 'with LocalTestFileSystem():' block. Or specify a path when calling this function. Note: Cleanup is not automatic when a path is specified.
I can confirm I can load blobs using with LocalTestFileSystem(), in tests, but when actually trying to run a workflow, I'm not sure why I'm getting this error, as the function that calls blob-processing is decorated with #task so it's definitely a Flyte Task. I also confirmed that the task node exists on the Flyte web console.
What path is the error referencing and how do I call this function appropriately?
Using Flyte Version 0.16.2

Could you please give a bit more information about the code? This is flytekit version 0.15.x? I'm a bit confused since that version shouldn't have the #task decorator. It should only have #python_task which is an older API. If you want to use the new python native typing API you should install flytekit==0.17.0 instead.
Also, could you point to the documentation you're looking at? We've updated the docs a fair amount recently, maybe there's some confusion around that. These are the examples worth looking at. There's also two new Python classes, FlyteFile and FlyteDirectory that have replaced the Blob class in flytekit (though that remains what the IDL type is called).
(would've left this as a comment but I don't have the reputation to yet.)
Some code to help with fetching outputs and reading from a file output
#task
def task_file_reader():
client = SynchronousFlyteClient("flyteadmin.flyte.svc.cluster.local:81", insecure=True)
exec_id = WorkflowExecutionIdentifier(
domain="development",
project="flytesnacks",
name="iaok0qy6k1",
)
data = client.get_execution_data(exec_id)
lit = data.full_outputs.literals["o0"]
ctx = FlyteContext.current_context()
ff = TypeEngine.to_python_value(ctx, lv=lit,
expected_python_type=FlyteFile)
with open(ff, 'rb') as fh:
print(fh.readlines())

Using core.Token to pass a String Parameter as a number

I raised a feature request on the CDK github account recently and was pointed in the direction of Core.Token as being pretty much the exact functionality I was looking for. I'm now having some issues implementing it and getting similar errors, heres the feature request I raised previously: https://github.com/aws/aws-cdk/issues/3800
So my current code looks something like this:
fargate_service = ecs_patterns.LoadBalancedFargateService(
self, "Fargate",
cluster = cluster,
memory_limit_mib = core.Token.as_number(ssm.StringParameter.value_from_lookup(self, parameter_name='template-service-memory_limit')),
execution_role=fargate_iam_role,
container_port=core.Token.as_number(ssm.StringParameter.value_from_lookup(self, parameter_name='port')),
cpu = core.Token.as_number(ssm.StringParameter.value_from_lookup(self, parameter_name='template-service-container_cpu')),
image=ecs.ContainerImage.from_registry(ecrRepo)
)
When I try synthesise this code I get the following error:
jsii.errors.JavaScriptError:
Error: Resolution error: Supplied properties not correct for "CfnSecurityGroupEgressProps"
fromPort: "dummy-value-for-template-service-container_port" should be a number
toPort: "dummy-value-for-template-service-container_port" should be a number.
Object creation stack:
To me it seems to be getting past the validation requiring a number to be passed into the FargateService validation, but when it tried to create the resources after that ("CfnSecurityGroupEgressProps") it cant resolve the dummy string as a number. I'd appreciate any help on solving this or alternative suggestions to passing in values from AWS system params instead (I thought it might be possible to parse the values into here via a file pulled from S3 during the build pipeline or something along those lines, but that seems hacky).

With some help I think we've cracked this!
The problem was that I was passing "ssm.StringParameter.value_from_lookup" the solution is to provide the token with "ssm.StringParameter.value_for_string_parameter", when this is synthesised it stores the token and then upon deployment the value stored in system parameter store is substituted.
(We also came up with another approach for achieving similar which we're probably going to use over SSM approach, I've detailed below the code snippet if you're interested)
See the complete code below:
from aws_cdk import (
aws_ec2 as ec2,
aws_ssm as ssm,
aws_iam as iam,
aws_ecs as ecs,
aws_ecs_patterns as ecs_patterns,
core,
)
class GenericFargateService(core.Stack):
def __init__(self, scope: core.Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)
containerPort = core.Token.as_number(ssm.StringParameter.value_for_string_parameter(
self, 'template-service-container_port'))
vpc = ec2.Vpc(
self, "cdk-test-vpc",
max_azs=2
)
cluster = ecs.Cluster(
self, 'cluster',
vpc=vpc
)
fargate_iam_role = iam.Role(self,"execution_role",
assumed_by = iam.ServicePrincipal("ecs-tasks"),
managed_policies=[iam.ManagedPolicy.from_aws_managed_policy_name("AmazonEC2ContainerRegistryFullAccess")]
)
fargate_service = ecs_patterns.LoadBalancedFargateService(
self, "Fargate",
cluster = cluster,
memory_limit_mib = 1024,
execution_role=fargate_iam_role,
container_port=containerPort,
cpu = 512,
image=ecs.ContainerImage.from_registry("000000000000.dkr.ecr.eu-west-1.amazonaws.com/template-service-ecr")
)
fargate_service.target_group.configure_health_check(path=self.node.try_get_context("health_check_path"), port="9000")
app = core.App()
GenericFargateService(app, "generic-fargate-service", env={'account':'000000000000', 'region': 'eu-west-1'})
app.synth()
Solutions to problems are like buses, apparently you spend ages waiting for one and then two arrive together. And I think this new bus is the option we're probably going to run with.
The plan is to have developers provide an override for the cdk.json file withing their code repos, which can then put parsed into the CDK pipeline where the generic code will be synthesised. This file will contain some "context", the context will then be used within the CDK to set our variables for the LoadBalancedFargate service.
I've included some code snippets for setting cdk.json file and then using its values within code below.
Example CDK.json:
{
"app": "python3 app.py",
"context": {
"container_name":"template-service",
"memory_limit":1024,
"container_cpu":512,
"health_check_path": "/gb/template/v1/status",
"ecr_repo": "000000000000.dkr.ecr.eu-west-1.amazonaws.com/template-service-ecr"
}
}
Python example for assigning context to variables:
memoryLimitMib = self.node.try_get_context("memory_limit")
I believe we could also use a Try/Catch block to assign some default values to this if not provided by the developer in their CDK.json file.
I hope this post has provided some useful information to those looking for ways to create a generic template for deploying CDK code! I don't know if we're doing the right thing here, but this tool is so new it feels like some common patterns dont exist yet.

Python AppEngine MapReduce

i have created a pretty simple MapReduce pipeline, but i am having a cryptic:
PipelineSetupError: Error starting production.cron.pipelines.ItemsInfoPipeline(*(), **{})#a741186284ed4fb8a4cd06e38921beff:
when i try to start it. This is the pipeline code:
class ItemsInfoPipeline(base_handler.PipelineBase):
"""
"""
def run(self):
output = yield mapreduce_pipeline.MapreducePipeline(
job_name="items_job",
mapper_spec="production.cron.mappers.items_info_mapper",
input_reader_spec="mapreduce.input_readers.DatastoreInputReader",
mapper_params={
"input_reader": {
"entity_kind": "production.models.Transaction"
}
}
)
yield ItemsInfoStorePipeline(output)
class ItemsInfoStorePipeline(base_handler.PipelineBase):
"""
"""
def run(self, statistics):
print statistics
return "OK"
Of course i have double checked that the mapper path is right, and take into account that ItemsInfoStorePipeline is not doing anything because i am focusing the have the pipeline started, which is not happening.
It is all triggered by a Flask view, the following:
class ItemsInfoMRJob(views.MethodView):
"""
It's based on transacions.
"""
def get(self):
"""
:return:
"""
pipeline = ItemsInfoPipeline()
pipeline.start()
redirect_url = "%s/status?root=%s" % (pipeline.base_path, pipeline.pipeline_id)
return flask.redirect(redirect_url)
I am using GoogleAppEngineMapReduce==1.9.22.0
Thanks for any help.
UPDATE
The above code works once deployed.
UPDATE 2
Apparently there's more people dealing with this:
https://github.com/GoogleCloudPlatform/appengine-mapreduce/issues/103

I'm updating this. I have a code base that's using pipelines and works fine in OSX. I had another developer using OSX that simply nothing I do seems to get it working, he gets this:
Encountered unexpected error from ProtoRPC method implementation: PipelineSetupError
I've tried swapping versions around and making our PC's match perfectly and it continue to happen. I finally broke down and built an Ubuntu image in docker. I'm also doing my best to match our versions of AppEngine and libraries perfectly.
It also refuses to start with the same message. I started working through the libraries uncommenting the part that's swallowing the error but that was a long rabbit hole I started down because lots of stuff above it also seems to be swallowing up whatever is going on.

Using libsecret I can't get to the label of an unlocked item

I'm working on a little program that uses libsecret. This program should be able to create a Secret.Service...
from gi.repository import Secret
service = Secret.Service.get_sync(Secret.ServiceFlags.LOAD_COLLECTIONS)
... get a specific Collection from that Service...
# 2 is the index of a custom collection I created, not the default one.
collection = service.get_collections()[2]
... and then list all the Items inside that Collection, this by just printing their labels.
# I'm just printing a single label here for testing, I'd need all of course.
print(collection.get_items()[0].get_label())
An important detail is that the Collecction may initially be locked, and so I need to include code that checks for that possibility, and tries to unlock the Collection.
# The unlock method returns a tuple (number_of_objs_unlocked, [list_of_objs])
collection = service.unlock_sync([collection])[1][0]
This is important because the code I currently have can do all I need when the Collection is initially unlocked. However if the Collection is initially locked, even after I unlock it, I can't get the labels from the Items inside. What I can do is disconnect() the Service, recreate the Service again, get the now unlocked Collection, and this way I am able to read the label on each Item. Another interesting detail is that, after the labels are read once, I no longer required the Service reconnection to access them. This seems quite inelegant, so I started looking for a different solution.
I realized that the Collection inherited from Gio.DBusProxy and this class caches the data from the object it accesses. So I'm assuming that is the problem for me, I'm not updating the cache. This is strange though because the documentation states that Gio.DBusProxy should be able to detect changes on the original object, but that's not happening.
Now I don't know how to update the cache on that class. I've taken a look at some seahorse(another application that uses libsecret) vala code, which I wasn't able to completely decipher, I can't code vala, but that mentioned the Object.emit() method, I'm still not sure how I could use that method to achieve my goal. From the documentation(https://lazka.github.io/pgi-docs/Secret-1/#) I found another promising method, Object.notify(), which seems to be able to send notifications of changes that would enable cache updates, but I also haven't been able to properly use it yet.
I also posted on the gnome-keyring mailing list about this...
https://mail.gnome.org/archives/gnome-keyring-list/2015-November/msg00000.html
... with no answer so far, and found a bugzilla report on gnome.org that mentions this issue...
https://bugzilla.gnome.org/show_bug.cgi?id=747359
... with no solution so far(7 months) either.
So if someone could shine some light on this problem that would be great. Otherwise some inelegant code will unfortunately find it's way into my little program.
Edit-0:
Here is some code to replicate the issue in Python3.
This snippet creates a collection 'test_col', with one item 'test_item', and locks the collection. Note libsecret will prompt you for the password you want for this new collection:
#!/usr/bin/env python3
from gi import require_version
require_version('Secret', '1')
from gi.repository import Secret
# Create schema
args = ['com.idlecore.test.schema']
args += [Secret.SchemaFlags.NONE]
args += [{'service': Secret.SchemaAttributeType.STRING,
'username': Secret.SchemaAttributeType.STRING}]
schema = Secret.Schema.new(*args)
# Create 'test_col' collection
flags = Secret.CollectionCreateFlags.COLLECTION_CREATE_NONE
collection = Secret.Collection.create_sync(None, 'test_col', None, flags, None)
# Create item 'test_item' inside collection 'test_col'
attributes = {'service': 'stackoverflow', 'username': 'xor'}
password = 'password123'
value = Secret.Value(password, len(password), 'text/plain')
flags = Secret.ItemCreateFlags.NONE
Secret.Item.create_sync(collection, schema, attributes,
'test_item', value, flags, None)
# Lock collection
service = collection.get_service()
service.lock_sync([collection])
Then we need to restart the gnome-keyring-daemon, you can just logout and back in or use the command line:
gnome-keyrin-daemon --replace
This will setup your keyring so we can try opening a collection that is initially locked. We can do that with this code snippet. Note that you will be prompted once again for the password you set previously:
#!/usr/bin/env python3
from gi import require_version
require_version('Secret', '1')
from gi.repository import Secret
# Get the service
service = Secret.Service.get_sync(Secret.ServiceFlags.LOAD_COLLECTIONS)
# Find the correct collection
for c in service.get_collections():
if c.get_label() == 'test_col':
collection = c
break
# Unlock the collection and show the item label, note that it's empty.
collection = service.unlock_sync([collection])[1][0]
print('Item Label:', collection.get_items()[0].get_label())
# Do the same thing again, and it works.
# It's necessary to disconnect the service to clear the cache,
# Otherwise we keep getting the same empty label.
service.disconnect()
# Get the service
service = Secret.Service.get_sync(Secret.ServiceFlags.LOAD_COLLECTIONS)
# Find the correct collection
for c in service.get_collections():
if c.get_label() == 'test_col':
collection = c
break
# No need to unlock again, just show the item label
print('Item Label:', collection.get_items()[0].get_label())
This code attempts to read the item label twice. One the normal way, which fails, you should see an empty string, and then using a workaround, that disconnects the service and reconnects again.

I came across this question while trying to update a script I use to retrieve passwords from my desktop on my laptop, and vice versa.
The clue was in the documentation for Secret.ServiceFlags—there are two:
OPEN_SESSION = 2
establish a session for transfer of secrets while initializing the Secret.Service
LOAD_COLLECTIONS = 4
load collections while initializing the Secret.Service
I think for a Service that both loads collections and allows transfer of secrets (including item labels) from those collections, we need to use both flags.
The following code (similar to your mailing list post, but without a temporary collection set up for debugging) seems to work. It gives me the label of an item:
from gi.repository import Secret
service = Secret.Service.get_sync(Secret.ServiceFlags.OPEN_SESSION |
Secret.ServiceFlags.LOAD_COLLECTIONS)
collections = service.get_collections()
unlocked_collection = service.unlock_sync([collections[0]], None)[1][0]
unlocked_collection.get_items()[0].get_label()

I have been doing this
print(collection.get_locked())
if collection.get_locked():
service.unlock_sync(collection)
Don't know if it is going to work though because I have never hit a case where I have something that is locked. If you have a piece of sample code where I can create a locked instance of a collection then maybe I can help

Use cache under mod_wsgi for Python web framework

The codes are like this(I'm using flask and flask-cache but this might be a general problem):
#cache.memoize(500000)
def big_foo(a,b):
return a + b + random.randrange(0, 1000)
If I run it in a Python intepreter, I can always get the same result by calling big_foo(1,2).
But if I add this function in the application and use mod_wsgi to daemon, then request in browser. (the big_foo is called within the views function of that request). I found the result is not the same each time.
I think the results are different each time because mod_wsgi use multiple process to launch the app. Each process might have their own cache, and the cache can't be shared between process.
Is my guess right? If right, how can I assign one and only one cache for global accessing? If not, where was wrong with my codes?
Following is the config used for flask-cache
UPLOADS_FOLDER = "/mnt/Storage/software/x/temp/"
class RadarConfig(object):
UPLOADS_FOLDER = UPLOADS_FOLDER
ALLOWED_EXTENSIONS = set(['bed'])
SECRET_KEY = "tiananmen"
DEBUG = True
CACHE_TYPE = 'simple'
CACHE_DEFAULT_TIMEOUT = 5000000
BASIS_PATH = "/mnt/Storage/software/x/NMF_RESULT//p_NMF_Nimfa_NMF_Run_30632__metasites_all"
COEF_PATH = "/mnt/Storage/software/x/NMF_RESULT/MCF7/p_NMF_Nimfa_NMF_Run_30632__metasample_all"
MASK_PATH = "/mnt/Storage/software/x/NMF_RESULT/dhsHG19.bed"

Here's your problem: CACHE_TYPE = 'simple'. From SimpleCache documentation:
Simple memory cache for single process environments. This class exists mainly for the development server and is not 100% thread safe.
For production better suited backends are memcached, redis and filesystem, since they're designed to work in concurrent environments.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python w/ Celery and RabbitMQ Does Not Behave Normally - python

The problem involved Celery hijacking the default logging. I implemented the solution given: Django Celery Logging Best Practice

Related

Flyte 0.16.2: Error loading Blob - How to get Types.Blob.fetch() to work in task decorated function?

Using core.Token to pass a String Parameter as a number

Python AppEngine MapReduce

Using libsecret I can't get to the label of an unlocked item

Use cache under mod_wsgi for Python web framework

Categories

Resources