What is the best way to use conditional services in Bonobo? - python

I'd like to use Bonobo to move data from one Postgres database to another on different services. I have the connections configured and would like to use one during extraction and one during loading.
Here is my testing setup:
source_connection_config_env = 'DEV'
source_connection_config = get_config(source_connection_config_env)
target_connection_config_env = 'TRAINING'
target_connection_config = get_target_connection_config(target_connection_config_env)
...
def get_services(**options):
if connection == 'source':
return {
'sqlalchemy.engine': create_postgresql_engine(**{
'host': source_connection_config.source_postres_connection['HOST'],
'name': source_connection_config.source_postres_connection['DATABASE'],
'user': source_connection_config.source_postres_connection['USER'],
'pass': source_connection_config.source_postres_connection['PASSWORD']
})
}
if connetion == 'target':
return {
'sqlalchemy.engine': create_postgresql_engine(**{
'host': target_connection_config.target_postres_connection['HOST'],
'name': target_connection_config.target_postres_connection['DATABASE'],
'user': target_connection_config.target_postres_connection['USER'],
'pass': target_connection_config.target_postres_connection['PASSWORD']
})
}
I'm not sure where the best place to change connections is, or how to actually go about it.
Thanks in advance!

As far as I understand, you want to use both source and target connection in the same graph (I hope I got this right).
So you cannot have this conditional, as it would return only one.
Instead, I'd return both, named differently:
def get_services(**options):
return {
'engine.source': create_postgresql_engine(**{...}),
'engine.target': create_postgresql_engine(**{...}),
}
And then use different connections in the transformations:
graph.add_chain(
Select(..., engine='engine.source'),
...,
InsertOrUpdate(..., engine='engine.target'),
)
Note that service names are just strings, there is no convention or naming pattern enforced. the 'sqlalchemy.engine' name is just the default, but you don't have to agree on it as long as you configure your transformations with the names you actually use.

Related

Unable to read environment variables in Django using django_configurations package

I was using django-environ to manage env variables, everything was working fine, recently I moved to django-configurations.
My settings inherit configurations.Configuration but I am having trouble getting values from .env file. For example, while retrieving DATABASE_NAME it gives the following error:
TypeError: object of type 'Value' has no len()
I know the below code returns a value.Value instance instead of a string, but I am not sure why it does so. The same is the case with every other env variable:
My .env. file is as follows:
DEBUG=True
DATABASE_NAME='portfolio_v1'
SECRET_KEY='your-secrete-key'
settings.py file is as follows
...
from configurations import Configuration, values
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': values.Value("DATABASE_NAME", environ=True),
...
I have verified that my `.env' file exists and is on the valid path.
I spent more time resolving the above issue and found what was missing.
Prefixing .env variables is mandatory in django-configuration as a default behavior.
While dealing dict keys, we have to provide environ_name kwarg to the Value instance
NOTE: .env variables should be prefixed with DJANGO_ even if you provide environ_name. If you want to override the prefix you have to provide environ_prefix) i.e.
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': values.Value(environ_name="DATABASE_NAME"), # provide DJANGO_DATABASE_NAME='portfolio_v1' in .env file
other use cases are:
VAR = values.Value() # works, provided DJANGO_VAR='var_value'
VAR = values.Value(environ_prefix='MYSITE') # works, provided MYSITE_VAR='var_value'
CUSTOM_DICT = {
'key_1': values.Value(environ_required=True), # doesn't work
'key_2': values.Value(environ_name='other_key'), # works if provided DJANGO_key_2='value_2' in .env
}
You are using django-configurations in the wrong way.
See the source code of the Value class:
class Value:
#property
def value(self):
...
def __init__(self, default=None, environ=True, environ_name=None,
environ_prefix='DJANGO', environ_required=False,
*args, **kwargs):
...
So you want to have the default value not as "DATABASE_NAME", and your environment variable in your .env file should start with DJANGO_.
Then to use the value you can use the value property, so your settings file should look like:
...
from configurations import Configuration, values
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': values.Value("DEFAULT_VAL").value,
# No need for environ=True since it is default
...

Python: Help converting messy if-elif statement ladder within loop a into dictionary switch case for creating grpc ssl certificates

I'm having trouble cleaning up some of my code.
In general terms the python code is meant to read two json files. The first json file includes the addresses for each microservice, the second json file includes a list of the services that each service communicates with. The function of this code is so create grpc ssl certificates for each microservice through looping over these two json files.
I have written the code, but I have used if statements and it is very messy, but I am struggling to clean the code up using dictionaries.
Below I will list a sample of the two jsons I have described above:
ServicesAddresses.json
[
{
"service":"service-A",
"address": ["localhost1"]
},
{
"service":"service-B",
"address": ["localhost2"]
},
]
ServicesUsed.json
[
{
"service":"service-A",
"services_used": ["service-B", "service-C"]
},
{
"service":"service-B",
"services_used": ["service-C", "service-D"]
},
]
I will share the code I use to loop over the first json and assign the addresses to variables below
for address in addressData:
if address["service"] == "service-A":
addresses.serviceA = address["address"]
elif address["service"] == "service-B":
addresses.serviceB = address["address"]
Finally I will share the code used to loop over the second json and to generate the ssl certificates using a function called cert_create which has an input of the address of each service
for service in runningData:
if service["service"] == "service-A" and service["services_used"] == ["service-B", "service-C"]:
os.chdir('/certs/service-A/service-B')
cert_create(str(addresses.serviceA))
os.chdir('/certs/service-A/service-C')
cert_create(str(addresses.serviceA))
elif service["service"] == "service-B" and service["services_used"] == ["service-C", "service-D"]:
os.chdir('/certs/service-B/service-C')
cert_create(str(addresses.serviceB))
os.chdir('/certs/service-B/service-D')
cert_create(str(addresses.serviceB))
As you can see, with a large number of services this logic can become quite a monstrosity. The issue is, with my limited experience of python, I don't see how creating a switch statement with dictionaries will be able to simplify this logic while retaining the same number of assignments and functions of a if statement.
Any ideas? I know this is quite basic but I feel as if with a different language such as go or java I would have been able to make this code a lot cleaner
Maybe you need some additional data structure (like dict below) or modification of existing addresses object so you can get address by name.
Of course, if your addresses data comes from user input consider sanitizing/checking it.
services_addresses = {
'service-A': str(addresses.serviceA),
'service-B': str(addresses.serviceB),
'service-C': str(addresses.serviceC),
}
for service in runningData:
service_name = service["service"]
for service_name_used in service["services_used"]:
os.chdir(f'/certs/{service_name}/{service_name_used}')
# here comes the difference - you need additional dict
cert_create( services_addresses[service_name] )
# or modification of addresses object ?..
# cert_create( addresses.getServiceAddress(service_name) )
import os
running_data = [
{
"service":"service-A",
"services_used": ["service-B", "service-C"]
},
{
"service":"service-B",
"services_used": ["service-C", "service-D"]
},
]
address_data = [
{
"service":"service-A",
"address": ["localhost1"]
},
{
"service":"service-B",
"address": ["localhost2"]
},
]
def get_service_address(service):
return [i.get("address")[0] for i in address_data if i.get("service") == service][0]
for data in running_data:
service = data.get("service")
used = data.get("services_used")
if service is None or used is None:
raise ValueError("Missing information")
for u in used:
print(os.path.join("certs", service, u))
print(get_service_address(service))
of course replace the prints with os.chdir and create_cert

pysaml2 - AuthnContextClassRef, PasswordProtectedTransport

I am struggling to understand how to configure pysaml2 and add the AuthnContext in my request.
I have a SP and I would need to add the following request when the client performs the login request:
<samlp:RequestedAuthnContext>
<saml:AuthnContextClassRef>
urn:oasis:names:tc:SAML:2.0:ac:classes:PasswordProtectedTransport
</saml:AuthnContextClassRef>
</samlp:RequestedAuthnContext>
I am struggling because I tried everything I could and I believe that it is possible to add that in my requests because in here https://github.com/IdentityPython/pysaml2/blob/master/src/saml2/samlp.py
I can see:
AUTHN_PASSWORD = "urn:oasis:names:tc:SAML:2.0:ac:classes:Password"
AUTHN_PASSWORD_PROTECTED = \
"urn:oasis:names:tc:SAML:2.0:ac:classes:PasswordProtectedTransport"
I just do not know how to reference that.. I have a simple configuration like this:
"service": {
"sp": {
"name": "BLABLA",
"allow_unsolicited": true,
"want_response_signed": false,
"logout_requests_signed": true,
"endpoints": {
"assertion_consumer_service": ["https://mywebste..."],
"single_logout_service": [["https://mywebste...", "urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect"]]
},
"requestedAuthnContext" : true
}
}
Anyone know how to add the above config?
I struggle to understand how to build the config dictionary, even by reading their docs. Any ideas?
I am happy to add the "PasswordProtectedTransport" directly in the code if the config does not allow that.. But I am not sure how to do it.
Thanks,
R
At some point your client calls create_authn_request(...)
(or prepare_for_authenticate(...),
or prepare_for_negotiated_authenticate(...)).
You should pass the extra arg requested_authn_context.
The requested_authn_context is an object of type saml2.samlp.RequestedAuthnContext that contains the wanted AuthnContextClassRef.
...
from saml2.saml import AUTHN_PASSWORD_PROTECTED
from saml2.saml import AuthnContextClassRef
from saml2.samlp import RequestedAuthnContext
requested_authn_context = RequestedAuthnContext(
authn_context_class_ref=[
AuthnContextClassRef(AUTHN_PASSWORD_PROTECTED),
],
comparison="exact",
)
req_id, request = create_authn_request(
...,
requested_authn_context=requested_authn_context,
)

Organizing my config variable for webapp2

For simplicity I think I need to rewrite this to just one statement
config = {'webapp2_extras.jinja2': {'template_path': 'templates',
'filters': {
'timesince': filters.timesince,
'datetimeformat': filters.datetimeformat},
'environment_args': {'extensions': ['jinja2.ext.i18n']}}}
config['webapp2_extras.sessions'] = \
{'secret_key': 'my-secret-key'}
Then I want to know where to put it if I use multiple files with multiple request handlers. Should I just put it in one file and import it to the others? Since the session code is secret, what are your recommendation for handling it via source control? To always change the secret before or after committing to source control?
Thank you
Just add 'webapp2_extras.sessions' to your dict initializer:
config = {'webapp2_extras.jinja2': {'template_path': 'templates',
'filters': {
'timesince': filters.timesince,
'datetimeformat': filters.datetimeformat},
'environment_args': {'extensions': ['jinja2.ext.i18n']}},
'webapp2_extras.sessions': {'secret_key': 'my-secret-key'}}
This would be clearer if the nesting were explicit, though:
config = {
'webapp2_extras.jinja2': {
'template_path': 'templates',
'filters': {
'timesince': filters.timesince,
'datetimeformat': filters.datetimeformat
},
'environment_args': {'extensions': ['jinja2.ext.i18n']},
},
'webapp2_extras.sessions': {'secret_key': 'my-secret-key'}
}
I would recommend storing those in a datastore Entity for more flexibility and caching them in the instance memory at startup.
You could also consider having a config.py file excluded from the source control, if you want to get things done quickly.

Changing how returned output is displayed in Django

I recently started learning Python/Django as a whole and in an attempt to speed up my learning curve and at the same time do something constructive I've started my own personal project.
I've got the latest Django/Python/Jinja2 installed together with the Python Battle.net API egg.
Currently I'm querying for a "character" and I'm trying to change the output of a returned value, here's the function from my views:
def viewCharacter(request, name):
character = get_object_or_404(Member, name=name)
info = Character('EU', 'Auchindoun', name, fields=[Character.GUILD])
ctx = { 'character': character, 'info': info, 'guildname': 'Guild Name' }
return render_to_response("roster/viewCharacter.html", ctx, request)
Now, in my template, I've tried "translating" info.class_ (which returns a numeric value) from it's numeric value to a string (The class name) but I'm always getting error messages about info.class_ not being able to be used in if/for statements/loops or other errors. (Tried comparing it to a two-tuple)
I really can't find a way to do this online, so I've come to the one place that have helped me the most in my learning process.
Any help would be most appreciated!
- Nieeru
If you really need to use a classname in template, try using this template filter, or just get it in the view and pass in a context :)
Is there any reason you can't add another variable to the context like so:
ctx = { 'character': character, 'info': info, 'class': repr(info.class_), 'guildname': 'Guild Name' }
EDIT: With the additional information you provided, here is my new suggestion.
Change:
ctx = { 'name': name, 'character': character, 'info': info, 'class': repr(info.class_), 'guildname': 'Team-Potato' }
to:
ctx = { 'name': name, 'character': character, 'info': info, 'className': classnameDict[info.class_], 'guildname': 'Team-Potato' }
This simply does the class look up in the view. Now add it to your template using
{{ className }}

Categories