I am trying to use map/reduce to find the duplication of the data in couchDB
the map function is like this:
function(doc) {
if(doc.coordinates) {
emit({
twitter_id: doc.id_str,
text: doc.text,
coordinates: doc.coordinates
},1)};
}
}
and the reduce function is:
function(keys,values,rereduce){return sum(values)}
I want to find the sum of the data in the same key, but it just add everything together and I get the result:
<Row key=None, value=1035>
Is that a problem of group? How can I set it to true?
Assuming you're using the couchdb package from pypi, you'll need to pass a dictionary with all of the options you require to the view.
for example:
import couchdb
# the design doc and view name of the view you want to use
ddoc = "my_design_document"
view_name = "my_view"
#your server
server = couchdb.server("http://localhost:5984")
db = server["aCouchDatabase"]
#naming convention when passing a ddoc and view to the view method
view_string = ddoc +"/" + view_name
#query options
view_options = {"reduce": True,
"group" : True,
"group_level" : 2}
#call the view
results = db.view(view_string, view_options)
for row in results:
#do something
pass
Related
I have a function that accepts variables and i need to pass the test, there is a condition in test that all the vars for the function CAN BE found in response data from API.
So how can i add variables to the api, i searched in local files and looked through the api documentation. There are no functions for adding variables or can add me.
test function:
def test_variables_and_functions_for_formulas_exist(connector):
"""Test variables and functions for formula.
Make sure all the vars for the formulas CAN BE found in response data from API.
"""
formulas = {
field["id"]: field["formula"]
for field in field_lists[connector]
if field.get("formula") is not None
}
field_response_names = {
field.get("upstream_api_response_name", field["id"])
for field in field_lists[connector]
}
# field upstream API name in a formula can be combined with a table
# (endpoint) name if the former is unique not per field list but per table
table_field_response_names = {
(f"{field.get('table')}_{field.get('upstream_api_response_name', field['id'])}")
for field in field_lists[connector]
if field.get("table") is not None
}
formula_parser = py_expression_eval.Parser()
formula_parser.functions = dict(formula_parser.functions, **FORMULA_FUNCTIONS)
for field, formula_definition in formulas.items():
formula = formula_parser.parse(formula_definition)
formula_vars = formula.variables()
non_existent_vars = list(
set(formula_vars) - field_response_names - table_field_response_names
)
assert len(non_existent_vars) == 0, (
f"Names {non_existent_vars} for field {field} are missing from "
"the lists of upstream_api_response_name's and available functions"
)
I think this is called "Positional arguments" or the opposite or keyword arguments. I have written a script in Python Django and rest framework that has to take in parameters from the user and feed it to the Amazon API. Here is the code;
page = request.query_params.get('page');
search = request.query_params.get('search');
search_field = request.query_params.get('search_field');
search_index = request.query_params.get('search_index');
response = amazon.ItemSearch(Keywords=search, SearchIndex=search_index, ResponseGroup="ItemAttributes,Offers,Images,EditorialReview,Reviews")
In this code the text Keywords=search varies. That is. It could be like this Actor=search, Title=search, or Publisher=search I am not sure how to make the Keywords part dynamic such that it changes to the user's input such as Actor, Title, or Publisher
I think you should build up a dictionary of your kwargs and then use the ** syntax to expand it for the function call. I'm going give an example that presumes the search_field query param changes the search:
search = request.query_params.get('search');
search_field = request.query_params.get('search_field');
search_index = request.query_params.get('search_index');
kwargs = {
'SearchIndex': search_index,
'ResponseGroup': 'ItemAttributes,Offers,Images,EditorialReview,Reviews'
}
# Presuming search_field is the key, i.e. "Keywords", or "Actor"
kwargs[search_field] = search
### kwargs['Keywords'] = search
response = amazon.ItemSearch(**kwargs)
I have used a dictionary instead. This is how I have done it;
page = request.query_params.get('page');
search = request.query_params.get('search');
search_field = request.query_params.get('search_field');
search_index = request.query_params.get('search_index');
dict = {'SearchIndex':search_index, 'ResponseGroup': 'ItemAttributes,Offers,Images,EditorialReview,Reviews'}
dict[search_field] = search;
response = amazon.ItemSearch(**dict)
I just passed a dictionary with my dynamic variable as a key.
A simple query looks like this
User.query.filter(User.name == 'admin')
In my code, I need to check the parameters that are being passed and then filter the results from the database based on the parameter.
For example, if the User table contains columns like username, location and email, the request parameter can contain either one of them or can have combination of columns. Instead of checking each parameter as shown below and chaining the filter, I'd like to create one dynamic query string which can be passed to one filter and can get the results back. I'd like to create a separate function which will evaluate all parameters and will generate a query string. Once the query string is generated, I can pass that query string object and get the desired result. I want to avoid using RAW SQL query as it defeats the purpose of using ORM.
if location:
User.query.filter(User.name == 'admin', User.location == location)
elif email:
User.query.filter(User.email == email)
You can apply filter to the query repeatedly:
query = User.query
if location:
query = query.filter(User.location == location)
if email:
query = query.filter(User.email == email)
If you only need exact matches, there’s also filter_by:
criteria = {}
# If you already have a dict, there are easier ways to get a subset of its keys
if location: criteria['location'] = location
if email: criteria['email'] = email
query = User.query.filter_by(**criteria)
If you don’t like those for some reason, the best I can offer is this:
from sqlalchemy.sql.expression import and_
def get_query(table, lookups, form_data):
conditions = [
getattr(table, field_name) == form_data[field_name]
for field_name in lookups if form_data[field_name]
]
return table.query.filter(and_(*conditions))
get_query(User, ['location', 'email', ...], form_data)
Late to write an answer but if anyone is looking for the answer then sqlalchemy-json-querybuilder can be useful. It can be installed as -
pip install sqlalchemy-json-querybuilder
e.g.
filter_by = [{
"field_name": "SomeModel.field1",
"field_value": "somevalue",
"operator": "contains"
}]
order_by = ['-SomeModel.field2']
results = Search(session, "pkg.models", (SomeModel,), filter_by=filter_by,order_by=order_by, page=1, per_page=5).results
https://github.com/kolypto/py-mongosql/
MongoSQL is a query builder that uses JSON as the input.
Capable of:
Choosing which columns to load
Loading relationships
Filtering using complex conditions
Ordering
Pagination
Example:
{
project: ['id', 'name'], // Only fetch these columns
sort: ['age+'], // Sort by age, ascending
filter: {
// Filter condition
sex: 'female', // Girls
age: { $gte: 18 }, // Age >= 18
},
join: ['user_profile'], // Load the 'user_profile' relationship
limit: 100, // Display 100 per page
skip: 10, // Skip first 10 rows
}
I was trying to fetch auto scaling groups with Application tag value as 'CCC'.
The list is as below,
gweb
prd-dcc-eap-w2
gweb
prd-dcc-emc
gweb
prd-dcc-ems
CCC
dev-ccc-wer
CCC
dev-ccc-gbg
CCC
dev-ccc-wer
The script I coded below gives output which includes one ASG without CCC tag.
#!/usr/bin/python
import boto3
client = boto3.client('autoscaling',region_name='us-west-2')
response = client.describe_auto_scaling_groups()
ccc_asg = []
all_asg = response['AutoScalingGroups']
for i in range(len(all_asg)):
all_tags = all_asg[i]['Tags']
for j in range(len(all_tags)):
if all_tags[j]['Key'] == 'Name':
asg_name = all_tags[j]['Value']
# print asg_name
if all_tags[j]['Key'] == 'Application':
app = all_tags[j]['Value']
# print app
if all_tags[j]['Value'] == 'CCC':
ccc_asg.append(asg_name)
print ccc_asg
The output which I am getting is as below,
['prd-dcc-ein-w2', 'dev-ccc-hap', 'dev-ccc-wfd', 'dev-ccc-sdf']
Where as 'prd-dcc-ein-w2' is an asg with a different tag 'gweb'. And the last one (dev-ccc-msp-agt-asg) in the CCC ASG list is missing. I need output as below,
dev-ccc-hap-sdf
dev-ccc-hap-gfh
dev-ccc-hap-tyu
dev-ccc-mso-hjk
Am I missing something ?.
In boto3 you can use Paginators with JMESPath filtering to do this very effectively and in more concise way.
From boto3 docs:
JMESPath is a query language for JSON that can be used directly on
paginated results. You can filter results client-side using JMESPath
expressions that are applied to each page of results through the
search method of a PageIterator.
When filtering with JMESPath expressions, each page of results that is
yielded by the paginator is mapped through the JMESPath expression. If
a JMESPath expression returns a single value that is not an array,
that value is yielded directly. If the result of applying the JMESPath
expression to a page of results is a list, then each value of the list
is yielded individually (essentially implementing a flat map).
Here is how it looks like in Python code with mentioned CCP value for Application tag of Auto Scaling Group:
import boto3
client = boto3.client('autoscaling')
paginator = client.get_paginator('describe_auto_scaling_groups')
page_iterator = paginator.paginate(
PaginationConfig={'PageSize': 100}
)
filtered_asgs = page_iterator.search(
'AutoScalingGroups[] | [?contains(Tags[?Key==`{}`].Value, `{}`)]'.format(
'Application', 'CCP')
)
for asg in filtered_asgs:
print asg['AutoScalingGroupName']
Elaborating on Michal Gasek's answer, here's an option that filters ASGs based on a dict of tag:value pairs.
def get_asg_name_from_tags(tags):
asg_name = None
client = boto3.client('autoscaling')
while True:
paginator = client.get_paginator('describe_auto_scaling_groups')
page_iterator = paginator.paginate(
PaginationConfig={'PageSize': 100}
)
filter = 'AutoScalingGroups[]'
for tag in tags:
filter = ('{} | [?contains(Tags[?Key==`{}`].Value, `{}`)]'.format(filter, tag, tags[tag]))
filtered_asgs = page_iterator.search(filter)
asg = filtered_asgs.next()
asg_name = asg['AutoScalingGroupName']
try:
asgX = filtered_asgs.next()
asgX_name = asg['AutoScalingGroupName']
raise AssertionError('multiple ASG\'s found for {} = {},{}'
.format(tags, asg_name, asgX_name))
except StopIteration:
break
return asg_name
eg:
asg_name = get_asg_name_from_tags({'Env':env, 'Application':'app'})
It expects there to be only one result and checks this by trying to use next() to get another. The StopIteration is the "good" case, which then breaks out of the paginator loop.
I got it working with below script.
#!/usr/bin/python
import boto3
client = boto3.client('autoscaling',region_name='us-west-2')
response = client.describe_auto_scaling_groups()
ccp_asg = []
all_asg = response['AutoScalingGroups']
for i in range(len(all_asg)):
all_tags = all_asg[i]['Tags']
app = False
asg_name = ''
for j in range(len(all_tags)):
if 'Application' in all_tags[j]['Key'] and all_tags[j]['Value'] in ('CCP'):
app = True
if app:
if 'Name' in all_tags[j]['Key']:
asg_name = all_tags[j]['Value']
ccp_asg.append(asg_name)
print ccp_asg
Feel free to ask if you have any doubts.
The right way to do this isn't via describe_auto_scaling_groups at all but via describe_tags, which will allow you to make the filtering happen on the server side.
You can construct a filter that asks for tag application instances with any of a number of values:
Filters=[
{
'Name': 'key',
'Values': [
'Application',
]
},
{
'Name': 'value',
'Values': [
'CCC',
]
},
],
And then your results (in Tags in the response) are all the times when a matching tag is applied to an autoscaling group. You will have to make the call multiple times, passing back NextToken every time there is one, to go through all the pages of results.
Each result includes an ASG ID that the matching tag is applied to. Once you have all the ASG IDs you are interested in, then you can call describe_auto_scaling_groups to get their names.
yet another solution, in my opinion simple enough to extend:
client = boto3.client('autoscaling')
search_tags = {"environment": "stage"}
filtered_asgs = []
response = client.describe_auto_scaling_groups()
for group in response['AutoScalingGroups']:
flattened_tags = {
tag_info['Key']: tag_info['Value']
for tag_info in group['Tags']
}
if search_tags.items() <= flattened_tags.items():
filtered_asgs.append(group)
print(filtered_asgs)
I have a django application that is utilizing a third party API and needs to receive several arguments such as client_id, user_id etc. I currently have these values labeled at the top of my file as variables, but I'd like to store them in an object instead.
My current set up looks something like this:
user_id = 'ID HERE'
client_id = 'ID HERE'
api_key = 'ID HERE'
class Social(LayoutView, TemplateView):
def grab_data(self):
authenticate_user = AuthenticateService(client_id, user_id)
I want the default values set up as an object
SERVICE_CONFIG = {
'user_id': 'ID HERE',
'client_id': 'ID HERE'
}
So that I can access them in my classes like so:
authenticate_user = AuthenticateService(SERVICE_CONFIG.client_id, SERVICE_CONFIG.user_id)
I've tried SERVICE_CONFIG.client_id, and SERVICE_CONFIG['client_id'], as well as setting up the values as a mixin but I can't figure out how to access them any other way.
Python is not Javascript. That's a dictionary, not an object. You access dictionaries using the item syntax, not the attribute syntax:
AuthenticateService(SERVICE_CONFIG['client_id'], SERVICE_CONFIG['user_id'])
You can use a class, an instance, or a function object to store data as properties:
class ServiceConfig:
user_id = 1
client_id = 2
ServiceConfig.user_id # => 1
service_config = ServiceConfig()
service_config.user_id # => 1
service_config = lambda:0
service_config.user_id = 1
service_config.client_id = 2
service_config.user_id # => 1
Normally using a dict is the simplest way to store data, but in some cases higher readability of property access can be preferred, then you can use the examples above. Using a lambda is the easiest way but more confusing for someone reading your code, therefore the first two approaches are preferable.