I think this is called "Positional arguments" or the opposite or keyword arguments. I have written a script in Python Django and rest framework that has to take in parameters from the user and feed it to the Amazon API. Here is the code;
page = request.query_params.get('page');
search = request.query_params.get('search');
search_field = request.query_params.get('search_field');
search_index = request.query_params.get('search_index');
response = amazon.ItemSearch(Keywords=search, SearchIndex=search_index, ResponseGroup="ItemAttributes,Offers,Images,EditorialReview,Reviews")
In this code the text Keywords=search varies. That is. It could be like this Actor=search, Title=search, or Publisher=search I am not sure how to make the Keywords part dynamic such that it changes to the user's input such as Actor, Title, or Publisher
I think you should build up a dictionary of your kwargs and then use the ** syntax to expand it for the function call. I'm going give an example that presumes the search_field query param changes the search:
search = request.query_params.get('search');
search_field = request.query_params.get('search_field');
search_index = request.query_params.get('search_index');
kwargs = {
'SearchIndex': search_index,
'ResponseGroup': 'ItemAttributes,Offers,Images,EditorialReview,Reviews'
}
# Presuming search_field is the key, i.e. "Keywords", or "Actor"
kwargs[search_field] = search
### kwargs['Keywords'] = search
response = amazon.ItemSearch(**kwargs)
I have used a dictionary instead. This is how I have done it;
page = request.query_params.get('page');
search = request.query_params.get('search');
search_field = request.query_params.get('search_field');
search_index = request.query_params.get('search_index');
dict = {'SearchIndex':search_index, 'ResponseGroup': 'ItemAttributes,Offers,Images,EditorialReview,Reviews'}
dict[search_field] = search;
response = amazon.ItemSearch(**dict)
I just passed a dictionary with my dynamic variable as a key.
Related
i want to build some function that read a url from txt file, then save it to some variable, then add some values inside the url between another values
example of the url: https://domains.livedns.co.il/API/DomainsAPI.asmx/NewDomain?UserName=apidemo#livedns.co.il&Password=demo
lets say i want to inject some values between UserName and Password and save it into file again and use it later.
i started to write the function and play with urllib parser but i still doesnt understand how to do that.
what i tried until now:
def dlastpurchase():
if os.path.isfile("livednsurl.txt"):
apikeyfile = open("livednsurl.txt", "r")
apikey = apikeyfile.read()
url_parse = urlsplit(apikey)
print(url_parse.geturl())
dlastpurchase()
Thanks in advance for every tip and help
A little bit more complex example that I believe you will find interesting and also enjoy improving it (while it takes care of some scenarios, it might be lacking in some). Also functional to enable reuse in other cases. Here we go
assuming we have a text file, named 'urls.txt' that contains this url
https://domains.livedns.co.il/API/DomainsAPI.asmx/NewDomain?UserName=apidemo#livedns.co.il&Password=demo
from os import error
from urllib.parse import urlparse, parse_qs, urlunparse
filename = 'urls.txt'
function to parse the url and return its query parameters as well as the url object, which will be used to reconstruct the url later on
def parse_url(url):
"""parse a given url and return its query parameters
Args:
url (string): url string to parse
Returns:
parsed (tupple): the tupple object returned by urlparse
query_parameters (dictionary): dictionary containing the query parameters as keys
"""
try :
# parse the url and get the queries parameters from there
parsed = urlparse(url)
# parse the queries and return the dictionary containing them
query_result = parse_qs(parsed.query)
return (query_result, parsed)
except(error):
print('something failed !!!')
print(error)
return False
function to add a new query parameter or to replace an existing one
def insert_or_replace_word(query_dic, word,value):
"""Insert a value for the query parameters of a url
Args:
query_dic (object): the dictionary containing the query parameters
word (string): the query parameter to replace or insert values for
value (string): the value to insert or use as replacement
Returns:
result (string):the result of the insertion or replacement
"""
try:
query_dic[word] = value
return query_dic
except (error):
print('Something went wrong {0}'.format(error))
function to format the query parameter and get them ready to reconstruct the new url
def format_query_strings(query_dic):
"""format the final query dictionaries ready to be used to construct a new url and construct the new url
Args:
query_dic (dictionary): final query dictionary after insertion or update
"""
final_string = ''
for key, value in query_dic.items():
#unfortunatly, query params from parse_qs are in list, so remove them before creating the final string
if type(value) == list:
query_string = '{0}={1}'.format(key, value[0])
final_string += '{0}&'.format(query_string)
else:
query_string = '{0}={1}'.format(key, value)
final_string += '{0}&'.format(query_string)
# this is to remove any extra & inserted at the end of the loop above
if final_string.endswith('&'):
final_string = final_string[:len(final_string)-1]
return final_string
we check out everything works by reading in text file, performing above operation and then saving the new url to a new file
with open(filename) as url:
lines = url.readlines()
for line in lines:
query_params,parsed = parse_url(line)
new_query_dic = insert_or_replace_word(query_params,'UserName','newUsername')
final = format_query_strings(new_query_dic)
#here you have to pass an iterable of lenth 6 in order to reconstruct the url
new_url_object = [parsed.scheme,parsed.netloc,parsed.path,parsed.params,final,parsed.fragment]
#this reconstructs the new url
new_url = urlunparse(new_url_object)
#create a new file and append the link inside of it
with open('new_urls.txt', 'a') as new_file:
new_file.writelines(new_c)
new_file.write('\n')
You don't have to use fancy tools to do that. Just split the url based on "?" Character. Then, split the second part based on "&" character. Add your new params to the list you have, and merge them with the base url you get.
url = "https://domains.livedns.co.il/API/DomainsAPI.asmx/NewDomain?UserName=apidemo#livedns.co.il&Password=demo"
base, params = url.split("?")
params = params.split("&")
params.insert(2, "new_user=yololo&new_passwd=hololo")
for param in params:
base += param + "&"
base = base.strip("&")
print(base)
I did it like this since you asked for inserting to a specific location. But url params are not depends on the order, so you can just append at the end of the url for ease. Or, you can edit the parameters from the list I show.
I am trying to convert boto3 dynamoDB conditional expressions (using types from boto3.dynamodb.conditions) to its string representation. Of course this could be hand coded but naturally I would prefer to be able to find something developed by AWS itself.
Key("name").eq("new_name") & Attr("description").begins_with("new")
would become
"name = 'new_name' and begins_with(description, 'new')"
I have been checking in the boto3 and boto core code but so far no success, but I assume it must exist somewhere in the codebase...
In the boto3.dynamodb.conditions module there is a class called ConditionExpressionBuilder. You can convert a condition expression to string by doing the following:
condition = Key("name").eq("new_name") & Attr("description").begins_with("new")
builder = ConditionExpressionBuilder()
expression = builder.build_expression(condition, is_key_condition=True)
expression_string = expression.condition_expression
expression_attribute_names = expression.attribute_name_placeholders
expression_attribute_values = expression.attribute_value_placeholders
I'm not sure why this isn't documented anywhere. I just randomly found it looking through the source code at the bottom of this page https://boto3.amazonaws.com/v1/documentation/api/latest/_modules/boto3/dynamodb/conditions.html.
Unfortunately, this doesn't work for the paginator format string notation, but it should work for the Table.query() format.
From #Brian's answer with
ConditionExpressionBuilder I had to add the Dynamodb's {'S': 'value'} type notation before the query execution.
I changed it with expression_attribute_values[':v0'] = {'S': pk_value}, where :v0 is the first Key/Attr in the condition. Not sure but should work for next values (:v0, :v1, :v2...).
Here is the full code, using pagination to retrieve only part of data
from boto3.dynamodb.conditions import Attr, Key, ConditionExpressionBuilder
from typing import Optional, List
import boto3
client_dynamodb = boto3.client("dynamodb", region_name="us-east-1")
def get_items(self, pk_value: str, pagination_config: dict = None) -> Optional[List]:
if pagination_config is None:
pagination_config = {
# Return only first page of results when no pagination config is not provided
'PageSize': 300,
'StartingToken': None,
'MaxItems': None,
}
condition = Key("pk").eq(pk_value)
builder = ConditionExpressionBuilder()
expression = builder.build_expression(condition, is_key_condition=True)
expression_string = expression.condition_expression
expression_attribute_names = expression.attribute_name_placeholders
expression_attribute_values = expression.attribute_value_placeholders
# Changed here to make it compatible with dynamodb typing
python expression_attribute_values[':v0'] = {'S': pk_value}
paginator = client_dynamodb.get_paginator('query')
page_iterator = paginator.paginate(
TableName="TABLE_NAME",
IndexName="pk_value_INDEX",
KeyConditionExpression=expression_string,
ExpressionAttributeNames=expression_attribute_names,
ExpressionAttributeValues=expression_attribute_values,
PaginationConfig=pagination_config
)
for page in page_iterator:
resp=page
break
if ("Items" not in resp) or (len(resp["Items"]) == 0):
return None
return resp["Items"]
EDIT:
I used this question to get string representation for Dynamodb Resource's query, which is not compatible (yet) with dynamodb conditions, but then I found a better solution from Github (Boto3)[https://github.com/boto/boto3/issues/2300]:
Replace paginator with the one from meta
dynamodb_resource = boto3.resource("dynamodb")
paginator = dynamodb_resource.meta.client.get_paginator('query')
And now I can simply use Attr and Key
Im working on a small project of retrieving information about books from the Google Books API using Python 3. For this i make a call to the API, read out the variables and store those in a list. For a search like "linkedin" this works perfectly. However when i enter "Google", it reads the second title from the JSON input. How can this happen?
Please find my code below (Google_Results is the class I use to initialize the variables):
import requests
def Book_Search(search_term):
parms = {"q": search_term, "maxResults": 3}
r = requests.get(url="https://www.googleapis.com/books/v1/volumes", params=parms)
print(r.url)
results = r.json()
i = 0
for result in results["items"]:
try:
isbn13 = str(result["volumeInfo"]["industryIdentifiers"][0]["identifier"])
isbn10 = str(result["volumeInfo"]["industryIdentifiers"][1]["identifier"])
title = str(result["volumeInfo"]["title"])
author = str(result["volumeInfo"]["authors"])[2:-2]
publisher = str(result["volumeInfo"]["publisher"])
published_date = str(result["volumeInfo"]["publishedDate"])
description = str(result["volumeInfo"]["description"])
pages = str(result["volumeInfo"]["pageCount"])
genre = str(result["volumeInfo"]["categories"])[2:-2]
language = str(result["volumeInfo"]["language"])
image_link = str(result["volumeInfo"]["imageLinks"]["thumbnail"])
dict = Google_Results(isbn13, isbn10, title, author, publisher, published_date, description, pages, genre,
language, image_link)
gr.append(dict)
print(gr[i].title)
i += 1
except:
pass
return
gr = []
Book_Search("Linkedin")
I am a beginner to Python, so any help would be appreciated!
It does so because there is no publisher entry in volumeInfo of the first entry, thus it raises a KeyError and your except captures it. If you're going to work with fuzzy data you have to account for the fact that it will not always have the expected structure. For simple cases you can rely on dict.get() and its default argument to return a 'valid' default entry if an entry is missing.
Also, there are a few conceptual problems with your function - it relies on a global gr which is bad design, it shadows the built-in dict type and it captures all exceptions guaranteeing that you cannot exit your code even with a SIGINT... I'd suggest you to convert it to something a bit more sane:
def book_search(search_term, max_results=3):
results = [] # a list to store the results
parms = {"q": search_term, "maxResults": max_results}
r = requests.get(url="https://www.googleapis.com/books/v1/volumes", params=parms)
try: # just in case the server doesn't return valid JSON
for result in r.json().get("items", []):
if "volumeInfo" not in result: # invalid entry - missing volumeInfo
continue
result_dict = {} # a dictionary to store our discovered fields
result = result["volumeInfo"] # all the data we're interested is in volumeInfo
isbns = result.get("industryIdentifiers", None) # capture ISBNs
if isinstance(isbns, list) and isbns:
for i, t in enumerate(("isbn10", "isbn13")):
if len(isbns) > i and isinstance(isbns[i], dict):
result_dict[t] = isbns[i].get("identifier", None)
result_dict["title"] = result.get("title", None)
authors = result.get("authors", None) # capture authors
if isinstance(authors, list) and len(authors) > 2: # you're slicing from 2
result_dict["author"] = str(authors[2:-2])
result_dict["publisher"] = result.get("publisher", None)
result_dict["published_date"] = result.get("publishedDate", None)
result_dict["description"] = result.get("description", None)
result_dict["pages"] = result.get("pageCount", None)
genres = result.get("authors", None) # capture genres
if isinstance(genres, list) and len(genres) > 2: # since you're slicing from 2
result_dict["genre"] = str(genres[2:-2])
result_dict["language"] = result.get("language", None)
result_dict["image_link"] = result.get("imageLinks", {}).get("thumbnail", None)
# make sure Google_Results accepts keyword arguments like title, author...
# and make them optional as they might not be in the returned result
gr = Google_Results(**result_dict)
results.append(gr) # add it to the results list
except ValueError:
return None # invalid response returned, you may raise an error instead
return results # return the results
Then you can easily retrieve as much info as possible for a term:
gr = book_search("Google")
And it will be far more tolerant of data omissions, provided that your Google_Results type makes most of the entries optional.
Following #Coldspeed's recommendation it became clear that missing information in the JSON file caused the exception to run. Since I only had a "pass" statement there it skipped the entire result. Therefore I will have to adapt the "Try and Except" statements so errors do get handled properly.
Thanks for the help guys!
I was trying to fetch auto scaling groups with Application tag value as 'CCC'.
The list is as below,
gweb
prd-dcc-eap-w2
gweb
prd-dcc-emc
gweb
prd-dcc-ems
CCC
dev-ccc-wer
CCC
dev-ccc-gbg
CCC
dev-ccc-wer
The script I coded below gives output which includes one ASG without CCC tag.
#!/usr/bin/python
import boto3
client = boto3.client('autoscaling',region_name='us-west-2')
response = client.describe_auto_scaling_groups()
ccc_asg = []
all_asg = response['AutoScalingGroups']
for i in range(len(all_asg)):
all_tags = all_asg[i]['Tags']
for j in range(len(all_tags)):
if all_tags[j]['Key'] == 'Name':
asg_name = all_tags[j]['Value']
# print asg_name
if all_tags[j]['Key'] == 'Application':
app = all_tags[j]['Value']
# print app
if all_tags[j]['Value'] == 'CCC':
ccc_asg.append(asg_name)
print ccc_asg
The output which I am getting is as below,
['prd-dcc-ein-w2', 'dev-ccc-hap', 'dev-ccc-wfd', 'dev-ccc-sdf']
Where as 'prd-dcc-ein-w2' is an asg with a different tag 'gweb'. And the last one (dev-ccc-msp-agt-asg) in the CCC ASG list is missing. I need output as below,
dev-ccc-hap-sdf
dev-ccc-hap-gfh
dev-ccc-hap-tyu
dev-ccc-mso-hjk
Am I missing something ?.
In boto3 you can use Paginators with JMESPath filtering to do this very effectively and in more concise way.
From boto3 docs:
JMESPath is a query language for JSON that can be used directly on
paginated results. You can filter results client-side using JMESPath
expressions that are applied to each page of results through the
search method of a PageIterator.
When filtering with JMESPath expressions, each page of results that is
yielded by the paginator is mapped through the JMESPath expression. If
a JMESPath expression returns a single value that is not an array,
that value is yielded directly. If the result of applying the JMESPath
expression to a page of results is a list, then each value of the list
is yielded individually (essentially implementing a flat map).
Here is how it looks like in Python code with mentioned CCP value for Application tag of Auto Scaling Group:
import boto3
client = boto3.client('autoscaling')
paginator = client.get_paginator('describe_auto_scaling_groups')
page_iterator = paginator.paginate(
PaginationConfig={'PageSize': 100}
)
filtered_asgs = page_iterator.search(
'AutoScalingGroups[] | [?contains(Tags[?Key==`{}`].Value, `{}`)]'.format(
'Application', 'CCP')
)
for asg in filtered_asgs:
print asg['AutoScalingGroupName']
Elaborating on Michal Gasek's answer, here's an option that filters ASGs based on a dict of tag:value pairs.
def get_asg_name_from_tags(tags):
asg_name = None
client = boto3.client('autoscaling')
while True:
paginator = client.get_paginator('describe_auto_scaling_groups')
page_iterator = paginator.paginate(
PaginationConfig={'PageSize': 100}
)
filter = 'AutoScalingGroups[]'
for tag in tags:
filter = ('{} | [?contains(Tags[?Key==`{}`].Value, `{}`)]'.format(filter, tag, tags[tag]))
filtered_asgs = page_iterator.search(filter)
asg = filtered_asgs.next()
asg_name = asg['AutoScalingGroupName']
try:
asgX = filtered_asgs.next()
asgX_name = asg['AutoScalingGroupName']
raise AssertionError('multiple ASG\'s found for {} = {},{}'
.format(tags, asg_name, asgX_name))
except StopIteration:
break
return asg_name
eg:
asg_name = get_asg_name_from_tags({'Env':env, 'Application':'app'})
It expects there to be only one result and checks this by trying to use next() to get another. The StopIteration is the "good" case, which then breaks out of the paginator loop.
I got it working with below script.
#!/usr/bin/python
import boto3
client = boto3.client('autoscaling',region_name='us-west-2')
response = client.describe_auto_scaling_groups()
ccp_asg = []
all_asg = response['AutoScalingGroups']
for i in range(len(all_asg)):
all_tags = all_asg[i]['Tags']
app = False
asg_name = ''
for j in range(len(all_tags)):
if 'Application' in all_tags[j]['Key'] and all_tags[j]['Value'] in ('CCP'):
app = True
if app:
if 'Name' in all_tags[j]['Key']:
asg_name = all_tags[j]['Value']
ccp_asg.append(asg_name)
print ccp_asg
Feel free to ask if you have any doubts.
The right way to do this isn't via describe_auto_scaling_groups at all but via describe_tags, which will allow you to make the filtering happen on the server side.
You can construct a filter that asks for tag application instances with any of a number of values:
Filters=[
{
'Name': 'key',
'Values': [
'Application',
]
},
{
'Name': 'value',
'Values': [
'CCC',
]
},
],
And then your results (in Tags in the response) are all the times when a matching tag is applied to an autoscaling group. You will have to make the call multiple times, passing back NextToken every time there is one, to go through all the pages of results.
Each result includes an ASG ID that the matching tag is applied to. Once you have all the ASG IDs you are interested in, then you can call describe_auto_scaling_groups to get their names.
yet another solution, in my opinion simple enough to extend:
client = boto3.client('autoscaling')
search_tags = {"environment": "stage"}
filtered_asgs = []
response = client.describe_auto_scaling_groups()
for group in response['AutoScalingGroups']:
flattened_tags = {
tag_info['Key']: tag_info['Value']
for tag_info in group['Tags']
}
if search_tags.items() <= flattened_tags.items():
filtered_asgs.append(group)
print(filtered_asgs)
I am trying to use map/reduce to find the duplication of the data in couchDB
the map function is like this:
function(doc) {
if(doc.coordinates) {
emit({
twitter_id: doc.id_str,
text: doc.text,
coordinates: doc.coordinates
},1)};
}
}
and the reduce function is:
function(keys,values,rereduce){return sum(values)}
I want to find the sum of the data in the same key, but it just add everything together and I get the result:
<Row key=None, value=1035>
Is that a problem of group? How can I set it to true?
Assuming you're using the couchdb package from pypi, you'll need to pass a dictionary with all of the options you require to the view.
for example:
import couchdb
# the design doc and view name of the view you want to use
ddoc = "my_design_document"
view_name = "my_view"
#your server
server = couchdb.server("http://localhost:5984")
db = server["aCouchDatabase"]
#naming convention when passing a ddoc and view to the view method
view_string = ddoc +"/" + view_name
#query options
view_options = {"reduce": True,
"group" : True,
"group_level" : 2}
#call the view
results = db.view(view_string, view_options)
for row in results:
#do something
pass