I am using slack webhooks to send message to my slack channel.
The problem it keeps posting messages every few mins.
Here what I did...
I created a simple function under util folder.
def send_to_slack(text):
conn_id = "https://hooks.slack.com/services/your/slack/URL"
task_slack_alert(text, url, is_error=False, args=None)
def task_slack_alert(msg, url, is_error=False, args=None):
slack_msg = ":red_circle: Task Failed" if is_error else ":green_heart: Task Message"
"""*Task*: {task}
*Dag*: {dag}
*Execution Time*: {exec_ts}""".format(
task=args["task"],
dag=args["dag"],
exec_ts=args["ts"],
) if args else ""
message = {'text': + msg}
response = requests.post(url=url, data=json.dumps(message))
time.sleep(1)
print(f"Slack response {response}")
if response.status_code != 200:
print(f"Error sending chat message. Got: {response.status_code}")
In my dag (which is under another folder) I call the function
The dag copy data from oracle to snowflake db and this works without slack part.
Inside my dag i do the following:
x = {‘key1’: [‘value1’, ‘value 2’, … ‘value10]}
send_to_slack('My test message from python')
default_args = {...
'on_failure_callback': send_to_slack, }
with DAG(‘my_dag’,
default_args=default_args,
catchup=False) as dag:
parallel = 4
start = DummyOperator(task_id='start')
tasks = []
i = 0
for s in x.keys():
for t in x.get(s):
task = OracleToSnowflakeOperator(
task_id=s + '_' + t,
source_oracle_conn_id=source_oracle_conn_id,
source_schema=schema,
source_table=table,…
)
if i <= parallel:
task.set_upstream(start)
else:
task.set_upstream(tasks[i - (parallel + 1)])
i = i + 1
tasks.append(task)
I know if I define the function inside the same dag, it will be called every time the dag is parsed.
not my case, so What's wrong?
Thanks
You're calling the function send_to_slack inside your DAG file, this means it will run every time the scheduler evaluates your DAG (every few minutes).
You should either:
Use the slack operator that comes with Airflow and put it downstream from your OracleToSnowflakeOperator and treat it like any other operator
Edit your OracleToSnowflakeOperator, which I assume is a custom one, and put the logic to call Slack in there (use the slack hook)
Basically you should put be encapsulating the call to Slack inside a custom operator or use the standard Slack operator provided, don't put it inside your DAG definition.
Related
Ideally I want to grab the token one time (1 request) and then pass that token into the other 2 requests as they execute. When I run this code through Locust though...
from locust import HttpUser, constant, task
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
class ProcessRequests(HttpUser):
host = 'https://hostURL'
wait_time = constant(1)
def on_start(self):
tenant_id = "tenant123"
client_id = "client123"
secret = "secret123"
scope = "api://123/.default"
body ="grant_type=client_credentials&client_id=" + client_id + "&client_secret=" + secret + "&scope=" + scope
tokenResponse = self.client.post(
f"https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token",
body,
headers = { "ContentType": "application/x-www-form-urlencoded"}
)
response = tokenResponse.json()
responseToken = response['access_token']
self.headers = {'Authorization': 'Bearer ' + responseToken}
#task
def get_labware(self):
self.client.get("/123", name="Labware",headers=self.headers)
#task
def get_instruments(self):
self.client.get("/456", name="Instruments", headers=self.headers)
It ends up firing off multiple token requests that don't stop..
Any ideas how to fix this to make the token only run once?
In your case it runs once per user so my expectation is that you spawned 24 users and the number of Labware and Instruments is at least 2x times higher so it seems to work exactly according to the documentation.
Users (and TaskSets) can declare an on_start method and/or on_stop method. A User will call its on_start method when it starts running, and its on_stop method when it stops running. For a TaskSet, the on_start method is called when a simulated user starts executing that TaskSet, and on_stop is called when the simulated user stops executing that TaskSet (when interrupt() is called, or the user is killed).
If you want to get the token only once and then share it across all the virtual users you can go for the workaround from this question
#events.test_start.add_listener
def _(environment, **kwargs):
global token
token = get_token(environment.host)
and put into this get_token() function what you have in on_start() one.
More information: Locust with Python: Introduction, Correlating Variables, and Basic Assertions
from coinbase.wallet.client import Client
from telegram import ParseMode
from telegram.ext import CommandHandler, Defaults, Updater
COINBASE_KEY = 'xxxxxxxxxxxx'
COINBASE_SECRET = 'xxxxxxxxxxxx'
TELEGRAM_TOKEN = 'xxxxxxxxxxxx'
coinbase_client = Client(COINBASE_KEY, COINBASE_SECRET)
#if __name__ == '__main__':
updater = Updater(token=TELEGRAM_TOKEN, defaults=Defaults(parse_mode=ParseMode.HTML))
dispatcher = updater.dispatcher
dispatcher.add_handler('start', startCommand) # Accessed via /start
dispatcher.add_handler('alert', priceAlert) # Accessed via /alert
updater.start_polling() # Start the bot
updater.idle() # Wait for the script to be stopped, this will stop the bot
def startCommand(update, context):
context.bot.send_message(chat_id=update.effective_chat.id, text='Hello there!')
def priceAlert(update, context):
if len(context.args) > 2:
crypto = context.args[0].upper()
sign = context.args[1]
price = context.args[2]
context.job_queue.run_repeating(priceAlertCallback, interval=15, first=15, context=[crypto, sign, price, update.message.chat_id])
response = f"⏳ I will send you a message when the price of {crypto} reaches £{price}, \n"
response += f"the current price of {crypto} is £{coinbase_client.get_spot_price(currency_pair=crypto + '-GBP')['amount']}"
else:
response = '⚠️ Please provide a crypto code and a price value: \n<i>/price_alert {crypto code} {> / <} {price}</i>'
context.bot.send_message(chat_id=update.effective_chat.id, text=response)
def priceAlertCallback(context):
crypto = context.job.context[0]
sign = context.job.context[1]
price = context.job.context[2]
chat_id = context.job.context[3]
send = False
spot_price = coinbase_client.get_spot_price(currency_pair=crypto + '-GBP')['amount']
if sign == '<':
if float(price) >= float(spot_price):
send = True
else:
if float(price) <= float(spot_price):
send = True
if send:
response = f'👋 {crypto} has surpassed £{price} and has just reached <b>£{spot_price}</b>!'
context.job.schedule_removal()
context.bot.send_message(chat_id=chat_id, text=response)
enter image description here
I get this error of the code above, also I have already tried changing the position of the def but, it also shows error, How to solve this?
It is the code for telegram bot and also this keeps on showing me NameError, I have already added python3 and pip, but still not solved
Python reads files top to bottom. So when you call dispatcher.add_handler('start', startCommand), the function startCommand is not yet known. Move the part
updater = Updater(token=TELEGRAM_TOKEN, defaults=Defaults(parse_mode=ParseMode.HTML))
dispatcher = updater.dispatcher
dispatcher.add_handler('start', startCommand) # Accessed via /start
dispatcher.add_handler('alert', priceAlert) # Accessed via /alert
updater.start_polling() # Start the bot
updater.idle() # Wait for the script to be stopped, this will stop the bot
below the callback definitions.
Apart from that, add_handler needs a Handler as argument, in your case something like add_handler(CommandHanlder('start', startCommand). Please see PTB tutorial as well as the examples.
Disclaimer: I'm the current maintainer of the python-telegram-bot library.
Try
dispatcher.add_handler('start', startCommand()) # Accessed via /start
dispatcher.add_handler('alert', priceAlert()) # Accessed via /alert
You will also need to add the two arguments required by both functions.
dispatcher.add_handler('start', startCommand(update, context))
dispatcher.add_handler('alert', startCommand(update, context))
I'm not exactly sure what data the two functions take in but I'm going to guess that it is whatever the bot is returning.
I'm trying to use cloud functions to update data by calling an external API once a day.
So far I have:
Cloud Schedule set to invoke Function 1
Function 1 - loop over items and create a task for each item
Task - invoke Function 2 with data provided by function 1
Function 2 - call external API to get data and update our db
The issue is that there are ~2k items to update daily and a cloud function times out before it can do that, hence why I put them in a queue. But even placing the items in the queue takes too long for the cloud function so that is timing out before it can add them all.
Is there a simple way to bulk add multiple tasks to a queue at once?
Failing that, a better solution to all of this?
All written in python
Code for function 1:
def refresh(request):
for i in items:
# Create a client.
client = tasks_v2.CloudTasksClient()
# TODO(developer): Uncomment these lines and replace with your values.
project = 'my-project'
queue = 'refresh-queue'
location = 'europe-west2'
name = i['name'].replace(' ','')
url = f"https://europe-west2-my-project.cloudfunctions.net/endpoint?name={name}"
# Construct the fully qualified queue name.
parent = client.queue_path(project, location, queue)
# Construct the request body.
task = {
"http_request": { # Specify the type of request.
"http_method": tasks_v2.HttpMethod.GET,
"url": url, # The full url path that the task will be sent to.
}
}
# Use the client to build and send the task.
response = client.create_task(request={"parent": parent, "task": task})
Answering your question “Is there a simple way to bulk add multiple tasks to a queue at once?” As per the public documentation The best approach is to implement a double-injection pattern.
For this you will have a new queue where you are going to add a single task that contains multiple tasks of the original queue, then on the receiving end of this queue, you will have a service that will get the data of this task and create one task per entry on a second queue.
Additionally, I will suggest you use the 500/50/5 pattern to a cold queue. This will help both the task queue and the Cloud Function service to ramp up on a safe ratio.
Chris32's answer is correct, but one thing i noticed in your code snippet is you should create the client outside the for loop.
def refresh(request):
# Create a client.
client = tasks_v2.CloudTasksClient()
# TODO(developer): Uncomment these lines and replace with your values.
project = 'my-project'
queue = 'refresh-queue'
location = 'europe-west2'
for i in items:
name = i['name'].replace(' ','')
url = f"https://europe-west2-my-project.cloudfunctions.net/endpoint?name={name}"
# Construct the fully qualified queue name.
parent = client.queue_path(project, location, queue)
# Construct the request body.
task = {
"http_request": { # Specify the type of request.
"http_method": tasks_v2.HttpMethod.GET,
"url": url, # The full url path that the task will be sent to.
}
}
# Use the client to build and send the task.
response = client.create_task(request={"parent": parent, "task": task})
In app engine i would do client = tasks_v2.CloudTasksClient() outside of def refresh, at the file level, but i dont know if that matters for cloud functions.
Second thing,
Modify "Function 2" to take multiple 'names' instead of just one. Then in "Function 1" you can send 10 names to "Function 2" at a time
BATCH_SIZE = 10 # send 10 names to Function 2
def refresh(request):
# Create a client.
client = tasks_v2.CloudTasksClient()
# ...
for i in range(0, len(items), BATCH_SIZE)]:
items_batch = items[i:i + BATCH_SIZE]
names = ','.join([i['name'].replace(' ','') for i in items_batch])
url = f"https://europe-west2-my-project.cloudfunctions.net/endpoint?names={names}"
# Construct the fully qualified queue name.
# ...
If those 2 quick-fixes don't do it, then you'll have to split "Function 1" into Function 1A" and "Function 1B"
Function 1A:
BATCH_SIZE = 100 # send 100 names to Function 1B
def refresh(request):
client = tasks_v2.CloudTasksClient()
for i in range(0, len(items), BATCH_SIZE)]:
items_batch = items[i:i + BATCH_SIZE]
names = ','.join([i['name'].replace(' ','') for i in items_batch])
url = f"https://europe-west2-my-project.cloudfunctions.net/endpoint-for-function-1b?names={names}"
# send the task.
response = client.create_task(request={
"parent": client.queue_path('my-project', 'europe-west2', 'refresh-queue'),
"task": {
"http_request": {"http_method": tasks_v2.HttpMethod.GET, "url": url}
}})
Function 1B:
BATCH_SIZE = 10 # send 10 names to Function 2
def refresh(request):
# set `names` equal to the query param `names`
client = tasks_v2.CloudTasksClient()
for i in range(0, len(names), BATCH_SIZE)]:
names_batch = ','.join(names[i:i + BATCH_SIZE])
url = f"https://europe-west2-my-project.cloudfunctions.net/endpoint-for-function-2?names={names_batch}"
# send the task.
response = client.create_task(request={
"parent": client.queue_path('my-project', 'europe-west2', 'refresh-queue'),
"task": {
"http_request": {"http_method": tasks_v2.HttpMethod.GET, "url": url}
}})
I try create a SlackWebhookOperator, and using may HTTP connection, but he still traing use the http_default.
failed_alert = SlackWebhookOperator(
task_id='slack_test',
http_conn_id='slack_conn',
webhook_token=slack_webhook_token,
message=slack_msg,
username='airflow')
failed_alert.execute(context=context)
[2019-07-21 13:14:57,415] {{init.py:1625}} ERROR - Failed at executing callback
[2019-07-21 13:14:57,415] {{init.py:1626}} ERROR - The conn_id http_default isn't defined
I think its a known issue with 1.10.3: https://github.com/apache/airflow/pull/5066
My workaround is this:
def task_fail_slack_alert_hook(url, context):
""" This is a webhook utility which will push an error message to a given slack channel using a URL """
slack_msg = """
:red_circle: Task Failed.
*Task*: {task}
*Dag*: {dag}
*Execution Time*: {exec_date}
*Log Url*: {log_url}
<!channel>
""".format(
task=context.get("task_instance").task_id,
dag=context.get("task_instance").dag_id,
ti=context.get("task_instance"),
exec_date=context.get("execution_date"),
log_url=context.get("task_instance").log_url,
)
slack_data = {"text": slack_msg}
return requests.post(
url,
data=json.dumps(slack_data),
headers={"Content-Type": "application/json"},
)
You will have to put the whole webhook URL in the host though, rather than splitting host and password up.
You could also have a look at the slack client instead
I'm trying to get working two basic lambdas using Python2.7 runtime for SQS message processing. One lambda reads from SQS invokes and passes data to another lambda via context. I'm able to invoke the other lambda but the user context is empty in it. This is my code of SQS reader lambda:
import boto3
import base64
import json
import logging
messageDict = {'queue_url': 'queue_url',
'receipt_handle': 'receipt_handle',
'body': 'messageBody'}
ctx = {
'custom': messageDict,
'client': 'SQS_READER_LAMBDA',
'env': {'test': 'test'},
}
payload = json.dumps(ctx)
payloadBase64 = base64.b64encode(payload)
client = boto3.client('lambda')
client.invoke(
FunctionName='LambdaWorker',
InvocationType='Event',
LogType='None',
ClientContext=payloadBase64,
Payload=payload
)
And this is how I'm trying to inspect and print the contents of context variable inside invoked lambda, so I could check logs in CloudWatch:
memberList = inspect.getmembers(context)
for a in memberList:
logging.error(a)
The problem is nothing works and CloudWatch shows user context is empty:
('client_context', None)
I've tried example1, example2, example3, example4
Any ideas?
I gave up trying to pass the data through the context. However, I was able to pass the data through the Payload param:
client.invoke(
FunctionName='LambdaWorker',
InvocationType='Event',
LogType='None',
Payload=json.dumps(payload)
)
And then to read it from event parameter inside invoked lambda:
ctx = json.dumps(event)
The code in the question is very close. The only issue is the InvocationType type:
This will work with the code in your question:
client.invoke(
FunctionName='LambdaWorker',
InvocationType='RequestResponse',
LogType='None',
ClientContext=payloadBase64
)
However this changes the invocation to synchronous which may be undesirable. The reason for this behavior is not clear.