Boto / Cloudwatch recover instance alarm - python

I've been banging my head against a wall trying to make this work.
I'm attempting to use python/boto to create a cloutwatch alarm that recovers a failed ec2 instance.
I'm having difficulty in getting the ec2:RecoverInstance action to work. I suspect my topic isn't setup correctly.
topics = sns_conn.get_all_topics()
topic = topics[u'ListTopicsResponse']['ListTopicsResult']['Topics'][0]['TopicArn']
# arn:aws:sns:us-east-1:*********:CloudWatch
status_check_failed_alarm = boto.ec2.cloudwatch.alarm.MetricAlarm(
connection=cw_conn,
name=_INSTANCE_NAME + "RECOVERY-High-Status-Check-Failed-Any",
metric='StatusCheckFailed',
namespace='AWS/EC2',
statistic='Average',
comparison='>=',
description='status check for %s %s' % (_INSTANCE, _INSTANCE_NAME),
threshold=1.0,
period=60,
evaluation_periods=5,
dimensions={'InstanceId': _INSTANCE},
# alarm_actions = [topic],
ok_actions=[topic],
insufficient_data_actions=[topic])
# status_check_failed_alarm.add_alarm_action('arn:aws:sns:us-east-1:<acct#>:ec2:recover')
# status_check_failed_alarm.add_alarm_action('arn:aws:sns:us-east-1:<acct#>:ec2:RecoverInstances')
status_check_failed_alarm.add_alarm_action('ec2:RecoverInstances')
cw_conn.put_metric_alarm(status_check_failed_alarm)
Any suggestions would be highly appreciated.
Thank you.
--MIke

I think the issue is these alarm actions do not have <acct> in the arn. The cli reference documents the valid arns:
Valid Values: arn:aws:automate:region:ec2:stop | arn:aws:automate:region:ec2:terminate | arn:aws:automate:region:ec2:recover
I would think it is easier to pull the metric from AWS and create an alarm from that rather than trying to construct it from the ground up, e.g. (untested code):
topics = sns_conn.get_all_topics()
topic = topics[u'ListTopicsResponse']['ListTopicsResult']['Topics'][0]['TopicArn']
metric = cloudwatch_conn.list_metrics(dimensions={'InstanceId': _INSTANCE},
metric_name="StatusCheckFailed")[0]
alarm = metric.create_alarm(name=_INSTANCE_NAME + "RECOVERY-High-Status-Check-Failed-Any",
description='status check for {} {}'.format(_INSTANCE, _INSTANCE_NAME),
alarm_actions=[topic, 'arn:aws:automate:us-east-1:ec2:recover'],
ok_actions=[topic],
insufficient_data_actions=[topic],
statistic='Average',
comparison='>=',
threshold=1.0,
period=60,
evaluation_periods=5)

Related

Log Application insights customMetrics once instead export_interval - python

I have a spark stream that reads data from an azure data lake, applies some transformations then writes into the azure synapse (DW).
I wanna log some metrics for each batch processed. but I don't wanna duplicate logs from each batch.
Is there any way to log only once instead with some export_interval?
Example:
autoloader_df = (
spark.readStream.format("cloudFiles")
.options(**stream_config["cloud_files"])
.option("recursiveFileLookup", True)
.option("maxFilesPerTrigger", sdid_workload.max_files_agg)
.option("pathGlobfilter", "*_new.parquet")
.schema(stream_config["schema"])
.load(stream_config["read_path"])
.withColumn(stream_config["file_path_column"], input_file_name())
)
stream_query = (
autoloader_df.writeStream.format("delta")
.trigger(availableNow=True)
.option("checkpointLocation", stream_config["checkpoint_location"])
.foreachBatch(
lambda df_batch, batch_id: ingestion_process(
df_batch, batch_id, sdid_workload, stream_config, logger=logger
)
)
.start()
)
Where ingestion process is as follows:
def ingestion_process(df_batch, batch_id, sdid_workload, stream_config, **kwargs):
logger: AzureLogger = kwargs.get("logger")
iteration_start_time = datetime.utcnow()
sdid_workload.ingestion_iteration += 1
general_transformations(sdid_workload)
log_custom_metrics(sdid_workload)
`
In log_custom_metrics I'm using:
exporter = metrics_exporter.new_metrics_exporter(connection_string=appKey, export_interval=12)
view_manager.register_exporter(exporter)
I don’t want duplicated logs
If anyone step by this post.
I was able to find a workaround on this topic:
https://github.com/census-instrumentation/opencensus-python/issues/1070
other related topics:
https://github.com/census-instrumentation/opencensus-python/issues/1029
https://github.com/census-instrumentation/opencensus-python/issues/963

Is there a thing in discord.py similar to streamTime in discord.js?

I'm using the package youtube_dl for the play music command.
After a while, now I've been working on the rewind & forward commands, I have implemented a basic seek command using ffmpeg options, so the only thing left is just to find the position of the track being played by the bot, so that I can seek (position +- ) to go to that position of the track. The only thing I've figured out is to count the progression of the track like this.
async def count_progress(self):
try:
if not self.on_count:
self.on_count = True
while self.is_playing:
await asyncio.sleep(0.99)
self.queue._queue[0].progress += 1
self.on_count = False
except (AttributeError, IndexError):
self.on_count = False
I found that discord.js has something called streamTime, is there anything similar in discord.py? if not is there any better way than just counting the progress?
Update: I have forgotten about this post I made but ever since I have found a really nice solution to this problem.
What I did was making a custom class to count the bytes read by the player. (Thanks to this issue I made)
class CalculableAudio(discord.PCMVolumeTransformer):
def __init__(self, original, start, volume: float):
self.played = start
super().__init__(original, volume=volume)
def read(self) -> bytes:
self.played += 20 # reading 20 frames at a time (1 sec = 1000 frames)
return super().read()
Then whenever I want to find the seconds played I just need to do this:
seconds_played = ctx.voice_client.source.played//1000
A few things right here at your disposal!
TL;DR - No, such a thing never existed in discord.py and never will.
In discord.js v12 there used to be a StreamDispatcher which had a StreamDispatcher.streamTime property which was further removed in v13 (discord.js) and is being re-added as per this commit on the official Discord.js repository.
Talking about your primary issue, Discord.py has been discontinued and will not be receiving any updates from the developer from now on, and previously it did not have any such method to access the stream time of the client's stream ( voice / video - not applicable to bot users) anyway. You may refer the documentation's VoiceClient section and review for yourself that no such attribute / method existed at any point.
Although Python itself is at your service! you may declare a variable of the time it starts playing and subtract it from the time right now!
# when command is first used save it to a database or as a variable simply :P starttime = datetime.datetime.utcnow()
#client.command()
async def playtime(ctx):
time = datetime.datetime.utcnow() - starttime
time = str(uptime).split(".")[0]
embed = discord.Embed(title="Total playtime", description=f"**time** = " + '' + time +'')
await ctx.send(embed=embed)
I found this piece of code in player.py ( one of the modules of discord.py ) which is responsible for playing audio:
while not self._end.is_set():
# are we paused?
if not self._resumed.is_set():
# wait until we aren't
self._resumed.wait()
continue
# are we disconnected from voice?
if not self._connected.is_set():
# wait until we are connected
self._connected.wait()
# reset our internal data
self.loops = 0
self._start = time.perf_counter()
self.loops += 1
data = self.source.read()
if not data:
self.stop()
break
play_audio(data, encode=not self.source.is_opus())
next_time = self._start + self.DELAY * self.loops
delay = max(0, self.DELAY + (next_time - time.perf_counter()))
time.sleep(delay)
Therefore If you do :
loop_count = ctx.voice_client._player.loops
This gives the loop count which the playing thread has gone through, which can be useful for representing the time position of the audio.
time_position_in_second = loop_count // 50
I would also recommend doing a floor division of 50 to convert it into seconds, since the loop runs every 20 ms which is 50 times in a second.

How to query a Google My Business API for Insights

I have created a report where I want to also include all the insights of a Google My Business account.
I have already been approved and have access to the GMB API with no problem. The only thing is now that I have full access, how do I successfully query it so I can get insight information? I have access to a team that works with PHP or Python so I wanted to see what I should give them so that they can start querying successfully. Can anyone help?
Download php client library from here
Here is the sample function to get location insights
Parameters required:
locationNames should be provided as input
startTime and endTime max difference should be 18 months
(2020-01-01T15:01:23Z,2021-01-01T15:01:23Z)
public function getLocationInsights($accountName,$parameters){
// Replace getClientService, with method having accesstoken
$service = $this->getClientService();
$insightReqObj = new Google_Service_MyBusiness_ReportLocationInsightsRequest();
$locationNames = $parameters['locationNames'];
// Atleast one location mandatory
if($locationNames && is_array($locationNames) && count($locationNames) <=10){
$insightReqObj->setLocationNames($locationNames);
}
$basicReqObj = new Google_Service_MyBusiness_BasicMetricsRequest();
// datetime range is mandatory
// TODO :: validate to not allow more than 18 months difference
$timeRangObj = new Google_Service_MyBusiness_TimeRange();
$timeRangObj->setStartTime($parameters['startTime']);
$timeRangObj->setEndTime($parameters['endTime']);
$metricReqObj = new Google_Service_MyBusiness_MetricRequest();
$metricReqObj->setMetric('ALL');
$basicReqObj->setMetricRequests(array($metricReqObj));
$basicReqObj->setTimeRange($timeRangObj);
$insightReqObj->setBasicRequest($basicReqObj);
$allInsights = $service->accounts_locations->reportInsights($accountName,$insightReqObj);
return $allInsights;
}
I work with java to do the same stuff.
Mine is something like this:
ReportLocationInsightsRequest content = new ReportLocationInsightsRequest();
content.setFactory(JSON_FACTORY);
BasicMetricsRequest basicRequest = new BasicMetricsRequest();
content.setLocationNames("your locationName as a list");
List<MetricRequest> metricRequests= new ArrayList<MetricRequest>();
MetricRequest metricR=new MetricRequest();
String metric="ALL";
metricR.setMetric(metric);
metricRequests.add(metricR);
TimeRange timeRange =new TimeRange();
timeRange.setStartTime("Desired startTime");
timeRange.setEndTime("Desired endTime");
basicRequest.setTimeRange(timeRange );
content.setBasicRequest(basicRequest );
try {
MyBusiness.Accounts.Locations.ReportInsights locationReportInsight=
mybusiness.accounts().locations().reportInsights(accountName, content);
ReportLocationInsightsResponse response= locationReportInsight.execute();
System.out.println("response is = "+ response.toPrettyString());
}catch(Exception e) {
System.out.println(e);
}

Python multi-threading method

I've heard that Python multi-threading is a bit tricky, and I am not sure what is the best way to go about implementing what I need. Let's say I have a function called IO_intensive_function that does some API call which may take a while to get a response.
Say the process of queuing jobs can look something like this:
import thread
for job_args in jobs:
thread.start_new_thread(IO_intense_function, (job_args))
Would the IO_intense_function now just execute its task in the background and allow me to queue in more jobs?
I also looked at this question, which seems like the approach is to just do the following:
from multiprocessing.dummy import Pool as ThreadPool
pool = ThreadPool(2)
results = pool.map(IO_intensive_function, jobs)
As I don't need those tasks to communicate with each other, the only goal is to send my API requests as fast as possible. Is this the most efficient way? Thanks.
Edit:
The way I am making the API request is through a Thrift service.
I had to create code to do something similar recently. I've tried to make it generic below. Note I'm a novice coder, so please forgive the inelegance. What you may find valuable, however, is some of the error processing I found it necessary to embed to capture disconnects, etc.
I also found it valuable to perform the json processing in a threaded manner. You have the threads working for you, so why go "serial" again for a processing step when you can extract the info in parallel.
It is possible I will have mis-coded in making it generic. Please don't hesitate to ask follow-ups and I will clarify.
import requests
from multiprocessing.dummy import Pool as ThreadPool
from src_code.config import Config
with open(Config.API_PATH + '/api_security_key.pem') as f:
my_key = f.read().rstrip("\n")
f.close()
base_url = "https://api.my_api_destination.com/v1"
headers = {"Authorization": "Bearer %s" % my_key}
itm = list()
itm.append(base_url)
itm.append(headers)
def call_API(call_var):
base_url = call_var[0]
headers = call_var[1]
call_specific_tag = call_var[2]
endpoint = f'/api_path/{call_specific_tag}'
connection_tries = 0
for i in range(3):
try:
dat = requests.get((base_url + endpoint), headers=headers).json()
except:
connection_tries += 1
print(f'Call for {api_specific_tag} failed after {i} attempt(s). Pausing for 240 seconds.')
time.sleep(240)
else:
break
tag = list()
vars_to_capture_01 = list()
vars_to_capture_02 = list()
connection_tries = 0
try:
if 'record_id' in dat:
vars_to_capture_01.append(dat['record_id'])
vars_to_capture_02.append(dat['second_item_of_interest'])
else:
vars_to_capture_01.append(call_specific_tag)
print(f'Call specific tag {call_specific_tag} is unavailable. Successful pull.')
vars_to_capture_02.append(-1)
except:
print(f'{call_specific_tag} is unavailable. Unsuccessful pull.')
vars_to_capture_01.append(call_specific_tag)
vars_to_capture_02.append(-1)
time.sleep(240)
pack = list()
pack.append(vars_to_capture_01)
pack.append(vars_to_capture_02)
return pack
vars_to_capture_01 = list()
vars_to_capture_02 = list()
i = 0
max_i = len(all_tags)
while i < max_i:
ind_rng = range(i, min((i + 10), (max_i)), 1)
itm_lst = (itm.copy())
call_var = [itm_lst + [all_tags[q]] for q in ind_rng]
#packed = call_API(call_var[0]) # for testing of function without pooling
pool = ThreadPool(len(call_var))
packed = pool.map(call_API, call_var)
pool.close()
pool.join()
for pack in packed:
try:
vars_to_capture_01.append(pack[0][0])
except:
print(f'Unpacking error for {all_tags[i]}.')
vars_to_capture_02.append(pack[1][0])
For network API request you can use asyncio. Have a look at this article https://realpython.com/python-concurrency/#asyncio-version for an example how to implement it.

How can I get all information about my vcenters in vmware vsphere with pyvmomi

I can currently get info on all vms, all hosts, all "clusters" (datacenters?) and all datastores. Is there anything else, any other managed datatype that would provide useful information.
My objective is to give my company info about the machines and utilization, but my side project is gather as much information as possible so that I can pair it with known outages and use machine learning to detect when systems or applications are likely to fail.
To be clear, right now this is how I gather information on vms
Also the print statements are just there so I can see what I want to put into a database and what I dont have a use for.
def vm_and_up():
viewTypeComputeResource = [vim.ComputeResource]
containerView = content.viewManager.CreateContainerView(container, viewTypeComputeResource, recursive) # create container view
clusters = containerView.view
for cluster in clusters:
print(cluster.name)
print(cluster.summary)
hosts = cluster.host
host_count = 0
for hosts2 in hosts:
host_count = host_count +1
hosts_list.append(hosts2.name)
#print(hosts2.name)
vms = hosts2.vm
#print('Cluster: '+cluster.name)
#print('Host: '+hosts2.name)
# vm_count = 0
for vm in vms:
print('Capability: '+ str(vm.capability))
print('Datestore: '+ str(vm.datastore))
#print('Config: '+ str(vm.config))
print('Guest: '+str(vm.guest.disk))
print('GuestDiskInfo')
print('GuestFullName: '+str(vm.guest.guestFullName))
print('GuestFullName')
print('Guest: '+str(vm.guest.hostName))
print('GuestHostName')
print('Guest: '+str(vm.guest.ipAddress))
print('GuestIpAddress')
print('Guest: '+str(vm.guest.net))
print('GuestNic')
print('Guest: '+str(vm.resourcePool))
print('GuestResourcePool')
print('Guest: '+str(vm.runtime))#
print('GuestRuntime')#
print('Guest: '+str(vm.layout))#
print('GuestLayout')#
print('Guest: '+str(vm.resourcePool))
print('GuestResourcePool')
#print(vm.config.hardware)
similarly for datacenters I use
def see_datacenters():
viewTypeDatacenter =[vim.Datacenter]
containerviewDatacenter = content.viewManager.CreateContainerView(container, viewTypeDatacenter, recursive)
datacenters = containerviewDatacenter.view
print(len(datacenters))
for data_center in datacenters:
data_center_name = str(data_center.name)
print(data_center_name)
print('data_center_name')
print(data_center.overallStatus)
print('data_center.overallStatus')
print(data_center.configStatus)
print('data_center.configStatus')
print(data_center.configIssue)
print('data_center.ConfigIssue')
print(data_center.vmFolder)
print('data_center.vmFolder')
print(data_center.recentTask)
print('data_center.recentTask')
print(data_center.configuration)
print('data_center.configuration')
print(data_center.datastore)
and for datastores
def see_datastores():
viewTypeDatastore = [vim.Datastore]
containerViewDatastore = content.viewManager.CreateContainerView(container, viewTypeDatastore, recursive) # create container view
datastores = containerViewDatastore.view
print(len(datastores))
print(datastores)
print('datastores')
for datastore in datastores:
print(datastore.name)
print('name')
print(str(datastore.browser))
print('Browser')
print(str(datastore.capability))
print('Capability')
print(str(datastore.info))
print('Info')
print(str(datastore.iormConfiguration))
print('IormConfig')
print(str(datastore.summary))
print('Summary')
print(str(datastore.overallStatus))
print('Status')
global datastore_summary_raw
datastore_summary_raw = datastore.summary
for each of these, I am creating a container view, using viewtypes like [vim.Datastore], [vim.Datacenter] and [vim.ComputeResource]
are there any other major ones I should concern myself with?
Particularly, manage object types
https://vdc-download.vmware.com/vmwb-repository/dcr-public/6b586ed2-655c-49d9-9029-bc416323cb22/fa0b429a-a695-4c11-b7d2-2cbc284049dc/doc/index-mo_types.html
or
https://pubs.vmware.com/vi-sdk/visdk250/ReferenceGuide/index-mo_types.html

Categories