Checking STOP time of EC2 instance with boto3

Checking STOP time of EC2 instance with boto3 - python

Python 2.7
Boto3
I'm trying to get a timestamp of when the instance was stopped OR the time the last state transition took place OR a duration of how long the instance has been in the current state.
My goal is to test if an instance has been stopped for x hours.
For example,
instance = ec2.Instance('myinstanceID')
if int(instance.state['Code']) == 80:
stop_time = instance.state_change_time() #Dummy method.
Or something similar to that.
I see that boto3 has a launch_time method. And lots of ways to analyze state changes using state_transition_reason and state_reason but I'm not seeing anything regarding the state transition timestamp.
I've got to be missing something.
Here is the Boto3 docs for Instance "state" methods...
state
(dict) --
The current state of the instance.
Code (integer) --
The low byte represents the state. The high byte is an opaque internal value and should be ignored.
0 : pending
16 : running
32 : shutting-down
48 : terminated
64 : stopping
80 : stopped
Name (string) --
The current state of the instance.
state_reason
(dict) --
The reason for the most recent state transition.
Code (string) --
The reason code for the state change.
Message (string) --
The message for the state change.
Server.SpotInstanceTermination : A Spot instance was terminated due to an increase in the market price.
Server.InternalError : An internal error occurred during instance launch, resulting in termination.
Server.InsufficientInstanceCapacity : There was insufficient instance capacity to satisfy the launch request.
Client.InternalError : A client error caused the instance to terminate on launch.
Client.InstanceInitiatedShutdown : The instance was shut down using the shutdown -h command from the instance.
Client.UserInitiatedShutdown : The instance was shut down using the Amazon EC2 API.
Client.VolumeLimitExceeded : The limit on the number of EBS volumes or total storage was exceeded. Decrease usage or request an increase in your limits.
Client.InvalidSnapshot.NotFound : The specified snapshot was not found.
state_transition_reason
(string) --
The reason for the most recent state transition. This might be an empty string.

The EC2 instance has an attribute StateTransitionReason which also has the time the transition happened. Use Boto3 to get the time the instance was stopped.
print status['StateTransitionReason']
...
User initiated (2016-06-23 23:39:15 GMT)
The code below prints stopped time and current time. Use Python to parse the time and find the difference. Not very difficult if you know Python.
import boto3
import re
client = boto3.client('ec2')
rsp = client.describe_instances(InstanceIds=['i-03ad1f27'])
if rsp:
status = rsp['Reservations'][0]['Instances'][0]
if status['State']['Name'] == 'stopped':
stopped_reason = status['StateTransitionReason']
current_time = rsp['ResponseMetadata']['HTTPHeaders']['date']
stopped_time = re.findall('.*\((.*)\)', stopped_reason)[0]
print 'Stopped time:', stopped_time
print 'Current time:', current_time
Output
Stopped time: 2016-06-23 23:39:15 GMT
Current time: Tue, 20 Dec 2016 20:33:22 GMT

You might consider using AWS Config to view the configuration history of the instances.
AWS Config is a fully managed service that provides you with an AWS resource inventory, configuration history, and configuration change notifications to enable security and governance
The get-resource-config-history command can return information about an instance, so it probably has Stop & Start times. It will take a bit of parsing to extract the details.

Related

How can i make this web3 python script faster?

I want to make a python script (for BSC) which keeps track of the balance of that particular token in the wallet. I need the python script to be very fast. Currently with the below code, it takes about 6 seconds for the script to detect the token entering the wallet. Is there a faster, more efficient way to do it? (I added the sleep func to act as some kind of buffer. Don't know if its a good idea though?)
Edit: removed the sleep function but still takes 6s.
from web3 import Web3
import json
bsc = "https://bsc-dataseed.binance.org/"
web3 = Web3(Web3.HTTPProvider(bsc))
print(web3.isConnected())
main_address = "wallet to be tracked"
contract_address = "token contract address"
abi = json.loads('the abi')
contract = web3.eth.contract(address=contract_address, abi = abi)
balanceOfToken = contract.functions.balanceOf(main_address).call()
print(web3.fromWei(balanceOfToken, 'ether'))
while(True):
balanceOfToken = contract.functions.balanceOf(main_address).call()
if(balanceOfToken > web3.fromWei(0.5, 'ether')):
break
time.sleep(1.1)
x+=1
print(f"Still looking {x}")
continue
second_address = "the other wallet address"
main_key = "private key of first wallet"
nonce = web3.eth.getTransactionCount(main_address)
token_tx = contract.functions.transfer(second_address, balanceOfToken).buildTransaction({
'chainId':56, 'gas': 90000, 'gasPrice': web3.toWei('5', 'gwei'), 'nonce':nonce
})
signed_tx = web3.eth.account.signTransaction(token_tx, main_key)
web3.eth.sendRawTransaction(signed_tx.rawTransaction)
print(contract.functions.balanceOf(my_address).call() + " " + contract.functions.name().call())

Key to answering your question is: What takes 6 seconds?
Running the code from start to finish?
If I run the code on my laptop - using the same node - the code executes in 0.45-0.55s. So perhaps it is not the code itself, but your connection to the node that is slowing down calls or broadcasting the transaction? If so, maybe trying another node will speed up execution. See Binance's docs for alternatives or check a 3rd party provider.
Unlikely, but it could also be the lack of available processing power on your laptop (?)
Starting the code until the transaction shows up in the block?
The code takes c. 0.5 to run. Add the 3s target block time on BSC and you are already at 3.5s, assuming there's space in the block (/your fee is sufficient to be included) and assuming it gets broadcasted and picked up immediately. I am unsure what the lower bound should be, but it will take a couple of seconds.
PS. As mentioned by – #Mikko Ohtamaa - Aug 17 '21 at 5:07 "Instead of polling, you can subscribe to all new blocks and filter out events in the block yourself. (..)" To do this, you can have a look at filtering in web3py.

You can make it faster by running an Ethereum node locally. Thus, you have 100% of the Ethereum node server capacity and there is no network delay. More information here.

Python with Azure blob service

i'm trying to use the Azure blob service to upload video files to the cloud.
I'm trying to figure what happens if my internet where to suddenly go out in the middle of a transfer.
There seems to be no Exceptions thrown when the internet go out.
from azure.common import AzureException
from azure.storage.blob import AppendBlobService, BlockBlobService, ContentSettings
try:
self.append_blob_service.append_blob_from_path(self.container_name, blob_name, upload_queue.get(timeout=3))
except AzureException as ae:
print("hey i caught something") <-- this line never seem to run
If i put the internet back on the blob seem to upload itself after about 30 minutes. I can't find any information about this in the docs. How long does the append_blob_from_path function keep trying?

There is LinearRetry, ExponentialRetry, NoRetry and Custom Retry Policy.
The default is Linear which makes a max of 5 attempts 5 seconds apart. So if your net connection was down for < 25 seconds your upload will continue.
I am not sure if your internet connection was down for 30 mins. In that case it should have thrown and exception.
PS: You can look up the corresponding C# documentation for Retry policies.

Python SDK for Azure Storage is OpenSource : https://github.com/Azure/azure-storage-python
If we look on calls from append_blob_from_path() we can see following things:
There is a default socket timeout:
# Socket timeout in seconds
DEFAULT_SOCKET_TIMEOUT = 20
At the end it uses functions from StorageClient (AppendBlobService(BaseBlobService) -> BaseBlobService(StorageClient)) and StorageClient uses :
self.retry = ExponentialRetry().retry
ExponentialRetry has following constructor:
def __init__(self, initial_backoff=15, increment_base=3, max_attempts=3,
retry_to_secondary=False, random_jitter_range=3):
'''
Constructs an Exponential retry object. The initial_backoff is used for
the first retry. Subsequent retries are retried after initial_backoff +
increment_power^retry_count seconds. For example, by default the first retry
occurs after 15 seconds, the second after (15+3^1) = 18 seconds, and the
third after (15+3^2) = 24 seconds.
:param int initial_backoff:
The initial backoff interval, in seconds, for the first retry.
:param int increment_base:
The base, in seconds, to increment the initial_backoff by after the
first retry.
:param int max_attempts:
The maximum number of retry attempts.
:param bool retry_to_secondary:
Whether the request should be retried to secondary, if able. This should
only be enabled of RA-GRS accounts are used and potentially stale data
can be handled.
:param int random_jitter_range:
A number in seconds which indicates a range to jitter/randomize for the back-off interval.
For example, a random_jitter_range of 3 results in the back-off interval x to vary between x+3 and x-3.
'''
Also there is a RetryContext which is used by this _retry() function to decide if retry is needed
If you enable INFO-level logging in your code, you will see all retries:
# Basic configuration: configure the root logger, including 'azure.storage'
logging.basicConfig(format='%(asctime)s %(name)-20s %(levelname)-5s %(message)s', level=logging.INFO)
To resume:
You have (20 seconds of socket timeout + dynamic interval started from 15 seconds and randomly incremented each attempt) and you have 3 attempts. You can see what exactly happening when you enable INFO-level logging.

I need to scrape logs from cloud watch logs and load it to s3 and from s3 to data warehouse

I have several lambda functions. I need to scrape my logs generated from all of my lambda functions and load to our internal data warehouse. I thought of these solutions.
Have a lambda function subscribed to my lambda function's cloudwatch log groups and polish and log messages and push it to s3.
Pros: Works and simple to implement.
Cons: There is no way for me to
"replay". Say My exporter failed for some reason. I wouldn't be able
to replay this action.
Have a lambda function that runs every 10 min or so and creates export task and scrapes logs from cloudwatch and loads them to s3.
import boto3
client = boto3.client('logs')
response = client.create_export_task(
taskName='export_task',
logGroupName='/aws/lambda/<lambda_function_1>',
fromTime=from_time,
to=to_time,
destination='<application_logs>',
destinationPrefix='<lambda_function_1>'
)
response = client.create_export_task(
taskName='export_task',
logGroupName='/aws/lambda/<lambda_function_2>',
fromTime=from_time,
to=to_time,
destination='<application_logs>',
destinationPrefix='<lambda_function_2>'
)
Second create_export_task fails here
An error occurred (LimitExceededException) when calling the
CreateExportTask operation: Resource limit exceeded."
I cant create multiple export task. Is there a way to address this?

From AWS docs: One active (running or pending) export task at a time, per account. This limit cannot be changed.
U can use the below function to check if the status has been changed to 'COMPLETED'
response = client.create_export_task(
taskName='export_cw_to_s3',
logGroupName='/ecs/',
logStreamNamePrefix=org_id,
fromTime=int((yesterday-unix_start).total_seconds() * 1000),
to=int((today-unix_start).total_seconds() * 1000),
destination='test-bucket',
destinationPrefix=f'random-string/{today.year}/{today.month}/{today.day}/{org_id}')
taskId = (response['taskId'])
status = 'RUNNING'
while status in ['RUNNING','PENDING']:
response_desc = client.describe_export_tasks(
taskId=taskId
)
status = response_desc['exportTasks'][0]['status']['code']

Came across the same error message and the reason is you can only have one running/pending export task per account at a given time hence this task is failing. From AWS docs: One active (running or pending) export task at a time, per account. This limit cannot be changed.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/cloudwatch_limits_cwl.html

Sometimes one createExport task stays in pending state for long preventing other lambda functions with the same task to run. You could see this task and cancel it allowing the other functions to run.

Boto3 and AWS Lambda - deleting snapshots older than

I'm currently utilising AWS Lambda to create snapshots of my database and delete snapshots older than 6 days. I'm using the Boto3 library to interface with the AWS API. I'm using a CloudWatch rule to trigger the deletion code every day.
Normally this is working fine, but I've come across an issue where at the start of the month (first 6 days) the delete script does not appear to delete any snapshots, even though snapshots older than 6 days exist.
The code is below:
import json
import boto3
from datetime import datetime, timedelta, tzinfo
class Zone(tzinfo):
def __init__(self,offset,isdst,name):
self.offset = offset
self.isdst = isdst
self.name = name
def utcoffset(self, dt):
return timedelta(hours=self.offset) + self.dst(dt)
def dst(self, dt):
return timedelta(hours=1) if self.isdst else timedelta(0)
def tzname(self,dt):
return self.name
UTC = Zone(10,False,'UTC')
# Setting retention period of 6 days
retentionDate = datetime.now(UTC) - timedelta(days=6)
def lambda_handler(event, context):
print("Connecting to RDS")
rds = boto3.setup_default_session(region_name='ap-southeast-2')
client = boto3.client('rds')
snapshots = client.describe_db_snapshots(SnapshotType='manual')
print('Deleting all DB Snapshots older than %s' % retentionDate)
for i in snapshots['DBSnapshots']:
if i['SnapshotCreateTime'] < retentionDate:
print ('Deleting snapshot %s' % i['DBSnapshotIdentifier'])
client.delete_db_snapshot(DBSnapshotIdentifier=i['DBSnapshotIdentifier']
)

Code looks perfectly fine and you are following the documentation
I would simply add
print(i['SnapshotCreateTime'], retentionDate)
in the for loop, the logs will tell you quickly what's going on in the beginning of every month.
Btw, are you using RDS from AWS? RDS supports automatic snapshot creation and you can also define a retention period. There is no need to create custom lambda scripts.

Due to the distributed nature of the CloudWatch Events and the target services, the delay between the time the scheduled rule is triggered and the time the target service honors the execution of the target resource might be several seconds. Your scheduled rule will be triggered within that minute but not on the precise 0th second.
In that case, your utc now will may miss a few seconds during execution there by retention date also may miss a few seconds. This should be very minimal but still there is a chance for missed deletion. Going by that, the subsequent run should delete the missed ones in the earlier run.

GAE: Task on backend instance killed without warning

TL;DR:
How can I work around this bug in Appengine: sometimes is_shutting_down returns False, and in a second or two, the instance is shut down?
Details
I have a backend instance on a Google Appengine application (Python). The backend instance is used to generate reports, which sometimes takes minutes or even hours to finish.
To deal with unexpected shutdowns, I am watching for runtime.is_shutting_down() and store the report's intermediate state into DB when is_shutting_down returns True.
Here's the portion of code where I check it:
from google.appengine.api import runtime
#...
def my_report_function():
#...
# Check if we should interrupt and reschedule to avoid timeout error.
duration_sec = time.time() - start
too_long = MAX_SEC < duration_sec
is_shutting_down = runtime.is_shutting_down()
log.debug('Does this report iteration need to wrap it up soon? '
'Too long? %s (%s sec). Shutting down? %s'
% (too_long, duration_sec, is_shutting_down))
if too_long or is_shutting_down:
# save the state of report, reschedule next iteration, and return
Sometimes it works, but sometimes I see the following in the Appengine log:
D 2013-06-20 18:41:56.893 Does this report iteration need to wrap it up soon? Too long? False (348.865950108 sec). Shutting down? False
E 2013-06-20 18:42:00.248 Process terminated because the backend took too long to shutdown.
Clearly, the 30-second timeout has not passed between the time when I checked the value returned by runtime.is_shutting_down(), and when Appengine killed the backend.
Does anybody know why this is happening, and whether there is a workaround for this?
Thank you in advance!

There is demo code from Google IO here http://backends-io.appspot.com/
The included counter_v3_with_write_behind.py demonstrates a pattern:
On '/_ah/start' set a shutdown hook via
runtime.set_shutdown_hook(something_to_save_progress_and_requeue_task)
It looks like your code is 'are you shutting down right now, if not, go do something that may take a while'. This pattern should listen for 'shut down ASAP or you lose everything'.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.