We are trying to use boto3 to query our AWS instances and get all security groups which have rules with no description. I have not been able to figure out how to retrieve the Rule description. I have included an image of the column I want to retrieve below.
I wanted to share the solution I came up with late in the evening, piecing together bits from other solutions - primarily from #Abdul Gill and hist ec2_sg_rules script.
import boto3
# Explicitly declaring variables here grants them global scope
cidr_block = ""
ip_protpcol = ""
from_port = ""
to_port = ""
from_source = ""
description = ""
sg_filter = [{'Name': 'group-name', 'Values': ['*ssh*']}]
print("%s,%s,%s" % ("Group-Name","Group-ID","CidrIp"))
ec2 = boto3.client('ec2' )
sgs = ec2.describe_security_groups(Filters=sg_filter)["SecurityGroups"]
for sg in sgs:
group_name = sg['GroupName']
group_id = sg['GroupId']
# print("%s,%s" % (group_name,group_id))
# InBound permissions ##########################################
inbound = sg['IpPermissions']
for rule in inbound:
#Is source/target an IP v4?
if len(rule['IpRanges']) > 0:
for ip_range in rule['IpRanges']:
cidr_block = ip_range['CidrIp']
if 'Description' not in ip_range:
if '10.' not in cidr_block:
print("%s,%s,%s" % (group_name, group_id, cidr_block))
print('\n')
Hopefully this helps someone else.
Related
The following code gives the following Print Output:
-------Recognizing business card #1--------
Contact First Name: Chris has confidence: 1.0
Contact Last Name: Smith has confidence: 1.0
The code that provides the above output is:
bcUrl = "https://raw.githubusercontent.com/Azure/azure-sdk-for-python/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/sample_forms/business_cards/business-card-english.jpg"
poller = form_recognizer_client.begin_recognize_business_cards_from_url(bcUrl)
business_cards = poller.result()
for idx, business_card in enumerate(business_cards):
print("--------Recognizing business card #{}--------".format(idx+1))
contact_names = business_card.fields.get("ContactNames")
if contact_names:
for contact_name in contact_names.value:
print("Contact First Name: {} has confidence: {}".format(
contact_name.value["FirstName"].value, contact_name.value["FirstName"].confidence
))
print("Contact Last Name: {} has confidence: {}".format(
contact_name.value["LastName"].value, contact_name.value["LastName"].confidence
))
I am trying refactor the code so as to output the results to a dataframe as follows:
import pandas as pd
field_list = ["FirstName", "LastName"]
df = pd.DataFrame(columns=field_list)
bcUrl = "https://raw.githubusercontent.com/Azure/azure-sdk-for-python/master/sdk/formrecognizer/azure-ai formrecognizer/samples/sample_forms/business_cards/business-card-english.jpg"
for blob in container.list_blobs():
blob_url = container_url + "/" + blob.name
poller = form_recognizer_client.begin_recognize_business_cards_from_url(bcUrl)
business_cards = poller.result()
print("Scanning " + blob.name + "...")
for idx, business_card in enumerate(business_cards):
single_df = pd.DataFrame(columns=field_list)
for field in field_list:
entry = business_card.fields.get(field)
if entry:
single_df[field] = [entry.value]
single_df['FileName'] = blob.name
df = df.append(single_df)
df = df.reset_index(drop=True)
df
However, my code does not provide any output:
Can someone take a look and let know why I'm not getting any output?
When I tried to connect blob storage, I got the same kind of error. I just followed the below syntax for connecting blob storage and also, I deleted .json and some other .fott files only I have the PDFs in the container. I run the same code without any problem it is working fine. Please follow below reference which has detailed information.
Install packages
Connect to Azure Storage Container
Enable Cognitive Services
Send files to Cognitive Services
Reference:
https://www.youtube.com/watch?v=hQ2NeO4c9iI&t=458s
Azure Databricks and Form Recognizer - Invalid Image or password protected - Stack Overflow
https://github.com/tomweinandy/form_recognizer_demo
Currently, I'm making two calls to AWS ec2 using boto3 to fetch subnetIDs that start with tag name org-production-* and org-non-production-*. How can I combine these two functions in python and still be able to access the SubnetID's all_prod_subnets and all_non_prod_subnets ? I want to perhaps avoid code duplicatin make just one call to aws ec2, get all subnets and iterate them and filter the response based on tag name.
def get_all_production_subnets_from_accounts():
subnet = vpc_client.describe_subnets(
Filters=[{'Name': 'tag:Name', 'Values': ['org-production-*']}])['Subnets']
if len(subnet) > 0:
# print([s['SubnetId'] for s in subnet])
all_prod_subnets = [ s['SubnetId'] for s in subnet ]
print("[DEBUG]Queried Subnet ID's of Production are: {}".format(all_prod_subnets))
return all_prod_subnets
else:
return None
def get_all_nonproduction_subnets_from_acccounts():
subnet = vpc_client.describe_subnets(
Filters=[{'Name': 'tag:Name', 'Values': ['org-non-production-*']}])['Subnets']
if len(subnet) > 0:
# print([s['SubnetId'] for s in subnet])
all_non_prod_subnets = [ s['SubnetId'] for s in subnet ]
print("[DEBUG]Queried Subnet ID's of Non-Production are: {}".format(all_non_prod_subnets))
return all_non_prod_subnets
else:
return None
One way would be as follows:
def get_all_subnets_from_connectivity():
subnets_found = {}
# define subnet types of interest
subnets_found['org-production'] = []
subnets_found['org-non-production'] = []
results = vpc_client.describe_subnets()
for subnet in results['Subnets']:
if 'Tags' not in subnet:
continue
for tag in subnet['Tags']:
if tag['Key'] != 'Name': continue
for subnet_type in subnets_found:
if subnet_type in tag['Value']:
subnets_found[subnet_type].append(subnet['SubnetId'])
return subnets_found
all_subnets = get_all_subnets_from_connectivity()
print(all_subnets)
The example output:
{
'org-production': ['subnet-033bad31433b55e72', 'subnet-019879db91313d56a'],
'org-non-production': ['subnet-06e3bc20a73b55283']
}
I am trying to get the list of all unused AMIs using boto3.
creating variables
import boto3
REGION = 'us-east-1'
OWNER_ID = 'XXXXXXXXXXXX'
ec2_client = boto3.client('ec2', REGION)
ec2_resource = boto3.resource('ec2', REGION)
get AMIs ID by instances
def get_used_amis_by_instances():
reservations = ec2_client.describe_instances(
Filters=[
{
'Name': 'owner-id',
'Values': [
OWNER_ID,
]
},
])['Reservations']
amis_used_list = []
for reservation in reservations:
ec2_instances = reservation['Instances']
for ec2 in ec2_instances:
ImageId = ec2['ImageId']
if ImageId not in amis_used_list:
amis_used_list = amis_used_list + [ImageId]
return amis_used_list
get the all AMIs ID
I just need the list of AMIs ID
def get_all_amis():
amis = ec2_resource.images.filter(Owners = [OWNER_ID])
all_amis = []
for ami in amis.all():
if ami.id not in all_amis:
all_amis = all_amis + [ami.id]
return all_amis
get unused AMIs.
Using the previous methods I have the all_amis and all_used_amis.
def get_unused_amis(all_amis, all_used_amis):
unused_amis_list = []
for ami_id in all_amis:
if ami_id not in all_used_amis:
if ami_id not in unused_amis_list:
unused_amis_list = unused_amis_list + [ami_id]
return unused_amis_list
Output the results
def deregister_amis(event, context):
all_amis = get_all_amis()
print("All AMIs: %d" % len(all_amis) )
all_used = get_used_amis_by_instances()
print("All used AMIs: %d" % len(all_used) )
all_unused = get_unused_amis(all_amis, all_used)
print("All unused AMIs: %d" % len(all_unused) )
My problem
The deregister_amis is returning
All AMIs: 201
All used AMIs: 102
All unused AMIs: 140
I expect 99 for all unused AMIs. I don't see where is my error. But I don't see if there is maybe an error with the used value of 102. The total is correct 201, but with the other two values maybe I am doing something wrong or missing something. Let me know if you are able to see where is my error because I can.
One AMI can be used by many instance. Using list, you might get same AMI added into the list(where multiple instance using there same AMI). Your list is alright if you want number of instances running, but not AMI.
You should use set() in the first place, not list. Just convert all your list to set(). In addition, using set, you can do subtraction, unison,etc. So you really don't need the redundant get_unused_amis() function.
It is so simple using the set
unused_ami = all_amis - all_used_amis
quick hack
def deregister_amis(event, context):
all_amis = set(get_all_amis())
print("All AMIs: %d" % len(all_amis) )
all_used = set(get_used_amis_by_instances())
print("All used AMIs: %d" % len(all_used) )
all_unused = all_amis - all_used
print("All unused AMIs: %d" % len(all_unused) )
After another view, I think using the quick hack is better. Retaining the list give you one extra advantage : You can check out how many AMI is used more than once.
I have a code which prints the Public IP's for the running instances,
regions = ['us-east-1','us-west-1','us-west-2','eu-west-1','sa-east-1','ap-southeast-1','ap-southeast-2','ap-northeast-1']
for region in regions:
client = boto3.client('ec2',aws_access_key_id=ACCESS_KEY,aws_secret_access_key=SECRET_KEY,region_name=region,)
addresses_dict = client.describe_addresses()
for eip_dict in addresses_dict['Addresses']:
if 'PrivateIpAddress' in eip_dict:
print eip_dict['PublicIp']
This is fine, now i also want to print the tag name and store it in another dict, i know i this can be done by :
regions = ['us-east-1','us-west-1','us-west-2','eu-west-1','sa-east-1','ap-southeast-1','ap-southeast-2','ap-northeast-1']
for region in regions:
client = boto3.client('ec2',aws_access_key_id=ACCESS_KEY,aws_secret_access_key=SECRET_KEY,region_name=region,)
dex_dict = client.describe_tags()
for dexy_dict in dex_dict['Tags']:
print dexy_dict['Value']
The problem is how do i combine it in one function and use 2 dict : one to store IP's and another to store the tag-name ? Please HELP
Try the following code, it will give you a dictionary where the key is the InstanceId and the value is a list of [PublicIP, Name].
import boto3
def instance_info():
instance_information = {}
ip_dict = {}
client = boto3.client('ec2')
addresses_dict = client.describe_addresses().get('Addresses')
for address in addresses_dict:
if address.get('InstanceId'):
instance_information[address['InstanceId']] = [address.get('PublicIp')]
dex_dict = client.describe_tags().get('Tags')
for dex in dex_dict:
if instance_information.get(dex['ResourceId']):
instance_information[dex['ResourceId']].append(dex.get('Value'))
for instance in instance_information:
if len(instance_information[instance]) == 2:
ip_dict[instance_information[instance][0]] = instance_information[instance][1]
else:
ip_dict[instance_information[instance][0]] = ''
return instance_information, ip_dict
I'm using YouTube API with Python. I can already gather all the comments of a specific video, including the name of the authors, the date and the content of the comments.
I can also with a separate piece of code, extract the personal information (age,gender,interests,...) of a specific author.
But I can not use them in one place. i.e. I need to gather all the comments of a video, with the name of the comments' authors and having the personal information of all those authors.
in follow is the code that I developed. But I get an 'RequestError' which I dont know how to handle and where is the problem.
import gdata.youtube
import gdata.youtube.service
yt_service = gdata.youtube.service.YouTubeService()
f = open('test1.csv','w')
f.writelines(['UserName',',','Age',',','Date',',','Comment','\n'])
def GetAndPrintVideoFeed(string1):
yt_service = gdata.youtube.service.YouTubeService()
user_entry = yt_service.GetYouTubeUserEntry(username = string1)
X = PrintentryEntry(user_entry)
return X
def PrintentryEntry(entry):
# print required fields where we know there will be information
Y = entry.age.text
return Y
def GetComment(next1):
yt_service = gdata.youtube.service.YouTubeService()
nextPageFeed = yt_service.GetYouTubeVideoCommentFeed(next1)
for comment_entry in nextPageFeed.entry:
string1 = comment_entry.author[0].name.text.split("/")[-1]
Z = GetAndPrintVideoFeed(string1)
string2 = comment_entry.updated.text.split("/")[-1]
string3 = comment_entry.content.text.split("/")[-1]
f.writelines( [str(string1),',',Z,',',string2,',',string3,'\n'])
next2 = nextPageFeed.GetNextLink().href
GetComment(next2)
video_id = '8wxOVn99FTE'
comment_feed = yt_service.GetYouTubeVideoCommentFeed(video_id=video_id)
for comment_entry in comment_feed.entry:
string1 = comment_entry.author[0].name.text.split("/")[-1]
Z = GetAndPrintVideoFeed(string1)
string2 = comment_entry.updated.text.split("/")[-1]
string3 = comment_entry.content.text.split("/")[-1]
f.writelines( [str(string1),',',Z,',',string2,',',string3,'\n'])
next1 = comment_feed.GetNextLink().href
GetComment(next1)
I think you need a better understanding of the Youtube API and how everything relates together. I've written wrapper classes which can handle multiple types of Feeds or Entries and "fixes" gdata's inconsistent parameter conventions.
Here are some snippets showing how the scraping/crawling can be generalized without too much difficulty.
I know this isn't directly answering your question, It's more high level design but it's worth thinking about if you're going to be doing a large amount of youtube/gdata data pulling.
def get_feed(thing=None, feed_type=api.GetYouTubeUserFeed):
if feed_type == 'user':
feed = api.GetYouTubeUserFeed(username=thing)
if feed_type == 'related':
feed = api.GetYouTubeRelatedFeed(video_id=thing)
if feed_type == 'comments':
feed = api.GetYouTubeVideoCommentFeed(video_id=thing)
feeds = []
entries = []
while feed:
feeds.append(feed)
feed = api.GetNext(feed)
[entries.extend(f.entry) for f in feeds]
return entries
...
def myget(url,service=None):
def myconverter(x):
logfile = url.replace('/',':')+'.log'
logfile = logfile[len('http://gdata.youtube.com/feeds/api/'):]
my_logger.info("myget: %s" % url)
if service == 'user_feed':
return gdata.youtube.YouTubeUserFeedFromString(x)
if service == 'comment_feed':
return gdata.youtube.YouTubeVideoCommentFeedFromString(x)
if service == 'comment_entry':
return gdata.youtube.YouTubeVideoCommentEntryFromString(x)
if service == 'video_feed':
return gdata.youtube.YouTubeVideoFeedFromString(x)
if service == 'video_entry':
return gdata.youtube.YouTubeVideoEntryFromString(x)
return api.GetWithRetries(url,
converter=myconverter,
num_retries=3,
delay=2,
backoff=5,
logger=my_logger
)
mapper={}
mapper[api.GetYouTubeUserFeed]='user_feed'
mapper[api.GetYouTubeVideoFeed]='video_feed'
mapper[api.GetYouTubeVideoCommentFeed]='comment_feed'
https://gist.github.com/2303769 data/service.py (routing)