Azure Functions IP addresses out of range

Azure Functions IP addresses out of range - python

I have a Azure Function, which makes calculations and stores and reads data from my own Cosmos DB and one external database via REST API.
From Azure Portal, I can see the "outboundIpAddresses" and "possibleOutboundIpAddresses" (subscriptions > {your subscription} > providers > Microsoft.Web > sites). Totally 12 IP addresses. When I run the function locally (VS Code), everything goes smoothly. However, when I deploy that function, I get the following error:
Result: Failure Exception: CosmosHttpResponseError: (Forbidden) Request originated from client IP <IP-address> through public internet. This is blocked by your Cosmos DB account firewall settings
This itself is self-explanatory, but the problem is that the IP-address mentioned in the error message does not belong neither "outboundIpAddresses" or "possibleOutboundIpAddresses". And almost every time the function gets triggered, the client IP in the error message changes.
Do you have any ideas why this happens and how to solve the issue?

Is your function app in Consumption plan? If yes, when a function app that runs on the Consumption plan is scaled, a new range of outbound IP addresses may be assigned. When running on the Consumption plan, you may need to whitelist the entire data center.
On further note, if you are into app service plan, you have the option of assigning dedicated IP address .

Related

Python Redis: Not able to connect to AWS Redis cluster from local machine or server

I have the following script
import redis
client = redis.Redis.from_url('redis://xxx.amazonaws.com:6379')
client.ping()
This works when I run it on a throwaway EC2 instamce
However when I run it locally or on a local server I get
redis.exceptions.ConnectionError: Error 11 connecting to xxx.amazonaws.com:6379. Resource temporarily unavailable.
Is this something to do with the VPC? If so, what is the way around it?
Thanks

Elasticache Redis is a VPC only service. ie. You can only conenct to it from resources within your VPC such as an EC2 instance or a Lambda function.
If you want to connect from outside, You will first need something to gain you VPC access like an AWS VPN or a Transit Gateway.
I think this link discusses it - Accessing ElastiCache resources from outside AWS - https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/accessing-elasticache.html#access-from-outside-aws
To add more context if you may be unaware,
VPC - Virtual Private Cloud ; basically an atmosphere for all your Cloud resources. Every region you choose to work with will have a default VPC. These Default VPCs have specific IP-address blocks associated to them. When you create a resource within a VPC, one of the IP-address will get associated with one of your resource.
Subents - These are partitions of your VPC. By default, all your Subnets within a VPC are present in different Availability Zones of AWS Datacenter in that region. Eg; N. Virginia has 6 AZs meaning it has 6 distinct locations where your resource can be present. In the default VPCs, each subnet represents one of those locations. When you select a subnet in default VPC, you're basically selecting your AZ.
NOTE - In custom made VPC, you can have subnets in the same AZ. That's totally on how you design it.
If you're new to all this, you might want to consider going through AWS docs - https://docs.aws.amazon.com/vpc/latest/userguide/how-it-works.html
They can be very comprehensive. Get some popcorn. :)
Cheers.

AWS Lambda Snowflake Python Connector hangs attempting to connect

I have a small AWS Lambda function that looks like this:
It grabs the credentials to connect to Snowflake from SSM Parameter Store, and then calls snowflake.connector.connect. It's meant to obviously go grab data from my Snowflake data warehouse. However, the code hangs and never finishes the snowflake.connector.connect call.
I believe my subnets and networking are set up properly:
Just to test and develop, I set my security group to allow all inbound and outbound traffic on all ports.
I have my Lambda running in a private subnet, and a route table that directs 0.0.0.0/0 to the NAT Gateway instance. In my code, I print(requests.get('http://216.58.192.142')) just to prove that I do indeed have internet connectivity.
I have many large dependencies that don't fit in the 200MB deploy package for Lambdas, so I have my dependencies mounted in an EFS file system at /mnt/efs path, and I add /mnt/efs/python to my PYTHONPATH in the code before I start to import those dependencies.
import boto3
import snowflake.connector
print("Getting snowflake creds")
session = boto3.session.Session()
ssm = session.client("ssm")
obj = ssm.get_parameter(Name="snowflake", WithDecryption=True)
sf_creds = json.loads(obj.get("Parameter").get("Value"))
def get_data(event, context):
print(requests.get('http://216.58.192.142'))
print("Executing")
print("got parameter, connecting")
con = snowflake.connector.connect(
user=sf_creds["USER"],
password=sf_creds["PASSWORD"],
account=sf_creds["ACCOUNT"],
ocsp_response_cache_filename="/tmp/ocsp_response_cache"
)
print("connected")
When I run this same exact code locally on my MacBook, I am able to connect fairly quickly, within a second or two. My snowflake-python-connector version is 2.3.2.
However, no matter how long I try, the connect method hands when it executes in an AWS Lambda function. I'm really not sure what is going on.
I've verified that
the AWS Lambda function is connected to the internet (it receives a [200 OK] from the requests.get call).
security groups are as permissive as possible (allow all traffic on all ports both inbound and outbound)
I have not touched the NACL
Really, I'm at a loss as to why this is happening, especially given that the code works fine on my local machine - could someone try to point me in the right direction?

Lambda can respond with 200, but there can be exception in logs. Check out the CloudWatch logs of this lambda.

AWS: Lambda submits a Batch job via Python boto3 client but times out before receiving a response

I have a Lambda function that has a Python handler that submits a job to AWS Batch via boto3 client:
client = boto3.client('batch', 'us-east-1')
def handle_load(event, context):
hasher = hashlib.sha1()
hasher.update(str(time.time()).encode())
job_name = f"job-{hasher.hexdigest()[:10]}"
job_queue = os.environ.get("job_queue")
job_definition = os.environ.get("job_definition")
logger.info(f"Submitting job named '{job_name}' to queue '{job_queue}' "
f"with definition '{job_definition}'")
response = client.submit_job(
jobName=job_name,
jobQueue=job_queue,
jobDefinition=job_definition,
)
logger.info(f"Submission successful, job ID: {response['jobId']}")
I can see this Lambda function submit the Batch job in CloudWatch logs but it always times out before the response comes back. I never see these jobs show up in the queue, so I'm not sure where things go after they are submitted, it seems that the Lambda is always timing out before the response comes back, I have little else to go on.
I have successfully added a job to the queue via AWS CLI, using the same queue and definition ARNs that are used in the Lambda's Python code. This job can be seen in the queue under the runnable tab (presumably the job will be started at some point in the near future).
The job submission with AWS CLI comes back instantly, so there must be something amiss on the Lambda configuration preventing the job submission. Perhaps I'm not using the correct role for the Lambda that submits the job, or have some other permissions that are amiss causing the timeout? The Lambda has permission for the batch:SubmitJob action allowed on all resources.

If an AWS Lambda function is not connected to a VPC, then by default it is connected to the Internet. This means it can call AWS API functions, which resides on the Internet.
If your Lambda function is configured to use a VPC, it will not have Internet access by default. This is good for connecting to other resources in a VPC, but if you wish to communicate with an AWS service, you'll either:
A NAT Gateway in a public subnet, with the Lambda function connected to a private subnet that has a Route Table rule that points to the NAT Gateway, or
A VPC endpoint that connects to the desired service. Unfortunately, AWS Batch does not have a VPC Endpoint.
So, if your Lambda function does not need to connect to other resources in the VPC, you can disconnect it and it should work. Otherwise, use a NAT Gateway.

Based on the comments. Lambda in a VPC does not have access to internet. You need to setup internet gateway in public subnet and NAT gateway in private subnet with your lambda to be able to access AWS Batch endpoints. Alternatively have to use VPC interface endpoint for AWS Batch. From docs:
Connect your function to private subnets to access private resources. If your function needs internet access, use NAT. Connecting a function to a public subnet does not give it internet access or a public IP address.
Also you need to add permissions to your lambda's execution role to be able to create network interface in VPC:
ec2:CreateNetworkInterface
ec2:DescribeNetworkInterfaces
ec2:DeleteNetworkInterface

connection times out when trying to connect to mongodb atlas with python

I'm trying to connect to my mongodb atlas cluster but i keep getting timed out as soon as i try to do something with my db.
The db i use was created in mongoshell and also the collection i checked their existence in mongodb compass
ERROR
pymongo.errors.ServerSelectionTimeoutError: projekt-shard-00-01-rk7ft.mongodb.net:27017: timed out,projekt-shard-00-00-rk7ft.mongodb.net:27017: timed out,projekt-shard-00-02-rk7ft.mongodb.net:27017: timed out
CODE
client = MongoClient("""mongodb://user:password#projekt-shard-00-00-rk7ft.mongodb.net:27017,projekt-shard-00-01-rk7ft.mongodb.net:27017,projekt-shard-00-02-rk7ft.mongodb.net:27017/projekt?ssl=true&replicaSet=projekt-shard-0&authSource=admin""")
client.projekt.category.insert_one({type : "pants"}).inserted_id

SO the problem is with your IP Address,
GO to the Network Access panel in MongoDB Atlas
In the IP Access List section, you will find all your IP addresses
Click on edit tab for the current IP address you are using
There change the setting to ALLOW ACCESS FROM ANYWHERE
That's it, it will work!

I was having this issue for hours. It's odd that it seems to be a connection issue, but it's not throwing a bad auth or anything, just this timeout. The client object seems to be actually created (I could print its properties). I kept playing around and this somehow worked:
In the MongoDB GUI, navigate to Database Access
Add a test user with the same read/write permissions to everything as the initial user created upon setup
Change the connection string in Python to the new user's username + password
Run the code
For me it finally connected and inserted successfully. After this, the original user's connection string now worked, so I deleted the test user.
I can't identify the root cause of this issue, but it seems like the Database Users table just needed some kind of action performed on it to refresh and begin accepting user connections.

Anybody looking for a solution, if you are trying to access Atlas instance from out in the wild, check the "Network Access" tab, as i think you have to whitelist either all, or specific IP addresses

authGSSServerInit extremely slow

I am implementing a single sign-on mechanism for a Flask server running on Ubuntu 16.04 that authenticates users against an Active Directory server in the Windows domain.
When I run the example app from https://github.com/mkomitee/flask-kerberos/tree/master/example on the Flask server, I can access the Flask server from a client computer that's logged in, the server correctly negotiates access and returns the name of the logged in user. However, this is very slow, taking about two minutes.
Following the steps of what happens in flask-kerberos, I found that the process stalls at the authGSSServerInit step. I can reproduce the behaviour using the following minimal progam:
import kerberos
rc, state = kerberos.authGSSServerInit("HTTP#flaskserver.mydomain.local")
The initalisation finishes successfully, but it takes about two minutes again.
I have successfully registered the service principal (HTTP/flaskserver.mydomain.local) on the AD server and exported the keytab to the Flask server. I can get a ticket granting ticket on the Flask server using kinit -k HTTP/flaskserver.mydomain.local. I can also verify passwords in Python using the kerberos library:
import kerberos
kerberos.checkPassword('username', 'password', 'HTTP/flaskserver.mydomain.local', 'MYDOMAIN.LOCAL'
This runs correctly and almost instantly.
What could be the cause for the delay in running kerberos.authGSSServerInit? How do I debug this?

The delay was caused by a failing reverse DNS lookup for the hostname. host flaskserver correctly returned the IP, but host <ip-of-flaskserver> returned a Host <ip-of-flaskserver>.in-addr.arpa not found: 2(SERVFAIL).
As described at https://web.mit.edu/kerberos/krb5-1.13/doc/admin/princ_dns.html, disabling the reverse DNS lookup in the krb5.conf solved the problem:
[libdefaults]
rdns = false

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.