This question was asked before but never answered well:
I need beautifulsoup4 to scrape through a websites HTML and get information. I want to use that information in my Alexa-skill.
How do I import/use bs4 in my Alexa developer console?
I've already read how to make a deployment package (https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html), but I don't understand how to download/zip bs4.
I am new to Python, AWS and Alexa developer console, so I am sorry if that question is very easy to answer.
I tried to create a ziped folder named lamda and upload it under upload code, but running my skill wuth import bs4 just errors
I am trying to scrape one website stuck with "errorMessage": "Unable to import module 'lambda_function': No module named 'requests'",
How to import any module in AWS lambda which throw the Unable to import error.
Disclaimer : I am not running on EC2 instance
i need to put x.text in the s3 bucket
code is below
import requests
x = requests.get('https://w3schools.com/python/demopage.htm')
print(x.text)
You have 2 choices for packaging extra python dependencies:
Run a pip install and zip the contents along with your Lambda function. Upload this Zip to your Python function.
Create a Lambda layer that you can then use with your Lambda function.
Regarding requests it is no longer included in the base Lambda setup.
AWS have a blog post that explains how to include these files in your codebase.
It also includes the AWS Arns for public Labda layers containing the requests dependency, although it does cap the SDK at a slightly older version.
In order to use external packages in AWS Lambda, as stated in official documentation, you should pack your dependencies along with your code and upload it all together.
I need beautifulsoup4 to scrape through a websites HTML and get information. I want to use that information in my Alexa-skill.
How do I import/use bs4 in my Alexa developer console?
I've already read how to make a deployment package (https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html), but I don't understand how to download/zip bs4.
I am new to Python, AWS and Alexa developer console, so I am sorry if that question is very easy to answer.
Kind regards,
Dany
I think you'll find all you need here at this documentation url: https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html
I've got a Python script for an AWS Lambda function that does HTTP POST requests to another endpoint. Since Python's urllib2.request, https://docs.python.org/2/library/urllib2.html, can only handle data in the standard application/x-www-form-urlencoded format and I want to post JSON data, I used the Requests library, https://pypi.org/project/requests/2.7.0/.
That Requests library wasn't available at AWS Lambda in the Python runtime environment, so had to be imported via from botocore.vendored import requests. So far, so good.
Today, I get a deprecation warning on that:
DeprecationWarning: You are using the post() function from 'botocore.vendored.requests'.
This is not a public API in botocore and will be removed in the future.
Additionally, this version of requests is out of date. We recommend you install the
requests package, 'import requests' directly, and use the requests.post() function instead.
This was mentioned in this blog post from AWS too: https://aws.amazon.com/blogs/developer/removing-the-vendored-version-of-requests-from-botocore/.
Unfortunately, changing from botocore.vendored import requests into import requests results in the following error:
No module named 'requests'
Why is requests not available for the Python runtime at AWS Lambda? And how can I use / import it?
I succeeded sending HTTP POST requests using the urllib3 library, which is available at AWS Lambda without the requirements for additional installation instructions.
import urllib3
http = urllib3.PoolManager()
response = http.request('POST',
url,
body = json.dumps(some_data_structure),
headers = {'Content-Type': 'application/json'},
retries = False)
Check out the instructions here: https://docs.aws.amazon.com/lambda/latest/dg/python-package.html#python-package-dependencies
All you need to do is download the requests module locally, then include it in your Lambda function deployment package (ZIP archive).
Example (if all your Lambda function consisted of was a single Python module + requests module):
$ pip install --target ./package requests
$ cd package
$ zip -r9 ${OLDPWD}/function.zip .
$ cd $OLDPWD
$ zip -g function.zip lambda_function.py
$ aws lambda update-function-code --function-name my-function --zip-file fileb://function.zip
Answer 2020-06-18
I found a nice and easy way to use requests inside AWS Lambda functions!
Open this link and find the region that your function is using:
https://github.com/keithrozario/Klayers/tree/master/deployments/python3.8/arns
Open the .csv related to your region and search for the requests row.
This is the ARN related to requests library:
arn:aws:lambda:us-east-1:770693421928:layer:Klayers-python38-requests:6
So now in your lambda function, add a layer using the ARN found.
Obs.: make sure your Python lambda function runtime is python3.8.
If you are using serverless framework
Specify the plugin in serverless.yml
plugins:
- serverless-python-requirements
At the directory root create file requirements.txt
requirements.txt
requests==2.22.0
This will install the requests and packages mentioned.
requests is NOT part of core python.
See https://docs.aws.amazon.com/en_pv/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html about packaging a Lambda having external dependencies (in your case the requests library)
Amazon's Serverless Application Model (SAM) provides a build command that can bundle arbitrary python dependencies into the deployment artifact.
To be able to use the requests package in your code, add the dependency to your requirements.txt file:
requests==2.22.0
then run sam build to get an artifact that vendors requests. By default, your artifacts will be saved to the .aws-sam/build directory but another destination directory can be specified with the --build-dir option.
Consult SAM's documentation for more info.
Here's my redneck solution that works with any library, using an AWS Lambda Layer:
This has the advantage that you don't have to trust any 3rd party layers, because you can easily make it yourself.
Go to your local python's Lib/site-packages (python install location or your venv)
Copy whichever libraries you need (e.g. "requests") into a folder named "python"
Zip this folder
Create an AWS Lambda Layer, and upload the zip into it
Add this layer in your lambda function
Import your libraries as usual, and keep coding as if nothing happened
pip install requests
and then
import requests
to use.
I want to run BeautifulSoup and selenium webdriver in amazon lambda and my running environment is python 3.6. Is it possible to run ? if so How. My intention is to scrap datas from a webpage using beautiful soup 4 and selenium(Since it has to scrap data dynamically generated by javascript).
Yes, it's possible. You need to package a headless Chrome binary and chromedriver along with all the Python packages you need. You'll also need to set several options in Selenium's Chrome web driver to make it work.
I wrote a step-by-step tutorial after spending several frustrating weeks trying to deploy it.
You will need to create a deployment package and upload it to Lambda if you are going to use dependancies outside of the standard library.
I have a write up about using BS4 and Lambda together. I did not use Selenium within Lambda but I do have extensive Selenium experience. You will not be able to execute commands within a browser using Lambda. You are going to need to have a remote server stood up, running Selenium Server. Download Selenium and the webdrivers on the machine that you wish to do the web scraping, start the .jar file, it will open a port on the machine Selenium will communicate with.
Considering that you will need a machine running probably windows to fire up a browser and scrape these pages, you probably don't need lambda in the end.