How to create a CSV file inside a Azure repo using python - python

I'm using azure devops for the first time, just trying to create a CSV file using python script.
Python script main.py:
# importing pandas as pd
import pandas as pd
# list of name, degree, score
nme = ["aparna", "pankaj", "sudhir", "Geeku"]
deg = ["MBA", "BCA", "M.Tech", "MBA"]
scr = [90, 40, 80, 98]
# dictionary of lists
dict = {'name': nme, 'degree': deg, 'score': scr}
df = pd.DataFrame(dict)
print(df)
# saving the dataframe
df.to_csv('file.csv', header=False, index=False)
print("CSV file created")
Output:
and it created a csv file in that folder
What i did is, just went to the repo and created a new repo called myTest and uploaded the python file to there
then went to the pipelines and selected "Azure repos git"->selected "myTest"->"Python Package"-> and edited the YAML file and gave "save and run"
azure-pipelines.yml file content:
trigger:
- main
pool:
vmImage: ubuntu-latest
strategy:
matrix:
Python37:
python.version: '3.7'
steps:
- task: UsePythonVersion#0
inputs:
versionSpec: '$(python.version)'
displayName: 'Use Python $(python.version)'
- script: |
python -m pip install --upgrade pip
pip install pandas
displayName: 'Install dependencies'
- script: |
pip install pytest pytest-azurepipelines
pytest
python main.py
displayName: 'pytest'
The pipeline ran successfully, but i didn't see any csv file created on the repo
Can somebody help me to solve this issue, is it possible to create a csv file inside a azure repo?

Related

Python - Build and release an Artifact with AzureDevOps

I'm trying to create an Azure DevOps Pipeline in order to build and release a Python package under the Azure DevOps Artifacts section.
I've started creating a feed called "utils", then I've created my package and I've structured it like that:
.
src
|
__init__.py
class.py
test
|
__init__.py
test_class.py
.pypirc
azure-pipelines.yml
pyproject.toml
requirements.txt
setup.cfg
And this is the content of files:
.pypirc
[distutils]
Index-servers =
prelios-utils
[utils]
Repository = https://pkgs.dev.azure.com/OMIT/_packaging/utils/pypi/upload/
pyproject.toml
[build-system]
requires = [
"setuptools>=42",
"wheel"
]
build-backend = "setuptools.build_meta"
setup.cfg
[metadata]
name = my_utils
version = 0.1
author = Walter Tranchina
author_email = walter.tranchina#OMIT.com
description = A package containing [...]
long_description = file: README.md
long_description_content_type = text/markdown
url = OMIT.com
project_urls =
classifiers =
Programming Language :: Python :: 3
License :: OSI Approved :: MIT License
Operating System :: OS Independent
[options]
package_dir =
= src
packages = find:
python_requires = >=3.7
install_requires=
[options.packages.find]
where = src
azure-pipelines.yml
trigger:
- main
pool:
vmImage: 'ubuntu-latest'
strategy:
matrix:
Python38:
python.version: '3.8'
steps:
- task: UsePythonVersion#0
inputs:
versionSpec: '$(python.version)'
displayName: 'Use Python $(python.version)'
- script: |
python -m pip install --upgrade pip
displayName: 'Install dependencies'
- script: |
pip install twine wheel
displayName: 'Install buildtools'
- script: |
pip install pytest pytest-azurepipelines
pytest
displayName: 'pytest'
- script: |
python -m build
displayName: 'Artifact creation'
- script: |
twine upload -r utils --config-file ./.pypirc dist/*
displayName: 'Artifact Upload'
The problem I'm facing is that the pipeline stucks in the Artifact Upload stage for hours without completing.
Can please someone help me understand what it's wrong?
Thanks!
[UPDATE]
I've updated my yml file as suggested in the answers:
- task: TwineAuthenticate#1
displayName: 'Twine Authenticate'
inputs:
artifactFeed: 'utils'
And now I have this error:
2022-05-19T09:20:50.6726960Z ##[section]Starting: Artifact Upload
2022-05-19T09:20:50.6735745Z ==============================================================================
2022-05-19T09:20:50.6736081Z Task : Command line
2022-05-19T09:20:50.6736434Z Description : Run a command line script using Bash on Linux and macOS and cmd.exe on Windows
2022-05-19T09:20:50.6736788Z Version : 2.201.1
2022-05-19T09:20:50.6737008Z Author : Microsoft Corporation
2022-05-19T09:20:50.6737375Z Help : https://learn.microsoft.com/azure/devops/pipelines/tasks/utility/command-line
2022-05-19T09:20:50.6737859Z ==============================================================================
2022-05-19T09:20:50.8090380Z Generating script.
2022-05-19T09:20:50.8100662Z Script contents:
2022-05-19T09:20:50.8102321Z twine upload -r utils --config-file ./.pypirc dist/*
2022-05-19T09:20:50.8102824Z ========================== Starting Command Output ===========================
2022-05-19T09:20:50.8129029Z [command]/usr/bin/bash --noprofile --norc /home/vsts/work/_temp/706c12ef-da25-44b0-b1fc-5ab83e7e0bf9.sh
2022-05-19T09:20:51.1178721Z Uploading distributions to
2022-05-19T09:20:51.1180490Z https://pkgs.dev.azure.com/OMIT/_packaging/utils/pypi/upload/
2022-05-19T09:20:27.0860014Z Traceback (most recent call last):
2022-05-19T09:20:27.0861203Z File "/opt/hostedtoolcache/Python/3.8.12/x64/bin/twine", line 8, in <module>
2022-05-19T09:20:27.0862081Z sys.exit(main())
2022-05-19T09:20:27.0863965Z File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/twine/__main__.py", line 33, in main
2022-05-19T09:20:27.0865080Z error = cli.dispatch(sys.argv[1:])
2022-05-19T09:20:27.0866638Z File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/twine/cli.py", line 124, in dispatch
2022-05-19T09:20:27.0867670Z return main(args.args)
2022-05-19T09:20:27.0869183Z File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/twine/commands/upload.py", line 198, in main
2022-05-19T09:20:27.0870362Z return upload(upload_settings, parsed_args.dists)
2022-05-19T09:20:27.0871990Z File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/twine/commands/upload.py", line 127, in upload
2022-05-19T09:20:27.0873239Z repository = upload_settings.create_repository()
2022-05-19T09:20:27.0875392Z File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/twine/settings.py", line 329, in create_repository
2022-05-19T09:20:27.0876447Z self.username,
2022-05-19T09:20:27.0877911Z File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/twine/settings.py", line 131, in username
2022-05-19T09:20:27.0879043Z return cast(Optional[str], self.auth.username)
2022-05-19T09:20:27.0880583Z File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/twine/auth.py", line 34, in username
2022-05-19T09:20:27.0881640Z return utils.get_userpass_value(
2022-05-19T09:20:27.0883208Z File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/twine/utils.py", line 248, in get_userpass_value
2022-05-19T09:20:27.0884302Z value = prompt_strategy()
2022-05-19T09:20:27.0886234Z File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/twine/auth.py", line 85, in username_from_keyring_or_prompt
2022-05-19T09:20:27.0887440Z return self.prompt("username", input)
2022-05-19T09:20:27.0888964Z File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/twine/auth.py", line 96, in prompt
2022-05-19T09:20:27.0890017Z return how(f"Enter your {what}: ")
2022-05-19T09:20:27.0890786Z EOFError: EOF when reading a line
2022-05-19T09:20:27.1372189Z ##[error]Bash exited with code 'null'.
2022-05-19T09:20:27.1745024Z ##[error]The operation was canceled.
2022-05-19T09:20:27.1749049Z ##[section]Finishing: Artifact Upload
Seems like twine is waiting for something... :/
I guess this is because you are missing a Python Twine Upload Authenticate task.
- task: TwineAuthenticate#1
inputs:
artifactFeed: 'MyTestFeed'
If you are using a project level feed, the value of artifactFeed should be {project name}/{feed name}.
If you are using an organization level feed, the value of artifactFeed should be {feed name}.
A simpler way is to click the gray "setting" button under the task and select your feed from the drop-down list.
I've found the solution after many tentatives...
First I've created a Service Connection in Azure DevOps to Python, containing an API key previously generated.
Then I've edited the yaml file:
- task: TwineAuthenticate#1
displayName: 'Twine Authenticate'
inputs:
pythonUploadServiceConnection: 'PythonUpload'
- script: |
python -m twine upload --skip-existing --verbose -r utils --config-file $(PYPIRC_PATH) dist/*
displayName: 'Artifact Upload'
They key was using the variable $(PYPIRC_PATH) that is automatically set by the previous task. The .pypirc file is ignored by the process, so it can be deleted!
Hope it will help!

Why does Streamlit not find my python file?

I have Streamlit working in terminal i.e. the following runs in terminal:
$ streamlit hello
I am trying to create an app with the online tutorial but encounter an error - see below
https://docs.streamlit.io/en/latest/tutorial/create_a_data_explorer_app.html#let-s-put-it-all-together
I have saved the following in as uber_pickups.py
import streamlit as st
import pandas as pd
import numpy as np
st.title('Uber pickups in NYC')
(base) lf-mac-0250:~ alastairhayes$ streamlit hello
You can now view your Streamlit app in your browser.
Local URL: http://localhost:8501
Network URL: http://172.20.10.2:8501
^C Stopping...
(base) lf-mac-0250:~ alastairhayes$ streamlit run uber_pickups.py
Usage: streamlit run [OPTIONS] TARGET [ARGS]...
Error: Invalid value: File does not exist: uber_pickups.py
Where am I going wrong?
I have python 3.7.6
Many thanks!
If you're using Windows OS, you may try the steps below:
First, you need to download Anaconda:
https://www.anaconda.com/
Open the Anaconda PowerShell Prompt and type the following:
conda list
pip uninstall streamlit
Then, create a new environment, and test out the following:
pip install streamlit
streamlit hello
Finally, you can test out streamlit on your code, replace your_app with the actual name of your file. Make sure you are in the same directory as your source code file.
streamlit run your_app.py

Is it possible to write files to your machine (local drive) using GitLab CI?

I'm trying to run a Python script, using GitLab CI, which will create a Pandas dataframe and write this as a .csv file on my machine.
As a test script I've created the following do_stuff_2.py file:
import datetime
import pandas as pd
import numpy as np
current_time = datetime.datetime.now()
print(f'Hello.\nCurrent date/time is:{current_time}')
df = pd.DataFrame(np.random.randint(0,100,size=(10, 4)), columns=list('ABCD'))
print(df)
df.to_csv('C:\\<USER_PATH>\\Desktop\\df_out.csv', index = False)
This should:
print the current time
generate a dataframe with 10 rows and 4 columns which is randomly populated with values between 0 and 100
print said dataframe
save the dataframe to the local drive
When the CI pipeline is executed I get no errors and the first 3 steps run successfully:
I have a .gitlab-ci.yml file with the following:
stages:
- build
build:
stage: build
image: python:3.6
script:
- echo "Running python..."
- pip install -r requirements.txt
- python do_stuff_2.py
and a requirements.txt file:
numpy
pandas
It looks like I've got everything set up correctly, as the time is being displayed and the print function returns the dataframe. However, no file is written to the specified location. When I run the script locally everything works as expected and the dataframe is saved on my desktop as df_out.csv.
I'm using Python 3.6, on a Windows 10 machine.
Is there an alternate way to do this from within a CI pipeline in GitLab?
You need to install the gitlab-runner on your local machine.
If you can't, you can use artifact: keyword to upload the result of your script on the gitlab server and download it afterward from the UI. Your gitlab-ci.yml will look like :
stages:
- build
build:
stage: build
image: python:3.6
script:
- echo "Running python..."
- pip install -r requirements.txt
- python do_stuff_2.py
artifacts:
paths:
- df_out.csv
and you code must change to :
df.to_csv('df_out.csv', index = False)

How to create a Python executable using Docker

I have a running python executable EIA.py which extracts information from EIA.gov website and downloads necessary information into an excel file on my laptop's C:/Python Folder. However, when I convert this file into image and run using docker run command for image it gives me following error.
FileNotFoundError: [Errno 2] No such file or directory: 'C:/Python/Sid.xls'
I am not adding any file but Python should rather create an excel file with contents extracted from website.
Following is my code from Dockerfile
FROM python
VOLUME ["C:/Sid"]
WORKDIR /app
COPY . /app
RUN pip install EIA-python
RUN pip install requests
RUN pip install pandas
RUN pip install xlwt
RUN python /app/EIA.py
Following is my python code
import eia
import pandas as pd
api_key = "mykey"
api = eia.API(api_key)
series_storage = api.data_by_series(series='NG.NW2_EPG0_SWO_R48_BCF.W')
df1 = pd.DataFrame(series_storage)
df1.reset_index(inplace=True)
df1.columns = ['Date', 'Value']
df1['Date'] = pd.to_datetime(df1['Date'].str[:-3], format='%Y %m%d')
df1.to_excel("C:/Python/Sid.xls")
Docker containers do not have persistent storage. To save a file locally from a container, you can either bind a folder mount or create a docker volume. Docker volumes are the preferred mechanism for persisting data as they are completely managed within Docker CLI itself. Check out here for more info.

AWS Lambda/Python3 can't import numpy

I have a python3.6 script that uses sqlalchemy, pandas and numpy. To get this working on AWS Lambda, I took the following steps.
Created a new, clean directory
Create a new virtualenv
Create a holding directory (mkdir dist)
Install packages pip install sqlalchemy numpy pandas
Navigate to packages cd env/lib/python3.6/site-packages
Zip packages to holding directory zip -r path/dist/Transfer.zip .
Navigate to root
Zip python file zip -g dist/Transfer.zip my_python.py
Upload to S3
Direct Lambda > Configuration > Code entry type > Upload a file from S3 > path to my file
Set Handler to my_python.lambda_handler
Save and test
I always get the error
{
"errorMessage": "Unable to import module 'my_python'"
}
With the logs as
Unable to import module 'heap_consolidation_lambda': Missing required dependencies ['numpy']
Why can it not see numpy? Fwiw, numpy is the third import, so apparently it has no issues with sqlalchemy and pandas.

Categories