Loading Data Into Peoplesoft DataBase with ExceltoCi - python

I have been tasked with building an ETL job that takes financial csv data from an asset management program, transforms and delivers to our PeopleSoft Financial system.
I am using Talend and also writing some python scripts. This program will run once a week. The PeopleSoft team insists on using this "Excel to Ci" excel xlsm file which is an excel workbook with macros and VBA code. This thing is a nightmare to work with and isn't supported by Talend or fully compatible with python openpyxl package.
Is there a better way to push (csv) data into a PeopleSoft database while executing this supposed business logic?

PeopleTools Integration Broker allows you to create web services that can invoke a CI. Then you could invoke the service using Python.
https://docs.oracle.com/cd/E41633_01/pt853pbh1/eng/pt/tibr/concept_UnderstandingCreatingComponentInterface-BasedServices-076354.html
Another alternative is to develop an App Engine program to read in the csv file and invoke the CI that way using PeopleCode.

Related

How to schedule a python/R script to run on a powerBI dataset

I already have a R script that will dump the data from a given dataset in powerBI to .csv file in local desktop. But I want to schedule this script to run every day.
How can this be achieved?
Can this be achieved without using gateway tool available in internet blogs like https://community.powerbi.com/t5/Community-Blog/Schedule-Automated-Data-Exports-from-Power-BI-using-a-simple-R/ba-p/1606313
You can use some sort of Task Scheduler depending on the OS you're using, or you can download On-Premise Gateway Personal Mode (Here) and run it through that on Power BI Service.
This gateway will be available only when using your account and this is great if your R script is embedded inside Power Query, then with each refresh of the dataset, the script will be run as well and you will get your .csv updated.

Can Google Apps Script call a python script?

I have a python script that uses a python api to fetch data from a data provider, then after manipulating the data, writes some of that data to Google Sheets (via the Google Sheets python api). So my workflow to update the data is to open the python file in VS Code, and run it from there, then switch to the spreadsheet to see the updated data.
Is there a way to call this python script from Google Sheets using Google Apps Script? If so, that would be more efficient; I could link the GAS script to a macro button on the spreadsheet.
Apps Script runs in the cloud at Google's servers rather than in your computer, and has no access to local resources such as Python scripts in your system.
To call a resource in the cloud, use the URL Fetch Service.
Currently, I am using following hack.
The appsscript behind gsheet command button writes parameters in a sheet called Parameters. And the python function on my local machine checks that Parameters sheet in that google workbook every 5 seconds. If there are no parameters, then it exists. And if there are parameters, then it executes the main code
When the code is deployed on service account, the portion which is polling remains inactive. And the appsscript directly makes a call directly to python code in service account.
There are many reasons why I need to call python function on LOCAL machine from gsheet. One reason is --- debugging is better in local machine and cumbersome on service account. Another reason is --- certain files can be put on local machines and we do not want to move these files to workspace. And gsheet needs data from these files.
This is a HACK I am using
.
I am looking for a better way that this "python code keeps polling" method.

Can you automate an ETL to dashboard pipeline using python/SQL/R and PowerBI?

Here's what I want to automate:
get data from an API,
do some cleaning and wrangling in python,
save as Excel,
load to PowerBI and
produce very simple dashboard.
I've been searching for ages on Microsoft forums and here to find out if there's a way to run PowerBI via a script, like a Powershell script, for instance. I'm comfortable with writing a script for doing the first 3 steps, but I have not been able to find a solution for steps 4-5. It has to be loaded to PowerBI because that's what clients are most familiar with. The ideal result would be a script that I can then package as an executable file, send to a non-technical person so they can run it at their own leisure.
Although I'd prefer a solution in python, if it's possible in SQL or R, I'd also be very happy. I've tried all the in-PowerBI options for using python scripts, but I have found them limited and difficult to use. I've packaged all my ETL functions into a py script, then imported it to PowerBI and ran functions therein, but it's still not really fully automated, and wouldn't land well with non-technical folks. Thanks in advance! (I am working on PC, PowerBI Desktop)

Suggestions to run a python script on AWS

I currently have a python project which basically reads data from an excel file, transforms and formats it, performs intensive calculations on the formatted data, and generates an output. This output is written back on the same excel file.
The script is run using a Pyinstaller EXE which basically is packing all the required libraries and the code itself, so every user is not required to prep the environment to run the script.
Both, the script EXE and the Excel file, sit on the user's machine.
I need some suggestion on how this entire workflow could be achieved using AWS. Like what AWS services would be required etc.
Any inputs would be appreciated.
One option would include using S3 to store the input and output files. You could create a lambda function (or functions) that does the computing work and that writes the update back to S3.
You would need to include the Python dependencies in your deployment zip that you push to AWS Lambda or create a Lambda layer that has the dependencies.
You could build triggers to run on things like S3 events (a file being added to S3 triggers the Lambda), on a schedule (EventBridge rule invokes the Lambda according to a specific schedule), or on demand using an API (such as an API Gateway that users can invoke via a web browser or HTTP request). It just depends on your need.

Python AWS Glue log says "Considering file without prefix as a python extra file" for uploaded python zip packages

In AWS Glue, for a simple pandas job of reading data in XLSX and writing to CSV. I have a small code. As per the Python Glue instructions, I have zipped the required libraries and provided the as packages to Glue Job while execution.
Question: What do the following logs convey?
Considering file without prefix as a python extra file s3://raw-data/sampath/scripts/s3fs/fsspec.zip
Considering file without prefix as a python extra file s3://raw-data/sampath/scripts/s3fs/jmespath.zip
Considering file without prefix as a python extra file s3://raw-data/sampath/scripts/s3fs/s3fs.zip
....
please elaborate with an example?
In python shell jobs, you should add external libraries in egg file and not zip file. Zip file is for Spark job.
I also wrote small shell script to deploy python shell job without manual steps to create egg file and upload to s3 and deploy via cloudformation. Script does all automatically. You may find code at https://github.com/fatangare/aws-python-shell-deploy. Script will take csv file and convert it into excel file using pandas and xlswriter libraries.

Categories