Automate multiple dependent python program - python

I've multiple python scripts. Each script is dependent on other i.e. the first script uses output of the second script, the second script uses output of the third and so on. Is there anyway i can link up the scripts such that i can automate the whole process. I came across Talend Data Integration Tool but i can't figure out how to use it. Any reference or help would be highly useful.

You did not state what operating system/platform you are using, but the problem seems like a good fit for make.
You specify dependencies between files in your Makefile, along with rules on how to generate one file from the others.
Example:
# file-1 depends on input-file, and is generated via f1-from-input.py
file-1: input-file
f1-from-input.py --input input-file --output file-1
# file-2 depends on file-1, and is generated via f2-from-f1.py
file-2: file-1
f2-from-f1.py < file-1 > file-2
# And so on
For documentation, check out the GNU Make Manual, or one of the million tutorials on the internet.

i found this link it show how to call a python script from Talend and use it's output (not sure if it wait for the code to finish)
The main concept is to
run the python script from Talend Studio
By using tSystem component

Related

Python command to execute non-Python (MQL5) files?

I have a collection of expert advisor (EA) scripts written in the MQL5 programming language for the stock/forex trading platform, MetaTrader5. The extension of these files is mq5. I am looking for a way to programatically run these MQL5 files from my Python script on a regular basis. The EAs do some price transformations, eventually saving a set of csv files that will later be read by my Python script to apply Machine Learning models on them.
My first natural choice was the Python API for MetaTrader5. However, according to its documentation, it "is designed for convenient and fast obtaining of exchange data via interprocessor communication directly from the MetaTrader 5 terminal" and as such, it doesn't provide the functionality I need to be able to run MQL scripts using Python.
I have found some posts here on SO (such as #1, #2) about executing non-python files using Python but those posts seemed to always come with the precondition that they already had Python code written in them, only the extension differed - this is different from my goal.
I then came across Python's subprocess module and started experimenting with that.
print(os.path.isfile(os.path.join("path/to/dir","RSIcalc.mq5")))
with open(os.path.join("path/to/dir","RSIcalc.mq5")) as f:
subprocess.run([r"C:\Program Files\MetaTrader 5\terminal64.exe", f], capture_output=True)
The print statement returns True, so the mq5 file exists in the specified location. Then the code opens the MetaTrader5 terminal but nothing else happens, the EA doesn't get executed, process finishes immediately after that.
Am I even on the right track for what I'm trying to achieve here? If yes, what might be the solution for me to run these MQL5 scripts programatically from Python?
Edit:
I use Windows 10 64-bit.
subprocess is indeed the right module for what you want to achieve. But let's look at what you're doing here:
with open(os.path.join("path/to/dir","RSIcalc.mq5")) as f
You're creating a file descriptor handle called f, which is used to write or read contents from a file. If you do print(f) you'll see that it's a python object, that converted to string looks like <_io.TextIOWrapper name='RSIcalc.mq5' mode='r' encoding='UTF-8'>. It is extremely unlikely that such a string is what you want to pass as a command-line parameter to your terminal executable, which is what happens when you include it in your call to subprocess.run().
What you likely want to do is this:
full_path = os.path.abspath(os.path.join("path/to/dir","RSIcalc.mq5"))
result = subprocess.run([r"C:\Program Files\MetaTrader 5\terminal64.exe", full_path], capture_output=True)
Now, this assumes your terminal64 can execute arbitrary scripts passed as parameters. This may or may not be true - you might need extra parameters like "-f" before passing the file path, or you might have to feed script contents through the stdin pipe (unlikely, on Windows, but who knows). That's for you to figure out, but my code above should probably be your starting point.
I don’t think you need to be passing a file object to your sub process statement. In my experience. A program will run a file when the path to the file is provided as a command line argument. Try this:
subprocess.run([r"C:\\Program Files\\MetaTrader 5\\terminal64.exe", os.path.join(“path/to/dir”, “RSIcalc.mq5”], capture_output=True)
This is the same as typing C:\Program Files\MetaTrader 5\terminal64.exe path\to\dir\RSIcalc.mq5 in your terminal.

How can one download the outputs of historical Azure ML experiment Runs via the python API

I'm trying to write a script which can download the outputs from an Azure ML experiment Run after the fact.
Essentially, I want to know how I can get a Run by its runId property (or some other identifier).
I am aware that I have access to the Run object when I create it for the purposes of training. What I want is a way to recreate this Run object later in a separate script, possibly from a completely different environment.
What I've found so far is a way to get a list of ScriptRun objects from an experiment via the get_runs() function. But I don't see a way to use one of these ScriptRun objects to create a Run object representing the original Run and allowing me to download the outputs.
Any help appreciated.
I agree that this could probably be better documented, but fortunately, it's a simple implementation.
this is how you get a run object for an already submitted run for azureml-sdk>=1.16.0 (for the older approach see my answer here)
from azureml.core import Workspace
ws = Workspace.from_config()
run = ws.get_run('YOUR_RUN_ID')
once you have the run object, you can call methods like
.get_file_names() to see what files are available (the logs in azureml-logs/ and logs/azureml/ will also be listed)
.download_file() to download an individual file
.download_files() to download all files that match a given prefix (or all the files)
See the Run object docs for more details.

Python script orchestrator and simultaneous script execution

Context
I'm working on a Data Science Project in which I'm running a data analysis task on a dataset (let's call it original dataset) and creating a processed dataset (let's call this one result). The last one can be queried by a user by creating different plots through use of a Dash application. The system also makes some predictions on an attribute of this dataset thanks to ML models. Everything will work on an external VM of my company.
What is my current "code"
Currently I have these python scripts that create the result dataset (except the Dashboard one):
concat.py (simply concatenates some files)
merger.py (merges different files in the project directory)
processer1.py (processes the first file needed for the analysis)
processer2.py (processes a second file needed for the analysis)
Dashboard.py (the Dash application)
ML.py (runs a classic ML task, creates a report and an updated result dataset with some predictions)
What I should obtain
I'm interested in creating this kind of solution that will run the VM:
Dashboard.py runs 24/7 based on the existence of the "result" dataset, without it it's useless.
Every time there's a change in the project directory (new files every month are added), the system triggers the execution of concat.py, merger.py, processer1.py and processer2.py. Maybe a python script and the watchdog package can help to create this trigger mechanism? I'm not sure.
Once the execution above is done, the ML.py file is executed based on the "result" dataset and it's uploaded to the dashboard.
The Dashboard.py it's restarted with new csv file.
I would like to receive some help to understand what are the technologies necessary to get what I would like. Something like an example or maybe a source, so I can fully understand and apply what is right. I know that maybe I have to use a python script to orchestrate the whole system, maybe the same script that observes the directory or maybe not.
The most important thing is that the dashboard operates always. This is what creates the need of running things simultaneously. Just when the "result" csv dataset is completed and uploaded it is necessary to restart it, I think that for the users is best to keep the service continuity.
The users will feed the dashboard with new files in the observed directory. It's necessary to create automation by using "triggers" to execute the code, since they are not skilled users and they will not be allowed to use the VM bash (I suppose). Maybe I could think about creating a repetitive execution instead, like every month.
Company won't let me grant another VM or similar if it's needed, so I should do it just with a single VM.
Premise
This is the first time that I have to get "in production" something, and I have no experience at all. Could anyone help me to get the best approach? Thanks in advance.

Python interactive library(pymidas) session warning

I'm using Python2.7 and a library called pymidas.
Within my python script I call the library with the following comand:
from pymidas import midas
midas.do('INDISK/FITS test.fits test.bdf')
All the code that I have further written does exactly what I want, but whenever the script imports midas I first get a welcome output of (py)midas, which is ok with me, but afterwards it asks me if I want a parallel or a new session.
Saddly this point needs human interaction in selecting parallel mode. By reading the documentation of midas I found, that midas has an option (-P) which causes exactly what I need, and forces midas to open without any questions asked and directly going to parallel mode.
Does anybody know how to achieve this in my python script?
Thanks!
At the end of your script add :
midas.do('.exit')
This ensures you dont get asked the next time you run the script.

Trying to automate the fpga build process in Xilinx using python scripts

I want to automate the entire process of creating ngs,bit and mcs files in xilinx and have these files be automatically be associated with certain folders in the svn repository. What I need to know is that is there a log file that gets created in the back end of the Xilinx gui which records all the commands I run e.g open project,load file,synthesize etc.
Also the other part that I have not been able to find is a log file that records the entire process of synthesis, map,place and route and generate programming file. Specially record any errors that the tool encountered during these processes.
If any of you can point me to such files if they exist it would be great. I haven't gotten much out of my search but maybe I didn't look enough.
Thanks!
Well, it is definitely a nice project idea but a good amount of work. There's always a reason why an IDE was built – a simple search yields the "Command Line Tools User Guide" for various versions of Xilinx ISE, like for 14.3, 380 pages about
Overview and list of features
Input and output files
Command line syntax and options
Report and message information
ISE is a GUI for various command line executables, most of them are located in the subfolder 14.5/ISE_DS/ISE/bin/lin/ (in this case: Linux executables for version 14.5) of your ISE installation root. You can review your current parameters for each action by right clicking the item in the process tree and selecting "Process properties".
On the Python side, consider using the subprocess module:
The subprocess module allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes.
Is this the entry point you were looking for?
As phineas said, what you are trying to do is quite an undertaking.
I've been there done that, and there are countless challenges along the way. For example, if you want to move generated files to specific folders, how do you classify these files in order to figure out which files are which? I've created a project called X-MimeTypes that attempts to classify the files, but you then need a tool to parse the EDA mime type database and use that to determine which files are which.
However there is hope, so to answer the two main questions you've pointed out:
To be able to automatically move generated files to predetermined paths. From what you are saying it seems like you want to do this to make the versioning process easier? There is already a tool that does this for you based on "design structures" that you create and that can be shared within a team. The tool is called Scineric Workspace so check it out. It also have built in Git and SVN support which ignores things according to the design structure and in most cases it filters all generated things by vendor tools without you having to worry about it.
You are looking for a log file that shows all commands that were run. As phineas said, you can check out the Command Line Tools User guides for ISE, but be aware that the commands to run have changed again in Vivado. The log file of each process also usually states the exact command with its parameters that have been called. This should be close to the top of the report. If you look for one log file that contains everything, that does not exist. Again, Scineric Workspace supports evoking flows from major vendors (ISE, Vivado, Quartus) and it produces one log file for all processes together while still allowing each process to also create its own log file. Errors, warning etc. are also marked properly in this big report. Scineric has a tcl shell mode as well, so your python tool can run it in the background and parse the complete log file it creates.
If you have more questions on the above, I will be happy to help.
Hope this helps,
Jaco

Categories