Is there any possibility to execute Azure machine learning models from inside a Python or R script?
My requirement is to run a large but changing number of ML algorithms on multiple given data sets. I could either hard wire all sets to all algorithms, but this is very complex and not flexible.
So the idea was to wire the data sets into the Azure-embeded Python script, write the program logic in Python, and start the ML algorithms form within (e.g. a for Loop) in the Python script.
Thank you for any hints!
Per my experience, I think to execute Azure ML models from inside a Python or R script is possible in principles. You can try to follow the offical tutorials for Python or R to know how to use the Execute Python script or Execute R script module for implementing your wants in the Azure ML studio.
Related
Here's what I want to automate:
get data from an API,
do some cleaning and wrangling in python,
save as Excel,
load to PowerBI and
produce very simple dashboard.
I've been searching for ages on Microsoft forums and here to find out if there's a way to run PowerBI via a script, like a Powershell script, for instance. I'm comfortable with writing a script for doing the first 3 steps, but I have not been able to find a solution for steps 4-5. It has to be loaded to PowerBI because that's what clients are most familiar with. The ideal result would be a script that I can then package as an executable file, send to a non-technical person so they can run it at their own leisure.
Although I'd prefer a solution in python, if it's possible in SQL or R, I'd also be very happy. I've tried all the in-PowerBI options for using python scripts, but I have found them limited and difficult to use. I've packaged all my ETL functions into a py script, then imported it to PowerBI and ran functions therein, but it's still not really fully automated, and wouldn't land well with non-technical folks. Thanks in advance! (I am working on PC, PowerBI Desktop)
I've got a following setup:
VB.NET Web-Service is running and it needs to regularly call Python script with machine learning model to predict some stuff. To do this my Web-Service generates a file with input for Python and runs Python script as a subprocess. The script makes predictions and returns them, as standard output, back to Web-Service.
The problem is, that the script requires a few seconds to import all the machine learning libraries and load saved model from drive. It's much more than doing actual prediction. During this time Web-Service is blocked by running subprocess. I have to reduce this time drastically.
What I need is a solution to either:
1. Improve libraries and model loading time.
2. Communicate Python script with VB.NET Web-Service and run Python all the time with imports and ML model already loaded.
Not sure I understood the question but here are some things I can think of.
If this is a network thing you should have python compress the code before sending it over the web.
Or if you're able to, use multithreading while reading the file from the web.
Maybe you should upload some more code so we can help you better.
I've found what I needed.
I've used web.py to convert the Python script into a Web-Service and now both VB.NET and Python Web-Services can communicate. Python is running all the time, so there is no delay for loading libraries and data each time a calculation have to be done.
I created a Machine Learning classifier with Python, using word2vec and I want to create an API to use it in production.
What's the easiest way to do that please ?
I heard of AWS Lambda and Microsoft Azure Machine Learning Studio but I am not sure it would work with word2vec.
For example with AWS Lambda, would I need to reload the libraries each time (it takes a while to do that).
And can I install any Python package with Microsoft Azure Machine Learning Studio and choose any kind of machine (need a lot of RAM for word2vec) ?
Thanks
By now, according to the offical document Execute Python machine learning scripts in Azure Machine Learning Studio about limitations for customizing Python installation (No.4 item), as below.
Inability to customize Python installation. Currently, the only way to add custom Python modules is via the zip file mechanism described earlier. While this is feasible for small modules, it is cumbersome for large modules (especially those with native DLLs) or a large number of modules.
Unfortunately, the python package like word2vec which includes some C modules could not be installed customize on Azure ML studio.
The only workaround way is to create a VM to install word2vec for Python and create a webservice in Python for calling in the Execute Python Script module of Azure ML studio via network IO, that as the answer of Azure ML Execute Python Module: Network I/O Disabled? said Azure ML studio support Network IO for Execute Python Script module now.
I have a few problems which may apply well to the Map-Reduce model. I'd like to experiment with implementing them, but at this stage I don't want to go to the trouble of installing a heavyweight system like Hadoop or Disco.
Is there a lightweight Python framework for map-reduce which uses the regular filesystem for input, temporary files, and output?
A Coursera course dedicated to big data suggests using these lightweight python Map-Reduce frameworks:
http://code.google.com/p/octopy/
https://github.com/michaelfairley/mincemeatpy
To get you started very quickly, try this example:
https://github.com/michaelfairley/mincemeatpy/zipball/v0.1.2
(hint: for [server address] in this example use localhost)
http://pythonhosted.org/mrjob/ is great to quickly get started on your local machine, basically all you need is a simple:
pip install mrjob
http://jsmapreduce.com/ -- in-browser mapreduce; in Python or Javascript; nothing to install
Check out Apache Spark. It is written in Java but it has also a Python API. You can try it locally on your machine and then, when you need it, you can easily distribute your computation over a cluster.
MockMR - https://github.com/sjtrny/mockmr
It's meant for educational use. Does not currently operate in parallel but accepts standard Python objects as IO.
So this was asked ages ago, but I worked on a full implementation of mapreduce over the weekend: remap.
https://github.com/gtoonstra/remap
Pretty easy to install with minimal dependencies, if all goes well you should be able to run a test run in 5 minutes.
The entire processing pipeline works, but submitting and monitoring jobs is still being worked on.
Let me explain what I'm trying to achieve. In the past while working on Java platform, I used to write Java codes(say, to push or pull data from MySQL database etc.) then create a war file which essentially bundles all the class files, supporting files etc and put it under a servlet container like Tomcat and this becomes a web service and can be invoked from any platform.
In my current scenario, I've majority of work being done in Java, however the Natural Language Processing(NLP)/Machine Learning(ML) part is being done in Python using the NLTK, Scipy, Numpy etc libraries. I'm trying to use the services of this Python engine in existing Java code. Integrating the Python code to Java through something like Jython is not that straight-forward(as Jython does not support calling any python module which has C based extensions, as far as I know), So I thought the next option would be to make it a web service, similar to what I had done with Java web services in the past. Now comes the actual crux of the question, how do I run the ML engine as a web service and call the same from any platform, in my current scenario this happens to be Java. I tried looking in the web, for various options to achieve this and found things like CherryPy, Werkzeug etc but not able to find the right approach or any sample code or anything that shows how to invoke a NLTK-Python script and serve the result through web, and eventually replicating the functionality Java web service provides. In the Python-NLTK code, the ML engine does a data-training on a large corpus(this takes 3-4 minutes) and we don't want the Python code to go through this step every time a method is invoked. If I make it a web service, the data-training will happen only once, when the service starts and then the service is ready to be invoked and use the already trained engine.
Now coming back to the problem, I'm pretty new to this web service things in Python and would appreciate any pointers on how to achieve this .Also, any pointers on achieving the goal of calling NLTK based python scripts from Java, without using web services approach and which can deployed on production servers to give good performance would also be helpful and appreciable. Thanks in advance.
Just for a note, I'm currently running all my code on a Linux machine with Python 2.6, JDK 1.6 installed on it.
One method is to build an XML-RPC server, but you may wish to fork a new process for each connection to prevent the server from seizing up. I have written a detailed tutorial on how to go about this: https://speakerdeck.com/timclicks/case-studies-of-python-in-parallel?slide=68.
NLTK based system tends to be slow at response per request, but good throughput can be achieved given enough RAM.