Convert data values from Python to C - python

For this project I am working with libsvm.
I have a python file that is able to output a list of feature vectors and I have a C executable that takes 2 csv files, a list of feature vectors and the svm model, as arguments and outputs a prediction in the form of a csv file.
Now, I would like to change the C file such that it takes in the list output from the python file as its input arguments to make a prediction. This is because I am having to run the python code and C in real time. Therefore, latency will be an issue if I am having to write to a csv file with python and read the file in C.
I have tried searching things like cython, subprocess module and argparse. However, it seems like these are used to execute Python functions in C and C in Python. Can someone please help understand how to transfer data from python to C? Thank you.

It will depends the amount of latency you are accepting as input.
I am not familiar with libsvm so I admit you are able to read your feature vector in my solutions:
First solution (easier but slower) would be to make a little Python to C library as follow:
#/usr/bin/python
def print_vector(vec):
first = true;
print("[")
for i in vec:
if first:
first = false
else:
print(',')
print(i)
print("]")
and to parse it from the standard input in C with <string.h> library and using atoi.
You will then execute your command as follow:
./my_python_program | ./my_c_program
Second solution (way faster but way harder) which is implement pipe connection between you python program and C. Or even TCP socket connections.
You can, for instance, have a look to the Linux Documentation if you are under Linux.

Related

Taking Input from an external file in Python

If you ever came across a situation where you have test your program against a very large pile of input, you have wondered that if there's any way to shortcut o it.
There are certain methods often come very handy when you have to test your program time and space complexity while processing a large input
You cant always input some large input manually , so there's the method by which you can provide input to your program using external txt file
Below is my answer to it :)
You just write simple program and then run it from command line in any platform like Windows/ Linux.
python program.py < input.txt > output.txt
<,> are redirection operators which simply redirects the stdin and stdout to input.txt and output.txt. This is the easiest way.
Alternatively, you can do the following
import sys sys.stdin=open('input.txt','r') sys.stdout=open('output.txt','w')
Alternatively you can do the following
input=open('input.txt','r')
ouput=open('ouput.txt','w')
n=input.read()
output.write(n)
I prefer method 1 as it is simple and no need of file handling and this helps a lot in Codejam, FaceBook HackerCup. Hope it helps

What are the modules supported by python to generate a c code?

Are there any particular modules in python which provide the generation of c file and write into this c file ? Currently I am parsing an xml file with python and I want to use the parsed information to create a desired data structure from the python script into a c file.
for example , i saw a reference from the link https://www.codeproject.com/Articles/571645/Really-simple-Cplusplus-code-generation-in-Python . Is there similar module for c code generation ??
By what you describe you just have to generate a source code file you will be using in your own project.
You just don t need any special Python library to do that.
The linked article you give does nothing special with the C++ strings for each line hardcoded into Python code, which linearly generates all the parts of the code: there is nothing special on that article about "creatign a C++" file - just put what you want in your final file in strings, and you can put your data values using the string format method, or the new f-strings.
However, if you want a nice and maintainable solution, you could use a template library for Python, like "jinja2". Once you have a jinja2 template for your desired C file, all you need to do is to call Jinja's render methods passing the custom data you want into the final file, and sabve the result to a file.
That is not specific about C, though, that is about creating a structered text file with data you have, and Jinja is a nice tool for that.
People are mentioning "Cython" around: Cython is a solution to convert Python-like code into C - that is, if you want a program you wrote in Python to become a C program. (Although it is seldon used like that - normally the C generating and building steps are transparent, and one just cares about the final executable program or module when using Cython.)
From my reading of your question, Cython is not what you want at this moment.

Can't pass data from octave to python through command line arguments

I am calculating HoG feature descriptors in Octave and then I am trying to cluster those data in Python using scikit-learn.
For testing my code in Python I am trying to pass a 4000x2 data to Python.
I am calling the Python script from Octave using
system('python filename.py data')
and then trying to get the data using
sys.argv
but I am getting the second argument as a string 'data' and not the 4000x2 data that I am passing from Octave
What should I do so that I can get the original data in Python and not just the string 'data'
There's a python command built into octave.
Alternatively, I would save as a .mat file, and open this in your python script using scipy.io.loadmat.
There's also eval_py and python_cmd from the symbolic package, but I'm not sure if this is appropriate for your particular use-case. The most general, matlab-compatible, and recommended way to do this would be the .mat one.
I am not an expert regarding Octave but the answer is most likely something like that:
command=sprintf("python filename.py %s",data)
system(command)
Be careful that the amount of command line arguments is limited in most operating systems.

Passing Arrays from Python to Fortran (and back)

Background:
My program currently assembles arrays in Python. These arrays are connected to a front-end UI and as such have interactive elements (i.e. user specified values in array elements). These arrays are then saved to .txt files (depending on their later use). The user must then leave the Python program and run a separate Fortran script which simulates a system based on the Python output files. While this only takes a couple of minutes at most, I would ideally like to automate the process without having to leave my Python UI.
Assemble Arrays (Python) -> Edit Arrays (Python) -> Export to File (Python)
-> Import File (Fortran) -> Run Simulation (Fortran) -> Export Results to File (Fortran)
-> Import File to UI, Display Graph (Python)
Question:
Is this possible? What are my options for automating this process? Can I completely remove the repeated export/import of files altogether?
Edit:
I should also mention that the fortran script uses Lapack, I don't know if that makes a difference.
You do not have to pass arrays to Fortran code using text files. If you create an entry point to the Fortran code as a subroutine, you can pass all the numpy arrays using f2py. You should be aware of that if you added the f2py tag yourself. Just use any of the numerous tutorials, for example https://github.com/thehackerwithin/PyTrieste/wiki/F2Py or http://www.engr.ucsb.edu/~shell/che210d/f2py.pdf .
The way back is the same, the Fortran code just fills any of the intent(out) or intent(inout) arrays and variables with the results.
I love the Python+Fortran stack. :)
When needing close communication between your Python front-end and Fortran engine, a good option is to use the subprocess module in Python. Instead of saving the arrays to a text file, you'll keep them as arrays. Then you'll execute the Fortran engine as a subprocess within the Python script. You'll pipe the Python arrays into the Fortran engine and then pipe the results out to display.
This solution will require changing the file I/O in both the Python and Fortran codes to writing and reading to/from a pipe (on the Python side) and from/to standard input and output (on the Fortran side), but in practice this isn't too much work.
Good luck!

Hadoop: Process image files in Python code

I'm working on a side project where we want to process images in a hadoop mapreduce program (for eventual deployment to Amazon's elastic mapreduce). The input to the process will be a list of all the files, each with a little extra data attached (the lat/long position of the bottom left corner - these are aerial photos)
The actual processing needs to take place in Python code so we can leverage the Python Image Library. All the Python streaming examples I can find use stdin and process text input. Can I send image data to Python through stdin? If so, how?
I wrote a Mapper class in Java that takes the list of files and saves the names, the extra data, and the binary contents to a sequence file. I was thinking maybe I need to write a custom Java mapper that takes in the sequence file and pipes it to Python. Is that the right approach? If so, what should the Java to pipe the images out and the Python to read them in look like?
In case it's not obvious, I'm not terribly familiar with Java OR Python, so it's also possible I'm just biting off way more than I can chew with this as my introduction to both languages...
There are a few possible approaches that I can see:
Use both the extra data and the file contents as input to your python program. The tricky part here will be the encoding. I frankly have no idea how streaming works with raw binary content, and I'm assuming that basic answer is "not well." The main issue is that the stdin/stdout communication between processes is very text-based, relying on delimiting input with tabs and newlines, and things like that. You would need to worry about the encoding of the image data, and probably have some sort of pre-processing step, or a custom InputFormat so that you could represent the image as text.
Use only the extra data and the file location as input to your python program. Then the program can independently read the actual image data from the file. The hiccup here is making sure that the file is available to the python script. Remember this is a distributed environment, so the files would have to be in HDFS or somewhere similar, and I don't know if there are good libraries for reading files from HDFS in python.
Do the java-python interaction yourself. Write a java mapper that uses the Runtime class to start the python process itself. This way you get full control over exactly how the two worlds communicate, but obviously its more code and a bit more involved.

Categories