Cython macro definition in structure - python

I'm using Cython to import a structure to python from C while there are some macro definitions which include functions. I just don't how to realize the structure in Cython.
typedef struct _SparMat {
int m, n;
int *rvec;
int *ridx;
double *rval;
int *cvec;
int *cidx;
double *cval;
int nnz;
int bufsz;
int incsz;
int flag;
#define MAT_ROWBASE_INDEX (0x00000001)
#define MAT_ROWBASE_VALUE (0x00000002)
#define MAT_COLBASE_INDEX (0x00000004)
#define MAT_COLBASE_VALUE (0x00000008)
#define CSR_INDEX(flag) ((flag) & MAT_ROWBASE_INDEX)
#define CSR_VALUE(flag) ((flag) & MAT_ROWBASE_VALUE)
#define CSC_INDEX(flag) ((flag) & MAT_COLBASE_INDEX)
#define CSC_VALUE(flag) ((flag) & MAT_COLBASE_VALUE)
} SparMat, * matptr;

Related

PyImport_Import segmentation fault after reading in TSV with C++

I am using C++ as a wrapper around a Python module. First, I read in a TSV file, cast it as a numpy array, import my Python module, and then pass the numpy array to Python for further analysis. When I first wrote the program, I was testing everything using a randomly generated array, and it worked well. However, once I replaced the randomly generated array with the imported TSV array, I got a segmentation fault when I tried to import the Python module. Here is some of my code:
#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
#define PY_SSIZE_T_CLEAN
#include <python3.8/Python.h>
#include "./venv/lib/python3.8/site-packages/numpy/core/include/numpy/arrayobject.h"
#include <stdio.h>
#include <iostream>
#include <stdlib.h>
#include <random>
#include <fstream>
#include <sstream>
int main(int argc, char* argv[]) {
setenv("PYTHONPATH", ".", 0);
Py_Initialize();
import_array();
static const int numberRows = 1000;
static const int numberColumns = 500;
npy_intp dims[2]{ numberRows, numberColumns };
static const int numberDims = 2;
double(*c_arr)[numberColumns]{ new double[numberRows][numberColumns] };
// ***********************************************************
// THIS PART OF THE CODE GENERATES A RANDOM ARRAY AND WORKS WITH THE REST OF THE CODE
// // initialize random number generation
// typedef std::mt19937 MyRNG;
// std::random_device r;
// MyRNG rng{r()};
// std::lognormal_distribution<double> lognormalDistribution(1.6, 0.25);
// //populate array
// for (int i=0; i < numberRows; i++) {
// for (int j=0; j < numberColumns; j++) {
// c_arr[i][j] = lognormalDistribution(rng);
// }
// }
// ***********************************************************
// ***********************************************************
// THIS PART OF THE CODE INGESTS AN ARRAY FROM TSV AND CAUSES CODE TO FAIL AT PyImport_Import
std::ifstream data("data.mat");
std::string line;
int row = 0;
int column = 0;
while (std::getline(data, line)) {
std::stringstream lineStream(line);
std::string cell;
while (std::getline(lineStream, cell, '\t')) {
c_arr[row][column] = std::stod(cell);
column++;
}
row++;
column = 0;
if (row > numberRows) {
break;
}
}
// ***********************************************************
PyArrayObject *npArray = reinterpret_cast<PyArrayObject*>(
PyArray_SimpleNewFromData(numberDims, dims, NPY_DOUBLE, reinterpret_cast<void*>(c_arr))
);
const char *moduleName = "cpp_test";
PyObject *pname = PyUnicode_FromString(moduleName);
// ***********************************************************
// CODE FAILS HERE - SEGMENTATION FAULT
PyObject *pyModule = PyImport_Import(pname);
// .......
// THERE IS MORE CODE BELOW NOT INCLUDED HERE
}
So, I'm not sure why the code fails when ingest data from a TSV file, but not when I use randomly generated data.
EDIT: (very stupid mistake incoming) I used the conditional row > numberRows for the stopping condition in the while loop and so this affected the row number used for the final line in the array. Once I changed that conditional to row == numberRows, everything worked. Who knew being specific about rows when building an array was so important? I'll leave this up as a testament to stupid programming mistakes and maybe someone will learn a little something from it.
Note that you don't have to use arrays for storing the information(like double values) in 2D manner because you can also use dynamically sized containers like std::vector as shown below. The advantage of using std::vector is that you don't have to know the number of rows and columns beforehand in your input file(data.mat). So you don't have to allocate memory beforehand for rows and columns. You can add the values dynamically.
#include <iostream>
#include <vector>
#include <string>
#include <sstream>
#include<fstream>
int main() {
std::string line;
double word;
std::ifstream inFile("data.mat");
//create/use a std::vector instead of builit in array
std::vector<std::vector<double>> vec;
if(inFile)
{
while(getline(inFile, line, '\n'))
{
//create a temporary vector that will contain all the columns
std::vector<double> tempVec;
std::istringstream ss(line);
//read word by word(or double by double)
while(ss >> word)
{
//std::cout<<"word:"<<word<<std::endl;
//add the word to the temporary vector
tempVec.push_back(word);
}
//now all the words from the current line has been added to the temporary vector
vec.emplace_back(tempVec);
}
}
else
{
std::cout<<"file cannot be opened"<<std::endl;
}
inFile.close();
//lets check out the elements of the 2D vector so the we can confirm if it contains all the right elements(rows and columns)
for(std::vector<double> &newvec: vec)
{
for(const double &elem: newvec)
{
std::cout<<elem<<" ";
}
std::cout<<std::endl;
}
return 0;
}
The output of the above program can be seen here. Since you didn't provide data.mat file, i created an example data.mat file and used it in my program which can be found at the above mentioned link.

In mat2py how to fix error AttributeError: 'numpy.ndarray' object has no attribute 'get_svm_type'

I am trying to convert Matlab code to Python code. I have downloaded the libsvm library using sudo apt-get install python-libsvm but I am getting an error.
import scipy.io
from svmutil import svm_predict
def SVM_LIBSVM(VECTOR_TRAINING_LABELS , VECTOR_TRAINING_POINTS ,VECTOR_TESTING_LABELS , VECTOR_TESTING_POINTS):
mat = scipy.io.loadmat('model.mat')
model = mat['model']
predicted_label = list()
predicted_label = svm_predict(VECTOR_TESTING_LABELS , VECTOR_TESTING_POINTS , model)
return predicted_label
I am getting the following error:
File "/home/optimum/mat2py/svmutil.py", line 193, in svm_predict
svm_type = m.get_svm_type()
AttributeError: 'numpy.ndarray' object has no attribute 'get_svm_type'
I'm assuming that model object is from the MATLAB libsvm interface, which would make it a MATLAB struct and not a svm_model object. AFAIK there is not a direct conversion, but you should be able to read the fields from the struct, which you may be able to use for a python svm_model object (not 100% on this; I haven't used the python API much).
model_parameters = model['Parameters']
model_nr_class = model['nr_class']
model_total_sv = model['totalSV']
model_rho = model['rho']
model_label = model['Label']
model_sv_indices = model['sv_indices']
model_prob_a = model['ProbA']
model_prob_b = model['ProbB']
model_n_sv = model['nSV']
model_sv_coef = model['sv_coef']
model_svs = model['SVs']
The most painful way to generate this, if you can't find a way in python, would be to move the MATLAB struct values to a ctypes struct then use toPyModel (python API) to convert it to a python svm_model.
struct svm_model
{
struct svm_parameter param; /* parameter */
int nr_class; /* number of classes, = 2 in regression/one class svm */
int l; /* total #SV */
struct svm_node **SV; /* SVs (SV[l]) */
double **sv_coef; /* coefficients for SVs in decision functions (sv_coef[k-1][l]) */
double *rho; /* constants in decision functions (rho[k*(k-1)/2]) */
double *probA; /* pairwise probability information */
double *probB;
int *sv_indices; /* sv_indices[0,...,nSV-1] are values in [1,...,num_traning_data] to indicate SVs in the training set */
/* for classification only */
int *label; /* label of each class (label[k]) */
int *nSV; /* number of SVs for each class (nSV[k]) */
/* nSV[0] + nSV[1] + ... + nSV[k-1] = l */
/* XXX */
int free_sv; /* 1 if svm_model is created by svm_load_model*/
/* 0 if svm_model is created by svm_train */
};

Read string in HDF5/C++

I have stored a string (and a vector) in my HDF5 archive, for example with the Python interface:
import h5py
file = h5py.File("example.h5","w")
file['/path/to/vector'] = [0., 1., 2.]
file['/path/to/string'] = 'test'
Now I want the read the string to a std::string. I know how to read the vector (see below), but I have absolutely no idea how to read the string. What particularly don't understand is how to allocate the result, as:
The H5Cpp library does not seem to use the STL-containers, but rather raw pointers, requiring pre-allocation.
This is somewhat contradicted by the observation HDFView indicates the dimension size to be 1 and the type to be "String, length = variable".
Here is how I read the vector:
#include "H5Cpp.h"
#include <vector>
#include <iostream>
int main()
{
// open file
H5::H5File fid = H5::H5File("example.h5",H5F_ACC_RDONLY);
// open dataset
H5::DataSet dataset = fid.openDataSet("/path/to/vector");
H5::DataSpace dataspace = dataset.getSpace();
H5T_class_t type_class = dataset.getTypeClass();
// check data type
if ( type_class != H5T_FLOAT )
throw std::runtime_error("Unable to read, incorrect data-type");
// check precision
// - get storage type
H5::FloatType datatype = dataset.getFloatType();
// - get number of bytes
size_t precision = datatype.getSize();
// - check precision
if ( precision != sizeof(double) )
throw std::runtime_error("Unable to read, incorrect precision");
// get the size
// - read rank (a.k.a number of dimensions)
int rank = dataspace.getSimpleExtentNdims();
// - allocate
hsize_t hshape[rank];
// - read
dataspace.getSimpleExtentDims(hshape, NULL);
// - total size
size_t size = 0;
for ( int i = 0 ; i < rank ; ++i ) size += static_cast<size_t>(hshape[i]);
// allocate output
std::vector<double> data(size);
// read data
dataset.read(const_cast<double*>(data.data()), H5::PredType::NATIVE_DOUBLE);
// print data
for ( auto &i : data )
std::cout << i << std::endl;
}
(compiled with h5c++ -std=c++14 so.cpp)
I have found a solution:
#include "H5Cpp.h"
#include <vector>
#include <iostream>
int main()
{
// open file
H5::H5File fid = H5::H5File("example.h5",H5F_ACC_RDONLY);
// open dataset, get data-type
H5::DataSet dataset = fid.openDataSet("/path/to/string");
H5::DataSpace dataspace = dataset.getSpace();
H5::StrType datatype = dataset.getStrType();
// allocate output
std::string data;
// read output
dataset.read(data, datatype, dataspace);
std::cout << data << std::endl;
}

SWIG c++ vector access in python

This may be a noob question but here it goes. I have wrapped a 3d vector into a python module using SWIG. Everything has compiled and I can import the module and perform actions with it. I can't seem to figure out how to access my vector in python to store and change values in it. How do I store and change my vector values in python. My code is below and was written to test if the algorithm stl works with SWIG. It does seem to work but I need to be able to put values into my vector with python.
header.h
#ifndef HEADER_H_INCLUDED
#define HEADER_H_INCLUDED
#include <vector>
using namespace std;
struct myStruct{
int vecd1, vecd2, vecd3;
vector<vector<vector<double> > >vec3d;
void vecSizer();
void deleteDuplicates();
double vecSize();
void run();
};
#endif // HEADER_H_INCLUDED
main.cpp
#include "header.h"
#include <vector>
#include <algorithm>
void myStruct::vecSizer()
{
vec3d.resize(vecd1);
for(int i = 0; i < vec3d.size(); i++)
{
vec3d[i].resize(vecd2);
for(int j = 0; j < vec3d[i].size(); j++)
{
vec3d[i][j].resize(vecd3);
}
}
}
void myStruct::deleteDuplicates()
{
vector<vector<vector<double> > >::iterator it;
sort(vec3d.begin(),vec3d.end());
it = unique(vec3d.begin(),vec3d.end());
vec3d.resize(distance(vec3d.begin(), it));
}
double myStruct::vecSize()
{
return vec3d.size();
}
void myStruct::run()
{
vecSizer();
deleteDuplicates();
vecSize();
}
from the terminal (Ubuntu)
import test #import the SWIG generated module
x = test.myStruct() #create an instance of myStruct
x.vecSize() #run vecSize() should be 0 since vector dimensions are not initialized
0.0
x.vec3d #see if vec3d exists and is of the correct type
<Swig Object of type 'vector< vector< vector< double > > > *' at 0x7fe6a483c8d0>
Thanks in advance!
It turns out that vectors are converted to immutable python objects when the wrapper/interface is generated. So in short you cannot modify wrapped c++ vectors from python.

creating numpy array in c extension segfaults

I'm just trying to start off by creating a numpy array before I even start to write my extension. Here is a super simple program:
#include <stdio.h>
#include <iostream>
#include "Python.h"
#include "numpy/npy_common.h"
#include "numpy/ndarrayobject.h"
#include "numpy/arrayobject.h"
int main(int argc, char * argv[])
{
int n = 2;
int nd = 1;
npy_intp size = {1};
PyObject* alpha = PyArray_SimpleNew(nd, &size, NPY_DOUBLE);
return 0;
}
This program segfaults on the PyArray_SimpleNew call and I don't understand why. I'm trying to follow some previous questions (e.g. numpy array C api and C array to PyArray). What am I doing wrong?
Typical usage of PyArray_SimpleNew is for example
int nd = 2;
npy_intp dims[] = {3,2};
PyObject *alpha = PyArray_SimpleNew(nd, dims, NPY_DOUBLE);
Note that the value of nd must not exceed the number of elements of array dims[].
ALSO: The extension must call import_array() to set up the C API's function-pointer table. E.g. in Cython:
import numpy as np
cimport numpy as np
np.import_array() # so numpy's C API won't segfault
cdef make_array():
cdef np.npy_intp element_count = 100
return np.PyArray_SimpleNew(1, &element_count, np.NPY_DOUBLE)

Categories