I have a problem with calling Fortran code including NetCDF subroutines from Python using F2PY. I manage to compile the code OK but when I call it from
Python I always get a 'Memory Fault' error.
For testing purposes I am using a code from the Unidata website:
http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-f90.
I included the code below:
subroutine simple_xy_wr
use netcdf
implicit none
! This is the name of the data file we will create.
character (len = *), parameter :: FILE_NAME = "simple_xy.nc"
! We are writing 2D data, a 12 x 6 grid.
integer, parameter :: NDIMS = 2
integer, parameter :: NX = 6, NY = 12
! When we create netCDF files, variables and dimensions, we get back
! an ID for each one.
integer :: ncid, varid, dimids(NDIMS)
integer :: x_dimid, y_dimid
! This is the data array we will write. It will just be filled with
! a progression of integers for this example.
integer :: data_out(NY, NX)
! Loop indexes, and error handling.
integer :: x, y
! Create some pretend data. If this was not an example program, we
! would have some real data to write, for example, model output.
do x = 1, NX
do y = 1, NY
data_out(y, x) = (x - 1) * NY + (y - 1)
end do
end do
! Always check the return code of every netCDF function call. In
! this example program, wrapping netCDF calls with "call check()"
! makes sure that any return which is not equal to nf90_noerr (0)
! will print a netCDF error message and exit.
! Create the netCDF file. The nf90_clobber parameter tells netCDF to
! overwrite this file, if it already exists.
call check(nf90_create(FILE_NAME, NF90_CLOBBER, ncid) )
! Define the dimensions. NetCDF will hand back an ID for each.
call check(nf90_def_dim(ncid, "x", NX, x_dimid) )
call check( nf90_def_dim(ncid, "y", NY, y_dimid) )
! The dimids array is used to pass the IDs of the dimensions of
! the variables. Note that in fortran arrays are stored in
! column-major format.
dimids = (/ y_dimid, x_dimid /)
! Define the variable. The type of the variable in this case is
NF90_INT (4-byte integer).
call check( nf90_def_var(ncid, "data", NF90_INT, dimids, varid) )
! End define mode. This tells netCDF we are done defining metadata.
call check( nf90_enddef(ncid) )
! Write the pretend data to the file. Although netCDF supports
! reading and writing subsets of data, in this case we write all the
! data in one operation.
call check( nf90_put_var(ncid, varid, data_out) )
! Close the file. This frees up any internal netCDF resources
! associated with the file, and flushes any buffers.
call check( nf90_close(ncid) )
print *, "*** SUCCESS writing example file simple_xy.nc! "
contains
subroutine check(status)
integer, intent ( in) :: status
if(status /= nf90_noerr) then
print *, trim(nf90_strerror(status))
stop 2
end if
end subroutine check
end subroutine simple_xy_wr
I compile this code as follows using F2PY:
> f2py -c -L/gfortran-4.4.5/lib -lnetcdff \
> -I/gfortran-4.4.5/include \
> --fcompiler=gfortran \
> --f90flags="-g -fconvert=little-endian -frecord-marker=8 -fdefault-real-8 -Wall -O -fbounds-check -ffixed-line-length-none -ffree-line-length-none -fbacktrace -fcheck=all" \
> -m __test__ test2.f90 --debug
>
> f2py -h __test__.pyf -m __test__ test2.f90 --overwrite-signature
In python I do the following:
import __test__
__test__.simple_xy_wr()
which invariably results in a Memory Fault(coredump) error and nothing else, i.e. I get no other information. Having used print statements it seems that the error occurs at the
call check(nf90_create(FILE_NAME, NF90_CLOBBER, ncid) )
I then replaced this line with (having defined status as an integer):
status=nf90_create(FILE_NAME, NF90_CLOBBER, ncid)
and I still get the Memory Fault(coredump) error. This is seems that it is definitely the call to NF90_create that results in the error.
I have another code (more complicated) where exactly the same error occurs at the first call to a NF90 subroutine.
Any ideas what causes this error and how I can get around it. I am not even sure how to debug this anymore. I had a look at the Unidata webpages but am none the wiser.
Related
I am trying to compile a piece of old Fortran code with f2py so that it can be called within Python.
However, the part with external function wouldn’t work.
Here is an MWE, first I have the test.f:
function f(x)
implicit double precision (a-z)
f = x * x
return
end function f
subroutine gauss(fun)
implicit double precision (a-h, j-z)
! external fun
x = 1.5
write(*,*) fun(x)
return
end subroutine gauss
which is later compiled with makefile:
f2py -c --quiet --fcompiler=gnu95 \
--f90flags=“-Wtabs” \
-m test \
test.f
Lastly it is called from Python:
import test
f = lambda x: x
test.gauss(test.f)
and gives the error TypeError: test.gauss() 1st argument (fun) can’t be converted to double.
In a second attempt, I uncomment the line external fun in the subroutine gauss and get the following error message during compilation:
/tmp/tmpet9sk3e9/src.linux-x86_64-3.7/testmodule.c: In function ‘cb_fun_in_gauss__user__routines’:
/tmp/tmpet9sk3e9/src.linux-x86_64-3.7/testmodule.c:313:8: error: variable or field ‘return_value’ declared void
I am running out of ideas, any help will be greatly appreciated!
I have (partly) figured it out: f2py needs you to explicitly specify the use of the function.
Here is how:
function f(x)
implicit double precision (a-z)
f = x * x
return
end function f
subroutine gauss(fun)
implicit double precision (a-h, j-z)
!! <-------- updates starts HERE
external fun
!f2py real*8 y
!f2py y = fun(y)
double precision fun
!! <-------- updates ends HERE
x = 1.5
write(*,*) fun(x)
return
end subroutine gauss
then compile it using
main:
f2py -c --quiet --fcompiler=gnu95 \
--f90flags=“-Wtabs” \
-m test \
test.f
You can test with
import test
f = lambda x: x
test.gauss(test.f)
test.gauss(f)
and see that the external function works for both Fortran and Python functions.
Two side notes:
Although the documentation says !f2py intent(callback) fun is necessary, I find the code can work without it.
You can specify multiple external functions using the same syntax, you can even use the same dummy input variable y for all of them (maybe not a good practice though).
I am trying to wrap some Fortran code using f2py, but running into the following problem. Let's say that I have a minimum working example in Fortran as,
MODULE Circle
!---------------------------------------------------------------------
!
! Module containing definitions of variables needed to
! compute the area of a circle of radius r
!
!---------------------------------------------------------------------
REAL, PARAMETER :: Pi = 3.1415927
REAL :: radius
END MODULE Circle
I save this file as circlemodule.F90 and run,
python -m numpy.f2py -c circlemodule.F90 -m circlemodule
Then f2py runs without errors and I obtain a file circlemodule.cp37-win_amd64.pyd. If I now run the simple script
import circlemodule
print(circlemodule.__doc__)
in the same directory I get the expected output and everything seems fine. However here is my issue, if I take the above example and add an extra line:
MODULE Circle
!---------------------------------------------------------------------
!
! Module containing definitions of variables needed to
! compute the area of a circle of radius r
!
!---------------------------------------------------------------------
REAL, ALLOCATABLE, DIMENSION(:,:) :: B
REAL, PARAMETER :: Pi = 3.1415927
REAL :: radius
END MODULE Circle
The code that I actually want to wrap contains a lot of allocatable definitions like this one, and if I am reading this correctly f2py should be able to handle these according to the example at the bottom of this webpage: https://numpy.org/doc/stable/f2py/python-usage.html However when I try to wrap this code using the exact same steps as before, and run the python script to import, I get the following,
ImportError: DLL load failed: The specified module could not be found.
Does anyone have any idea why this happens? Somehow the module f2py creates is now no longer recognized by python. I am using python 3.7.9 and my fortran compiler is ifort 2021. Thanks!
EDIT: It might be interesting to note that in the output of f2py the following line is present
getarrdims:warning: assumed shape array, using 0 instead of ':'
Perhaps this is a clue to the problem?
I've found that the function genfromtxt from numpy in python is very slow.
Therefore I decided to wrap a module with f2py to read my data. The data is a matrix.
subroutine genfromtxt(filename, nx, ny, a)
implicit none
character(100):: filename
real, dimension(ny,nx) :: a
integer :: row, col, ny, nx
!f2py character(100), intent(in) ::filename
!f2py integer, intent(in) :: nx
!f2py integer, intent(in) :: ny
!f2py real, intent(out), dimension(nx,ny) :: a
!Opening file
open(5, file=filename)
!read data again
do row = 1, ny
read(5,*) (a(row,col), col =1,nx) !reading line by line
end do
close (5)
end subroutine genfromtxt
The length of the filename is fixed to 100 because if f2py can't deal with dynamic sizes. The code works for sizes shorter than 100, otherwise the code in python crashes.
This is called in python as:
import Fmodules as modules
w_map=modules.genfromtxt(filename,100, 50)
How can I do this dynamically without passing nx, ny as parameters nor fixing the filename length to 100?
The filename length issue is easily dealt with: you make filename: character(len_filename):: filename, have len_filename as a parameter of the fortran function, and use the f2py intent(hide) to hide it from your Python interface (see the code below).
Your issue of not wanting to have to pass in nx and ny is more complicated, and is essentially is the problem of how to return an allocatable array from f2py. I can see two methods, both of which require a (small and very light) Python wrapper to give you a nice interface.
I've haven't actually addressed the problem of how to read the csv file - I've just generated and returned some dummy data. I'm assuming you have a good implementation for that.
Method 1: module level allocatable arrays
This seems to be a fairly standard method - I found at least one newsgroup post recommending this. You essentially have an allocatable global variable, initialise it to the right size, and write to it.
module mod
real, allocatable, dimension(:,:) :: genfromtxt_output
contains
subroutine genfromtxt_v1(filename, len_filename)
implicit none
character(len_filename), intent(in):: filename
integer, intent(in) :: len_filename
!f2py intent(hide) :: len_filename
integer :: row, col
! for the sake of a quick demo, assume 5*6
! and make it all 2
if (allocated(genfromtxt_output)) deallocate(genfromtxt_output)
allocate(genfromtxt_output(1:5,1:6))
do row = 1,5
do col = 1,6
genfromtxt_output(row,col) = 2
end do
end do
end subroutine genfromtxt_v1
end module mod
The short Python wrapper then looks like this:
import _genfromtxt
def genfromtxt_v1(filename):
_genfromtxt.mod.genfromtxt_v1(filename)
return _genfromtxt.mod.genfromtxt_output.copy()
# copy is needed, otherwise subsequent calls overwrite the data
The main issue with this is that it won't be thread safe: if two Python threads call genfromtxt_v1 at very similar times the data could get overwritten before you've had a change to copy it out.
Method 2: Python callback function that saves the data
This is slightly more involved and a horrendous hack of my own invention. You pass your array to a callback function which then saves it. The Fortran code looks like:
subroutine genfromtxt_v2(filename,len_filename,callable)
implicit none
character(len_filename), intent(in):: filename
integer, intent(in) :: len_filename
!f2py intent(hide) :: len_filename
external callable
real, allocatable, dimension(:,:) :: result
integer :: row, col
integer :: rows,cols
! for the sake of a quick demo, assume 5*6
! and make it all 2
rows = 5
cols = 6
allocate(result(1:rows,1:cols))
do row = 1,rows
do col = 1,cols
result(row,col) = 2
end do
end do
call callable(result,rows,cols)
deallocate(result)
end subroutine genfromtxt_v2
You then need to generate the "signature file" with f2py -m _genfromtxt -h _genfromtxt.pyf genfromtxt.f90 (assuming genfromtxt.f90 is the file with your Fortran code). You then modify the "user routines block" to clarify the signature of your callback:
python module genfromtxt_v2__user__routines
interface genfromtxt_v2_user_interface
subroutine callable(result,rows,cols)
real, dimension(rows,cols) :: result
integer :: rows
integer :: cols
end subroutine callable
end interface genfromtxt_v2_user_interface
end python module genfromtxt_v2__user__routines
(i.e. you specify the dimensions). The rest of the file is left unchanged.
Compile with f2py -c genfromtxt.f90 _genfromtxt.pyf
The small Python wrapper is
import _genfromtxt
def genfromtxt_v2(filename):
class SaveArrayCallable(object):
def __call__(self,array):
self.array = array.copy() # needed to avoid data corruption
f = SaveArrayCallable()
_genfromtxt.genfromtxt_v2(filename,f)
return f.array
I think this should be thread safe: although I think Python callback functions are implemented as global variables, but default f2py doesn't release the GIL so no Python code can be run between the global being set and the callback being run. I doubt you care about thread safety for this application though...
I think you can just use:
open(5, file=trim(filename))
to deal with filenames shorter than the length of filename. (i.e. you could make filename much longer than it needs to be and just trim it here)
I don't know of any nice clean way to remove the need to pass nx and ny to the fortran subroutine. Perhaps if you can determine the size and shape of the data file programatically (e.g. read the first line to find nx, call some function or have a first pass over the file to determine the number of lines in the file), then you could allocate your a array after finding those values. That would slow everything down though, so may be counter productive
I have a (for me) very weird segmentation error. At first, I thought it was interference between my 4 cores due to openmp, but removing openmp from the equation is not what I want. It turns out that when I do, the segfault still occurs.
What's weird is that if I add a print or write anywhere within the inner-do, it works.
subroutine histogrambins(rMatrix, N, L, dr, maxBins, bins)
implicit none;
double precision, dimension(N,3), intent(in):: rMatrix;
integer, intent(in) :: maxBins, N;
double precision, intent(in) :: L, dr;
integer, dimension(maxBins, 1), intent(out) :: bins;
integer :: i, j, b;
double precision, dimension(N,3) :: cacheParticle, cacheOther;
double precision :: r;
do b= 1, maxBins
bins(b,1) = 0;
end do
!$omp parallel do &
!$omp default(none) &
!$omp firstprivate(N, L, dr, rMatrix, maxBins) &
!$omp private(cacheParticle, cacheOther, r, b) &
!$omp shared(bins)
do i = 1,N
do j = 1,N
!Check the pair distance between this one (i) and its (j) closest image
if (i /= j) then
!should be faster, because it doesn't have to look for matrix indices
cacheParticle(1, :) = rMatrix(i,:);
cacheOther(1, :) = rMatrix(j, :);
call inbox(cacheParticle, L);
call inbox(cacheOther, L);
call closestImage(cacheParticle, cacheOther, L);
r = sum( (cacheParticle - cacheOther) * (cacheParticle - cacheOther) ) ** .5;
if (r /= r) then
! r is NaN
bins(maxBins,1) = bins(maxBins,1) + 1;
else
b = floor(r/dr);
if (b > maxBins) then
b = maxBins;
end if
bins(b,1) = bins(b,1) + 1;
end if
end if
end do
end do
!$omp end parallel do
end subroutine histogramBins
I enabled -debug-capi in the f2py command:
f2py --fcompiler=gfortran --f90flags="-fopenmp -fcheck=all" -lgomp --debug-capi --debug -m -c modulename module.f90;
Which gives me this:
debug-capi:Fortran subroutine histogrambins(rmatrix,&n,&l,&dr,&maxbins,bins)'
At line 320 of file mol-dy.f90
Fortran runtime error: Aborted
It also does a load of other checking, listing arguments given and other subroutines called and so on.
Anyway, the two subroutines called in are both non-parallel subroutines. I use them in several other subroutines and I thought it best not to call a parallel subroutine with the parallel code of another subroutine. So, at the time of processing this function, no other function should be active.
What's going on here? How can adding "print *, ;"" cause a segfault to go away?
Thank you for your time.
It's not unusual for print statements to impact - and either create or remove the segfault. The reason is that they change the way memory is laid out to make room for the string being printed, or you will be making room for temporary strings if you're doing some formatting. That change can be sufficient to cause a bug to either appear as a crash for the first time, or to disappear.
I see you're calling this from Python. If you're using Linux - you could try following a guide to using a debugger with Fortran called from Python and find the line and the data values that cause the crash. This method also works for OpenMP. You can also try using GDB as the debugger.
Without the source code to your problem, I don't think you're likely to get an "answer" to the question - but hopefully the above ideas will help you to solve this yourself.
Using a debugger is (in my experience) considerably less likely to have this now-you-see-it-now-you-don't behaviour than with print statements (almost certainly so if only using one thread).
I have been struggling with this issue for a while now, and search queries / applicable documentation did not yield any viable results either; hence posting it here.
What do I want to accomplish:
I have some program written in FORTRAN77 which takes some arguments and returning a double precision array of fixed length.
I wish to use another programming language to call this function (for now I use python for testing purposes; but this will be subject to change - so f2py isn't an option here)
The FORTRAN routine can be summarized to be something as follows:
FUNCTION TST (IIN)
IMPLICIT LOGICAL (A-Z)
cGCC$ ATTRIBUTES DLLEXPORT, STDCALL :: TST
INTEGER II,IIN
DOUBLE PRECISION, DIMENSION(3) :: TST
C
C Test function:
DO 1001 II = 1,3
TST(II) = II * 1.11D0 + IIN
1001 CONTINUE
RETURN
END
This is compiled with gcc-fortran as follows:
gfortran -static -c -fdollar-ok -fno-align-commons TEST.for
gfortran -shared -mrtd -static -o TEST.dll TEST.def TEST.o
Where TEST.def maps TST to tst_
No problem thus far, however in python the issue arises "how do I call this function and handle the return value?"
Using the 'depwalker'-tool; the TST function apparently expects 2 arguments (tst_#8). Besides an integer, I assume this should be either a pointer to the output array or its length.
My python code is as follows:
import ctypes as cs
import numpy as np
#import dll library hooks:
tdll = cs.WinDLL(pathToDLL)
#test:
tdll.TST.restype = None
tdll.TST.argtypes = [cs.POINTER(cs.c_long), cs.POINTER(cs.c_double*3)]
#start testing:
Ar = [0.0, 0.0, 0.0]
_A = np.array(Ar).ctypes.data_as(cs.POINTER(cs.c_double*len(Ar)))
_L = cs.c_long(3)
tdll.TST(cs.byref(_L), _A)
The problem is that this code (together with any variants to this I try) will produce an
error: OSError: exception: access violation writing 0x0000000F. If I try to pass the first argument ByValue, it will result in OSError: exception: access violation reading 0x0000000F
Can someone point me in the right direction here?
After some tooling around; also thanks to the valuable suggestions by Vladimir F, a solution to the issue as described above has been found.
Some slight changes on the python side of things were required in order to make it work:
import ctypes as cs
import numpy as np
#import dll library handle:
tdll = cs.WinDLL(pathToDLL)
#specify result and argument types
tdll.TST.restype = None
tdll.TST.argtypes = [cs.POINTER(cs.POINTER(cs.c_double*3)), cs.POINTER(cs.c_long)]
#call the dll function 'TST':
Ar = (cs.c_double*3)()
_A = cs.pointer(Ar)
tdll.TST(cs.byref(_A), cs.byref(cs.c_long(3)))
result = Ar[:]
Hopefully this post might be of any help to someone else.