Calling Fortran subroutines from Rust
I’ve been working on some Rust bindings to the COIN HSL Archive - a Fortran package containing solvers for linear equations and finding matrix scaling factors. HSL make some of the best algorithms in this domain, and are they commonly used in large scale nonlinear optimization packages such as Ipopt. The algorithms in the free archive package have largely been superseded by more modern alternatives, but these cost money to acquire. The archive package on the other hand is free for personal use subject to acquiring an individual license directly from HSL.
If you are interested in the specifics of the algorithms:
ma27
- solves a sparse symmetric system of linear equations, where the matrix need need be definite.ma28
- solves a sparse system of linear equations.mc19
- calculates scaling factors for a sparse matrix.
There are a lot of well-studied scientific packages in Fortran, and as discussed to some extent in the last post, it would be great to bring more to the scientific compute ecosystem in Rust! I hadn’t expected to need to Fortran FFI in any of my Rust projects, so I thought it would make the subject of a good blog post. This is by no means a complete guide to this subject - I know it is a strict subset of what would be generally required to call any given Fortran library, but I hope my experience will be useful to someone nonetheless.
C bindings?
When investigating solutions for a new problem, it can be worth stopping to think if you even have a problem at all.
Some Fortran libraries already ship with C bindings. This may be an easier method of integration that worrying about Fortran specifically. Rust/C interop is a well-trodden path - the Rust book has a descriptive FFI section and the crates ecosystem features mature crates for working with C interfaces, such as bindgen
.
If you have the source code available for the library you wish to integrate and at least Fortran 2003 available (which seems likely if you are using Rust), then you can make use of the bind(C)
attribute to generate a C-interoperable interface for the library. For example:
1
2
3
4
! You can even pass a custom symbol name to bind(c)!
function my_function() bind(c, "my_special_symbol_name")
! implementation here...
end function my_function
With this, you may follow the more common Rust/C interop approach if desired. If you are authoring the Fortran library yourself, I’d recommend adding C bindings purely to increase its consumability regardless of which language your callers are using. C is the lingua franca of foreign function interfaces and nearly all commercially viable programming languages support calling extern "C"
functions one way or another.
Calling Fortran subroutines from Rust requires only a little more consideration than these cases, so if you don’t have C bindings available (or don’t want to use them) then let’s look at some of the things you can do to make Rust and Fortran work together.
Basic Fortran FFI
The first step in most language interop journeys is understanding the foreign function interface (FFI). For the COIN HSL archive, this is relatively simple as the entire public interface of the package consists of Fortran subroutines.
Subroutines in Fortran cannot return values, they manage all data as parameters so we only need concern ourselves with representing the data in the argument list and the symbol name. The arguments to the public API subroutines of the COIN HSL Archive happen to only use C-compatible integer and floating point scalar and array types, so the former problem is solved for us in this instance with primitive types.
In a more general FFI case, we would potentially need to find a set of structured types and a long list of functions with return values to bind to our crate. This would require us to consider the memory representation of the structured types (layout, alignment, etc.) and potentially the lifetimes of returned values. Moreover, Fortran also has a modules feature which can further complicate name mangling. These are not a problem for the library in question today though, I will simply point out that those features exist and that I (perhaps somewhat luckily) will not be covering them, for now at least.
A first subroutine
The first simple rule we can learn is that the default name mangling for a Fortran subroutine in gfortran
appends an underscore (_
).
Secondly (and this tripped me up), all functions and subroutines in Fortran take their arguments by reference. From an FFI perspective, this means we need to pass raw pointers to our Rust data. There is a telltale sign that you have made this mistake for integral types: you will get segfaults when you call an erroneously-bound Fortran function and see the value you passed appear as a memory address in tools like valgrind
.
Using these two simple rules, if we have a subroutine that looks like this:
1
2
3
4
5
6
subroutine square(x, r)
implicit none
integer (4), intent(in) :: x
integer (4), intent(out) :: r
r = x * x
end subroutine square
Assuming that the subroutine was compiled into a library called myfortranlib
, then we can bind to it using a Rust function like this:
1
2
3
4
#[link(name = "myfortranlib")]
unsafe extern "C" {
fn square_(x: *const i32, r: *mut i32);
}
Verbose Fortran syntax aside, that’s simple enough! We can add many function signatures to this block if required.
Side note: whilst it may look tempting to derive the mutability of the Rust arguments from those intent
specifiers in the subroutine dummy argument list, it is possible to violate these. The intent specifiers simply provide hints to the Fortran compiler about how the arguments should be used and if ignored could cause undefined behaviour in your bindings. For const-correctness, you should refer to the documentation for the code that you binding to get the mutability of the data in the subroutine argument list.
Building up the parameters
Whilst the simple square
subroutine is fine for a quick concept demonstration, the actual COIN HSL Archive functions take a lot of inputs (so much so that I had to silence clippy
at the root level of the crate). These inputs range from scalars, arrays, out parameters, and special arrays used to provide configuration options for the algorithms.
Focussing on the last of those input types, the algorithm configuration options are stored in integer and real-valued arrays of length 30 and 5 respectively - no structured types here! Let’s have a look at a more typical, but still slightly cut down, example of a function that more closely resembles the APIs of the COIN HSL Archive package. We’ll assume that we have real-valued matrix non-zeroes in an in/out parameter called a
, the indices nonzero elements are stored in the integer arrays irn
and icn
, and lastly we have the integer and real-valued control variables in icntl
and cntl
respectively. How would our bindings to this look in Rust?
1
2
3
4
5
6
7
8
9
10
#[link(name = "coinhsl")] // for example
unsafe extern "C" {
fn example_matrix_function_(
a: *mut f64,
irn: *const i32,
icn: *const i32,
icntl: *const i32,
cntl: *const f64,
);
}
Uh-oh, because everything in Fortran is a reference, this is starting to get opaque. At a glance, it is not straightforward to see which arguments are supposed to be scalars, vectors, perhaps raw memory addresses, or maybe even optional values. Of course, when writing FFI bindings in any language, we do not simply expose the functions from another language, we provide idiomatic wrappers using the conventions of our desired language. Now that we can represent our Fortran subroutines, let’s focus on consuming them nicely because we certainly don’t want to ship this.
Leveraging idiomatic Rust
Ok, so we can call our Fortran subroutines from Rust - great! Although, that example_matrix_function_
doesn’t look like something I would actually want to consume as a Rust developer. It’s implicitly unsafe
and the arguments are pointers, which is going to require me to take care to not violate the guarantees of safe Rust. Thankfully, Rust has more than a few idioms that can make this can make this more ergonomic for callers
For starters, we can just lift the pointer arguments into slices and then call the Fortran binding converting them to pointers.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
pub fn example_matrix_function(
a: &mut [f64],
irn: &[i32],
icn: &[i32],
icntl: &[i32; 30],
cntl: &[f64; 5],
) {
unsafe {
example_matrix_function_(
a.as_mut_ptr(),
irn.as_ptr(),
icn.as_ptr(),
icntl.as_ptr(),
cntl.as_ptr(),
);
}
}
This is the minimum viable binding in my opinion - it’s true to the original function but removes the potentially unsafe pointer semantics. We can take this a few steps further though.
Newtype idiom
We’ve already seen that the COIN HSL Archive library uses fixed-length control arrays for configuring the algorithms. Perhaps today we would use a structured type to represent this configuration data with names, giving us the benefit of a strong type for the data contained within those array arguments. Extolling the virtues of leveraging the type system in your programs is probably not required for a Rust audience, let’s see how we can apply it to our bindings.
Newtype idiom “gives compile time guarantees that the right type of value is supplied to a program”. This is perfect for our integer control structure array, which could easily get jumbled up with the many other arrays we must pass to the Fortran APIs. Let’s prevent that mistake at compile time with the newtype idiom!
1
pub struct Icntl([i32; 30]);
Essentially, we use a tuple struct
to create an entirely new type (importantly not just a type alias) containing a single member: our integer array of length 30.
This new type has no functionality defined for it though because we’ve hidden the array as an implicit private field. We can remedy that by implementing index operations and some typical conversions via standard Rust traits. Note that I am using the core::
namespace rather than std::
so I can keep this crate no_std
compatible.
1
2
3
4
5
impl core::ops::Index<usize> for Icntl { /* --snip-- */ }
impl core::ops::IndexMut<usize> for Icntl { /* --snip-- */ }
impl AsRef<[i32]> for Icntl { /* --snip-- */ }
impl AsMut<[i32]> for Icntl { /* --snip-- */ }
impl IntoIterator for Icntl { /* --snip-- */ }
Applying the newtype pattern to all of our control array types, we get a moderate improvement in expressiveness. Perhaps the name Icntl
could be improved, but it comes from the COIN HSL Library itself so at least it is consistent. Substituting our fixed-length slice arguments with our new control types gives us the following work-in-progress function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
pub fn example_matrix_function(
a: &mut [f64],
irn: &[i32],
icn: &[i32],
icntl: &Icntl,
cntl: &Cntl,
) {
unsafe {
example_matrix_function_(
a.as_mut_ptr(),
irn.as_ptr(),
icn.as_ptr(),
icntl.as_ref().as_ptr(),
cntl.as_ref().as_ptr(),
);
}
}
Accepting anything slice-like
Let’s finish up by making the matrix non-zero values and indices generic in types that can be converter to slices.
This can be achieved by taking arguments for contiguous arrays as impl AsMut<[T]>
and impl AsRef<[T]>
for mutable and immutable arrays respectively. Now, any type that satisfies these traits can be passed to our function and the compiler will generate many implementations for each combination of argument types that are actually invoked by our consumer (this is called monomorphization).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
pub fn example_matrix_function(
mut a: impl AsMut<[f64]>,
irn: impl AsRef<[i32]>,
icn: impl AsRef<[i32]>,
icntl: &Icntl,
cntl: &Cntl,
) {
let a = a.as_mut();
let irn = irn.as_ref();
let icn = icn.as_ref();
unsafe {
example_matrix_function_(
a.as_mut_ptr(),
irn.as_ptr(),
icn.as_ptr(),
icntl.as_ref().as_ptr(),
cntl.as_ref().as_ptr(),
);
}
}
Now users can provide any type that be converted to a slice, even deciding whether or not the input is borrowed or moved. This will allow consumers to bring their own linear algebra types whilst still maintaining control of ownership, which can have performance implications in scientific codes.
There is a small issue here though. We just used the newtype idiom to prevent the control arrays getting mixed up with the other arguments, but because the control arrays themselves implement AsRef
and AsMut
they can accidentally be passed in place of the other arguments once more! There are clearly some implementation trade-offs to be made here: we could lose the named control types, not implement AsRef
and AsMut
for them, or leave it as-is with reduced benefit. In the end I chose the latter so as to not lose the expressive power of the generic arguments.
Conclusion
The few Fortran/Rust interop techniques contained herein were enough to get my bindings to the COIN HSL Archive package up and running. With this, I can run all of the documented Fortran examples through my Rust functions and get the same answers. Result!
There are definitely some areas I did not touch on. Older Fortran codes use a so-called “common block” which is a kind of global variable store shared between functions. Actually, some of the COIN HSL Archive subroutines use these, it is just not relevant for my usage so I have escaped that consideration for now. Furthermore, as previously mentioned, more modern Fortran codes are organised into modules which will have nontrivial implications for our Rust bindings code. It feels cool to be able to solve an immediate practical problem but come away knowing you still have a lot more to learn.
Looking further into the future, a dream world of Rust/Fortran tooling would probably consist of build script support for Fortran toolchains, similar to how we have the bindgen
and cc
crates for C header parsing and in-crate compilation support.