The last time Hackerfall tried to access this page, it returned a not found error.
A cached version of the page is below, or click here to continue anyway

Some initial experiments were performed here, but since they used different codes with somewhat different discretizations the results cant quite be compared directly. This then is a follow up benchmark study which tests Matlab, Octave, Julia, and Fortran head to head solving and identical problem (Poisson equation on a unit square) with as identical codes as possible. The Matlab and Octave code is taken directly from the FEATool Matlab FEM solver production code. The Fortran code uses the FEAT2D library which is featured in the FeatFlow FEM CFD solver and is a very good FEM reference implementation. The Julia code is a direct port of the Fortran code. Note that the codes are not simplified and optimized for the test case, but more or less has all features of a production FEM code such as support for unstructured grids, different FEM shape functions, cubature rules, variable equation coefficients etc.

The time required to assemble the system matrix is first to be investigated since except for solving the linear systems typically is one of the most cpu intensive parts of a FEM code. The figure below shows the resulting timings for uniform grids sizes 1/32 all the way up to 1/1024. All runs were performed on a simple laptop with up to date versions of all runtimes and the Fortran code compiled with the Intel Fortran compiler.

What one can see is that the Matlab, Julia, and Fortran codes all have almost identical assembly times. Octave is the slow outlier which can partly be attributed to the lack of JIT compilation. Matlab has a hard to shake reputation to often be downright slow and one might indeed be surprised and skeptical to see that it performed just as fast as Julia and Fortran. In this case this is due to a heavily vectorized and optimized Matlab assembly routine. Thus it is indeed possible to write both complex and high performance Matlab code, in this case speed at the cost of memory. However, what one also can not see from the graphs is the development effort required. The Julia code was a quick and easy straight port of the Fortran code, effectively identical warts and all. Writing fast and efficient Matlab code is unfortunately a non-trivial and time consuming task, not forgetting about the reduced readability and maintainability of the vectorized code.

Moving on to the solver timings below, the results are neither surprising nor that interesting. Matlab, Octave, and Julia for good reason all use Tim Davis Umfpack as default direct linear sparse solver. The Fortran FeatFem code employs a geometric multigrid solver. Thus what one sees here has not much to do with the programming languages themselves but simply direct solver versus multigrid.

Now considering how well Julia performed for matrix assembly it is quite plausible that rewriting the multigrid solver in Julia would not only be easy but could possibly perform just as well. This is in contrast to Matlab and Octave for which efficiently vectorizing a multigrid solver would indeed be a difficult and non-trivial task.

Lastly, examining the total simulation timings without the solver but including grid generation, matrix pointer computation, matrix and right hand side assembly, boundary conditions, and sparse matrix allocation one can quite easily see the potential of Julia. Looking at the complete set of graphs it is only in the grid generation and matrix pointer calculations that Julia falls behind Fortran, and it is very likely that this could be improved as well, considering the quick and dirty port.

To sum up, this little exercise has shown that Julia can be just as fast as a fast Fortran code, and perhaps more importantly is that writing fast Julia code is significantly easier than Matlab and Octave. Thus Julia seem to have the potential to be a one stop solution for at least some number crunching applications, not anymore requiring switching between Matlab/Octave for easy development and rewriting in Fortran for speed. At least to this *programmer *Julia seems to warrant a more serious and in-depth look.

For the interested reader, the complete benchmark suite is available at github.