Cleanscape FortranLint Static Source Code Analysis Tool / White Paper / A "Lint" for Fortran Programs

Overview

Cleanscape FortranLint is a Fortran source code analyzer. It can detect errors not only in individual routines (like a Fortran compiler), but it can also detect inconsistencies in groups of subroutines, which is something that a Fortran compiler usually doesn't do. It generates a call tree from a group of routines to show how the routines call one another. It also reports unused variables, functions, and subroutines, and more importantly, variables which are referenced, but not set. It is this last feature that I find most interesting.

Since a Fortran compiler is usually limited to analyzing only one subroutine at a time, inter-subroutine problems and uninitialized variables like those that can be located by Cleanscape FortranLint are not seen until load time, or worse yet, at execution time.

Background

The idea of a program like Cleanscape FortranLint, which checks Fortran code the way LINT checks 'C' code is very appealing. The output from such a program is much more useful than the error and warning messages that a good compiler can generate, because Cleanscape FortranLint will look at all the routines that make up a program (or library) at once. Thus it can generate not only the error/warning messages that a good compiler generates, but, in addition, it generates errors or warnings telling you about inconsistencies in the way routines call one another and how common blocks are used.

It is not surprising that a number of different programs have been written to try to perform this task. In addition to Cleanscape FortranLint, which is the commercial product sold by Cleanscape being reviewed here, I know of several other programs that perform similar checking.

The first is Ftnchek v2.7, written by Robert Moniot, which is free software, [is] available on a number of anonymous ftp servers, including research.att.com.

Another is FORCHEK. FTNCHEK was initially known as "FORCHEK", but this conflicted with the name of a program from a European site, leading to confusion, and the name was changed. I know nothing about the other FORCHEK other than the fact that it obviously exists or existed. [Ed: this software was reviewed in the May/June 1991 issue of the Fortran Journal.]

There is also the program ANALYZ, written by a group at CSC as part of an Air Force contract. Since the group that wrote ANALYZ has now disbanded, it is not clear what the current availability of this product is. However, it is available at a number of government computer centers.

Finally there is TOOLPACK, which I know can do some analysis of Fortran codes, but I have never used this package.

Running Cleanscape FortranLint

Cleanscape FortranLint has both a command line and an X11 interface. Since I did most of my testing remotely and from scripts, I used the command line interface exclusively.

Cleanscape FortranLint uses a floating license manager, so while Cleanscape FortranLint was actually located on another machine in the office, there was no problem in running it on my own workstation where I had the codes I wanted to test. The License manager has one unpleasant "feature" that I have not seen in other license managers. When it is started, it waits for some random amount of time, up to perhaps 10 minutes (the exact maximum is not documented) before it will let a copy of Cleanscape FortranLint execute. This would only be a minor inconvenience if the license manager was being started from rc.local, and was always available, but in our test installation we did not bother to start the license manager from rc.local, and rather just started the license manager by hand. Needless to say, we had several power outages during the time Cleanscape FortranLint was being tested, and each time I not only had to restart the license manager, but had plenty of time to go out the candy machine, talk to folks in the hall, etc., before you could actually use the code. It is not clear what additional protection this gives the license manager, but it most certainly is a negative from this user's perspective.

A first pass at the test codes

The set of tests for Cleanscape FortranLint reported here were the 54 Quick Checks in the SLATEC mathematical library (*). For my tests, I extracted the routines in each of the 54 tests, and gave them to Cleanscape FortranLint. The command line arguments were:

flint -afgs -Stestxx testxx.in

where xx was the test number.

(*) The Slatec Math library is a cooperative effort of a group of the National Laboratories (AF Phillips, Sandia, LANL, LLNL, NIST, NERSC, ORNL) and currently consists of just under 300,000 lines of mathematical library code, all in the public domain. In addition to the library, there are 54 Quick Checks, consisting of 60,000 additional lines of code that test a good portion of the library routines.

Cleanscape FortranLint itself ran relatively fast, churning through the 493,000 lines of code in the 54 tests in 5 minutes and 27 seconds of user time (10m 33s clock time) on a SparcStation 1. This is probably a worst case, as the executable was being NFS mounted from one machine and the test codes were being NFS mounted from another. I am sure that Cleanscape FortranLint would have run even faster if everything had been on local disks, but the amount of time it took even in this case was not objectionable. For any one of the Quick Checks, the analysis took 10 to 20 seconds of clock time.

The result of this run was 11,000 lines of Cleanscape FortranLint output for the 54 tests. Examination of the output showed two "problems" which were repeated in each of the 54 sets of output.

First, I had forgotten to uncomment one of the sets of machine constants in the routines I1MACH, R1MACH, and D1MACH. This rightly caused Cleanscape FortranLint to say that some variables were undefined. I went back and uncommented the SUN variables in each of the appropriate routines.

Secondly, Cleanscape FortranLint gave warning messages about variables that were passed into routines in the error handler, but never used. These arguments are in fact being passed through these routines to stub routines which can be replaced by the user to gain control after an error has occurred.

I did not want to turn off these error messages in total (after all, this is the sort of thing that you would want to find out about if it occurred somewhere else).

The most expedient solution seemed to be just to remove these routines from the code passed to Cleanscape FortranLint and accept the much smaller complaint about the routines being missing. Of course, if one of the error handler routines were being called incorrectly, I would not see that.

It probably would have been better to modify the error handler routines so that the variables appeared to be used, but this appeared to involve too much work at the time.

Alternately, there is a way to tell Cleanscape FortranLint how arguments of a subroutine are used, and I could also have used this feature to write some pseudo routines to replace the error handler.

Iterating on the test codes

With the above changes to the test codes, the number of lines of output from Cleanscape FortranLint was reduced to 7,293, not as large a saving as I had expected. This time, the problem was that I was getting error messages of the form

PORT ERROR #340- ANSI-F77 does not support Z'3F' style hex constants.

for all the constants being set in D1MACH and R1MACH.

True enough, this is not an allowed construct. However, in a mathematical library, it is desirable to have some constants describing the machine correct to the low order bit, and the conversion from decimal notation to binary is notoriously bad in many compilers (and often different in the compiler than in the run-time environment). The routines I1MACH, R1MACH and D1MACH contain these constants, and out of necessity have constants for each machine type in a format that the machine's compiler is known to accept. The routines would be much smaller (and more understandable) if the Fortran standard specified some general way to include these constants, but it doesn't.

Again, the simplest solution seemed to be to just trim these routines from the tests and try again. This reduced the output to 5,800 lines.

At this point, I noted that I had "missed" one of the Error Handler routines in my first pass, and that it was now appearing in the output from a number of tests as having unused arguments. It was removed, and the tests rerun again.

At this point, the amount of output from Cleanscape FortranLint was getting to the point where one could actually examine it in detail. On average, there was only 100 lines per test. Actually, most tests had virtually no output, and others had quite a bit.

The first thing that was noticed when reading the vastly reduced output was that Cleanscape FortranLint was giving the bogus syntax error message in a number of routines.

SYNTAX ERROR #48 unclosed block if.

These routines contain IFs, but no BLOCK IFs. I was highly suspicious that the line that was causing the error was

IF (INCX .EQ. INCY) IF (INCXl) 10,20,30

Sure enough, removing this line from the code caused the error to go away. This error was reported to Cleanscape, and it appears that it was not an error in Cleanscape FortranLint as such, but rather a misunderstanding on their part of the Fortran standard. It will be fixed.

I am always surprised when I find a problem with a compiler, or now Cleanscape FortranLint with one of the BLAS routines (or for that matter any of the other routines in the SLATEC library). Surely this public domain code is, or should be one of the prime sources of Fortran source used to test these products.

Other errors

Cleanscape FortranLint found several other errors in the Quick Checks, several of which were inherited from the BLAS2 and BLAS3 Quick Checks. These are known problems, but we have chosen not to make our own fixes, but will rather wait for a new release of the tests from the authors.

Cleanscape FortranLint also issued the warning:

INTERFACE FYI #121- common block /MPCOM/ member names differ (compared to initial use in routine DQDOTA).

I personally do not like having common blocks where variables are renamed (and almost always bring in commons with include statements so that they are guaranteed to be the same). So this warning is interesting. Cleanscape FortranLint does check the length of the labeled commons and issues a different error message if the length of a common block changes.

Cleanscape FortranLint reports unused variables (variables that are set but then unused), and also variables that are used but never set. There is one case, however, where the output could be improved. Instead of the message:

USAGE WARNING #127- local variable MODEW is set but never referenced.

I would prefer it if there were a separate message for returned arguments that are unused. This would allow me to just accept some warnings without having to go back and look at the code. It is common not to use all the returned values from a math library routine, but almost always an error to set, but not use a local variable in a subroutine.

Fortran "lawyer" mode

There are several fine points in the Fortran standard that cause almost all codes of any size to be nonstandard. With the possible exception of the Burroughs 5500 series of machines, these constructs have worked on all Fortran compilers that I am aware of. Cleanscape FortranLint is good about giving warning messages about these infractions, the first of which is shown in the following example:

CALL SCOPY (N, l.EO, O, WS(Nl), 1)

INTERFACE ERROR #257- R*4 expression passed to
dummy arg which is a R*4 array.

The standard says that if the subroutine expects an array, you have to pass it an array (and conversely, if it expects a scalar, you can't pass it an array and expect to get the first element). As noted, most Fortran compilers will generate the correct results if this part of the standard is violated; and if your program was not written by a novice who was liable to make an error in this case, I would expect that Error #257 would be one of the first that I would turn off. In addition, Error #251 is similar, but reads "R*4 variable" rather than "R*4 expression". There are similar error messages for other types.

(Note that in the above example the routine SCOPY is being used to distribute the value 1.0 into N positions of an array by setting the increment on the first "array" to zero.)

Along the same lines, Cleanscape FortranLint reports

INTERFACE ERROR #254 CHAR*2 array passed to a CHAR*(*) dummy arg.

I don't understand this error message at all and would consider it bogus.

Comparisons with other code analyzers

While reviewing Cleanscape FortranLint, I ran the same codes through both FTNCHEK and ANALYZ. Some of the comparisons are interesting. First, Cleanscape FortranLint found all the real errors in these codes, and produced one or two orders of magnitude less output for the user to have to sort through. For TEST5 in the SLATEC Quick Checks (which was rather average), here is a comparison of the amount of output generated by each of the three analyzers:

Test5	Bytes	Lines
ANALYZ	1,294,026	13,432
FTNCHEK	304,088	6,008
FLINT	4,603	125

ANALYZ's output includes a listing of the code with the errors/warnings interspersed with the code. Both FTNCHEK and Cleanscape FortranLint just give line number references to the location of the problems. In addition, ANALYZ's output is formatted for a line printer, while the other two keep the line length down to closer to 80 columns.

ANALYZ had a number of problems. It generated so much output that I ran out of space on the disk partition of the server I was running on.

It did not recognize that variables EQUIVALENCEd to variables initialized in DATA statements were initialized, thus giving bogus messages saying that these variables were used before they were set, and then declaring the original variable unused.

By default, ANALYZ gives a Cross Reference in each subroutine showing where each variable is defined and used. ANALYZ generates messages under the categories of ANSI, WARNINGS, SERIOUS, and FATAL. It did not find any ANSI or FATAL errors. It found several SERIOUS errors, some of which had to do with its mis-parsing of a logical expression, and the others due to getting confused when a variable was initialized in a data statement with 40 digits of accuracy.

ANALYZ generated many WARNING messages, being very paranoid about mixed mode expressions, and the modification of variables set in DATA statements. Unfortunately, the latter led to a large number of occurrences of the WARNING:

WARNING ***** Initialize local variable FIRST was modified. *****

where the variable FIRST is used in an IF (FIRST) THEN to protect initialization code that only needs to be executed once.

ANALYZ also complained about a variable not being set in a case where you would in fact have to follow the program flow to make any decision.

FTNCHEK has two major problems relative to Cleanscape FortranLint. Like ANALYZ, it is much more verbose, so you have to hunt through one or two orders of magnitude more output to find an error. There is always the feeling that you may miss something important in the mass of warnings. It does not (currently) have the ability to turn off individual error messages. I ended up writing some rather oblique SED scripts to remove multi-line messages from the output so that I could see what was left.

Its other fault is that it does not trace argument (and common) usage through multiple levels of subroutines as Cleanscape FortranLint does. Rather, it makes assumptions as to how arguments (and common) are used. This removes the ability to find a large class of errors that Cleanscape FortranLint can locate.

FTNCHEK gave warnings about all constants in DATA statements that were given to more precision than they could be stored in a REAL*4. It would have been really nice to be able to turn off this message, as the fact that the constants are longer than necessary is known fact -- the code in question runs not only on workstations, but CRAYs, so the constants need to be given to enough precision so that they will be correct with both 32-bit and 64-bit single precision, and 64-bit and 128-bit double precision.

Overall, FTNCHEK did not give any incorrect error messages, but its levels of error checking never quite seem appropriate. If you really want to see "everything", then you get warnings like the above about more precision than needed. If you request "less" messages, it's not clear what you are not seeing. Being able to turn off individual messages like Cleanscape FortranLint would be a nice addition.

Conclusion

There are many features of Cleanscape FortranLint that I do not have space to discuss here. It has a "library" option, where you can first analyze a library of subroutines, and then generate a library file that can be used in place of all of the source from these subroutines in later analysis.

Since I usually organize my codes so that any routine that is used in more than one place is pushed into a library, this is a very nice feature.

As mentioned at the start, in addition to the command line interface, there is a nice X11 interface to Cleanscape FortranLint. This would be nice when working with an individual code.

Finally, to repeat my prejudice, I would like any analyzer to just tell me what is wrong and not bother me with additional warnings. Cleanscape FortranLint is far from perfect in this respect, but it is much better than anything else available. In addition, its ability to selectively turn off individual error messages makes it possible to reduce the output incrementally once you are sure that certain types of warnings can be ignored.

I would recommend Cleanscape FortranLint to anyone involved with large Fortran codes, be they their own, or something that they have inherited and now need to understand and debug/modify.

About the author

Reg Clemens has a Ph.D. in Physics and has been involved in computational physics since the late 1960s. He is the DOD representative to the SLATEC Math Libraries Committee, and over the years he has authored several large hydrodynamic and EMP codes. He currently spends half of his time in the Consulting office at Phillips Lab, Kirtland Air Force Base, providing scientific programming support.

A "LINT" for Fortran Programs

Cleanscape FortranLint White Paper

Contents

Related Items