Computer software programming tools for Windows, Unix, Linux, C, C++, fortran, ada. automatic code generators fortran development tools static source code analysis software test automation test coverage build management systems programming source code analyzers software development lint debuggers MIL-STD-1750A
Software development powerfully simplified

Early bug eradication techniques at NCAR speed development, improve software quality

Static Source Code Analysis Case Study

Contents

Other Resources

Overview

At the National Center for Atmospheric Research (NCAR) in Boulder, Colorado, some of the largest and most complex FORTRAN programs in the world are developed to run weather modeling simulations on the center's supercomputers. One of the key problems facing developers at NCAR is that the complexity and length of the programming required to build these packages increases the opportunity for small coding inconsistencies (or bugs) to go undetected. Once these errors become part of the compiled package, they sit, waiting like a coding time bomb, until the right combination of user hardware interactions can create a critical problem.

Finding these errors used to mean the application of valuable human resources to manually review each line of code until the problem could be detected. Recently, the Scientific Computing Division (SCD) at NCAR has begun using a new approach to provide pre-compile source code analysis capability to the developers who contribute to their Community Climate Model (CCM). This solution has substantially reduced the amount of time that programmers must spend debugging applications and has provided the opportunity for developers to quickly analyze the way in which subtle errors can bloom into large problems.

^ Top

Background

NCAR is operated by a consortium of universities, and its main mission is to provide tools and service necessary to the academic community for the study of the atmosphere. Much of the work at NCAR revolves around creating large computer driven simulations that model different aspects of the Earth's atmosphere.

Because of the complexity of the computations required to do even basic atmospheric modeling, these programs are run on the class of hardware called supercomputers. To put the computation demands of this type of development into perspective, approximately 70 percent of the time used on NCAR supercomputers is allocated to dealing with problems whose execution requires more than one hour. This type of problem can not be handled by any other class of machine.

One of the largest development programs at NCAR is the ongoing work being done on the CCM. This large global climate model is distributed throughout the world to researchers. The "Community" in the name indicates that this model is provided free of charge or licensing to participating researchers. The base program, minus the hundreds of associated subroutines, is 50,000 lines of FORTRAN code. The size of the programs developed at NCAR, many of which are developed over a period of years and are revised regularly, means that the possibility of human error affecting the integrity of the code is a constant concern.

SCD is charged with supporting the users and developers of the large climate models that run on NCAR's supercomputers. Part of this support involves making recommendations on tools and techniques that can enhance the quality and efficiency of development projects. For a number of years a search for a pre-compile source code analysis tool had been a high priority at NCAR.

^ Top

Code Analysis Options

One solution that had provided some degree of proficiency in addressing these problems was brought to NCAR from NOAA by Linda Bath, lead programmer on the Community Climate Model core group, about 10 years ago when she first moved to NCAR. This product provided global cross-referencing of a large code so that the location of common variable changes could be determined and the type of interactions taking place among subroutines could be checked. What the product could not do was handle all syntax problems or analyze arguments in terms of correctness.

When NCAR changed operating systems to UNICOS, the UNIX version of the CRAY operating system, the product was not ported and so the necessity of providing a replacement fell to the consulting unit at SCD. This was approximately two years ago and the product that SCD finally settled on trying was Cleanscape FortranLint from Information Processing Techniques (Cleanscape) in Palo Alto, CA. Cleanscape FortranLint is a static source code analysis utility and was actually an evolutionary development of the original UNIX "lint" concept, though designed to analyze FORTRAN code instead of 'C'.

Difficult to use, the free UNIX lint utility was never fully utilized by UNIX programmers. Cleanscape took the idea of a static source code analyzer and enhanced it to produce a tool for FORTRAN that could provide extensive source code analysis. Unlike standard debuggers, which operate only on runtime units (executables), source code analyzers are designed to provide pre-compile error checking of a program at the source level. At the point when an executable is generated, most of the information about the source code is lost. A good source code analyzer keeps track of all information about every symbol in the code. As a result, the analyzer can generate intelligent information about the analyzed code.

When NCAR first tried Cleanscape FortranLint, the product was only available for the SUN operating system and not on the CRAY. NCAR ran Cleanscape FortranLint on a Sun, and moved the FORTRAN source code from the Cray to the Sun so that the people in the SCD, who were the only people who had a license to use it, could run it on the CCM. Then SCD could give the output to other developers so that they were able to gain access to the tool in that way. At that time, they were not able to use it interactively. Subsequently, the product was introduced for the UNICOS operating system on the CRAY.

^ Top

A Typical Problem

The types of problems that can be located and solved by the use of a static source code [analyzer] can best be illustrated by examining how these types of problems would have been solved using the manual approach to analysis. The following problem occurred during one Community Climate Model project before the introduction of Cleanscape FortranLint:

In anticipation of the development of a multitasked CCM2, the model development group built a prototype version of CCM1, which was partially multitasked. This required "scoping" all variables, a process by which variables in the code are determined to be either "local" (confined to a certain range in the code) or "global" (allowed to operate outside any pre-defined range). Those which could not be globally shared between processes were placed on the FORTRAN stack.

Since CCM1 was heavily dependent on named common variables, rewriting the code in this way involved removing many variables from common and passing them through long, complicated argument lists. Programmers who used common variables extensively would spend most of their time debugging their misuse. The reasons for moving to the argument list method are based in the lack of tracking ability in the FORTRAN code itself. It is very easy to change the value of common variables without realizing that you have done so.

In an attempt to make the FORTRAN code more accountable in terms of the data structures, the move is being made at NCAR to examine more statements in relation to argument lists. Also, the argument lists approach is an attempt to enforce some standardization for modeling implementation in respect to the parameterizations of the model. This is called making components "plug compatible", and means all scientists working on their own packages would be able to plug these modules into the global model without spending months learning the specific parameterizations of the model.

This experience (pre-Cleanscape FortranLint) was instructive to programmers at NCAR. It showed them that passing variables through argument lists was just as difficult to debug as checking for named common variable errors. It took weeks to correct this aspect of the code, and it was at this time that a search for a new tool was initiated by the group.

^ Top

The Automated Solution

SCD made the programming group aware of the source code analyzer they had been testing and Cleanscape FortranLint was used on the new version of the CCM2 model. Its ability to automate the analysis of argument list length and type checking made the project much less time-consuming and much easier. Analysis which had required weeks when done manually, was completed with Cleanscape FortranLint in a matter of a few minutes.

The CCM2 model, though multitasked, still uses about 35 common blocks, and Cleanscape FortranLint's ability to provide cross-referencing has been quite useful in keeping track of the common block variables. Also, the product offers a call-tree that provides documentation of each version of the Climate Model.

^ Top

Conclusion

It is clear that even on relatively small sections of code, 3,000 lines or so, the ability to bypass a manual assessment of code integrity and to rely on automated checking and analysis can speed up the quality assurance aspect of a project while providing programmers [with] insights into the subtle problems that may affect their more complicated packages.

Copyright © 2002-2023 Cleanscape Software International
> Make a comment about our web site