Reliability Model
According to ANSI, Software Reliability is defined as: the probability of failure free software operation for a specified period of time in a specified environment. Although software Reliability is defined as a probabilistic function, and comes with the notion of time, we must note that, different from traditional Hardware Reliability, Software Reliability is not a direct function of time. Electronic and mechanical parts may become “old” and wear out with time and usage, but software will not rust or wear out during its life cycle. Software will not change over time unless intentionally changed or upgraded.
Software Reliability is an important to attribute of software quality, together with functionality, usability, performance, serviceability, capability, installability, maintainability, and documentation. Software Reliability is hard to achieve, because the complexity, including software, will be hard to reach a certain level of reliability, system developers tend to push complexity into the software layer, with the rapid growth of system size and ease of doing so by upgrading the software. For example, large next generation aircraft will have over one million source lines of software. For example, large next generation aircraft will have over one million source lines of software on board; next generation air traffic control systems will contain between one and two million lines; the upcoming international Space Station will have over two million lines on board and over ten million lines of ground support software; several major life critical defense systems will have over five million source lines of software. While the complexity of software is inversely related to software reliability, it is directly related to other important factors in software quality, especially functionality, capability, etc. Emphasizing these features will tend to add more complexity to software.
Software failure mechanisms
Software failures may be due to errors, ambiguities, oversights or misinterpretation of the specification that the software is supposed to satisfy, carelessness or incompetence in writing code, inadequate testing, incorrect or unexpected usage of the software or other unforeseen problems. While it is tempting to draw an analogy between Software Reliability, software and hardware have basic differences that make them different in failure mechanisms. Hardware faults are mostly Physical faults, while software faults are design faults, which are harder to visualize, classify, detect, and correct. Design faults are closely related to fuzzy human factors and the design process, which we don’t have a solid understanding. In hardware, design faults may also exist, but physical faults usually dominate. In software, we can hardly find a strict corresponding counterpart for “manufacturing” as hardware manufacturing process, if the simple action of uploading software modules into the storage and start running. Trying to achieve higher reliability by simple duplicating the same software modules will not work, because design faults cannot be masked off by voting.
A partial list of the district characteristics of software compared to hardware is listed below.
•
Failure cause: Software defects are mainly design defects.
•
Wear-out: Software does not have energy related wear-out phase. Errors can occur without warning.
•
Repairable system concept: Periodic restarts can help fix software problems.
•
Time dependency and life cycle: Software reliability is not a function of operational time.
•
Environmental factors: Do not affect Software reliability, except it might affect program inputs.
•
Reliability prediction: Software reliability cannot be predicted from any physical basis, since it depends completely on human factors in design.
•
Redundancy: Cannot improve Software reliability if identical software components are used.
•
Interfaces: Software interfaces are purely conceptual other than visual.
•
Failure rate motivation: Usually not predictable from analyses of separate statements.
•
Built with standard components: Well- understood and extensively- tested standard parts will help improve maintainability and reliability. But in software industry, we have not observed this trend. Code reuse has been around for some time, but to a very limited extent. Strictly speaking there are no standard parts for software, except some standardized logic structures.
Software Reliability Models
Most software models contain the following parts: assumptions, factors, and a mathematical function that relates the reliability with the factors. The mathematical function is usually higher order exponential or logarithmic.
Software modeling techniques can be divided into two subcategories: prediction modeling and estimation modeling. Both kinds of modeling techniques are based on observing and accumulating failure data and analyzing with statistical inference. The major differences of the two models are shown in Table 1.
ISSUES |
|
PREDICTION MODEL |
|
ESTIMATION MODELS |
DATA REFERENCE |
|
Uses historical data |
|
Uses data from the current software development effort |
WHEN USED IN DEVELOPMENT CYCLE |
|
Usually made prior to development or test phases: can be used as early as concept phase |
|
Usually made later in life cycle( after some data have been collected); not typically used in concept or development phases |
TIME FRAME |
|
Predict reliability at some future time |
|
Estimate reliability at either present or some future time |
Table 1: Difference between software reliability prediction models and software reliability estimation models
Representative prediction models include Musa’s Execution Time Model. And Rome Laboratory models TR-92-51 and TR_92_15, etc. Using prediction models, software reliability can be predicted early in the development phase and enhancements can be initiated to improve the reliability.
Representative estimation models include exponential distribution models, Weibull distribution model, Thompson and Chelson’s model, etc. Exponential models and Weibull distribution model are usually named as classical fault count/fault rate estimation models, while Thompson and Chelson’s model belong to Bayesian fault rate estimation models.
The field has matured to the point that software models can be applied in practical situations and give meaningful results and, second, that there is no one model that is best in all situations, Because of the complexity of software, any model has to have extra assumptions. Only limited factors can be put into consideration. Most software reliability models ignore the software development process and focus on the results--- the observed faults and/or failures, By doing so, complexity is reduced and abstraction is achieved, however, the models tend to specialize to be applied to only a portion of the situations and a certain class of the problems. We have to carefully choose the right model that suits our specific case. Furthermore, the modeling results cannot be blindly believed and applied.
Software Reliability Metrics
Measurement is commonplace in other engineering field, but not in software engineering. Measuring software reliability remains a difficult problem because we don’t have a good understanding of the nature of software. There is no clear definition to what aspects are related to software reliability. We cannot find a suitable way to measure software reliability, and most of the aspects related to software reliability. Even the most obvious product metrics such as software size have not uniform definition.
It is tempting to measure something related to reliability to reflect the characteristics, if we cannot measure reliability directly. The current practices of software reliability measurement can be divided into four categories:
•
Product metrics
Software size is thought to be reflective of comity, development effort and reliability. Lines of Code (LOC), or LOC in thousands (KLOC), is an intuitive initial approach to measuring software size. But there is not a standard way of counting. Typically, source code is used (SLOC< KSLOC) and comments and other non-executable statements are not counted. This method cannot faithfully compare software not written in the same language. The advent of new technologies of code reuse and code generation technique also cast doubt on this simple method.
Function point metric is a method of measuring the functionality of a proposed software development based upon a count of inputs, outputs, master files, inquires, and interfaces. The method can be used to estimate the size of a software system as soon as these functions can be identified. It is a measure of the functional complexity of the program. It measures the functionality delivered to the user and is a measure of the functional complexity of the program. It measures the functionality delivered to the user and is independent of the programming language. It is used primarily for business systems; it is not proven in scientific or real-time applications.
Complexity is directly related to software reliability, so representing complexity is important. Complexity oriented metrics is a method of determining the complexity of a program’s control structure, by simplify the cod into a graphical representation. Representative metric is McCabe’s Complexity Metric.
Test coverage metrics are a way of estimating fault and reliability by performing tests on software products, based on the assumption that software reliability is a function of the portion of software that has been successfully verified or tested.
•
Project management metrics
Researchers have realized that good management can result in better products. Research has demonstrated that a relationship exists between the development process and the ability to complete projects on time and within the desired quality objectives. Costs increase when developers us inadequate processes. Higher reliability can b achieved by using better development process, risk management process, configuration management process, etc.
•
Process metrics
Based on the assumption that the quality of the product is a direct function of the process, process metrics can be used to estimate, monitor and improve the reliability and quality of software. ISO- 9000 certification, or “quality management standards”, is the generic reference for a family of standards developed by the International Standards Organization (ISO).
•
Fault and failure metrics
The goal of collecting fault and failure metrics is to be able to determine when the software is approaching failure-free execution. Minimally, both the number of faults found during testing (i.e., before delivery) and the failures (or other problems) reported by users after delivery are collected, summarized and analyzed to achieve this goal. Test strategy is highly relative to the effectiveness of fault metrics, because if the testing scenario does not cover the full functionality of the software, the software may pass all tests and yet be prone to failure once delivered. Usually, failure metrics are based upon customer information regarding failures found after release of the software. The failure data collected is therefore used to calculate failure density, Mean Time between Failures (MTBF) or other parameters to measure or predict software reliability.
Software Reliability Improvement Techniques
Good engineering methods can largely improve software reliability.
Before the deployment of software products, testing, verification and validation are necessary steps. Software testing is heavily used to trigger, locate and remove software defects. Software testing is still in its infant stage; testing is crafted to suit specific needs in various software development projects in an ad-hoc manner. Various analysis tools such as trend analysis, fault-tree analysis, Orthogonal Defect classification and formal methods, etc., can also be used to minimize the possibility of defect occurrence after release and therefor improve software reliability.
After development of the software product, field data can be gathered and analyzed to study the behavior of software defects. Fault tolerance or fault/failure forecasting techniques will be helpful techniques and guide rules to minimize fault occurrence or impact of the fault on the system.
Conclusions
Software reliability is a key part in software quality. The study of software reliability can be categorized into three parts: modeling, measurement and improvement.
Software reliability modeling has matured to be point that meaningful results can be obtained by applying suitable models to the problem. There are many models exist, but no single model can capture a necessary amount of the software characteristics. Assumptions and abstractions must be made to simplify the problem. There is no single model that is universal to all the situations.
Software reliability measurement is native. Measurement is far from commonplace in software, as in other engineering field. “How good is the software, quantitatively?” As simple as the question is, there is still no good answer. Software reliability cannot be directly measured, so other related factors are measured to estimate software reliability and compare it among products. Development process, faults and failures found are all factors related to software reliability.
Software reliability improvement is hard. The difficulty of the problem stems from insufficient undertaking of software reliability and in general, the characteristics of software. Until now there is no good way to conquer the complexity problem of software. Complete testing of a moderately complex software module is infeasible. Defect-free software product cannot be assured. Realistic constraints of time and budget severely limit the effort put into software reliability improvement.
As more and more software is creeping specially into embedded systems, we must make sure they don’t embed disasters. If not considered carefully, software reliability can be the reliability bottleneck of the whole system. Ensuring software reliability is no easy task. As hard as the problem is, promising progresses are still being made toward more reliable software. More standard components and better process are introduced in software engineering field.