MBA management

Reliability and Quality Management Models:


With the advent of the computer age, computers, as well as the software running on them, are playing a vital role in our daily lives. Currently all the household items but appliances such as washing machines, telephones, TVs, and watches, are having their analog and mechanical parts replaced by CPU and software. With a continuously lowering cost and improved control, processors and software controlled systems offer compact design, flexible handling, rich features and competitive cost.

People used to believe that” software never breaks”. Intuitively, unlike mechanical parts such as bolts, levers, or electronic parts such as transistors, capacitor, software will stay “as is” unless there are problems in hardware that changes the storage content or data path. Software does not age, rust, wear-out, deform or crack. There is no environmental constraint for software to operate as long as the hardware processor it runs on can operate. Furthermore, software has no shape, color, material, mass. It cannot be seen or touched, but it has a physical existence and is crucial to system functionality.

Without being proven to be wrong, optimistic people would think that once after the software can run correctly, it will be correct forever. A series of tragedies and chaos caused by software proves this to be wrong. These events will always have their place in history.

Software can make decisions, but can just as unreliable as human beings. The British destroyer Sheffield was sunk because the radar system identified an incoming missile as “friendly”. The defense system has matured to the point that it will not mistaken the rising moon for incoming missiles, but gas-field fire, descending space junk, etc.

Software can also have small unnoticeable errors or drifts that can culminate into a disaster. On February 25, 1991,during the Gulf War, the chopping error that missed 0.000000095 second in precision in every 10th of a second, accumulating for 100 hours, made the patriot missile fail to intercept a scud missile.28 lives were lost.

The current wave of Drone missiles striking accurately at Taliban targets shows that there cannot be any room for error. Fixing problems may not necessarily make the software more reliable and on the country, new serious problems may arise.

Reliability Model

According to ANSI, Software Reliability is defined as: the probability of failure free software operation for a specified period of time in a specified environment. Although software Reliability is defined as a probabilistic function, and comes with the notion of time, we must note that, different from traditional Hardware Reliability, Software Reliability is not a direct function of time. Electronic and mechanical parts may become “old” and wear out with time and usage, but software will not rust or wear out during its life cycle. Software will not change over time unless intentionally changed or upgraded.

Software Reliability is an important to attribute of software quality, together with functionality, usability, performance, serviceability, capability, installability, maintainability, and documentation. Software Reliability is hard to achieve, because the complexity, including software, will be hard to reach a certain level of reliability, system developers tend to push complexity into the software layer, with the rapid growth of system size and ease of doing so by upgrading the software. For example, large next generation aircraft will have over one million source lines of software. For example, large next generation aircraft will have over one million source lines of software on board; next generation air traffic control systems will contain between one and two million lines; the upcoming international Space Station will have over two million lines on board and over ten million lines of ground support software; several major life critical defense systems will have over five million source lines of software. While the complexity of software is inversely related to software reliability, it is directly related to other important factors in software quality, especially functionality, capability, etc. Emphasizing these features will tend to add more complexity to software.

Software failure mechanisms

Software failures may be due to errors, ambiguities, oversights or misinterpretation of the specification that the software is supposed to satisfy, carelessness or incompetence in writing code, inadequate testing, incorrect or unexpected usage of the software or other unforeseen problems. While it is tempting to draw an analogy between Software Reliability, software and hardware have basic differences that make them different in failure mechanisms. Hardware faults are mostly Physical faults, while software faults are design faults, which are harder to visualize, classify, detect, and correct. Design faults are closely related to fuzzy human factors and the design process, which we don’t have a solid understanding. In hardware, design faults may also exist, but physical faults usually dominate. In software, we can hardly find a strict corresponding counterpart for “manufacturing” as hardware manufacturing process, if the simple action of uploading software modules into the storage and start running. Trying to achieve higher reliability by simple duplicating the same software modules will not work, because design faults cannot be masked off by voting.

A partial list of the district characteristics of software compared to hardware is listed below.

Failure cause: Software defects are mainly design defects.
Wear-out: Software does not have energy related wear-out phase. Errors can occur without warning.
Repairable system concept: Periodic restarts can help fix software problems.
Time dependency and life cycle: Software reliability is not a function of operational time.
Environmental factors: Do not affect Software reliability, except it might affect program inputs.
Reliability prediction: Software reliability cannot be predicted from any physical basis, since it depends completely on human factors in design.
Redundancy: Cannot improve Software reliability if identical software components are used.
Interfaces: Software interfaces are purely conceptual other than visual.
Failure rate motivation: Usually not predictable from analyses of separate statements.
Built with standard components: Well- understood and extensively- tested standard parts will help improve maintainability and reliability. But in software industry, we have not observed this trend. Code reuse has been around for some time, but to a very limited extent. Strictly speaking there are no standard parts for software, except some standardized logic structures.

Software Reliability Models

Most software models contain the following parts: assumptions, factors, and a mathematical function that relates the reliability with the factors. The mathematical function is usually higher order exponential or logarithmic.

Software modeling techniques can be divided into two subcategories: prediction modeling and estimation modeling. Both kinds of modeling techniques are based on observing and accumulating failure data and analyzing with statistical inference. The major differences of the two models are shown in Table 1.



DATA REFERENCE   Uses historical data   Uses data from the current software development effort
WHEN USED IN DEVELOPMENT CYCLE   Usually made prior to development or test phases: can be used as early as concept phase  

Usually made later in life cycle( after some data have been collected); not typically used in concept or development phases

TIME FRAME   Predict reliability at some future time   Estimate reliability at either present or some future time

Table 1: Difference between software reliability prediction models and software reliability estimation models

Representative prediction models include Musa’s Execution Time Model. And Rome Laboratory models TR-92-51 and TR_92_15, etc. Using prediction models, software reliability can be predicted early in the development phase and enhancements can be initiated to improve the reliability.

Representative estimation models include exponential distribution models, Weibull distribution model, Thompson and Chelson’s model, etc. Exponential models and Weibull distribution model are usually named as classical fault count/fault rate estimation models, while Thompson and Chelson’s model belong to Bayesian fault rate estimation models.

The field has matured to the point that software models can be applied in practical situations and give meaningful results and, second, that there is no one model that is best in all situations, Because of the complexity of software, any model has to have extra assumptions. Only limited factors can be put into consideration. Most software reliability models ignore the software development process and focus on the results--- the observed faults and/or failures, By doing so, complexity is reduced and abstraction is achieved, however, the models tend to specialize to be applied to only a portion of the situations and a certain class of the problems. We have to carefully choose the right model that suits our specific case. Furthermore, the modeling results cannot be blindly believed and applied.

Software Reliability Metrics

Measurement is commonplace in other engineering field, but not in software engineering. Measuring software reliability remains a difficult problem because we don’t have a good understanding of the nature of software. There is no clear definition to what aspects are related to software reliability. We cannot find a suitable way to measure software reliability, and most of the aspects related to software reliability. Even the most obvious product metrics such as software size have not uniform definition.

It is tempting to measure something related to reliability to reflect the characteristics, if we cannot measure reliability directly. The current practices of software reliability measurement can be divided into four categories:

Product metrics

Software size is thought to be reflective of comity, development effort and reliability. Lines of Code (LOC), or LOC in thousands (KLOC), is an intuitive initial approach to measuring software size. But there is not a standard way of counting. Typically, source code is used (SLOC< KSLOC) and comments and other non-executable statements are not counted. This method cannot faithfully compare software not written in the same language. The advent of new technologies of code reuse and code generation technique also cast doubt on this simple method.

Function point metric is a method of measuring the functionality of a proposed software development based upon a count of inputs, outputs, master files, inquires, and interfaces. The method can be used to estimate the size of a software system as soon as these functions can be identified. It is a measure of the functional complexity of the program. It measures the functionality delivered to the user and is a measure of the functional complexity of the program. It measures the functionality delivered to the user and is independent of the programming language. It is used primarily for business systems; it is not proven in scientific or real-time applications.

Complexity is directly related to software reliability, so representing complexity is important. Complexity oriented metrics is a method of determining the complexity of a program’s control structure, by simplify the cod into a graphical representation. Representative metric is McCabe’s Complexity Metric.

Test coverage metrics are a way of estimating fault and reliability by performing tests on software products, based on the assumption that software reliability is a function of the portion of software that has been successfully verified or tested.

Project management metrics

Researchers have realized that good management can result in better products. Research has demonstrated that a relationship exists between the development process and the ability to complete projects on time and within the desired quality objectives. Costs increase when developers us inadequate processes. Higher reliability can b achieved by using better development process, risk management process, configuration management process, etc.

Process metrics

Based on the assumption that the quality of the product is a direct function of the process, process metrics can be used to estimate, monitor and improve the reliability and quality of software. ISO- 9000 certification, or “quality management standards”, is the generic reference for a family of standards developed by the International Standards Organization (ISO).

Fault and failure metrics

The goal of collecting fault and failure metrics is to be able to determine when the software is approaching failure-free execution. Minimally, both the number of faults found during testing (i.e., before delivery) and the failures (or other problems) reported by users after delivery are collected, summarized and analyzed to achieve this goal. Test strategy is highly relative to the effectiveness of fault metrics, because if the testing scenario does not cover the full functionality of the software, the software may pass all tests and yet be prone to failure once delivered. Usually, failure metrics are based upon customer information regarding failures found after release of the software. The failure data collected is therefore used to calculate failure density, Mean Time between Failures (MTBF) or other parameters to measure or predict software reliability.

Software Reliability Improvement Techniques

Good engineering methods can largely improve software reliability.

Before the deployment of software products, testing, verification and validation are necessary steps. Software testing is heavily used to trigger, locate and remove software defects. Software testing is still in its infant stage; testing is crafted to suit specific needs in various software development projects in an ad-hoc manner. Various analysis tools such as trend analysis, fault-tree analysis, Orthogonal Defect classification and formal methods, etc., can also be used to minimize the possibility of defect occurrence after release and therefor improve software reliability.

After development of the software product, field data can be gathered and analyzed to study the behavior of software defects. Fault tolerance or fault/failure forecasting techniques will be helpful techniques and guide rules to minimize fault occurrence or impact of the fault on the system.


Software reliability is a key part in software quality. The study of software reliability can be categorized into three parts: modeling, measurement and improvement.

Software reliability modeling has matured to be point that meaningful results can be obtained by applying suitable models to the problem. There are many models exist, but no single model can capture a necessary amount of the software characteristics. Assumptions and abstractions must be made to simplify the problem. There is no single model that is universal to all the situations.

Software reliability measurement is native. Measurement is far from commonplace in software, as in other engineering field. “How good is the software, quantitatively?” As simple as the question is, there is still no good answer. Software reliability cannot be directly measured, so other related factors are measured to estimate software reliability and compare it among products. Development process, faults and failures found are all factors related to software reliability.

Software reliability improvement is hard. The difficulty of the problem stems from insufficient undertaking of software reliability and in general, the characteristics of software. Until now there is no good way to conquer the complexity problem of software. Complete testing of a moderately complex software module is infeasible. Defect-free software product cannot be assured. Realistic constraints of time and budget severely limit the effort put into software reliability improvement.

As more and more software is creeping specially into embedded systems, we must make sure they don’t embed disasters. If not considered carefully, software reliability can be the reliability bottleneck of the whole system. Ensuring software reliability is no easy task. As hard as the problem is, promising progresses are still being made toward more reliable software. More standard components and better process are introduced in software engineering field.

Quality Management Models

The purpose of the quality management models to assess the quality of a software /product, the number of defects, or to estimate the mean time to next failure when development work is complete. Quality management models must provide early signs of warning or of improvement so that timely actions can be planned and implemented. For a development organization, to be helpful the quality management model must cover the early development phases.

The most important principal in software engineering is “do it right the first time”. This principle speaks to the importance of managing quality throughout the development process. The interpretation of the principle, in the context of the software quality management is

a. The best scenario is to prevent errors from being injected into the development process.

b. When errors are introduced, improve the front end of the development process to remove as early as possible.

c. If the project is beyond the design and code phases, unit tests and any additional tests by the developers serve as gatekeepers to escape the front end process before the code is integrated into the configuration management system.

The Rayleigh Model framework is good overall model for quality management.


The salient feature of Rayleigh model is its focus on early detection and defect removal related to the preceding items. Based on the model, if the error injection rate is reduced, the entire area under the Rayleigh curve becomes smaller, leading to a smaller projected field defect rate. It is ideal that most of the errors should be found before formal testing as the more numbers in formal testing leads to more field defects. The iceberg analogy describes the relationship between testing and field defects rates, the tip of the iceberg. Here the size of the iceberg is equivalent to the amount of error injection. By the time the formal testing starts, the iceberg is already formed and its size determined. The larger is tip, the larger it is entire iceberg. To reduce the submerged part, extra effort must be applied to expose more of the iceberg above the water.

A Rayleigh model derived from a previous release or from historical date can be used to track the pattern of defect removal of the project under development. If the current pattern is more font loaded than the model would predict, it is a positive sign and vice versa. If the tracking is via time series such as month or week (instead of development phase), when enough data points are available, early estimation of model parameters can be performed. Quality projections based on early data would not be reliable compared to the final estimate at the end of the development cycle. But the data points surely indicate the direction of the quality so that timely action can be taken.

Rayleigh framework serves as the basis for quality improvement strategy especially it prevents defect prevention and early defect detection and removal. As an in-process tool, it provides the data which can indicate the direction of the quality. Compared to the previous models will show

• Whether defects removal is more or less front–end loaded
• Same quality should be similar to previous products
• Less quality action should be taken
• For each direction, actions are formulated and implemented. For example, to facilitate early defect removal, actions implemented include focus on
• Design review
• Code Review
• Code inspection (DR/CI) process
• Deployment of moderator training
• Inspection checklists
• Use of in-process scape measurements to track the effectiveness of reviews and inspections
• Use of mini builds to flush out defects by developers before configuration

The action for defect prevention or to reduce error injection includes the following

• Implementation of Defect prevention process
• Use of Case tools for development
• Use of Case tools for development
• Improved communication between interface owning groups

Reliability Growth Model

It models from previous products can be used to track the defect rates of the current product. To experience significant quality improvement, the current defect arrival rate must follow below the model curve. Comparing to a model allows for quality actions to be identified and implemented. Models can be used to determine the end date of testing. Other models should be used in conjunction because the Reliability Growth Models do not focus on the front-end of the process.

Criteria for the model evaluation

The most important criteria for evaluating the reliable models are predictive, validity, simplicity and quality of assumptions, in that order of importance. With regard to Quality Management Models, we propose

a. Timeliness—raising a red flag early which allows more time to react and recover

b. Scope of process coverage—should address each phase and should address the quality of stage deliverables

c. Capability- Ability for model to provide information for planning and managing software development

Orthogonal Defect Classification (ODC)

Orthogonal defect classification (ODC) is a method for in-process quality management based on defect cause analysis. The ODC method assets that a set of mutually independent cause categories (Orthogonal) can be developed, which can be used across phases of development and across products and that the distribution of these defect types is associated with process phases. The primary concept of this method is to assert the exact development phase by examining the distribution types of defects. There are eight defect types and they are:

• Function
• Interface
• Checking
• Assignment
• Timing/Serialization
• Build /package/Merge
• Documentation
• Algorithm

The authors associate the defect types in the following manner

Defect Type    Association
Function (Missing or incorrect function)   Design phase
Interface defects   Low Level Design
Checking   Low Level Design or implementation
Assignment   Code phase
Timing and Serialization   Low Level Design
Build/package/Merge   Library Tools
Documentation   Publication
Algorithm   Low Level Design

There are lot of examples the author of ODC provides, One such example is that the high percentage of the defect type ”function” found at a late stage in the development cycle. Specifically, the defect discovery time was classified into four periods: the last period corresponded approximately to the system test phase. In the last period the number of defects found almost doubled, and the percent of defect type “function” increased to almost 50%. Since the defect type “function” is supposed to be found earlier (during the design phase), the observed distribution indicated a clear departure from the executed process behavior.

In addition to the defect type analysis, the ODC method includes defect trigger to improve testing effectiveness, A defect trigger is a condition that allows a defect to surface, By capturing information on defect triggers during testing and for field defects reported by customers, the test tam can improve its test planning and test cases to maximize defect discovery.

The attributes classified by ODC when a defect is opened include the following:

• Activity: The specific activity that exposed the defect. For example, during a system test, a defect occurs when one clicks a button to select the Summary View option. The phase is system test but the activity is function test because the defect surfaced by performing a function test type activity.

• Trigger: The environment or condition that had to exist for the defect to surface,

• Impact: This refers to the effect had on thee customers if it had escaped to the field, or the effect it would have had if not found during development.

When a defect fix is made the following attributes are added

• Target: What is being fixed: design, code, documentation, and so forth?

• Trigger: The nature of the correction made

• Defect Qualifier (applies to the defect type)- Capture the element of nonexistent, wrong or irrelevant implementation

• Source: The origin of the design/code that had the effect

• Age: The history of the design/code that had the effect

This ODC method has been applied to many projects and successful results have been reported. The most significant contribution of ODC seems to be in the area of providing data based assessments leading to improvement of test effectiveness.


Quality Management Models are most valuable for monitoring and managing the quality of software when it is under development. Unlike Reliability models, Quality Management Models should focus on Timeliness: For quality indications

• Scope of Coverage: Of various phases of the development process
• Capability: provides information about the different dimensions of quality through indicators and attributes.

The Rayleigh model provides a nice framework for quality management covering the entire development process. Within the overall Rayleigh framework, sub-models such as the effort/outcome model, the PTR sub-model and the PTR arrival and backlog projection models, the reliability growth models and related in-process metrics.

To implement the above models, a good tracking and reporting system and a set of related in-process metrics are important. Defect Cause and Defect type analysis, such as the ODC method, can lead to more insights and therefore, effective improvement actions.

The above models can lead to a better understanding of the project.
Copyright © 2015         Home | Contact | Projects | Jobs

Review Questions
  • 1. What is Reliability Model?
  • 2. Explain briefly about Software Failure mechanisms?
  • 3. Sketch bathtub curve for hardware reliability and explain?
  • 4. How many categories available in software reliability measurement. Explain.
  • 5. Draw a diagram of Rayleigh Model framework and explain in detail.
  • 6. Explain in detail about PTR Model?
Copyright © 2015         Home | Contact | Projects | Jobs

Related Topics
Reliability and Quality Management Models Keywords
  • Reliability and Quality Management Models Notes

  • Reliability and Quality Management Models Programs

  • Reliability and Quality Management Models Syllabus

  • Reliability and Quality Management Models Sample Questions

  • Reliability and Quality Management Models Subjects

  • Reliability and Quality Management Models Syllabus

  • EMBA Reliability and Quality Management Models Subjects

  • Reliability and Quality Management Models Study Material

  • BBA Reliability and Quality Management Models Study Material