Metrics
Metric is a measure for quantitatively assessing, controlling or selecting a person, process, event, or institution, along with the procedures to carry out measurements and the procedures for the interpretation of the assessment in the light of previous or comparable assessments.
Metrics are usually specialized by the subject area, in which case they are valid only within a certain domain and cannot be directly benchmarked or interpreted outside it. This factor severely limits the applicability of
metrics, for instance in comparing performance across domains. The prestige attached to them may be said to relate to a ‘quantifiability fallacy’, the erroneous belief that if a conclusion is reached by quantitative measurement, it must be vindicated, irrespective of what parameters or purpose the investigation is supposed to have.
WHY METRICS?
Before we consider software measurement and collecting
software metrics data, we need to ask ourselves: why are we doing this? Do organizations need data on their software projects and management? Does the software industry need data about itself? There is management pressure to build systems faster, better and at minimum cost. The return on investment that an organization can get from the money it spends on software development has come under increased scrutiny from senior business executives and directors. Consequently software development projects now have to operate like other parts of the organization, being aware of its performance and its contribution to the organization’s success and opportunities for improvement. How can a Manager achieve this without performance data? The below quotes by Lord Kelvin and Tom Demarco make us understand the importance of measurement.
“When you can measure what you are speaking about and express it in numbers, You know something about it, but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a measure and unsatisfactory kind: it may be the beginnings of knowledge but you have scarcely in your thoughts advanced to the stag of science.” _Lord Kelvin
“You cannot control what you cannot measure.” _Tom De Marco
So what is it that managers need to know? Here are some of the questions that the Managers need to know in their project:
• How do I know if my internal operation is performing satisfactorily?
• How do I decide whether I should outsource some or all of my operations?
• How do I know if my outsourcer is performing?
• What are the risk factors I should consider in a project?
• What questions should I ask to ensure that a project proposal is realistic?
• How do I know if a project is healthy? i.e. what should I be worrying about?
None of these questions can be answered without sound software metric system in place.
A Properly established metric helps achieve missions, visions, goals and objectives. Measurement data is most reliable when it is generated as a byproduct of producing a product or service. Metric can be used to gauge the status, effectiveness and efficiency of processes, customer satisfaction, product quality and as a tool for management concepts, the use of
metrics in a software development environment, variation, process capability, risk management, the ways measurement can be used and how to implement an effective
metrics program.
Measurement Concepts
To effectively measure, one needs to know the basic concepts of measurement. This section provides those basic measurement concepts.
Standard Units of Measure
A measure is a single quantitative attribute of an entity. It is the basic building block for a measurement program. Examples of measures are lines of code (LOC) , work effort, or number of defects. Since quantitative measures should be expressed in numbers. For example, the measure LOC refers to the “number” of lines and work effort refers to the “number” of hours, days, or months.
Measurement cannot be used effectively until the standard units of measure have been defined. For example, talking about lines of code does not make sense until the measure LOC has been defined. Lines of code may mean LOC written, executable LOC written, or non-compound LOC written. If a line of code contained a compound statement (Such as a nested IF statement two levels deep) it could be counted as one or two lines of code. Additionally, organizations may use weighting factors: for example, one verb would weighted as more complete than other verbs in the same programming language expressed in numbers, For example, the measure LOC refers to the “number” of lines and work effort refers to the “number” of hours, days, or months.
Measurement cannot be used effectively unit the standard units of measure have been defined .Lines of code may mean LOC written, executable LOC written, or non-compound LOC written. If a line of code counted a compound statement (Such as a nested IF statement two levels deep ) it could be counted as one or two lines of code. Additionally, organizations may use weighing factors; for example, one verb would be weighted as more complete than other verbs in the same programming language.
Standard units of measure are the base on which all measurement exists. Measurement programs typically have between five and fifty standard units.
A metric is a derived (Calculated or composite) unit of measurement that cannot be directly observed, but is created by combing or relating two or more measures. A metric normalizes data so that comparison is possible. Since
metrics are combinations of measures they can add more value in understanding or evaluating a process than plain measures. Examples of
metrics are mean time to failure and actual effort compared to estimated effort.
Objective and Subjective Measurement
Objective measurement uses hard data that can be obtained by counting, staking, weighing, timing, etc. Examples include number of defects, hours worked, or completed deliverables. An objective measurement should result in identical values for a given measure, when measured by two or more qualified observers.
Subjective data is normally observed or perceived. It is a person’s perception of a product or activity, and includes personal attitudes, feelings and opinions, such as how easy a system is to use, or the skill level needed to executive the system. With subjective measurement, even qualified observers may determine different value for a given measure, since their subjective judgment is involved in arriving at the measured value. The reliability of subjective measurement can be improved through the use of guidelines, which define the characteristics that make the measurement result one value or another.
Objective measurement is more reliable than subjective measurement, but as a general rule, subjective measurement is considered more important. The more difficult something is to measure, the more valuable it is. For example, it is more to know how effective a person is in performing a job (subjective measurement), than knowing they got to work on time (objective measurement). Following are a few other examples of objective and subjective measures:
The size of a software program measured in LOC is an objective product measure. Any informed person, working from the same definition of LOC, should obtain the same measure value for a given program.
The classification of software as user-friendly is a subjective product measure. For a scale of 1-5, customers of the software would likely rate the product differently. The reliability of the measure could be improved by providing customers with a guideline that describes how having or not having a particular attribute affects the scale.
Development time is an objective process measure. Level of programmer experience is a subjective process measure.
Types of Measurement Data
Before measurement data is collected and used, the type of information involved must be considered. It should be collected for a specific purpose. Usually the data is used in a process model, used in other calculations, or is subjected to statistical analyses. Statisticians recognize four types of measured data, which are summarized in table below.
Data Type |
|
Possible Operations |
|
Description of Data |
Normal |
|
= ? |
|
Categories |
Ordinal |
|
< > |
|
Rankings |
Interval |
|
+ |
|
Differences |
Ratio |
|
/ |
|
Absolute Zero |
Nominal Data
This data can be categorized. For example, a program can be classified as database software, operating system, etc. Normal data cannot be subjected to arithmetic operations of any type, and the values cannot be ranked in any “natural order”. The only possible operation is to determine whether something is the same type as something else. Nominal data can be objective, depending on the rules for classification.
Ordinal Data
This data can be ranked, but differences or ratios between values are not meaningful. For example, programmer experience level may be measured as low, medium, or high. For ordinal data to be used in an objective measurement the criteria for placement in the various categories must be well defined; otherwise, it is subjective.
Interval Data
This data can be ranked and can exhibit meaningful differences between values. Interval data has no absolute zero, and ratios of values are not necessarily meaningful. For example, a program with a complexity value of 6 is four units more complex than a program with a complexity of 2, but it is probably not meaningful to say that the first program is three times as complex as the second. T.J. McCabe’s complexity metric is an example of an interval scale.
Ratio Data
This data has an absolute Zero and meaningful ratios can be calculated. Measuring program size by LOC is an example. A program of 2,000 lines can be considered twice as large as a program of 1,000 lines.
It is important to understand the measurement scale associated with a given measure or metric. Many proposed measurements use values from an interval, ordinal, or nominal scale. If the values are to be used in mathematical equations designed to represent a model of the software process, measurements associated with a ratio scale are performed, since the ratio scale allows mathematical operations to be meaningfully applied.
Measures of Central Tendency
The measures of central tendency are the mean, medium and mode. The mean is the average of the items in the population; the medium is the item at which half the items in the population are blow this items and half the items are above this item; and the mode represents which items are repeated most frequently.
For example, if a population of numbers are: 1, 2, 2, 3, 4, 5, and 11:
The mean is “4” because 1=2=2=3=4=5=11=28 and 28 %7=4.
The medium is “3” because there are three values less and three values higher than 3.
The mode is “2” because that is the item with the most occurrences.