human factors research
human factors research
human factors research
Central approach of human factors is the application of relevant information about human characteristics
and behavior to the design of objects, facilities, and environments that people use. Most relevant human
factors information is based on experimentation and observation. Research plays a central role in this
regard.
1. descriptive studies-
• generally seek to describe a population (usually people) in terms of certain attributes.
• Examples would include an anthropometric survey of truck drivers wherein the height, weight,
arm length, etc. of the drivers were measured and tabulated; a survey of hearing loss among rock
band members; or a survey of opinions of train riders.
2. experimental research
• the purpose is to assess the effects of one or more variables on some variable on behavior or
performance.
• Examples include assessing the effects of various levels of environmental heat on mental
performance, or assessing the effects of various types of seats on perceived comfort.
3. evaluation research
• This tests the effect of a system on human behavior; similar to experimental research in assessing
the effects of "something"; however, the "something" is usually a complex system.
• Generally is more global and comprehensive than experimental research. A system is evaluated
by comparison with its goals; both intended consequences and unintended outcomes must be
assessed. Often includes a cost-benefit analysis
• Examples include evaluating a new training program, evaluating a new design for rapid transit
vehicles, and evaluating a new computer information management system.
Fundamental decisions to be addressed in order to plan and execute the work properly: picking a
research setting, selecting variables, choosing a sample of subjects, and deciding how the data will be
collected and analyzed.
Research setting
--Whether to carry out in a laboratory setting, real world (i.e., field research), or to use simulations of the
real world
Laboratory setting:
advantage: experimental control, extraneous variables can be controlled and the experiment can be
replicated almost at will, data collection can be made more precise.
disadvantage: sacrifice some realism and generalizability.
Field research:
advantage: realism in terms of relevant task variables, environmental constraints, and subject
characteristics including motivation thus better chance that the results can be generalized to the real-
world operational environment.
disadvantage: cost, safety hazards for subjects, lack of experimental control (no opportunity to replicate
the experiment a sufficient number of times, many variables cannot be held constant; and often certain
data cannot be collected because the process would be too disruptive)
IME 135 Ergonomics
Engr. Mary Grace O. Catong, IE, EnP, MS MSE
Simulation is an attempt to combine the generalizability of field research with the control of laboratory
research.
a. physical simulation-usually constructed of hardware and represent some system, procedure, or
environment. Can range from very simple items (such as picture of a control panel) to extremely
complex configurations (such as a moving-base jumbo jet flight simulator' with elaborate out-of-
cockpit visual display capabilities).
b. Computer simulation involves modeling a process or series of events in a computer. By changing the
parameters the model can be run and predicted results can be obtained. To develop an accurate
computer model requires a thorough understanding of the system being modeled and usually
requires the modeler to make some simplifying assumptions about how the real-world system
operates.
Types of Variables
Sampling
The critical concern when selecting a sample is that the information obtained from the sample be
generalizable to some hypothetical population. A sample must be representative of the population.
Representative means that the sample should contain all the relevant aspects of the population in the
same proportion as found in the population. For example, if in the population of coal miners 30% are
under 21 years of age, 40% are between 21 and 40, and 30% are over 40 years of age, then the sample—
to be representative—should also contain the same percentages of each age group. A sample that is not
representative is said to be biased.
The sample must be selected randomly from the population. Random selection occurs when each
member of the population has an equal chance of being included in the sample.
Data Collection
Observers must be trained in what to observe and how to record it. In field research, extra attention must
be paid to designing data collection procedures and devices to perform in this often unpredictable and
unaccommodating field setting.
Data Analysis
Analyze the data to see what relationships there are between and among the independent and dependent
variables using appropriate statistical analyses.
Standard deviation--for quantifying the degree of variability among the cases such as errors made,
heights of people, or scores on tests.
IME 135 Ergonomics
Engr. Mary Grace O. Catong, IE, EnP, MS MSE
Correlation--a coefficient of correlation is a measure of the degree of relationship between two variables.
Can range from +1.00 (perfect positive correlation, through zero (absence of any relationship) to -1.00,
a perfect negative correlation. A positive correlation between two variables indicates that high values on
one variable tend to be associated with high values on the other.
Example, height and weight are positively correlated. A negative correlation between two variables
indicates that high values on one variable tend to be associated with low values of the other variable.
Example is stimulus intensity and reaction time which tend to be negatively correlated.
Statistical significance--refers to the probability that the results could have occurred by chance. If a
difference is significant at the 1 percent level, this means that the obtained difference is of such a
magnitude that it could have occurred by chance only 1 time out of 100.
Percentiles—correspond to the value of a variable below which a specific percentage of the group fall.
For example, the 5th percentile standing height for males is 162 cm. This means that only 5 percent of
males are smaller than 162 cm. The 50th percentile male height is 173 cm which is the same as the median
since 50 percent of males are shorter than this value and 50 percent are taller. The 95th percentile is 185
cm, meaning that 95 percent of males are shorter than this height. The concept of percentile is especially
important in using anthropometric (body dimension) data for designing objects, workstations, and
facilities.
The criterion (or dependent variable) as used in research is a measure of the possible effects of the
independent variable.
Types of criteria: In general terms the criteria used in human factors research and systems development
are of two types: human criteria and system criteria.
Human Criteria. There are four relatively different types of human criteria: (1) human performance
measures, (2) physiological indices, (3) subjective responses, and (4) accident frequency.
In a strict sense human performance must be considered in terms of various sensory, mental, and motor
activities. In specific work situations, however, it is usually difficult, if not impossible, to measure human
performance strictly in terms of human activity, since such performance usually is inextricably
intertwined with the performance characteristics of the physical equipment being used. Thus, the typing
performance of a typist is not entirely a function of the typist but also in part the consequence of the
typewriter (its make, condition, etc.).
Human performance measures are usually frequency measures (e.g., number of targets detected, number
of keystrokes made, or number of times the "help" screen was used), intensity measures (e.g., torque
produced on a steering wheel), latency measures (e.g., reaction time or delay in switching from one
activity to another), or duration measures (e.g., time to log on to a computer system or time on target in
a tracking task).
For some purposes indices of various physiological conditions are pertinent criteria. Such possible
indices include heart rate blood pressure, composition of the blood, galvanic skin response, brain waves,
respiration rate, skin temperature, blood sugar, and many other measures. Some of these and other
physiological variables are used as indices of the physiological effects on people of various methods of
work, of work formed with equipment of various designs, of work periods, and of work performed under
various environmental situations (such as heat and cold).
Physiological indices are often used to measure strain in humans resulting from physical or mental work
and from environmental influences, such as heat, vibration, noise, and acceleration. Physiological indices
IME 135 Ergonomics
Engr. Mary Grace O. Catong, IE, EnP, MS MSE
can be classified by the major biological systems of the body: cardiovascular (e.g., heart rate or blood
pressure), respiratory (e.g., respiration rate or oxygen consumption), nervous (e.g., electric brain
potentials or muscle activity), sensory (e.g., visual acuity, blink rate, or hearing acuity), and blood
chemistry (e.g., catecholamines).
For some purposes the subjective responses of people can serve as appropriate criteria; examples are
ratings of the performance of individuals, of alternative design features of a system, of the judged
importance of different types of information for use in a system and of the comfort of seats.
For still other purposes accident or injury frequency may serve as appropriate criteria. For example, the
number of injuries or deaths per million miles traveled gives a comparison (in terms of this criterion) of
various types of transportation systems, such as commercial airlines, railways, buses and automobiles.
Svstem Criteria. Basically system criteria are those that relate to the performance of the system (or
subsystem or component thereof) or, in other words, those that reflect something about the degree to
which the system (or subsystem or component) achieves what it is intended to achieve. For example, a
computer keyboard might be evaluated in terms of such criteria as number and accuracy of data entries
made per unit of time, and an earth-moving vehicle might be evaluated in terms of the amount of earth
moved per unit of time. Other examples of system criteria are the anticipated life of a system; ease of
operation or use; maintainability, reliability, operating cost, and human resource requirements. Some
such criteria are rather strictly mechanistic in the sense that they reflect essentially engineering
performance (e.g. the maximum rpm of an engine), whereas others reflect more the performance of the
system as it is used by the people involved in it (such as errors in cards punched).
The two classes of criteria are not neat dichotomies, but rather tend to form a continuum ranging from
strictly mechanistic system criteria at one end to strictly behavioral criteria at the other end.
Criteria used in research investigations generally should fulfill certain requirements, namely validity,
reliability, freedom from contamination, and sensitivity of measurement.
Reliabilitv. This refers to the consistency or stability of the measurements of a variable over time or
across representative samples.
Validitv.
The validity of a criterion refers to the extent to which the measure in question is considered to be a
relevant or pertinent index of the criterion in mind, such as system performance quality of work, comfort
in seating, or job satisfaction.
For example, a body temperature is easy to measure and very reliable but it will not be a good measure
of mental workload. Heart rate variability is reliable and easy to measure and would be a good indicator
of mental workload.
Freedom from Contamination. A criterion measure should not be influenced by variables that are
extraneous to the variable that is being measured.
Sensitivity of Measurement.
A dependent variable must have sensitivity so that one can distinguish between different levels of the
independent variables. A criterion measure should be measured in units that are commensurate with the
anticipated differences among subjects.
One type of measure is the capacity a driver has to perform two tasks at the same time, or dual-task
capacity. To investigate these types of problems Michon (1967) used a tapping task. The idea is that the
IME 135 Ergonomics
Engr. Mary Grace O. Catong, IE, EnP, MS MSE
driver can establish an even rhythm in verbal tapping—saying “ta-ta-ta” at a rate of one “ta” per second.
Butwhen traffic conditions become difficult the rhythm becomes irregular, because there is not enough
mental capacity for both driving and tapping. If this particular measure can distinguish between the
levels of difficulty in of driving on different road or trafficenvironments, then it is a sensitive measure.
Example:
Suppose that a human factors specialist in King Arthur's court were commanded to assess the combat
skills of the Knights of the Roundtable. To do this, the specialist might have each knight shoot a single
arrow at a target and record the distance off target as the measure of combat skill. If all the knights were
measured one day and again the next, quite likely the scores would be quite different on the two days.
The best archer on the first day could be the worst on the second day. We would say that the measure
was unreliable. Much, however, could be done to improve the reliability of the measure, including having
each knight shoot 10 arrows each day and using the average distance off target as the measure, being
sure all the arrows were straight andthe feathers set properly, and performing the archery inside the
castle to reduce the variability in wind, lighting, and other conditions that could change a knight's
performance from day to day.
Correlating the sets of scores from the two days would yield an estimate of the reliability of the measure:
Generally speaking, test-retest reliability correlations around .80 or above are considered satisfactory,
although with some measures we have to be satisfied with lower levels.
In our Knights of the Roundtable example, accuracy in shooting arrows at stationary targets would have
only slight validity as a measure of actual combat skill.
In our example of the knights, wind conditions, illumination, and quality of the arrows could be sources
of contamination because they could affect accuracy yet are unrelated to the concept being measured,
namely combat skill.
To continue with our example of the knights, if the distance off target were measured to the nearest yard,
it is possible that few, if any, differences between the knights’ performance would have been found. The
scale (to the nearest yard) would have been too gross to detect the subtle differences in skill between the
archers.