Winter 2002 - HaPI Database
Transcripción
Winter 2002 - HaPI Database
••• ••• ••• ••• • ••• ••••••• •• ••• • ••• • ••••• • -, The BEHAVIORAL MEASUREMENT Letter Behavioral. Measurement Database Services • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Vol. • • 7,• No.2 • • Winter 2002 • • • • • • • • • • •••• • • • • •••••••••••••••• ••••• •••• • •••• • • Enriching the health and behavioral sciences by broadening instrument access Introduction to This Issue This issue of The Behavioral Measurement Letter has three featured articles: one describes a process through which measurement instruments may be improved in various ways and outlines a model for revising instruments; another discusses self-estimated intelligence, gender differences in overestimations and underestimations of intelligence, and possible explanations for and consequences of these differences; the third piece presents ten very practical guidelines -- a list of "do's and don'ts" -- for selecting measurement instruments that will best meet one's needs. In "Refinements to the Lubben Social Network Scale: The LSNS-R," James Lubben, Melanie Gironda, and Alex Lee describe in detail the process of successive changes and analyses thereof through which they modified the Lubben Social Network Scale to improve it in various ways. The process involved performing principal component analysis to determine which items in the instrument contribute to underlying factors being measured by the instrument, and thus which items best measure what they want the instrument to measure. This analytical process resulted in a "cleaner and meaner" version of the LSNS, the LSNS-R. In addition, by detailing the process of successive modifications and mathematical analyses that produced the LSNS-R, the authors outline a model for instrument revision that is broadly applicable. In a piece titled "Self-Estimated Intelligence," Adrian Furnham from London University reviews literature pertaining to estimation of intelligence by oneself and others, and then Vol. 7, No.2, Winter 2002 compares, contrasts, and offers explanations for the reported findings. The literature he reviewed shows that there is a gender difference in intelligence estimation -- females tend to underestimate their IQ and that of other females, while males, on the other hand, tend to overestimate their IQ and that of other males, and that this gender difference exists across cultures and nationalities, across socioeconomic classes, and across age groups. He then explores possible explanations for the gender difference and suggests possible life consequences of over- and under-estimating one's abilities. Our regular contributor, Fred Bryant, presents a set of guidelines for choosing measurement tools, "Ten Commandments for Selecting SelfReport Instruments." Some of the guidelines may seem to be common-sense. Many readers will be familiar with their basic content. Those experienced in instrument selection may see mistakes they've made in the past (and vowed never to make again). In any case, all of our readers, from neophytes to seasoned instrument users, will find value in the piece -- if not something they hadn't known previously, then, certainly, practical reminders of the numerous subtleties, cautions, and pitfalls that should be attended to while selecting a measurement instrument. Address comments and suggestions to The Editor, The Behavioral Measurement Letter, Behavioral Measurement Database Services, PO Box 110287, Pittsburgh, PA 15232-0787. If warranted and as space permits, your communication may appear as a letter to the The Behavioral Measurement Letter Introduction (continued from page 1) Refinements to the Lubben Social Network Scale: The LSNS-R editor. Whether published or not; your feedback will be attended to and appreciated. James Lubben, Melanie Gironda, and Alex Lee We also accept short manuscripts for The BML. Submit, at any time, a brief article, opinion piece or book review on a BML-relevant topic to The Editor at the above address. Each submission will be given careful consideration for possible publication. Increased interest in social support networks during the past decade spawned the development of many new social network scales. However, psychometric properties of these instruments are generally inadequately reported and there are few reported attempts to refine them. The work presented here addresses both of these concerns in examining the psychometric properties of the Lubben Social Network Scale (Lubben, 1988) and creating a revised version, the LSNS-R. In addition, the analytic plan of procedure used in this work is a model that could be employed to evaluate and refine other social and behavioral measures. HaPI reading . . . Al K. DeRoy, Editor O'Reilly (1988) lamented that most social support network assessment instruments have inadequate clarity of definition and also lack reported reliability and validity statistics. Similarly, a common criticism of many studies that examine social support networks is that they employ instruments with unknown or unreported psychometric properties (Winemiller, Mitchell, Sutliff, & Cline, 1993). The work discussed below addresses these concerns 'by examining psychometric properties of the Lubben Social Network Scale (Lubben, 1988) and putting forth a refinement of the LSNS, the LSNS-R (i.e., the "revised" version of the LSNS). Given growing consensus on the importance of social support networks to health and well-being, developing and using consistent tools to conduct assessments of such networks are becoming ever more crucial to gerontological research and geriatric practice (House, Landis, & Umberson, 1988; Steiner, Raube, Stuck, Aronow, Draper, Rubenstein, & Beck, 1996; Glass, Mendes, de Leon, Seeman, & Berkman, 1997). Staff Director Evelyn Perloff, PhD Newsletter Editor Al K. DeRoy, PhD Database Coordinator Barbara L. Brooks, MLS IndexerlReviewer JacquelineA.Reynolds,BA Analyst Janelle Price-Kelly, BA Computer Consultant Alfred A. Cecchetti, MS Measurement Consultant Fred B. Bryant, PhD Information Specialist Consultant Ellen Detlefson, PhD Business Services Manager . Diane Cadwell Copy Assesser Tracey Handlovic, BA File Processor Betty Hubbard Phone: (412) 687-6850 Fax: (412) 687-5213 E-mail: [email protected] Since the measuring device has been constructed by the observer ... we have to remember that what we observe is not nature in itself but nature exposed to our method of questioning. Lubben Social Network Scale The Lubben Social Network Scale (LSNS; Lubben, 1988) has been used in a wide array of studies, and in both research and practice settings, since it was first reported more than a decade ago (Lubben; Lubben, Weiler, & Chi, 1989; Siegel, 1990; Mor-Barak & Miller, 1991; Mor-Barak, Miller, & Syme, 1991; Potts, Hurwicz, Goldstein, & Berkanovic, 1992; Werner Karl Heisenberg The Behavioral Measurement Letter 2 Vol. 7, No.2, Winter 2002 Refinements to the Lubben Social Network Scale: The LSNS-R (continued) primary social relationships. Scores for each LSNS item range from zero to five, with lower scores indicating smaller networks. The scale has been found to have relatively good internal consistency among a widely diverse set of study populations (ex = 0.70). Factor analyses on the LSNS suggest that it measures three different types of social networks: family networks, friendship networks, and interdependent relationships (Lubben, 1988; Lubben & Gironda, 1997). Hurwicz & Berkanovic, 1993; Rubenstein, L. et al., 1994; Rubenstein, R., Lubben, & Mintzer, 1994; Dorfman, Walters, Burke, Hardin, & Karanik, 1995; Luggen & Rini, 1995; Lubben & Gironda, 1997; Okwumabua, Baker, Wong, & Pilgrim, 1997; Mor-Barak, 1997; Gironda, Lubben, & Atchison, 1998; Chou & Chi, 1999; Martire, Schulz, Mittelmark, & Newsom, 1999; Mistry, Rosansky, McGuire, McDermott, & Jarvik, 2001). Further, the LSNS has been employed in a variety of ways, including use as a control variable as well as an outcome variable in health and social science studies. It also has been used as a screening tool for health risk appraisals and as a "gold" standard by which to evaluate other social network assessment instruments. Methodology The main purpose of the work presented here is to address deficits that became apparent in the original LSNS as it was used with diverse populations over the past decade. But because the LSNS was found to have relatively stable reliability and validity across this wide array of settings, any proposed modifications were not to jeopardize the relatively strong psychometric properties of the original LSNS. The LSNS was developed as an adaptation of the Berkman-Syme Social Network Index (BSSNI; Berkman & Syme, 1979). Whereas the LSNS was developed specifically for use among elderly populations, the BSSNI was initially developed for a study of an adult population that purposefully excluded older persons. The LSNS is based on items borrowed from questionnaires used in the original epidemiological study for which the BSSNI was constructed. However, the LSNS excluded BSSNI items dealing with secondary social relationships (viz., group and church membership) because these organizational participation items showed limited variance when used with older populations, especially those having large numbers of frail elderly persons (Lubben, 1988). In contrast, the LSNS elaborated on an array of items dealing with the nature of relationships with family and friends, in view of the growing body of empirical data suggesting that the structure and functions of kinship and friendship networks are particularly salient to the health and well-being of older persons. Reliability is a fundamental issue in psychological measurement (Nunnally, 1978). One important type of measurement reliability is internal consistency, i.e., the extent to which items within' a scale relate to the latent variable being measured (DeVellis, 1991; Streiner & Norman, 1995). Cronbach's (1951) coefficient alpha was chosen to examine the internal consistency of the LSNS and modifications designed to improve upon the original version. The acceptable range of coefficient alpha values employed here was 0.70 to 0.90 (Nunnally; DeVellis) because assessment instruments with reliability scores higher than 0.90 are likely to suffer from excessive redundancy, whereas those with alpha less than 0.70 are likely to be unreliable (Streiner & Norman). A further test of item homogeneity used was the item-total test score correlation (DeVellis; Streiner & Norman). Here acceptable values of the itemtotal score correlation were 0.20 and greater (Streiner & Norman). Principal component analysis looks for underlying (latent) components that account for most of the variance of a scale (Stevens, 1992). Principal component analysis with varimax rotation was used here to explore the component structure of various versions of the LSNS to see The LSNS total scale score is computed by summing ten equally weighted items that quantify structural and functional aspects of Vol. 7, No.2, Winter 2002 3 The Behavioral Measurement Letter Refinements to the Lubben Social Network Scale: The LSNS-R (continued) they are not sure as to which aspect of the double-barreled question they should respond (DeVellis, 1991; Streiner & Norman, 1995). Disaggregating double-barreled questions as per the third objective not only helps respondents in answering, it allows researchers to determine the extent to which each part of the original question helps to define a particular construct. if the modified versions conformed in actuality to the hypothesized structure. Although more sophisticated methods exist to examine factor or latent variable structures, such as maximum likelihood factor analysis and confirmatory factor analysis, many scholars contend that principal component analysis is both adequate and yet more practical than more sophisticated techniques, for principal component analysis is mathematically easier to manage, easier to interpret, and yields results similar to those from maximum likelihood factor analysis (Nunnally, 1978; Stevens). The size of the sample used in the analyses discussed below is adequate to conduct principal component analysis according to general sample size guidelines (Stevens; Guadagnoli & Velicer, 1988). Plan and Procedures Production of the LSNS- R progressed along a series of four analytical steps to address the objectives stated above. In each step, alpha reliability coefficients and the results of a series of principal component factor analyses were examined to determine whether and the extent to which items corresponded to the latent structural components of family networks and friendship networks. Four Objectives Used in Refining the LSNS The four steps are summarized in Figure 1. In the first step, reliability statistics were obtained for the original LSNS administered to the sample described above. These values then served as reference points for comparison with values obtained for subsequent modifications. The work to refine the LSNS has four principal objectives. One was to distinguish between and better specify the nature of family and friendship social networks. A second was to replace, where feasible, items in the original LSNS that have small statistical variance. A third objective was to disaggregate "double-barreled" questions. The fourth was to produce a parsimonious instrument to encourage and facilitate its use in research and practice settings where time constraints or other issues preclude using longer social support network instruments. In the second analytical step, two items from the original LSNS scale - L9 C'Helps others with various tasks") and L 10 C'Living arrangements") - were dropped because they demonstrated limited response variation among a number of sample groups including the present one. Furthermore, neither of these items helps to distinguish between family networks and friendship networks better than the other items in the original LSNS. With regard to the first two objectives above, it should be noted that social and behavioral measures are purposely designed to discriminate among groups for a certain construct, and so lack of variation within a given item limits a scale's ability to identify and discriminate among variations (DeVellis, 1991; Streiner & Norman, 1995). Thus, eliminating items with limited statistical variance generally increases a scale's overall sensitivity and specificity, and that in turn improves its effectiveness in measuring constructs of interest (McDowell & Newell, 1987; DeVellis). The L9 item was originally included in the LSNS in part because social exchange theory suggests that a reciprocal social relationship is stronger than one that is unidirectional (Jung, 1990; Burgess & Huston, 1979). Thus, rather than only capturing what others do for the older person being assessed, it is desirable to include items that also assess what the older person does for other people, i.e., items should be included to assess reciprocity of social support within kinship and friendship networks. Moreover, in past studies L9 generally demonstrated insufficient item variance and thus was a ,good candidate for elimination or replacement. Double-barreled items are those in which two different questions are contained in one item. Such items often confuse respondents because The Behavioral Measurement Letter 4 Vol. 7, No.2, Winter 2002 Refinements to the Lubben Social Network Scale: The LSNS-R (continued) with family members may serve different functions than confidant relationships with friends (Keith, Hill, Goudy, & Power, 1984). The final result of the four-step process is a revised version of the LSNS, the LSNS-R. The LlO item on living arrangements also had not worked out well over time. When the original LSNS was constructed, both living arrangements and marital status were common items included in measures of social support networks. It therefore seemed entirely appropriate to include an item merging these two related constructs. However, the LlO item has been the worst performing item on the LSNS across different settings. Part of the problem has been scoring it, which is constrained by the limited number of response options available as well as by disagreements among scorers in assigning ordinal weights to specific response options. Perhaps most important, marital status and living arrangements are generally not malleable nor appropriate for intervention, so items concerning them should not be included in any case. Plan of Analysis Step 1:Analyze original LSNS Original LSNS Items: Ll Family: Number seen or heard from per month L2 Family: Frequency of contact with family member most in contact L3 Family: Number feel close to, talk about private matters, call on for help L4 Friends: Number feel close to, talk about private matters, call on for help L5 Friends: Number seen or heard from per month L6 Friends: Frequency of contact with friend most in contact L7 Confidant: Has someone to talk to when have important decision to make L8 Confidant: Others talk to respondent when they have important decision to make L9 Helps others Ll 0 Living arrangements Step 2: Eliminate items with limited variation In the third analytical step, two "doublebarreled" questions were disaggregated. The two items, L3 (relatives) and L4 (friends) ask, "How many (relatives) (friends) do you feel close to? That is, how many of them do you feel at ease with, can talk to about private matters, or call on for help?" In this step, L3 and L4 were each recast as two distinct questions. One asks, "How many (family members) (friends) do you feel at ease with such that you can talk to them about private matters?" whereas the other asks, "How many (family members) (friends) do you feel close to such that you can call on them for help?" The first of these substitute questions examines somewhat intangible or expressed support, whereas the other taps into more tangible support, such as help with running an errand. Both types of support have been suggested as important aspects of social support networks (Litwak, 1985; Sauer & Coward, 1985). Items eliminated: L9 and Ll 0 Step 3: Uncouple double-barreled questions Items modified: L3 and L4 each split into two separate questions L3A Family: Number feel at ease with whom you can talk about private matters L3B Family: Number feel close to whom you can call on for help L4A Friends: Number feel at ease with whom you can talk about private matters L4B Friends: Number feel close to whom you can call on for help Step 4: Distinguish between source and target confidant relationships with family and friends Items modified: L7 and L8 each split into separate questions for family and friends L7A Family: Respondent functions as confidant to other family members L7B Friends: Respondent functions as confidant to friends L8A Family: Respondent has family confidant L8B Friends: Respondent has friend who is a confidant. In the fourth step, items that identify both the targets and sources of respondents' confidant relationships were constructed and tested. Here the two confidant relationship items in the original LSNS (L7 and L8) were recast to distinguish between confidant relationships with family members and those with friends. These changes recognize that confidant relationships Vol. 7, No.2, Winter 2002 Data Source The data are from a survey of older white, nonHispanic Americans in Los Angeles County, California done between June and November 1993. A self-weighting, multistage probability 5 The Behavioral Measurement Letter Refinements to the Lubben Social Network Scale: The LSNS·R (continued) indicates that disaggregation of "doublebarreled" questions (items L3 and L4) was quite beneficial. In Step 4, the product resulting from including items that distinguish between source and target confidant relationships with family members and those with friends had further increased internal consistency. sample was selected from 861 census tracts in the area in which the white, nonHispanic population exceeded any other single racial or ethnic group. This sampling strategy insured a high level of homogeneity in the sample. Table 1 Cronbach's Alpha Value by Product of Each Step of Analysis The first three sampling stages were: (a) random selection of tracts, (b) random selection of blocks, and (c) random selection of households within selected blocks. Households were then contacted by telephone to determine the age and ethnicity of household members. All white, nonHispanic persons aged 65 or over in each household were thus identified and then potential participants were randomly selected from this pool. Of the 265 older persons thus selected, 76 percent agreed to be interviewed, resulting in a final sample of 201. The sample included 130 women (65%) and 71 men (35%) and had a mean age of 75.3. Additional details on the sample are reported elsewhere (Villa, Wallace, Moon, & Lubben, 1997; Moon, Lubben, & Villa, 1998; Pourat, Lubben, Wallace, & Moon, 1999; Pourat, Lubben, Yu, &Wallace, 2000). S~ a 1. Original LSNS; LO-item scale 2. Items L9 and LIO dropped; 8-item scale results 3. L3 & L4 split; l O-item scale results 4. L7 & L8 split; 12-item scale (LSNS-R) results .66 .67 .73 .78 Factor Analysis Principal component factor analyses were performed for each step of the revision to explore for latent factors and to determine whether the final modified version has latent structural components corresponding to both kinship and friendship networks. The number of factors found in each step of the analysis was determined by considering factors with eigenvalues over one (Kaiser, 1960) and by identifying the elbow in the screenplot tests (Cattell, 1966). The factor loadings were subjected to varimax rotation. Results Cronbach Alpha Values for the Products of Each Step Table 1 presents internal consistency values for the products of the four analytical steps. The Cronbach alpha value for the original 10-item LSNS scale administered to the present sample (ex = 0.66) is slightly lower than those previously reported (Lubben, 1988; Lubben & Gironda, 1997) and below the desired standard for internal consistency. The Cronbach alpha values increased in each subsequent step in the analysis, indicating that each successive modification contributed to improving the final product's internal consistency. Although the variant produced in Step 2 has two items less than the original LSNS, the alpha value is slightly higher than that for the original LSNS, suggesting that dropping items L9 and LIO was appropriate. Table 2 shows the rotated factor matrix for the original LSNS administered to the present sample. Although previous studies have reported three factors (Lubben, 1988; Lubben & Gironda, 1997), the rotated factor structure for the LSNS here showed a two-factor solution, with one factor consisting largely of family-related items and the other consisting primarily of items concerning friendships. However, the former, i.e., the family factor, also incorporates items concerning confidant relationships and the "helps others" items. Both confidant items clearly load onto the family factor in this step, but the "living arrangements" item cross-loads on both the family and friend factors. Further, the greatly improved alpha value obtained for the variant produced in Step 3 The Behavioral Measurement Letter 6 Vol. 7, No.2, Winter 2002 Refinements to the Lubben Social Network Scale: The LSNS-R (continued) Table 2 Step 1: Original 1O-Item LSNS Factor Matrix L3 L8 Ll L2 L9 L7 Item Family: discuss private matterslcall on for help Is confidant Family: number in contact Family: frequency of contact with family member most in contact Helps others Has confidant Family Factor .7298 .6743 .6633 .6629 .5702 .5391 Friend Factor .1242 .0377 -.0639 -.0761 -.0269 .2047 L4 L5 L6 LlO Friends: discuss private matters/call on for help Friends: number in contact Friends: frequency of contact with friend most in contact Living arrangements .2606 .2072 -.1082 -.3490 .7807 .7520 .5225 .4461 L3 L2 Ll L8 L7 Item Family: discuss private matterslcall on for help Family: frequency of contact with family member in contact Family: number in contact Is confidant Has confidant Family Factor .7722 .7393 .6807 .6455 .5622 Friend Factor .1255 -.1103 -.0218 .0850 .2040 L4 L5 L6 Friends: discuss private matters/calIon for help Friends: number in contact Friends: frequency of contact with friend most in contact .2105 .1514 -.1230 .8162 .7810 .5558 Table 3 Step 2: Factor Matrix after Eliminating Items L9 and L10 Table 4 Step 3: Factor Matrix after Disaggregating Items L3 and fA Ll L3A L3B Item Family: number in contact Family: discuss private matters Family: call on for help Family Factor .8133 .8020 .7775 Friend Factor .0029 .0959 .1226 L4B L5 L4A L6 Friends: Friends: Friends: Friends: .0522 .1873 .1632 -.1817 .7984 .7653 .7504 .4877 .1707 -.0319 .0523 .0685 L7 L8 L2 Has confidant Is confidant Family: frequency of contact .0425 .2365 .4253 .1524 .1647 -.1524 .8276 .6689 .6166 call on for help number in contact discuss private matters frequency of contact "confidant" items were found to clearly load on the family factor. No heavy cross-loading was found for the scale variant produced in this step. In the second step, items L9 and LlO were eliminated as planned due to their general poor performance in previously discussed studies. Similar problems are demonstrated in the current study by the heavy cross-loading found for LlO in Step 1. As in Step 1, a two-factor structure was found (Table 3) and the Vol. 7, No.2, Winter 2002 Confidant Factor .0604 .2442 .2545 Table 4 shows the factor structure found in Step 3. In this step, the double-barreled items L3 and L4 ("talk about private matters" and "callon 7 The Behavioral Measurement Letter .~-----~----~ Refinements to the Lubben Social Network Scale: The LSNS-R -.---- (continued) Table 5 Step 4: LSNS-R Factor Matrix Family Factor Item L3B L8A L7A L3A L2 Ll L8B L7B L6 L5 L4B L4A Family: Family: Family: Family: Family: Family: call on for help has confidant is confidant discuss private matters frequency of contact with family member most in contact number in contact Friends: has confidant Friends: is confidant Friends: frequency of contact with friend most in contact Friends: number in contact Friends: call on for help Friends: discuss private matters -.0352 .0848 .2490 -.0292 .0514 -.1576 .2327 -.0024 -.0384 .2320 -.1438 .1565 .0487 .1907 -.1428 .8800 .8488 .5493 .1279 .1140 .1856 .0915 .0839 .0922 .0594 .2711 .4920 .8467 .7663 .6028 the original LSNS and Conclusion As gerontologists and geriatricians begin to identify means to increase active life expectancy rather than mere life expectancy, it is likely that older persons' social support networks will be shown to. be necessary to healthy aging. This means that there is increasing need for a variety of reliable and valid social support scales for use in research and practice settings. The work discussed here should be viewed as part of an ongoing pursuit of such well-constructed social integration scales, for improved measures of social support networks are essential to understanding better the reported link between social integration and health. Such improved knowledge thus will enhance future gerontological research, geriatric care, and the quality of life of the elderly. Step 4 involved distinguishing family and friends as both possible sources and possible targets of confidant relationships, and resulted in the LSNS-R. Principal component factor analysis in this step (Table 5) revealed a single, clean family factor and two friendship factors. The friendship confidant items (L7B, L8B) and the frequency of contact with a friend item (L2) constitute one of the friendship factors, while the remaining friendship items make up a second friendship factor. The item on being able to talk to a friend about private matters (L4A) loads on both friendship factors. From applied research and clinical perspectives, there is growing pressure to develop short and efficient scales. Some elderly populations are unable to complete long questionnaires, and time constraints in most clinical practice settings necessitate use of efficient and effective screening tools. Shorter scales require less time and energy of both the administrator and respondent. Thus, parsimonious and effective screening tools are needed that are acceptable to elders, researchers, and health care providers as well. Item-Total Scale Correlations Item-total scale correlational analysis yielded coefficients ranging from 0.27 to 0.75, indicating that LSNS-R items are sufficiently homogeneous and without excessive redundancy. All internal reliability coefficients fell within the acceptable range suggested by Steiner and Norman (1995). The correlation The Behavioral Measurement Letter Friend Factor B .7600 .7402 .7358 .7345 .7134 .6712 coefficient between LSNS-R was 0.68. for help") were disaggregated. Principal component factor analysis performed here indicated a three-factor solution, with items loading on a family factor, a friendship factor, and a confidant factor. Generally the factors are clean (i.e., for each item, there is predominant loading on one factor and little loading on the others). However, the family "frequency of contact" item (L2) cross-loads on both the family and confidant factors. Friend Factor A 8 Vol. 7, No.2, Winter 2002 Refinements to the Lubben Social Network Scale: The LSNS-R (continued) In summary, development and validation of social support network instruments are cumulative and ongoing processes. They require testing and retesting with diverse populations, in various research and practice settings, and using both psychometric and practical standards to assess their actual utility. In addition, future analyses of these scales should include assessment of their sensitivity to various differences within and between groups, for example, cultural and socio-demographic differences, or differences in levels of health and functional status that might affect response patterns. For more basic researchers, somewhat longer research instruments are desirable, because having a larger number of items as well as better clarity of concepts generally contributes to a scale's reliability and facilitate the analysis and appraisal of subtle differences. But rather than attempting to design a single social support network scale for use with all elderly populations and for all research purposes, it seems far more practical to design measurement instruments for specific populations along with clear indications of how they should be used (Mitchell & Trickett, 1980). Use of such welltargeted social support network assessment instruments will yield more valid research results than use of less well-targeted measurement tools. The four-step process described above continues in the tradition of instrumentation refinement, tailoring new measurement tools to address special needs as well as offering a revised version of the LSNS incorporating modifications that greatly enhance its psychometric properties. This work also offers a paradigm that could be employed to evaluate and refine other social support network measurement tools. The original LSNS was designed specifically for an elderly population. Although it has proven adaptable to a variety of settings, some deficiencies have been noted over the past ten years of use. The LSNS-R addresses these problems, resulting in an improved measure of social support networks. The refinements are theory driven and involved reworking items in the original LSNS so that the revised scale can better measure the distinct aspects of kinship and friendship networks (Lubben & Gironda, in press). An abbreviated version of the LSNS-R has been developed and is reported elsewhere (Lubben & Gironda, 2000). This six-item scale (the LSNS-6) can be especially suitable in practice settings as a screening tool for social isolation or for more general use in those research settings where longer social support network scales cannot be accommodated. For those social and behavioral researchers desiring more extensive inquiry into the nature of social relationships of the elderly, an 18-item version of the LSNS has also been developed (Pourat, et al., 1999; Pourat, et al., 2000). The major advantage of the LSNS-18 over the LSNS-R is that the former distinguishes friendship ties with neighbors from those with friends who do not live in close proximity to the respondent. Such distinctions are desirable for exploring a growing number of social and behavioral research questions regarding the functioning of social support networks. Vol. 7, No.2, Winter 2002 References Berkman, L.P., & Syme, S.L. (1979). Social networks, host resistance, and mortality: A nine-year follow'-up study of Alameda County residents. American Journal of Epidemiology, 109, 186-204. Burgess, R.I. & Huston, exchange in developing Academic Press. T.L. (Eds.) (1979). Social relationships. New York: Cattell, R.B. (1966). The meaning and strategic use of factor analysis. In R.B. Cattell (Ed.), Handbook of multivariate experimental psychology. Chicago: Rand McNally. Chou, K.L. & Chi, I. (1999). Determinants of life satisfaction in Hong Kong Chinese elderly: A longitudinal study. Aging and Mental Health, 3:328-335. Cronbach, L.I. (1951) Coefficient alpha and the internal structure of tests. Psychometrika, 16,297-334. DeVellis, R.P. (1991). Scale development: applications. Newbury Park, CA: Sage. Theory and Dorfman, R., Lubben, J.E., Mayer-Oakes, A., Atchison, K.A., Schweitzer, S.O., Dejong, P., et aI., (1995). Screening for depression among the well elderly. Social Work, 40, 295-304. 9 The Behavioral Measurement Letter Refinements to the Lubben Social Network Scale: The LSNS-R (continued) Luggen, A.S., & Rini, A.G. (1995). Assessment of social networks and isolation in community-based elderly men and women. Geriatric Nursing, 16, l79-181. Gironda, M.W, Lubben, J.E., & Atchison, KA. (1998). Social support networks of elders without children. Journal of Gerontological Social Work, 27, 63-84. Martire, L.M., Schulz, R., Mittelmark, M.B., & Newsom, J.T (1999). Stability and change in older adults' social contact and social support: The Cardiovascular Health Study. Journals of Gerontology: Series B: Psychological and Social Sciences, 54B, S302-S311. Glass, TA., Mendes de Leon, C.P., Seeman, TE., Berkman, L.P. (1997). Beyond single indicators of social networks: A LISREL analysis of social ties among the elderly. Social Science and Medicine, 44, 1503-1507. McDowell, I., & Newell, e. (1987). Measuring health: A guide to rating scales and questionnaires. New York: Oxford University Press. Guadagnoli, E., & Velicer, W (1988). Relation of sample size to the stability of component patterns. Psychological Bulletin, 103, 265-275. Mistry, R., Rosansky, J., McGuire, J., McDermott, c., Jarvik, L., and the UPBEAT Collaborative Group (2001). Social isolation predicts re-hospitalization in a group of older American veterans enrolled in the UPBEAT Program. International Journal of Geriatric Psychiatry, 16,950-959. House, J.S., Landis, KR., & Umberson, D. (1988). Social relationships and health. Science, 241, 540-545. Hurwicz, M.L., Berkanovic, E. (1993). The stress process in rheumatoid arthritis. Journal of Rheumatology, 20, 11. Mitchell, R., & Trickett, E. (1980). Social networks as mediators of social support: An analysis of the effects and determinants of social networks. Community Mental Health Journal, 16, 27-44. Jung, J. (1990). The role of reciprocity in social support. Basic and Applied Social Psychology, 11, 243-253. Moon, A., Lubben, J.E., & Villa, VM. (1998). Awareness and utilization of community long term care services by elderly Korean and nonHispanic White Americans. Gerontologist, 38, 309-316. Kaiser, H.P. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20, 141-151. Keith, P.M., Hill, K, Goudy, W.J., & Power, E.A. (1984). Confidants and well-being: A note on male friendship in old age. Gerontologist, 24, 318-320. Mor-Barak, M.E. (1997). Major determinants of social networks in frail elderly community residents. Home Health Care Services Quarterly, 16, 121-137. Litwak, E. (1985). Helping the elderly: The complementary role of informal networks and formal systems. New York: Guilford. Mor-Barak, M.E., Miller, L.S., & Syme, L.S. (1991). Social networks, life events, and health of the poor frail elderly: A longitudinal study of the buffering versus the direct effect. Family Community Health, 14, (2), 1-13. Lubben, J.E. (1988). Assessing social networks among elderly populations. Family Community Health, 11, (3), 42-52. Mor-Barak, M.E., & Miller, L.S. (1991). A longitudinal study of the causal relationship between social networks and health of the poor frail elderly. Journal of Applied Gerontology, 10,293-310. Lubben, J.E., & Gironda, M. (1997). Social support networks among older people in the United States. Litwin H. (Ed.) The Social Networks of Older People. Wesport, CT: Praeger. Nunnally, J.e. (1978). Psychometric New York: McGraw-Hill. Lubben, J.E. & Gironda, M.W (2000). Social Support Networks. Osterweil, D., Beck, J., & Brummel-Smith, K (Eds.) In Comprehensive Geriatric Assessment: A Guide for Healthcare Providers, New York: McGraw-Hill. Okwumabua, J.O., Baker, P.M., Wong, S.P., & Pilgrim, B.O. (1997). Characteristics of depressive symptoms in elderly urban and rural African Americans. Journals of Gerontology: Biological Sciences and Medical Sciences. 52A, M241-M246. Lubben, J.E., & Gironda, M.W (in press). Centrality of social ties to the health and well-being of older adults. In B. Berkman & L. Harooytan (Eds.). Gerontological social work in the emerging health care world. New York: Springer. O'Reilly, P. (1988). Methodological issues in social support and social network research. Social Science Medicine, 26, 863-873. Potts, M.K., Hurwicz, M.L., Goldstein, M.S., & Berkanovic, E. (1992). Social support, health-promotive beliefs, and preventive health behaviors among the elderly. Journal of Applied Gerontology, 11, 425-440. Lubben, J.E., Weiler, P.G., Chi, I. (1989). Gender and ethnic differences in the health practices of the elderly poor. Journal of Clinical Epidemiology, 42, 725-733. The Behavioral Measurement Letter theory (2nd ed.). 10 Vol. 7, No.2, Winter 2002 Refinements to the Lubben Social Network Scale: The LSNS-R (continued) James Lubben, DSW, MPH is Professor of Social Welfare and Urban Planning at the University of California, Los Angeles (UCLA). Both his DSW and MPH are from the University of California, Berkeley. He is Principal Investigator for the Hartford Doctoral Fellows Program in Geriatric Social Work, a program administered by the Gerontological Society of America, and a consultant to the World Health Organization-Kobe Centre on health and welfare systems development for aging societies. Dr. Lubben's research examines social behavioral determinants of vitality in old age, with a particular focus on the roles of social support networks. Pourat, N., Lubben, J., Yu, H., & Wallace, S. (2000). Perceptions of health and use of ambulatory care: Differences between Korean and White elderly. Journal of Aging and Health, 12, (1), 112-134. Pourat, N., Lubben, J., Wallace, S., & Moon (1999). Predictors of use of traditional Korean healers among elderly Koreans in Los Angeles. Gerontologist, 39, 711719. Rubenstein, L.Z., Aronow, H.U., Schloe, M., Steiner, A., Alessi, C.A., Yuhas, K.E., Gold, M., Kemp, K., Nisenbaum, R, Stuck, A., & Beck, r.c. (1994). A homebased geriatric assessment: Follow-up and health promotion program: Design, methods, and baseline findings from a 3-year randomized clinical trial. Aging, 6, 105-120. Melanie Gironda, PhD, MSW is a Lecturer in the Department of Social Welfare at UCLA where she teaches courses on social gerontology and research on aging. She received both her PhD and MSW from UCLA. Dr. Gironda is Deputy Program Director of the Hartford Doctoral Fellows Program in Geriatric Social Work. Her research examines loneliness in various populations of the elderly, with a special focus on the nature of social support networks of older adults without children. Rubenstein RL., Lubben, J.E., & Mintzer, J.E. (1994). Social isolation and social support: An applied perspective. Journal of Applied Gerontology, 13, 58-72. Sauer, W'L, & Coward, RT. (1985). Social support networks and the care of the elderly: Theory, research and practice. New York: Springer. Siegel, J .M. (1990). Stressful life events and use of physician services among the elderly: The moderating role of pet ownership. Journal of Personality and Social Psychology, 58, 1081-1086. Stevens, J. (1992). Applied multivariate statistics for the social sciences (2nd ed.). Hillsdale, NJ: Erlbaum. Alex E. Y. Lee, PhD, MSW is Assistant Professor in the Department of Social Work and Psychology at the National University of Singapore where he teaches courses on social work and gerontology. He received both his PhD and MSW from UCLA. His current research concerns social service delivery systems for the elderly and familybased gerontological counseling for Asian elders. Steiner, A., Raube, K., Stuck, A.E., Aronow, H.U., Draper, D., Rubenstein, L.Z., & Beck, J. (1996). Measuring psychosocial aspects of well-being in older community residents: Performance of four short scales, Gerontologist, 36,54-62. Streiner, D.J., & Norman, G.R. (1995). Health measurement scales: A practical guide to their development and use (2nd ed.). New York: Oxford University Press. Villa, Y.M., Wallace, S.P., Moon, A., & Lubben, J.E. (1997). A comparative analysis of chronic disease prevalence among older Koreans and nonHispanic Whites. Journal of Family and Community Health, 20, 1-12. Winemiller, D.R., Mitchell, M.E., Sutliff, J., & Cline, D.1. (1993). Measurement strategies in social support: A descriptive review of the literature. Journal of Clinical Psychology, 49, 638-648. In small proportions, we just beauties see, And in short measures life may perfect be. Ben Jonson Vol. 7, No.2, Winter 2002 11 f~~~~~,&,F I The Behavioral Measurement Letter Self- Estimated Intelligence equality with their mothers, men with their fathers and men superior to their mothers. Mothers therefore come out as inferior to fathers. This pattern has been consistent each year." Beloff argued that the modesty training girls receive in socialization accounts for these data. Adrian Furnham Do beliefs about one's ability affect scores on an ability test? Is it important to believe that you are bright to do well on an abilitylintelligence test? What do people think about their own intelligence? How do they estimate their own intelligence and that of their relatives? What determines their self-estimates? Beloff's research stimulated others to replicate these gender-related differences in selfestimated intelligence (Bennett, 1996, 1997, 2000; Byrd & Stacey, 1993; Furnham & Rawles, 1995, 1999). These studies were in four related research areas: (a) studies to replicate and explain these differences in various countries and relate nationallcultural variables to them; (b) research to see if these gender-related differences can be replicated not only for overall (general) intelligence, but also for more specific and multiple intelligences; (c) work to examine whether these differences occur in estimations of the intelligence of others, notably male and female members of one's immediate family Studies of self-estimated intelligence date back nearly a quarter of a century (Hogan, 1978). More recently Beloff (1992) provoked a great deal of interest in the effects of gender differences on self-estimated intelligence. Among her Scottish undergraduates, she found a six-point difference, with males estimating their score significantly higher than females. Additional work led her to conclude that "The younger women see themselves as intellectually inferior compared to young men ....Women see Table I Results of Studies Where Participants Made an Overall IQ (g) Rating on Themselves and Others. Study Women Men Beloff (1992) - Scotland Self Mother Father (N = 502) 120.5 119.9 127.7 (N = 265) 126.9 118.7 125.2 6.4 -1.2 -2.5 Byrd & Stacey (1993) - New Zealand Self Mother Father Sister Brother (N = 105) 12l.9 114.5 127.9 118.2 114.1 (N = 112) 121.5 106.5 122.3 110.5 116.0 -0.4 -9.0 -5.6 -7.7 1.9 Bennett (1996) - Scotland Self (N = 96) 109.4 (N = 48) 117.1 7.7 Reilly & Mulhern (1995) - Ireland Self Measured (N = 80) 105.3 106.9 (N = 45) 113.9 106.1 8.6 -0.8 Furnham & Rawles (1995) - England Self Mother Father (N = 161) 118.48 108.70 114.18 (N = 84) 123.31 109.12 116.09 6.17 0.72 1.91 Furnham & Rawles (1996) - England Self (N = 140) 116.64 (N = 53) 120.50 3.9 Furnham & Gasson (1998) - England Self Male child (1st child) Female child (1st child) (N = 112) 103.84 107.69 102.57 (N = 72) 107.99 109.70 102.36 4.15 2.01 -0.21 Furnham & Reeves (1998) - England Self Male child (1st son) (N = 84) 104.84 116.09 (N = 72) 110.15 114.32 5:31 -1.77 The Behavioral Measurement Letter 12 Difference Vol. 7, No.2, Winter 2002 Self-Estimated Intelligence (continued) standard deviations above the mean (Furnham, Clark, & Bailey, 1999) - around 120, whereas nonstudent British adults believed their IQ is to be around a half standard deviation above the norm (Furnham, 2000). (grandparents, parents, siblings, children); (d) research to examine the relationship between estimated intelligence and "actual" intelligence scores. Furnham (2001) reported on various such studies, all but one of which showed significant gender differences in self-rated overall IQ ranging from 3.9 to 8.6 points. Multiple Intelligence Over the past decade there have been many attempts to redefine intelligence and define different types of intelligence. Thus we have emotional intelligence as well as practical intelligence, for example, but perhaps the idea that has appealed most to lay people (not academics, however) is Gardner's (1999) multiple intelligence theory. Gardner (1983) initially argued that there were seven types of intelligence: • Verbal or linguistic intelligence (the ability to use words), • Logical or mathematical intelligence (the ability to reason logically, to solve number problems), • Spatial intelligence (the ability to find one's way around one's environment and form mental images), • Musical intelligence (the ability to perceive and produce pitch and rhythm), • Body-kinetic intelligence (the ability to carry out motor movements, for example, in performing surgery, ballet dancing, or playing sports), • Interpersonal intelligence (the ability to understand people), • Intrapersonal intelligence (the ability to understand oneself and develop a sense of one's identity). He suggested that the first two are those types valued at school, the next three are valuable in the arts, and the latter two constitute a sort of "emotional intelligence." Beginning with Beloff (1992), a number of these studies looked at estimates of relations' IQs (Furnham, Fong, & Martin, 1999). These studies showed that people believe there are clear generational effects in IQ - they believe that they are a little brighter than their mothers and certainly much brighter than their grandparents, while parents tend to believe their children are brighter than they are. In general, a half standard deviation (6-8 points) difference was found in estimated IQ between generations. Gender differences were also found for estimated IQ of relatives (Flynn, 1987). Thus people believe their grandfathers are brighter than their grandmothers, their fathers brighter than their mothers, their brothers brighter than their sisters, and their sons brighter than their daughters. Interestingly for parents estimating their children, the results were stronger for first-borns compared to those born later, indicating the possible working of the principle of primogeniture. These results have been shown to be crossculturally robust as the gender difference effect has been demonstrated in Asia (Japan, Hong Kong), Africa (Uganda, South Africa), Europe (Belgium, Britain, but not Slovakia) and America (Furnham, et al., 2002) as well as New Zealand (Furnham & Ward, 2001) and Iran (Furnham, Shahidi, & Baluch, 2002). Thus there seems to be a robust and cross-culturally valid finding that there is a clear, consistent gender difference in self-rating of overall intelligence, with males rating themselves and their male relations higher than females rate themselves and their female relations. At least eight studies have examined gender differences in estimates of Gardner's (1983, 1999) seven multiple intelligences. The results have been consistent and give an important clue into gender differences in overall performance. Most people rate their own interpersonal and intrapersonal intelligence as very high, about 1 SD (standard deviation) above the mean, and their musical and body kinetic intelligence as strictly average, around 100. Of the remaining three types of intelligence in the Gardner model It is also interesting to note that it is very rare for people to rate their scores, or indeed those of relations, as below average «100 IQ points). For example, it was found that students tended to believe their IQ to be about one-and-a-half Vol. 7, No.2, Winter 2002 13 The Behavioral Measurement Letter Self-Estimated Intelligence (continued) Reilly and Mulhern (1995) note that IQestimates research should not be based on the "assumption that gender differences at group level represent a generalized tendency on the part of either gender to either overconfidence or lack of confidence with regard to their own intelligence" (Reilly & Mulhern). Studies do show, however, a tendency for males to overestimate and females to underestimate their score [?reference(s)] but this is related, in part, to the IQ test used. - verbal, mathematical and spatial, gender differences were found only in mathematical and spatial intelligence, particularly in spatial intelligence where females typically rate themselves six to 10 points lower than do males. No gender differences were found in selfestimations of verbal intelligence (Furnham, 2001). Some studies have asked participants to estimate their overall (g, general intelligence) score first and then their scores on the seven specific multiple intelligences. This has made it possible to regress simultaneously the seven multiple intelligence estimates on the overall intelligence estimate. These regression studies show that people believe that mathematical (logical), then spatial, and then verbal intelligence are the primary contributors to overall IQ. This, in turn, allowed for testing the hypothesis that lay conceptions of intelligence are male normative in the sense that those abilities that men tend to be better at are those that most people consider to be the essence of intelligence. These studies showed that lay people tend to confuse mathematical/spatial and overall intelligence, thereby explaining the consistent gender differences in estimates of overall score. This means that quite possibly the often-observed and debated gender difference in spatial IQ accounts for the difference in overall IQ. Some researchers have tried to understand and then increase the correlation between selfestimates and test scores by using more tests on bigger populations, yet the correlations remained the same, as noted above, around r = 0.30 (Borkenau & Liebler, 1993). Perhaps the explanation should be sought in motivational factors that may be involved in the selfestimation of intelligence and that may lead to serious distortions in estimated scores. Thus a close examination of the conditions and instructions under which participants selfestimate their intelligence may provide a clue as to how they make their self-estimates. For example, if social norms and conventions in part dictate how people respond, then anonymity in responding· may reduce the distortions in selfestimates. From Whence the Differences? Gender differences in self-estimated IQ need to be explained because the issue is so frequently discussed, there has been an academic consensus for many years to the effect that gender differences are minimal, and since before the Second World War, test constructors have been careful to produce IQ tests that minimize gender differences. Correlations Between Self-Estimated and Test-Measured IQ Are self-estimates accurate? In other words, is the correlation between (valid) IQ test scores and self-estimated scores high, low, or "on the mark"? This is an important question because some psychologists have suggested that if these scores correlate highly, self-estimates may serve as useful measures at a fraction of time, effort, and costs of administering and scoring IQ tests. Essentially there are three positions on the gender differences of estimates issue (Fumham, 20Ul). The "feminist" position was clearly articulated by Beloff (1992) in her first study, where she suggested that these differences are erroneous and simply due to attribution errors. One explanation she offers is a gender difference in modesty and humility. "Modesty-training is given to girls. Modesty and humility are likely to be connected to the over estimates of women and for women" (Beloff). Another explanation Various studies have found that correlations of actual and self-estimated IQ are around r = 0.30 and that therefore self-estimates cannot serve as proxy measures of IQ (Paulus, Lysy, & Yik, 1998). One study, however, examined the effect of outliers and concluded that, if these outliers are removed, self-estimated and actual IQ correlate highly (r > 0.90). The Behavioral Measurement Letter 14 Vol. 7, No.2, Winter 2002 Self-Estimated Intelligence (continued) and failure, and, ultimately, performance on ability-related tasks. Thus the importance of studies of self-estimated intelligence may lie not only in exploring lay theories of intelligence, but also in understanding the self-fulfilling nature of self-evaluation of ability. "Because of the serious implications of under-estimations for self-confidence and psychological health more attention should be devoted to the investigation of gender differences in the accuracy of selfevaluations. Such research will not only elucidate the underlying processes of selfevaluation biases and therefore be of theoretical interest but will also be of practical value by suggesting ways of eliminating women's underestimation of their performance" (Beyer). she proposes is that because IQ is correlated with occupational grade and men tend to occupy certain higher status occupations more than women (for political rather than ability reasons in her view), females tend to believe that they are less intelligent than males. There seem to be no direct data to test Beloff's assumptions, and they look a little outdated, particularly in the case of America where students still show the same gender difference pattern despite their more equal socialization and occupational choice. However, unusual data from Slovakia may be evidence for this position. Slovakian females awarded themselves higher overall (g) and verbal scores than did equivalent Belgian and British student groups (Furnham, Rakow, Samany-Schuller, & De Fruyt, 1999). The authors offered the following possible explanation for this unique group of confident females: It may well be that under the pressure of socialist governments of Eastern Europe, the role of females in society was somewhat different from that of women in capitalist Western Europe - the former took a more active role in the economy and were socialized differently in school. In fact, Slovakian society attached high prestige to education and the government made a consistent effort to improve the position of women in society by encouraging them to obtain educational qualifications, facilitating their employment in traditionally male-dominated occupations, and even mandating a given percentage of women in its parliament. Another explanation they present is that among the various nationalities studied, the Slovakian women had the most experience in taking intelligence tests and were therefore presumably more likely to recognize that gender differences in intelligence are actually very small (Furnham, et al.). Beyer (1998) reviewed studies concluding that individuals make poor self-evaluators. In one such study, the correlation between medical students' self-rated knowledge and exam grades was found to be almost exactly zero (r = -0.01). In another study Beyer reviewed, the correlation between self-perceptions of one's own physical attractiveness and judges' ratings of their attractiveness was 0.22. Another study she considered found a correlation of 0.32 between how one judged ones own performance on a test of managerial-skill and how independent experts rated it. "Interestingly, outside evaluators seem to be better assessors of a target's performance that the target her/himself' (Beyer). Thus it seems that while self-estimates of intelligence may not be useful as proxy IQ tests, various studies (Paulus et al., 1998) have shown that they could be very useful in explaining some of the variability in actual test results through the self-expectation effect that Beyer (1999) discussed. The work of Dweck and colleagues is particularly salient here (Dweck & Bempechat, 1983; Mueller & Dweck, 1998). Dweck argued that lay persons tend either to believe that intelligence is malleable (incremental lay theorists) or else it is fixed (entity lay theorists). These beliefs logically relate to goal setting, motivation, and attribution for success and failure. Further, they relate to one's perceptions of others. While these beliefs are not necessarily related to actual ability, they can have powerful, even paradoxical, behavioral impacts. Thus Mueller and Dweck pointed out if a young Consequences of Gender Differences in Ability Estimates In a series of studies, Beyer (1990) demonstrated gender differences in self-expectations, selfevaluations, and performance on ability-related tasks. Her results support the "male hubris, female humility" thesis of Beloff. Further, she argued that gender differences in selfevaluations affect expectancies of success Vol. 7, No.2, Winter 2002 15 The Behavioral Measurement Letter Self-Estimated Intelligence Beyer, S., (1990). Gender differences in the accuracy of self-evaluation of performance. Journal of Personality and Social Psychology, 59, 960-970. (continued) person believes him/herself to be intelligent, entity theory predicts that they are likely to be less motivated to work hard to achieve their goals. Beyer, S., (1998). Gender differences in self-perception and negative recall bias. Sex Roles, 38, 103-133. Beyer, S., (1999). Gender differences in the accuracy of grade expectations and evaluations. Sex Roles, 41, 279296. The interaction between beliefs about one's level of intelligence on the one hand and the changeability of intelligence on the other is potentially very important. Thus entity lay theorists who believe themselves to be below average on overall or specific intelligences may shun intellectual tasks, set themselves low goals, and become depressed and ineffectual. In contrast, entity lay theorists who rate themselves way above average may be complacent and lazy, believing their "natural wit" sufficient to let them succeed at most academic and work assignments. However, lay incrementalists who are low self-raters may, if so moved, be prepared to undertake tasks or training that they believe will increase their intelligence. High self-rating incrementalists, in contrast, may also believe that they need to work hard to maintain their levels of intelligence. Borkenau, P, & Liebler, A., (1993). Convergence of stranger ratings of personality and intelligence with selfratings, partner-ratings, and measured intelligence. Journal of Personality and Social Psychology, 65, 546553. Byrd, M., & Stacey, B., (1993). Bias in IQ perception. The Psychologist, 6, 16. Dweck, c., & Bempechat, J., (1983). Children's theories of intelligence: Consequences for learning. In S. Paris, G. Olson, and H. Stevenson (Eds.), Learning and motivation in the classroom. Hillsdale, NJ: Erlbaum. Flynn, 1., (1986). Massive IQ games in 14 nations: What IQ tests really measure. Psychological Bulletin, 101, 171191. Furnham, A., (2001). Self-estimates of intelligence: Culture and gender differences in self and other estimates of general (g) and multiple intelligences. Personality and Individual Differences, 31,1381-1405. Conclusion Furnham, A., Clark, K., & Bailey K., (1999). Sex differences in estimates of multiple intelligences. European Journal of Personality, 13,247-259. The measurement of intelligence has always been a controversial topic. So, too, have the topics of lay people's beliefs about their own intelligence and the extent to which it can be changed. Research relating to these topics has yielded evidence of a "self-Pygmalion effect" in that beliefs about the nature of intelligence and one's intelligence level can profoundly influence not only how one performs on a test, but one's personal motivations in academic and work settings throughout life (Spitz, 1999). Furnham, A., Fong, G., & Martin, N., (1999). Sex and cross-cultural differences in the estimated multi-faceted intelligence quotient score for self, parents, and siblings. Personality and Individual Differences, 26, 1025-1034. Furnham, A., Rakow, T., Samany-Schuller, I., & De Fruyt, F., (1999). European differences in self-perceived multiple intelligences. European Psychologist, 4, 131-138. Furnham, A., & Rawles, R., (1995). Sex differences in the estimation of intelligence. Journal of Behavior and Personality, 10,741-748. References Furnham, A., & Rawles, R., (1999). Correlations between self-estimated and psychometrically measured IQ. Journal of Social Psychology, 139,405-410. Beloff, H., (1992) Mother, father and me: Our IQ. The Psychologist, 5, 309-311. Bennett, M., (1996) Men's and women's self-estimates of intelligence. Journal of Social Psychology, 136, 411-412. Furnham, A., Shahidi, S., & Baluch, B., (2002 in press). Sex and culture differences in self-perceived and family estimated multiple intelligence. A British- Iranian comparison. Journal of Cross-Cultural Psychology. Bennett, M., (1997) Self-estimates of ability in men and women. Journal of Social Psychology, 137,540-541. Bennett, M., (2000). Gender differences in the se1festimation of ability. Australian Journal of Psychology, 52,23-28. The Behavioral Measurement Letter 16 Vol. 7, No.2, Winter 2002 Self-Estimated Intelligence (continued) HaPI Advisory Board Furnham, A., & Ward, c., (2001 in press). Sex differences, test experience and the self-estimation of multiple intelligence. New Zealand Journal of Psychology. Aaron T. Beck, MD University of Pennsylvania School of Medicine Timothy C. Brock, PhD Ohio State University, Psychology Gardner, H., (1983). Frames of mind: A theory of multiple intelligences. New York: Basic Books. Gardner, H., (1999). Intelligence Basic Books. William C. Byham, PhD Development Dimensions International reframed. New York: Donald Egolf, PhD University of Pittsburgh, Communication Hogan, H., (1978). IQ self-estimates of males and females. Journal of Social Psychology, 106, 137-138. Mueller, c., & Dweck, c., (1998). Praise for intelligence can undermine children's motivation and performance. Journal of Personality and Social Psychology, 75, 33-52. Sandra J. Frawley, PhD Yale University School of Medicine, Medical Informatics Paulus, D., Lysy, D., & Yik, M., (1998). Self-report measures of intelligence: Are they useful as proxy IQ tests. Journal of Personality, 66,523-555. David F. Gillespie, PhD Washington University George Warren Brown School of Social Work Reilly, J., & Mulhern, G., (1995). Gender difference in self-estimated IQ: the need for care in interpreting group data. Personality and Individual Differences, 18, 189-192. Robert C. Like, MD, MS University of Medicine and Dentistry of New Jersey, Robert Wood Johnson Medical School Spitz, H., (1999). Beleaguered Pygmalion: A history of the controversy over claims that teacher expectancy raises intelligence. Intelligence, 27, 199-234. Joseph D. Matarazzo, PhD Oregon Health Sciences University Dr. Adrian Furnham is Professor of Psychology at London University (UCL) where he has taught for 20 years. He holds doctoral degrees from both Oxford University and UCL. He is the author of 36 books and 450 peer-reviewed papers. Dr. Furnham is currently working on a book about managerial incompetence. Vickie M. Mays, PhD University of California at Los Angeles, Psychology Michael S. Pallak, PhD Behavioral Health Foundation Kay Pool, President Pool, Heller & Milne, Inc. --- --~~-------- Ora Lea Strickland, PhD~ ~~;A~----L Gerald Zaltman, PhD Harvard University Graduate School of Business Administration Every tool carries with it the spirit by which it has been created. Stephen J. Zyzanski, PhD Case Western Reserve University School of Medicine Werner Karl Heisenberg Vol. 7, No.2, Winter 2002 Emory University Woodruff School of Nursing 17 The Behavioral Measurement Letter Ten Commandments for Selecting Self-Report Instruments only a single measure of a particular research construct makes it impossible to know the degree to which the irrelevancies in the measure affect the obtained results. Moreover, diversity in operationalization helps us better understand not only what we're measuring, but also what we're not measuring. Thus, in the long run, employing only the most commonly used instruments limits and weakens the body of scientific knowledge. Fred B. Bryant Selecting appropriate measurement instruments is among the tasks researchers most frequently face. Yet, surprisingly little has been written about how best to go about the process of instrument selection. Given the prevalence of self-report methods of measurement in the social sciences, the task of selecting an instrument most often involves choosing from among a set of seemingly relevant questionnaires, surveys, inventories, checklists, and scales. Although it is difficult to devise a universally applicable set of rules for selecting measurement instruments, it is possible to suggest some general guidelines that researchers can use in choosing appropriate self-report measures. What follows, then, is a set of precepts and principles for selecting instruments for research purposes, along with concrete examples illustrating each. Note that the order of presentation is not intended to reflect the guidelines' relative importance in the process of instrument selection. In fact, each principle is essential in selecting the right measurement tool for the job. For example, a researcher who wishes to measure depression in college students might locate dozens of potentially useful self-report instruments designed to assess this construct. Indeed, the September 2001 release of the Health and Psychosocial Instruments (HaPI) database contains 105 primary records of selfreport instruments with the term "depression" in the title, excluding measures developed for use with children or the elderly, those written in foreign languages, and those assessing attitudes toward, knowledge of, or reactions to depression rather than depression per se. The seemingly appropriate instruments thus identified include the Beck Depression Inventory (BDI; Beck, 1987), Hamilton Rating Scale for Depression (HAM-D; Hamilton, 1960), Center for Epidemiologic Studies Depression Scale (CESD; National Institute of Mental Health, 1977), and Self-Rating Depression Scale (SDS; Zung, 1965), to name just a few. How should the researcher decide which to use? 1. Before choosing an instrument, define the construct you want to measure as precisely as feasible. Ideally, researchers should not rely merely on-a label or descriptive term to represent the construct they wish to assess, but should instead define the construct of interest clearly and precisely in theory-relevant terms at the outset. Being unable to specify beforehand what it is you want to measure makes it hard to know whether or not a particular instrument is an appropriate measure of the target construct (Bryant, 2000). Potentially useful strategies for defining research constructs are to draw on published literature reviews that synthesize available theories concerning a particular construct, or to review the published literature on one's own in search of alternative theoretical definitions. If you can find alternative conceptual definitions of the target construct, then you can choose from among them a particular conception that resonates with your own thinking. The process of explicitly conceptualizing the construct that you wish to measure is known as "preoperational explication" (Cook & Campbell, 1979). One strategy for selecting instruments is to employ only those most commonly used in published studies. Not only is this strategy simple and straight-forward, but some researchers follow it in the hope of increasing the likelihood that their research will be published. However, it limits conceptual definitions to those created within the theoretical frameworks of commonly used instruments. Over time, this effectively constricts the generalizability of research on these constructs. Further, all measurement instruments tap irrelevancies that have nothing to do with the constructs they are intended to assess. Using The Behavioral Measurement Letter Imagine, for example, that a clinical researcher wants to use a self-report measure of shyness. A 18 Vol. 7, No.2, Winter 2002 Ten Commandments for Selecting Self-Report Instruments (continued) appropriate self-report instrument in each case will be the one whose underlying conceptual definition most closely corresponds to that of the researcher. The sociologist, on the one hand, might be studying people's reactions to homeless adults from a macro-level, sociocultural perspective. If so, then she might begin by defining guilt to be a pro social emotion experienced when one perceives oneself as being better off than another person who is disadvantaged. Consistent with this conceptualization is Montada and Schneider's (1989) three-item measure of "existential guilt," conceived as prosocial feelings in reaction to the socially disadvantaged. The psychologist, on the other hand, might be studying personality from a micro-level, individual perspective. If so, then she might begin by defining guilt to be a dispositional feeling of regret or culpability in reaction to perceived personal or moral transgressions. Consistent with this conceptual definition is Kugler and Jones' (1992) 98-item Guilt Inventory, which includes a separate subscale specifically tapping personal guilt as an unpleasant emotional trait. Clearly, instruments must have underlying conceptual definitions that match your own conceptual framework (Brockway & Bryant, 1998). wide variety of potentially relevant measures can be found in the Health and Psychosocial Instruments database, depending on how the researcher conceptually defines shyness. Is shyness (for which there are 30 primary records in the database) synonymous with introversion (30 primary records), timidity (17 primary records), emotional insecurity (2 primary records), social anxiety (63 primary records), social fear (1 primary record), social phobia (29 primary records), social avoidance (4 primary records), or social isolation (76 primary records)? The clearer and more precise the initial conceptual definition, the easier it will be to find appropriate measurement tools. An added benefit of precisely specifying target constructs at the outset is that it helps to focus the research. Although precise preoperational explication is the ideal when selecting measures, in actual practice it is often difficult beforehand to specify a clear conceptual definition of the target construct. Many times the published literature does not provide explicit alternatives and this, then, forces researchers to explicate constructs on their own -- the equivalent of trying to define an unknown word without having a dictionary. In actual practice, researchers often begin by selecting a particular instrument that appears useful, thus adopting by default the conceptual definition of the target construct that underlies the chosen instrument. Truly, therefore, an available instrument often dictates one's conceptual definition post hoc. 3. Never decide to use a particular instrument based solely on its title. Just because the name of an instrument includes the target construct does not guarantee that it either defines this construct the same way you do or even measures the construct at all. Don't let the title lead you to select an inappropriate instrument. As a case in point, consider a developmental psychologist who wants to measure parents' psychological involvement in their families. Based on its promising title, the researcher decides to use the ParentiFamily Involvement Index (PFII; Cone, DeLawyer, & Wolfe, 1985). After obtaining the instrument and inspecting its constituent items, the researcher realizes to his chagrin that the PFII requires a knowledgeable informant (e.g., a teacher or teacher's aide) to indicate whether or not the particular parent of a school-aged child has engaged in each of 63 specific educational activities. Based on an underlying conceptualization of family involvement in psychological terms, a more appropriate instrument would be a measure 2. Choose the instrument that is designed to tap an underlying construct whose conceptual definition most closely matches your own. Carefully consider the theoretical framework on which the originators based their instrument. Select an instrument that stems from a theory that defines the construct the same way you do, or at least in a way that does not contradict your conceptual orientation, for the theoretical underpinnings of the instrument should be compatible with your own conceptual framework. Consider a sociologist and a psychologist, each of whom wants to measure guilt. The most Vol. 7, No.2, Winter 2002 19 The Behavioral Measurement Letter Ten Commandments for Selecting Self-Report Instruments (continued) types of reliability (e.g., internal consistency, split-half, parallel-forms, interrater, test-retest) and of reliability coefficients (e.g., Cronbach's alpha, coefficient kappa, intraclass correlation, KR-20). Similarly, there are numerous types of validity (e.g., construct, concurrent, criterionreferenced, convergent, discriminant). Thus a host of specialized statistical tools has been developed to quantify both reliability (Strube, 2000) and validity (Bryant, 2000). Given the numerous types of reliability and reliability coefficients, validity, and tools used to assess reliability and validity, instrument selection requires at least a basic understanding of psychometrics and of principles of scale validation in order to make informed judgments of instrument quality. When there is no evidence concerning an instrument's reliability or validity, measurement becomes a "shot in the dark." developed by Yogev and Brett (1985) that assesses parental involvement in terms of the degree to which parents identify psychologically with and are committed to family roles. This example shows clearly that you can't judge an instrument by its title any more than you can judge a book by its cover (Brockway & Bryant, 1997). 4. Choose an instrument for which there is evidence of reliability and validity. Reliability 'i i is measurement that is accurate and consistent. Good reliability in measurement strengthens observed statistical relationships -- the more reliable the instrument, the smaller will be the error in measurement, and the closer observed results will be to actual results. For example, imagine a medical researcher who wants to determine whether an experimental antipyretic agent reduces fever more rapidly than available antipyretics, but who is using an unreliable thermometer that gives different readings over time, even when body temperature is stable. These inconsistencies in measurement make it nearly impossible to assess temperature accurately, and greatly decrease the likelihood of finding any experimental effects. 5. Given a choice among alternatives, select an instrument whose breadth of coverage matches your own conceptual scope. If you define your target construct as having a wide range of requisite constituent components, then choose an instrument whose items tap a broad spectrum of content relating to those components. On the other hand, if you define your target construct in a way that specifies a narrower set of conceptual components, then choose an instrument that has a more restrictive and specific content. Data supporting the validity of an instrument increase one's confidence that it really measures what it is designed to measure. For example, the medical researcher referred to above can be more confident that the thermometer actually measures temperature if its readings correlate highly with those of infrared-telemetry or bodyscanning devices. Although instrument developers sometimes report reliability and validity data, such empirical evidence is often available only in published studies that have used the given measure. As a rule, avoid judging the validity of an instrument by the content of its constituent items. What an instrument appears to measure "on its face" (i.e., face validity) is not necessarily what it actually measures. As in the case of an instrument's title, what you see is not necessarily what you get. Breadth of coverage varies widely across available instruments. For example, to measure coronary-prone Type A behavior, alternatives include: the Student Jenkins Activity Survey (Glass, 1977) that taps the Type A components of hard-driving competitiveness and speedimpatience; the Type A Self-Rating Inventory (Blumenthal, Herman, O'Toole, Haney, Williams, & Barefoot, 1985) that taps harddrivingness and extraversion; the Type A/Type B Questionnaire (Eysenck & Fulker, 1983) that taps tension, ambition, and activity; the Time Urgency and Perpetual Activation Scale (Wright, 1988) that taps activity and time urgency; and the Self-Assessment Questionnaire (Thorensen, Friedman, Powell, Gill, & Ulmer, 1985) that taps hostility, time urgency, and competitiveness. Clearly, your choice of Judging the quality of research evidence concerning measurement reliability and validity can be difficult and confusing. There are various The Behavioral Measurement Letter 20 Vol. 7, No.2, Winter 2002 I Ten Commandments for Selecting Self-Report Instruments (continued) seeks a global summary of patients' overall life satisfaction, whereas the other wants to compare levels of satisfaction across important aspects of patients' lives. The first researcher could use the five-item Satisfaction with Life Scale (Diener, Emmons, Larsen, & Griffin, 1985) to obtain a global total score. The second researcher could use the 66-item Quality of Life Index (Ferrans & Powell, 1985) to obtain individual scores for four separate satisfaction subscales: Health/ Functioning, Socioeconomic, Psychological/ Spiritual, and Family. instrument depends on the specific components of Type A behavior that you want to investigate. The choice between general versus specific measures of a given construct may also depend on your particular research question. Consider the process of coping, for example. Numerous instruments exist for measuring people's general style of coping in response to stress. However, if you want to study coping in relation to a specific problem or stressor, a host of other measures have been developed to assess individual differences in coping with such specific concerns as arthritis, asthma, cancer, chronic pain, diabetes, heart disease, hypertension, stroke, multiple sclerosis, spinal cord injury, HIV/AIDS, rape, sexual abuse, sexual harassment, pregnancy, miscarriage, childbirth, economic pressure, job stress, unemployment, depression, bereavement, natural disasters, prison confinement, test anxiety, and war trauma, to name just a few. Compared to a broad-band measure of general coping, a narrow-band measure of coping that is specific to a particular stressor would be expected to show stronger relationships with reactions to that specific stressor. 7. Choose an instrument with a time frame and response format that meet your needs. Don't use a "trait" measure (i.e., an instrument that defines the underlying construct as a stable disposition) to assess a transient, situational "state" that you expect to change over time. And don't modify the labels on a response scale (e.g., from "rarely" to "never") or the time frame of measurement (e.g., from "in general" or "on average" to "during the past day" or "in the past hour") unless you have no other recourse. Substantial changes in an instrument's response format or time frame can compromise its construct validity and therefore require revalidation. In measuring hostility, for example, the choice of an appropriate instrument would depend on whether you conceive of hostility as a transitory, variable state or a stable, dispositional trait. An appropriate tool for measuring "state" hostility might be the 35-item Current Mood Scale (Anderson, Deuser, & DeNeve, 1995), which is designed to assess situational hostility, whereas an appropriate tool for measuring "trait" hostility might be the 50-item Cook-Medley Hostility Scale (Cook & Medley, 1954), which is based on the Minnesota Multiphasic Personality Inventory (MMPI; Hathaway & McKinley, 1989) and conceptualizes hostility as a personality trait. Clearly, when selecting instruments you need to distinguish carefully between states and traits. 6. Select an instrument that provides a level of measurement appropriate to your research goals. Some instruments are based on a theoretical model in which the underlying construct is assumed to be "unitary." Such instruments provide only a general, global "total score" that summarizes the overall level of responses. Other instruments are based on a theoretical framework in which the latent construct is considered to be "multidimensional." Such instruments provide multiple "subscale scores," each of which taps a separate dimension of the underlying construct. Thus, if you want to gather global summary information about a target construct, then use a unitary instrument. If you want to gather information about multiple facets of a target construct, then use a multidimensional instrument. 8. Match the reading level required to understand the items in the instruments you select to the age of the intended respondents. In studying children or adolescents, for example, avoid using an instrument designed for use with adults. When in doubt, use word-processing' or Imagine two nursing researchers, each of whom wishes to measure patients' life satisfaction. One Vol. 7, No.2, Winter 2002 21 The Behavioral Measurement Letter Ten Commandments for Selecting Self-Report Instruments (continued) have access to the detailed scoring instructions contained in the test manual. Clearly, then, administering an instrument is one thing, but scoring it can be an entirely different matter. linguistic software to determine the reading ability level required to understand an instrument's constituent items. 10. Rather than choosing only one measure, when feasible use multiple measures of the construct you wish to assess. A central tenet of classical measurement theory is that any single way of measuring a construct has unavoidable idiosyncrasies that are unique to the measure and have nothing to do with the underlying conceptual variable. By studying what multiple measures of the same construct have in common, researchers can converge or triangulate on the referent construct. Using multiple measures also allows an assessment of the generalizability of results across alternative operational or conceptual definitions, to probe the generality versus specificity of effects. And in the long run, using multiple instruments will advance our understanding of the targeted construct much further than simply using a single instrument. Imagine a researcher who wants to measure depression in children. Depending on the average age of the subjects, the researcher could choose from a variety of different self-report instruments specifically designed to assess depression in children of various ages, including those with a first-grade reading level (Children's Depression Inventory; Kovacs, 1985), children age 7-13 (Negative Affect Self-Statement Questionnaire; Ronan, Kendall, & Rowe, 1994), children age 8-12 (Childhood Depression Assessment Tool; Brady, Nelms, Albright, & Murphy, 1984), or children age 8-13 (Depression Self-Rating Scale; Asarnow, Carlson, & Guthrie, 1987). In any case, the researcher studying childhood depression should avoid adopting an instrument designed to tap depression in adults. Even when following the ten guidelines for instrument selection discussed above, you will still sometimes face difficult, highly subjective decisions 'in choosing appropriate measures. For example, which mood measure is more appropriate: one that uses a seven-point response scale, or one that uses a four-point response scale; one that includes a specific label for each individual point on its response scale, or one that has labels only at its endpoints; one that assesses the frequency with which respondents experience certain feelings, or one that taps the percentage of time respondents experience certain feelings? Given the subjectivity of such decisions, it makes all the more sense to use multiple measures whenever possible so as to evaluate the generalizability of results across alternative operational definitions of the same underlying construct. 9. Never use an instrument unless you know how you'll score it and how you'll analyze it. This rule may seem self-evident, but it is sometimes violated unintentionally. No matter how interesting or important an instrument seems, it is useless unless you can convert responses to it into meaningful data. Sometimes the scoring rules are difficult to obtain or hard to follow, particularly when an instrument consists of multiple composite subscales, reverse-scored items, or item-specific scoring weights. This suggests that researchers should first make sure they know how to score an instrument before they administer it. Consider the SF-12 (Ware, Kosinski, & Keller, 1996), a 12-item self-report instrument designed to measure functional health status. At first glance, it might appear simple enough to score this instrument by simply summing or averaging responses to its constituent items. But the test manual for the SF-12 (Ware, 1993) specifies a complex set of mathematical computations designed to weight and combine the 12 item scores to produce composite scores reflecting mental, physical, and total functioning. Users cannot score the SF-12 correctly unless they The Behavioral Measurement Letter To recap, I Commandments" instruments: have suggested "Ten for selecting self-report 1. Before choosing an instrument, define the construct you want to measure as precisely as feasible. 22 Vol. 7, No.2, Winter 2002 - - Database Services, creator of the HaPI database, to secure permission to use an instrument, and to obtain a hardcopy of it along with any scoring instructions. For a reasonable fee, BMDS will perform these services. Ten Commandments for Selecting Self-Report Instruments (continued) 2. Choose the instrument that is designed to tap an underlying construct whose conceptual definition most closely matches your own. References 3. Never decide to use a particular instrument based solely on its title. Anderson, c.x.. Deuser, W.E., & DeNeve, K.M. (1995). Hot temperatures, hostile affect, hostile cognition, and arousal: Tests of a general model of affective aggression. Personality and Social Psychology Bulletin, 21, 434-448. 4. Choose an instrument for which there is evidence of reliability and validity. 5. Given a choice among alternatives, select an instrument whose breadth of coverage matches your own conceptual scope. Asarnow, J.R., Carlson, G.A., & Guthrie, D. (1987). Coping strategies, self-perceptions, hopelessness, and perceived family environments in depressed and suicidal children. Journal of Consulting and Clinical Psychology, 55,361-366. 6. Select an instrument that provides a level of measurement appropriate to your research goals. Beck, A.T (1987). Beck Depression Inventory (BDI). San Antonio, TX: Psychological Corporation. 7. Choose an instrument with a time frame and response format that meet your needs. Blumenthal, lA., Herman, S., O'Toole, i.c., Haney, TL., Williams, R.B., Jr., & Barefoot, J.e. (1985). Development of a brief self-report measure of the Type A (coronary prone) behavior pattern. Journal of Psychosomatic Research, 29, 265-274. 8. Match the reading level required to understand the items in the instruments you select to the age of the intended respondents. Brady, M.A., Nelms, B.C., Albright, A.V, & Murphy, C.M. (1984)". Childhood depression: Development of a screening tool. Pediatric Nursing, 10, 222-225,227. 9. Never use an instrument unless you know how you'll score it and how you'll analyze it. 10. Brockway, lH., & Bryant, FB. (1997). Teaching the process of instrument selection in family research. Family Science Review, 10, 182-194. Rather than choosing only one measure, when feasible use multiple measures of the construct you wish to assess. Following these guidelines select instruments wisely. Brockway, J.H., & Bryant, FB. (1998). You can't judge a measure by its label: Teaching students how to locate, evaluate, and select appropriate instruments. Teaching of Psychology, 25, 121-123. will help you to Bryant, FB. (2000). Assessing the validity of measurement. In L.G. Grimm & P.R. Yarnold (Eds.), Reading and understanding more multivariate statistics (pp. 99-146). Washington, DC: American Psychological Association. Addendum Obtaining Instruments Once Identified Adhering to these ten guiding principles can help you identify appropriate measurement instruments, but then you need to obtain them and permission to use them. Indeed, physical availability may ultimately dictate instrument choice. An instrument may be unavailable for a variety of reasons, including copyright restrictions, being out-of-print, or death of the primary author. Given such obstacles, rather than trying to contact the original developer to obtain copies of an instrument, it may be best to contact BMDS, Behavioral Measurement Vol. 7, No.2, Winter 2002 Cook, TD., & Campbell, D.T (1979). Quasiexperimentation: Design & analysis issues for field settings. Chicago: Rand McNally. Cook, WW, & Medley, D.M. (1954). Proposed hostility and pharisaic-virtue scales for the MMPI. Journal of Applied Psychology, 38, 414-419. Cone, J.D., DeLawyer, D.D., & Wolfe, VV (1985). Assessing parent participation: The Parent/Family Involvement Index. Exceptional Children, 51, 4l7-424. 23 The Behavioral Measurement Letter Ware, J., Jr., Kosinski, M., & Keller, S.D. (1996). A 12Item Short-Form Health Survey: Construction of scales and preliminary tests of reliability and validity. Medical Care, 34, 220-233. Ten Commandments for Selecting Self-Report Instruments (continued) Diener, E., Emmons, R.A., Larsen, R.I., & Griffin, S. (1985). The Satisfaction with Life Scale. Journal of Personality Assessment, 49, 71-75. Wright, L. (1988). The Type A behavior pattern and coronary artery disease: Quest for the active ingredients and the elusive mechanism. American Psychologist, 43, 214. Eysenck, H., & Fulker, D. (1983). The components of Type A behavior and its genetic determinants. Personality and Individual Differences, 4, 499-505. Yogev, S., & Brett, J. (1985). Patterns of work and family involvement among single-and dual-earner couples. Journal of Applied Psychology, 70, 754-768. Ferrans, c.E., & Powell, M.J. (1985). Quality of Life Index: Development and psychometric properties. Advances in Nursing Sciences, 8, 15-24. Glass, D.C. (1977). Behavior patterns, stress, coronary disease. Hillsdale, NJ: Lawrence Erlbaum. Zung, WWK (1965). A Self-Rating Depression Scale, Archives of General Psychiatry, 12, 371-379. and Fred Bryant is Professor of Psychology at Loyola University, Chicago. He has roughly 90 professional publications in the areas of social psychology, personality psychology, measurement, and behavioral medicine. In addition, he has coedited 6 books, including Methodological Issues in Applied Social Psychology (1993; New York: Plenum). Dr. Bryant has extensive consulting experience in a wide variety of applied settings, including work as a research consultant for numerous marketing firms, medical schools, and public school systems; a methodological expert for the u.s. Government Accounting Office; and an expert witness in several federal court cases involving social science research evidence. He is currently on the Editorial Board of the journal Basic and Applied Social Psychology. His current research interests include happiness, the measurement of cognition and emotion, and structural equation modeling. Hamilton, M. (1960). A rating scale for depression. Journal of Neurology, Neurosurgery, and Psychiatry, 23, 56-62. Hathaway, S.R., & McKinley, J.c. (1989). Minnesota Multiphasic Personality Inventory (MMPI). Minneapolis, MN: NCS Assessments. Kovacs, M. (1985). The Children's Depression Inventory (CDI). Psychopharmacology Bulletin, 21, 995-998. Kugler, K, & Jones, WH. (1992). On conceptualizing and assessing guilt. Journal of Personality and Social Psychology, 62, 318-327. Montada, L., & Schneider, A. (1989). Justice and emotional reactions to the disadvantaged. Social Justice Research, 3, 313-344. National Institute of Mental Health (1977). Center for Epidemiologic Studies Depression Scale (CES-D). Rockville, MD: Author. Ronan, KR., Kendall, rc, & Rowe, M. (1994). Negative affectivity in children: Development and validation of a self-statement questionnaire. Cognitive Therapy and Research, 18, 509-528. Strube, M.I. (2001). Reliability and generalizability theory. In L.G. Grimm & P.R. Yamold (Eds.), Reading and understanding more multivariate statistics (pp. 23-66). Washington, DC: American Psychological Association. Remember me, I'm HaPI at BMDS! Thoresen, C.E., Friedman, M., Powell, L.H., Gill, J.J., & Ulmer, D. (1985). Altering the Type A behavior pattern in postinfarction patients. Journal of Cardiopulmonary Rehabilitation, 5, 258-266. HaPI Ware, J.E., Jr. (1993). SF-12 Health Survey (SF-12). Boston, MA: Medical Outcomes Trust. I Health and Psychosocial Instruments The Behavioral Measurement Letter 24 Vol. 7, No.2, Winter 2002 HaPI Thoughts ...•...••.....•....•.......•..................•.......•.....••...........................•...•.......... Search for measurement instruments in the Health and Psychosocial Instruments (HaPI) database with over 105,000 records of measurement instruments online or on CD-ROM! ·'~ ~. *- Produced by BMDS • Behavioral Measurement if ~ Database Services SUBSCRIBE TO HaP\! Call Ovid Technologies at 800-950-2035 ext 6472 for pricing and to order today! Obtain copies of measurement instruments from BMDS Instrument Delivery! Call BMDS at 412-687-6850, fax 412-687-5213, or e-mail [email protected] ($20/$30 processing/handling fee) •••••.•••••.••••.•.••...•..•...•......•.•......................•....•.•••••.........•••...•••..••....... HaPI Thoughts Good News! Ovid Technologies now comprises both the Ovid and SilverPlatter platforms. Health and Psychosocial Instruments (HaPI) continues on the Ovid platform and can be found in the categories of Medicine & Allied Health. and Social Sciences. Measurement Assistance? Yes indeed! It's just a phone call away and with a smile you get searching tips and suggestions for quick retrieval of the instruments you need. Call Evelyn Perloff. PhD. for Measurement Assistance from BMDS at 412687-6850. Vol. 7, No.2, Winter 2002 25 The Behavioral Measurements Letter •• •••• ••••• ••• • • • • • • ••• • ••••••••• •••••• The BEHAVIORAL MEASUREMENT Letter Behavioral Measurement Database Services • ••• • ••••• • •• •• • ••• • ••••• •• • •••••••• ••• • • • • •• • • ••• In This Issue: • Introduction to This Issue Al K. DeRoy 1 • Refinements to the Lubben Social Network Scale: The LSNS-R James Lubben, Melanie Gironda, and Alex Lee 2 • Self-Estimated Intelligence Adrian Furnham 12 • Ten Commandments for Selecting Self- Report Instruments Fred B. Bryant 18 • HaPI Thoughts 25 '1VnI3.LVWa3.LVa L8LO-ZSZSI Vd 'q'i3mqsn1d • L8ZOlI X0S: Od Sg:J1AJgS gSBqBlUa lUgrnglnSBgw ·Saws IB101ABqgs: .1()1J()7 lUgrnglnSBgw IB10~ABqgg ()If.L