There has been an increased interest in utilizing self-presentation theory (^{14,22} ) to understand behavior in physical activity environments (^{21} ). For example, researchers have examined the construct of social physique anxiety in physical activity environments because it may have important implications for understanding exercise and sport behavior. Social physique anxiety has been related to avoidance of public exercise settings, preferences for exercise settings differentially emphasizing one’s physique, and self-reported leisure time physical activity (^{7,20,37} ), and it may influence an individual’s willingness to engage in some types of physical activity (^{21} ). Social physique anxiety also has been related to self-presentation issues such as physical attractiveness, physical self-presentation confidence, satisfaction with body size and weight, and weight control (^{7,11,27,29} ), and it even may detract from the positive affective responses associated with exercise (^{21} ).

The majority of research examining the construct of social physique anxiety has employed the Social Physique Anxiety Scale (SPAS)—a 12-item, unidimensional measure of anxiety related to the perceived negative evaluation of one’s physique by others (^{15} ). The factorial validity of the SPAS has been scrutinized (^{12,13,28,29,34} ), and it is not well established. Researchers initially reported that a one-factor model adequately described responses to the SPAS using exploratory factor analysis (^{15} ), and the one-factor model was further supported using confirmatory factor analysis (CFA; 29). Subsequent research tested the fit of three different models (one-factor unidimensional model, two-factor uncorrelated model, and two-factor higher-order model) to the SPAS utilizing CFA. Results indicated that the higher-order model, which consisted of two first-order factors (physique presentation comfort and expectations of negative physical evaluation) subordinate to a second-order factor (social physique anxiety), represented the best fit to the SPAS (^{13} ). The two-factor higher-order model was further supported by other research (^{34} ), and it has been reported to be invariant across gender (^{12} ).

Although the two-factor higher-order model has received empirical support, it also has been questioned as a valid solution to the SPAS (^{10,13,28} ). The multidimensional model to the SPAS may be conceptually flawed (^{10,13,28} ). The higher-order model was not identified according to the three-indicator rule (4, p. 247), and it was not supported in an analysis of factorial invariance (^{12} ); the test of invariant factor structure resulted in a larger chi-square value than the test of invariant variance-covariance matrices. The two factors in the higher-order model even may represent method variance (i.e., error) rather than true score variance, because all positively worded items load on one factor and all negatively worded items load on the other factor. Marsh (^{26} ) and Tomás and Oliver (^{39} ) have reported a similar phenomenon with a measure of global self-esteem and utilized CFA methodology to test whether two-factor models associated with positively and negatively worded items are substantively meaningful or methodological artifacts.

Rather than testing the fit of various models to the 12-item SPAS, some researchers have opted to modify the number of SPAS items in an attempt to improve the factorial validity (^{28,29} ). For example, Martin and colleagues (^{28} ) recently provided a nine-item unidimensional solution to the SPAS. The nine-item solution was generated based on conceptual and statistical arguments. Two of the original SPAS items did not appear to possess social evaluation components and correlated strongly with measures of body satisfaction; one of the original items (item 2) has consistently demonstrated a weak factor loading (^{12,13,29,34} ). The three items were removed and the remaining nine items were subjected to CFA to test the unidimensional model in four separate samples of females. Results of the CFAs provided support for the nine-item unidimensional solution in each of the four samples. Accordingly, Martin et al. (28, p. 359) suggested that the nine-item unidimensional model was “more parsimonious and conceptually clear than a two-factor model” and recommended that others should begin to utilize the nine-item unidimensional version of the SPAS.

The recommendation of Martin et al. (^{28} ) may be premature. Martin et al. (^{28} ) reported relative indices of fit but omitted other indices that would provide additional information needed to support the unidimensional model more completely. The CFAs were based on the responses of only women and the extent to which the proposed nine-item model fits the responses of men is unknown. The invariance of nine-item unidimensional model across gender also should be examined. Researchers have reported gender differences in mean SPAS scores (^{15,27} ), but inadequate attention has been directed toward potential gender differences in the factor structure. An invariance analysis will test the extent to which the factor structure underlying responses to the SPAS is equivalent across females and males (^{25} ). The invariance analysis illustrates whether responses to the SPAS have the same or different meaning for women and men, and it ultimately may contribute to a meaningful comparison and interpretation of mean SPAS scores across gender.

Further examination of the factorial validity and invariance of the SPAS is important because the understanding of social physique anxiety in physical activity environments is partly dependent upon a structurally valid and stable measure. The construct validity of the SPAS also requires further evaluation because establishing evidence of score meaning is a continual and evolving process (^{32} ). The present study, therefore, had five purposes that further tested the factorial validity , factorial invariance, and construct validity of the SPAS. We first tested whether a two-factor correlated solution to the original 12-item version of the SPAS was substantively meaningful or a methodological artifact representing positively and negatively worded items. The test was performed using the CFA methodology described by Marsh (^{26} ) and Tomás and Oliver (^{39} ). The two-factor correlated model was tested because it is statistically equivalent to the higher-order model, but unlike the higher-order model it is identified in covariance modeling. The second purpose involved assessing the factorial validity of the nine-item unidimensional model to the SPAS in a sample of women and men. The third purpose was to examine whether modifying the number of items based on both standardized residuals and item content would improve the factorial validity of the SPAS. The fourth purpose was to evaluate the factorial invariance of the SPAS across gender according to the multi-step procedure described by Jörsekog and Sörbom (^{19} ). The fifth purpose involved performing correlation analyses between SPAS scores and measures of theoretically related constructs to provide convergent and discriminant evidence of construct validity (^{32} ).

METHODS
Subjects
The Institutional Review Board approved the procedures for this study and all subjects signed an informed consent document before data collection. Participants (N = 312) were female (N = 146) and male (N = 166) college students with a mean age of 22.2 yr (SD = 4.0). The mean age of the women was 21.7 yr (SD = 3.9). The mean age of the males was 22.7 yr (SD = 4.1). Students were recruited from a Liberal Education lecture class (i.e., Sport in American Society;N = 103) because it was convenient and provided access to a large sample on a single occasion. Students also were recruited from a variety of exercise classes: weight training (N = 82), swimming (N = 61), aerobics (N = 30), aquatone (N = 22), and walking (N = 14). The participants were recruited to represent college-aged and physically active groups because the SPAS has most often been employed and psychometrically evaluated in these populations (e.g., 13,15,28). Employing a sample of subjects from different populations would not have allowed for a direct comparison of our results to previous research testing the factorial validity of the SPAS.

Instrumentation
The 12-item SPAS was employed as originally developed by Hart et al. (^{15} ) with the item two modification recommended by Eklund et al. (^{12} ). The 12 items were rated on a 5-point Likert-type scale with the response options of 1) not at all, 2) slightly, 3) moderately, 4) very, and 5) extremely. Items 1, 5, 8, and 11 were reverse-scored. Responses from all items were employed when testing the 12-item SPAS; only responses to items 3, 4, 6, 7, 8, 9, 10, 11, and 12 were employed in the CFA on the nine-item unidimensional model. The internal consistency of the 12- and 9-item versions of SPAS scores has ranged between 0.88 and 0.90 (^{15,27–29,34} ). The 8-wk test-retest reliability of the 12-item SPAS has been estimated to be 0.82 (^{15} ). The convergent validity of SPAS scores has been supported by positive correlations with measures of social anxiety (i.e., interaction anxiousness and fear of negative evaluation), public self-consciousness, weight and body shape satisfaction, and percent body fat (^{7,11,15,34} ). Negative correlations with measures of body cathexis and multidimensional body-esteem (i.e., physical attractiveness, physical condition, sexual attractiveness, upper-body strength, weight concern) also have supported the convergent validity of the SPAS (^{15,34} ). SPAS scores have differentiated between individuals reporting high and low levels of discomfort, negative thoughts, and stress during physique evaluation (^{15} ), which provides evidence of its discriminant validity. High scores on the SPAS appear to reflect anxiety and discomfort related to the perceived negative evaluation of one’s physique by others.

The eight-item surveillance subscale of the Objectified Body Consciousness Scale (S-OBCS; 31) was employed to measure the degree to which an individual monitors bodily appearance. Because social physique anxiety is a self-presentation construct that involves anxiety related to the perceived negative evaluation of one’s physique by others, SPAS scores should theoretically exhibit a moderate, positive correlation with monitoring bodily appearance. The eight items were rated on a seven-point scale anchored by strongly agree (^{1} ) and strongly disagree (7). Items 1, 2, 3, 4, 7, and 8 were reverse-scored. The internal consistency of the S-OBCS has been estimated to range between 0.76 and 0.89 (^{31} ). The 2-wk test-retest reliability has been estimated to be 0.79 (^{31} ). Surveillance subscale scores have exhibited appropriate correlations with measures of appearance orientation, body-esteem, public body consciousness, and public self-consciousness in young women (^{31} ), which supports its construct validity . High scores on the S-OBCS reflect the tendency to monitor and think about the appearance of one’s body.

The 22-item Physical Self-Efficacy Scale (PSES; 36) was employed to measure the two constructs of perceived physical ability (PPA) and physical self-presentational confidence (PSPC). Based on previous research (^{27,29} ), we anticipated that social physique anxiety would moderately and negatively correlate with PPA and PSPC. The 22 items were rated on a 6-point Likert scale anchored by strongly agree (1) and strongly disagree (6). The PPA subscale consisted of 10 items (1, 2, 4, 6, 8, 12, 13, 19, 21, and 22); the PSPC subscale consisted of 12 items (3, 5, 7, 9, 10, 11, 14, 15, 16, 17, 18, and 20). Items 1, 3, 4, 9, 11, 14, 17, 19, 20, 21, and 22 were reverse-scored. The PPA and PSPC subscale scores were reported to be positively correlated (r = 0.26) and could be summed to form an overall PSES score (^{36} ). We did not employ the overall PSES score in this study because it would contain information that was redundant to the correlations between SPAS scores and PPA and PSPC subscale scores. The internal consistency of the PSPC and PPA subscales were originally estimated between 0.74 and 0.85 (^{36} ), although some studies have found the PSPC to possess internal consistency estimates below 0.70 (^{27,29,30} ). The 6-wk test-retest reliability of the PPA and PSPC subscales has been estimated to range between 0.69 (PSPC) and 0.85 (PPA; 36). The construct validity of the PPA and PSPC subscales has been established via negative correlations with measures of anxiety, self-consciousness, and social anxiety and positive correlations with measures of physical self-concept, self-esteem, and sensation seeking (^{36} ). Predictive validity has been established via prediction of physical fitness, self-reported physical activity, sport participation, and performance on tests of reaction time and motor coordination (^{9,36,38} ). High PPA scores reflect perceived competence in performing tasks involving physical skills; high scores on the PSPC reflect confidence in displaying physical skills and having them evaluated by others.

The 13-item version C short-form of the Marlowe-Crowne Social Desirability Scale (SDS-C) was employed to measure social desirability (^{35} ). We measured social desirability because it may be a component of self-presentation (^{33} ) and therefore correlate positively with social physique anxiety scores. We also were interested in determining whether SPAS scores were influenced by social desirability response biases. Items were rated “true” or “false” and responses to items 1, 2, 3, 4, 6, 8, 11, and 12 were reverse-scored. Responses were summed to form an overall score. The internal consistency has been estimated to be 0.76 (^{35} ). The mean item to short-form scale correlation was reported to be 0.38, which was consistent with the original findings for the complete Marlowe-Crowne scale (^{35} ). Scores on the SDS-C have been correlated with the original Marlowe-Crowne Social Desirability Scale (r = 0.93) and with the Edwards Social Desirability Scale (r = 0.41; 35). High SDS-C scores reflect an increased tendency for social desirability response bias.

Procedure
Data were collected from intact lecture and physical activity classes. Course instructors were initially contacted to schedule a date for data collection. On the date of data collection, instructors introduced the researchers and the researchers then described the study as “an inquiry into the attitudes of exercisers” and provided a brief overview of the methods. Students who were willing to participate signed an informed consent document and completed a demographic questionnaire, SPAS, S-OBCS, PSES, and SDS-C.

Data Analysis
The fit of the models for describing responses to the SPAS was examined using CFA with maximum likelihood estimation in LISREL 8.30 (Scientific Software International, Inc., Chicago, IL). More detailed descriptions of CFA methodology have been provided elsewhere (^{3,4,19,40} ). Briefly, researchers using CFA postulate an “a priori ” structure linking observed variables to latent factors and then the structure is tested for its ability to fit the data. The a priori structure consists of factor loadings (relationship between measured variables and latent variables), factor covariances and variances (relationship between latent variables), uniquenesses (combination of specific and error variance associated with each measured variable), and in some cases correlations among uniquenesses (systematic covariation between items that is not explained by the latent variables). Maximum likelihood was selected to estimate the parameters in the models because it has resulted in accurate absolute and relative fit indices with ordered categorical data of varying degrees of kurtosis (^{17} ). The CFAs were over-identified according to the necessary (i.e., there were more elements in the covariance matrix than estimated parameters allowing for a model that can be uniquely determined) and sufficient guidelines of Bollen (^{4} ). The sample size in this study was adequate to estimate the various models based on two criteria: 1) a total sample size larger than 300 and 2) ratio of total sample size to number of freely estimated parameters greater than 5:1 and approximating 10:1 (^{3,4} ).

Both absolute and relative fit indices were examined to assess whether the model was reasonable (^{3,4,16,19} ). The chi-square statistic assessed absolute fit of the specified model to the data, but it is sensitive to sample size and assumes the correct model. Accordingly, Jörsekog (^{18} ) suggested that the chi-square to degrees of freedom ratio (χ^{2} / df ) also should be employed to judge model fit. The χ^{2} / df should be less than three and close to two as an estimate of acceptable fit (^{4,6,40} ). The root mean square error of approximation (RMSEA) represents closeness of fit, and values should approximate 0.05 to demonstrate an acceptable and close fit (^{5,16} ). The 90% confidence interval (CI) around the RMSEA point estimate should contain 0.05 and/or zero to indicate the possibilities of close and/or exact fit (^{5} ). The Goodness of Fit Index (GFI) represents the relative amount of variances and covariances reproduced by the specified model compared to the saturated model. The Non-normed Fit Index (NNFI) tests the fit of the specified model to the null model (i.e., no structure) for describing the variance-covariance among items; NNFI values can be above one. Minimally acceptable fit was based on threshold GFI and NNFI values of 0.90 (^{2,4} ), and values approximating 0.95 were indicative of good fit (^{16} ). The standardized root mean square residual (SRMR) is the average of the standardized residuals between the specified and obtained variance-covariance matrices. The SRMR value should be less than 0.05 and close to zero (^{4,40} ). The LISREL estimates of item loadings, standard errors, t -values, and squared multiple correlations (SMC) also were inspected for appropriate sign and/or magnitude. The standardized residuals were examined to characterize the model fit and identify possible areas of model modification.

Methodological artifact .
To determine whether the two-factor model to the 12-item SPAS represented a substantively meaningful solution or a methodological artifact , we employed a procedure outlined by Marsh (^{26} ) and Tomás and Oliver (^{39} ). This procedure involved testing four nested models. Model 1 involved a single-factor solution to the SPAS representing social physique anxiety. Model 2 posited a two-factor model to the SPAS representing physique presentation comfort (items 1, 5, 8, and 11) and expectations of negative physique evaluation (items 2, 3, 4, 6, 7, 9, 10, and 12) as previously reported by Eklund et al. (^{12} ). Models 3 and 4 posited a single-factor model of social physique anxiety plus the different method effects. Model 3 posited correlated uniquenesses among residual variances of positively worded items (items 1, 5, 8, and 11); model 4 posited correlated uniquenesses among residual variances of negatively worded items (items 2, 3, 4, 6, 7, 9, 10, and 12). Model 4, more specifically, posited that there was systematic residual covariation among negatively worded items that was not explained by the global social physique anxiety factor. Chi-square difference tests were employed to compare the nested models.

Model modification.
Model modification was conducted via an iterative process that involved removing a single item and then rerunning the CFA. Items were removed based on large standardized residuals (i.e., greater than ± 2) and substantive arguments concerning item content (i.e., redundancy and salience). Large standardized residuals identified pairs of items that were either over- or under-predicted by the model (^{4,19} ). One of the two items was removed based on redundant content and/or content that was not equally salient across gender. The CFA was then reperformed to determine whether the modification resulted in an improved fit. This process was continued until a reasonable model was generated as indicated by the absolute and relative fit indices, but modifications were only made when substantively appropriate as recommended by Jörsekog and Sörbom (^{19} ). We also utilized the Aikake Information Criterion (AIC; 1) and the Expected Cross-Validation Index (ECVI; 5) to test modifications because chi-square difference tests cannot be legitimately performed on nonnested models. The AIC value was computed based on the chi-square value for the model minus two times the number of estimated parameters (^{40} ). The ECVI is a single sample estimate that indicates how well the current solution would fit in an independently drawn sample (^{5} ). The AIC and ECVI are not normed on a zero to one scale; reductions in AIC and ECVI values in comparison to other competing models demonstrated an improved and more parsimonious fit of a model (^{40} ).

Factorial invariance.
Factorial invariance was examined through a multi-step procedure outlined by Jörsekog and Sörbom (^{19} ). The invariance routine involved initial CFAs on the female and male data to determine whether the model was tenable in each group separately. The next analysis assessed the invariance of the variance-covariance matrices (Equal Sigmas). The analysis of Equal Sigmas tested whether the variance-covariance underlying the SPAS item responses was invariant across women and men. When the test of Equal Sigmas was rejected, the remainder of the invariance routine examined group differences in the factor structure; the search for group differences also followed a failure to reject the Equal Sigmas hypothesis similar to follow-up ANOVAs with a nonsignificant MANOVA. The final portion of the invariance routine involved four hierarchical CFAs. The CFAs were hierarchical because each successive analysis contained the previous restriction(s) plus one additional restriction. The first CFA tested the equality of the factor structure across groups (i.e., same dimensions or location of fixed, free, and constrained parameters; model 1). The subsequent two CFAs tested the invariance of the factor loadings (i.e., equality of coefficients linking the observed and latent variable; model 2) and factor variance (i.e., equality of factor variance; model 3) across groups. The final CFA was most restrictive, and it tested the invariance of item uniquenesses across groups (i.e., equality of measurement and specific error variance associated with each item; model 4). See Bollen (^{4} ) and Jörsekog and Sörbom (^{19} ) for more detailed descriptions of the invariance analyses. Chi-square difference tests were employed to determine when the scale was no longer invariant across gender. We also utilized the AIC and ECVI to compare invariance across nonnested models.

RESULTS
Descriptive Statistics
The means, standard deviations, skewness, and kurtosis values for the 12 items on the SPAS are reported in Table 1 . The skewness and kurtosis values for all SPAS items were within an acceptable range (i.e., ± 1.96). Examination of multivariate normality using PRELIS 2.20 (Scientific Software International, Inc., Chicago, IL) yielded Mardia’s (^{24} ) coefficient values (normalized estimate) for skewness and kurtosis of 11.34 and 11.14, respectively. It should be noted, however, that the magnitude of Mardia’s coefficient is positively related to sample size (^{4,24} )^{1} and the reported values were only employed for descriptive purposes (^{40} ).

Table 1: Means, standard deviations, skewness, and kurtosis values for the 12 items comprising the SPAS.

The composite mean scores and standard deviations from three different versions of the SPAS and from the OBCS-S, PSES, and SDS-C are provided in Table 2 . The composite mean scores and standard deviations are presented for the total sample and for the samples of women and men separately.

Table 2: Composite mean scores and standard deviations from three versions of the SPAS and from the OBSC-S, PSES, and SDS-C; the mean scores and standard deviations are presented for the total sample and for the samples of women and men separately.

CFA on the 12-Item SPAS: Test of One and Two-factor Models
Results of the CFAs testing whether the two-factor solution to the 12-item SPAS represented substantively meaningful factors or methodological artifact are reported in Table 3 . The one-factor model representing a global social physique anxiety factor (Model 1) did not fit the data as well as the two-factor model positing physique presentation comfort (i.e., positively worded items) and expectations of negative physique evaluation (i.e., negatively worded items) factors (model 2). The one-factor model with correlated uniquenesses among negatively worded items (model 4) fit better than the one-factor model with correlated uniquenesses among positively worded items (model 3). This was consistent with Marsh’s (^{26} ) claim that method effects are primarily associated with negatively worded items. The one-factor model with correlated uniquenesses among negatively worded items (model 4) also fit better than the two-factor model (model 2). This suggests that SPAS items tap one substantively meaningful construct and substantively irrelevant methodological effects related to item wording.

Table 3: Results of the CFAs testing whether the two-factor model to the 12-item SPAS was substantively meaningful or a methodological artifact .

CFA on the Nine-Item SPAS
Results of the CFA on the nine-item unidimensional model indicated that it represented an acceptable but perhaps not optimal fit to the SPAS responses of women and men (χ^{2} = 96, df = 27, χ^{2} / df = 3.56, RMSEA = 0.09 [90% CI = 0.07–0.11], SRMR = 0.04, NNFI = 0.93, GFI = 0.94). The chi-square value was significant and it suggested that the solution should be rejected. The χ^{2} / df exceeded three, and it also suggested that the model did not represent a reasonable fit to the data. The RMSEA value exceeded the acceptable threshold (i.e., 0.05) and the 90% CI around the RMSEA point estimate did not include 0.05 or zero. The SRMR, NNFI, and GFI provided some support for the nine-item model. The SRMR was less than the threshold value of 0.05 and the NNFI and GFI values exceeded 0.90, but not the 0.95 cut-off recommended by Hu and Bentler (^{16} ). The LISREL estimates of item loadings, standard errors, t -values, and squared multiple correlations (SMC) were of the expected sign and/or magnitude.

Model Modifications
We inspected the standardized residuals to determine whether the nine-item model to the SPAS could be modified to improve the fit. There were five standardized residuals greater than positive two and four standardized residuals that exceeded negative two. Three of the standardized residuals were larger than positive three; one of the standardized residuals exceeded negative three. Multiple large residuals were observed for item 6 (2 exceeding ± 2.00), item 11 (2 exceeding ± 2.00), and item 12 (2 exceeding ± 2.00). The content of these items was examined to determine whether there could be substantive justification for removing an item. Items 6 (“Unattractive features of my physique/figure make me nervous in certain social settings”) and 11 (“I usually feel relaxed when it is obvious that others are looking at my physique/figure”) appeared to be redundant in content. The wording of item 12 (i.e., “When in a bathing suit, I often feel nervous about the shape of my body”) may not be equally salient for women and men. This observation was partially supported by a comparison of mean scores for item 12 across gender; women reported a significantly higher mean score than men, t (309) = 6.62, P < 0.0001, 95% CI = 0.67–1.24.

Considering that item 8 (“I am comfortable with how fit my body appears to others”) appeared to tap the same content as item 11 but it demonstrated a better fit in the model (i.e., no large standardized residuals), we removed item 11 and reran the CFA on the remaining eight items. The eight-item model represented a good but not optimal fit to the SPAS (χ^{2} = 55, df = 20, χ^{2} / df = 2.75, RMSEA = 0.07 [90% CI = 0.05–0.10], SRMR = 0.04, NNFI = 0.96, GFI = 0.96). The standardized residuals suggested that item 12 was still problematic (i.e., 5 standardized residuals exceeding ± 2.00) and it was removed. The CFA was reperformed on the remaining seven items. The fit of the seven-item model to the SPAS was supported by both absolute and relative fit indices (χ^{2} = 21, df = 14, P = 0.10, χ^{2} / df = 1.5, RMSEA = 0.04 [90% CI = 0.00–0.07], SRMR = 0.03, NNFI = 0.99, GFI = 0.98). There was only one large standardized residual in the seven-item model.

We compared the AIC and ECVI values for the nine-, eight-, and seven-item unidimensional models. The AIC was reduced from 132.48 for the nine-item model to 87.85 for the eight-item model and to 49.09 for the seven-item model. The ECVI was reduced from 0.43 for the nine-item model to 0.28 for the eight-item model and to 0.16 for the seven-item model. The reduction in AIC and ECVI values across models suggested that the seven-item model to the SPAS represented an improved and more parsimonious fit than the nine- and eight-item models.

Multigroup Invariance Analysis
The invariance analysis was performed on both the nine- and seven-item solutions to the SPAS. The nine-item model was examined for invariance because some indices from the CFA provided evidence of an acceptable fit to the SPAS and Martin et al. (^{28} ) have previously advocated the nine-item model. The seven-item model was examined for invariance because the CFA provided strong support of a fit to SPAS responses.

Nine-item model.
Table 4 contains the fit indices for each step of the invariance analysis. The nine-item model received acceptable, but not unanimous, support in the separate samples of women or men. Some indices suggested questionable fit of the model (i.e., χ^{2} , RMSEA, & 90% CI around the RMSEA); other indices (i.e., χ^{2} / df , SRMR, NNFI, and GFI) suggested that the nine-item unidimensional model was acceptable for summarizing the SPAS responses of both the females and the males. The test of Equal Sigmas was not rejected, and it indicated that the variance-covariance matrix was equivalent across groups. To examine possible differences in the factor structure across groups, models one through four were compared using chi-squared difference tests. The first chi-square difference test was not significant and suggested that the one-factor structure and pattern of loadings were invariant across gender. The equivalence of the factor loadings across groups (i.e., equivalent coefficients linking observed variables to the latent variable) has been reported to be the minimal condition of factorial invariance (^{25} ). The second and third chi-square difference tests were significant and suggested that the factor variance and item uniquenesses were not equivalent across gender. The lack of equivalent factor variance was manifest in a larger LISREL parameter estimate for women than for men, which suggested that the one-factor model accounted for more variance among responses to the nine-item SPAS for women than men. The lack of invariant uniquenesses was not manifest in a clear pattern of gender differences—the estimates of uniquenesses for the women were both larger and smaller than the estimates for the men.

Table 4: Results of the CFAs testing the factorial invariance of nine-item unidimensional model to the SPAS across gender.

The estimates of item loadings from model 2 are presented in Table 5 because it was not significantly different than model 1. The item loadings for women and men combined are common metric completely standardized; item loadings for women and men separately are within group completely standardized. The within group loadings for females were an average of 0.10 higher than the loadings for men. This difference seemed to be related to moderately large LISREL estimates of the standard errors (range = 0.10–0.13).

Table 5: Factor loadings for the nine-item unidimensional solution to the SPAS based on model 2 in the invariance routine.

We encountered a possible problem when testing the invariance of the nine-item model. The test of invariant factor structure (model 1) resulted in a substantially larger chi-square value than the test of invariant variance-covariance matrices (Equal Sigmas; see Table 4 ). One would expect that applying structure to the variance-covariance matrix would result in a slight, but not a substantial increase in the chi-square value compared with the test of Equal Sigmas. The substantial increase in the chi-square value runs counter to the excellent example of an invariance routine provided by Jörsekog and Sörbom (19, p. 284, Table 9 .2). Jörsekog and Sörbom (^{19} ) reported a dramatic reduction in the chi-square value when the model was applied to the variance-covariance matrix. The substantial increase in the chi-square value, therefore, seemed to indicate that the nine-item unidimensional model may be problematic. This problem is identical to that observed in the invariance analysis of the two-factor higher-order model (^{12} ), and it suggested that the nine-item model may not be an entirely acceptable solution to the SPAS.

Seven-item model.
The results of the CFAs to test the factorial invariance of the seven-item unidimensional model are presented in Table 6 . The relative and absolute fit indices suggested that the seven-item unidimensional model represented a good fit in the separate samples of women and men. The Equal Sigmas hypothesis was not rejected indicating that the variance-covariance matrix was invariant across gender. Based on the model comparisons in Table 4 , the first chi-square difference test was not significant and suggested that the one-factor structure and pattern of loadings were invariant across gender (i.e., equivalent coefficients linking observed variables to the latent variable). The second chi-square difference test was significant. This finding suggested that the factor variance was not invariant between women and men—the LISREL estimate of factor variance was larger for women than men. The larger factor variance suggested that the model accounted for more variance among responses to the seven-item SPAS for women than men. The third chi-square difference test was not significant indicating that the item uniquenesses were invariant across gender—there was equivalence of measurement and specific error variance for each item across women and men.

Table 6: Results of the CFAs testing the factorial invariance of the seven-item unidimensional model to the SPAS across gender.

The estimates of item loadings representing common metric and within group completely standardized solutions from model 2 are presented in Table 7 ; model 2 was not significantly different than model 1. The within group loadings for women were an average of 0.11 higher than for men. Again, the LISREL estimates of standard errors were moderately large in magnitude (range = 0.11–0.13).

Table 7: Factor loadings for the seven-item unidimensional solution to the SPAS based on model 2 in the invariance routine.

We examined the AIC and ECVI values across the series of invariance analyses on the seven- and nine-item unidimensional models to the SPAS. As illustrated in Table 8 , the AIC and ECVI values were lower for the seven-item model compared to the nine-item model in all of the analyses of the invariance routine. Based on the AIC and ECVI values, the seven-item model to the SPAS appeared to possess improved factorial invariance and parsimony compared with the nine-item model.

Table 8: AIC and ECVI values from the factorial invariance routine on the seven- and nine-item unidimensional models to the SPAS across gender.

Internal Consistency
The internal consistency of the nine-item and seven-item models to the SPAS was estimated using coefficient alpha (^{8} ). The estimates of internal consistency for the nine-item and seven-item models were 0.67 and 0.72, respectively. The increase in coefficient alpha with the seven-item SPAS suggested that it represented a unidimensional scale to a greater extent than did the nine-item SPAS.

Examination of Construct Validity
Table 9 presents the Pearson product-moment correlations performed on scores from the nine- and seven-item unidimensional SPAS and scores from the S-OBCS, PSES, and SDS-C. Significant negative correlations were observed between nine-item SPAS scores and PPA, PSPC, and SDS-C; SPAS and S-OBCS scores were positively correlated. Scores on the seven-item SPAS also were negatively correlated with PPA, PSPC, and SDS-C scores and positively correlated with S-OBCS scores. Scores from the nine- and seven-item versions of the SPAS were strongly correlated (r = 0.98, P < 0.0001).

Table 9: Correlations between scores from the nine-item and seven-item unidimensional SPAS models and other theoretically related measures.

DISCUSSION
This study tested whether the previously reported two-factor solution to the 12-item SPAS (^{12,13,34} ) was substantively meaningful or a methodological artifact . We then tested the nine-item unidimensional model reported by Martin et al. (^{28} ) for describing the SPAS responses of college-aged women and men. Using a data-driven model modification strategy supplemented by substantive arguments about item content, this study also generated a unidimensional solution to the SPAS that consisted of seven items. Both models were tested for factorial invariance across gender and examined for construct validity via correlations with measures of other theoretically related constructs (^{32} ).

12-Item SPAS: One or Two Factors?
Results of the CFA procedure outlined by Marsh (^{26} ) and Tomás and Oliver (^{39} ) indicated the existence of a single factor (social physique anxiety) rather than two factors (physique presentation comfort or positively worded items and expectations of negative physique evaluation or negatively worded items) underlying responses to the SPAS. The inclusion of substantively irrelevant methodological effects in the one-factor model, however, was necessary to achieve a good fit. The method effect was primarily associated with the negatively worded items. Our results were similar to the findings of Marsh (^{26} ) and Tomás and Oliver (^{39} ) when examining a global measure of self-esteem. These results reinforce the need to model method effects in factorial validity studies (^{26} ). When method effects are not modeled in factorial validity studies with both positively and negatively worded items, the underlying structure of the responses may be obscured by substantively irrelevant method variance (^{26} ).

Nine-Item Model
The CFAs provided some support for the nine-item unidimensional model to the SPAS. The nine-item unidimensional model represented a reasonable, but not optimal, fit as illustrated by the CFA on responses of both women and men. The invariance routine also supported the nine-item unidimensional model in separate samples of women and men. We did, however, obtain some evidence of questionable fit for the nine-item model in the series of invariance tests described by Jörsekog and Sörbom (^{19} ). Although the factor structure and pattern loadings appeared to be invariant across gender, the test of invariant factor structure resulted in a substantially larger chi-square value than the invariance analysis of the variance-covariance matrix (Equal Sigmas). This was not consistent with the example of an invariance analysis provided by Jörsekog and Sörbom (19, p. 284, Table 9 .2) and suggested that the nine-item unidimensional model did not represent an optimal solution to the SPAS and it may contain problems. The possibility of a reasonable fit to the nine-item SPAS cannot be completely overlooked based on the acceptable fit indices (i.e., NNFI and GFI) obtained in all CFAs. Researchers should be cautious when utilizing the nine-item unidimensional version of the SPAS in examinations of physique-related anxiety until further research supports the recommendation proposed by Martin et al. (^{28} ) that others should begin to utilize the nine-item model in research on social physique anxiety.

Seven-Item Model
The data-generated, seven-item solution represented a very convincing fit to the SPAS. To create the seven-item solution, two items (i.e., items 11 and 12) were removed based on empirical findings (i.e., large standardized residuals) that were supported by substantive arguments (i.e., redundant or gender-specific content). The CFA on the remaining seven items provided absolute and relative fit indices that were indicative of a strong fit of the model to the SPAS. The reductions in AIC and ECVI values supported the possibility of an improved and more parsimonious fit when the seven-item model was compared to the nine- and eight-item solutions. The invariance routine even demonstrated that the variance-covariance matrix, factor structure, pattern of loadings, and uniquenesses were invariant across gender. It seems that the latent construct of social physique anxiety is measured similarly for female and male college-aged students—when employing the seven-item SPAS. The invariance of the seven-item SPAS may add to the ability to make a meaningful and interpretable comparison of mean scores between female and male college-aged students on the construct of social physique anxiety.

Construct Validity of the SPAS
The construct validity of scores on the nine- and seven-item unidimensional solutions to the SPAS was examined to help provide evidence of convergent and discriminant validity (^{32} ). Scores from both unidimensional models to the SPAS were correlated with measures of other theoretically related constructs. Pearson product-moment correlations indicated that SPAS scores were positively related to S-OBCS scores. The positive correlation between SPAS and S-OBCS scores is theoretically reasonable. It suggests that individuals who are prone to physique-related anxiety also tend to monitor the appearance of their body and vice versa . As expected, SPAS scores were negatively related to PPA and PSPC scores. Other researchers also have reported negative relationships between SPAS scores and PPA and PSPC scores (^{27,29} ). It is not surprising that individuals with high social physique anxiety also report low levels of perceived physical ability and self-presentational confidence. Scores on these measures should covary and are consistent with the predictions of self-presentational theory applied to physical activity (^{21,27,29} ). There was a weak negative correlation between SPAS and SDS-C scores. The weak negative correlation indicates that SPAS scores were not overly influenced or contaminated by social desirability and/or response distortion. This finding is noteworthy because scores on the Marlowe-Crowne SDS (the parent form of the SDS-C) have been shown to tap impression management aspects of socially desirable responding (^{33} ). The correlation obtained in the present study is evidence of modest discriminant validity between an impression-related construct (i.e., social physique anxiety) and a general measure of impression management (i.e., SDS-C).

Future Research
Studies are needed to further test the factorial validity of the nine- and seven-item solutions to the SPAS using more diverse samples. This study and others (^{13,15,28} ) have primarily focused on college-aged and physically active groups, which represents a narrow range of the possible populations of relevance. Additionally, the seven-item model needs further testing because it might be over-fitted. The model modifications also may have capitalized on chance because a test of the seven-item model using an independent sample was not performed (^{23} ). There also is a need to examine the factorial invariance of the SPAS across other groups (e.g., individuals differing in fitness, age, and culture) to further establish the context in which differences in mean scores are meaningful and interpretable. Future research is needed to test the relationships between social-physique anxiety and exercise adherence/mental-health benefits. The results of the present study also suggested future lines of research examining the direction of causality between social physique anxiety and other constructs (i.e., body consciousness and physical self-efficacy).

One important issue that needs to be addressed is whether the content of the SPAS, not the number of items, is reasonable to measure social-physique anxiety. The large positive correlation between scores from the nine- and seven-item models suggests that the same information was captured with both versions of the SPAS. The important concern is whether the domain of social physique anxiety can be adequately measured by seven or even nine items. The SPAS items were generated to tap only a general form of social physique anxiety (i.e., perceived negative evaluation of one’s physique by others; 15). This may be a limitation of the SPAS. From a self-presentational perspective (^{14,21,22} ), social anxiety related to one’s physique may involve many other factors including activity or environmental specific influences, body-part specific components, self-presentational motives, and significance of other factors. Future researchers might generate additional items to form a broader measure that more adequately samples the possible domain of social physique anxiety.

SUMMARY
The present study indicated that the two-factor model to the SPAS represented a methodological artifact , and the one-factor unidimensional model was a more appropriate solution to the SPAS. The results also provided some support for the factorial validity and invariance of the nine-item unidimensional solution to the SPAS reported by Martin et al. (^{28} ). In comparison, stronger support was obtained for a modified seven-item unidimensional model to the SPAS that demonstrated factorial validity and invariance of variance-covariance matrix, factor structure, pattern of loadings, and uniquenesses across gender. The observed correlations with measures of other theoretically related constructs provided convergent and discriminant evidence of construct validity for both the nine- and seven-item unidimensional solutions to the SPAS (^{32} ). Rather than overlook the nine-item solution in favor of the seven-item model, we recommend that future researchers should utilize caution when interpreting SPAS scores until more research has demonstrated which solution is most stable and replicable (^{23} ). It also may be profitable to explore a broader domain of physique-related anxiety than presently defined by the SPAS.

The authors would like to thank Dr. Rod K. Dishman, Dr. Jeri Benson, and two anonymous reviewers for providing comments and suggestions that helped to improve the quality of this manuscript.

REFERENCES
1. Aikake, H. Factor analysis and AIC. Psychometrika 52:317–332, 1987.

2. Bentler, P. M., and D. G. Bonett. Significance tests and goodness of fit in the analysis of covariance structures. Psychol. Bull. 88:588–606, 1980.

3. Bentler, P. M., and C. Chou. Practical issues in structural modeling. Soc. Methods Res. 16:78–117, 1987.

4. Bollen, K. A.

Structural Equations with Latent Variables. New York: John Wiley & Sons, Inc, 1989, pp. 1–514.

5. Browne, M. W., and R. Cudeck. Alternative ways of assessing model fit. In: Testing Structural Equation Models, K. A. Bollen and J. S. Long (Eds.). Newbury Park, CA: Sage Publications, Inc, 1993, pp. 136–162.

6. Carmines, E., and J. McIver. Analyzing models with unobserved variables: analysis of covariance structures. In: Social Measurement: Current Issues, G. Bohrnstedt and E. Borgatta (Eds.). Beverly Hills, CA: Sage, 1981, pp. 65–115.

7. Crawford, S., and R. C. Eklund. Social physique anxiety, reasons for exercise, and attitudes toward exercise settings. J. Sport Exerc. Psychol. 16:70–82, 1994.

8. Cronbach, L. J. Coefficient alpha and internal structure of tests. Psychometrika 16:297–334, 1951.

9. Dishman, R. K., C. R. Darracott, and L. T. Lambert. Failure to generalize determinants of self-reported physical activity to a motion sensor. Med. Sci. Sports Exerc. 24:904–910, 1992.

10. Eklund, R. C. With regard to the Social Physique Anxiety Scale (conceptually speaking). J. Sport Exerc. Psychol. 20:225–227, 1998.

11. Eklund, R. C., and S. Crawford. Active women, social physique anxiety, and exercise. J. Sport Exerc. Psychol. 16:431–448, 1994.

12. Eklund, R. C., B. Kelley, and P. Wilson. The Social Physique Anxiety Scale: men, women, and the effects of modifying item 2. J. Sport Exerc. Psychol. 19:188–196, 1997.

13. Eklund, R. C., D. Mack, and E. Hart.

Factorial validity of the Social Physique Anxiety Scale for females. J. Sport Exerc. Psychol. 18:281–295, 1996.

14. Goffman, E. The Presentation of Self in Everyday Life. New York: Anchor Books, 1959, pp. 1–259.

15. Hart, E. A., M. R. Leary, and W. J. Rejeski. The measurement of social physique anxiety. J. Sport Exerc. Psychol. 11:94–104, 1989.

16. Hu, L., and P. M. Bentler. Cutoff criteria for fit indices in covariance structure analysis: conventional criteria versus new alternatives. Struct. Equat. Model 6:1–55, 1999.

17. Hutchinson, S. R., and A. Olmos. Behavior of descriptive fit indices in confirmatory factor analysis using ordered categorical data. Struct. Equat. Model 5:344–364, 1998.

18. Jöreskog, K. G. A general approach to confirmatory maximum likelihood factor analysis. Psychometrika 34:183–202, 1969.

19. Jöreskog, K. G., and D. Sörbom.

LISREL® 8: User’s Reference Guide. Chicago, IL: Scientific Software International, Inc., 1996, pp. 1–378.

20. Lantz, C. D., C. J. Hardy, and B. E. Ainsworth. Social physique anxiety and perceived exercise behavior. J. Sport Behav. 20:83–93, 1997.

21. Leary, M. R. Self-presentational processes in exercise and sport. J. Sport Exerc. Psychol. 14:339–351, 1992.

22. Leary, M. R., and R. M. Kowalski. Social anxiety. New York: Guilford Press, 1995, pp. 1–244.

23. MacCallum, R. C., M. Roznowski, and L. B. Necowitz. Model modifications in covariance structure analysis: the problem of capitalization on chance. Psychol. Bull. 111:490–504, 1992.

24. Mardia, K. V. Measures of multivariate skewness and kurtosis with application. Biometrika 57:519–530, 1970.

25. Marsh, H. W. Confirmatory factor analysis models of factorial invariance: a multifaceted approach. Struct. Equat. Model 1:5–34, 1994.

26. Marsh, H. W. Positive and negative global self-esteem: a substantively meaningful distinction or artifactors? J. Pers. Soc. Psychol. 70:810–819, 1996.

27. Martin, K. A., and D. Mack. Relationships between self-presentation and sport competition trait anxiety: a preliminary study. J. Sport Exerc. Psychol. 18:75–82, 1996.

28. Martin, K. A., W. J. Rejeski, M. R. Leary, E. McAuley, and S. Bane. Is the Social Physique Anxiety Scale really multidimensional? Conceptual and statistical arguments for a unidimensional model. J. Sport Exerc. Psychol. 19:359–367, 1997.

29. McAuley, E., and G. Burman. The Social Physique Anxiety Scale:

construct validity in adolescent females. Med. Sci. Sports Exerc. 25:1049–1053, 1993.

30. McAuley, E., and D. Gill. Reliability and validity of the Physical Self-Efficacy Scale in a competitive sport setting. J. Sport Psychol. 5:410–418, 1983.

31. McKinley, N. M., and J. S. Hyde. The Objectified Body Consciousness Scale: Development and validation. Psychol. Women Q. 20:181–215, 1996.

32. Messick, S. Validity of psychological assessment: validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. Am. Psychol. 50:741–749, 1995.

33. Paulhus, D. L. Measurement and control of response bias. In: Measures of Personality and Social Psychological Attitudes, J. P. Robinson, P. R. Shaver, and L. S. Wrightsman (Eds.). San Diego: Academic Press, 1991, pp. 17–59.

34. Petrie, T. A., N. Diehl, R. L. Rogers, and C. L. Johnson. The Social Physique Anxiety Scale: reliability and

construct validity . J. Sport Exerc. Psychol. 18:420–425, 1996.

35. Reynolds, W. M. Development of reliable and valid short forms of the Marlowe-Crowne social desirability scale. J. Clin. Psychol. 38:119–125, 1982.

36. Ryckman, R. M., M. A. Robbins, B. Thornton, and P. Cantrell. Development and validation of a Physical Self-Efficacy Scale. J. Pers. Soc. Psychol. 42:891–900, 1982.

37. Spink, K. S. Relation of anxiety about social physique to location of participation in physical activity. Percept. Mot. Skills 74:1075–1078, 1992.

38. Thornton, B., R. M. Ryckman, M. A. Robbins, J. Donolli, and G. Biser. Relationship between perceived physical ability and indices of actual physical fitness. J. Sport Exerc. Psychol. 9:295–300, 1987.

39. Tomás, J. M., and A. Oliver. Rosenberg’s Self-Esteem Scale: two factors or method effects. Struct. Equat. Model 6:84–98, 1999.

40. Ullman, J. B. Structural equation modeling. In: Using Multivariate Statistics, 3rd Ed., B. G. Tabachnick and L. S. Fidell (Eds.). New York: Harper Collins College Publishers, 1996,