Psychometric Properties of the Acceptance and Action Questionnaire

Introduction and Background

The Acceptance and Action Questionnaire–II (AAQ-II; Bond et al., 2011) constitutes one of the most widely utilized instruments in the assessment of experiential avoidance and psychological inflexibility, constructs that are central to Acceptance and Commitment Therapy (ACT). The AAQ-II was developed as a substantial improvement over its predecessor (AAQ-I), which exhibited questionable psychometric properties, particularly a complex factor structure and limited internal consistency. The improved version demonstrated a clear unidimensional structure and solid internal consistency in English-speaking samples.

Experiential avoidance, defined as the process of avoiding or escaping from aversive internal experiences (thoughts, emotions, sensations), has been proposed as a transdiagnostic mechanism common to multiple psychological disorders. The AAQ-II operationalizes this construct through seven items that assess both the degree of resistance to internal experiences and the functional impact of psychological inflexibility.

Although the AAQ-II has been translated into multiple languages, including Spanish in previous work (Ruiz et al., 2013), debates existed in the scientific literature regarding its psychometric characteristics. Some authors have raised concerns about possible criterion contamination, arguing that items include descriptions of negative affectivity that might conflate the measurement of experiential avoidance with general psychological distress. However, recent research has demonstrated that AAQ-II scores explain variance beyond affect measures, maintain temporal stability despite symptom fluctuations, and predict therapeutic outcomes independently of depressive or anxious symptoms.

A critical aspect not systematically explored previously was measurement invariance of the AAQ-II across clinical and nonclinical samples, as well as between genders. Additionally, no published psychometric data for the AAQ-II existed in Colombian population, which motivated this investigation.

Objectives

Primary objective: To analyze the psychometric properties and factor structure of the Spanish AAQ-II in Colombian sample.

Secondary objectives: (1) To examine measurement invariance across clinical and nonclinical samples, and (2) To evaluate measurement invariance between genders.

Method

Design and Participants

The study adopted an instrumental validation design with four independent samples:

Sample 1 (N = 762): Undergraduate students from seven universities in Bogotá. Age range 18–63 years (M = 21.16, SD = 3.76). Forty-six percent studied Psychology. Gender composition: 62% women. Regarding history of psychological/psychiatric treatment: 26% had received some form of treatment; 4.3% were in active treatment; 2.9% were taking psychotropic medication.

Sample 2 (N = 724; 74.4% women): Age range 18–88 years (M = 26.11, SD = 8.93). Recruited through anonymous online survey; all participants resided in Colombia. History of treatment: 45%; current treatment: 8.4%; psychotropic medication: 5.4%. This sample represented a more diverse population with higher prevalence of psychological difficulties compared to Sample 1.

Sample 3 (N = 277): Clinical patients referred from private practice in Bogotá. Gender composition: 64% women. Age range 18–67 years (M = 28.4, SD = 11.33). Diagnoses: 88.4% emotional disorders (depression and anxiety), 11.6% sexual disorders. Psychotropic medication: 6.1%.

Sample 4 (N = 11; 2 men): Participants in a single-session ACT study for rumination-related negative thinking (RNT). Age range 18–32 years (M = 22.18, SD = 4.40). Employed to assess sensitivity to change.

Total sample size: N = 1,773 for factor structure analyses.

Instruments

Acceptance and Action Questionnaire–II (AAQ-II): Seven items with 7-point Likert scale (1 = never true; 7 = always true). Participants rate agreement with statements regarding avoidance of internal experiences and functional impact. Score range: 7–49, with higher scores indicating greater psychological inflexibility.

Instruments for convergent and discriminant validity:

DASS-21 (Depression, Anxiety and Stress Scales–21): Measure of emotional symptoms.
GHQ-12 (General Health Questionnaire–12): Assessment of general psychological distress.
DAS-R (Dysfunctional Attitude Scale–Revised): Measure of dysfunctional beliefs.
SWLS (Satisfaction with Life Scale): Assessment of life satisfaction.
MAAS (Mindful Attention Awareness Scale): Measure of present-moment awareness.

Internal consistency values (α) were reported for all instruments in each sample.

Data Analysis

Confirmatory factor analysis (CFA): Conducted with LISREL 8.71 using weighted least squares (WLS) estimation with polychoric correlations, given that Likert data were treated as ordinal. Two models were compared: (a) simple unidimensional, and (b) unidimensional with correlated error terms for Items 1 and 4, following Bond et al. (2011) specification identifying a residual covariance between these items in English-language samples.

Fit indices evaluated: RMSEA (Root Mean Square Error of Approximation), CFI (Comparative Fit Index), NNFI (Non-Normed Fit Index), ECVI (Expected Cross-Validation Index). Adequate fit criteria: RMSEA < 0.08, CFI > 0.90, NNFI > 0.90.

Model comparison: Via likelihood ratio test (chi-square difference).

Measurement invariance: Multiple-group analysis examining invariance across three independent samples (1–3) and between genders. Baseline models versus models with constrained parameters. Chen (2007) criteria: ΔRMSEA < 0.01, ΔCFI > −0.01, ΔNNFI > −0.01.

Internal consistency: Cronbach's alpha with 95% confidence intervals via SPSS 19. Corrected item-total correlations.

Discriminant validity: Student's t-tests comparing groups by GHQ-12 cutoff score.

Convergent validity: Pearson correlations with related constructs.

Sensitivity to change: Paired t-test pre-post Sample 4, with Cohen's d effect sizes.

Results

Factor Structure

Analyses were conducted with the combined sample (N = 1,759).

Simple unidimensional model: RMSEA = 0.097 [95% CI: 0.087–0.110], CFI = 0.97, NNFI = 0.95, ECVI = 0.16 [95% CI: 0.13–0.19], χ²(14) = 247.23. This model showed acceptable fit but was susceptible to improvement, particularly in RMSEA.

Unidimensional model with correlated errors (Items 1 and 4): RMSEA = 0.069 [95% CI: 0.059–0.081], CFI = 0.98, NNFI = 0.97, ECVI = 0.087 [95% CI: 0.069–0.110], χ²(13) = 123.18. Substantial improvement in fit, with RMSEA within optimal criteria.

Chi-square difference: χ²diff = 124.05 (df = 1, p < 0.001), confirming statistical superiority of the correlated-error model.

Standardized factor loadings: All items demonstrated satisfactory loadings. Ranges: 0.75–0.85. Specifically, Item 1 (λ = 0.75), Item 2 (λ = 0.75), Item 3 (λ = 0.78), Item 4 (λ = 0.83), Item 5 (λ = 0.81), Item 6 (λ = 0.80), Item 7 (λ = 0.85). The experiential avoidance latent factor was fixed to 1.00 for identification.

Measurement Invariance

Invariance across clinical and nonclinical samples (Samples 1–3):

Baseline multiple-group model: χ²(39) = 161.59, RMSEA = 0.073, CFI = 0.98, NNFI = 0.97.

Constrained-parameters model: χ²(53) = 184.88, Δχ²(14) = 23.29 (p > 0.05), ΔRMSEA = 0.008, ΔCFI = 0.00, ΔNNFI = 0.01.

Conclusion: All Chen (2007) criteria met, confirming strict invariance.

Invariance between genders (Samples 1, 2, and 3 combined):

Baseline multiple-group model: χ²(26) = 134.04, RMSEA = 0.069, CFI = 0.99, NNFI = 0.98.

Constrained-parameters model: χ²(33) = 140.23, Δχ²(7) = 6.19 (p > 0.05), ΔRMSEA = 0.008, ΔCFI = 0.00, ΔNNFI = 0.00.

Conclusion: Complete invariance confirmed between men and women.

Internal Consistency

Sample 1: α = 0.88 [95% CI: 0.87–0.89]. Mean score M = 19.99 (SD = 8.37). Corrected item-total correlations: range 0.63–0.69.

Sample 2: α = 0.91 [95% CI: 0.90–0.92]. Mean score M = 22.86 (SD = 9.51). Corrected item-total correlations: range 0.67–0.78.

Sample 3: α = 0.90 [95% CI: 0.88–0.92]. Mean score M = 29.67 (SD = 10.27). Corrected item-total correlations: range 0.66–0.77.

Overall sample (N = 1,759): α = 0.91 [95% CI: 0.90–0.92]. Mean score M = 22.69 (SD = 9.74).

Cronbach's alpha coefficients across all samples exceeded 0.88, indicating strong internal consistency.

Gender Differences

Sample 1: Men (M = 18.92, SD = 8.04) < Women (M = 20.64, SD = 8.50), t(760) = −2.76, p = 0.006, significant difference.

Sample 2: Men versus Women, t(722) = 1.22, p = 0.22, no significant difference.

Sample 3: Men versus Women, t(275) = −1.52, p = 0.13, no significant difference.

Discriminant Validity

Sample 1: Participants with GHQ-12 ≥ 12 (M = 24.88, SD = 8.25) scored significantly higher on AAQ-II than those with GHQ-12 < 12 (M = 16.83, SD = 6.79), t(760) = 14.00, p < 0.001. Mean difference = 8.05 points.

Between-sample comparison: Sample 3 (clinical) scored significantly higher than Sample 1, t(1037) = −12.33, p < 0.001, and Sample 2, t(999) = −9.86, p < 0.001, demonstrating that the instrument adequately discriminates between clinical and nonclinical populations.

Convergent Validity

Pearson correlations were calculated (all p < 0.001) between AAQ-II and related constructs:

DASS-21: Positive correlations with Depression subscale (r = 0.49–0.73), Anxiety (r = 0.49–0.73), Stress (r = 0.49–0.73).
GHQ-12: Positive correlation r = 0.55–0.60 with general psychological distress.
DAS-R: Positive correlation r = 0.42 with dysfunctional attitudes.
SWLS: Negative correlations r = −0.42 to −0.57 with life satisfaction.
MAAS: Negative correlation r = −0.31 with present-moment awareness.

All correlation patterns were consistent with theoretical hypotheses about experiential avoidance.

Sensitivity to Change

Sample 4 (N = 11): Pre-post assessment of single-session ACT intervention.

Baseline: M = 29.09 (SD = 6.14)
6-week post-intervention: M = 18.82 (SD = 6.57)
Difference: Δ M = 10.27 points
Paired t-test: t(10) = 7.13, p < 0.001
Effect size: Cohen's d = 2.16 (very large effect)

The result indicates that AAQ-II is sensitive to changes induced by ACT intervention.

Discussion

This study confirmed that the Spanish AAQ-II demonstrates excellent psychometric properties in Colombian population, replicating findings from the original English-language validation (Bond et al., 2011) and previous Spanish validation (Ruiz et al., 2013).

The unidimensional structure was clearly confirmed, with specification of correlated error terms between Items 1 and 4 providing superior fit. This residual correlation suggests some degree of item dependence beyond the latent construct, possibly reflecting semantic overlap or an additional minor unmeasured dimension. Regardless, the model demonstrated excellent fit across all primary indices.

Internal consistency was exceptionally strong (α = 0.91), exceeding even traditionally optimal thresholds (α > 0.80). Values were stable across clinical and nonclinical samples, suggesting robust reliability.

Measurement invariance was established across two important dimensions: (a) between clinical and nonclinical samples, and (b) between genders. These findings have critical implications for comparative research and cross-cultural clinical applications, permitting interpretations of score differences as reflecting true differences in the underlying construct rather than measurement artifacts.

Discriminant validity was robustly demonstrated, with AAQ-II significantly differentiating between individuals with elevated psychological distress (GHQ-12 ≥ 12) and those without such distress, as well as between clinical and nonclinical samples. Large differences between Sample 3 (clinical) and Samples 1–2 (nonclinical) support diagnostic utility.

Convergent validity was consistent with theory, with positive moderate to strong correlations with psychopathology measures (DASS-21, GHQ-12, DAS-R) and negative correlations with well-being and present-moment awareness (SWLS, MAAS). These patterns confirm that experiential avoidance appropriately covaries with related but distinct constructs.

Sensitivity to change was demonstrated through substantial effect size assessment (d = 2.16) in response to ACT intervention. Although the sample was small (N = 11), the effect was dramatic and consistent. This supports that AAQ-II functions as a transdiagnostic mechanism of change in ACT, as theorized.

Limitations

The authors acknowledged several methodological limitations: (1) Systematic diagnostic information was not collected for Sample 3, limiting conclusions about performance in specific disorders. (2) Instruments for convergent/divergent validity (DASS-21, SWLS, MAAS) were not formally validated in Colombian population, potentially affecting interpretation of correlational patterns. (3) Samples were predominantly young and educated, particularly Sample 1 (university students), limiting generalization to more diverse and lower-education populations. (4) Sample 4 was very small for firm conclusions regarding sensitivity to change, although effects were large.

Conclusions and Implications

The AAQ-II demonstrates to be a reliable and valid instrument for measuring experiential avoidance and psychological inflexibility in Colombian context. Its unidimensional structure, excellent internal consistency, demonstrated measurement invariance, discriminant and convergent validity, and sensitivity to change position it as an appropriate tool for research and clinical practice. Findings regarding measurement invariance have significant implications for comparative cross-cultural studies.

Future studies are recommended with specific diagnostic samples, formal validation of related measures in Colombia, and evaluations in demographically more diverse populations.

Psychometric Properties of the Acceptance and Action Questionnaire–II in Colombia

Authors

Journal

Abstract

Detailed Summary