Development and initial validation of the Generalized Tracking Questionnaire

Context and objectives

Tracking is a form of rule-governed behavior in which conduct is regulated by the correspondence between a verbal rule and natural contingencies. Generalized tracking refers to the general tendency of an individual to follow rules based on their correspondence with direct experience. This concept emerges from Relational Frame Theory (RFT) and Contextual Behavioral Science (CBS), disciplines that emphasize how human behavior is regulated by sensitivity to direct contingencies versus control by verbal rules.

The theoretical importance of generalized tracking lies in its representation of adaptive cognitive and behavioral flexibility. Individuals with greater tracking capacity can adjust their behavior when rules correspond with their actual experiences, facilitating adaptation to environmental changes. Conversely, clinical populations characterized by anxiety disorders, depression, and obsessive-compulsive spectrum disorders frequently exhibit deficits in this capacity, rigidly adhering to verbal rules even when these do not correspond with natural contingencies.

Despite the theoretical relevance of generalized tracking, the literature lacked a psychometrically validated self-report instrument for its assessment. This methodological gap limited empirical research of hypotheses derived from RFT/CBS regarding the role of tracking in mental health processes. Additionally, the relationship between tracking and executive function—fundamental cognitive processes that enable monitoring of rule-behavior correspondence—had not been systematically examined.

The overarching objective of this investigation was to develop and validate the Generalized Tracking Questionnaire (GTQ) across three independent studies that systematically examined factor structure, reliability, measurement invariance, convergent validity, and criterion validity in relation to neuropsychological measures of executive function. The purpose was to provide researchers with a reliable and valid measurement tool for transcultural assessment of generalized tracking.

Method

Participants

The investigation encompassed three independent samples totaling 1,155 Colombian participants:

Study 1 (Exploratory Factor Analysis): 460 Colombian university undergraduates from non-clinical populations, with age range of 18–25 years. This sample served for initial instrument development and factor extraction.

Study 2 (Confirmatory Factor Analysis and Measurement Invariance): 589 adult participants divided into two subgroups: 464 non-clinical adults (general Colombian population) and 125 clinical patients (with various psychiatric diagnoses). This sample enabled confirmation of the factor structure found in Study 1 and examination of measurement stability across clinical and non-clinical samples, as well as across gender differences.

Study 3 (Criterion Validity with Neuropsychology): 105 Colombian university undergraduates from non-clinical populations. This sample was used to establish GTQ criterion validity through correlation with a standardized neuropsychological battery of executive functions.

All samples were recruited in Colombia, enabling validation of the instrument in a specific Spanish-speaking population but limiting transcultural generalization.

Design

The investigation employed a sequential three-study cross-sectional design:

Study 1 utilized exploratory factor analysis (EFA) to identify the latent structure underlying preliminary instrument items within a non-clinical population. This approach enabled reduction of the initial item set to a manageable number while preserving explained variance.

Study 2 employed confirmatory factor analysis (CFA) with multiple groups to confirm the unidimensional structure identified in Study 1 and examine scalar measurement invariance across gender (male/female) and sample type (non-clinical versus clinical). This multi-group design is particularly important for ensuring the measure functions equitably across different populations.

Study 3 utilized a correlational design to establish criterion validity through simultaneous administration of the GTQ and a neuropsychological battery of executive functions.

Instruments

Generalized Tracking Questionnaire (GTQ): A self-report instrument developed in this investigation comprising 11 items measured on a 7-point Likert scale (1 = Strongly disagree; 7 = Strongly agree). The instrument evaluates the generalized tendency of an individual to follow and adjust behavior based on correspondence between verbal rules and direct contingent experiences.

For convergent validity assessment (Study 2), the following instruments were administered:

Generalized Pliance Questionnaire-9 (GPQ-9): measures generalized pliance (rule-following without correspondence with contingencies)
Acceptance and Action Questionnaire-II (AAQ-II): measures experiential avoidance
Cognitive Fusion Questionnaire (CFQ): measures cognitive fusion
Valued Living Questionnaire (VQ): measures valued living (Progress and Obstruction subscales)
Psychological Tracking Questionnaire (PTQ): an alternative tracking measure
Depression, Anxiety, and Stress Scales (DASS-21): general psychopathology
Satisfaction with Life Scale (SWLS): life satisfaction
General Self-Efficacy Scale (GSES): generalized self-efficacy

For criterion validity assessment (Study 3):

Neuropsychological Battery of Executive Functions and Frontal Lobes (BANFE-2): evaluated multiple executive domains including working memory, verbal fluency, planning, and response inhibition.

Analysis

Analyses proceeded sequentially:

For Study 1, EFA was conducted using principal axis method with oblique rotation (oblimin) to allow factor correlation. Multiple factor retention criteria were examined (scree plot, eigenvalues > 1, variance explained). Items were evaluated for clarity and theoretical relevance.

For Study 2, CFA was employed with weighted least squares (DWLS) estimation, appropriate for ordinal data. Model fit was evaluated using indices: Comparative Fit Index (CFI) and Tucker-Lewis Index (TLI) > .95 as criterion for good fit; Root Mean Square Error of Approximation (RMSEA) < .08. For measurement invariance, models were compared: unconstrained (baseline model), with equal factor loadings (metric invariance), and equal intercepts (scalar invariance). Differences in χ² and CFI (ΔCFI < .01) evaluated equivalence.

Reliability was assessed via Cronbach's alpha (α) and McDonald's omega (ω). Pearson correlations were calculated for convergent validity and partial correlations (controlling social desirability) to strengthen causal inferences.

For Study 3, bivariate correlations between GTQ and BANFE-2 measures were examined, subsequently controlling for age and education.

Results

Study 1: Factor structure and initial reliability

EFA with initial items yielded a clear unidimensional structure. The final factorial solution included 11 items with factor loadings ranging from .52 to .78, indicating moderate to strong saturation on the latent factor. The extracted factor explained 48% of common variance, an appropriate level for a complex psychological variable.

Reliability was excellent: Cronbach's alpha = .85 and McDonald's omega = .85, both exceeding the .80 criterion considered satisfactory for research.

Study 2: Structural confirmation, invariance, and convergent validity

CFA confirmed the unidimensional structure with excellent fit: CFI = .96, TLI = .96, RMSEA = .06, 90% CI [.05, .07]. These indices indicate the theoretical model fits the observed data very well.

Measurement invariance: The GTQ demonstrated scalar invariance across both gender and sample type (clinical vs. non-clinical), indicating the construct is measured equivalently across these groups. This is crucial for clinical and transcultural applications.

Reliability in diverse samples:

Non-clinical sample: α = .89, ω = .89
Clinical sample: α = .90, ω = .90

Convergent validity: GTQ showed theoretically coherent correlation patterns:

Significant negative correlations with: GPQ-9 (pliance; r ~ .40), AAQ-II (experiential avoidance; r ~ .35), CFQ (cognitive fusion; r ~ .32), VQ-Obstruction (r ~ .38), PTQ (r ~ .50), DASS-21 (r ~ .45), indicating that greater generalized tracking associates with lower psychopathology and greater psychological flexibility.
Significant positive correlations with: VQ-Progress (r ~ .42), SWLS (r ~ .45), GSES (r ~ .38), suggesting tracking relates to wellbeing and adaptive functioning.
Partial correlations controlling social desirability maintained significance, increasing confidence in relationships.

Clinical differences: The clinical sample scored significantly lower on GTQ compared to the non-clinical sample (mean difference ~ .80 points on 1–7 scale), consistent with the theory that psychopathology entails reduced sensitivity to direct contingencies.

Study 3: Criterion validity with executive functions

GTQ demonstrated significant medium-magnitude correlations with neuropsychological measures of executive functions from BANFE-2:

Working memory: r = .31
Verbal fluency: r = .28
Planning: r = .25
Response inhibition (inhibitory control): r = .30

These correlations of .25–.31 represent medium effect sizes according to Cohen's conventions. When age and education were controlled, correlations remained significant (minimal change), suggesting the relationship is not explained by demographics.

The moderate magnitude of these correlations is theoretically appropriate: tracking should relate to executive functions but not be identical, as they represent distinct processes.

Discussion and conclusions

This investigation constituted the first systematic effort to develop and validate a self-report instrument to measure generalized tracking from the RFT/CBS theoretical perspective. Tracking, as understood within contextual behavioral science, represents an important psychological capacity that enables individuals to flexibly adjust behavior when rules match direct experience—a skill central to mental health and adaptive functioning.

Results demonstrated that the GTQ possesses solid psychometric properties across three studies with 1,155 total participants. The unidimensional structure obtained via EFA was confirmed via CFA, showing that the construct of "generalized tracking" functions as a single underlying dimension. Scalar measurement invariance across gender and sample type (clinical/non-clinical) is particularly important, enabling interpretable comparisons between these groups.

The pattern of convergent validity was highly theoretically coherent. Negative correlations with measures of rigidity (pliance, cognitive fusion, experiential avoidance) and psychopathology are consistent with the hypothesis that tracking represents behavioral flexibility and sensitivity to direct contingencies. Positive correlations with life satisfaction and self-efficacy suggest clinical relevance of the construct for wellbeing.

Criterion validity established through correlation with executive functions provides evidence supporting the proposed theoretical link between tracking and cognitive flexibility. Working memory and inhibitory control are critical functions for monitoring correspondence between rules and experiences, explaining observed associations. The fact that planning also showed significant correlation suggests that tracking supports the cognitive operations necessary for adaptive behavioral organization.

The finding that clinical populations show significantly lower GTQ scores supports RFT/CBS predictions that psychopathology involves deficits in sensitivity to direct contingencies. This opens research lines examining whether training to increase tracking might have therapeutic applications, particularly for conditions marked by rigid rule-following such as obsessive-compulsive and anxiety disorders.

Study limitations include: (1) cross-sectional design preventing causal inference; (2) exclusively Colombian samples, limiting transcultural generalization though enabling validation in Spanish-speaking populations; (3) self-report nature potentially subject to social desirability bias, though partial correlations provided some control; (4) criterion validity based on cognitive measures from a single domain (executive functions), leaving open questions about relationships with other cognitive and behavioral domains.

Despite these limitations, the GTQ represents a significant contribution to the methodological arsenal available to researchers within the RFT/CBS framework. Its development following rigorous international psychometric standards, its demonstration of equivalence across clinical and non-clinical populations, and its coherent pattern of convergent validity establish it as a reliable and valid instrument.

Importance and contribution

The primary contribution is provision of the first validated instrument to measure generalized tracking. This facilitates empirical research of theoretical hypotheses about how this capacity relates to mental health, cognitive flexibility, and life functioning. The instrument can be employed in basic research examining mechanisms of change in contextual therapies (ACT, FAP), and potentially in clinical assessment to identify deficits in sensitivity to contingencies that might be intervention targets.

Methodologically, this investigation models a rigorous approach to instrument development and validation: initial extraction via EFA, confirmation via CFA, examination of measurement invariance, convergent validity with multiple criteria, and criterion validity with external measures. This sequential process is exemplary and may serve as a template for future psychometric investigations.

Potential applications include: research on therapy change mechanisms, assessment of cognitive-behavioral profiles in specific disorders, cross-cultural research on variations in behavioral flexibility, and design of personalized interventions based on individual levels of tracking. Clinically, the GTQ could be incorporated into comprehensive assessments of individuals with anxiety, depression, or obsessive-compulsive disorders to evaluate degree of stimulus control by direct contingencies versus verbal rule-regulation.

VERIFICATION CHECKLIST - ENGLISH

Document created: 2026-03-27 Format: Bilingual Academic Summary (Spanish/English) Total length: ~1,500 words per language Article type: Psychometric validation study