Research Methodology

    What is Factor Analysis in Research? A Complete Guide

    Factor analysis is a statistical technique used to identify underlying latent factors from a set of observed variables. This complete guide covers exploratory factor analysis (EFA), confirmatory factor analysis (CFA), how to run them in SPSS, and how to interpret results in your thesis.

    Shruti Sharma
    30 May 202612 min read1 views
    Thesis Ace Writers
    Research Methodology

    What is Factor Analysis in Research? A Complete Guide

    Meet the Expert

    Shruti Sharma

    Academic Writing Coach & Research Communication Specialist

    • Guided 300+ PhD scholars through EFA, CFA, SEM, and scale development
    • Expertise in SPSS, AMOS, and SmartPLS for factor analysis and structural modelling
    • Specialises in writing measurement and methodology chapters for management and social science theses
    Book Consultation

    Factor analysis is a multivariate statistical method that identifies a smaller set of underlying latent factors from a larger set of observed variables. It is one of the most powerful tools for scale development, construct validation, and data reduction in quantitative research — and is used extensively in management, psychology, education, and marketing studies.

    If you are developing or validating a questionnaire for your thesis, or building a structural equation model, factor analysis is almost certainly part of your analytical toolkit. Understanding the difference between exploratory and confirmatory factor analysis — and when to use each — is crucial for a credible quantitative study.

    What Factor Analysis Does

    Imagine you have 20 questionnaire items measuring employee well-being. Factor analysis might reveal that these 20 items actually cluster into 4 underlying factors: physical health, psychological safety, work-life balance, and social support. These 4 factors (also called latent variables or constructs) explain the patterns of correlation among the 20 observed items.

    EFA vs CFA — Key Differences

    EFA PurposeExplore Factor Structure

    Used when factor structure is unknown; lets data determine groupings

    CFA PurposeConfirm Factor Structure

    Tests a pre-specified model against the data; theory-driven

    EFA Prior TheoryNot Required

    No prior model; suitable for early-stage scale development

    CFA Prior TheoryRequired

    Requires a theoretical model specifying which items load on which factor

    EFA SoftwareSPSS

    Run via Analyze → Dimension Reduction → Factor in SPSS

    CFA SoftwareAMOS / R / SmartPLS

    Requires SEM software; output includes model fit indices (CFI, RMSEA)

    Prerequisites: KMO and Bartlett's Test

    Before running EFA, check whether your data is suitable for factor analysis:

    TestWhat It ChecksAcceptable Result
    KMO Measure of Sampling AdequacyWhether inter-item correlations are suitable for factoringKMO ≥ 0.70 (0.60 minimum)
    Bartlett's Test of SphericityWhether correlation matrix is non-identity (i.e., items are correlated)p < 0.05 (significant)
    CommunalitiesProportion of each item's variance explained by extracted factors≥ 0.40 for each item
    Sample sizeAdequate N for stable factor solutionMinimum 5–10 participants per item; N ≥ 200 preferred

    How to Run EFA in SPSS (Step-by-Step)

    1. Go to Analyze → Dimension Reduction → Factor
    2. Move all items into the Variables box
    3. Click Descriptives → tick KMO and Bartlett's test of sphericity and Coefficients
    4. Click Extraction → Method: Principal Axis Factoring (or Principal Components); Extract: Eigenvalues over 1; tick Scree plot
    5. Click Rotation → Method: Oblimin (if factors may correlate) or Varimax (if factors assumed orthogonal)
    6. Click Options → tick Suppress small coefficients (absolute value below 0.30 or 0.40)
    7. Click OK

    Interpreting EFA Output

    SPSS Output TableWhat to Look For
    KMO and Bartlett's TestKMO ≥ 0.70; Bartlett's p < 0.05
    Total Variance ExplainedCumulative % variance explained by retained factors (aim for ≥ 50–60%)
    Scree PlotElbow point indicates number of factors to retain
    Pattern MatrixFactor loadings for each item on each factor (≥ 0.40 or 0.50 preferred)
    CommunalitiesExtraction values ≥ 0.40 for each item

    Factor Loading Interpretation

    Factor LoadingInterpretation
    ≥ 0.70Strong loading — item is an excellent indicator of the factor
    0.50–0.69Moderate loading — item is a good indicator
    0.40–0.49Acceptable loading — borderline; review conceptually
    0.32–0.39Weak loading — consider dropping the item
    < 0.32Drop the item from the scale

    Confirmatory Factor Analysis (CFA) and Model Fit

    CFA is run in AMOS, R (lavaan), or SmartPLS. It tests whether your pre-specified factor structure fits the observed data. Key model fit indices:

    Fit IndexAcceptable Value
    CFI (Comparative Fit Index)≥ 0.90 (≥ 0.95 preferred)
    TLI (Tucker-Lewis Index)≥ 0.90
    RMSEA (Root Mean Square Error of Approximation)≤ 0.08 (≤ 0.06 preferred)
    SRMR (Standardised Root Mean Square Residual)≤ 0.08
    Chi-square / df ratio (CMIN/DF)≤ 3.0 (≤ 5.0 acceptable)

    EFA Before CFA: The Recommended Sequence

    In doctoral research, the recommended sequence for scale development is: (1) Conduct EFA on a portion of your sample to explore the factor structure; (2) Conduct CFA on the remaining portion (or a new sample) to validate the structure. If your scale is well-established in the literature, you may proceed directly to CFA without EFA. Always justify your choice in your methodology chapter.

    Need help with EFA, CFA, or reporting factor analysis results in your thesis? Our statistical experts at Thesis Ace Writers can assist you from analysis to write-up.

    Rotation Methods in EFA: Oblimin vs Varimax

    • Varimax (orthogonal rotation): Assumes factors are uncorrelated. Produces simpler, cleaner factor structure. Use when factors are theoretically independent.
    • Oblimin / Promax (oblique rotation): Allows factors to correlate. More realistic for social science constructs (e.g., motivation and satisfaction are likely correlated). Generally recommended when factors are expected to be related.

    Common Mistakes in Factor Analysis

    • Skipping KMO and Bartlett's test: Always confirm data suitability before running factor analysis
    • Using principal components analysis (PCA) when you should use EFA: PCA is a data reduction technique, not a latent variable model — use principal axis factoring for construct validation
    • Deleting items without theoretical justification: Statistical output alone is insufficient — always consider item content and theory
    • Running EFA and CFA on the same sample: This inflates fit; use different subsamples or separate studies
    • Ignoring cross-loadings: Items that load strongly on two or more factors are ambiguous and should be revised or removed

    Running factor analysis for your thesis and need expert guidance? Book a consultation with Thesis Ace Writers today.

    Frequently Asked Questions

    Click a question to expand the answer.

    Factor analysis is a multivariate statistical technique used to identify a smaller number of underlying latent factors (unobserved variables) that explain the patterns of correlation among a larger set of observed variables (items). In research, it is primarily used for: (1) data reduction — reducing many variables to a few meaningful factors; and (2) scale development and validation — grouping questionnaire items that measure the same underlying construct. Factor analysis is widely used in psychology, management, education, marketing, and health research.

    Exploratory Factor Analysis (EFA) is used when you do not have a prior theory about which items belong to which factors — it lets the data reveal the factor structure. Confirmatory Factor Analysis (CFA) is used when you have a pre-specified model (based on theory or a prior EFA) and want to test whether the observed data fits that model. EFA is typically used in scale development; CFA is used in scale validation and in the measurement model step of SEM (Structural Equation Modelling).

    Factor loadings represent the correlation between an observed variable and an underlying factor. General guidelines: ≥ 0.70 = strong loading; ≥ 0.50 = acceptable loading (recommended minimum for inclusion); ≥ 0.32 = weak but may be reported; < 0.32 = item should generally be dropped. In practice, most researchers require factor loadings ≥ 0.40 or ≥ 0.50 and cross-loadings (item loading on more than one factor) below 0.30.

    The KMO (Kaiser-Meyer-Olkin) measure of sampling adequacy indicates whether the correlation matrix is suitable for factor analysis. KMO values: ≥ 0.90 = Marvelous; ≥ 0.80 = Meritorious; ≥ 0.70 = Middling (acceptable); ≥ 0.60 = Mediocre; < 0.50 = Unacceptable. Bartlett's Test of Sphericity tests whether the correlation matrix is an identity matrix (all variables uncorrelated). It should be statistically significant (p < 0.05) to proceed with factor analysis.

    Several criteria guide the number of factors to extract: (1) Kaiser's criterion (eigenvalue > 1) — retain factors with eigenvalues greater than 1; (2) Scree plot — look for the 'elbow' where the slope flattens; (3) Parallel analysis — compare eigenvalues to those from random data (most rigorous method); (4) Theoretical interpretability — factors must make conceptual sense. Kaiser's criterion is most commonly used but tends to over-extract; parallel analysis is considered more accurate.

    Tags

    factor analysis
    exploratory factor analysis
    efa
    confirmatory factor analysis
    cfa
    factor loading
    spss factor analysis
    scale development
    thesis statistics
    Share this article

    Need Professional Academic Assistance?

    Our expert team is ready to help with your research, writing, and publication needs.