Can we use the same data for EFA and CFA? #Instrument_Development

I just reviewed several articles in BNJ, which the authors tried to develop or modify an instrument. Unfortunately, in their papers, I found that the authors used the exact data for Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA). So, I need to reject the articles.

We can’t use both EFA and CFA with the same data. The reason for this is very simple. We explore factor(s) with EFA; on the other hand, we confirm latent structure/s of the scale with CFA. So, if we try to verify the factor(s) we discovered with EFA using the same data, CFA results will most likely give good fit indices because the same data will tend to conform to the structure(s) of the scale which is discovered with EFA. Therefore, we have to use the different data obtained from a similar sample for CFA.

Or, another solution is that you can just do EFA only. That’s it, and it is acceptable. 

But, if you want to make a great instrument, CFA is recommended after EFA, with different data set. 

On the other hand, the next question is, when can you do CFA only? If you just do the translation or cross-cultural adaptation of the instrument, CFA alone is acceptable (according to my opinion). 

However, EFA is suggested if you want to reduce or add one or more items in an instrument because it may or may not change the latent or construct variable. 

You may need to understand the term adopt, adapt, modify, and translate the instrument (I talk about this in my Youtube Channel, check this out). So, you will know when you should do EFA and CFA rationally. Pros and cons are accepted, as long as logical or critical reasons are provided. 

I hope it helps! Buzz me if you have another question, opinion, or perspective. 

Don’t forget to cite!

This is an open access article distributed under Creative Commons Attribution 4.0 International (CC BY 4.0)

About Author

7 thoughts on “Can we use the same data for EFA and CFA? #Instrument_Development

  1. Dear Joko,
    thank you so much for your clear and informative article. I noted that EFA, calculated on a wide sample (S1), provides some results, however, when I re-perform the EFA on a sub-sample (s2) drawn from the previous larger one (S1), the results change and sometimes even the factorial structural may be quite different in the two cases. We frequently observed this fact, even controlling for gender and age. Can you explain why? And if this is case, which results shall I consider?
    Thank you
    Donata

    1. Hi,

      Thank you for your interest. The differences you observed in the results of EFA between a wide sample (S1) and a sub-sample (S2) can be due to several factors:
      1) Sampling variability: When you take a sub-sample from a larger sample, there is inherent variability in the composition of the sub-sample. The sub-sample might not fully capture the diversity and range of the constructs being measured, leading to different factor patterns.
      2) Random chance: In small sub-samples, there is a higher likelihood of chance correlations influencing the factor extraction process. As a result, the factor structure may differ from the wider sample due to random variations.
      3) Subgroup differences: Even when controlling for gender and age, there might be other relevant subgroup differences within the sub-sample. Factors might emerge differently for different subsets of the data due to variations in characteristics, experiences, or other relevant factors.
      4) Measurement error: If the sub-sample has a smaller number of observations, it is more susceptible to the impact of measurement error. This can lead to instability in the factor structure, resulting in different results compared to the wider sample.

      Anyway, if you have access to additional data, it is better to conduct CFA to confirm the factors. But if you don’t have additional data, you just need to use EFA results with S1. No need to do EFA in S2.

      Regards,
      Joko

  2. Thank you for your clear information. Exactly! It doesn’t make any sense when we explore factors and then confirm them in the same data! Could you please introduce some exact references for it? I am looking for a big-name statistics reference to explain to my supervisors that for EFA and CFA we should have a different round of data sets, but they insist it is possible to do both with the same data set! and I have trouble finding the reference I am sure I have read it in Tabachnik and Fidell but I can not find it again!

  3. Hello,
    I am facing a problem and expecting a solution for it.
    I have conducted an exploratory factor analysis and found that the two items of my only dependent variable were listed with the items of an independent variable (total 6 IVs). So, my dependent variable is gone in the EFA stage. Now what can I do to make a SEM model connecting the dependent and independent variables? Can I separate those two items during my CFA? In that situation how would I explain. I don’t want to change my model.

    Please, help.

Leave a Reply

Your email address will not be published. Required fields are marked *