Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening

Differential Item Functioning (DIF) analysis is always an indispensable methodology for detecting item and test bias in the arena of language testing. This study investigated grade-related DIF in the General English Proficiency Test-Kids (GEPT-Kids) listening section. Quantitative data were test sco...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Linyu Liao, Don Yao
Formato: article
Lenguaje:EN
Publicado: Frontiers Media S.A. 2021
Materias:
DIF
Acceso en línea:https://doaj.org/article/dacf8148c83c470f93b8831c9fe3efbf
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:dacf8148c83c470f93b8831c9fe3efbf
record_format dspace
spelling oai:doaj.org-article:dacf8148c83c470f93b8831c9fe3efbf2021-12-01T02:05:14ZGrade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening1664-107810.3389/fpsyg.2021.767244https://doaj.org/article/dacf8148c83c470f93b8831c9fe3efbf2021-11-01T00:00:00Zhttps://www.frontiersin.org/articles/10.3389/fpsyg.2021.767244/fullhttps://doaj.org/toc/1664-1078Differential Item Functioning (DIF) analysis is always an indispensable methodology for detecting item and test bias in the arena of language testing. This study investigated grade-related DIF in the General English Proficiency Test-Kids (GEPT-Kids) listening section. Quantitative data were test scores collected from 791 test takers (Grade 5 = 398; Grade 6 = 393) from eight Chinese-speaking cities, and qualitative data were expert judgments collected from two primary school English teachers in Guangdong province. Two R packages “difR” and “difNLR” were used to perform five types of DIF analysis (two-parameter item response theory [2PL IRT] based Lord’s chi-square and Raju’s area tests, Mantel-Haenszel [MH], logistic regression [LR], and nonlinear regression [NLR] DIF methods) on the test scores, which altogether identified 16 DIF items. ShinyItemAnalysis package was employed to draw item characteristic curves (ICCs) for the 16 items in RStudio, which presented four different types of DIF effect. Besides, two experts identified reasons or sources for the DIF effect of four items. The study, therefore, may shed some light on the sustainable development of test fairness in the field of language testing: methodologically, a mixed-methods sequential explanatory design was adopted to guide further test fairness research using flexible methods to achieve research purposes; practically, the result indicates that DIF analysis does not necessarily imply bias. Instead, it only serves as an alarm that calls test developers’ attention to further examine the appropriateness of test items.Linyu LiaoDon YaoFrontiers Media S.A.articlegradeDIFGEPT-Kidslisteningmixed-methods approachPsychologyBF1-990ENFrontiers in Psychology, Vol 12 (2021)
institution DOAJ
collection DOAJ
language EN
topic grade
DIF
GEPT-Kids
listening
mixed-methods approach
Psychology
BF1-990
spellingShingle grade
DIF
GEPT-Kids
listening
mixed-methods approach
Psychology
BF1-990
Linyu Liao
Don Yao
Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening
description Differential Item Functioning (DIF) analysis is always an indispensable methodology for detecting item and test bias in the arena of language testing. This study investigated grade-related DIF in the General English Proficiency Test-Kids (GEPT-Kids) listening section. Quantitative data were test scores collected from 791 test takers (Grade 5 = 398; Grade 6 = 393) from eight Chinese-speaking cities, and qualitative data were expert judgments collected from two primary school English teachers in Guangdong province. Two R packages “difR” and “difNLR” were used to perform five types of DIF analysis (two-parameter item response theory [2PL IRT] based Lord’s chi-square and Raju’s area tests, Mantel-Haenszel [MH], logistic regression [LR], and nonlinear regression [NLR] DIF methods) on the test scores, which altogether identified 16 DIF items. ShinyItemAnalysis package was employed to draw item characteristic curves (ICCs) for the 16 items in RStudio, which presented four different types of DIF effect. Besides, two experts identified reasons or sources for the DIF effect of four items. The study, therefore, may shed some light on the sustainable development of test fairness in the field of language testing: methodologically, a mixed-methods sequential explanatory design was adopted to guide further test fairness research using flexible methods to achieve research purposes; practically, the result indicates that DIF analysis does not necessarily imply bias. Instead, it only serves as an alarm that calls test developers’ attention to further examine the appropriateness of test items.
format article
author Linyu Liao
Don Yao
author_facet Linyu Liao
Don Yao
author_sort Linyu Liao
title Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening
title_short Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening
title_full Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening
title_fullStr Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening
title_full_unstemmed Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening
title_sort grade-related differential item functioning in general english proficiency test-kids listening
publisher Frontiers Media S.A.
publishDate 2021
url https://doaj.org/article/dacf8148c83c470f93b8831c9fe3efbf
work_keys_str_mv AT linyuliao graderelateddifferentialitemfunctioningingeneralenglishproficiencytestkidslistening
AT donyao graderelateddifferentialitemfunctioningingeneralenglishproficiencytestkidslistening
_version_ 1718405930747428864