Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening
Differential Item Functioning (DIF) analysis is always an indispensable methodology for detecting item and test bias in the arena of language testing. This study investigated grade-related DIF in the General English Proficiency Test-Kids (GEPT-Kids) listening section. Quantitative data were test sco...
Guardado en:
Autores principales: | , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/dacf8148c83c470f93b8831c9fe3efbf |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:dacf8148c83c470f93b8831c9fe3efbf |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:dacf8148c83c470f93b8831c9fe3efbf2021-12-01T02:05:14ZGrade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening1664-107810.3389/fpsyg.2021.767244https://doaj.org/article/dacf8148c83c470f93b8831c9fe3efbf2021-11-01T00:00:00Zhttps://www.frontiersin.org/articles/10.3389/fpsyg.2021.767244/fullhttps://doaj.org/toc/1664-1078Differential Item Functioning (DIF) analysis is always an indispensable methodology for detecting item and test bias in the arena of language testing. This study investigated grade-related DIF in the General English Proficiency Test-Kids (GEPT-Kids) listening section. Quantitative data were test scores collected from 791 test takers (Grade 5 = 398; Grade 6 = 393) from eight Chinese-speaking cities, and qualitative data were expert judgments collected from two primary school English teachers in Guangdong province. Two R packages “difR” and “difNLR” were used to perform five types of DIF analysis (two-parameter item response theory [2PL IRT] based Lord’s chi-square and Raju’s area tests, Mantel-Haenszel [MH], logistic regression [LR], and nonlinear regression [NLR] DIF methods) on the test scores, which altogether identified 16 DIF items. ShinyItemAnalysis package was employed to draw item characteristic curves (ICCs) for the 16 items in RStudio, which presented four different types of DIF effect. Besides, two experts identified reasons or sources for the DIF effect of four items. The study, therefore, may shed some light on the sustainable development of test fairness in the field of language testing: methodologically, a mixed-methods sequential explanatory design was adopted to guide further test fairness research using flexible methods to achieve research purposes; practically, the result indicates that DIF analysis does not necessarily imply bias. Instead, it only serves as an alarm that calls test developers’ attention to further examine the appropriateness of test items.Linyu LiaoDon YaoFrontiers Media S.A.articlegradeDIFGEPT-Kidslisteningmixed-methods approachPsychologyBF1-990ENFrontiers in Psychology, Vol 12 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
grade DIF GEPT-Kids listening mixed-methods approach Psychology BF1-990 |
spellingShingle |
grade DIF GEPT-Kids listening mixed-methods approach Psychology BF1-990 Linyu Liao Don Yao Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening |
description |
Differential Item Functioning (DIF) analysis is always an indispensable methodology for detecting item and test bias in the arena of language testing. This study investigated grade-related DIF in the General English Proficiency Test-Kids (GEPT-Kids) listening section. Quantitative data were test scores collected from 791 test takers (Grade 5 = 398; Grade 6 = 393) from eight Chinese-speaking cities, and qualitative data were expert judgments collected from two primary school English teachers in Guangdong province. Two R packages “difR” and “difNLR” were used to perform five types of DIF analysis (two-parameter item response theory [2PL IRT] based Lord’s chi-square and Raju’s area tests, Mantel-Haenszel [MH], logistic regression [LR], and nonlinear regression [NLR] DIF methods) on the test scores, which altogether identified 16 DIF items. ShinyItemAnalysis package was employed to draw item characteristic curves (ICCs) for the 16 items in RStudio, which presented four different types of DIF effect. Besides, two experts identified reasons or sources for the DIF effect of four items. The study, therefore, may shed some light on the sustainable development of test fairness in the field of language testing: methodologically, a mixed-methods sequential explanatory design was adopted to guide further test fairness research using flexible methods to achieve research purposes; practically, the result indicates that DIF analysis does not necessarily imply bias. Instead, it only serves as an alarm that calls test developers’ attention to further examine the appropriateness of test items. |
format |
article |
author |
Linyu Liao Don Yao |
author_facet |
Linyu Liao Don Yao |
author_sort |
Linyu Liao |
title |
Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening |
title_short |
Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening |
title_full |
Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening |
title_fullStr |
Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening |
title_full_unstemmed |
Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening |
title_sort |
grade-related differential item functioning in general english proficiency test-kids listening |
publisher |
Frontiers Media S.A. |
publishDate |
2021 |
url |
https://doaj.org/article/dacf8148c83c470f93b8831c9fe3efbf |
work_keys_str_mv |
AT linyuliao graderelateddifferentialitemfunctioningingeneralenglishproficiencytestkidslistening AT donyao graderelateddifferentialitemfunctioningingeneralenglishproficiencytestkidslistening |
_version_ |
1718405930747428864 |