Assessments are almost always replete with measurements, whether they are used to test knowledge, competency, or the aptitude of an individual. Putting a numerical value is a credible and convenient way of gauging students’ learning progress and doing a comparative analysis among them.
However, the efficacy of assessments itself remains ambiguous. The same test papers are designed for all students in a particular class; it fails to take into account the varying needs and capabilities of each. Therefore, an above average student might get bored attempting questions too easy for him, or a student who is lagging behind, might become frustrated attempting questions too difficult.
Thus, instead of promoting learning in children, this futile exercise demotivates them, straying their attention and interest. In the era of personalised learning, where learning is tailored as per the requirement of each student, personalised assessments could not have possibly stayed far behind. Hence, the difficulty level of questions should be in tandem with the abilities of the students, thus keeping them motivated, which is essential to attain better learning outcomes.
Philosophically enrooted in the Classical Test Theory (CTT), Item Response Theory (IRT) makes up for the former’s limitations. Unlike IRT, which analyses student’s capability in solving each question, CTT presents the entire score that a student has managed to score in an exam. Scoring in CTT is based on predefined weightage, and does not take into account the difficulty level of each question. CTT works fine for average students, but not for the weak and strong ones. Put simply, (IRT analyses the response to a particular item, i.e., question, to decide the student’s latent ability. The motive of the theory is to provide a framework to analyse how well tests work, and also how well a single question in a test paper works.
How does IRT Work?
Looking into the Graduate Record Examinations (GRE) and Graduate Management Admission Test (
What’s so Great About It?
The IRT model estimates the likelihood or probability of different responses to items by individuals. The three parameters of IRT to analyse the huge chunk of data set are:
1. Discrimination parameter, which measures the differentiation of a top participant from a weak one on the basis of each item
2. Difficulty parameter, which decides which level of question is best suited to the needs and ability of each student
3. Pseudo-guessing parameter, which accounts for the effects of guessing the probability of a correct response in a multiple-choice question.
Initiatives so Far
One of the first ed-techs to adopt IRT for the K-12 sector, Next
The Union government has also taken heed of the technological advancements and initiated constructive measures. The Department of Educational Measurement and Evaluation of the National Council of Educational Research and Training (NCERT) has been conducting achievement surveys in the country under the Sarva Shiksha Abhiyan to measure the learning outcomes. Recently, it has decided to use IRT for developing multiple booklets to analyse the data, and report the outcomes. The scope for improvising IRT is immense. Research and innovation would go a long way.
(This article is authored by Beas Dev Ralhan, CEO & Co-founder, Next Education India Pvt. Ltd)