Diploma Exams

Maintaining Consistent Standards Over Time Initiative

Why has an initiative to maintain consistent standards over time been established?

In May 2003, Learner Assessment announced an initiative designed to maintain, over time, consistent standards associated with diploma examinations. This initiative was introduced to ensure fairness to students regardless of when they write a diploma examination and equal opportunity in relation to scholarships and entrance to postsecondary institutions. This initiative will allow Alberta Education to report changes in student achievement on diploma examinations from administration to administration and from year to year in a more meaningful manner than has been possible in the past. This initiative will also make it possible to equate students’ marks on diploma examinations to accurately reflect their achievement regardless of which examination form they have written.

The first diploma examinations to be affected by this initiative were the Social Studies 30 and Social Studies 33 diploma examinations administered in 2004. In January 2005, Chemistry 30, Physics 30, and Pure Mathematics 30 joined this initiative, and Applied Mathematics 30, Biology 30, English Language Arts 30-1, and English Language Arts 30-2 joined in January 2006.

Similar approaches for maintaining consistent standards over time are used in the Provincial Achievement Testing Program as well as in many national and international tests.

How do the diploma examinations administered in the initiative compare with the examinations that were administered prior to the initiative?

The diploma examinations administered after the maintaining consistent standards over time initiative are exactly the same in design as the diploma examinations administered in 2004. Curriculum coverage is the same, the number and the design of the writing assignments on Part A: Written Response is the same, and the number and the style of the questions on Part B: Machine-Scored are the same. The time allotted for writing each part is the same.

What is the purpose of the anchor items?

To conduct this initiative, it was necessary that at least 20% of the questions on the Part B component of an examination be re-used in another administration. These items are called anchor items, and they are selected from the items of previously secured diploma examinations. By comparing student results on the anchor items and unique items (unique to a single exam) on any particular diploma examination and those from the baseline examination, Alberta Education can determine whether or not that examination was more difficult or less difficult than the baseline examination. Student scores on that examination can then be equated to the baseline examination to remove any influence that differences in the difficulty of the two examinations may have had on student scores.

How is equating done?

Each diploma examination is designed and developed according to a published blueprint that determines the makeup of an entire examination. Once it is established, the blueprint typically remains unchanged through the life of a particular program of studies so each examination administered is designed consistently through time. The anchor set of items, mentioned earlier, is selected to be representative of the entire examination, and these items are embedded throughout each exam.

When two groups of students write the same set of anchor items contained in two different forms of an examination, the following process occurs. The averages that the two groups attain on the common anchor set are compared. This tells us about the nature of the two populations of students. For example, if the averages on the anchor sets for those students writing either form of an examination were almost identical and showed no practical significant difference, this would tell us that the characteristics of the populations writing the two forms are essentially the same. As a result, any differences seen in student performance on the unique items of the two forms would be due to differing item difficulties between the forms, not differences in the populations writing. The relative difficulties of the forms are then determined.

Eight equating methods are used, including equipercentile equating with various smoothing values. Based on the convergence and trends found among these methods, an equating table is developed, which adjusts the students’ total scores to the same metric, or standard, found in the baseline examination.

This process is not the same as the common notion of scaling examination scores, whereby a uniform amount is added to, or removed from each examination score. Typically, the numerical adjustments that result from the equating process do not produce a uniform adjustment across all total test scores. It should be noted that test scores of 0 and 100 are never adjusted, because a student cannot have a negative score, nor can a score greater than 100 exist. A student achieving 100% has not necessarily demonstrated an upper limit to his or her achievement, and to move the mark downward would not be fair.

Adjustments may be upward or downward dependent upon the differences in difficulty between the baseline examination and the subsequent examination. The degree of adjustment also varies.

It is critical to remember that the equating process is designed to remove the possibility of varying examination difficulty as a factor in large-scale assessments. As a result, fairness to all students over time is ensured.

Are the scores of students who rewrite a diploma examination adjusted in any way?

The initiative to maintain consistent standards on diploma examinations over time requires that a number of machine-scored questions are unique to each administration and that a number of machine-scored questions (the anchor items) are common across administrations to allow for comparison of the difficulty of the examinations.

Therefore, a rewriter’s score on an examination may be prorated if the rewriter has been administered items on the examination from their first writing. Students who rewrite a diploma examination after the baseline examination will be scored in the same way as first-time writers on the unique items of the rewritten exam. However, they will have their total score on the Part B component adjusted based on the ratio of first-time writers’ scores on the total Part B to first-time writers’ scores on the unique items.

The process of prorating takes into account the fact that the anchor items may differ in average difficulty from the unique items. By prorating rewriters’ scores in this manner, Alberta Education can ensure fairness to students who are rewriting diploma examinations that contain anchor items, while retaining fairness to first-time writers.

How does this initiative affect the release of items from diploma examinations?

The implementation of the initiative to maintain consistent standards on diploma examinations over time has meant that diploma examinations are not released in their entirety immediately after administration. However, Learner Assessment releases significant numbers of diploma examination items on an annual basis. Booklets of released items from each of the diploma examinations are provided to schools in print form annually.

Is the Results Statement sent to students affected by this initiative?

Where equating is used with a diploma examination, the Results Statement sent to students reports the equated diploma examination mark for each exam written, as well as the school-awarded mark and the final course mark. In addition, the results statement provides the raw score on the written-response component of each diploma exam written.

How does this initiative affect the multiyear reports for diploma examinations?

Once the examinations from all administrations in a given school year are equated, results on the multiyear reports will be directly comparable to results in subsequent years.

Updated August 2008