A variety of methods are available for student assessment including global faculty ratings, structured oral examinations, standardized patient simulations, patient management problems, computer-based simulations, free-response questions (essay and short answer), and various forms of multiple-choice questions. Each method has inherent strengths and weaknesses associated with its reproducibility, validity and utility. The purpose of this paper is to discuss the use of extended matching questions as an alternative to multiple-choice questions or free-response questions in student assessment.

Free-response questions are commonly believed to test important higher-order skills whereas multiple-choice questions are thought to assess only knowledge of isolated facts () or, as Newbie, et al. stated, "a combination of what the student knows, partially knows, can guess, or is cunning enough to surmise from cues in the questions." () Some of the flaws in multiple-choice questions can be overcome by following important construction principles (). But, as commonly used, multiple-choice examinations often place undue emphasis on recall and stimulate students to learn in a like mode. On the positive side, scoring reproducibility for multiple-choice questions is excellent and many topic areas can be sampled in a short time.

Validity is the degree to which a test measures the learning outcomes it purports to measure. Because students can typically answer a multiple choice item much more quickly than an essay question, tests based on multiple choice items can typically focus on a relatively broad representation of course material, thus increasing the validity of the assessment.

This study compared two types of teacher-made in-class tests (multiple-choice and short-answer) with a no test (control) condition to determine their relative effectiveness as aids to retention learning (that learning which is still retained weeks after the initial instruction and testing have occurred). The investigation involved instruction via self-paced texts, initial testing of learning, and delayed testing 3 weeks later. The delayed tests, which included both previously tested information and novel information that had not been previously tested, provided the experimental data for the study.

One method of testing that has received little attention in the literature, however, which is popular in many educational settings, is the use of short-answer test items. Short-answer items are relatively easy to prepare ( ) and may be scored more quickly than essay items. They are not as objective as multiple-choice items because they sometimes do not give adequate information to evoke the desired response even from students who know the subject well. Despite this limitation, they may be useful on teacher-made tests because there is good evidence to suggest that many teachers are not capable of authoring truly clear and effective multiple-choice items ( , ). Since many teachers do use short-answer items, their usefulness in promotion of retention learning is worthy of research.

Multiple-choice tests, take-home tests, and post-test reviews have all been shown to promote retention learning in previous studies ( , , , ; ). However, announcements of an upcoming test did not have a positive effect on retention learning without a test actually being given. It appears that increased studying due to anticipation of a test did not result in better retention -- only the act of taking the test increased retention ( ). No studies were found that investigated the effects of short-answer tests on retention learning which is the thrust of this research. Research on the effects of tests on retention learning within the context of technology education classes and the value of the learning time they consume is limited to the studies cited above.

In addition to studying the relative gains in retention learning acquired by students while they take a test, an effort was made here to determine whether information which has been studied but which does not actually appear on the immediate posttest will be retained in addition to that material which is on the test. This study also examined whether multiple-choice and short-answer tests differ in their effectiveness for promoting retention of both tested and untested material. The research questions posed and addressed by this study were:

Free-response questions are not without disadvantages. They require students to guess, to some degree, what the author intended and what the grader (sometimes not the author) will reward. This ambiguity can reduce the reliability and validity of scores. Some students might simply be "lucky" or "unlucky" with respect to guessing exactly what the author had in mind or what the grader will expect. A second disadvantage is that free-response items usually sample a relatively small portion of the topic area. A third disadvantage is that free-response questions must be hand-scored which is cumbersome, time-consuming and resource intensive. Most importantly, subjectivity in grading can reduce score reliability and validity particularly for longer essay questions. In one study, the reliability correlation for scoring essay questions at six-month intervals, by the same grader, was only 0.35 (). Scoring reliability is less of a problem with short-answer questions. Methods are available to minimize subjectivity and eliminate bias from scoring free-response items, but such methods also increase the cost of scoring ().

