The assessment of clinical skills
‘Examinations are formidable even to the best prepared, for the greatest fool may ask more than the wisest man can answer’
Charles Caleb Colton (1780–1832)
This section is a condensed guide to clinical assessments presently popular in the UK. It is meant as a complement to the Workplace-based assessment module and the 'Assessment' guide in its ‘Explore around this topic’ section.
- List the clinical skills assessments you are familiar with; what are their strengths and weaknesses?
- What would you consider the most limiting aspect of assessing clinical skills?
A clinical graduating examination was first introduced in the UK by Professor Sir George Paget at Cambridge in the 1840s. Until the subsequent introduction of the Objective Structured Clinical Examinations (OSCEs) in the 1970s (Harden and Gleeson, 1979) very little changed. Most more established clinicians will remember the ‘challenges’ of the long case, short case and viva voce.
The long case
While the traditional long case had relatively good face and construct validity, its weaknesses far outweighed these relative strengths. It was largely unobserved, statistically unreliable and often poorly structured.
In attempts to strengthen the format, Gleeson (1992) suggested a more structured approach, the objective structured long examination record’ (OSLER), but this is still rarely used.
Wass and colleagues (2001a, 2001b and 2004) advocated that the traditional format could be improved if: (a) there were observation of the interaction (see also Newble, 1991); (b) sampling involving 8–10 different cases per student; (c) comparable testing times to an OSCE were employed; and (d) one, rather than two, observers were used.
McKinley et al. (2005) have since advocated using a sequential approach to this and similar assessments, thus allowing institutions to exclude the competent from further testing, and concentrating on the borderline and poorly performing students.
The short case
The traditional ‘short case’ assessment consisted of a candidate examining a series of ‘real’ patients, observed by two examiners. The candidate was then asked a series of questions around their findings, the differential diagnosis, causes and management. Whilst this mode of assessment has several appealing features (going someway to explain its continued use in many postgraduate examinations), it has largely been replaced by the introduction of the OSCE.
Objective structured clinical examination (OSCE)
The OSCE consists of a circuit made up of a number of cubicles or ‘stations’ through which each candidate must pass. At each station the candidate is required to perform a given task. This is usually observed and assessed by an examiner, but may also include unobserved stations relating to previous tasks or related data. Tasks may include communication skills, history taking, informed consent, clinical examination of real and simulated patients, clinical procedures performed on manikins and data interpretation. Stations may assess various attitudes and behaviours, something the older assessments often failed to address. Each student within a given circuit is assessed on the same task by the same examiner. For more objectivity, the observer is provided with a detailed, itemised checklist on which to mark the candidate’s performance. More recently, global rating scales have been advocated and used. Despite fears that their use would be a retrograde step towards the old subjective marking system, they have been shown to be as reliable as their detailed counterparts (Allen et al., 1998).
While often viewed as the ‘gold standard’ of clinical assessment, the OSCE has weaknesses, and it remains an assessment in evolution. Debate continues about the maximal duration of the stations, the minimum testing time required for a reliable examination, and the number of stations and tasks assessed. In most undergraduate examinations, these details are governed by the practical considerations of the number of students to be examined, the facilities available and, perhaps most importantly, the financial constraints. The administration, logistics and practicalities of running an undergraduate OSCE are comprehensively described by Feather and Kopelman (1997). The experience of the authors is that OSCEs are approximately 30–50% more expensive than traditional examinations. However, this must be set against their reliability, which is far superior to the traditional short case.
At present, most UK undergraduate high-stake OSCEs consist of 20 to 30 stations, each of approximately 5–10 minutes in length. It should be stressed that OSCEs of less than two and a half hours become increasingly unreliable and should not be used summatively. However, the 5–10 minute station format limits the type of task assessed and is probably more applicable to the assessment of students in the earlier years, when single tasks need to be assessed in isolation to ensure competence. At graduation, one is more interested in a holistic approach and extended or paired stations may be used to assess a range of skills within a single clinical scenario.
All postgraduate teachers and trainers (certainly in the UK) will be now familiar with the newer work-based assessments recently introduced into postgraduate training. These include the mini-CEX (mini-Clinical Evaluation Exercise; see Norcini et al. (2003)), Direct observation of Procedural Skill (DoPS) and Multi-source feedback (MSF).
But this is for another module. For a detailed description of their application and the theory behind them see the Workplace-based assessment module.