Assessment Issues: marking to the full range

This part of our assessment resource is intended to provide some background to discussions about assessing across the full range of available marks. It is divided into four sections:

1. Issues around marking to the full range

… in some subjects marks within 10 per cent of the maximum obtainable are not uncommon, while in other subjects such marks are never given, and yet in others, practice will be changing, sometimes slowly, sometimes rapidly. There is an increasing trend for external examiners to encourage the greater use of more of the whole range of marks available to examiners, and thus to change marking practices. This trend may be supported at an institutional level by the development of generic marking criteria.

QAA briefing paper Quality matters, The classification of degree awards, April 2007

The topic of marking to the full range has been equated with ‘dumbing down’ and ‘grade inflation’ in the media and has consequently received a very bad press recently. See for instance Geoffrey Alderman writing in the Times in June 2008, where he claims that “students who would once have been awarded respectable lower seconds are now awarded upper seconds and even firsts . There have been some clumsy attempts to improve the number of ‘good’ degrees awarded in certain institutions which have been seized upon as examples of poor practice - for example in this article by Phil Baty from December 2006.

Actually the debate is and should be about how we determine student performance in a reliable way. We have criterion referenced assessment in UK Higher Education. If we had norm reference assessment, we would only ever award a certain proportion of each degree classification. We don’t do this because it doesn't fairly represent student achievement. With criterion referencing everyone who achieves a particular standard recieves the same classification. It is therefore theoretically possible for all students to fail, or all to achieve first class degrees if they achieve the marks associated with that classification. But what should those marks represent? Anyone who works with staff on assessment issues will have heard at least one person say “first class honours should be publishable work” or “I can’t mark above 75%, nobody does in my subject area”. An analysis by Yorke, Mantz, Bridges, Paul and Woolf, Harvey (2000) showed that there were very large variations between subject areas in the extent to which the full range of marks was used. Mathematical subjects tended to use the full range of marks available, while English stuck mainly to a range of 50%-70%, and Business Studies was somewhere in between.

Another study by Chapman (1997) showed that Mathematics departments themselves varied very greatly according to the number of ‘good’ (First and upper Second) degrees they awarded – some departments gave good degrees to about 60% of their students, while others were around the 25% mark (many other subjects are covered in the study).

These figures on their own tell us nothing useful about individual marking practices, because we don’t know anything about the abilities or previous knowledge of the students or indeed which universities gave out which percentage of good degrees. They just tell us that there is considerable variation in the ways that different institutions and different subject areas approach assessment - and also significant variation in the method for calculation of classification, which can have as great an impact (see Yorke, M, Barnett, G, Bridges, P, Evanson, P, Haines, C, Jenkins, D, Knight, P, Scurry, D, Stowell, M and Woolf, H (2002). This isn’t necessarily a bad thing: what is important is that the people actually doing the marking and allocating grades share a clear idea of what is needed for a student to achieve a particular grade and that they are able to apply these ideas consistently and reliably. Individual markers may vary as much as teams from different disciplines. Left to our own devices, some of us are ‘hawkish’ – looking to be severe – while others are ‘doveish’ – looking to be kind. Some of us are ‘restrained’ – use a small range of marks – while others are ‘theatrical’ – using a large range. (these terms probably first coined by Wakeford and Roberts (1984)). Althought the shift towards assessment based on the achievement of learning outcomes (linked with set criteria to establish the full range of extent of that achievement) is sometimes criticised for being too mechanical, or too focused on the quantifiable, it has tried to encourage a more systematic, objective and shared approach to the marking process.

The next pages look at sharing a view and moving towards more reliable marking.

next: Sharing a view of the meaning of grade descriptors »