Prologue
I studied economics as an undergraduate at Trinity College Cambridge. Economics is a dull subject, but a 1970s TV show had persuaded me to aim for a career in the London square mile. Three years at The Bank of New York, and five at Mellon Bank made me realise that TV shows are not the best career guides.
I moved to Western Australia in 1989, where I worked as International Credit Manager for titanium dioxide producer Tiwest. Credit management is even more boring than banking, but the perks were great: monthly trips over East and frequent sojourns to South East Asia. Nevertheless, I had this nagging feeling that there was more to life than balances sheets and dollar signs.
After a period of head scratching I ended up working as a volunteer for various charities in Calcutta, India. The first was Future Hope, run by Tim Grandage, who is a sort of latter-day Dr Barnardo. The second was the Loreto Day School, Sealdah, where I taught mathematics and English to a small group who had fallen behind in their classes. Finally, I worked for Mother Theresas order, the Missionaries of Charity, teaching at an informal school for street children.
A psychometric conundrum
In the major cities of India in the 1990s there were many street children, and there were many charities offering them ad hoc assistance. In addition to food, clothing and shelter, some tried to offer educational assistance, often in the form of informal classes in the premises of the charity or some other shelter. A longer term goal would be to get at least some of the children back into regular school. A major obstacle would be persuading a school that a child from the street could keep up with a regular class. The conundrum was trying to demonstrate to a school the learning potential of a child who could not read or write.
Learning Potential Assessment
When I first stumbled across this term in the psychometric literature, I thought it would provide the answers to all my questions. This did not turn out to be the case, but I found the history of the term quite interesting, especially the work of the Russian psychologist Lev Vygotsky, and his Zone of Proximal Development.
From my reading of the literature (references are given in my thesis and published papers) Lev Vygotsky (pictured left) was the first psychologist to develop an interactive assessment technique, as distinct from the passive pencil and paper tests developed by Stanford, Binet and others, and he did this long before the advent of computers. Unfortunately the procedure was fraught with problems.
At the time of implementation there is the high cost of one on one testing, and problems with standardization when different people implement the test. Then there are issues of meta-testing: testing the validity of the test itself. First you have to define what constitutes the achievement or realization of learning potential, and over what time scale, and then you have to figure out a method of measuring it. Finally, you have to track down the original test candidates after the chosen time period (which presumably would be measured in years, and apply the validity or vindication test to them. And again, to my knowledge, this was never systematically done.
Computer based assessment
I set out to address some of the practical problems with Vygotskys procedure by developing a computer based interactive (arithmetic) test. I felt this would overcome the cost barrier associated with manually administered interactive tests, and I hoped it would also overcome the standardization issue.
I had initially been drawn to Vygotskys work because he seemed to be addressing a similar problem. I was interested in assessing the learning potential of street children in India, or anywhere else, or frankly any children disadvantaged for cultural or other reasons by conventional pencil and paper tests. Vygotsky was interested in assessing the learning potential of children in what he called Central Asia. Geography is not my top subject, but I believe this would now be countries from the former Soviet Union with stan in their name. At the time (the 1930s) city people believed children from these regions to be intrinsically inferior to their city counterparts. Vygotsky believed the problem was less with the children than with the method of assessment, and his interactive procedure was his method of addressing this problem.
At the time I was beginning my research (the late 1990s), the literature and even the popular press was full of references to the poor results of Australian Aboriginal children in numeracy and other government tests. I set out to demonstrate that this too was related to the traditional paper based methods of assessment, and that a computer based test would do a better job. My doctoral thesis title (An investigation into the psychometric properties of a computer based interactive arithmetic test: Estimates of reliability and validity and a comparison of cultural bias in its results with that found in the results of a written arithmetic test) was so long that the master of ceremonies at the degree ceremony needed two breaths to read it out. But the nub of it was that my computer based test did indeed exhibit less bias towards Aboriginal children from the North West Kimberley region of Western Australia.
Rasch Probabilistic Models
A key component of Vygotskys interactive procedure was repeated sittings of similar but not identical tests. I was using a computer to generate test items, and I was using a set of algorithms to generate the items with random numbers, but with prescribed levels of difficulty. It was important that the test items were both sufficiently different for test candidates not simply to memorize the answers, and sufficiently similar for the tests to be of an equivalent standard.
The need to address these twin aims led me to the work of Danish mathematician Georg Rasch (picture right borrowed from the Winsteps website). Rasch was working with the Danish government to develop standardized tests. Much of his work related to reading tests, but his methodology has been generalized to cover almost any topic.
Rasch and Vygotsky worked in similar fields in a similar period of history (the early 20th century) but their approaches could not be more different. Vygotskys methodology and theory were essentially qualitative. The arguments and methods of Rasch were purely mathematical.
Rasch essentially took the view that the outcome of any student/test item interaction was random, just like the tossing of a coin. When you toss a coin, there are two possible sources of bias. The coin itself might be weighted, or you might be tossing it in such a way as to favour one outcome or another. And in any student/test item interaction, Rasch identified two sources of bias. One was in the item itself, and he called this item difficulty, and other was in the student, and he called this student ability. And the essence of his methodology was to isolate these parameters, so as to estimate student ability without reference to the test items, and to estimate item difficulty, without reference to the students addressing them.
For my doctoral research, I used the Winsteps computer program to estimate the difficulty of the items generated by each algorithm in my interactive test, and the ability of the students sitting the test. Winsteps is designed to be used with dichotomous item tests, but after consultation with the software authors I converted my scoring rates into conceptual scores from a series of dichotomous item tests.
I have since re-addressed the chapters in the Rasch book which deal with reading rates, and the current version of my software applies the arguments in those chapters to use the scoring rates directly to estimate item difficulty and student ability.