« August 2009 | Main | October 2009 »

September 26, 2009

Walden University’s College of Education produces teachers who are more effective in improving pupils’ reading fluency. Really?

A glossy advertisement on the back of the latest issue of Educational Researcher (the official journal of the American Educational Research Association, AERA, no less) grabbed my attention. Apparently, and as the headline exclaims: “New study shows that students of Walden teachers make greater gains in reading fluency.”

The claim is based upon research commissioned by Walden University’s Richard W. Riley College of Education and Leadership that compared the effectiveness of teachers who had graduated with their master's degree compared to that of teachers who had graduated with master's degrees elsewhere. As the glossy advert went onto explain:

“In a unique collaboration with Tacoma Public Schools in Tacoma, Washington, researchers compared the reading fluency of students taught by Walden Master’s-educated teachers with students taught by non-Walden Master’s-educated teachers. The study revealed that students of teachers who graduated from Walden’s Elementary Reading and Literacy programme had gains in reading fluency that were on average 4.8 words per minutes, or 14%, greater than students of non-Walden Master’s-educated teachers.”

This is a huge claim. It is not surprising that Walden's College of Education chose to buy a glossy advert on the back of the prestigious AERA magazine to publicise it. What College wouldn’t want to let the world know that their masters degree is proven to be more effective than others? Students will clearly want to graduate from Walden given that a Walden degree is evidence that you are a more effective teacher. The advert encourages readers to visit their website at http://www.WaldenU.edu/tacoma for more information on the research. Fortunately, the full report of the research is also available to download from the website and can also be downloaded directly from here: http://bit.ly/2IZDLm 

So, are the claims in the advertisement true? Well, the research that lies behind these findings is based on a relatively small sample (the main element of which compares the reading scores of children taught by just 35 graduates from Walden with those taught by 35 graduates of other programmes). However, the findings are statistically significant so we can be sufficiently confident that the differences between the two groups are unlikely to have occurred by chance. Moreover, the researchers use appropriate statistical techniques – hierarchical linear modelling – for analysing the data they have (nearly 4,000 pupils clustered in 70 classes).

Interestingly, the researchers are a little more cautious in their own interpretation of the findings. As they explain in the executive summary: “Limitations on the research design do not allow for a claim of causation between the completion of the Walden degree and teaching effectiveness. However, [the findings] ... provide suggestive evidence that the program may indeed improve the effectiveness of elementary literacy instruction” (p. 3).Of course everything rests on these ‘limitations’ that, not surprisingly, fail to get a mention in the glossy advert and that do not seem to be considered by the researchers to be that serious to stop them claiming that they have “suggestive evidence” that the Walden programme “is making teachers more effective at reading and language arts instruction” (p. 21). Well, here’s the main limitations, taken directly from the research report (pp. 22-23):
 
  1. “While we were able to use matching to control for differences in teacher experience between the Walden and the control group samples, we did not have information on teachers’ credentials, prior education (i.e., bachelor’s degree institution and major field of study), or professional development/training experiences. It is plausible that any differences in student reading gains are not due to Walden’s M.S. in Education program, but due to systematic differences in these other factors between Walden teachers and the comparison group teachers.”
  2. The inference from the estimated effect is the difference in earning a Walden M.S. in Education degree with a specialization in Elementary Reading and Literacy relative to earning any other type of master’s degree (as represented in the control group). It is plausible that teachers who seek out specialized degrees in elementary literacy instruction are more likely to be successful at reading instruction than those who seek out degrees in other areas. In fact, they may pursue the degree because they have higher self-efficacy as it relates to literacy instruction. Consequently, the estimated effect of the Walden program may stem from this self-selection and the unobserved differences in reading instruction effectiveness between those who sought out the ERL program and those who did not.
  3. The samples were too small to control for “school effects” (i.e., the effects on student achievement that are common to all students within a given school). Therefore, it is possible that the difference in performance between Walden teachers and non-Walden teachers is due to the programs and policies used in the schools where they teach rather than to their own classroom instruction.
  4. "While we were able to control for some student demographic characteristics, there were a number of unobserved factors that might also explain these differences, for example students’ socioeconomic status or home circumstances."
In relation to three of the four limitations (1, 3 and 4), these are significant but are to be expected from such a research design where it is simply not possible for students to be  randomly assigned to the main and control groups. As the researchers quite rightly point out, the positive gains found among the pupils taught by Walden graduates could be due to a range of unidentified systematic differences between these graduates and their comparators. This is why the researchers quite rightly state that it is not possible to make “a claim of causation between the completion of the Walden degree and teaching effectiveness.” It is also why they also present their research as “suggestive evidence”.

All of the above is quite reasonable and to be expected with a pragmatic evaluation of this type. However, it is the second limitation that is much more problematic and represents a fundamental flaw in the research design. Interestingly, it is hidden away in the body of the report and not mentioned at all either in the Executive Summary or the main Conclusions. Not surprisingly, it doesn’t feature at all in the glossy advertisement.And yet, this second limitation completely undermines the validity of the claims being made. In essence, we’re not comparing “like with like” at all. Rather, we’re comparing students that have taken a master’s degree with a specialisation in elementary reading and literacy with students who have simply taken generic master’s degrees. There is thus no way of knowing whether the additional gains made in reading fluency among the pupils taught by the Walden graduates (which are actually fairly small by the way and not consistent across year groups)  were due to the effectiveness of the Walden programme itself (i.e. compared to other specialist elementary reading and literacy master’s programmes) or the fact that it is due simply to the students having had more specialist training in elementary reading and literacy.

This is a crucial point. Remember that the headline in the glossy advert claimed that: “New study shows that students of Walden teachers make greater gains in reading fluency.” This is clearly misleading as it encourages the reader to believe that there is evidence that the Walden programme is more effective than other comparable specialist programmes. As it is, the study provides no evidence at all that Walden teachers are any more effective in producing gains in reading fluency than teachers with equivalent specialist qualifications from any other College.

     

September 19, 2009

Why does the UK Government, with £6m at its disposal, also find it so difficult to do a simple evaluation?

This week, the Home Office published the findings of the first phase of its £6 million evaluation of Blueprint, a multi-component school-based drug education programme targeted at secondary school children in Years 7 and 8. The reports are available at: http://bit.ly/22SLI

With such resources at its disposal one would expect a rigorous evaluation with some clear evidence of whether the programme is effective or not (initially in relation to children’s levels of drug awareness and, in the longer-term, their attitudes and behaviour). After all, undertaking an evaluation isn’t rocket science. You invite a number of schools to take part, you randomly split them into two groups – one that will deliver the programme and one that will act as a control/comparison group – and then you just collect some data from all the children before the programme starts and then again at the end. If the children in the programme schools have shown progress (in terms of awareness, attitudes and/or behaviour) above and beyond those in the control group then you have strong evidence that the programme has been effective.

Unfortunately, the research team responsible for the evaluation of the Blueprint programme failed to follow even this simple design. They were advised to use 50 schools in order to generate sufficient data to detect any effects that might be associated with the programme. However, they felt that the use of such a sample size was “a very large step for an improvement in the limited UK evidence based” (p. 32) and thus, presumably, a step too far. This is just nonsense. Only this summer we (the Centre for Effective Education) published the results of a randomised controlled trial of a pupil mentoring scheme involving 50 schools and over 800 children (the full report is available from our website at: http://www.qub.ac.uk/cee). Moreover, we’re just writing up another trial involving 80 preschool settings and 1,500 3-4 year old children and their parents.

Instead, the research team referred to guidance from the Medical Research Council that, in the evaluation of complex interventions, a “cumulative approach” is required “to understanding how outcomes are achieved, moving from theory, to modelling, to an exploratory trial to a definitive trial” (p. 32). This is indeed an eminently sensible and pragmatic approach to take and one we have also adopted as well. Most recently we have just completed an “efficacy test” of an early childhood programme in 10 preschool playgroups (5 delivering the pilot programme and 5 acting as a control group).

However, and curiously, the “exploratory trial” the research team chose to conduct for the Blueprint programme involved 30 schools. Clearly too large for a proper exploratory trial and insufficient for a full-blown study. Unfortunately, the problems don’t just stop here. Inexplicably, the research team decided to only select six of the 30 schools to act as a comparison (control) group and then decided not to randomly select them but to hand-pick them. As it turned out, the characteristics of these six comparison schools proved to be significantly different to the remaining 23 schools (one dropped out) delivering the programme and so they cannot now be used for any meaningful comparisons at all.The catalogue of errors involved in this trial are well outlined by Ben Goldacre in the latest entry in his commendable “Bad Science” column in The Guardian, see: http://bit.ly/ECcq5.  It is just astounding that the Home Office could have ended up with such a half-baked evaluation, especially given the amount of funding they set aside for this and the clear advice they were given as well as the expertise at their disposal (see Goldacre’s column for more details).

I have previously asked the question “why some educational researchers find it so difficult to do a simple evaluation” (see: http://bit.ly/6tfJ). Then, I used an example of a small evaluation conducted by a couple of educational researchers that was reported at the BERA Conference. That was bad enough; reflecting, as I argued, a more general lack of competence among sections of the British educational research community in conducting simple evaluations of the effectiveness of educational programmes and interventions. However this present example is simply in a different league. What hope can we have for the future when even the New Labour government – the self-styled proponents of evidence-based policy – can’t even undertake a simple evaluation for themselves?

 

September 07, 2009

Why do some educational researchers find it so difficult to do a simple evaluation?

Here's an example of an evaluation of an educational programme taken from a paper presented last week at the British Educational Research Association Annual Conference at a session I attended. To maintain anonymity I will keep the description of the study fairly vague. The point is not to be critical of the specific authors of the paper, for they are far from the only ones to adopt this type of approach, but to raise a more general point about the nature of educational research.

The paper described what was actually a very interesting educational initiative that attempted to motivate children through the use of a particular strategy. The presenters clearly knew their subject area and provided a convincing case theoretically for why the use of that strategy may help to motivate children. They also described a pilot scheme where this approach was trialled for a short period of time. However, the evaluation that was undertaken of the effectiveness of the strategy, and that the presenters then went onto report, was unfortnately probably one of the worst examples of an evaluation I have seen.

Part of the evaluation involved the teachers rating the children's levels of motivation into four categories (‘very motivated’, ‘engaged’, ‘somewhat engaged’ and ‘negative’) for the 63 who participated in the pilot scheme. The results were presented in a table, reproduced below exactly as it appeared in the paper, with the children being broken down by their entering grade (year group):


Entering     |      Very       |      Engaged   |    Somewhat   |   Negative
Grade        | Motivated    |                      |    Engaged     |
------------------------------------------------------------------------------------------------------
2               |        5         |          7         |           3         |          3
3 or 4        |        8         |         13         |           4         |          1
5 or 6        |        2         |          6          |           4         |          1
7+             |                  |          1          |           1         |          4
------------------------------------------------------------------------------------------------------
Total          |      15        |          7          |           12        |          9
 

The presenters interpreted these data as follows: “The teachers’ descriptions indicated that 15 of them were very motivated by [using the strategy], 27 were somewhat motivated, 12 were not very engaged, and 9 found it to be a negative experience. In general, in this population, students aged 8-11 years-old [i.e. those in entry grades 3 or 4] were more likely to be motivated by [the strategy] than younger or older students.”

Now, there are three main problems with this interpretation of the data that should be apparent to anyone who has done even an elementary course in educational research methods:

  1. There’s no pre-test scores. How, therefore, can we tell whether the children’s levels of motivation have actually changed at all during the course of the pilot scheme?
     
  2. There’s no comparison or control group. Even if we had pre-test scores and we could see that the children’s motivations had increased over the course of the pilot scheme, how do we know that this improvement was down to them participating in the pilot scheme and not due to something else?
     
  3. As regards the claim that the use of the strategy was more effective for the middle band of children (i.e. those with entering grades 3-4), how do we know that the differences between the differing bands of children were due to the programme rather than just down to random variation?

As it happens, the query raised in the last point can be answered very quickly with the use of a simple statistical test (a Fisher’s exact test in this instance). In this case, and by conflating the oldest two bands so that we are comparing the ‘3-4’ group with their younger counterparts (‘2’) and older counterparts (‘5+’), such a test gives us a significance level of p=0.275. What this tells us, in essence, is that there’s a fair chance (a 27.5% chance to be precise) that there are actually no underlying age differences and that the differences in this present sample are simply due to random variation. With odds like this, how can we have any confidence in these claims?

The presenters attempted to justify their approach by arguing that it is difficult to isolate the effects of the strategy used and that it was not possible to organize and conduct a randomized controlled trial. However, such arguments are difficult to defend. Infact the present pilot scheme, that ran for just a few weeks, was ideally placed to have been evaluated using a small, pragmatic trial. For example, the children taking part could have been randomly organized into two groups, with one group participating in the scheme initially and the other group acting as a control but possibly getting to participate in the scheme at a later stage (i.e. being a ‘delayed control group’). This way, nobody loses out in the long run. Then, with the children organized into two groups, they just needed to have their motivations tested at the beginning of the pilot scheme and then again at the end. Et voila: a pragmatic randomized trial that would provide strong evidence of whether this pilot scheme was being effective in increasing the motivation of the children taking part.

So if randomized trials are so simple to organize and run then why do researchers still opt, with depressing frequency, for flawed evaluative designs like this? I have offered some possible answers to this question in my editorial for the first issue of the new journal Effective Education which can be accessed free online at: http://www.informaworld.com/effectiveeducation Whatever the reason, it is surely a telling indictment that studies like the one described here are still being produced when so much commitment has been expressed, and efforts made, to building research capacity in education. Teaching the basics of evaluative research designs should be a core element of all undergraduate and postgraduate research training. After all, doesn’t the question of whether an educational programme is effective or not represent one of the basic and fundamental questions that educational research should be seeking to answer? The fact that educational researchers are routinely failing to receive basic training in simple evaluative techniques is therefore indefensible.

CEE team win national prize for poster presentation

A research team from the Centre for Effective Education has won the prize for ‘best poster’ at the British Educational Research Association Annual Conference. The prize, sponsored by the CfBT Education Trust was awarded to Professor Paul Connolly, Dr Emma Larkin and Dr Susan Kehoe for their poster reporting the findings of the evaluation they have recently completed of the effects of the children’s television series, Sesame Tree, on young children’s attitudes and awareness in Northern Ireland.

The BERA Conference is the largest annual gathering of educational researchers within the UK and this year attracted over 800 delegates at its meeting at the University of Manchester between 2-5 September. The prize was awarded during a packed plenary session and the poster was particularly commended for “excelling at communicating the findings of a complex research study in a clear and highly accessible way for policy makers and practitioners.”

Speaking of the prize, Professor Connolly said: “we were delighted to have received this prestigious award. Much of the credit for the poster is due to Emma and Susan who spent a lot of time planning very carefully how to present the findings.”

He went onto add: “This prize means a lot to us at the Centre for Effective Education where we pride ourselves on undertaking strong and scientifically-robust research but where we are also committed to ensuring that the findings are reported in an accessible and relevant way so that they contribute to policy and practice.”

The poster reported on two, linked, studies that were conducted during 2008 into the effects of Sesame Tree – the Northern Ireland version of the popular US-based Sesame Street – on the attitudes and awareness of 5-6 year olds. The first studied comprised a cluster randomized controlled trial involving 20 primary schools and 440 children whereas the second study comprised a naturalistic longitudinal survey of a separate sample of 697 children from 37 primary schools selected randomly from across Northern Ireland.

The prize-winning poster will be on display shortly in the reception area of the School of Education (69-71 University Street). To download a copy of the handout associated with the poster please follow this link: http://www.paulconnolly.net/publications/pdf_files/SesameBeraPoster.pdf

September 01, 2009

CEE researchers to present five papers at BERA

Researchers from the Centre for Effective Education are due to present five papers at the British Educational Research Association Annual Conference to be held on 2-5 September at the University of Manchester. BERA is the largest gathering of educational researchers within the UK, attracting up to 1,000 delegates. The papers to be presented report the findings of four different studies that the Centre has been running over the last year:

  • “The effects of the children’s television series Sesame Tree on young children’s social attitudes and cultural awareness” Paul Connolly, Emma Larkin and Susan Kehoe (12.30-2.30pm Thursday and 2.00-3.00pm Friday 4 September, Poster Presentation, University Place Theatre Foyer – Level 1)
  • “A qualitative evaluation of a mentoring reading programme for 9-10 year olds in Northern Ireland” Oscar Odena, Sarah Miller and Susan Kehoe (4.30-6.00pm, Thursday 3 September, Session 4.17, room: University Place 3.205)
  • “Educational attainment, well being and economic disadvantage: a survey of primary school pupils in Northern Ireland” Sarah Miller, Laura Lundy and Lisa Maguire (9.00-10.30am, Friday 4 September, Session 5.19, Room: University Place 3.212)
  • “A need to belong: an epidemiological study of the experiences and needs of minority ethnic children in Northern Ireland” Liam O’hare, Andy Biggart and Paul Connolly (3.00-4.30pm, Friday 4 September, Session 6.19, Room: Roscoe 3.4)
  • “The place of randomised controlled trials in educational research: a case study” Paul Connolly and Sarah Miller (9.00-10.30am, Saturday 5 September, Session 8.05, Room: Roscoe 3.2)

For more information on the BERA Conference, and to view the full programme, please visit: http://www.beraconference.co.uk/scipro.html For more information on any of the papers listed above, please contact the lead author. Their contact details can be found on the Centre website at: http://www.qub.ac.uk/cee


Hosting by Yahoo!