Article – The great delusions

The great delusions

M. Scott Peck in ‘the road less traveled’ starts: “Life is difficult”. What is most surprising, is that, for many people, this is a revelation! Go to any business networking event, or meet a potential client – especially during the current economic situation and they will be moaning incessantly about the enormity of their problems, burdens or difficulties as if life should be easy.

Perhaps you are struggling on your journey to achieving your ’success’ and you may be suffering the consequences of one or more of the nine common delusions about achieving success. Depending on how much you believe your ’success’ is down to what you do (cause) and how much is down to external forces over which you have little or no control (effect) determines where you might be:

It’s impossible!

Particularly for those just embarking on their journey, ’success’ is a place far away. We may have wonderful dreams about it and a delightfully crafted goal. But as the days, weeks and months go by and ’success’ doesn’t appear to be any closer, many people throw in the towel. More budding entrepreneurs than I can recall have given up – life without a salary is just too tough.

When we’ve given up because ’success’ is impossible, we’ll then criticize it. Anyone who achieves success whom we deem less worthy is the subject of our scorn and contempt – “they don’t deserve it!”.

It’s a mystery to me…

If we survive the ‘impossible’ stage, seeing others achieving yet success continues to elude us we search for the secret.

We need to find the magic formula, the silver bullet or the golden key.

Retuning to that bookshop to find ‘the’ book that will change our lives. So many promise that you can achieve success in business, life, management, health, diet and they are snapped up.

Business people are constantly looking for quick fixes to problems:

  • To sell more, we need the sales messages and techniques that instantly convert a cold call into a lifelong customer.

  • To produce more, we need the unique leadership skills that magically and massively increase performance.

  • To maintain shareholder value we need to increase profitability by increasing sales and reducing costs simultaneously. Either that or we cook the books to make it look as though we did.

Lady luck?

OK, so there’s no absolute secret to success. Sure we can learn from others, but they didn’t really do it instantly, it took time. But essentially, they were in the right place at the right time. No more than luck.

So if success is down to luck – all I can do is hope for it. One day my ship will come in. Next year, when the current economic crisis is over. The dice will fall my way.

May as well buy lottery tickets.If you’ve waited for ‘lady luck’ long enough and still on the journey, by now you may believe that luck only comes to those who create it for themselves.

All I need is a break!

If only…

Everyone has a story about someone they know who got their break. The telephone sales guy spotted in a mall by a movie producer and became an instant star. The busker in the subway ‘found’ by the record label. The crazy inventor who made gold from apple seeds.

But, if all you do is wait for it, when your opportunity comes your way, you won’t be ready for it.So you’ve not had fortune turn up on your doorstep. The 43 steps to instant success didn’t quite work out as expected. That anticipated call from the client you’ve not met didn’t come. Your website is getting plenty of ‘hits’ but turning those into business isn’t quite happening.

What I need is leverage.

We look for an angle to exploit or for leverage over someone else.

They’re successful. They do the same thing as me. Surely I can hang onto their coat tails and ride along until I’m on my feet, then I can set up on my own again, take the best customers with me and …

All I need to do is work harder!

OK, so you’re in charge of the situation now. It’s not about luck or any special formula. It’s all about hard work.

The best thing about working hard and producing results is that it feels rewarding.

Talk to anyone who has achieved success in their business, and I’ll bet they worked hard for it. They just kept going. Putting everything on the line and never giving up.

So that’s the real secret? Well, yes and no. Those people you know who are really successful in their business or career. How’s the rest of their life? Is there a chance that they are neglecting important relationships? I know of no-one on their death bed saying “I wish I’d spent more time at the office.”

Hard work itself doesn’t bring success – you may be in a dead-end job, or your fabulous new product will remain unwanted forever.

So I haven’t attended the right event yet…

Most people take the middle road towards their success. A route that depends much on self-effort, yet recognizes that the outside world has a role in my success too.

A huge number of people believe that success is an event, so they schedule for it. They attend the seminar by one of those fabulous speakers and just know that after this, they will have both the secrets of success and have made connections with like-minded people who will help each other achieve success.

The most common form of event in companies is the ‘training event’. Apparently, the two-day workshop on strategic business leadership is going to equip you with all the knowledge, experience and determination to make your business the incredible success is deserves to be.

That ‘rah-rah’ motivational event might just be the tipping point of a decision to move on, but success is a process not an event.

I just need better connections…

This is the massively growing space for business people.

We’ve all heard the phrase, it’s not what you know, it’s who you know. So we network for success. No longer is this the restrictive domain of the ‘old school tie’, the golf club or the masons. Networking is accessible to all – and the world becomes your oyster.

New technologies allow us to easily expand out network beyond any previous borders. I can network with people across the globe and in my local chapter – over breakfast, lunch, coffee, in a virtual world, in a chat room, a forum. And surely, if I connect with enough people, I’ll get to meet the ‘who you know’ that is going to make that difference.

The right relationships certainly help in achieving your ’success’ but connections alone neither improve life nor guarantee ’success’.

Remember Billy Carter? No?

No-one can network himself to success unless he has something to offer in the first place.

So I just need to be recognized…

As we network with more and more people to increase our visibility we want to be recognised by more and more people for our talents, our special ness, our difference. So we strive for success by being recognized.

For the great business people, it might be the cover of Time magazine. For the scientist or academic, maybe the Nobel prize. The writer for the Pulitzer. The movie star an Oscar. The musician, a Grammy.

Most people would settle for a lot less. Walking into a room full of people and being called by name to come over and ‘let me introduce you to…’ A client who recommends you to a friend. A collaborator who endorses you. A boss who thanks you.

So what’s the answer? Keep on keeping on

Copyright John Kenworthy ©2009 All Rights Reserved

Evaluating Management Development

Evaluating Management Development







There are several issues involved in evaluating management development, more so when the evaluation also considers the method of management development that this paper explores. Through a review of the literature, this paper will address the high level issues and establish the basis for a suitable model of evaluation that will measure the effectiveness of the method(s) employed in a managerial development intervention or event.

The paper commences by reviewing why we should evaluate; what should be evaluated; and lastly how to evaluate. The paper then considers suitable models and methods of evaluation to measure the effectiveness of a management development intervention and the method employed in undertaking the intervention with particular reference to experiential learning and the use of computer-based simulations during a training intervention.


A number of authors consider the need and reasons for evaluation though all tend to fall into four broad categories identified by Mark Easterby-Smith . He notes four general purposes of evaluation (P14):

  1. Proving: the worth and impact of training. Designed to demonstrate conclusively that something has happened as a result of training or developmental activities.

  2. Improving: A formative purpose to explicitly discover what improvements to a training programme are needed

  3. Learning: Where the evaluation itself is or becomes an integral part of the learning of a training programme

  4. Controlling: Quality aspects in the broadest sense, both in terms of quality of content and delivery to established standards.

More recent literature uses different terminology which can be placed into these four broad categories:

Russell does not include such a Learning purpose explicitly but suggests that the evaluation of performance interventions produces the following benefits:

  • Determines if performance gaps still exist or have changed (Proving)

  • Determines whether the performance intervention achieved its goals (Controlling)

  • Assesses the value of the performance intervention against organizational results (Proving)

  • Identifies future changes or improvements to the intervention (Improving)

According to Russell, whichever evaluation model is chosen to follow “the evaluation should focus on how effective the performance intervention was at meeting the organization’s goals”.

Russ-Eft and Preskill review evaluation across the entire organisation, not just in the training or development arena and cite 6 distinct reasons to evaluate:


  • Ensures quality (Controlling)

  • Contributes to increased organisation members’ knowledge (Learning)

  • Helps prioritise resources (Improving)

  • Helps plan and deliver organisational initiatives (Improving)

  • Helps organisation members be accountable (Controlling)

  • Findings can help convince others of the need or effectiveness of various organisational initiatives (Proving)

They include a seventh reason to gain experience in evaluation as it is ‘fast becoming a marketable skill’.

Increasingly in the business and Human Resources media and the professional bodies representing training and development professionals (for example the American Society for Training and Development, ASTD), there is a call for ‘better’ evaluation of training and development intervention. The basic principle driving this is for training to demonstrate its worth to organisations – whether that be its’ attributable Return on Investment (ROI) or its value in improving performance of individuals (such as productivity gains or reduced accidents) or of the organisation (such as more efficient use of resources or demonstrable improvement in quality). Essentially, training and development costs time and money and needs to be shown to be worthwhile.


Exactly what is to be measured as part of the evaluation is an especially problematic area. Aspects of behaviour or reaction that are relatively easy to measure are usually trivial. Measuring changes in behaviour, for example, may require the observations to be reduced to simpler and more trivial characteristics. Alternatively, these characteristics may be assessed by the individual’s subordinates – however, the purist would no doubt claim that such holistic judgements are of dubious validity. The problem is that the general requirement for quantitative measurement tends to produce a trivialization in the focus of the evaluation.

According to Bedingham , ultimately “the only criteria that make sense are those which are related to on-the-job behaviour change”. Bedingham advocates the use of research based 360º questionnaires that objectively measure competency sets and skills applicable to most organisations, functions or disciplines and making the results of the feedback taken immediately prior to the event, available to trainees during their training – thus allowing individuals to easily see how they actually do something and the relevance of the training. Thus, they can then start transferring the learned skills immediately on return to the workplace.

Managerial effectiveness and competency

David McClelland is often cited as the father of the modern competency movement. In 1973, he challenged the then orthodoxy of academic aptitude and knowledge content tests, as being unable to predict performance or success in life and as being biased against minorities and women . Identified through patterns of behaviour, competencies are characteristics of people that differentiate performance in a specific job or role . Competencies distinguish well between roles at the same level in different functions and also between roles at different levels (even in the same function) often by the number of competencies required to define the role. A competency model for a Middle Manager, is usually defined within ten to twelve competencies, of those two are relatively unique to a given role. Kelner suggest that competency models for senior executives require fifteen to eighteen competencies, up to half of which were unique to the model .

Kelner , cites a 1996 unpublished paper by the late David McClelland were he performed a meta-analysis of executives assessed on competencies, where McClelland discovered that only eight competencies could consistently predict performance in any executive with 80 percent accuracy.

The first scientifically valid set of scaled competencies – competencies that have sets of behaviours ordered into levels of sophistication or complexity – were developed by Hay/McBer . The competencies found to be the most critical for effective managers include:

  • Achievement Orientation

  • Developing Others

  • Directiveness

  • Impact and Influence

  • Interpersonal Understanding

  • Organisational Awareness

  • Team Leadership

This set of characteristics, or individual competencies, that a manager brings to the job need to match well to the job or additional effort may be necessary to carry out the job or the manager may not be able to use certain managerial styles effectively. These are in turn affected by the organisational climate and the actual requirement of the job.

Managerial effectiveness is the combination of these four critical factors, Organisational Climate, Managerial Styles, Job Requirements and the Individual Competencies that the manager brings to the job. Reddin points out that ‘managerial effectiveness’ is not a quality but a statement about the relationship between his behaviour and some task situation. According to Reddin, it is therefore not possible to discover as fact the qualities of effectiveness which would then be of self-evident value.

If, however the Hay/McBer competencies are able to predict performance in any executive to 80 percent accuracy and these competencies are ‘trainable’ then any training programme designed to develop managerial effectiveness in any role can be evaluated by means of assessing the changes in behaviour of the participant that demonstrates the competency.

Learning outcomes

Most researchers conceptualize learning as a multidimensional construct. There is considerable commonality across different attempts to classify types of learning outcomes. synthesised the work of , and others proposing broad based categories of learning outcomes: skill-based, cognitive and affective outcomes.

Skill-based learning outcomes address technical or motor skills. A number of game-based instructional programmes have been used for practice and drill of technical skills. For example, investigated the effectiveness of the use of an aviation computer game by military trainees on subsequent test flights.

Cognitive learning outcomes include three subcategories of:

  • Declarative knowledge. The learner is typically required to reproduce or recognise some item of information. For example, demonstrated that students who played a computer game focusing on Newtonian principles were able to more accurately answer questions on motion and force than those who did not play the game.

  • Procedural knowledge. Requires a demonstration of the ability to apply knowledge, general rules, or skills to a specific case. For example, found that students using a variable payoff electronics game achieved higher scores on electronics troubleshooting tasks than students who received standard drill and practice.

  • Strategic knowledge. Requires application of learned principles to different contexts or deriving new principles for general or novel situations (referred to by others, as Constructivist Learning ), , for example, found that use of a computer game to improve practical reasoning skills of students led to improvements in critical thinking.

Affective learning outcomes refers to attitudes. Includes feelings of confidence, self-efficacy, attitudes, preferences and dispositions. Some research has shown that games can influence attitudes, for example and .

General approaches to evaluation ‘Schools’

Easterby-Smith groups major approaches and classifies within two dimensions, the scientific-constructivist dimension, and the research-pragmatic dimension.

The Scientific-constructivist dimension represents the distinct paradigms – according to Filstead , these paradigms represent distinct, largely incompatible, ways of seeing and understanding the world, yet In practice, most evaluations contain elements of each point of view . The scientific approach favours the use of quantitative techniques involving the attempt to operationalise all variables in measurable terms – normally analysed by statistical techniques in order to establish absolute criteria.

The Scientific approach contrasts greatly with ‘constructivist’ or ‘naturalistic’ methods which emphasise the collection of different views of various stakeholders before data collection begins. The process continues with reviewing largely, but not exclusively, qualitative data along with the view of various stakeholders.

Research-Pragmatic dimension

This dimension represents the contrasting styles of how the evaluation is conducted. Easterby-Smith described the two extremes as Evaluation and evaluation representing the Research and Pragmatic styles respectively.

Research styles stress the importance of rigorous procedures, that the direction and emphasis of the evaluation study should be guided by theoretical considerations and that these considerations are aimed at producing enduring generalizations, and knowledge about the learning and developmental process involved. The researcher in this instance should be independent and maintain an objective view of the courses under investigation without becoming personally involved.

The pragmatic approach, in contrast, emphasizes the reduction of data collection and other time-consuming aspects of the evaluation study to the minimum possible. As pointed out in , when working within companies, the researcher is dependent on the cooperation and given time of managers and other informants – who may and will have other more important priorities than the researcher’s study.

Easterby-Smith combines the dimensions into a useful matrix. The arrows show the influence of one ‘school’ on the development of the linked ‘school’.



Experimental Research

Experimental research has its roots in traditional social science research methodology. , , , are cited in as the best known representatives of this ‘school’. The emphasis in experimental research is:

  1. Determining the effects of training and other forms of development

  2. Demonstrating that any observed changes in behaviour or state can be attributed to the training, or treatment, that was provided.

There is an emphasis on the theoretical considerations, preordinate designs and quantitative measurement and emphasis on comparing the effects of different treatments. In a training evaluation study this would involve at least two groups being evaluated before and after a training intervention. One group receives a training ‘treatment’ whilst the other group does not. The evaluation would measure the differences between the groups in specific quantifiable aspects related to their work.


Evaluation models and taxonomies

Kirkpatrick Model

Donald Kirkpatrick created the most familiar evaluation taxonomy of a four step approach to evaluation – now referred to as a model of four levels of evaluation . It is one of the most widely accepted and implemented models used to evaluate training interventions. Kirkpatrick’s four levels measure the following:

  1. Reaction to the intervention

  2. Learning attributed to the intervention

  3. Application of behaviour changes on the job

  4. Business results realised by the organisation

Russ-Eft and Preskill note that the ubiquity of Kirkpatrick’s model stems from its simplicity and understandability – having reviewed 57 journal articles in the training, performance, and psychology literature that discussed training evaluation models, Russ-Eft and Preskill found that 44 (77%) of these included Kirkpatrick’s model. They also note that only in recent years (1996) that several alternative models have been developed.

In an article written in 1977, Donald Kirkpatrick considered how the evaluation at his four levels provided evidence or proof of training effectiveness. Proof of effectiveness requires an experimental design using a control group to eliminate possible factors affecting the measurement of outcomes from a training program. Without such a design, i.e., evaluating only the changes in a group participating in a training program, can provide evidence of training effectiveness but not proof .

Brinkerhoff model

Brinkerhoff’s model has its roots in evaluating training and HRD interventions. Brinkerhoff’s cyclical model consists of six stages grouped into the following four stages of performance intervention:

  • Performance Analysis

  • Design

  • Implementation

  • Evaluation

Brinkerhoff’s model addresses the need for evaluation throughout the entire human performance intervention process.

The six stages:

  1. Goal setting – identify business results and performance needs and determine if the problem is worth addressing

  2. Program design – evaluation of all types of interventions that may be appropriate

  3. Program implementation – Evaluates the implementation and addresses the success of the implementation

  4. Immediate outcomes – focuses on learning that takes place during the intervention

  5. Intermediate outcomes – focuses on the after-effects of the intervention some time following the intervention

  6. Impacts and worth – how the intervention has impacted the organization, the desired business results and whether it has addressed the original performance need or gap.

Holton’s HRD Evaluation Research and Measurement Model

Holton identified three outcomes of training (learning, individual performance, and organisational results) that are affected by primary and secondary influences. Learning is affected by trainee’s reactions, their cognitive ability, and their motivation to learn, The outcome of individual performance is influenced by motivation to transfer their learning, the programme’s design, and the condition of training transfer. Organisational results are affected by the expectations for return on investment, the organisation’s goals, and external events and factors. Holton’s model bears great similarity to Kirkpatrick’s Levels 2, 3 and 4.

The model is more testable than others in that it is the only one that identifies specific variables that affect training’s impact through identification of various objects, relationships, influencing factors, predictions and hypotheses. Russ-Eft view Holton’s model as being related to a theory-driven approach to evaluation .



There are a number of reasons why this approach may not work as well as it might, particularly with management training where sample sizes are limited and especially where training and development activities are secondary to the main objectives of the organization. Easterby-Smith discusses four main reasons:

Sample sizes

Using statistical techniques, essential to experimental research, need to be large to discover statistical significances when evaluating management training and development. This is particularly difficult in this field when group sizes are often less than ten and rarely greater than 30.

Control Groups

There are special problems in achieving genuine control groups. In one study , the selection of the group to receive training were ‘closer’ to their bosses than those who were not selected – rather than more objective or even random selection criteria.

Further, does ‘no training’ have a negligible effect on the ‘control group’? Indeed Guba and Lincoln now include those whoa are left out as one of the main groups of stakeholders.


The intention of experimental design is primarily to demonstrate causality between the training intervention and any subsequent outcomes. However, it is often hard to isolate the intervention from other influences on the manager’s behaviour. It may be possible to reduce the ‘clutter’ of other influences for training interventions that have a specific, identifiable skills focus, interventions of a more complex nature may be subject to myriad external influences between evaluations.

Time and effort

Kirkpatrick identifies that an experimental design to ‘prove’ that training is effective requires additional steps including the evaluation of a control group. Particularly with regard to his Level 3 (Behaviour) and Level 4 (Results), ensuring that any post-test evaluation is undertaken at a time after the training event long enough for the participant to have had an opportunity to change behaviour, or for results to be realisable .


In response to the problems above with experimental research, evaluators usually try to increase the sample size. The Illuminative evaluation school takes issue with the comparative and quantitative aspects of experimental research. Such a view will tend to concentrate rather on the views of different people in a more qualitative way. However, this school has been noted to be more costly than anticipated , and also this style of evaluation had been rejected by sponsors.


Easterby-Smith notes three main features of the systems model ‘school’. Starting with the objectives, an emphasis on identifying the outcomes of training and a stress on providing feedback about these outcomes to those people involved in providing inputs to the training.

Additionally, evaluation is the ‘assessment of the total value of a training system, training course or programme in social as well as financial terms’. Critical of the approach Hamblin suggests that this is restrictive because evaluation has to be related to objectives whilst being over ambitious because the total value of training is evaluated in ‘social as well as financial’ terms. However, Hamblin’s own cycle of evaluation depends heavily on the formulation of objectives, either as a starting point or as a product of the evaluation process.

Another important feature of Hamblin’s work is the emphasis on measurement of outcomes from training at different levels. It assumes that any training event will, or can. Lead to a chain of consequences, each of which may be seen as causing the next consequence. Hamblin stresses that it would be unwise to conclude from an observed change at one of the higher levels of effect that this was due to a particular training intervention, unless one has followed the chain of causality through the intervening levels of effect. Should a change in job behaviour, for example, be observed, the constructivist take on this would possibly be to ask the individual for his or her own views of why they were now behaving in a different way and then compare this interpretation with the views of one or two close colleagues or subordinates.

Hamblin’s Five-Level Model

Hamblin , also widely referenced, devised a five-level model similar to Kirkpatrick’s. Hamblin adds a fifth level that measures “ultimate value variables” of “human good” (economic outcomes). This also can be viewed as falling into the tradition of the behavioural objectives approach .



The third main feature of the systems model is the stress on feedback to trainers and others decision makers in the training process. This features significantly in the work of who take a very pragmatic view of evaluation, suggesting that it should be of help to the trainer in making decisions about a particular programme as it is happening. subsequently makes a further distinction between assisting decisions that can be made about current programmes, and feedback that can contribute to decisions about future programmes. This, Rackham began to appreciate after attempting to improve the amount of learning achieved in successive training programmes by feeding back to the trainers data about the reactions and learning achieved in earlier programmes. What Rackham noticed was that the process of feedback from one course to the next resulted in clear improvements when the programmes were non-participative in nature, but that there were no apparent improvements in programmes that involved a lot of participation.

The idea of feedback as an important aspect of evaluation was developed further by Burgoyne and Singh . They distinguish between evaluation as feedback and feedback adding to the body of knowledge. The former they saw as providing transient and perishable data relating directly to decision-making, and the latter as generating permanent and enduring knowledge about education and training processes.

Burgoyne and Singh relate evaluative feedback to a range of different decisions about training in the broad sense:

  1. Intra-method decisions: about how particular methods are handled, for example ‘lectures’ may vary from straight delivery to lively debates.

  2. Method decisions: for example, whether to employ a lecture, a case study, or a simulation in order to introduce the topic

  3. Programme decisions: about the design of a particular programme, whether it should be longer or shorter, more or less structured, taught by insiders or visiting speakers

  4. Strategy decisions: about the optimum use of resources, and about the way the training institution might be organized.

  5. Policy decisions: about the overall provision of funding and resources, and the role of the institution as a whole, whether for example, a management training college should see itself as an agent for change or as something that oils the wheels of change that are already taking place. ())


The systems model has been widely accepted, especially in the UK, but there are a number of problems and limitations that should be understood. According to Easterby-Smith the main limitation is that feedback (i.e. data provided from evaluation of what has happened in the past) can only contribute marginally to decisions about what should happen in the future. This is due to the legacy of the past training event and feedback can highlight incremental improvements based on the previous design, but not note when radical change is needed.

The emphasis on outcomes provides a good and logical basis for evaluation but it represents a mechanistic view of learning. In the extreme, this suggests that learning consists of facts and knowledge being placed in people’s heads and that this becomes internalized and then gradually incorporated in people’s behavioural responses.

The emphasis on starting with objectives brings us to a classic critique of the systems approach. Just whose objectives are they? It has been questioned by MacDonald-Ross whether there is any particular value in specifying formal objectives at all, since among other things, this might place undue restrictions on the learning that could be achieved from a particular educational or training experience.


This leads to the next major evaluation ‘school’: Goal free evaluation. This starts with the assumption that the evaluator should avoid consideration of formal objectives in carrying out his or her work. Scriven proposed the radical view that the evaluator should take no notice of the formal objectives of a programme when carrying out an investigation of it.

Goal free evaluation leans more to the constructive method where the evaluator should avoid discussing or even seeing the published objectives of the programme and discover from participants what happened and what was learned .


According to Easterby-Smith this approach includes responsive evaluation and utilization focused evaluation. He cites Stake contrasts responsive evaluation with the preordinate approach of experimental research. The latter requires the design to be clearly specified and determined before evaluation begins, it makes use of ‘objective’ measures, evaluates these against criteria determined by programme staff, and produces reports in the form of research-type reports. In contrast, responsive evaluation is concerned with programme activities rather than intentions, and takes account of different value perspectives. In addition, Stake stresses the importance of attempting to respond to the audience’s requirements for information (contrasting with some goal-free evaluators to distance themselves from some of the principal stakeholders). Stake positions this method more pragmatically by recognizing that ‘different styles of evaluation will serve different purposes’. He also recognizes that preordinate evaluations may be preferable to responsive evaluations under certain circumstances.

Guba and Lincoln take this method further by what they call responsive constructivist evaluation. Guba and Lincoln recommend starting with the identification of stakeholders and their concerns, and arranging for these concerns to be exchanged and debated before collection of further data.

Patton stresses the importance of identifying the motives of key decision makers before deciding what kind of information needs to be collected. Recognising that some stakeholders have more influence than others (a view Guba and Lincoln argue should not be the case) but goes further by concentrating on the uses of the subsequent information might be put.

But, Interventionalist evaluation has the danger of being so flexible in its approach because it considers the views of all stakeholders and the use of the subsequent information, that the form adapts and changes to every passing whim and circumstance, and thereby producing results that are weak and inconclusive. It may also be, that the evaluator becomes too close to the programme and stakeholders, something that goal-free evaluators set pout to avoid, leading to a reduction of impartiality and credibility.


Many researchers measuring the effects of training have looked at one or more of the outcomes identified by Kirkpatrick : reactions, learning, behaviour, and results.

Evaluation of Trainee reactions to learning has yielded mixed results , yet it is often the only formal evaluation of a training programme and relied upon to assess learning and behaviour change. It may be reasonable to assume that enjoyment is a precursor to learning, and that if a trainee enjoys the training, they are likely to learn but such an assumption is not supported by a meta-analytic study combining the results of several other studies .

Kirkpatrick’s second level, Learning, is the most commonly used measurement after trainee reactions to assess the impact of training. Studies investigating the relationship between learning and work behaviour have also shown mixed results and offer little concrete evidence to support the notion that increased learning from a training programme relates to organisational performance. cited in .

Training transfer is defined as applying the knowledge, skills, and attitudes acquired during training to the work setting. Most organisations are genuinely interested in this aspect of the effectiveness of training events and programmes – yet the paucity of research dedicated to transfer of training contradicts the importance of the issues. Some research in this area has focussed on comparison of alternative conditions to training transfer. A typical design of such research compares groups who receive different training methods and/or a ‘control group’ that receives no training. Such studies do not all use the same research design and the results are inconsistent. However, the research does indicate that some form of post-training transfer strategy facilitates training transfer .

Evaluating Results of training in terms of business results, financial results and return on investment (ROI) is much discussed in the popular literature – most offering anecdotal evidence or conjecture about the necessity of evaluating training’s return on investment and methods that trainers might use to implement such an evaluation. Solid research on this topic is not, however, so voluminous. Mosier proposes a number of capital budgeting techniques that could be applied to evaluating training whilst recognising that there are common reasons why such financial analyses are rarely conducted or reported:

  • It is difficult to quantify or establish a monetary value for many aspects of training

  • No usable cost-benefit tool is readily available

  • The time lag between training and results is problematic

  • Training managers and trainers are not familiar with financial models.

Little progress appears to have been made in this area since Mosier wrote.



Reviewing the history and development of training evaluation research shows that there are many variables that ultimately affect how trainees learn and transfer their learning in the workplace. Russ-Eft and Preskill suggest that a comprehensive evaluation of learning, performance , and change would include the representation of a considerable number of variables . Such an approach, whilst laudable in terms of purist academic research, is likely to cause another problem, that of collecting data to demonstrate the affects and effects of so many independent variables and factors.Thus, we need to recognise that there is a trade off between the cost and duration of a research design and increasing the quality of the information which it generates .

Hamblin points out that a considerable amount of evaluation research has been done. This research has been carried out with a great variety of focal theories, usually looking for consistent relationships between educational methods and learning outcomes, using a variety of observational methods but with a fairly consistent and stable background theory. However, the underlying theory has been taken from behaviouralist psychology summed up as the ‘patient’ – here the essential view is that the subject (patient) does (behaviour or response) is a consequence of what has been done to him (treatment or stimulus).

Another approach according to Burgoyne which researchers appear to take to avoid confronting value issues is to hold that all value questions can ultimately be reduced to questions of fact. This usually takes the form of regarding the quality of ‘managerial effectiveness’ as a personal quality which is potentially objectively measurable, and therefore a quality, the possession of which could be assessed as a matter of fact. In practice this approach has appeared elusive. Seashore et al, felt that the existence of the concept of ‘managerial effectiveness’ as a unitary trait could be confirmed, if they found high intercorrelations between five important aspects of managerial behaviour: high overall performance ratings, high productivity, low accident record, low absenteeism and few errors

There are no forced rules about the style of evaluation used for a particular evaluative purpose. However, suggests that studies aimed at fulfilling the purpose of proving will tend to be located towards the ‘research’ end of the dimension, and studies aimed at improving will tend to be located near the ‘pragmatic’ end. On the methodological dimension there may be more concern with proving at the ‘scientific’ end, and learning at the ‘constructivist’ end. ())



Margaret Gredler notes that a major design weakness of most studies evaluating simulation based training and development is that they are compared to regular classroom instruction. However, the instructional goals for each can differ. Similarly, many studies show measurement problems in the nature of the post-tests used.

The problems highlighted above regarding the choice of evaluation methodology are further compounded by the lack of well-designed research studies in the development and use of games and simulations – much of the published literature consists of anecdotal reports and testimonials providing sketchy descriptions of the game or simulation and report only on perceived student reactions.

Most of the research, as noted by Pierfy is flawed by basic weaknesses in both design and measurement. Some studies implemented games or simulations that were brief treatments of less than 40 minutes and assessed effects weeks later. Intervening instruction in these cases, however, contaminated the results.


Mckenna cites many references to papers criticising the evaluation of learning effectiveness using Interactive Multimedia and many researchers ask for more research to quantitatively evaluate the real benefits of simulations and games-based learning.

The question whether Simulation based learning requires a new and different assessment techniques beyond those in us e today remains relatively unexplored. Although there have been some attempts at constructing theoretical frameworks for the evaluation of virtual worlds , , very few working examples or reports on the practical use of these frameworks exists.

Whitelock et al. argue that effective evaluation methods need to be established to discover if conceptual learning takes place in virtual environments. In practice, however, the assessment of virtual environments and simulations has been focused primarily on its usefulness for training and less on its efficacy for supporting learning especially in domains with a high conceptual and social content.

Hodges suggests that simulations (especially Virtual Reality) prove most valuable in job training where hands-on practice is essential but actual equipment cannot often be used. Greeno et al. suggest that knowledge of how to perform a task is embedded in the contextual environment. Druckman and Bjork suggest that only when a task is learned in its intended performance situations can the learned skills be used in those situations.

Since the early days of simulation and gaming as a method to teach, there have been calls for hard evidence that support the teaching effectiveness of simulations . Their paper states that

despite the extensive literature, it remains difficult, if not impossible, to support objectively even the most fundamental claims for the efficacy of games as a teaching pedagogy. There is little relatively little hard evidence that simulations produce learning or that they are superior to other methodologies.”

They go on to review the reasons as being traceable to the selection of dependent variables and to the lack of rigour with which investigations have been conducted.


One of the major problems of simulations is how to “evaluate the training effectiveness [of a simulation]” citing . Although for more than 40 years, researchers have lauded the benefits of simulation , very few of these claims are supported with substantial research

Many of the above cited authors attribute the lack of progress in simulation evaluation to poorly designed studies and the difficulties inherent in creating an acceptable methodology of evaluation.

There are a number of empirical studies that have examined the effects of game-based instructional programs on learning. For example both Ricci, et al. and Whitehall and McDonald found that instruction incorporating game features led to improved learning. The rationale for these positive results varied, given the different factors examined in these studies. Whitehall and McDonald argued that the incorporation of a variable payoff schedule within the simulation led to increased risk taking among students, resulting in greater persistence and improved performance. Ricci, et al. proposed that instruction incorporating game features enhanced student motivation, leading to greater attention to training content and greater retention.

However, although anecdotal evidence suggests that students seem to prefer games over other, more traditional methods of instruction, review have reported mixed results regarding the training effectiveness of games and simulations. Pierfy evaluated the results of 22 simulation-based training game effectiveness studies to determine patterns of training effectiveness. 21 studies are reported on having comparable assessments. Three of the studies reported results favouring the effectiveness of games, three studies reported results favouring the effectiveness of conventional teaching. The remaining 15 studies reported no significant differences. 11 of the studies tested for retention of learning, eight of these indicated that retention was superior for games, the remaining 3 yielded no significant result. 7 of 8 of the studies assessing student preference for training games over classroom instruction reported greater interest in simulation game activities over conventional teaching methods. More recently, Druckman concluded that games seem to be effective in enhancing motivation and increasing student interest in subject matter, yet the extent to which this translates into more effective learning is less clear.


Specifically with regard to simulation based training programmes (but also applying to all training delivery methods perhaps with different terminology), this section reviews the literature on simulation evaluation developing a coherent framework for pursuing the evaluation problem. Three prominent constructs appear in the literature: fidelity, verification, and validation. Validation of a simulation as a method of teaching and that the simulation programme as a training intervention produces (or helps to produce) learning and transfer of learning are the important criteria yet fidelity and verification are easier to evaluate (but not necessarily to measure objectively) and often distract evaluators from the more tricky issue of validation.

Simulation Fidelity

Fidelity is the level of realism that a simulation presents to a learner. Hays and Singer describe fidelity as “how similar a training situation must be, relative to the operational situation, in order to train most efficiently”. Fidelity focuses on the equipment that is used to simulate a particular learning environment.

In more sophisticated (technologically) simulations that use virtual reality for example, the construct of fidelity has an additional dimension, that of presence (the degree to which an individual believes that they are immersed within the virtual world or simulation) .

The degree of fidelity or presence in a learning environment is a difficult element to measure . Much research during the 60’s and 70’s studied the relationship between fidelity and its effects on training and education. These studies according to Feinstein and Cannon found that a higher level of fidelity does not translate into more effective training or enhanced learning In fact, it may be that lower levels of fidelity but with effective ‘human virtual environment interface’ (navigational simplicity) and a significant degree of presence can assist in trainees acquiring knowledge or skills within the simulation .

Simulation Verification

Verification is the process of assessing that a model is operating as intended. Verification “is a process designed to see if we have built the model right” . During the process, simulation developers need to test and debug errors in the simulation (now usually software errors) through a series of alpha and beta tests under different conditions, verifying that the model works as intended. Often, developers are distracted by this process producing what appear to be brilliant models that work ‘correctly’ but with no appreciation of the educational effect and hence their validity. In this sense, verification can be a trap, notwithstanding its critical status as a necessary condition of validity .


Building on the work of Feinstein and Cannon, Cannon and Burns and others, there are three main questions beyond those of the general issues of evaluation noted above. ())

  1. Are we measuring the ‘right’ thing? Validity of the constructs

  2. Does the simulation provide the opportunity to learn the ‘right’ thing? Verification of the simulation and the appropriate fidelity for the content and audience.

  3. Does the method of using a simulation deliver the learning? Validity as a method of developing.


Evaluation of simulations for learning outcomes

An analysis of the review of simulation literature identifies learning outcomes instructors adopt as they strive to educate business students. These learning outcomes have been advanced as targeting the skills and knowledge needed by practicing managers. In particular, the sources are . Simulation researchers have speculated that the method is an effective pedagogy for achieving many of these outcomes. below identifies these learning outcomes where below identifies these learning outcomes where

P= measurement of learner perceptions of learning outcomes

O= an objective measurement of learning outcomes


It is clear from the table above that the vast majority of evaluations have relied on the learner’s perceptions of their learning outcomes. Objective measurements (for any learning intervention) are more difficult, however, there appears to be a need to bring in more objective measures to help understand if simulations are an effective method for people to learn business management skills.


This paper commenced with three key questions:

  1. Why evaluate?

  2. What to evaluate? and

  3. How to evaluate?

Why evaluate?

Any deliberate learning intervention costs and organisation and individuals’ time and money. The value placed on this may vary from person to person, however, there is a value, hence it is worth ensuring that the value of the outcome is greater than or equal to the value of the input.

What to evaluate?

What to evaluate is more difficult to answer. In business and especially management, behavioural competencies as learning outcomes are increasingly recognised as a valid and useful measurement.

How to evaluate?

How to evaluate poses significant problems for the researcher. The training community is most familiar with Kirkpatrick’s four levels of evaluation and most frequently measures Level 1 (Reactions) to training intervention as a proxy for all learning outcomes achieved. The basic premise is that if trainees enjoy the training, then they will learn from it. This may be true; however, it does not mean that the trainees have learned what was intended, let alone needed. This suggest that a goal-free evaluation approach may be suitable to establish what was learned – regardless of what was intended in the intervention.

Greater objectivity about on-the-job behaviour change may be obtained through the use of a suitable instrument and 360º assessment. However, the researcher needs to be aware of Wimer’s warning that 360º feedback can have detrimental effects especially when the feedback is particularly negative or not handled sensitively .

Evaluation style

Choosing an evaluation style requires the researcher to clearly understand his intentions of the evaluation. In order to demonstrate the effectiveness of using simulations as a method of developing managerial competencies, this researcher is interested in ‘proving’ (rather than learning) the effectiveness of the delivery method and in ‘proving’ (rather than improving) the development intervention. The experimental research approach to evaluation appears to be the most objective and scientific in approach for this purpose.

Researcher involvement and bias

There are significant problems to consider, those of; the researchers involvement – whether direct or not, the fact of measuring participants in a programme may have a direct impact on the achievement of the learning outcomes. However, if this is the same with all delivery methods being evaluated, the comparison measures on the same basis.

Sample size

The next and very significant issue is that of sample size. It is recognised that in order to achieve statistical validity, the sample size will need to be significantly large – to the order of 100 or more participants.

Control group and other influences

Issues of the suitability of the control group need to be considered as well. It can be expected that individuals will react differently to the same training intervention. Such aspects of the individual as their preferred learning style, their educational history, perhaps gender, perhaps race, perhaps cultural heritage, their age, etc. may all play a part in whether they, as an individual, learn and change behaviour as a result or as a consequence of a particular learning intervention.

The design of a suitable evaluation research method will need to consider all of these issues.

Dr. John Kenworthy

About the author

John is the Chief Coaching Officer of CELSIM, the Leadership and Simualtions arm of Corporate Edge Group based in Singapore.  

John earned his Doctorate at the Henley Management College, graduating in 2006. He has a BSC in Hotel and Catering Studies from Manchester Polytechnic and an MBA in Technology Management from the Open University. His research and business interests are in the use of computer-based simulations to develop managerial competence. He is an active member of ABSEL (Association for Business Simulations and Experiential Learning) and AECT (Association for Educational Communications and Technology).

Dr. John Kenworthy, Chief Coaching Officer at CELSIM,

Is leadership competence or competency?

The GAINMORE Advantage framework is a practical development model that considers the hieracrchical nature of people’s behaviours within teams and within organizations.

But just what is it that we are developing? Is it character, attributes, traits, competence or competency? The lack of agreement in the academic world on the terminology and whether some aspects are genetic. Leading us to the recurring question, “are leaders born or made?”. Here, I want to address one part of the debate: the disticntion of what is competence and what is competency.

Competence and competency

The concept of competence remains one of the most diffuse terms in the organisational and occupational literature (Nordhaug and Gronhaug, 1994). Exactly what does an author mean when using any of the terms of competence?

The concept of individual competence is widely used in human resource management (Boyatzis, 1982, Schroder, 1989, Burgoyne, 1993). This refers to a set of skills that an individual must possess in order to be capable of satisfactorily performing a specified job. Although the concept is well developed, there is continuing debate about its precise meaning.

Others take a job-based competence view that according to Robotham and Jubb (1996) can be applied to any type of business where the competence-based system is based on identifying a list of key activities (McAuley, 1994) and behaviours identified through observing managers in the course of doing their job.

A useful view is to look at competence to mean a skill and the standard of performance, whilst competency refers to behaviour by which it is achieved (Rowe, 1995). That is, competence describes what people do and competency describes how people do it.

Rowe (1995, p16) further distinguishes the attributes an individual exhibits as “morally based” behaviours – these are important drivers of behaviours but especially difficult to measure – and “intellectually based” behaviours as capabilities or competencies. Capabilities are distinguished as these refer to development behaviours – i.e. are graded to note development areas to improve behaviours in how people undertake particular tasks.

Young (2002) develops on a similar theme and builds on Sarawano’s (1993) model, linking competency and competence to performance and identifies competency as a personal characteristic (motives, traits, image/role and knowledge) and how the individual behaves (skill). Competence is what a manager is required to do – the job activities (functions, tasks). These in turn lead to performance of the individual [manager].

Jacobs (1989) considers a distinction between hard and soft competences. Soft competences refer to such items as creativity and sensitivity, and comprise more of the personal qualities that lie behind behaviour. These items are viewed as being conceptually different from hard competences, such as the ability to be well organised. Jacob’s distinction fits neatly into Young’s model with hard competences referring to identifiable behaviours, and soft competences as the personal characteristics of the individual.

Further distinctions relate to the usefulness of measuring competenc[i]es. Cockerill et al. (1995) define threshold and high-performance competences. Threshold competences are units of behaviour which are used by job holders, but which are not considered to be associated with superior performance. They can be thought of as defining the minimum requirements of a job. High performance competences, in contrast, are behaviours that are associated with individuals who perform their jobs at a superior level.

In the UK, the Constable and McCormick Report (1987) suggested that the skill base within UK organisations could no longer keep pace with the then developing business climate. In response, the Management Charter Initiative sought to create a standard model where competence is recognised in the form of job-specific outcomes. Thus, competence is judged on performance of an individual in a specific job role. The competences required in each job role are defined through means of a functional analysis – a top-down process resulting in four levels of description:

  • Key purpose
  • Key role
  • Units of competence
  • Elements of competence

Elements are broken down into performance criteria, which describe the characteristics of competent performance, and range statements, which specify the range of situations or contexts in which the competence should be displayed.

The MCI model now includes personal competence, missing in the original, addressing some of the criticisms levelled at the MCI standards. Though the model tends to ignore personal behaviours which may underpin some performance characteristics, particularly in the area of management, where recent work has indicated the importance of behavioural characteristics such as self-confidence, sensitivity, proactivity and stamina.

The US approach to management competence, on the other hand, has focused heavily on behaviours. Boyatzis (1982) identifies a number of behaviours useful for specifying behavioural competence. Schroder (1989) also offers insights into the personal competencies which contribute to effective professional performance.

Personal competencies and their identifying behaviours form the backbone of many company-specific competency frameworks and are used extensively in assessment centres for selection purposes. This is because behavioural (or personal) competence may be a better predictor of capability – i.e. the potential to perform in future posts – than functional competence – which attests to competence in current post. The main weakness of the personal competence approach, according to Cheetham and Chivers (1996), is that it doesn’t define or assure effective performance within the job role in terms of the outcomes achieved.

In his seminal work “The Reflective Practitioner”, Schon (1983) attempts to define the nature of professional practice. He challenges the orthodoxy of technical rationality – the belief that professionals solve problems by simply applying specialist or scientific knowledge. Instead, Schon offers a new epistemology of professional practice of ‘knowing-in-action’ – a form of acquired tacit knowledge – and ‘reflection’ – the ability to learn through and within practice. Schon argues that reflection (both reflection in action and reflection about action) is vital to the process professionals go through in reframing and resolving day-to-day problems that are not answered by the simple application of scientific or technical principles.

Schon (1983) does not offer a comprehensive model of professional competence, rather he argues that the primary competence of any professional is the ability to reflect – this being key to acquiring all other competencies in the cycle of continuous improvement.

There are criticisms of competency-based approaches to management and these tend to argue that managerial tasks are very special in nature, making it impossible to capture and define the required competences or competencies (Wille, 1989). Other writers argue that management skills and competences are too complex and varied to define (Hirsh, 1989, Canning, 1990) and it is an exercise in futility to try and capture them in a mechanistic, reductionist way (Collin, 1989). Burgoyne (1988) suggests that the competence-based approach places too much emphasis on the individual and neglects the importance of organisational development in making management development effective. It has also been argued that generic lists of managerial competences cannot be applied across the diversity of organisations (Burgoyne, 1989b, Canning, 1990).

Linking competency models to organisation outcomes

Some writers have identified competencies that are considered to be generic and overarching across all occupations. Reynolds and Snell (1988) identify ‘meta-qualities’ – creativity, mental agility and balanced learning skill – that they believe reinforces other qualities. Hall (1986) uses the term ‘meta-skills’ – as skills in acquiring other skills. Linstead (1991) and Nordhaug and Gronhaug (1994) use the term ‘meta-competencies’ to describe similar characteristics. The concept of meta-competence falls short of providing a holistic, workable model, but it does suggest that there are certain key competencies that overarch a whole range of others.

There is however, some doubt about the practicability of breaking down the entity of management into its constituent behaviours (Burgoyne, 1989a). This suggests that the practice of management is almost an activity that should be considered only from a holistic viewpoint.

Baker et al. (1997) link the various types of competence by first establishing a hierarchy of congruence as a backbone to the model. In broad terms, they describe the congruence of an entity to be the degree of match or fit between some external driver to the entity and the response of that entity to the driver. This method enables them to take into consideration the idea that management, as an entity, and the individuals who perform the function do so within a particular environment. Measurement of congruence or goodness of fit, has been attempted in studies of operations (Cleveland et al., 1989, Vickery, 1991). Baker et al.’s hierarchy is shown in Figure  below, with four levels of congruence: 1) Organisation level, 2) Core business process level, 3) Sub-process within core process level, and 4) Individuals level.

At the organisation level, there is congruence when a firm adopts a strategy that is consistent with the competitive priorities derived from the firm’s business environment. The strategy, in turn, determines the operational priorities of the firm, following Platts and Gregory (1990), Baker et al. (1997) using their own terminology, consider these operational priorities to drive the core processes of the firm. These, in turn, can be broken down into a number of sub-processes – and congruence is needed between the sub-processes and the core processes. At the individual level, the skills and knowledge should also match the priorities driven by the sub-processes.

This hierarchical model follows a traditional approach that structure follows strategy (Vickery, 1991, Cleveland et al., 1989, Kim and Arnold, 1992). Others view that competences are a part of the structure of the firm and should influence strategy making, Bhattacharaya and Gibbons (1996) point out that Prahalad and Hamal (1990) and Stalk et al. (1992) take this approach.

The hierarchical model has been tested analysing case studies of seventeen manufacturing plants that won Best Factory Awards during the period 1993-95 in the UK (Cranfield) and established benchmarks. Baker et al. (1997) found some direct cause-effect links between enabling competences at the sub-process level and competitive performance (at the core process level). However, they also found many ‘best practices’ such as employee empowerment and team working which were harder to link to specific competitive competences.

This model provides an insightful way to break down the complex issue of how individual performance influences the competitive competences of the firm. Baker et al.’s research is limited within the manufacturing sector where core processes are often easier to identify and define with a clear delineation of individual effort, technology and product. It is also established on the basis that structure follows strategy – whereas, most firms will already have structure and will be adapting their strategies continuously as the external environment changes.


Figure 1. Hierarchical model of competence (Baker et al., 1997)

Cheetham and Chivers (1996) describe a model of competence that draws together the apparently disparate views of competence – the ‘outcomes’ approach and the ‘reflective practitioner’ (Schon, 1983, Schon, 1987) approach.

Their focus was to determine how professionals maintain and develop their professionalism. In drawing together their model, they consider the key influences of different approaches and writers. The core components of the model are: Knowledge/cognitive competence, Functional competence, Personal or behavioural competence and Values/ethical competence with overarching meta-competencies include communication, self-development, creativity, analysis and problem-solving. Reflection in and about action (Schon, 1983) surround the model, thereby bringing the outcomes and reflective practitioner approaches together in one model shown in Figure  below.

Cheetham and Chivers model of professional competence is useful in bringing the concept of individual competence to bear on the competence of the organisation in a non-manufacturing context, but it still falls short of providing a useful model to link an individuals behaviour with the business results of an organisation across industries – a generic model if you will.


Figure 2. Model of professional competence (Cheetham and Chivers, 1996)

Young (2002) creates a generic model neatly, by developing his individual model further to the organisational perspective adopting the concept of core competence, as articulated by Prahalad and Hamal (1990) and further developed by Stalk et al. (1992) and Tampoe (1994), suggesting that the collection of individual competences within the organisation create the organisational core competence.

This model provides a way to understand how developing competency (personal characteristics and behaviours) at the individual level enables an individual to demonstrate competence (the functions and tasks  of the job) which in turn cascades through a hierarchy of the organisation (core competence and other activities supporting the organisation)  to deliver business results.


Figure3. Individual variables of competency, competence and performance and organisation core competence (adapted from Young, 2002)

Please contact the author for full bibliography

Related articles by Zemanta
Reblog this post [with Zemanta]