PSYCH need it by Wednesday – December 1 – NO EXCEPTIONS. APA FORMAT – see attached R E S E A R C H A R T I C L E Child development assessment: Practitio

PSYCH need it by Wednesday – December 1 – NO EXCEPTIONS.


– see attached R E S E A R C H A R T I C L E

Child development assessment: Practitioner input in the
revision for Griffiths III

Elizabeth M. Green1,2 | Louise Stroud1,2 | Candice Marx2 |

Johan Cronje2

1Association for Research in Infant and Child

Development, Birmingham, UK

2Department of Psychology, Nelson Mandela

University, Port Elizabeth, South Africa


Elizabeth M. Green, Association for Research

in Infant and Child Development,

Birmingham, UK.


Funding information

Association for Research in Infant and Child

Development, Grant/Award Number:



Introduction: The input from practitioners in developmental assessment test revision

is a crucial and leading component of the project. This paper highlights six key phases

of the Griffiths III revision process and the value of having a guiding plan that

includes test practitioner input.

Methods: The revision of the Griffiths III consisted of six separate phases that were

supported by practitioner and user input and feedback. These six phases and practi-

tioner views ensured that the necessary core constructs and new areas for item devel-

opment were included in the revised version. These processes also underscored the

construct development and task review, item design, piloting and standardization of

the revised version, as well as its production, release and subsequent training methods.

Results: The six guiding phases provided a methodologically robust frame to the revi-

sion process. Practitioners valued an overall developmental measure with discrete

data about and within the ‘avenues of learning’ allowing them to analyse a child’s

strengths and weaknesses. Communication with practitioners across the world dem-

onstrated the wide disparity of culture and environments that the Griffiths Scales are

deployed in. It is not possible to design a revised scale that is appropriate for all areas

of use, so in this revision process, it was decided to design the scales as culturally fair

as possible and support practitioners in other countries to translate and validate the

scales for use.

Conclusions: The revision of the Griffiths III found test users to be valuable sources

of information on the basis of their experiences with the test and professional knowl-

edge. Creating a continuous feedback mechanism within a phased process provided

opportunities for the revision team to engage meaningfully with the data being

obtained as well as test users to advance the scope and quality of the test. Revision

teams are encouraged to consider the process and engagement methods explored in

this study during their projects.


child development, early assessment, measurement

Received: 16 March 2018 Revised: 26 June 2020 Accepted: 7 July 2020

DOI: 10.1111/cch.12796

682 © 2020 John Wiley & Sons Ltd Child Care Health Dev. 2020;46:682–


According to the International Test Commission (ITC) (2013, 2015,

2013, 2017), test developers should have a guiding plan during test

development. For test revision, test developers are able to draw on

the knowledge and experience of practitioners and to develop and

revise tests that meet the needs of those that employ it in daily prac-

tice (Adams, 2000). If the test has previous versions, a body of

research evidence and feedback from registered practitioners of the

test as well as the test market in general is available to provide insight

into areas of the test that may need amendment.

Tests require revision for a number of reasons including outdated

key test components (Adams, 2000); advances in measurement theory,

psychometric practice and norm development (King, 2006); and

changes in test performance, such as the Flynn effect (Flynn, 1984;

Trahan, Stuebing, Fletcher, & Hiscock, 2014), may reduce the overall

difficulty of tests. In addition, concern about the effects of time on the

validity of interpretations of test results is evident in industry publica-

tions. The European Federation of Psychologists’ Associations

(EFPA) (2013a, b) label a test as inadequate if the normative and stan-

dardization information is 20 years or older.

Tests of child development require timely revision. Improved nutri-

tion, health care, child-rearing practices and education are cited as pos-

sible causes for the Flynn effect (Strauss, Spreen, & Hunter, 2000).

Child development is a dynamic, moving target, so a critical look at the

underpinning theories, philosophies and principles of development

ensures that these stay relevant. One challenge when revising or

updating a test is the balance between modernization and retaining the

original spirit of the measure. Another challenge is ensuring that the

test remains fit for purpose. Tests of child development are often stan-

dardized on a population of typically developing children, yet they are

used mostly to assess children whose development is thought to be

atypical in order that the test may discriminate between typical and

atypical development. The shape of the normal distribution curve

(i.e. the bell curve) provides sparse comparison data from typically

developing children at the lower end of the curve (−2 SD to −3 SD). This

means that for the 2.5% of children whose performance falls in the

lower tail of the bell curve, when compared against a norm group of

typically developing children, the specific degree of impairment cannot

be determined with any degree of confidence from the normed data for

the test floor. Including some atypically developing children in the sam-

ple in order to improve the test floor is not appropriate. Leaders Project

(2013), in their test review of the Bayley (2006), concluded that the

inclusion of children with clinical diagnoses in the main standardization

sample had not been helpful, limiting the test’s ability to diagnose chil-

dren with mild impairments. It may take years after publication for

research to be conducted with a revised test on specific and non-

heterogeneous clinical populations. Research should draw on samples

where true comparison may be made, and efforts should be made to

produce research that will maximize the usability and generalizability of

findings (Oliveri, Lawless, & Young, 2015).

Test scales are a manifestation of latent constructs and are typi-

cally used to capture a behaviour, a feeling or an action that cannot be

captured in a single variable or item (Boateng, Neilands, Frongillo,

Melgar-Quinonez, & Young, 2018, p. 148). This means that, in terms of

validity, the highest level of evidence of the coverage of a construct is

likely to be offered by experts and practitioners working in the field of

the test. However, in spite of the ITC’s (2017) comment on the recipro-

cal relationship called for between test developers and test users, there

is scant evidence in the literature about how test developers can draw

on the knowledge and experience of practitioners. Adams (2000)

described the challenges that can emerge on the basis of an incomplete

understanding of each other’s needs and a failure to fully appreciate

the potential contributions of the other. Butcher (2000) saw one of the

greatest values of user feedback as providing awareness of practical

issues to the test developer. Gregory (2015) noted that feedback from

examiners is a potentially valuable source of information that is nor-

mally overlooked by test developers.

In 2011, the Association for Research in Infant and Child

Development (ARICD) started the revision process of the Griffiths

Scales (Griffiths, 1970; 1986). Previous restandardizations had

included clarifications and amendments without modernization. The

two Griffiths Scales, Birth to Two Years (Huntley, 1996) and the

Griffiths Mental Development Scales—Extended Revised (GMDS-ER)

(Luiz et al., 2006) for children 2 to 8 years, had differences in test

organization (e.g. number of subscales) and in scoring. Although

there was a need for a continuous version, it was not clear what

changes would be needed to update the scales to meet current

needs, good test specifications and modern developmental research


Developmental concerns about a child can arise by a number of

different routes, and further evaluation is often required to identify

potential difficulties that may necessitate intervention or special edu-

cation services (Marlow, 2018; Sharma, 2011). Four comprehensive

standardized assessment measures with different theoretical back-

grounds were in use in 2011 (Bedford, Walton, & Ahn, 2013). The

Batelle Developmental Inventory, Second Edition (BDI-2) (Newborg,

2006) measures a child’s progress sequentially along a developmental

continuum of critical skills and behaviours from simple to complex

through both global domains and discrete skill sets. The Bayley Scales

of Infant and Toddler development, Third Edition (Bayley-III) (2006) is

eclectic. It has been developed from a variety of different scales of

infant development and infant and toddler research (Bayley, 2006)

and was formulated on the principle that it measures underlying traits

or latent factors. Confirmatory factor analysis (CFA) demonstrated

construct validity by evaluating relationships between test scores and

different underlying traits/factors. The authors concluded that the

test scores best modelled three underlying traits: motor, language and

cognitive factors (Sun et al., 2019). The Mullen Scales of Early Learn-

ing (MSEL) (1995) have a theoretical foundation based on the con-

cepts of neurodevelopment and intrasensory and intersensory

learning. The Griffiths Scales has five avenues of learning: locomotor,

personal–social, hearing and speech, eye and hand coordination, and

performance. As well as normed comparison against a standardized

population, a child’s developmental profile can be produced for dis-

tinct avenues of learning.


This paper highlights six key phases of the Griffiths III revision

process and the value of having a guiding plan that includes test prac-

titioner input.


2.1 | Phase 1: Literature review, stakeholder
feedback and market research

A comprehensive literature review from the start of the millennium

revealed little focused research on the assessment of children’s devel-

opment. Research focused mainly on cognitive development, with

particular emphasis on memory and working memory capacity, speed

of information processing, logical reasoning and attention within this

developmental domain (Best & Miller, 2010; Fuchs et al., 2010; Haden

et al., 2011; Pellegrini, 2009). In addition, there has been an extensive

focus on the social development of children, specifically on parent–

child relationships, attachment theories and children’s social judge-

ment. Other important areas are neurodevelopmental milestone

achievement (especially the visual and hearing systems); cognitive

development; working and incidental memory; attention and reason-

ing skills; behavioural manifestations of development; movement and

development; comprehensive understanding of the development of

attention; development; and multimedia literacy (Best & Miller, 2010;

Case, Demetriou, Platsidou, & Kazi, 2001; Demetriou & Kazi, 2006).

The six areas underpinning the original Griffiths Scales (Griffiths,

1954) (viz. locomotor, personal–social, language, eye and hand coordi-

nation, performance and practical reasoning) remain important. The

literature review indicated limited literature, particularly from the

United Kingdom, in relation to the need, cost, efficacy and benefits of

developmental testing including the content of such tests. Nor was

there literature on what is needed to update a developmental test

(Aylward, 2009) (Figure 1).

The GMDS-ER has been translated into Italian, French,

Portuguese, Chinese and Russian since its publication in English in

2006. Research studies since 2010 confirm the use of the Griffiths

Scales in assessing children in special populations such as HIV-

exposed or infected children (Lowick, Sawry, & Meyers, 2012; Perez

et al., 2015; Springer, Laughton, Tomlinson, Harvey, & Esser, 2012),

aboriginal infants (McDonald, Comino, Knight, & Webster, 2012;

McDonald, Webster, Knight, & Comino, 2014), measuring the effects

of various surgical procedures or noxious environments or treatments

(Battaglia et al., 2012; Ebbink, Aarsen, Van Gelder, & Van Den

Hout, 2013; Hemels et al., 2012; Laughton et al., 2012; Ostrea

et al., 2012; Rahkonen et al., 2012; Van Der Aa et al., 2013; Peroni

et al., 2014; Van Dyk, Ramanjam, Church, Koren, & Donald, 2014),

genetic groups such as Duchenne muscular dystrophy (Bargagna,

Bozza, Purpura, & Luongo, 2012; Colombo et al., 2014; Chieffo

et al., 2015; Pane et al., 2012, 2013) and multiple births (Tsekoura,

Beli, Boutopoulou, & Orfanidou, 2012).

In revising the Griffiths Scales, it is likely that they will continue

to be used widely internationally. Consequently, consideration needed

to be given to cultural issues. These may be country specific or related

to low income economies. Leung and Barnett (2008) stated that there

is a great need for culturally sensitive and appropriate psychological

assessment where relevant issues include competence of administra-

tors, test selection, contextual relevance of item content, adaption

and translation, administration, and assessment and interpretation of

performance. It was decided to concentrate on the production of a

developmental test that focused on core, universal aspects of child

development with an initial standardization in the United Kingdom

and Republic of Ireland on children who spoke English at home but

with sufficient cross-cultural sensitivity built in. The Griffiths III can

then be adapted as necessary in different countries and contexts

where it is used. The Guidelines for Translating and Adapting Tests

(2nd ed.) of the ITC (2017) could serve as a valuable resource in the

adaptation of the Griffiths III in various countries and contexts.

FIGURE 1 The GMDS Revision—setting the landscape summarizes the six phases of the revision process. Reproduced with permission from
ARICD from Stroud et al. (2016)


A qualitative, exploratory, descriptive research approach was used

to explore the opinions and attitudes of child development specialists in

order to develop a practitioner perspective on the structure and content

of the next version. Descriptive research can be an effective analysis of

nonquantified topics and issues. In this research approach, qualitative

data take the form of text, written words, phrases or symbols describing

or representing people, their experience, actions, thoughts, knowledge

and opinions. In contrast to quantitative research that relies on the use

of statistics and measurements, qualitative research is naturalistic, par-

ticipatory and interpretative (Kerlinger & Lee, 2000). An exploratory

study is relevant because it serves as an exploration of a relatively

unknownresearcharea,thatis, therevisionof theGriffithsScales.

Therefore, in order to clarify what practitioners think the Griffiths

Scales should include in the 21st century, an exploratory qualitative

descriptive approach was chosen with the following aims:

• to establish current thinking regarding child development and its


• to consider new constructs needed for developmental assessment

in addition to existing constructs;

• to agree on the geographical area for initial standardization of the

revised version of the Griffiths Scales;

• and to establish a detailed description of the structure and content

of the new revised version of the Griffiths Scales.

There were several stages in the present research. Findings from

each sequential stage influenced the design of subsequent stages, as

shown in Figure 2.

2.1.1 | Stage 1: Avenues of learning workshop

Nineteen experienced practitioners (nine paediatricians, eight psychol-

ogists and two allied health professionals) attended a workshop. The

discussion resulted in qualitative ‘sticky note’ data in response to the

following questions:

FIGURE 2 Flow chart showing sequence of
research stages in Phase 1


What are the ideal components of a developmental test for


What is the new child development knowledge that is not

in current tests?

What new advances/knowledge do we not want in devel-

opmental assessment?

What areas of emotional/social development would test

users like included in developmental assessment?

2.1.2 | Stage 2: Qualitative interviews

The revision team used critical analysis of the literature review and

the sticky note workshop data to guide the identification of both

open-ended and individually specific interview questions for nine

expert practitioners, two of whom had attended an earlier workshop.

Two thirds of this expert group had not used the existing Griffiths

Scales and were chosen to provide an opportunity to supplement the

views of regular Griffiths users. The expert practitioners worked in a

range of clinical and research environments and had varied psycholog-

ical, medical and allied health professional training. Open-ended ques-

tions posed to these experts by two of the authors included the

following: why they use or do not use the Griffiths Scales and

whether they identified key constructs or areas in child development

that the Griffiths Scales or other developmental scales are not ade-

quately tapping as important aspects for developmental assessment.

Thematic analysis of interview scripts and synthesis of thematic data

from earlier phases led to the description of core guiding principles to

underlie the development of the new Edition of the Griffiths Scales.

2.1.3 | Stage 3: Questionnaire

A practitioner questionnaire was developed with an open-ended

question format to test the core guiding principles of the Griffiths

Scales and to obtain data on the requirements of these practitioners.

Questionnaires were sent to 432 registered practitioners of the

Griffiths Scales in 17 countries for whom there was an available email

address. The questionnaire consisted of three sections:

Section A required biographical information (name, qualifi-

cation and address) from the respondents and a yes/no

response to a question about current use of the Griffiths


Section B listed 20 questions for practitioners currently

using the Griffiths Scales, testing their opinions on the core

guiding principles of the Scales.

Section C listed 5 questions for Griffiths registered

practitioners not currently using the Griffiths Scales, testing

reasons for their non-use of the measure, in particular, and

their opinion on the future role of developmental testing in


With the use of the script completed by the practitioners, itera-

tive analysis was performed for identifiable patterns or themes. The

qualitative data were coded separately per question for the users and

nonusers (Braun & Clarke, 2006). Similar responses were grouped

together into categories. This was done by making use of direct

quotes or interpreting common ideas (Aronson, 1994; Nowell, Norris,

White, & Moules, 2017).

2.2 | Phase 2: Construct development and task

A team of seven practitioners (four psychologists and three paediatri-

cians) was established to provide the ‘avenues of learning’ with both

individual subscale leadership, that is, the domains which would be the

focus of the new subscales of the revised measure and mechanisms to

provide continuity across the process and maintain financial integrity.

Updated subscale definitions were agreed to inform task development.

Current items were examined according to updated theory and current

clinical practice. Fine-grained analysis of the existing items used under-

pinning constructs derived from the literature review and Phase 1. Gaps

were identified using the constructs and a construct map.

2.3 | Phase 3: Item design

Subscale leaders identified constructs relevant to their subscale,

designed a statement of purpose for that subscale, analysed existing

Griffiths Scale tasks to identify subscale suitability, identified

gaps/overlaps and developed new tasks. Critical markers were

established. Cross-subscale cohesion was checked by the full practi-

tioner team. Percentages of achievement for each age year were cal-

culated for every task using data from the GMDS-ER. Normally, items

having the item difficulty level of 20% to 80% are included in a test

(Boopathiraj & Chellamani, 2013, p. 191), but a cut-off of 10% was

used to ensure that an adequate ceiling was achieved at 6 years.

2.4 | Phase 4: Piloting and standardization

Pilot testing was arranged to check task constructs, gather statistical

information and make further refinements in order to produce a final

standardization version. The test was piloted in South Africa with a

culturally diverse team of practitioners who provided feedback that

allowed for refinement of item instructions and scoring.


For the final standardization sample, a strategy was developed to

accommodate a continuous norming solution. The key descriptive

indicators used to select and classify children were geographic loca-

tion, gender, age, urban–rural and socio-economic status by using the

Indices of Deprivation (Department for Communities and Local

Government, 2014). Targets were set for geographic locations, with

suggested month age targets for each area.

2.5 | Phase 5: Production and release of Griffiths III

The test kit was produced, the record book finalized and three

parts of the manual were drafted and finalized with publication in

July 2016.

2.6 | Phase 6: Training

E-learning modules were set up for a Griffiths III conversion course

for registered Griffiths practitioners and Griffiths III Part I for new

users. A 3-day face-to-face course was designed and refined to pro-

duce a final recommended model. Additional training was provided

(both e-learning modules and face to face) for Griffiths tutors with a

registration process for Griffiths III tutors.


3.1 | Phase 1 results

Stage 1. Nineteen experienced practitioners produced sticky note

data as shown in Table S1. Questions that arose during the

workshop included the following: are feelings and emotions

part of a cognitive scale? should self-help items be included

in a Developmental Quotient? and do we need preterm


Stage 2. Themes emerging from the expert qualitative interviews are

shown inTable S2.

Table 1 provides details of the Core Guiding Principles identified

with supportive evidence from the literature review, from thematic

analyses of both sticky note workshop data and the qualitative inter-

views with the expert practitioners, from the revision team’s expert

clinical knowledge.

TABLE 1 Identified core guiding principles for development of new Griffiths Scales

Identified principle Supportive evidence

1. The core of the GMDS should remain the

core—in other words, it needs to answer
the clinician’s question of ‘is this child
developing like other children?’

Literature review—areas are still the main
areas Thematic Analysis

2. The underlying premise of the test should

remain the structured observation of

children using play.

Ruth Griffiths—Free behaviour of children
in a semistructured way but with rigorous

control of conditions Thematic Analysis

3. The purpose of the GMDS is to measure

general development.

ARICD revision team Expert Interviews The

Market GapThematic Analysis

4. The breadth of the GMDS remains

important, as one test cannot measure


ARICD revision team Expert Interviews

Thematic Analysis

5. Specialists for particular contexts can adapt

the test, such as in practice and for


Expert Interviews Thematic Analysis

6. The GMDS must be able to identify ‘flags’ in
development, which could be analysed


Sticky Note data ARICD revision team

Thematic Analysis

7. The GMDS must be usable by both the

practitioner and the researcher for their

respective purposes.

Literature Review Expert Interviews

Thematic Analysis

8. The main structure of the test should

remain with the possibility of developing

a supplementary set of GMDS scales that

cover second order factors such as

working memory, processing speed,

attention, and socio-emotional and

behavioural aspects of development.

Expert Interviews Other Psychometric

Measures Thematic Analysis

Abbreviations: ARICD, Association for Research in Infant and Child Development; GMDS, Griffiths Mental Development Scales.


Stage 3. Completed questionnaires were received from 85 respon-

dents (a 20% response rate). All respondents were paediatri-

cians or psychologists: 52 used the Griffiths Scales regularly

(regular users), and 33 did not use the scales in their current

work (nonusers). Respondents worked in 15 different coun-

tries in Europe, Australasia, Africa and Asia.

Regular users listed its use for developmental assessment; assess-

ment of special groups; diagnosis, intervention, planning and monitor-

ing; assessment linked to the school context; and training, research,

supervision and clinical trials. There was a balance of opinion (yes = 19,

no = 16, no clear response = 15) from regular users of the Griffiths

Scales on the need to incorporate more screening elements. Nonusers

cited practical reasons such as insufficient time allocated by managers;

work in inappropriate service; and wrong age range of the scales.

Both regular users and nonusers were asked: ‘Will there be a

place in your professional work for developmental testing (as distinct

from cognitive or physical testing) in future?’ A large proportion of

both regular users of the scales (46 of the 51; 90%), and nonusers

(21 of the 29; 72%) stated that a place remained for developmental

testing in their work. The overarching themes from the questionnaire

analysis were as follows:

To revise the Griffiths Scales according to a criterion

referenced (CRT) construction process with a subsequent

normative approach.

To reduce the test ceiling from 8 to 6 years.

To retain the traditional Griffiths Scales assessment for-

mat of structured observation of children at play.

To return to the original Griffiths Scales age group

structured format.

To merge two subscales and create a new subscale of

cognitive functioning.

To incorporate memory tasks across all subscales.

3.2 | Phase 3 results

An experimental version of the new scales was constructed.

3.3 | Phase 4 results

The final normative and standardization sample comprised 426 chil-

dren from the United Kingdom and Republic of Ireland. Further details

are included in Stroud et al. (2016). Raw scores obtained on Griffiths

III were transposed to scaled scores, developmental quotients (per

subscale as well as a general developmental quotient), percentiles, sta-

nines and developmental age equivalent (per raw score).


Phase 1 of this revision was designed so that the test users could

have the opportunity to have a reciprocal relationship with the test

development team, as recommended by the ITC (2017). By fostering

this relationship, test developers are able to draw on the knowledge

and experience of users and to develop and revise tests that meet

the needs of those that employ it in daily practice (Adams, 2000). In

later phases, the test development team members were both practi-

tioners and previous tutors of the Griffiths Scales. The results from

the Avenues of Learning practitioner workshop and the qualitative

interviews with the expert practitioners provided a wide range of

possible future inclusions for the Griffiths III. Some of these data

assisted in the clarification of the necessary core constructs, as well

as new areas for item development. Practitioners valued an overall

developmental measure with discrete data about and within the

‘avenues of learning’, allowing them to analyse a child’s strengths

and weaknesses.

A number of broad, fundamental questions were examined by

the practitioners in the research. Do the Griffiths Scales tap into

the ‘right’ child development areas? How ‘big’ (out of the box) should

the thinking be in restandardizing the Scales? The inclusion of prac-

titioners in all parts of a scale revision is unusual and is likely to

add to the test’s validity. Practitioners clarified the need for for-

mal assessment of social and emotional development and also a

reduction as much as possible of scoring relying on behaviour

which was reported rather than observed. The revision should set

the Griffiths Scales apart from other developmental assessment

measures and retain its unique quality valued by practitioners as

they have been involved throughout the revision process. Practi-

tioners offered valuable input on the sensitivity and specificity to

identify where development deviates from the norm. It is impor-

tant to recognize that once developmental tasks have been identi-

fied and established, and once sensitive …

Looking for this or a Similar Assignment? Click below to Place your Order