eBooks

Young L2 learners' narrative discourse

2015
978-3-8233-7903-4
Gunter Narr Verlag 
Christine Möller

Wie entwickeln sich Text-/Diskursfähigkeiten bei Zweitsprachenlernenden? Dieser bisher weitgehend unbeantworteten Frage geht die vorliegende Monographie in Bezug auf frühen Fremdsprachenerwerb nach. Untersucht werden narrative Texte von Grundschülerinnen und Grundschülern, die an einem englischsprachigen Immersionsprogramm in Deutschland teilnehmen, einer intensiven Form von CLIL bzw. bilingualem Unterricht. Ausgehend von einem psycholinguistischen Modell der Diskursproduktion betrachtet der vorliegende Band einerseits, auf der Makroebene der Textualität, die Entwicklung der Kohärenz der Erzählungen und andererseits, auf der Mikroebene, die Entwicklung der Kohäsion. Die gewählten Analysemodelle basieren auf Geschichtengrammatik und kohäsiven Mittel nach Halliday und Hasan (1976). Der Band ist eine der ersten Studien zu Kohärenz und Kohäsion im Zweitsprachenerwerb und die bisher einzige solche Studie im Kontext frühen Fremdsprachenunterrichts. Hauptzielgruppen sind Linguistinnen und Linguisten, die an Spracherwerb, frühkindlichen Erzählungen, Text-/Diskurslinguistik interessiert sind, sowie Fachdidaktikerinnen und Fachdidaktiker, die an frühem Fremdsprachenunterricht und bilingualem Unterricht interessiert sind.

Young L2 learners’ narrative discourse Coherence and cohesion Christine Möller Multilingualism and L anguage Tea c hing 3 Young L2 learners’ narrative discourse Multilingualism and Language Teaching Herausgegeben von Thorsten Piske (Erlangen), Silke Jansen (Erlangen) und Martha Young-Scholten (Newcastle) Band 3 Christine Möller Young L2 learners’ narrative discourse Coherence and cohesion Bibliografische Information der Deutschen Nationalbibliothek Die Deutsche Bibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detaillierte bibliografische Daten sind im Internet über http: / / dnb.dnb.de abrufbar. © 2015 · Narr Francke Attempto Verlag GmbH + Co. KG Dischingerweg 5 · D-72070 Tübingen Das Werk einschließlich aller seiner Teile ist urheberrechtlich geschützt. Jede Verwertung außerhalb der engen Grenzen des Urheberrechtsgesetzes ist ohne Zustimmung des Verlages unzulässig und strafbar. Das gilt insbesondere für Vervielfältigungen, Übersetzungen, Mikroverfilmungen und die Einspeicherung und Verarbeitung in elektronischen Systemen. Gedruckt auf chlorfrei gebleichtem und säurefreiem Werkdruckpapier. Internet: www.narr.de E-Mail: info@narr.de Printed in Germany ISSN 2197-6384 ISBN 978-3-8233-6903-5 Für meine großen und kleinen Erzähler VII Table of contents Detailed table of contents.…………………….…………………………………………. IX Acknowledgements………….…………………………………………………………….. XIII Abbreviations………….……………………………………………………………………… XIV 1 Introduction ......................................................................................... 1 1.1 What is (narrative) discourse? ..................................................................... 1 1.2 Why (narrative) discourse? .......................................................................... 2 1.3 The present study: Goals and outline.......................................................... 3 2 Approaches to narrative discourse ...................................................... 7 2.1 Coherence: Story grammar........................................................................... 8 2.2 Cohesion ....................................................................................................... 13 2.3 Summary ....................................................................................................... 14 3 Narrative production: From cognition to coherence and cohesion. 15 3.1 A simplified model of narrative discourse production........................... 15 3.2 Top-down versus bottom-up organization: Narrative coherence and cohesion ........................................................................................................ 22 3.3 The story grammar approach to narrative coherence ............................ 25 3.4 Cohesion: From references to lexical cohesion........................................ 31 3.5 Narrative production: Summary and research questions....................... 36 4 The development of storytelling........................................................ 39 4.1 The development of narrative coherence in L1 acquisition................... 40 4.2 The development of narrative coherence in L2 and bilingual acquisition..................................................................................................... 42 4.3 The development of cohesion in L1 acquisition...................................... 44 4.4 The development of cohesion in L2 and bilingual acquisition.............. 46 4.5 The relationship between coherence and cohesion................................. 48 4.6 Summary and hypotheses ........................................................................... 49 5 Research Design ................................................................................. 53 5.1 Participants and data collection................................................................. 53 5.2 Method of analysis ....................................................................................... 62 5.3 Addressing the dangers of the comparative fallacy ................................. 88 VIII 6 The development of coherence: Results ............................................ 91 6.1 Total number of narrative components.................................................... 91 6.2 Individual narrative components ............................................................108 6.3 Index of global narrative structure ..........................................................124 6.4 Narrative coherence: Summary ...............................................................136 7 The development of cohesion: Results ............................................ 139 7.1 Overall cohesive density............................................................................139 7.2 Overall cohesive density: Statistical results ............................................151 7.3 Cohesive density: Subcategories ..............................................................152 7.4 Contribution of the subcategories to the overall cohesive density......199 7.5 Cohesion results: Summary ......................................................................210 8 The relationship between (the development of) coherence and cohesion ............................................................................................ 217 9 General discussion ........................................................................... 223 9.1 Coherence and cohesion: Similarities and differences .........................224 9.2 The relationship between (the development of) L2 cohesion and coherence ....................................................................................................226 9.3 Learner variables and their impact: Grade, sex and preschool experience ...................................................................................................228 9.4 Interindividual variation...........................................................................230 9.5 Coherence and cohesion: Cognitive and linguistic development .......231 9.6 Limitations of the study ............................................................................234 9.7 Conclusions for the effectiveness of immersion programs ..................235 10 Bibliography ..................................................................................... 239 11 Appendix .......................................................................................... 259 11.1 Individual cohesion scores ordered by frequency .................................260 11.2 Individual cohesion scores ordered by frequency and sex ...................263 11.3 Individual cohesion scores by frequency and experience group .........265 11.4 Individual coherence and cohesion scores .............................................268 12 List of figures .................................................................................... 274 13 List of tables...................................................................................... 278 IX Detailed table of contents Acknowledgements………….…………………………………………………………….. XIII Abbreviations………….……………………………………………………………………… XIV 1 Introduction ...................................................................................... 1 1.1 What is (narrative) discourse? .................................................................. 1 1.2 Why (narrative) discourse? ....................................................................... 2 1.3 The present study: Goals and outline....................................................... 3 2 Approaches to narrative discourse ................................................... 7 2.1 Coherence: Story grammar........................................................................ 8 2.2 Cohesion .................................................................................................... 13 2.3 Summary .................................................................................................... 14 3 Narrative production: From cognition to coherence and cohesion ........................................................................................ 15 3.1 A simplified model of narrative discourse production........................ 15 3.2 Top-down versus bottom-up organization: Narrative coherence and cohesion...................................................................................................... 22 3.3 The story grammar approach to narrative coherence ......................... 25 3.3.1 Story grammar........................................................................................... 25 3.3.2 Story grammar: Narrative components ................................................. 27 3.4 Cohesion: From references to lexical cohesion..................................... 31 3.4.1 References .................................................................................................. 32 3.4.2 Substitution and ellipsis ........................................................................... 32 3.4.3 Connectives................................................................................................ 34 3.4.4 Lexical cohesion ........................................................................................ 35 3.5 Narrative production: Summary and research questions.................... 36 4 The development of storytelling ..................................................... 39 4.1 The development of narrative coherence in L1 acquisition................ 40 4.2 The development of narrative coherence in L2 and bilingual acquisition.................................................................................................. 42 4.3 The development of cohesion in L1 acquisition ................................... 44 4.4 The development of cohesion in L2 and bilingual acquisition........... 46 4.5 The relationship between coherence and cohesion.............................. 48 4.6 Summary and hypotheses ........................................................................ 49 5 Research Design............................................................................... 53 X 5.1 Participants and data collection .............................................................. 53 5.1.1 The Kiel Immersion Project .................................................................... 53 5.1.2 Participants ................................................................................................ 55 5.1.3 Data collection: Materials and procedure.............................................. 60 5.2 Method of analysis .................................................................................... 62 5.2.1 Narrative coherence: The structure of “Frog, where are you? ” .......... 62 5.2.2 Cohesion .................................................................................................... 70 5.2.2.1 Coding into clauses................................................................................... 70 5.2.2.2 Cohesion methodology ............................................................................ 76 5.2.2.3 References .................................................................................................. 77 5.2.2.4 Substitution and ellipsis ........................................................................... 79 5.2.2.5 Connectives................................................................................................ 80 5.2.2.6 Lexical cohesion ........................................................................................ 81 5.2.3 Statistical methods .................................................................................... 84 5.3 Addressing the dangers of the comparative fallacy .............................. 88 6 The development of coherence: Results ......................................... 91 6.1 Total number of narrative components................................................. 91 6.1.1 Total number of components: Observed results .................................. 91 6.1.1.1 Overall results ............................................................................................ 91 6.1.1.2 Total number of components by sex ...................................................... 95 6.1.1.3 Total number of components by experience group ...........................100 6.1.2 Total number of components: Statistical results ................................106 6.1.3 Summary: Total number of narrative components............................107 6.2 Individual narrative components .........................................................108 6.2.1 Individual narrative components: Observed and statistical results .108 6.2.1.1 Individual narrative components by sex..............................................113 6.2.1.2 Individual narrative components by experience group.....................118 6.2.2 Summary: Individual narrative components ......................................123 6.3 Index of global narrative structure .......................................................124 6.3.1 Index construction and reliability ........................................................124 6.3.2 Global narrative index results ...............................................................125 6.3.2.1 Overall results ..........................................................................................125 6.3.2.2 Index by sex .............................................................................................128 6.3.2.3 Index by experience group.....................................................................131 6.3.3 Summary: Narrative index.....................................................................135 6.4 Narrative coherence: Summary ............................................................136 XI 7 The development of cohesion: Results ......................................... 139 7.1 Overall cohesive density.........................................................................139 7.1.1 Observed results overall cohesive density ...........................................139 7.1.2 Overall cohesive density by sex .............................................................141 7.1.3 Overall cohesive density by experience group ....................................146 7.1.4 Overall cohesive density: Statistical results .........................................151 7.1.5 Summary: Overall cohesive density......................................................151 7.2 Cohesive density: Subcategories ...........................................................152 7.2.1 References ................................................................................................153 7.2.1.1 Overall reference density .......................................................................153 7.2.1.2 Reference density by sex ........................................................................155 7.2.1.3 Reference density by experience group................................................158 7.2.1.4 Referential density: Statistical results ...................................................162 7.2.1.5 Summary: Referential density ...............................................................162 7.2.2 Connectives..............................................................................................164 7.2.2.1 Overall connective density.....................................................................164 7.2.2.2 Connective density by sex......................................................................166 7.2.2.3 Connective density by experience group .............................................170 7.2.2.4 Connective density: Statistical results ..................................................174 7.2.2.5 Summary: Connective density ..............................................................175 7.2.3 Substitution and ellipsis .........................................................................176 7.2.3.1 Substitutions ............................................................................................176 7.2.3.2 Overall ellipsis density............................................................................176 7.2.3.3 Ellipsis density by sex .............................................................................179 7.2.3.4 Ellipsis density by experience group ....................................................182 7.2.3.5 Ellipsis density: Statistical results..........................................................186 7.2.3.6 Summary: Substitution and ellipsis density ........................................187 7.2.4 Lexical cohesion ......................................................................................188 7.2.4.1 Overall lexical density.............................................................................188 7.2.4.2 Lexical density by sex .............................................................................190 7.2.4.3 Lexical density by experience group.....................................................193 7.2.4.4 Lexical density: Statistical results ..........................................................197 7.2.4.5 Summary: Lexical density ......................................................................197 7.3 Contribution of the subcategories to the overall cohesive density...199 7.3.1 Overall contribution ...............................................................................199 7.3.2 Contribution by sex ................................................................................200 XII 7.3.3 Contribution by experience group .......................................................203 7.3.4 Qualitative changes.................................................................................206 7.3.5 Summary: Contribution of the subcategories to cohesive density ...210 7.4 Cohesion results: Summary ...................................................................210 8 The relationship between (the development of) coherence and cohesion ......................................................................................... 217 9 General discussion......................................................................... 223 9.1 Coherence and cohesion: Similarities and differences ......................224 9.2 The relationship between (the development of) L2 cohesion and coherence .................................................................................................226 9.3 Learner variables and their impact: Grade, sex and preschool experience ................................................................................................228 9.4 Interindividual variation........................................................................230 9.5 Coherence and cohesion: Cognitive and linguistic development ....231 9.6 Limitations of the study .........................................................................234 9.7 Conclusions for the effectiveness of immersion programs ...............235 10 Bibliography .................................................................................. 239 11 Appendix........................................................................................ 259 11.1 Individual cohesion scores ordered by frequency ..............................260 11.2 Individual cohesion scores ordered by frequency and sex ................263 11.3 Individual cohesion scores by frequency and experience group ......265 11.4 Individual coherence and cohesion scores ..........................................268 11.4.1 Coherence scores: Overall measures ....................................................268 11.4.2 Coherence scores: Individual components..........................................270 11.4.3 Cohesion scores.......................................................................................272 12 List of figures ................................................................................. 274 13 List of tables ................................................................................... 278 XIII Acknowledgements First of all, very special thanks to Henning Wode, without whom I would never have set foot on immersion territory, and to all teachers, students, and parents of Claus-Rixen-Schule! There are too many other people to thank each one of them individually, but here are at least some of those who deserve a very special mention: Anna Zaunbauer, who read it all and, moreover, helped out whenever I had any statistics questions; her help was invaluable; Andreas Rohde, who was the second to read the manuscript in its entirety and gave important feedback; Claudia Claridge, who answered many major and minor questions over the years and always had an open ear for more. Jack Chambers, “the great encourager”, who read and commented on parts of the manuscript; Manfred Pienemann for helping out when help was most needed, and finally Petra Jaecks and Christiane Gross, who rechecked the statistics in the final manuscript. Many thanks also to the great many others who, in some way or other, contributed to the completion of this book or to making it possible in the first place: Sumru Akcan, Susanne Bendixen, Catherine Bennewitz, Petra Burmeister, Ping Deters, Johanna Gerwin, Swantje Hachmann, Kristin Kersten, Peter Kriwy, Normand Labrie, Christina Lupa, Matthias Meyer, Thorsten Piske, Ruth Pasternak, David Reuter, Bianca Sauer, Astrid Schmidt, Anja Steinlen, Stefanie Tacken, and Natalie Vellmer. Last but not least many thanks to my mother and especially to my husband for his seemingly endless patience, support, and encouragement. XIV Abbreviations Bili Bilingual experience group: Children who attended a bilingual English-German preschool before joining the English immersion program. IM Immersion L1 First language learned by an individual L2 Any language learned after the L1 Mono Monolingual experience group: Children who attended a monolingual German preschool before joining the English immersion program. SD Standard deviation SLA Second language acquisition 1 1 Introduction It is somewhat striking that so much research effort and interest has been focused on trying to understand how children come to learn the sounds, words, and syntax necessary to produce sentences in their native language, given that very little real language use is confined to the sentence level. (Pan & Snow 1999: 229) Even though the situation has changed somewhat and a substantial body of research on phenomena above the sentence level is now available, the majority of these studies has been conducted on monolingual speakers and first language (L1) learners. Far fewer studies have targeted bilinguals and second language (L2) learners and still fewer have addressed learners in immersion (IM) programs. 1 By looking at the development of narrative discourse in an English immersion program at elementary school in Germany the present study seeks to contribute to filling this gap. 1.1 What is (narrative) discourse? Discourse can be defined as the “use of language beyond a single sentence” (Bamberg & Moissinac 2003: 395). This definition encompasses not only written and oral mode but also a broad range of discourse types from conversation to more specific genres such as, for example, narratives (ibid.). Discourse used in this sense refers to the same phenomenon described by other authors as text. 2 Halliday and Hasan, for example, define text as “any passage, spoken or written, of whatever length, that does form a unified whole” (1976: 1). In the present study discourse and text will be used synonymously in the sense conveyed by both quotations. However, both terms will be employed as referring to extended discourse (Pan & Snow 1999), i.e. a sequence larger than just a couple of sentences. Narrative as a particular discourse type includes subgenres such as (fairytale/ make-believe/ fictional) stories and personal narratives (e.g. Bamberg & Moissinac 2003, Sperry & Sperry 1996, Hicks 1991), i.e. narratives about personal experience. Consequently, narrative discourse can be defined as any spoken or written piece of extended discourse associated with the particular discourse genre narrative. 3,4 1 I will be using the term second language as referring to any language learned in addition to the first one, regardless of context. 2 Yet other authors distinguish between discourse as a more dynamic and text as a static entity (e.g. Cutting 2002: 2, Johnstone 2002: 2, Hoey 1996, Clark 1994). Both views are yet again different from discourse in a Foucaultian tradition, where it is seen as a thematically-driven, superordinate communicative entity realized by a network of singular texts (e.g. Warnke 2008). 3 This definition of narrative is kept deliberately vague. For discussions of the ongoing debate about what characterizes narratives (as well as stories) see, for example, Her- 2 1.2 Why (narrative) discourse? As the initial quote from Pan and Snow already indicates, discourse is an important part of human communication. Accordingly, discourse features are included in all influential models of language competence or performance (e.g. Bachman 1990) and in all major language assessment frameworks (e.g. Cambridge Certificate of Proficiency in English, TOEFL or the Common European Framework of Languages (Council of Europe 2001)), even if there is no general agreement on what exactly discourse competence encompasses and how to test it. But what makes discourse special? The rules of grammar operate on a very local level, predominantly on clauses and sentences, and not usually across sentence borders. The rules of discourse production, on the other hand, operate (also) on larger stretches of spoken or written text. Since each rule system functions at a different level, local grammaticality is thus a priori unrelated to discourse requirements (cf. e.g. Givón 1995). 5 A series of sentences as in example (1.1) would more likely be accepted as discourse (here: a story), for example, than (1.2), even if the morphosyntax is target-like in (1.2) but not in (1.1): (1.1) Boy go school. Friend bad, take bike. Boy cry, no bike. (1.2) The boy goes to school every day. He owns a bike. The boy’s brother also has a bike. That is, discourse production—be it in an L1 or L2—may involve the formation of grammatical sentences, but more importantly it requires discourse-specific abilities. This ranges from the “social-cognitive sensitivity to communicative setting” (Berman 2008: 763), including familiarity with different genres, to command over the linguistic means of connecting stretches of speech or writing and the cognitive abilities to pre-plan for this linguistic as well as for a content-related, structural connectedness (Berman 2001). In studying the (development of) discourse abilities of monolingual normaldeveloping, brain-damaged or language-impaired adults and children (e.g. Manolitsi & Botting 2011, Epstein & Phillips 2009, Reilly et al. 2004 & 1998, Norbury & Bishop 2003, Manhardt & Rescorla 2002, Berman & Slobin 1994, Reilly et al. 1998, Joanette & Brownell 1990), the study of narratives has proved especially useful, since, due to their strong socio-cultural importance (e.g. Bamberg & Moissinac 2003, Stein & Policastro 1984), narratives occur in conversation from a man (2009), Renkema (2004: 191ff.), Richardson (2000), Boueke et al (1995) and Stein (1982). The present study follows Berman and Slobin’s (1994) approach to defining narrative/ story simply as the participants’ productions in response to the pictureelicited storytelling task used for data collection (cf. ch. 4). 4 In the following, the terms narrative, narrative discourse and narrative text will be used interchangeably. 5 Even if the production of syntactically and morphologically target-like clauses and sentences facilitates understanding. 3 relatively early age (e.g. Nelson 1986, cf. also ch. 4). 6 Because of this, narratives are often considered the most important discourse genre (e.g. Bamberg & Moissinac 2003, Reilly et al. 1998). Discourse competence deserves special attention in language acquisition studies, since it is a valuable asset in any educational context; comprehending and producing various discourse types is an important part of most curricula, be it “merely” oral and written stories or more specifically academic discourse types such as oral and written expository texts. Oral discourse skills in turn—especially oral narrative competence—have been found to be a significant predictor for literacy-related skills (e.g. Reese et al. 2010, Chang 2006, Griffin et al. 2004, Blankman et al. 2002, Dickinson & McCabe 2001, Snyder & Downey 1991, Norris & Bruning 1988). 7 Aspects of oral narrative competence can even be indicative for literacy-unrelated academic achievements, however: Fazio and colleagues (1996), for example, found that story retelling was the best single kindergarten predictor for the future academic status of their participants receiving academic remediation, while O’Neill and colleagues (2004) showed that the use of connectives is related to later mathematical achievement. 1.3 The present study: Goals and outline The importance of (oral) discourse competence is, of course, not limited to monolingual education. On the contrary, immersion students and other L2 learners face an even greater challenge than monolinguals when asked to produce discourse in their L2, since even for young learners there may be a gap between cognitive and linguistic skills. At the same time immersion has been found to have an especially positive effect on participants’ L2 conversational skills and willingness to communicate (e.g. Wode 2009: 38, Baker & MacIntyre 2003, Johnson & Swain 1997, Harley et al. 1990; cf. also Smit 2008, Lazaruk 2007, Genesee 1987). But what about the production of make-believe stories, which are considered a very challenging type of narrative discourse (cf. Berman 2004: 264ff.), since they require a largely autonomous construction of text? How does this type of discourse develop in an immersion program? The present study investigates fictional adventure stories produced by 59 first and fourth graders (mean age 6; 8 and 9; 8) in an early partial immersion program 6 As opposed to other discourse types, e.g. expository texts, which are introduced only later in formal schooling and whose development lags behind accordingly (Berman 2008, Berman & Verhoeven 2002: 18). 7 This can be attributed to oral narratives’ conceptual closeness to written discourse, i.e. their making use of many (linguistic) features otherwise associated with a written discourse style (Koch & Oesterreicher 1994). It should be kept in mind, however, that any discourse type’s linguistic and content structure follows norms determined, in an educational context, by the respective (educated) majority culture and may therefore not apply to all parts of a population (cf. Gumperz et al. 1984, Scollon & Scollon 1984, Michaels & Collins 1984). 4 in the north of Germany, in which all subjects besides German language arts are taught in English. Learner variables collected were grade (first vs. fourth), sex (male vs. female), and L2 preschool experience (monolingual German vs. German- English bilingual group). Participants’ stories were obtained through a pictureelicited oral storytelling task administered at the end of both school years. These stories were then analyzed in terms of two main discourse features: Cohesion and coherence, i.e. the linguistic connectedness of stretches of speech, for example via references or ellipses, and content connectedness through a global organization structure following an underlying narrative schema. It will be argued, furthermore, that these two measures represent aspects of participants’ linguistic, and (respectively) cognitive development. In addition to investigating the development of cohesion and coherence from first to fourth grade this study explores differences attributable to participants’ sex and/ or preschool experience. It should be emphasized that my study first and foremost gives a quantitative account of how coherence and cohesion develop and not a fine-grained analysis of developmental steps. At the same time, however, the detailed description of the categories of analysis in the methods chapter (ch. 5.2) presents the results of an initial in-depth qualitative analysis: Participants’ texts were analyzed in detail as to the linguistic options they chose to realize the analysis categories provided by the two underlying frameworks (story grammar and Halliday and Hasan’s (1976) approach to cohesive devices; cf. especially ch. 2 and 3). The qualitative results obtained were then quantified and the result of this latter step is described in the actual results section(s) of the present work. In addition to contributing to the investigation of (the development of) narrative discourse produced by young L2 learners in an immersion setting, the present study also has a more concrete goal in relation to IM programs in general: Even though bilingual education is a very old phenomenon in Europe, over time monolingual L1 education came to be seen as “natural” in most European countries and it was only in the last third of the 20 th century that this view started to change again (cf. Möller 2013 & 2009). Immense progress has been made especially in the last ten years, which shows in Germany, for example, in an increase in immersion and other bilingual education programs as well as in the introduction of at least some foreign language teaching in elementary school. Nevertheless, prejudices and reservations by parents and policymakers continue, as shown, for example, in recurring discussions on the importance of German as a national language (e.g. Spiegel Online 2008 & 2008a, Welt Online 2008). Therefore, the present study has a threefold aim: 1. To investigate how linguistic and content organization of (narrative) discourse develop over the four-year duration of an early partial immersion program. 2. To relate this development to participants’ cognitive and linguistic development. 3. To relate the overall results to the effectiveness of the program. These three goals will be pursued as follows: In chapter 2 the two frameworks used in the present study, story grammar (coherence) and Halliday and Hasan’s 5 (1976) approach to cohesion, will be discussed critically in the light of the wide range of approaches to studying (narrative) discourse production and its development. It will be shown that these two approaches can make an important contribution to studying the development of narrative discourse—even in the light of more recent approaches—and that they are well suited for the purposes of the present study. A simplified model of (narrative) discourse production will be outlined in chapter 3 in order to illustrate the challenges involved in telling a story from a picture book. This model leads up to a detailed description of the two fundamental dimensions of texts investigated in my study, namely coherence and cohesion. Coherence will be defined as a text’s organization structure reflecting an underlying narrative schema. Cohesion, on the other hand, will be defined as the use of linguistic means to connect clauses and sentences into stretches of discourse. By continuity, coherence will be defined as a cognitive measure and cohesion as a linguistic measure. Both discourse measures will be described in detail and several research questions will be posed in relation to them. In chapter 4 I will give an overview of the findings of previous studies with respect to narrative development. Studies will be presented on both monolingual L1 and, in a separate section, on L2 and bilingual learners. The overview of prior research will show that coherence and cohesion, as they were defined in chapter 3, can be studied fruitfully (a) within my participants’ age range and (b) in L2 data. From the findings presented it will be argued that the analysis of L2 coherence allows insights into speakers’ cognitive decelopment, since the studies presented show that the coherence of L2 productions and its development do not differ from those evident in L1 narratives once the speaker has acquired the necessary linguistic means in the L2. Chapter 4 concludes with several research hypotheses regarding the questions raised in the previous chapter. The study’s research design will be presented in chapter 5. Here, a short overview will be given of the immersion project in which my data was collected and then I will describe the participants and the data collection. Following this I will describe in detail the method of analysis for coherence (narrative components and their realizations) and cohesion (the subcategories of cohesive devices and their realizations) and will introduce the statistical methods that were applied. In the last section of this chapter I will address the dangers of the comparative fallacy with respect to my analysis. In chapter 6 I will present the results obtained for narrative coherence with respect to the number of components produced and the individual narrative components identified in the task material. Then, complementing the latter analyses, the construction of a newly created index of global narrative structure will be described, and the results obtained with the help of this index will be given. Chapter 6 concludes with a summary of the coherence results. The results obtained for narrative cohesion will be presented in chapter 7. First of all, I will give the results obtained for the texts’ overall cohesive density and then those for the subcategories. After that the subcategories’ degree of contribu- 6 tion to the overall cohesion of participants’ stories will be described. Chapter 7 ends with a summary of the cohesion results. Chapter 8 first of all details the results of a correlation analysis investigating the relationship between coherence and cohesion. In addition to this, the relationship between the development of these measures from first to fourth grade is explored. In the last chapter, chapter 9, I will summarize and discuss the results of my study: I will address similarities and differences between coherence and cohesion results plus the relationship between the two measures. Then I will focus on two recurring themes of my study, namely the influence of the learner variables of grade, sex and L2 preschool experience and on participants’ interindividual differences. In the light of my results I will then discuss the validity of coherence and cohesion as a cognitive and (respectively) linguistic measure as well as their usefulness for studying L2 data. After that I will address some general limitations of the present study. Chapter 9 ends with a discussion of conclusions regarding the effectiveness of the current program in particular and immersion teaching in general. 7 2 Approaches to narrative discourse A multitude of approaches to discourse analysis and, more specifically, to the analysis of narrative exists. Their use differs greatly, however, between the fields interested in narratives—psycholinguistics and sociolinguistics, (cognitive) psychology, anthropology, sociology and/ or literary studies, to name but a few. Even though some approaches are of limited use for child first and/ or second language development studies within the age range covered in the present study (i.e. age six to ten), for example critical discourse analysis (e.g. Wodak 2005, Fairclough 2003, 1995, 1992 & 1989, Wodak et al. 1990), many others promise important insights (for narratives see, for example, the overviews in Bamberg & Moissinac 2003, Bamberg 1997, and Toolan 1988; for a general overview of text and discourse analytical approaches see e.g. Schubert 2012, Hyland & Paltridge 2011, Becker Bryant 2009, Esser 2009, Johnstone 2008, Jaworski & Coupland 2006, Trappes- Lomax 2006, Gee 2005: 118ff. & 182ff., Renkema 2004). Selecting the method of analysis is of course inextricably tied to a study’s research interest, which in turn mandates the data to be collected. As described in chapter 1, the present research monograph is the result of a study carried out within the framework of a larger project, the Kiel Immersion Project (cf. ch. 5 for a more detailed description), which sought to investigate the L2 development of German children in English immersion. Different linguistic areas were tackled, mainly morphology (e.g. Kersten 2009, Kersten et al. 2002) and the lexicon (e.g. Tiefenthal 2009, Rohde 2005, Rohde & Tiefenthal 2002). The present study is the first to investigate the elementary-school data of the project developmentally from the perspective of text/ discourse analysis; it builds on two earlier small-scale studies conducted in the project (Möller 2003, Maschewski 2002). Data collection in the elementary-school part of the Kiel Immersion Project aimed for one specific text type, namely monologic oral stories elicited with the help of a picture book 1 or, as Berman most adequately defined it, “the verbalization of graphic representations of non-veridical, fictive sequences of events” (2004: 262) told to an adult listener (cf. also ch. 5). This type of discourse genre is of special importance in the educational context, since it is included in most curricula and can be related to, for example, the acquisition of literacy (cf. ch.1). Other types of oral or written discourse data, e.g. conversational or personal narratives, were not elicited in the project. 2 Data elicitation method and material chosen in the Kiel Immersion Project therefore severely limits the choice of the method of analysis. Interactional frameworks (e.g. Quasthoff & Becker 2005, Quasthoff 1997, Hausendorf & Quasthoff 1996), for example, which require a 1 Frog, where are you? (Meyer 1969). 2 Likewise, the only learner variables collected were age, sex and whether the child had previous experience with the L2 by having attended a bilingual kindergarten before joining the elementary school immersion program (cf. ch. 5). 8 different kind of data and data collection need to be ruled out from the start. The same is true, for example, for multimodality approaches (cf. O’Halloran 2011, Renkema 2004), since the necessary information was not collected as part of the overall project. There is agreement, however, that texts such as the narratives collected in the present study share two fundamental dimensions, namely coherence and cohesion (e.g. Schubert 2012, Renkema 2004, Graesser et al. 2003, Hickmann 2003 & 1996, Gernsbacher & Givón 1995). These two dimensions are the research constructs to be investigated. The two frameworks selected for studying these constructs are (a) story grammar (coherence) and (b) Halliday and Hasan’s (1976) classical categories of cohesion, which were updated on the basis of more recent studies. Both frameworks will be presented in depth in the next chapter (ch. 3) in relation to the general concepts of coherence and cohesion and a simple production model for storytelling. Story grammar (e.g. Mandler & Johnson 1977, Thorndyke 1977; cf. also the 1982 special issue on stories in the Journal of Pragmatics) and Halliday and Hasan’s approach to cohesion both date back to the 1970s. So why go back to these two frameworks? Story grammar and the classical Hallidayan framework for cohesion were selected, first of all, because both of them can be applied to narrative texts, cohesion being of course not limited to any specific text type. Secondly, these two approaches capture two structural extremes of text, namely the microstructure of linguistic text organization and the macrostructure of a text’s content organization, i.e. the most global and the most local organizational levels of a (narrative) text (cf. also ch. 3). Many other options available for analysis do not operate on these two “extreme” levels of narrative organization but instead on intermediate levels and/ or they cover only parts of the two approaches used in the present work, for example the study of coherence relations (e.g. Sanders & Spooren 2009, Taboada 2009 & 2006, Spooren & Sanders 2008, Taboada & Mann 2006a & 2006b, Asher & Lascarides 1998, Knott and Sanders 1998, Sanders et al. 1992, Mann & Thompson 1988). Yet other approaches pursue a completely different aim, even if they operate in some respects on similar structural levels, for example studies investigating form-function relationships, such as the ones conducted and triggered by Berman, Slobin and colleagues (e.g. Strömqvist & Verhoeven 2004, Verhoeven & Strömqvist 2001, Berman & Slobin 1994, Bamberg 1987), or studies investigating stories’ affective potential (e.g. Jose & Brewer 1990, Brewer & Lichtenstein 1982; cf. also the overviews in Boueke et al. 1995 and Yussen & Ozcan 1996). 2.1 Coherence: Story grammar The two most influential approaches in the study of coherence as the global organization of narrative discourse have been highpoint analysis and story grammar (Peterson & McCabe 1983: 3) and most recent acquisition studies on story structure still build largely on one or the other (e.g. Augst et al. 2007, Boueke et al. 9 1995, Berman & Slobin 1994). Even though there is a certain similarity between the two frameworks in that both analyze stories with the help of predetermined structural components, they have important differences. First of all, highpoint analysis (Labov 1999 [1972], Labov & Waletzky 1967) was developed from narratives about personal experience and it serves mainly sociolinguistic and anthropological purposes (cf. Renkema 2004: 193, Toolan 1988: 46ff.). Story grammar, on the other hand, was developed on the basis of make-believe stories (cf. e.g. Mandler & Johnson 1977, Thorndyke 1977). It is a cognitive framework postulating a relationship between observed text structure and underlying cognitive structures, which thus caters to fields with a cognitive interest, e.g. psycholinguistics, cognitive linguistics, cognitive psychology or artificial intelligence research. The second important difference relates to the way stories are structured. Highpoint analysis postulates and investigates a small number of linearly ordered narrative components in the surface structure of a narrative. Story grammars, on the other hand, typically include a larger number of components, which are seen as connected through membership in an underlying hierarchical organization structure, i.e. the story schema, even if in the surface structure of a given text the individual components may occur in a linear order. Story grammar approaches have largely been neglected in (second) language and multilingual acquisition studies, even though reduced designs following the same or at least a similar logic as the original story grammar framework have been applied successfully in a small number of studies (e.g. Becker 2005, Montanari 2004, Akinçi et al. 2001, Kupersmitt & Berman 2001, Stavans 2001, Boueke et al. 1995) mainly following Berman and Slobin’s (1994b) lead. In clinical linguistics, in contrast, “[n]arratives have become a common feature of clinical assessment and intervention” (Schneider et al. 2005) and story-grammar-based approaches are used frequently in current studies in this field (e.g. Manolitsi & Botting 2011, Epstein & Phillips 2009, Reilly et al. 2004 & 1998, Norbury & Bishop 2003, Manhardt & Rescorla 2002). 3 Story grammar approaches even form the foundation for several clinical test batteries currently in use, e.g. the Index of Narrative Complexity (INC; Petersen et al. 2008), the Edmonton Narrative Norms Instrument (ENNI; Schneider et al. 2006 & 2005), the Test of Narrative Language (TNL; Gilliam & Pearson 2004) or the Strong Narrative Assessment Procedure (SNAP; Strong 1998). Despite the framework’s comparatively long history, story grammars are therefore rightfully included among the many current research paradigms in text/ discourse analysis (e.g. Bamberg & Moissinac 2003: 407ff.; cf. also Hickmann 2003: 86ff., Heinemann & Heinemann 2002: 90ff.). Story grammar has been criticized in many respects right from its beginnings (e.g. Wilensky 1983 & 1982, Black & Wilensky 1982, de Beaugrande 1982, Brewer & Lichtenstein 1982 & 1981) and criticism has largely remained the same over the years (cf. for example Boueke et al. 1995: 56ff. & 67ff.). Some of the main points of criticism will be discussed in the following. 3 Some of these even use the same elicitation material as the present study, e.g. Reillly et al. 2004, Norbury and Bishop 2003, Manhardt and Rescorla 2002, Strong 1998. 10 Story grammars have been criticized, first of all, for not being able to represent all possible stories while allowing also the formation of non-stories (e.g. Becker 2005, Black & Wilensky 1982, Calfee 1982). This is supposed to result from (a) failing to include story-specific properties and, at least partly, from (b) missing an evaluative or affective component. The perceived lack of a specific “storiness” property in most story grammars can be defined in relation to what has been called, for example, the criterion of reportability (Labov & Waletzky 1967) and the criterion of discontinuity (Boueke et al. 1995: 84). Boueke and colleagues, for example, claim that an element of discontinuity is constitutive for a story, i.e. a story needs to contain a contrastive relation between the normal course of events and the actual episodic structure, which makes the story interesting and worth telling (1995: 136ff.). 4 However, maintaining that story grammars do not include any such element of discontinuity, anything “unusual” worth telling (cf. Becker 2005, Quasthoff 1980), falls short of understanding one of story grammars’ central components, namely the INITIATING EVENT (cf. ch. 3). The INITIATING EVENT marks the beginning of a story’s problem-resolution structure (e.g. Berman & Slobin 1994b, Stein & Policastro 1984: 118, Peterson & McCabe 1983: 67, Mandler & Johnson 1977: 119) and it consists of external or internal events leading to a change in the habitual state previously described in the setting (Stein & Glenn 1979: 63, Mandler & Johnson 1977: 115; cf. also ch. 3). The INITIATING EVENT thus marks exactly the departure from the “normal course of events” that Boueke et al. and others call for. While the lack of a specific evaluative/ affective component, on the other hand, cannot be refuted, story grammars do incorporate evaluative/ affective statements. In the SIMPLE REACTION component, for example, statements about story characters’ emotional states in relation to the initiating event are included as one possible realization (cf. Shapiro & Hudson 1991, Stein & Glenn 1979: 64, Mandler & Johnson 1977: 119). 5 The same is true for the ENDING, where evaluative or affective statements in relation to the end of the plot are listed as one of the possible realizations. It is certainly true, however, that evaluative/ affective components and statements are not attributed the same importance as in other approaches (cf. for example Augst 2010, Augst et al. 2007, Becker 2005, Boueke et al. 1995: 92ff., Jose & Brewer 1990) 6 . 4 Cf. also Quasthoff (1980). 5 It has to be remembered that this component is often realized implicitly, i.e. it does not need to be overtly expressed in the story (Stein & Glenn 1979: 64, Mandler & Johnson 1977: 121). 6 Boueke et al. (1995: 138ff.) also showed, in a study on L1 German speakers tested at the end of kindergarten (age 5; 7 to 5; 11), in the middle of elementary school (7; 7 to 7; 11) and at the end of elementary school (9; 7 to 9; 11), that the number of children using global affective markings and the degree of affective marking increase significantly with age. Similarly, Augst (2010) and Augst et al. (2007) found—in a study on three (2007; grade two, three and four of elementary school), respectively six age groups (2010; additionally also grade five, six and nine)—an age-related increase in local affective markers. It could be argued that in all three studies task instructions biased the re- 11 Another frequent point of criticism in relation to story grammars has been their perceived staticness (e.g. Calfee 1982, Black & Wilensky 1979). However, story grammarians have rejected this criticism early on as a misunderstanding of story grammars’ representation as tree diagrams, which were used analogously to the well-known representations in generative syntax in order to enhance comprehensibility of the approach (e.g. Mandler 1982: 433). At the same time a certain staticness (in the sense of prototypicality) does not necessarily have to be seen as a deficit when analyzing texts such as the ones obtained in the present work. Culture-specific, prototypical structures for different text types—stories among them—have always been an important part of school curricula, with their roots going back to classical rhetorics (Ludwig 1984 and 1988 in Boueke et al. 1995: 68), and even today these basic structures are defined in a relatively static way in school contexts. 7 Furthermore, a comparatively static narrative structure is rather the rule than the exception when children learn to produce stories themselves (cf. ch. 4). At the same time story grammar has an important advantage over other approaches, even the influential model by Labov and Waletzky (1967), namely its hierarchical organization structure—as opposed to a linear one—which is able to represent the global organization of a text instead of suggesting a mere linear succession of components. It has been argued, however, that in practice story grammar analyses do not live up to their promise (Boueke et al. 1995: 64). That is, they fail to investigate the hierarchical organization characteristic for this approach, since they merely count the number of narrative components and lose sight of their function in the story’s global, hierarchical organization. This “practice problem” can easily be avoided, though, in a study on picture-elicited stories (as opposed to conversational narratives or story-stem elicitations), such as the present one. First of all, by a careful selection of the elicitation material and, secondly, by an equally careful definition of the individual components for the analysis, which keeps in mind each component’s place in the story’s global structure (cf. ch. 5.2). Even though allowing to delete individual story grammar categories may be problematic in some cases (cf. Boueke et al.’s (1995: 58ff.) criticism of Peterson and McCabe (1983)), it then also becomes possible to reduce the investigation to a smaller number of components representing the minimal requirements for a hierarchical structure, while eliminating elaborations of the plot which may be representative for other aspects of participants’ storytelling ability besides the one to tell a globally organized story. sults with respect to affective marking, since they asked participants to tell “a really gripping and exciting or funny story” (Boueke et al. 1995: 124; my translation) and to think of “an interesting story” (Augst et al. 2007: 46; my translation). Other studies also show, however, that affective/ emotive marking needs to be included in a full account of the development of storytelling. Nevertheless, this is outside the scope of my study, which is limited to investigating coherence and cohesion. 7 Augst (2010), for example, describes a well-accepted five-step structure (“struktureller Fünf-Schritt”) for (written) make-believe stories: Einleitung-Planbruch-Spannung- Pointe-Schluss. 12 Another advantage of story grammar is that it provides a framework connecting surface narrative structures to underlying cognitive structures. Even though story grammars’ psychological validity has been disclaimed in the past (e.g. Wilensky 1982), the existence of a story schema has been well demonstrated by now (cf. ch. 3 and also the overview in Hickmann 2003: 90). Story grammar is therefore particularly suitable for studying narratives within a production model such as the one outlined in the next chapter. A well-accepted, seemingly alternative approach rooted in cognitive science which is suitable for narratives (as well as for other types of text/ discourse) and which could claim to provide a fuller account of the cognitive base of storytelling—while also avoiding the staticness of story grammars—is propositional analysis based on the construction-integration model (e.g. van Dijk & Kintsch 1983, van Dijk 1980; cf. also Peterson and McCabe (1991) for an application to L2 narrative productions). Propositional analysis, essentially a bottom-up approach, aims to identify individual propositions on the microstructure of discourse and to deduce from these micropropositions, via transformational rules, the discourse macrostructure (i.e. a text’s macropropositions), which can broadly be defined as the theme or topic of a text. The level of coherence analysis as conducted with the help of story grammar would correspond to yet another level of propositional analysis, however, namely to “(schematic) superstructures” (van Dijk 1980: 108) , i.e. “conventionalized schemas that provide the global form for the macrostructural content of a discourse” (Renkema 2004: 97). As van Dijk states, “perhaps the most characteristic example of such a conventional, schematic superstructure is that of narrative. A narrative structure is a global schema expressed by stories.” (1980: 109). In other words, “[a] story is a discourse which expresses a macrostructure which is organized by a narrative schema” (ibid.). Van Dijk and Kintsch’s narrative superstructure explicitly builds on earlier structural work on narrative, which they “attempted to extend […] by a text grammatical and generative framework, an action-theoretical foundation, distinctions between several levels of description, and later a cognitive component” (van Dijk 1980: 113). In this context, “narrative grammars” (1980: 113) are mentioned specifically as one of the foundations of their theoretical approach. It comes as no surprise, therefore, that the superstructure of narratives is defined in a strikingly similar way to story grammar approaches, namely as a number of basic narrative categories (such as setting, complication and resolution), which are related through membership in a hierarchical network of higher-order and lower-order categories, which, in turn, makes up the global schematic structure of narrative discourse (cf. van Dijk 1980: 112ff.). Equally similar to story grammar, these basic categories are to a large degree bound by a conventionalized canonical order and they are, consequently, also of a comparatively static nature (ibid.). Story grammar approaches and propositional analysis should therefore not be seen as competing approaches. 13 Moreover, van Dijk and Kintsch never state how exactly to approach the analysis of superstructures, even if they propose “first to define or derive, by specific formation rules, a specific schema (e.g., a narrative structure) after which the category slots are filled with macropropositions” (van Dijk 1980: 122) in the analysis of a given text. Therefore, the integration of superstructures in their model is important in a theoretical vein but the model does not lend itself directly to the analysis of such structures. Quite the contrary applies to story grammar, which provides a detailed account of the individual constituents of the global narrative schema and thereby enables the researcher to easily operationalize these constituents for a given text. Story grammar approaches generally lack, on the other hand, embedding in a broader model, i.e. a broader text processing framework connecting the global structure to lower level constituents and structures. With respect to narrative texts the story grammar approach could therefore be viewed as one possible specification of the superstructure postulated in the construction-integration model. Last but not least, story grammar is well-suited for the present study, since it is able to capture story structures by children well (Yussun & Ozcan 1996: 32) and, as I will show in chapter 5, the data elicitation material has a close correspondence with a prototypical story structure as propagated by story grammar. From the points made above it is clear that story grammar can (still) make a significant contribution to the field of language acquisition. At the same time going back to the original story grammar framework is a logical step, since newer approaches with a revised structure—for example in reaction to criticism about affective marking (e.g. Boueke et al. 1995 and, based on this study, Augst 2010 & 2007)—lack empirical evidence for their model’s psychological validity. Boueke et al. (1995), for example, explicitly claim psychological validity for their model since, as they say, it is based on story grammar. However, the way their model is structured is sufficiently different from the original framework to warrant further testing, especially also in comprehension studies, before being able to claim any such validity. The psychological validity of story grammars, however, has been well established empirically (cf. ch. 3). 2.2 Cohesion Halliday and Hasan’s (1976) approach to cohesion was chosen for several reasons. First of all, it constitutes, even today, the most comprehensive and elaborate framework for cohesion. As such, Halliday and Hasan’s framework continues to be classified as one of the standard approaches, if not the standard approach in text analysis (cf. Schubert 2012, Esser 2009, Johnstone 2008: 118ff., Renkema 2004). More recent approaches either focus on one of Halliday and Hasan’s categories or cut across these categories; they do not, however, accommodate all of Halliday and Hasan’s textual means of cohesion. More recent approaches to lexical cohesion, for example, focus on lexical cohesive chains or on collocation only (e.g. Hoey 2005 & 1996). More recent developmental studies of reference (e.g. 14 Becker 2005, Hickmann 2003, Nistov 2001, Severing & Verhoeven 2001, Boueke et al. 1995), on the other hand, take into account personal references, substitutions, and ellipses to account for reference strategies relating to animate entities, but they do not include comparative references. Therefore, other approaches should rather be seen as complementing the original framework from additional (e.g. more properly functional) perspectives or extending and redefining parts of the 1976 model—no single one of the newer approaches, however, captures references, connectives, substitution, ellipses and lexical means in one integrative framework. Secondly, Halliday and Hasan’s categories of analysis are well defined and thus rather well operationalizable also for L2 speakers. 8 Moreover, several recent studies have successfully applied the Hallidayan cohesion framework to developmental and L2 data (e.g. Diehr 2011, Manhardt & Rescorla 2002, Bae 2001). Last but not least, using Halliday and Hasan’s approach ensured continuity with earlier studies within the Kiel Immersion Project, be it the two pilot studies conducted in the elementary school part of the project (Möller 2003, Maschewski 2002), which the present study builds on, or the studies within the high school part of the project, which had already been concluded (e.g. Burmeister & Daniel 2002, Mukherjee 1999, Claussen 1997, Krohn 1996). 2.3 Summary In the present chapter the range of approaches to studying (narrative) discourse production and its development was presented. Then the two frameworks used in the present study, story grammar (coherence) and Halliday and Hasan’s (1976) approach to cohesion, were discussed critically. It was shown that the two approaches are far from outdated, even though they were developed comparatively early in the study of (narrative) discourse. Furthermore, it was shown that they are well suited for the purposes of the present study, even in the light of more recent approaches. 8 The lexical cohesion category of collocation is comparatively difficult to operationalize, however, as Hasan (1984) herself argues; she even proposes to disregard it in analyses of lexical cohesion. Collocation has been included in recent studies (e.g. Bae 2001), though, and it is one of the major sources for intersentential cohesion. Therefore it is not feasible to simply leave out this category of cohesive ties. 15 3 Narrative production: From cognition to coherence and cohesion Puzzling involves putting pieces of a jigsaw together in order to form a unified whole. One piece is connected to the next piece, the next to another and from the related pieces the jigsaw is formed. In the result of the puzzling process the pieces of the jigsaw are hardly noticeable, as the interrelated pieces have become an interrelated representation. (Louwerse 2004: 3; my emphases) This jigsaw analogy, used in its original context to illustrate the processes involved in text comprehension, also applies to the production of text/ discourse and it raises the following questions concerning discourse production: • What (linguistic and cognitive) processes are involved in this “puzzling”? • What are the “connections” between “pieces” that make up “a unified whole”, i.e. a text? In the present study, participants were asked to tell a story from a picture book (cf. ch. 5.1.3 for more details). In the following I will therefore present a simplified model of narrative discourse production focusing on the task faced by the participants. As the model will show, (narrative) text can be described as the result of two different organizational processes: bottom-up and top-down. These two processes, I will argue, correspond to two fundamental dimensions recognized for (narrative) texts, namely cohesion and coherence (ch. 3.2). Then I will give an indepth description of the approach to narrative coherence chosen for the present study, story grammar, which is based on schema theory and recognizes an underlying story organization with several different component nodes related through membership in a hierarchical network (ch. 3.3). Narrative coherence is further explained to reflect (an aspect of) participants’ cognitive development. Finally, I will outline the theoretical basis for analyzing the narrative cohesion of participants’ texts (ch. 3.4), i.e. their use of linguistic devices to create semantic links between individual textual elements. Cohesion is further explained to reflect (an aspect of) participants’ linguistic development. I conclude this chapter by giving a short summary and stating the three main research questions pursued in the present study (ch. 3.5). 3.1 A simplified model of narrative discourse production General agreement exists that texts are embedded in a communicative situation involving a producer and a (potential) receiver 1 . Fig. 3.1 shows a simplified model of the processing steps involved in telling a story from a picture book (the content 1 Receivers do not necessarily have to be present, e.g. in an asynchronic communicative situation such as between writer and reader. 16 material) to a listener. Due to the focus of the present study on production, the receiver side is not depicted. Text processing by the receiver is largely analogous, however, to the producer’s processing of the elicitation material described in the present section (cf. Eysenck 2001, Cutler & Clifton 2000, Kintsch 1998). Additionally, the model focuses on the most important steps and processes only, while leaving out many others, especially in regard to perceptual processing, linguistic enor decoding, and knowledge storage or access. 2 At the same time the model treats discourse production as a static, one-directional sequence of operations. In reality, a cyclic operation with constant feedback should be assumed due to limited working memory capacity (cf. Kintsch 1998, Rickheit et al. 1995) and several processes need to be considered largely parallel (cf. Eysenck 2001: 3). Similarly, instead of a monologue, an (at least partly) interactive process is closer to reality. This is also the case in the present study, where some interaction occurs, even if task and interviewer instructions deliberately aim for a monologue (cf. ch. 5.1.3). † Processing repeatedly passes through a ‘general filter’ (GF1 & GF2), which corresponds to producers’ knowledge base; in the encoding phase it additionally passes through a languagespecific ‘linguistic filter’ (LF) (see text for further explanations). As Fig. 3.1 shows, producers first of all need to process the content material cognitively in order to form a coherent mental representation of the content, which can serve as a conceptual basis for the subsequent narration (e.g. Kintsch 1998 & 1995, Trabasso et al. 1995). 3 In this phase the producer corresponds to a receiver 2 For more information about perceptual and linguistic processing see, e.g., Herrmann & Fiebach 2004, Eysenck 2001, Levelt 1999, Perfetti 1999. For additional information about memory storage and/ or access see, e.g., Schermer 2006: 116ff., Schneider & Büttner 2002, Eysenck 2001: 157ff. See also Foltz (2003) for an overview of existing models and challenges to modelling discourse processes. 3 In telling a story from memory, on the other hand, producers would need to access the relevant memory structures. Fig. 3.1 A simplified model of discourse production in a picturebook-elicited storytelling task 17 or comprehender of the elicitation material. Thus, producers need to process the content material perceptually, i.e. bottom-up or locally (e.g. Eysenck 2001: 382, Singer 1990: 109, Rumelhart 1980: 42). 4 At the same time all information from the content material is influenced by producers’ knowledge base, i.e. any previous knowledge they bring to the task, which corresponds to a top-down or global processing of the content material (e.g. Eysenck 2001: 387, Kellogg 1995: 50f., Rumelhart 1980: 41f.). 5 The degree to which both processing modes (top-down and bottom-up) interact, i.e. the degree of parallel processing, depends on the experience with a task (Eysenck 2001: 3). In each processing stage shown in the model, i.e. both in the producer-internal formation of a mental representation and in the encoding of this mental representation, producers’ knowledge base is represented as a general filter (GF1 & GF2). The model’s GFs, i.e. the knowledge base, encompass general (world) knowledge stored in long-term memory, situational knowledge referring to the immediate communicative context as well as producers’ immediate motivations and goals (cf. Eysenck 2001: 200, Perfetti 1999: 169, Kintsch 1995: 140, Kayser 1989: 346). 6 General knowledge comprises all of a person’s previously acquired knowledge. It is based primarily on personal experience but also on mediating sources such as books or TV (e.g. Sodian 2002: 447ff., Nelson 1986a: 5). 7 General knowledge thus includes knowledge as diverse as theory of mind or conversational principles (e.g. Sodian 2002: 456ff., Levelt 1999: 84f., Grice 1975). Additionally, more abstract cognitive structures are formed through abstraction processes working on the representations of “real” experiences in memory: Schemata (e.g. Goswami 1998: 281, Singer 1990, Nelson 1986, Mandler & Goodman 1982). 8 Schemata in turn also become part of a person’s general knowledge base. Situational knowledge (also contextual knowledge), the second type of knowledge forming part of producers’ knowledge base (GF 1 & 2), includes interpersonal knowledge, i.e. knowledge about the interlocutor (the “target audience”) and the social relationship to her or him, as well as knowledge about the physical environment (e.g. Crystal 2003: 104). However, situational knowledge cannot neatly be separated from the general knowledge base; general knowledge about 4 Also data-driven or outside-in processing (cf. Kellogg 1995: 50f., Anderson 1994, Singer 1990, Pearson 1982, Rumelhart 1980: 41f.). 5 Also hypothesis-driven, schema-based, conceptual, knowledge-driven or inside-out processing (cf. Kellogg 1995: 50f., Anderson 1994, Singer 1990: 108f., Pearson 1982, Rumelhart 1980: 41f.). 6 Speakers’ goals are not usually transparent. In the set-up of the present study, for example, the default speaker goal is assumed to be “follow the task instructions” (cf. ch. 5.1.3). The possibility remains, though, that speakers follow (additional) different goals such as “tell the story in time for the next break”. 7 For an alternative stance, i.e. innate theories, see Sodian (2002: 448). 8 The terminology varies (cf. Kellogg 1995: 166), e.g. also cognitive model (Ungerer & Schmid 2006) or global patterns (Schubert 2008: 72, de Beaugrande & Dressler 1981: 90f.). 18 social hierarchies in a specific culture is needed, for example, for the even more specific contextual knowledge about the relationship between interlocutors. As explained above, schemata also form part of producers’ knowledge base represented by the model’s GFs. In discourse production and reception schemata play important roles and they form the theoretical background for the analysis of narrative coherence conducted in the present study (cf. ch. 3.3). In the following I will therefore describe the role and characteristics of schemata in more detail before moving back to the description of the model. A schema is, in its most general sense, “an organized knowledge structure” (Singer 1990: 6). Schemata range from semantic networks for simple objects, e.g. chair, to very complex entities such as stories (Schermer 2006: 161, Pearson 1982). Accordingly, different types of schemata have come to be distinguished, which can be embedded within each other. Best known are frames and scripts (cf. Kellogg 1995: 170ff., Rumelhart 1980, Schank & Abelson 1977, Minsky 1975). Frames refer to concept knowledge and stereotypical characteristics of concepts, e.g. STUDENT 9 . Scripts, on the other hand, refer to knowledge about people’s stereotypical roles and activities in reference to certain situations, e.g. Schank’s (1975) well-known RESTAURANT script. Schema can thus be seen as a cover term with different subtypes, which corresponds to the usage adopted in the present study. Schemata are organized in networks together with several subschemata. These networks are structured hierarchically in that the more specific information is embedded under nodes 10 containing the more general information (e.g. Ungerer & Schmid 2006: 50, Nelson 1986a: 8). A RESTAURANT script, for example, would incorporate ORDERING, PAYING, WAITER and several other subschema nodes (Schank 1975: 264ff.). All nodes are associated with default values, i.e. prototypical instances (e.g. Smith 1991: 510f., Pearson 1982: 28, Rumelhart 1980). A RUNNING RACE schema, for example, would contain a node PARTICIPANTS and without information to the contrary we would assume these participants to possess attributes such as human and athletic. However, schema nodes are also variable in nature, i.e. they have an attribute-value format (Smith 1991: 511): Once a particular schema is activated, the nodes’ default values can be replaced by more specific ones on the basis of the incoming information (e.g. Pearson 1982, Rumelhart 1980). If contextual evidence stated, for example, that a turtle and a rabbit participated in the race, the default value for PARTICIPANTS would be substituted by turtle and rabbit. Qualitatively, schemata are a function of the (amount of) experience they are based on (e.g. Johnson-Laird 1991: 483f., Singer 1990: 107f., Fivush & Slackman 9 Schemata will be written in capital letters. 10 A variety of different terms is used to refer to schemata and their parts, the most general ones being variables (e.g. Rumelhart 1980) or constituents (e.g. Mandler & Goodman 1982, Thorndyke 1977). The term node emphasizes the closeness to a connectionist model of the mind (Ungerer & Schmid 2006: 50). Other terms used are slots (e.g. Pearson 1982, Minsky 1975), roles (e.g. Schank & Abelson 1977, Schank 1975), and attributes (cf. Schermer 2006: 148f., Smith 1991: 511). 19 1986). Consequently, schemata are personas well as (sub-)culture-specific (e.g. Ungerer & Schmid 2006: 51ff., Brewer 1985, Scollon & Scollon 1984). Renkema (2004: 231) gives the HOUSE schema as an example: In Western culture, general agreement would be to include windows, different rooms, a kitchen, furniture etc., while this might not be the case in other cultures. At the same time, most people in Western culture would have different associations with the concept HOUSE, for example, a skyscraper or a farmhouse. Additionally, factors such as age, for example, may influence schemata even within the same culture. Children’s STORYTELLING schema, for instance, may not necessarily correspond to that of adults, since early parent-child storytelling mainly consists in interactive picture description (Cook-Gumperz & Green 1984). Schemata are involved in several ways in discourse comprehension, recall, and production. First of all, schemata relevant to the content material, e.g. about prototypical events or story structures, are called up in producers’ memory based on a bottom-up processing of the content material (e.g. Singer 1990: 108f., Pearson 1982: 28ff., Rumelhart 1980). In the set-up of the present study this includes not only the picture book but also the task instructions (cf. ch. 5.1.3). The incoming information is then processed top-down, i.e. it is filtered as to correspondence with and relevance for existing schemata (e.g. Schermer 2006: 162, Singer 1990: 109, Pearson 1982: 30, Rumelhart 1980); that is, once a schema has been selected, the incoming information is processed in the light of this schema, seeking to attribute a specific value to each of its (sub-)nodes. This process facilitates the mental organization and storage of the information (e.g. Goswami 1998: 281, Kellogg 1995: 147ff.). In the present study, for example, participants were asked to look through a picture book and then to “tell me [the interviewer] the story” (cf. ch. 5.1.3). Based on these task instructions, they must have expected, for example, that the booklet actually contains pictures. Similarly, if they had accessed a relevant STORY schema, they probably sought to identify one or more protagonists as well as a temporal or causal sequence of events.. Participants would most certainly ignore, on the other hand, the paper quality or whether the interviewer was wearing a white or a blue shirt, since this information is irrelevant with respect to task and content material. Additionally, schema-based processing enables inferences (e.g. Anderson 1994: 474, Mandler & Goodman 1982, Pearson 1982: 28). 11 This includes bridging inferences used to connect current and preceding parts of content material (e.g. Renkema 2004: 137f., Collins et al. 1980: 386), reconstructive inferences (e.g. Anderson 1994: 474, Pearson 1982: 29), i.e. using default values in order to fill nodes not specified by the incoming information, as well as elaborative inferences, i.e. such that go beyond the information literally stated without being essential to the coherence of a message (e.g. Renkema 2004: 137f., Anderson 1994, Singer 1990: 11 This is especially important since a large amount of information is left to be inferred and never realized explicitly in communication (Pearson 1982: 29). 20 175ff). 12 In the present study, for example, participants need to use schemata and other types of general knowledge to decode the contents of each individual picture of the task material (cf. ch. 5.2.1) and connect them to the content of previous and subsequent pictures. They need to infer, for example, that one of the protagonists, a frog, escapes and that this escape triggers a reaction in the other two protagonists (a boy and a dog), namely to go looking for the frog. Finally, schemata also serve as production plans for linguistic encoding on the part of the producer (e.g. Berman 2001a, Boueke et al. 1995, Bamberg & Marchman 1991, Singer 1990: 109f., Kintsch & van Dijk 1978: 376). This involves different types of schemata at different levels of abstraction. A variety of different schemata would therefore need to be activated in a study such as the present one, for example schemata relevant to the content material, such as GOING TO SLEEP or ANIMALS OF THE FOREST (cf. Wilkins in Berman & Slobin 1994: 21f.), or on a more abstract level a STORYTELLING script specifying the details of the communicative situation. Very importantly, producers also need to activate a genre-specific STORY schema, which provides the abstract content components to be covered and their order, e.g. SETTING and ENDING (cf. ch. 3.3). Evidence for the use of schemata as production plans comes primarily from recall experiments: In retelling a non-canonical story, for example, producers were found to restructure the original input and follow a classic STORY schema (Mandel & Johnson 1984). In another recall experiment, which involved a text with very general wording, participants were found to perform better if the global theme of the passage was known to them beforehand, since this allowed them to call up the relevant schemata for (a) input processing and storage and (b) for recall production (Bransford & Johnson 1972). These results are in agreement with the findings of Bartlett’s (1932) groundbreaking study, which showed that participants in a recall experiment restructured the original content along schemata pertaining to their own cultural background. To sum up, the model presented in Fig. 3.1 outlines that producers process the content material cognitively (under the influence of the task instructions) and “make sense” of it with the help of their knowledge base, represented as the incoming information passing through GF1, where any missing information is filled in. The result is a coherent mental representation on the part of the producer, which is a combination of content material and producers’ additional knowledge sources (e.g. Kintsch 1995, Trabasso et al. 1995). 13 Up to this point, processing is identical to that of a receiver of (visual) information, however in the next processing step, the producer’s mental representation needs to be encoded linguistically. As stated above, schemata act as production plans for this linguistic encoding, i.e. the encoding is organized top-down. 14 In the model, the use of schemata 12 For further details on different types of inferences see e.g. Singer 1990: 167ff., van de Velde 1989, Anderson and Pearson 1984. 13 In Kintschean terms, the situational model, i.e. a combination of text base and additional knowledge sources (e.g. Kintsch 1995, van Dijk & Kintsch 1983). 14 Cf. also Levelt’s macroplanning (2000: 91ff.). 21 as production plans is represented by processing again passing through the general filter (GF2). At the same time, producers also need to pre-plan their linguistic output with respect to receivers’ needs, i.e. they need to try to predict the nature of the mental representation a receiver would form on the basis of the linguistic material (i.e. the text) received (e.g. Berman 2001a, Lenk 1998, Givón 1995, Kreyß 1995). Linguistic encoding can only begin if producers find that there is agreement between this presumed mental representation and their own. Otherwise, the planned linguistic output needs to be revised (Kreyß 1995). 15 While all processing described so far (and thus its success) depends on general cognition, the encoding into speech (or writing) is language-specific, since different languages require different linguistic choices in the encoding process, even if one assumes the cognitive basis, i.e. the underlying mental representation, to be similar across languages (e.g. Berman & Slobin 1994, Hickmann 1991). 16 Linguistic encoding is conducted bottom-up in combining sounds to words, words to sentences, sentences to discourse etc. (e.g. Sodian 2002: 483, Khodadady & Herriman 2000: 204ff., Rumelhart 1975: 235). 17 In the model, the content of the mental representation is therefore shown as passing through a language-specific linguistic filter (LF), which contains knowledge about well-formed texts and their parts, i.e. knowledge about phonological structures, about the grammatical structure of words, clauses and sentences as well as knowledge about the linguistic means of connecting clauses, sentences, and longer stretches of speech. Consequently, successful encoding is at least to some extent dependent on producers’ linguistic abilities in the respective language (be it an L1 or L2). As the model presented in Fig. 3.1 indicates, cognitive (GF2) and linguistic processing (LF) are intimately connected in the encoding phase just as cognitive and perceptual processing were in the decoding phase (Eysenck 2001: 2ff., Levelt 1999: 112ff., Lenk 1998, Rickheit et al. 1995, Bamberg 1987). Producers not only have to pre-plan, for example, but they also need to constantly monitor their own output against receiver needs and update their predictions regarding the mental representation formed by the receiver. Additionally, producers must monitor their output against genre-specific schemata and other general, pragmatic and contextual requirements, which includes a constant awareness of the linguistic context already created (e.g. Berman 2001a, Lenk 1998, Hudson & Shapiro 1991). As the model shows, the linguistic encoding of producers’ mental representation finally results in a text (here: a story), which can then be processed by the 15 In reality, this is often done online, which is evident, for example, in self-repairs and comprehension checks such as Do you know what I mean? 16 As explained above, the intention is to present a very general processing model and thus no distinction is made between different types of linguistic processing modes such as vocabulary-driven versus grammar-driven (cf. the overview in Louwerse 2004). 17 Cf. also Levelt’s microplanning (2000: 91ff.). 22 receiver (here: the listener). 18 Texts as a “product” of the processing and encoding illustrated above also form the material for the analysis conducted in the present study. A “text as a product” view, i.e. disregarding any performance aspects, is common practice in studying discourse (cf. Berman & Slobin 1994a: 24). Nevertheless, it should be kept in mind that such a view of stories is a researcherinduced artificial reduction (ibid.; cf. also Boueke et al. 1995: 16). In the next section, the two text construction principles described above (bottom-up and top-down) will be set in relation to two fundamental dimensions of (narrative) discourse, namely coherence and cohesion, which are the basic research constructs of the present study. 3.2 Top-down versus bottom-up organization: Narrative coherence and cohesion Narrative texts share two fundamental dimensions with other types of text: Coherence and cohesion (e.g. Schubert 2012, Renkema 2004, Graesser et al. 2003, Hickmann 2003 & 1996, Gernsbacher & Givón 1995). Coherence and cohesion work together in making a text a unified, meaningful whole which is more than the sum of its sentences (e.g. Hoey 1996, Cook 1989, Halliday & Hasan 1976) and they correspond to the two different organization principles described in the last section, namely the top-down/ global organization of the content structure, and the bottom-up/ local organization of the linguistic structure (e.g. Hickmann 2003: 86ff., Karmiloff-Smith 1985: 62). There is of course more to stories and storytelling besides coherence and cohesion. Additional dimensions include, for example, a text’s entertaining function, evaluative dimension or given-new structure (cf. Augst 2010, Berman 2008, Augst et al. 2007, Bamberg & Moissinac 2003, Becker 2001, Boueke et al. 1995, Brewer 1985, de Beaugrande & Dressler 1981). However, the present study focuses on coherence and cohesion as the two most fundamental dimensions of text. Texts are constructed top-down in that their content is organized with the help of a schema-based, global organization structure (e.g. Kellogg 1995: 310). This underlying organizational structure corresponds to a text’s coherence (e.g. Hickmann 1995: 201, Berman & Slobin 1994b: 40ff., Tannen 1984: xiv). 19 In principle, a distinction needs to be made between structural (or formal) and thematic coherence (van Dijk 1980), 20 the latter emphasizing a text’s content being organized around content schemata relating to one overall theme (Berman & Slobin 18 Again, this is a parallel process and the sequential description is owed to explanatory clarity. 19 No general agreement exists on the concept of coherence (e.g. Graesser et al. 2004, Bublitz et al. 1999, Gernsbacher & Givón 1995, Cook 1989, Hasan 1984). In spite of this lack of agreement, there seems to be a minimal consensus, however, that coherence is (at least partly) attributable to a text’s underlying organizational structure (Hickmann 2003 & 1995, Shapiro & Hudson 1997). 20 Also superstructure vs. (semantic) macrostructure (van Dijk 1980). 23 1994b: 40), the former focusing on abstract schematic components determining structural well-formedness (Hickmann 1995: 201ff.). For the analysis of narrative productions, however, this distinction is not feasible, since the two are intimately connected: The analysis of a text’s formal organization can only be conducted by recurring to its content structure (cf. Shapiro & Hudson 1997, Berman & Slobin 1994b: 43), while the formal structure dictates to a large extent the content structure (van Dijk 1980: 122). Similarly, the content structure is centered around a global theme, which is at the same time inherent in a text’s formal organization, i.e. the formal components could not form a coherent text without a contentbased connection (cf. ch. 3.3.2 and 5.2.1). Thus, this theoretical distinction will not be pursued further in the present study. Instead, the cover term narrative coherence will be used in order to emphasize the interdependence of these two aspects of coherence. As already stated in the previous chapter, the two most influential approaches in the study of coherence as the global organization of narrative discourse have been highpoint analysis and story grammar (Peterson & McCabe 1983: 3). Despite a certain similarity between the two approaches in that both analyze stories with the help of predetermined structural components, they have important differences: Highpoint analysis (Labov 1999 [1972], Labov & Waletzky 1967) was developed from narratives about personal experience and it serves mainly sociolinguistic and anthropological purposes (e.g. Renkema 2004: 193, Toolan 1988: 46ff.). Story grammar, on the other hand, which is the approach to narrative coherence chosen for the present study, was developed on the basis of fairytales (e.g. Mandler & Johnson 1977, Thorndyke 1977, Bartlett 1932). It is a cognitive framework postulating a relationship between observed text structure and underlying cognitive structures; a more detailed overview of the theoretical assumptions underlying story grammar will be given in the following section (ch. 3.3). Recent language acquisition studies drawing on a cognitive, schema-based approach to narrative (e.g. Stein & Albro 1997, Berman & Slobin 1994, Bamberg & Marchman 1990, Peterson & McCabe 1983) usually draw on the story grammar(s) developed in the 1970s, mostly employing reduced designs. 21 Linguistically, texts are constructed bottom-up. That is, sounds are combined to words and words to sentences according to language-specific phonological, morphological and syntactic rules. Above the sentence level, however, i.e. in combining sentences to discourse, a discourse-specific linguistic rule system is at work: Cohesion. The notion of cohesion has been accepted as well-defined and useful for the study of text since Halliday and Hasan’s seminal work Cohesion in English, which was published in 1976 (e.g. Schubert 2012, Renkema 2004: 103, Martin 2001: 35, Bublitz et al. 1999). 21 As already explained in chapter 2, story grammar approaches have largely been neglected in (second) language acquisition studies, even though they have been and are used very productively in language impairment studies, for example (e.g. Manolitsi & Botting 2011, Epstein & Phillips 2009, Reilly et al. 2004 & 1998, Norbury & Bishop 2003, Manhardt & Rescorla 2002). 24 Cohesion can be defined as a textual, surface quality, accomplished through the use of cohesive devices (e.g. Halliday & Hasan 1997 [1985] & 1976, Hickmann 1996: 201ff., Hoey 1996). Cohesive devices establish relationships of meaning, i.e. ties, between elements in the text by linking a presupposing with a presupposed element (Halliday & Hasan 1976: 4). These cohesive ties create texture, i.e. they linguistically establish the textual quality of a stretch of sentences (e.g. ibid.: 10). 22 Thus, cohesion can also be defined as “the set of possibilities that exist in the language for making text hang together” (Halliday & Hasan 1976: 18). Cohesion is a phenomenon that applies to texts in general, so that no qualitative difference can be made between cohesive devices in narratives as opposed to other genres, even though there may be a genre-related influence on the specific choice and number of cohesive devices (e.g. Berman 2008, Norment 1995). In strongly co-textdependent genres such as in the present study, for example, the use of cohesive devices is expected to be relatively high (Hickmann 2003: 43f.). Contextual factors, such as task and task material differences, may equally influence the use of cohesive devices (Berman 2004, Hickmann 2004). In Cohesion in English, Halliday and Hasan (1976) identify five main types of cohesive devices: references, connectives, substitution, ellipsis, and lexical cohesion; these categories will be described in more detail in ch. 3.4. Following Halliday and Hasan’s lead, several other linguistic variables were also found to contribute to cohesion, e.g. tense-aspect marking, punctuation, and intonation (cf. Hickmann 2003 & 1996, Bamberg 1987, Gumperz et al. 1984, de Beaugrande & Dressler 1981). The contribution which these latter variables make to cohesion will not be discussed further, however, since the present study is based on the traditional categories established by Halliday and Hasan (cf. ch. 3.4), which can be considered the “standard” categories in discourse cohesion (cf. Schubert 2008, Renkema 2004). 23 No agreement exists regarding the relationship between coherence and cohesion (cf. Hickmann 2004: 286ff., ibid. 2003: 93ff.). Furthermore, any comparison is made difficult by the many distinct operationalizations of the two variables. Even so, many researchers regard coherence as a superordinate quality and cohesion as contributing to it (e.g. Halliday & Hasan 1997 [1985] & 1976, Hellmann 1995, Cook 1989, Tannen 1984). Thus, it is argued that by indicating the relationship between elements in the text, cohesive ties contribute to textual coherence. Others, especially within a strictly functionalist tradition, argue that there is strong interrelatedness and a far more complex relationship between these two discourse measures. Hickmann, for example, states that “discourse cohesion could be partially constitutive of how children construct complex forms of cognitive macrostructures” (2003: 336), while at the same time she sees a potential impact 22 As stated in ch. 1, Halliday and Hasan use the term text for “any passage, spoken or written, of whatever length, that does form a unified whole” (1976: 1). 23 See, for example, Hickmann (2004 & 2003) for a definition of cohesion as a cluster of more functionally motivated categories such as marking information status. 25 of story structure on the use of cohesive devices (ibid.: 93ff.; cf. also Hickmann 2004, Berman & Slobin 1994, Bamberg & Marchman 1991 & 1990) Cognitively-oriented theories also assume an influence of cohesion on coherence but in a different vein; they claim that cohesive devices serve as “processing instructions”, i.e. as linguistic cues signaling the receiver to build a coherent mental representation and how to do this (e.g. Louwerse 2004, Knott & Sanders 1998, Segal & Duchan 1997, Givón 1995, Sanford & Moxey 1995). However, conflicting evidence exists regarding the number and types of processing cues needed by receivers (Kamalski et al. 2008) and thus this line of research can be pursued only within a receiver-centered perspective. At the same time there seems to be agreement that cohesion markers are neither always necessary (example (3.1)) nor sufficient (example (3.2)) for creating coherence, even though the occurrence of some cohesive ties is the norm in naturally occurring texts (e.g. Martin 2001: 44, Hoey 1996: 12, Hellmann 1995, Tannen 1984). Vice versa, coherence cannot be considered the (only) determining factor for the use of cohesive devices. Thus, example (3.1) would be considered a coherent, albeit short sequence of utterances, even if it contains virtually no cohesive devices. Example (3.2), on the other hand, contains a lexical tie (bone - bones) as well as a referential tie (they - bones) but this sequence of utterances would not easily be interpreted as coherent. (3.1) Eric sat down and relaxed. At flight altitude the attendant offered beverages. 24 (3.2) The dog was eating a bone. Strong bones are important. They are white. 3.3 The story grammar approach to narrative coherence 25 3.3.1 Story grammar Most researchers “agree in assuming the existence of a pre-linguistic representation of a general organization common to all narrative” (Gombert 1992: 146), i.e. a narrative schema. In fact, general schema theory is founded mainly on Bartlett’s discovery that stories are encoded and memorized with the help of a culturallydetermined narrative schema (1932; cf. also Schermer 2006: 160ff., Solso 2005: 305ff.). 26 24 The example shows, however, that coherence can influence the use of cohesive devices: The use of the definite article in the attendant can be explained by activation of the FLYING ON A PLANE schema (cf. Ungerer & Schmidt 2006: 212). 25 Unless stated differently, the term coherence will in the following be used in the sense of narrative coherence described above. 26 Bartlett’s work was at least the most influential one. The German psychologist Otto Selz would need to be credited with the first schema theory in the 20 th century, published in 1922, but his work went largely unnoticed in Europe at the time due to his Jewish origins and it was noticed in the anglophone world only after 1945 (Lück & Miller 2005, Rieger 1987, Kintsch 1982: 319). 26 Just as with other types of schemata (cf. ch. 3.1), narrative schemata are based on experience, i.e. they are “constructed through an inductive, inferential, and abstractive process, resulting from repeated exposure” (Stein & Policastro 1984: 124). Thus, story schemas are grounded in world knowledge, e.g. about cause and effect or about social interactions, as well as knowledge about different narrative genres based on experience with reading, being told and telling stories or even being taught prototypical story structures at school (e.g. Hudson & Shapiro 1991: 89, Stein 1982, Mandler & Johnson 1977). Consequently, story schemas are not only culturebut also genre-specific, i.e. they also vary among text genres (e.g. Shapiro & Hudson 1991: 960, Brewer 1985, Scollon & Scollon 1984). Actual stories can be described as “surface realizations” of the underlying narrative schema, i.e. the schema’s abstract nodes are filled with the specific content of a particular story (Nelson & Gruendel 1986: 40, Thorndyke 1977: 83). Story grammar, in turn, is a formal rule system (Mandler & Goodman 1982: 507), which was developed—starting in the 1960s—analogously to generative grammar (Berman 1995: 288) in order to describe regularities in the surface structure of stories, which reveal the underlying story schema (e.g. Mandler & Johnson 1977, Thorndyke 1977, cf. also ch. 3.3.2). Story grammar thus operates under the following assumptions (e.g. Stein & Albro 1997, Stein & Glenn 1979, Mandler & Johnson 1977, Thorndyke 1977): • Stories have some kind of internal structure which is relatively stable within a culture and narrative genre. • This internal structure can be described as an abstract, hierarchical network of higherand lower-order components (nodes) realized linguistically in the surface structure of a text. • This network of nodes, in turn, reflects the mental organization of story components in processors, i.e. their narrative schema. 27 Story grammar approaches have repeatedly shown the psychological validity of story schemata, especially through recall and comprehension studies (e.g. Stein & Policastro 1984, Mandler & Goodman 1982; cf. also ch. 3.1). Story grammar research on narrative production, on the other hand, is a comparatively recent approach (Bamberg & Moissinac 2003: 408) and it should be noted that the schemata used for understanding and production are not necessarily the same; schemata could be used successfully in comprehension, for example, while not being under operative control for production (e.g. Nelson 1986a: 12, Mandler & Johnson 1977). Story grammar analysis is used in the present study to analyze the narrative coherence of the stories elicited, i.e. their content structure. Narrative coherence 27 Just as memory structures in general, story schemata are hypothetical constructs, i.e. they are not accessible to direct observation (e.g. Schermer 2006: 12f.). Instead, conclusions are drawn from input/ output to processes and structures in the mind (e.g. Eikmeyer et al. 1995). See also the similar discussion on accessing competence in an L2 through performance measures (Ellis 1994: 12f.). 27 serves as an indicator for cognitive development, since the structure of participants’ texts reflects their cognitive pre-planning abilities in the sense of having command over a narrative schema. The successful encoding of schema nodes is directly related to participants’ linguistic abilities, however, and differences in linguistic proficiency, e.g. between adults and children or between L1 and L2 speakers, can therefore influence any results obtained for narrative coherence (cf. also ch. 3.1). 28 In the present study the linguistic threshold for considering component nodes realized was set very low in order to do justice to any such potential gap between cognitive and linguistic abilities; this will become clear in ch. 5.2.1. 3.3.2 Story grammar: Narrative components 29 Prototypical stories, as described by story grammars, have a problem-resolution structure and are goal-based, i.e. a problem arises, the protagonist forms a goal to solve the problem, carries out one or several attempts to do so and in the end either succeeds or fails. 30 This prototypical structure is operationalized with the help of narrative components: According to story grammar approaches to narrative coherence, a story has a deep structure consisting of a number of narrative components, which are either terminal nodes, i.e. states or events directly encoded in the surface structure of the story, 31 or higher order nodes, which have terminal nodes attached to them (e.g. Stein & Glenn 1979, Mandler & Johnson 1977, Thorndyke 1977; cf. also the previous section). A story minimally consists of a SETTING 32 component followed by an episode (e.g. Westby 1984: 111, Stein & Glenn 1979: 59, Mandler & Johnson 1977: 117); this is depicted in Fig. 3.2. A single episode includes an INITIATING EVENT, a development section, which incorporates the main body of the story, and an END- ING (Fig. 3.2). The development section in turn consists of a complex reaction— triggered by the initiating event—which can be subdivided into a SIMPLE REAC- TION and a GOAL-PLAN or simply GOAL. These are followed by a pursuit of the GOAL, i.e. the goal-path, which is composed of an ATTEMPT and its OUT- COME. In the following, I will describe the terminal nodes of the story schema, as identified by most story grammars for a single episode, in more detail. 28 Again, this relates to the fundamental methodological question whether underlying competences can be tapped through performance measures (Ellis 1994: 12f.; cf. also Chafe 1980). 29 The story grammar presented in the following is the result of my in-depth comparison of different earlier story grammar versions and it represents a synthesis of these earlier versions. 30 For better readability, I will be using the singular form protagonist to stand for one or more protagonists, whether acting in unison or pursuing different goals. 31 States and events can be either external or internal (cf. Peterson & McCabe 1983: 69f., Mandler & Johnson 1977: 115). 32 In the following, terminal nodes, as identified by story grammar and illustrated in Fig. 3.2, will be written in capital letters. 28 † Terminal nodes are written in capital letters. The SETTING is one of the integral constituents of story grammars (e.g. Stein 1982, Mandler & Johnson 1977, Rumelhart 1975). 33 It describes the state of affairs prior to the INITIATING EVENT, i.e. it provides information on initial time and location, participating characters and props, as well as additional background details (e.g. Shapiro & Hudson 1991, Mandler & Johnson 1977: 118). The minimal requirement for a setting is the mention of one animate character (Stein & Policastro 1984: 119, Mandler & Johnson 1977: 118), which is why the character introduction is also labeled major setting as compared to other minor setting information (Stein & Glenn 1979: 62). The general function of the SETTING is to provide a contextual embedding for the listener in which to integrate the ensuing episode (cf. Berman 2001a: 1, Peterson & McCabe 1991: 41). 34 This listener orientation requires the narrator to be aware of the contextual information a listener needs in order to build an initial representation in which he or she can successfully integrate the following episode(s) (cf. Berman 2001a: 2, Peterson & McCabe 1983: 54). In other words, the development of the SETTING must be seen in relationship with theory-of-mind development, i.e. the awareness of shared as opposed to unshared knowledge. As such, the development of the SETTING forms part of children’s cognitive development. Further evidence for this comes from recall studies: Character introduction, for example, which is the minimal requirement for a SETTING, seems to “act as a marker for the initiation of the [story] recall schema” (Stein & Glenn 33 In reference to highpoint analysis (Labov [1972] 1999, Labov & Waletzky 1967) the SETTING is sometimes also called orientation. 34 This study looks at oral narratives, thus the addressee is always referred to as a listener. The explanations given apply just the same to a reader. Fig. 3.2 Narrative components of a single episode structure 29 1979: 62) just like beginning formulae such as once upon a time in English or es war einmal in German. The INITIATING EVENT (e.g. Berman & Slobin 1994b, Stein & Policastro 1984: 118, Peterson & McCabe 1983: 67, Mandler & Johnson 1977: 119) marks the beginning of a story’s problem-resolution structure. 35 It consists of one or more events that make some change in the environment of the protagonist and typically triggers a complex reaction in the protagonist. Thus, the main function of the INITIATING EVENT is to evoke an emotional response in the protagonist (SIM- PLE REACTION), which leads to the formation of the GOAL-PLAN. 36 In a simple episode, INITIATING EVENTs are external or internal events, i.e. natural occurrences, action(s) performed by one of the story characters or thoughts about and perceptions of external events, which lead to changes in the habitual states described in the SETTING (Stein & Glenn 1979: 63, Mandler & Johnson 1977: 115). 37 In more complex episodes, the INITIATING EVENT itself can consist of one or more subordinate episodes (e.g. Mandler & Johnson 1977: 115ff.). The SIMPLE REACTION, which directly follows the INITIATING EVENT, is often implicit, i.e. not overtly expressed in the story (Stein & Glenn 1979: 64, Mandler & Johnson 1977: 121). It refers to the internal state of the protagonist in reaction to the INITIATING EVENT, i.e. any descriptions of emotional responses (e.g. sadness) or statements about a character’s thoughts (cognitions) (Shapiro & Hudson 1991, Stein & Glenn 1979: 64, Mandler & Johnson 1977: 119). The primary function of the SIMPLE REACTION is to motivate the GOAL-PLAN, which in turn results in the goal-path, i.e. action(s) by the protagonist (Stein & Policastro 1984: 144). The GOAL-PLAN, or simply GOAL, consists of the protagonist’s internal plan to overcome the problem raised by the INITIATING EVENT, i.e. statements referring to desires, intentions or strategies of the protagonist, which motivate the following plan application sequence, namely the goal-path (e.g. Trabasso & Rodkin 1994: 106, Shapiro & Hudson 1991, Stein & Glenn 1979). Just like the SIMPLE REACTION, the GOAL is often left to be inferred from the story context (e.g. Stein & Glenn 1979: 65, Mandler & Johnson 1977: 121). The goal-path induced by the complex reaction consists of an ATTEMPT and an OUTCOME, i.e. one or several actions carried out in order to achieve the GOAL and the subsequent result(s) of these actions (e.g. Stein & Policastro 1984: 118, Mandler & Johnson 1977: 123). 38 If the ATTEMPT is followed by a successful 35 The INITIAITING EVENT is also referred to as beginning (e.g. Mandler & Goodman 1982) or onset of the plot (e.g. Berman & Slobin 1994b). 36 Less typically, a SIMPLE REACTION is triggered, which is followed directly by an action (cf. Mandler & Johnson 1979: 119). 37 See Boueke et al. (1995) and Quasthoff’s (1980), for example, call for a disturbance in the normal course of events as a criterion for storiness (cf. the story grammar criticism in ch. 2). 38 The OUTCOME is also referred to as consequence (e.g. Stein & Policastro 1984, Peterson & McCabe 1983). 30 OUTCOME, i.e. the goal has been achieved and the problem resolved, the END- ING of the episode is reached (Stein & Glenn 1979: 58ff., Mandler & Johnson 1977: 123). The goal-path can be recursive, though: If the first attempt to reach the GOAL is marked by failure, the goal is reinstated and motivates another attempt. 39 Thus, several attempts and their outcomes can follow each other until the overall goal is finally reached. As long as they are reinstantiations of the same main goal, these attempts are all part of the same ATTEMPT section (Trabasso & Rodkin 1994: 99ff., Mandler & Johnson 1977: 123, Thorndyke 1977: 80). The ENDING concludes an episode (e.g. Mandler & Goodman 1982: 509, Thorndyke 1977). 40 Even though considerable redundancy between OUTCOME and ENDING may occur in stories with a single goal-path (cf. Mandler & Johnson 1977: 123), the two components can normally be clearly distinguished: While an OUTCOME is the immediate result of an ATTEMPT, the ENDING is connected to the overall GOAL and therefore to the episode as a whole (Mandel & Johnson 1984: 647, Mandler & Johnson 1977: 123). The ENDING wraps up the story and can include statements referring to one or several of the following: Successful attainment of the goal, emotional and cognitive responses of one or several story characters to the final state of affairs (i.e. goal attainment), actions resulting from these psychological states, or long-term consequences occurring as a result of goal attainment (e.g. Stein & Policastro 1984: 118, Pradl 1979: 21, Mandler & Johnson 1977: 123). Long-term consequences can take the form of morals or genre-related linguistic formulae such as they lived happily ever after. The task material in the present study (cf. ch. 5.1.3) can be described as a single-episode story with a recursive attempt section (cf. ch. 5.2.1 below). 41 Consequently, only the narrative components of a minimal story containing a single episode were described above, even though stories are generally far more complex. Nevertheless, even complex stories are variations of the basic one-episode story structure in that they may include several such episodes either following each other or embedded in the one-episode structure (see e.g. Bamberg & Marchman 1990, Stein & Glenn 1979, Mandel & Johnson 1977). Based on the theoretical assumptions outlined so far, the coherence analysis conducted in the present study addresses the following questions: • How coherent are participants’ stories as measured by the number of narrative components? • Are there any qualitative differences in coherence as measured by the frequency of individual components? • Are there quantitative and/ or qualitative differences in coherence attributable to grade, sex or L2 preschool experience? 39 Reinstantiation can involve the creation of subgoals adapted to the specific circumstances described at that point in the narrative, i.e. local goal-plans (Trabasso & Rodkin 1994: 93). 40 The ENDING is also referred to as reaction (e.g. Stein & Policastro 1984). 41 But cf. Bamberg & Marchman (1990, 1991, 1994) for a description as a multipleepisode story. 31 3.4 Cohesion: From references to lexical cohesion As explained in ch. 3.2, cohesion is a linguistic phenomenon. Any development in the use of cohesive devices therefore reflects a linguistic development. The present study traces the development of the five “classical” categories of cohesion going back to Halliday and Hasan (1976), namely references, connectives, substitution, ellipsis, and lexical cohesion. Each one of these categories will be described in more detail in the following sections. First of all, however, a general distinction should be made according to the phoricity of a cohesive device. That is, cohesive devices can refer either to elements in the extralinguistic context (exophora) or in the linguistic context (endophora) (e.g. Hickmann 2003 & 1991, Halliday & Hasan 1976). Even though the linguistic forms used for referring to linguistic and situational context are the same, exophoric reference does not directly contribute to cohesion (Halliday & Hasan 1976: 37). 42 The difference in phoricity corresponds to the developmental challenge children face: While even very small children are able to use cohesive devices exophorically (cf. ch. 4), they need to master the intralinguistic use of these cohesive devices, i.e. they need to learn “to use language as its own context” (Hickmann 1991: 157). Additionally, endophoric cohesive relations can be either anaphoric, i.e. they point towards an antecedent in the preceding text (examples (3.3) and (3.4)), or cataphoric, i.e. they point towards a referent in the following text (3.5). 43 However, cataphoric ties are comparatively rare and seldom cohesive (Halliday & Hasan 1976: 56). (3.3) The cat ran away and it was never seen again. (3.4) The cat ran away and Ø was never seen again. (3.5) It was gone. The cat had finally run away. The following questions will be investigated with regard to the overall use of cohesive devices: • How cohesive are participants’ stories as measured by the use of cohesive devices? • Are there any qualitative differences as measured by the use of different types of cohesive devices? • Are there quantitative and/ or qualitative differences in cohesion attributable to grade, sex or L2 preschool experience? 42 Most cohesive devices, i.e. substitution, ellipsis, connectives and lexical cohesion, are essentially endophoric relations, however, so that the distinction between endophora and exophora mainly becomes important for references (cf. Halliday & Hasan 1976). 43 The term anaphora as it is used here refers to a relationship of co-reference, i.e. between one textual element and its antecedent (Crystal 2003: 24, Bamberg 1986: 231). In a more limited use of the term, it can be used to describe a relationship between a nominal expression and a pronoun (Bamberg 1986: 231). 32 Similar questions will need to be answered for each of the subtypes of cohesion under investigation. However, these additional questions will be presented in the corresponding sections below. 3.4.1 References As with all other cohesive devices, references refer to something else for their interpretation but “[i]n the case of reference the information to be retrieved is the […] identity of the particular thing or class of things that is being referred to” (Halliday & Hasan 1976: 31), i.e. the referent. Three types of references can be distinguished, namely personal, demonstrative and comparative references (ibid.: 31ff.). Personal references are established through pronominal markers referring to the identity of people, objects, and events. They are expressed linguistically with the help of personal and possessive pronouns, e.g. she, mine, his. Personal reference devices between clauses include relative pronouns, e.g. who or which (Quirk et al. 1985: 365). Demonstrative references, on the other hand, identify a referent by place or time on a temporal and spatial proximity scale through the use of demonstrative pronouns and adverbs, e.g. this, here, now, 44 as well as the definite article the. Comparative references, finally, are established by means of identity, similarity or difference expressed through adjectives 45 and adverbs such as same, other, better, more or similarly. The analysis of the use of references in the present study addresses the following questions: • How often do participants use referential ties? 46 • Are there differences in the use of references attributable to grade, sex or L2 preschool experience? 3.4.2 Substitution and ellipsis Substitution and ellipsis are two closely related processes, which rely on structural relations for reference to a presupposed item (Halliday & Hasan 1976: 90), i.e. they act as a grammatical signal indicating that a presupposed item must be recovered from the preceding text (ibid. 308). While reference describes a relationship between meanings, substitution and ellipsis thus describe a relationship between linguistic items (Halliday & Hasan 1976: 89). 44 Also then as a temporal adverb, e.g. In my twenties I was pretty chubby. I loved cheeseburgers (back) then. The demonstratives then and now have to be distinguished from the homonymous temporal connectives (cf. Halliday & Hasan 1976: 261). 45 Halliday and Hasan’s use of adjective includes what other authors would rather label determiners and semi-determiners (e.g. other, cf. Biber et al. 1999: 280, Greenbaum 1996: 213). 46 In the following, the terms reference, connective etc. are used synonymously with referential ties, connective ties etc., since only cohesive devices forming ties were included in the analysis. 33 Substitution refers to a presupposed element (a word or a group of words) by replacing it with a substitute item. Ellipsis, on the other hand, can be defined as substitution by zero (Halliday & Hasan 1976: 88ff.) or grammatical omission (Quirk et al. 1985: 883) in that a sentence’s grammatical structure contains an empty slot, which needs to be filled with a presupposed item from a neighboring part of the text (ibid.: 861, Halliday & Hasan 1976: 142ff.). Semantically, substitute item and ellipsis express general, “class” identity but at the same time non-identity of the actual referent—as opposed to reference, which presupposes the exact identity of referents (Halliday & Hasan 1976: 324). Thus, substitution and ellipsis are used to add a contrast to the antecedent (ibid.: 314ff.), e.g. boots - new ones (example 2.6). Substitution and ellipsis each have three subtypes: Nominal, verbal, and clausal. In nominal substitution the head of a nominal group 47 is substituted by one(s) (Halliday & Hasan 1976: 91ff.), as in example (3.6 ), or an entire nominal group by same, typically accompanied by the (ibid: 105ff.). 48 In nominal ellipsis the head of a nominal group is omitted in the presupposing structure; its function can then be taken over by a modifier, as in (3.7). Also, an entire nominal group can be omitted, as in (3.8). (3.6) I lost my boots. Now I’ve bought new ones. (3.7) I used to have three red blouses. Now I’ve got two Ø. (3.8) The little frog got out of his glass and Ø ran away. Verbal substitution means substituting a lexical verb (and possibly additional elements) by a form of do (ibid.: 112ff.), while in verbal ellipsis the lexical verb (lexical ellipsis as in (3.9)) or finite auxiliary and subject are omitted from the verbal group (operator ellipsis as in (3.10)) and left to be inferred from the preceding text (ibid.: 170ff.). 49 In both types of verbal ellipsis additional elements can be left out. (3.9) The frog may have escaped. Or he may not (have) Ø. (3.10) The boy had not found the frog and instead Ø gone home. Clausal substitution is realized by so and not presupposing an entire clause (ibid.: 130ff.; example (3.11)). In clausal ellipsis, on the other hand, either subject plus finite element of the verbal group (modal ellipsis as in (3.12)) or the remainder of the verbal group as well as any complements and adjuncts (propositional ellipsis; example (3.13)) are omitted (ibid.: 196ff.). Modal ellipsis typically includes operator ellipsis in the verbal group, propositional ellipsis includes lexical ellipsis (ibid.: 199). In other types of clausal ellipsis an entire clause is left out, e.g. in yes/ no ellipsis as in (3.14). All forms of clausal ellipsis typically occur in question-answer sequences (cf. ibid.: 196ff.). 47 Halliday’s nominal phrase is defined as any constituent able to function as a subject or complement (Bloor & Bloor 1995: 259). 48 A related process, namely replacement through a general word such as thing, is covered by lexical cohesion in Halliday and Hasan’s system, cf. ch. 3.4.4. 49 For a definition of the Hallidayan verbal group cf. Bloor and Bloor (1995). 34 (3.11) Mary will drive tomorrow. If not, I will. (3.12) A: What are you doing? B: Ø reading a book. (3.13) You can come to my house tomorrow, if you want Ø. (3.14) A: Have you done your homework? B: Yes Ø. The analysis of the use of substitution and ellipsis in the present study addresses the following questions: • How often do participants use ties by substitution/ ellipsis? • Are there differences in the use of substitution/ ellipsis attributable to grade, sex or L2 preschool experience? 3.4.3 Connectives Connectives make explicit the semantic relationship between one clause or sentence and another and thereby relate even text sequences which are structurally unconnected, i.e. not part of the same grammatical construction (Halliday & Hasan 1976: 226ff.). Formally, this can be accomplished with the help of conjunctions, e.g. and, simple or compound adverbs, e.g. then, finally or afterwards, prepositional phrases (with or without that) e.g. as a result (of that) or on the contrary, and other complex constructions, e.g. one day later. Semantically, most connectives express one of four basic relationships, namely additive, temporal, causal and adversative (e.g. Biber et al. 1999: 79ff., Quirk et al. 1985: 915ff., Halliday & Hasan 1976: 226ff.). 50 Additive connectives make explicit the underlying structural relationship of coordination, which joins linguistic elements to function as one complex element of structure, in the sense of one element being an addition to the other. Temporal connectives express a relationship of sequence or simultaneity, while causal connectives mark a relationship of causality and adversative connectives express a contrast. There is no simple correspondence between the form and meaning of a connective, however. The coordinating conjunction and, for example, can express several types of relations besides additive, e.g. temporal I bought some milk and went home or adversative I asked him to help me and he did not want to (cf. Quirk et al. 1985: 930ff.). In addition to formal and semantic criteria, connectives can be distinguished according to the syntactic role of the elements they link, i.e. coordination in a paratactic or subordination in a hypotactic construction. But, for example, expresses an adversative relation between two coordinate clauses, while because marks a causal relationship between a main clause and the subordinate clause it introduces. 50 No general agreement exists on the number of categories and subcategories. Other authors recognize additional categories, for example concessive and conditional (e.g. Biber et al. 1999, Quirk et al. 1985), which Halliday and Hasan would subsume under the category adversative and causal, respectively. Halliday and Hasan, on the other hand, recognize an additional “residual” category continuatives (1976: 267ff.). 35 In studying the use of connectives it is crucial to keep in mind that the semantic relations between clauses and sentences do not have to be made explicit (e.g. de Beaugrande & Dressler 1981, Mandler & Johnson 1977, Halliday & Hasan 1976: 229). 51 Two sequentially stated event clauses, for example, are generally assumed to reflect the order of occurrence of these events, i.e. listeners automatically infer that the event stated first also occurred first; an explicit linguistic marking of this relationship is therefore only necessary if the sequence is out of order (e.g. Hickmann 2003: 95, Peterson & McCabe 1991: 31). Concerning the development of the use of connectives this means that a lack of connectives does not automatically relate to lack of a concept. Even children at age three, for example, are able to attribute actions to goals of the person acting (Sodian 2002: 457), i.e. to infer causal relationships. Nevertheless, children at that age rarely use connectives at all and much less explicitly causal ones (cf. ch. 4). The analysis of the use of connectives in the present study addresses the following questions: • How often do participants use connective ties? • Are there differences in the use of connective ties attributable to grade, sex or L2 preschool experience? 3.4.4 Lexical cohesion Lexical cohesion is the most frequent type of cohesion in texts and thus also carries most of their cohesive load (e.g. Bae 2001, Hoey 1996, Hasan 1984). It can be defined as “the cohesive effect achieved by the selection of vocabulary” (Halliday & Hasan 1976: 274), i.e. ties created through the lexical (rather than textual) relations between nouns, adjectives, lexical verbs and, to a lesser extent, also adverbs (e.g. Beigman Klebanov & Shamir 2005, Hoey 1996: 6ff.). Such relations between lexical items can involve a general semantic relationship, e.g. a hyponymic relation as between frog and animal, or an instantial one, i.e. a relationship created by the content of the specific text (Hasan 1984: 202, Halliday & Hasan 1976: 289). Such an instantial lexical relationship is shown in (3.15), where dog and Bello denote the same referent just as frog and Bingo denote the same referent. (3.15) 1[One night, Tim # had a dog, 2[names Bello,] and a frog # 3[who called # Bingo.] # 4[they were very happy] 5[that they had b/ Bingo.] (male fourth grader C5-G4-3) Two large subcategories of lexical ties can be distinguished: Reiteration and collocation (Halliday & Hasan 1976: 274ff.). Reiteration refers, first of all, to ties created by repetition, which range from simple reproductions, e.g. boy - boy, to the complex repetition of lexical items that share a lexical morpheme but are not formally identical, e.g. boy - boyish (Halliday & Hasan 1997 [1985]: 81, Hoey 1996: 51 See also the distinction Halliday and Hasan make between external connectives, which make explicit relations between events, processes or states, and internal connectives, which are language-internal structuring devices (1976: 239ff.). 36 55). Reiteration also includes ties by synonymy and near-synonymy, recurrence to a hyperonym or the use of a general noun such as thing or man (Renkema 2004: 105, Halliday & Hasan 1976: 277ff.). At the same time the relationship between two lexical items tied by reiteration can be described on a continuum ranging from mere identity of form (homonymy) to identity of referent (Halliday & Hasan 1976: 283). Non-identity of the referent is usually compensated for by a higher number of additional lexical ties than in instances with identical reference (ibid.: 283). Collocation refers to the cohesive effect between lexical items that “are in some way associated with each other in the language” (Halliday & Hasan 1976: 285), i.e. lexical items which regularly co-occur. This ranges from being systematically connected through semantic relations such as co-hyponymy, (co-)meronymy and opposition (e.g. antonymy, converses), i.e. those relations not included in the reiteration category, to habitual co-occurrence with no readily specifiable sense relation, e.g. laugh - joke or cut - knife (ibid.: 285). Collocation in this latter sense of co-occurrence may in most cases be attributable to relatedness through schema membership (cf. ch. 3.1), e.g. a TELLING A JOKE schema and a CUTTING schema in the case of the examples given. The strength of the cohesive effect achieved by collocation is determined by the lexical items’ proximity in the lexical system, their distance in the text at hand, and their overall frequency in the language system (ibid.: 290). The analysis of the use of lexical cohesion in the present study addresses the following questions: • How often do participants use lexical ties? • Are there differences in the use of lexical ties attributable to grade, sex or L2 preschool experience? 3.5 Narrative production: Summary and research questions As explained in the previous sections, two fundamental dimensions of (narrative) discourse can be distinguished: Coherence and cohesion. Cohesive devices are text-based, linguistic devices, so that the development of their use reflects a development in linguistic abilities. Coherence, on the other hand, which in most general terms refers to the content structure of a text, is based on narrative schemata. The development of narrative coherence therefore reflects a development in cognitive abilities. Although it may not immediately be evident how L2 data could reflect cognitive development, the overview of previous research given in ch. 4 will show that children’s narrative organization, and thus their underlying schema, evolves very strongly within the age range of the present study’s participants. Additionally, it will be shown that this development is evident not only in an L1 but also in an L2 within the age range investigated. By exploring the coherence and cohesion of immersion students’ L2 narratives I will thus trace aspects of their cognitive and linguistic development from first to 37 fourth grade. That is, two main research questions are addressed in the present study: • Is there a cognitive development from first to fourth grade as measured by L2 narrative coherence? • Is there a linguistic development from first to fourth grade as measured by L2 narrative cohesion? The research questions presented in the preceding sections should be seen as outlining necessary steps towards answering these main questions. Additionally, a possible relationship between coherence and cohesion will be explored, i.e. the following question will be addressed: • Is there a relationship between (the development of) L2 cohesion and coherence? 39 4 The development of storytelling Compared to well-researched areas in (second) language acquisition, such as morphosyntax or the lexicon, few studies have been conducted on narrative coherence and cohesion and their development. Furthermore, the majority of the latter investigated L1 speakers only; that is, the field of narrative coherence and cohesion is under-researched not only for L1 but even more so for L2 and bilingual data. In order to obtain as complete a picture as possible of the development of coherence and cohesion, the following overview will therefore give a synthesis of findings—organized on a developmental time line wherever possible—on L1 as well as L2 and bilingual speakers; this is especially important since participants of the present study are child L2 learners, whose L1 development must be considered far from concluded. What the present chapter will show is that children’s stories become more coherent within the age-range investigated in the present study, be it in the L1 or an L2. Narrative schemata seem to be transferable from one language to the other, dependent only on the availability of the necessary linguistic means. At the same time children’s use of cohesive devices increases with age and their use becomes more sophisticated. At the same time the development of the use of cohesive devices is parallel in the L1 and the L2. Sex and preschool experience will not be taken into account for the following reasons: Even though sex is generally included among the potential factors of influence on (second) language acquisition, its impact is far from clear (cf. for example the overview in Ellis 1994) and the influence of sex on cohesion and coherence has rarely been investigated. In those instances where it was considered, sex was typically found to be of no influence on overall coherence and cohesion (e.g. Allen et al. 1994, Stenning & Mitchell 1985, Peterson & McCabe 1983, Botvin & Sutton-Smith 1977), even though some studies found at least minor qualitative differences (e.g. Stenning & Mitchell 1985, Peterson & McCabe 1983, cf. also the overview in Hendrickson & Shapiro 2001). With respect to preschool experience, there is an even more severe lack of previous studies: To the best of my knowledge, no study has so far compared young L2 learners with and without L2 preschool experience, so no prior findings are available. Additional influence factors, for example comprehension skills (Cain 2003), or mothers’ encouragement in mother-child storytelling activities (Kang et al. 2009), will also not be covered in the developmental overview, since they were not investigated as part of the Kiel Immersion project and are thus not available for the present study. It should be kept in mind for the interpretation of the results, however, that variables such as these might also have been found to influence results had they been investigated. In the following sections I will give an overview of previous findings on the development of narrative coherence in L1 learners’ discourse (ch. 4.1) as well as in that of L2 and bilingual learners (ch. 4.2). Then I will summarize findings on the 40 development of cohesion in L1 learners’ discourse (ch. 4.3) and in that of L2 and bilingual learners (ch. 4.4). From this overview it will become clear that—due to the age-range under investigation—a development in the coherence and cohesion of participants’ discourse is to be expected in the present study, irrespective of whether L2 or L1 data is analyzed. After the developmental overview of coherence and cohesion, an overview of findings on the relationship between these two discourse characteristics will be given (ch. 4.5), which will show that the results of earlier studies show no consistent pattern. The chapter concludes with a summary in which several hypotheses are formulated with respect to the research questions raised in the previous chapter. 4.1 The development of narrative coherence in L1 acquisition Children’s earliest narrative productions seem to be accounts of personal experiences (personal narratives) in reaction to adult questions; that is, at age two, children are already able to relate personal narratives if heavily prompted and encouraged by their parents, i.e. “storytelling” tends to be a joint activity of parental questions and child answers (McCabe & Peterson 1991a). Between age three and four children’s narrative productions remain short strings of events without any global organization structure, which are—in spontaneous production—most often either (again) accounts of personal experiences or general event representations, i.e. scripts (Berman 2004: 266ff., Hudson & Shapiro 1991: 121ff., McCabe & Peterson 1991a: 246f., Nelson & Gruendel 1986: 41ff., Seidman et al. 1986: 167ff.). With respect to fictional stories it was found that at age three children are already able to respond to a picture-elicited storytelling task, even if they still have difficulties in concentrating and need a very high degree of interviewer support (Berman & Slobin 1994b: 58ff.). The resulting story productions consist of a very limited number of narrative components and mainly correspond to descriptions of the event content of some of the pictures, i.e. two or three events connected by and then (e.g. Berman 2004: 266ff., Berman & Slobin 1994b: 46, Berman 1988, Peterson & McCabe 1983). While there is no evidence for an underlying story schema in production at this age, this fact does not, exclude the possibility that children are using a story schema for comprehension and recall (Trabasso et al. 1995, Seidman et al. 1986; cf. also the overview in Hudson & Shapiro 1991: 100f.). From age three to four onward, children’s personal narratives and fantasy stories gradually develop from being descriptions of isolated objects, events or states into event sequences. Especially from age five there is also increasing evidence that some children arrange their stories hierarchically, i.e. in relation to an overall goal-plan, even if such a global organization remains the exception (e.g. Boueke et al. 1995, Berman & Slobin 1994, Hudson & Shapiro 1991, Berman 1988, Seidman et al. 1986: 167f., Peterson & McCabe 1983). At around age eight to nine, the development of scripts and personal narratives seems to be relatively complete (Hudson & Shapiro 1991: 123f., Peterson & 41 McCabe 1983). Children’s ability to tell a globally organized fictional story, on the other hand, continues to develop. At around age nine to ten, a large number of children’s fictional stories finally start to resemble those of adults, whose stories are organized in relation to a global narrative structure reflecting the use of a narrative schema (e.g. Augst 2010, Augst et al. 2007, Shapiro & Hudson 1997, Stein & Albro 1997, Boueke et al. 1995, Berman & Slobin 1994b: 58, Berman 1988). 1 This increase in stories with a global organization is reflected by an increase in the overall number of narrative components as well as in the frequency of the individual components (e.g. Reilly et al. 1998, Berman & Slobin 1994b: 53f., Hudson & Shapiro 1991, Berman 1988). Some narrative components seem to be easier to acquire than others in the sense that they are produced from an earlier age. However, research results for individual components are not easy to compare, due to differing definitions and approaches to narrative structure (but cf. the developmental overview in Hudson & Shapiro 1991: 89ff.): Berman and Slobin and their colleagues (1994), for example, investigated (among other components) the development of the onset, which in their study corresponds to one part of what was outlined as story grammar’s INITIATING EVENT in ch. 3.3.2, in the story productions of both children (aged three to nine) and adults. They found that across five languages the large majority of children (78%) had acquired the onset by age five. Hudson and Shapiro (1991), on the other hand, found that their problem component, which corresponds to story grammar’s INITIATING EVENT, was realized by only around 34% of first graders (mean age 6; 7). A percentage of realizations comparable to Berman and Slobin, namely over 70%, was not reached before third grade (mean age 8; 7), i.e. almost four years later. Previous studies only seem to fully agree that by age four to five children include at least a primitive SETTING or orientation component, whose realization and complexity increases with age (e.g. Berman 2001, Peterson & McCabe 1983, Pradl 1979). As opposed to children, adults’ narrative productions all have a globally organized structure (Lanza 2001, Berman & Slobin 1994b: 75). Additionally, they are much more elaborate and detailed, e.g. in references to inner states (Berman & Slobin 1994b: 80). Further developments into adult age include the “individualization” of stories (Berman & Slobin 1994, Berman 1988). As Berman and Slobin (1994b: 74f.) note, narrative productions at age nine (i.e. their oldest group of child participants) can best be described as stereotypical because there is so little individual variation. Adult stories, on the other hand, are very heterogeneous. 2 To sum up, children’s storytelling abilities evolve continuously, from scriptlike personal narratives and picture descriptions (up to about age 3) to event se- 1 Note that the studies by Augst (2010), Augst et al. (2007) and Boueke et al. 1995 investigated written texts. 2 Children’s more stereotypical view of stories is also reflected in their story goodness ratings: Stein and Policastro (1984: 151), for example, found that adults (elementary school teachers) more readily included non-prototypical passages, e.g. non-goal-based reactive sequences, into the story category than did sevento eight-year-olds. 42 quences (from about age 3), and finally to globally organized narratives (from about age 5). While the structure of scripts, i.e. general event representations, and personal narratives can be considered acquired by around age eight, the ability to tell conventionally structured fantasy stories is a relatively late development. That is, the use of a global organization structure (story schema) can be found in the majority of children’s stories around age nine but it continues to develop into adulthood—accompanied by a greater individualization of storytelling styles. If children’s data is regarded in its own right, it can thus be argued that children successively acquire different types of story schemas. That is, very young children have acquired a “joint activity” schema, which corresponds to the way in which caregivers seem to engage in a storytelling task (e.g. Berman & Slobin 1994b: 60, Trabasso & Rodkin 1994: 104, McCabe & Peterson 1991a; see also Kang et al. 2009). Preschool and early school children, on the other hand, have acquired a descriptive model of storytelling (Stenning & Mitchell 1985: 275, Cook-Gumperz & Green 1984). Older children, finally, start to acquire an adultlike schema, which involves a global organization structure as described, for example, by story grammars. 4.2 The development of narrative coherence in L2 and bilingual acquisition The results obtained in studies focusing on the narrative coherence of child L2 learners and/ or bilinguals confirm the age-related use and development found in studies on monolinguals. That is, the development of a global narrative structure begins at around age five and the number of narrative components as well as the frequency of the individual components increase with age (e.g. Montanari 2004, Akinçi et al. 2001, Kupersmitt & Berman 2001, Severing & Verhoeven 2001). However, most studies on narrative development in an L2 focused on participants older than age five, so there is a severe lack of studies on young L2 learners just as on early bilinguals. Whether narrative training is received in an L1 or L2, e.g. due to being enrolled in an immersion program, does not seem to have adverse effects on the development of storytelling abilities (Almgren et al. 2008 & 2007, Hüttner & Rieder-Bünemann 2007, Laurén 1998). Almgren et al. (2007), for example, found that L1 Spanish students in a Basque immersion program, who had been trained in storytelling via their L2 Basque, were able to transfer this knowledge to narrative production in their L1 Spanish. Several studies have shown, however, that a certain linguistic threshold must be passed to ensure that once the cognitive schema is available so are the linguistic means to express it in the L2 or, in the case of very early bilinguals, in both languages (e.g. Almgren et al. 2008 & 2007, Hüttner & Rieder-Bünemann 2007, Montanari 2004, Severing & Verhoeven 2001). Severing and Verhoeven (2001), for example, investigated the narrative productions of L1 Papiamento speakers educated in Dutch (mean age 5; 2 to 10; 4). On the one hand they found a consist- 43 ently higher performance in the L1, but on the other hand they also found that after four years of schooling the differences between participants’ L1 and L2 story productions had started to become smaller. This trend towards a similar performance in the L1 and L2 is supported by the results of studies on very early bilinguals, which observed almost no difference in coherence between bilinguals’ two languages once the necessary linguistic means had been acquired (e.g. Akinçi et al. 2001, Kupersmitt & Berman 2001, Lanza 2001, Viberg 2001). The impact of a linguistic threshold is even more evident in older L2 learners, who can be expected to have full command of a story schema as well as—from their L1 experience—a general understanding of how to express such structures linguistically. Myles (2003), for example, studied the narrative development of British L2 French learners after one (8 th grade) and two (9 th grade) years of classroom instruction. She found that not only the number of propositions, but also the frequency of the individual narrative components increased as a function of the L2 exposure, even though her eighth and ninth graders should already have had a well-developed narrative schema available for use. A comparison of Myles’ results with the studies on young L1 and L2 learners leads to the conclusion that the time of exposure (as well as related factors such as the amount of input) becomes the crucial factor for older learners, while in younger learners an overlap of age-related cognitive and linguistic developments with an exposure-related L2 development must be expected. Further confirmation for the impact of linguistic proficiency on coherence in L2 narratives comes from studies conducted on immersion students, who receive a much higher amount of language input than L2 learners in regular programs. Immersion students have been found, first of all, to produce more coherent stories than students enrolled in regular foreign language teaching (Hüttner & Rieder-Bünemann 2007). At the same time immersion students have been found to perform similarly to monolingual comparison groups. Laurén (1998), for example, studied L2 Swedish stories produced by Finnish fifth graders, who had been enrolled in a Swedish immersion program for five years, preceded by one year in an immersion kindergarten. Laurén found that the immersion group’s mean number of components was only slightly lower than the one of the L1 Swedish comparison group, i.e. both L1 and L2 speakers produced comparably coherent stories. In summary, the development of coherence in L2 learners and bilinguals parallels the development found in L1 acquisition, i.e. a maturational and/ or exposure-related increase is to be expected in the overall number of components as well as in the frequency of individual components. Once a narrative schema has been acquired, it can be used for narrative productions in any language, if L2 learners have acquired the necessary linguistic means to express the schema in the L2. 44 4.3 The development of cohesion in L1 acquisition Children have been found to use cohesive devices in conversations and personal narratives from age two onwards and in fictional stories from age three (Berman & Slobin 1994b: 67, Peterson & Dodsworth 1991; cf. also the overviews in Shapiro & Hudson 1991 and Bamberg 1987). However, few studies on L1 and L2 narratives have investigated the full range of cohesive devices going back to Halliday and Hasan (1976)—most studies focused instead on character references and/ or connectives—and even fewer have done so for oral data. The overview of cohesion in L1 narrative productions given in the present section and the subsequent overview of cohesion in L2 narratives will therefore include findings from studies not only on oral but also on written data, even though oral and written stories are not necessarily directly comparable. A notable exception to studies covering only references and/ or connectives are Peterson and Dodsworth (1991), who studied the development of cohesion in two-year-olds’ personal narratives over a span of one and a half years. They found that even at age two children were able to use most of the subcategories differentiated by Halliday and Hasan (1976)—exceptions were comparative references, nominal ellipses, and substitutions—and by age three all of them. Peterson and Dodsworth also found that the overall number of cohesive ties increased significantly over the duration of their study, an increase mainly attributable to a rise in the use of personal references and conjunctions. The (sub-)categories of ellipsis and substitution, on the other hand, decreased or remained fairly stable. Additionally, Peterson and Dodsworth found a stable order of frequency for the five types of cohesion distinguished by Halliday and Hasan: Across all narrative productions lexical ties were the most frequently used, followed by references and conjunctions. Ellipses and especially substitutions, on the other hand, were used very infrequently. This order of frequency was confirmed in studies on the written L1 narratives of children (e.g. Cameron et al. 1995, Spiegel & Fitzgerald 1990, Crowhurst 1987, Yde & Spoelders 1985) as well as, for example, in adults’ essay writing (Neuner 1987), i.e. it seems to be largely mediumand genreindependent. Research on written narratives has shown that by around age eight to nine (at the latest) a ceiling is reached in the number of cohesive devices: Studies comparing eightto nine-year-olds with older peers found no further significant increase in cohesiveness or even a decrease in some of the subcategories (e.g. Manhardt & Rescorla 2002, Spiegel & Fitzgerald 1990; but cf. Yde & Spoelders 1985). Instead, a qualitative shift in the subcategories of cohesive devices was found for older children, for example a decrease of repetitions as opposed to an increase of synonyms and collocations within the lexical cohesion category (Crowhurst 1987). In oral narratives the same phenomenon has been shown for connectives (e.g. Berman & Slobin 1994d: 609f., Berman 1988; cf. also below). Most previous studies on cohesion have, as already mentioned, focused on connectives and references. With respect to connectives it was found that, although twoand three-year-olds already use connectivity markers in their narra- 45 tives, their number is comparatively low (e.g. Berman & Slobin 1994, Peterson & Dodsworth 1991, Berman 1988). In the following years, i.e. up to around age nine, children’s use of connectives increases (e.g. Augst et al. 2007: 62ff., Hendrickson & Shapiro 2001, Berman & Slobin 1994c: 177ff., Strömqvist & Day 1993, Hudson & Shapiro 1991, Bennett-Kastor 1986). Berman and Slobin (1994c: 175ff.), for example, found that at age three their monolingual English-speaking participants marked under 50% of the relations between clauses, while fiveand nine-year-olds did so in over 85%. However, around age ten (at the latest) the number of connectives seems to decrease again (e.g. Hickmann 2003: 287f., Boueke et al. 1995, Berman & Slobin 1994d: 609f., Berman 1988). With age the use of connectives also becomes more sophisticated. This means, first of all, that children’s use of connectives shifts from a pragmatic to a discourse function and, secondly, that the variety and complexity of connectives increases— be it in oral (e.g. Verhoeven et al. 2002: 148ff., Berman & Slobin 1994c: 175ff., Strömqvist & Day 1993, Hudson & Shapiro 1991) or written narratives (e.g. Augst et al. 2007: 62ff., Verhoeven et al. 2002: 148ff., Boueke et al. 1995, Crowhurst 1987, Fitzgerald & Spiegel 1986, McCutchen & Perfetti 1982: 128). Thus, three-yearolds typically produce utterance-initial deictics (e.g. here) indicative of a picture description mode or they use the coordinator and with a discourse-pragmatic function, i.e. to indicate that “more is to come” (Berman & Slobin 1994c: 175f., Berman 1988). Nevertheless, even very young children are able to produce structurally subordinating and semantically temporal, causal or adversative connectives, especially if such relationships are encouraged by the elicitation material (Shapiro & Hudson 1997, Berman & Slobin 1994c: 176). In the following years children’s preference gradually shifts to temporal expressions of sequentiality, especially and then (Berman & Slobin 1994c: 177ff.; cf. also Augst et. 2007: 348ff., Boueke et al. 1995: 149ff.). By age nine children still rely heavily on this temporal sequencing but they also produce more advanced connectives, e.g. connectives expressing causal and adversative relations or syntactically complex ones (e.g. ibid.). Adults, finally, use explicit expressions for temporal sequence (such as and then) only marginally, since they assume temporal sequence to be the default relation between clauses (Berman & Slobin 1994d: 610, Peterson & McCabe 1991: 47). Instead, adults produce an even larger variety of connective devices, e.g. more subordinators, and they integrate connectives more flexibly into the sentence structure, e.g. by using connectives such as finally also in non-clause-initial position (Verhoeven et al. 2002: 149f., Berman & Slobin 1994c: 180ff.). These maturational changes in the use of connectives are indicative for a development similar to the one found for narrative structure; that is, children’s use of connectives reflects a gradual movement away from a simple picture-by-picture description towards, first of all, a temporal structuring of events and, finally, a thematically-motivated global structure (Augst et al. 2007: 348ff., Bouke et al. 1995: 152ff., Aksu-Koç & von Stutterheim 1994: 419, Berman & Slobin 1994c: 175, Strömqvist & Day 1993). 46 Not only the use of connectives but also that of references becomes more sophisticated with age. This has been shown especially for personal references, where the changes reflect the developments found for connectives, i.e. from (a) a deictic to an intralinguistic use of referential devices and (b) towards global organization strategies (e.g. Hickmann 2003 & 1991, Shapiro & Hudson 1997 & 1991, Boueke et al. 1995: 143ff., Wigglesworth 1990, Bamberg 1987 & 1986, Karmiloff-Smith 1985). Three-year-olds use referential devices mainly deictically but children’s references gradually become intralinguistic (Hickmann 2003 & 1991, O’Neill & Holmes 2002, Karmiloff-Smith 1985). Once referential devices are used intralinguistically, children pass through several stages characterized by different referential strategies (e.g. Hickmann 2003 & 1991, Shapiro & Hudson 1997 & 1991, Boueke et al. 1995: 143ff., Bamberg 1987 & 1986, Karmiloff-Smith 1985): The intralinguistic use of referential devices is first of all dominated by the immediate linguistic context, i.e. children use a locally-motivated strategy (predominant from about age 3 to 5). This evolves into strict adherence to a thematic subject strategy (Karmiloff-Smith 1985), i.e. a globally-motivated use of referential devices (predominant from about age 5 to 7), and then into an adult-like use, which integrates local and global strategies (from about age 8). Evidence on the respective age ranges is controversial, though: While Karmiloff-Smith (1985), for example, found that the thematic subject strategy develops relatively late, other studies report that it is already the most common strategy around age four and five (Shapiro & Hudson 1991, Bamberg 1987 & 1986). These differences in results could, however, be attributable to differences in task condition, task material or language typology (cf. Berman 2004, Shapiro & Hudson 1997 & 1991, Bamberg 1986, and the discussions in Hickmann 2004: 299ff & 1996: 205ff.). There is general agreement at least that children’s reference system continues to develop even beyond age eight (e.g. Hickmann 2003 & 1991, Shapiro & Hudson 1997: 40, Wigglesworth 1990, Karmiloff-Smith 1985). To sum up, some cohesive devices have been observed in children’s narratives from age two onwards. The number and type of cohesive ties increases with age while at the same time the use of cohesive devices becomes more sophisticated. However, evidence especially from written narratives indicates a ceiling in the number of cohesive ties, which seems to be reached at around age eight to nine. Further development seems to be limited to qualitative changes within the subcategories. 4.4 The development of cohesion in L2 and bilingual acquisition Studies conducted on the use of cohesive devices by L2 or bilingual subjects parallel the results obtained in studies on monolinguals with respect to the overall order of frequency and the general development of the use of cohesive devices; that is, first of all, the order of frequency of the categories distinguished by Halliday and Hasan (1976) was found to be the same, be it in studies on child L2 learners 47 (Bae 2001) or adult L2 speakers (Norment 1995, but cf. Kang 2005). Bae (2001), for example, investigated the use of all five categories of cohesion in the written L2 stories of Korean first and second graders enrolled in immersion and English-only programs. Bae found, firstly, that the children were able to use all five types of cohesive devices. Just as in studies on L1 speakers, however, participants most frequently used lexical ties, which accounted for 56% of their cohesive devices, followed by references (32%) and (coordinating) conjunctions (12%). The use of ellipses and substitutions was again found to be marginal (both below 1%). With respect to the development of the use of cohesive devices, it was found that the number and variety of connectives increases with age also in an L2 (Bae 2001, Severing & Verhoeven 2001), and respectively in both languages in the case of very early bilinguals (Montanari 2004, Aarsen et al. 2001), just as L2 and bilingual speakers’ reference strategies become more sophisticated (Montanari 2004, Severing & Verhoeven 2001). In older L2 learners similar developments take place (Myles 2003, Norment 1995), which are again rather attributable to factors such as exposure time and/ or input (and the corresponding proficiency level), while in young learners’ L2 productions these factors seem to overlap with maturational effects (cf. also ch. 4.2). Even though L2 learners’ general development in the use of cohesive devices parallels that of L1 learners, older L2 learners have typically been found to perform differently from monolingual speakers of the respective target language both in oral and written narratives (e.g. Kang 2005 & 2004, Myles 2003, Reid 1992) L2 speakers/ writers have been found, for example, to overuse coordinate conjunctions (Reid 1992) and nominal referential devices (Kang 2004). However, learners’ use of L2 cohesive devices has also been found to differ from their L1 productions (e.g. Kang 2005, Norment 1995), and thus, the differences between L2 learners’ use of cohesive devices and that of monolingual controls seem to be attributable an intermediary stage in their interlanguage. The results obtained by Laurén (1998: 528) point in the same direction: Laurén analyzed the use of anaphoric references by Finnish children attending the fifth year of a Swedish immersion program, which had been preceded by another year in an immersion kindergarten. She found that almost all immersion students used the same linguistic means in Swedish as the L1 comparison group, even though these linguistic means are different from those available in Finnish. To summarize, the general order of frequency and the development of cohesion in young L2 learners’ and bilinguals’ narratives parallel those found in L1 acquisition: The number and variety of ties increases with age and their use becomes more sophisticated. In the case of older learners the same developmental process has been found, with time of exposure rather than age as the determining factor. At the same time L2 learners perform differently from L1 speakers as long as they are still reorganizing their interlanguage system. 48 4.5 The relationship between coherence and cohesion As the developmental overview of coherence and cohesion has shown, a number of studies have investigated the development of these two discourse characteristics in children’s narrative productions. Far fewer L1 and particularly L2-studies, however, have explored the relationship between the coherence and cohesion of children’s texts, and these studies have had conflicting results (cf. also ch. 3.2). Additionally, a comparison of findings is made difficult by the varying operationalizations of both coherence (e.g. as a discourse-based measure such as story grammar vs. external coherence ratings) and cohesion (e.g. as the number of ties vs. reference strategies). Studies using two discourse-based measures typically found a relationship between coherence and at least some cohesion measures, usually references and connectives (e.g. Augst et al. 2007, Cain 2003, Shapiro & Hudson 1997 & 1991, Boueke et al. 1995, Spiegel & Fitzgerald 1990, Stenning & Mitchell 1985). This latter finding may be biased, however, by the general preference for investigating connectives and references, which was discussed earlier in this chapter. Shapiro and Hudson (1997, 1991), for example, investigated the development of coherence and cohesion in the oral narratives of preschoolers (mean age 4; 8) and first graders (mean age 6; 8). They operationalized coherence with the help of an overall complexity measure for narrative structure (based on the number and types of episodic components) and cohesion with the help of a subordination index, the subtypes of connectives, and participants’ reference strategies. Shapiro and Hudson found that, regardless of age, children producing structurally more complex narratives also used more complex language (i.e. subordination) and more sophisticated cohesive devices. 3 Similarly, Cain (2003), who studied coherence (operationalized as completeness of the event structure) and connectives in oral stories produced by sixto eight-year-olds, found a positive correlation between story coherence and the use of sophisticated connectives. However, not all findings confirm this pattern unambiguously. Stenning and Mitchell (1985), for example, studied the oral story productions of children aged five to ten. They found that children using sophisticated connectives had a tendency to score better for story content and answers to explanatory questions. At the same time Stenning and Mitchell also found a strong interindividual variation, however, i.e. many of their child participants who told sophisticated stories used unsophisticated connectives. Additionally, reference strategies were not found to be indicative of the two coherence measures. Several studies, especially on written narrative productions, operationalized coherence with the help of rating scales and compared raters’ coherence scores to participants’ use of cohesive ties. Fitzgerald and Spiegel (1986), for example, investigated the relationship between coherence and cohesion in third and sixth graders’ written stories. Across grades, they found a trend towards a negative 3 Augst et al. (2007: 62ff.) came up with a similar result for subordination in L1 stories written by German speakers. 49 correlation between coherence ranking and the number of cohesive ties, i.e. stories with fewer ties were given higher coherence rankings. To the best of my knowledge, only one study on L2 learners has explored the relationship between coherence and cohesion, namely Bae (2001), and this study also used a rating scale approach. Bae investigated narrative texts written by first and second grade L2 English learners (cf. last section) and found that the number of at least some cohesive ties was related to coherence scores. That is, the number of both lexical and referential ties was a significant predictor for coherence ratings. Connectives, ellipses and substitution, on the other hand, had no predictive value. The difference between Bae’s and Fitzgerald and Spiegel’s results could of course be attributable to differences between L1 and L2 speakers. However, a much larger number of comparable studies is needed to make any claims regarding differences and/ or similarities between L1 and L2 studies—especially when keeping in mind that, as described earlier, even largely comparable L1 studies have yielded somewhat contradictory results. To sum up, this overview has shown that even studies with comparable approaches do not yield consistent results with respect to the relationship between coherence and cohesion in children’s narrative productions. However, L1 studies show at least a trend towards a correlation between coherence and the use of sophisticated connectives. The only prior research on L2 data, which used a different approach from that employed in the present study, found a relationship between lexical and referential ties and coherence ratings. 4.6 Summary and hypotheses As this review of earlier research has shown, children’s L1 narrative productions become more coherent with age; that is, they are increasingly built around a global problem-resolution structure (top-down organization), which is indicative of an underlying narrative schema, instead of being a simple chaining of events (bottom-up organization). Independent of the L1 ivestigated, this development seems to be most prominent up to the age of nine, but it continues beyond that age. With respect to the use and development of cohesive ties in the L1, this overview has shown that even three-year-olds are able to produce almost all categories of cohesive devices distinguished by Halliday and Hasan (1976). With age the number of cohesive devices increases and their use becomes more sophisticated in a way that reflects the development from local to global organization found for narrative coherence. However, at around age eight to nine a ceiling in the number of cohesive devices may be reached after which the development is mainly limited to qualitative changes within the individual categories. The overview also presented evidence for a general order of frequency, i.e. lexical ties make the strongest contribution to cohesion, followed by reference and connectives, while ellipsis and substitution contribute only marginally. This review of earlier studies has also shown that the coherence of L2 productions and its development do not differ from those evident in L1 narratives once 50 the speaker has acquired the necessary linguistic means of expression in the L2. Instead, narrative schemata as production plans seem to be transferable from one language to another, at least within the same cultural context. With respect to cohesion in L2 learners’ narratives previous studies showed, first of all, that the same order of frequency holds true. Secondly, the general development of cohesion parallels that of L1 learners, i.e. a maturational and/ or exposure-related increase in the number and sophistication of cohesive devices is to be expected. In addition to this, L2 learners’ use of cohesive devices typically differs not only from L1 comparison groups (at least before full native-like control has been obtained) but also from learners’ use in their L1. Regarding the relationship between coherence and cohesion in narrative productions it was shown that no clear picture emerges for either L1 or L2 discourse. Research on L1 data found a tendency, however, towards a correlation between at least some cohesion measures and coherence. The only study on L2 data found a relationship between coherence ratings and the number of both lexical and referential ties. Based on the developmental overview given in the previous sections the following basic conclusions can be drawn for the design of the present study: 1. The participants’ age range (mean age 6; 8 to 9; 8) is suitable to study a development in narrative coherence and cohesion. 2. This development in narrative coherence and cohesion can also be traced in L2 data (keeping in mind a potential linguistic threshold). With respect to the research questions presented in the previous chapter several hypotheses can now be formulated. First of all, regarding the impact of the three learner variables under investigation, it is assumed that: 1. Grade/ age has a significant influence on the narrative coherence and cohesion of participants’ stories. 2. Sex does not have a significant influence on the narrative coherence or cohesion of participants’ stories. As indicated at the beginning of the chapter, no previous studies are available on the influence of L2 preschool experience on coherence and cohesion, so no hypothesis will be put forward. Secondly, regarding a grade/ age-related development, several additional hypotheses can be formulated: 3. Participants’ stories become more coherent from first to fourth grade as measured by the number of narrative components. 4. There are qualitative differences in narrative coherence between grades as measured by differences in frequency among the individual narrative components. 5. Participants’ stories become more cohesive from first to fourth grade as measured by the number of cohesive ties. 51 6. There are no qualitative differences in cohesion between grades as measured by the frequency order of the subcategories of cohesion (lexical ties > references > connectives > ellipses and substitutions). No hypotheses will be put forward with respect to an increase or decrease of the individual categories of cohesion from first to fourth grade, i.e. whether qualitative differences in cohesion are to be expected as measured by each category’s number of ties in first and fourth grade. Also, due to the controversial findings of previous studies, no hypothesis will be formulated with respect to a relationship between coherence and cohesion. 53 5 Research Design 5.1 Participants and data collection 5.1.1 The Kiel Immersion Project In Germany, the provision of foreign language teaching differs between the sixteen different states (Bundesländer), since responsibility for the education system is determined by Germany’s federal structure. 1 In Schleswig-Holstein, Germany’s northernmost Bundesland, a project brought into being by Henning Wode from Kiel University seeks to enable participants to accomplish as part of their formal schooling something EU policy-makers have called for on various occasions, namely to reach “meaningful communicative competence in at least two other languages in addition to his or her mother tongue” (Commission of the European Communities 2003: 4). Since this requires a coordination of language teaching in primary and secondary education, the Kiel Immersion Project combines a bilingual preschool, an IM elementary school and bilingual wings at secondary schools. The present study deals with data from the elementary school of the project only and thus only this part of the project will be described in more detail in the following; however, a short overview of the preschool program will also be given, since some of the participants in the present study attended the bilingual preschool beforehand. 2 The bilingual preschool of the project (ages 3-6) was set up near Kiel in 1996 (cf. Wode 2009 & 2000, Burmeister & Pasternak 2004, Kersten et al. 2002, Rohde & Tiefenthal 2002). From the start, at least one group per year was attended to by a team of one German-speaking and one English-speaking caregiver. By 2005, the preschool had three caregivers with English as their L1 and three of the five preschool groups were educated bilingually. 3 Bilingual education in the preschool follows the one person-one language principle, i.e. Germanand English-speaking caregivers consistently use only their respective L1 when speaking to the children. A neighboring elementary school started a partial English IM program in August 1999 (cf. Kersten 2009, Wode 2009, Burmeister & Pasternak 2004, Wode et al. 2003, Kersten et al. 2002). Each school year, one IM class is offered, with the exception of the 2005/ 2006 school year, in which an even higher demand than 1 Children generally begin primary schooling (elementary school) at age six to seven. Secondary school starts at age ten to eleven. Children usually attend preschool (age 2-3 to 6-7) before entering primary school; this is, however, not legally binding. For more details on the German education system as well as the variation between the different Bundesländer cf. Kultusministerkonferenz (2011). 2 For the secondary school part cf. e.g. Burmeister & Daniel 2002, Wode 1998, 1995 & 1994. For a short overview of the Kiel project cf. also Möller 2009. 3 Personal communication Mrs. Devich-Henningsen (head of the preschool). 54 usual led to two first grade IM classes being established. 4 The IM elementary school continues L2 English education for children from the bilingual preschool, but it is also open to children from monolingual preschools. All subjects except German language arts are taught in English, i.e. about 70% of the curriculum, while the academic curriculum corresponds to that of the monolingual peer classes, as is the rule in immersion programs (Swain & Lapkin 2006, Johnson & Swain 1997). In first grade, the children are allowed to speak German, although the use of English is highly encouraged; from second grade on the children are expected to use English only in class. As is typical for immersion classes, the formal correction of errors and explanations of grammar are kept to a minimum during the teaching of the academic subjects. Similarly, subject matter tests are taken in the L2, even though only the academic content is graded. Due to the German teacher-training system, all IM teachers in Altenholz have a teaching certificate for both elementary and secondary schools, although they originally have a specialization for English at secondary school. 5,6 Native-speaker input comes from assistant teachers and from a changing number of English L1 students whose parents work in Kiel for some time. 7 Most of the teaching material must be designed by the teachers, e.g. by translating German text books or creating new teaching materials with the help of pictures and corresponding English texts, since L1 English text books are linguistically too difficult at the beginning and do not correspond to the German academic curriculum. All parts of the Kiel project were evaluated by Henning Wode and his group at Kiel University, the preschool project being evaluated from 1997 to 2001 (e.g. Tiefenthal 2009, Wode 2009, 2004 & 2000, Rohde 2005, Rohde & Tiefenthal 2002). Research results show that, even at the end of preschool, the children hardly use the L2 because there is no real necessity: All caregivers, even the ones with L1 English, understand German; nevertheless, the L2 input in preschool is important, since it is responsible for the development of strong receptive abilities (cf. also the results of the Early Language and Intercultural Acquisition Studies project, e.g. Kersten 2010). Research results at the elementary school (e.g. Wode 2009, Burmeister & Pasternak 2004, Wode et al. 2003), obtained from a picture-elicited storytelling task since 1999 (cf. ch. 5.1.3), show an impressive rise in the children’s productive abilities during the first year of elementary school—the receptive abilities acquired in preschool seem to form the basis for this development in grade one where 4 The success of the IM project in Altenholz also inspired an increasing number of elementary schools in Schleswig Holstein and the neighbouring Bundesland Hamburg to set up immersion wings following the Altenholz model (cf. http: / / www.fmks-online.de, for example). 5 I.e. the teachers are certified as Grund- und Hauptschullehrer. 6 Until recently, the possibility of majoring in English for primary school was an exception in the regular teacher-training programs at universities and applied universities in Germany. Instead, trained primary school teachers were able to qualify for teaching English by obtaining a supplementary certificate (C1-Schein). 7 One assistant teacher per year is shared by all grades. (S)he teaches 14 hours per week. 55 classroom activities begin to heavily encourage the use of the L2. At the end of fourth grade the children are able to communicate freely and efficiently in English. L1 tests at the elementary school show that, even though the L2 develops very strongly and most of the school day is conducted in English, the L1 suffers in no way (e.g. Gebauer et al. 2012, Wode 2009, Zaunbauer et al. 2005, von Berg 2005, Bachem 2004, Wode et al. 2003). While the elementary school students’ linguistic proficiency and development in areas such as syntax, verbal morphology or phonology has been covered by a number of studies (e.g. Kersten 2009, Wode 2009, Kersten et al. 2002), children’s discourse abilities have not yet been investigated on a larger scale. That is, only two studies have been conducted on discourse production measures and both of these were on cohesion in first and second grade (Möller 2003, Maschewski 2002). No attempt has so far been made to study coherence and/ or the relationship between coherence and cohesion. The present study seeks to fill this gap. 5.1.2 Participants The data used in the present study is a subsection of a larger amount of data collected in the Kiel Immersion Project since 1999 (cf. the previous section). The learner variables collected as part of the project were grade, age in years, sex, and L2 preschool experience. Other measures, such as parental demographics or socio-economic data, were not collected. 8 A distinction is made between three experience groups with respect to L2 preschool experience: children with prior L2 English experience from attending the bilingual preschool described in the previous section (bili group), children who attended a regular monolingual German preschool (mono group), and children who attended a monolingual German group in the preschool that also offers the bilingual groups (MB group). 9 Since the latter preschool has a partly “open concept,” i.e. children are—at least during some part of the day—relatively free to move around between groups, the MB children have all had some exposure to the L2; however, to what extent their previous L2 experience equals that of the bili group cannot be confirmed. At the same time the MB group is very small (N=3 in first grade and N=4 in fourth grade) and its usefulness for separate, especially statistical, analyses thus strongly limited. Consequently, the MB group was excluded from the present study. 8 However, see Zaunbauer and Möller (2007) for data from other cohorts of the same program. According to their teachers, the socioeconomic status of all cohorts used in the present study is roughly comparable. 9 It should be noted that neither the exact amount of experience nor information on the exact time of exposure in the bilingual preschool group were tested or collected. It is one of the limitations of the present study that children with bilingual preschool experience are treated as one homogeneous group simply based on the fact that they had some exposure to English in preschool, while there might well be interindividual differences attributable to distinct amounts of (time of) exposure. 56 The present study makes use of data collected at the end of the first and at the end of the fourth school year; it combines longitudinal data from the first cohort of participants 10 in the IM program with cross-sectional data from the fifth and eighth cohort; an overview of the different cohorts is given in Tab. 5.1. All participants (N=59) have German as their L1 or stronger language and, according to the teachers, all of them were normally-developing. The first cohort and the fifth cohort were taught by the same, cohort eight by a different teacher. Tab. 5.1 Descriptive statistics of the study’s participants Cohort (Grade) Total N Male Female M † B Mean age (SD) Start (school year) Tested 1 (1) 13 3 10 5 8 6; 9 (0.28) 1999 5/ 2000 1 (4) 13 3 10 5 8 9; 9 (0.28) 1999 5/ 2003 8 (1) 18 7 11 7 11 6; 8 (0.43) 2006 6/ 2007 5 (4) 15 4 11 7 8 9; 7 (0.46) 2003 5/ 2007 Total/ Mean 59 17 42 24 35 8; 3 (1.55) - - † ‘B’ denotes children with bilingual preschool experience (‘bili’ group), ‘M’ children without L2 preschool experience (‘mono’ group). ‘SD' denotes the standard deviation. In terms of participants’ exposure time to the L2, cohort one had an exposure time of approximately nine months, and cohort eight an exposure time of approximately ten months by the end of first grade; that is, the mean time of exposure for the first grade children is 9.5 months. 11 The mean time of exposure in fourth grade was 41.5 (cohort 1) and 40.5 months (cohort 5), which makes for a mean exposure time in fourth grade of 41 months. 12 As Tab. 5.1 shows, the cohorts of the longitudinal and the cross-sectional data set are comparable in terms of all measures. Additionally, there were no statistically significant differences in age (χ 2 (1) = 0.37, ns), sex (χ 2 (1)=0.45, ns) or preschool experience (χ 2 (1)=1.0, ns) between the two first grade or between the two fourth grade cohorts (age χ 2 (1)=0.33, ns; sex χ 2 (1)=1.0, ns; preschool experience χ 2 (1)=0.72, ns). This observed and statistical likeness justified combining the different data sets for the present study. To ensure further comparability, all statistical analyses were also carried out separately for the longitudinal and cross- 10 Cohort one children, who had been to the bilingual preschool, were already subjected to tests in the preschool part of the Kiel IM project. 11 Cohort one entered school at the end of August 1999 and was tested in mid-May 2000, cohort eight entered school in mid-August 2006 and was tested in mid-June 2007. 12 Cohort five entered school in mid-August 2003. For both cohorts three summer vacations, a total of six weeks per year in the German school system, were deducted. Other vacations and holidays are more variable and were thus disregarded. 57 sectional data set using paired-sample, and respectively independent-sample tests. The results corresponded to those obtained when combining the cohorts. In the following, the descriptive statistics will be given for each cohort that participated in the study. Tab. 5.2 shows the participant information collected for children from the first cohort. Besides the children pertaining to the small MB group, several other participants originally tested had to be excluded from the analysis: Two children of cohort one were not included in the analysis, child 9 since it left the class after second grade and child 12 due to inappropriate interviewer instructions in first grade, which could have led to a biased understanding of the task as a picture description. Tab. 5.2 Participants of cohort one Participant No. Sex English experience † Age in grade 1 (May 2000) Age in grade 4 (May 2003) 1 Male B 7 10 2 Male B 7 10 3 Female M 7 10 4 Female M 7 10 6 Female B 7 10 7 Female M 7 10 8 Female M 6 9 10 Female M 7 10 13 Male B 7 10 14 Female B 7 10 15 Female B 7 10 16 Female B 7 10 17 Female B 7 10 † ‘B’ denotes children with bilingual preschool experience (‘bili’ group), ‘M’ children without L2 preschool experience (‘mono’ group). ‘SD’ denotes the standard deviation. Child 10 had a bilingual Polish-German background (cf. Kersten 2009). Participants of this cohort are coded as C1-G1-number (first grade) and C1-G4-number (fourth grade). Tab. 5.3 gives the participant information collected for children from the fifth cohort. Five children of the fifth cohort had to be excluded from the analysis: child 13 because he had lived in the United States for four years, children 23 and 24 because they had joined the class only in third grade, and child 5 because he did not do the second task (B-task, cf. ch. 5.1.3 for a description of the two tasks conducted). Child 22 had to be excluded due to biased interviewer instructions (similar to child 12 in cohort one). 13 13 Additionally, three children who had been tested in first grade had left the class: Two moved away and one skipped second grade. 58 Tab. 5.3 Participants of cohort five (fourth grade) Participant No. Sex English experience † Age in grade 4 (May 2007) 3 Male B 10 6 Female B 10 7 Female B 9 8 Female B 10 9 Female M 10 10 Male M 10 11 Female M 9 12 Female M 10 14 Female B 9 15 Female B 9 17 Female M 10 18 Female B 10 19 Male M 10 20 Female B 10 21 Male M 10 † ‘B’ denotes children with bilingual preschool experience (‘bili’ group), ‘M’ children without L2 preschool experience (‘mono’ group). ‘SD’ denotes the standard deviation. Participants of this cohort are coded as C5-G4-number. Tab. 5.4 gives the participant information collected for children from the eighth cohort. Five children of cohort eight also had to be excluded: child 12 refused to do the B-task after a very strenuous A-task; child 17 had to be excluded due to potentially suggestive interviewer contributions; child 19 and 23 had both lived in the US for a prolonged period of time (4 ½ and 1 years respectively) and were thus not included in the analysis. 14 One child (22) refused to participate in the interviews altogether. In addition to the background variables grade, age, sex, and experience group, a short interview was conducted with the cohorts tested in 2007, namely cohorts five and eight, before the actual storytelling task. In this interview the children were asked about their experience with the story genre and storytelling. 15 All children reported at least some experience with stories or story-like genres, ranging from TV shows with a story format to parents’ bedtime stories or the children’s own reading. The amount and type of experience varied, however, especially as a function of age. 14 Additionally, child 23 had joined the class just two days prior to the interviews. 15 These interviews have not been evaluated systematically so far, i.e. no systematic quantitative analysis has been carried out, so the results given here need to be considered as somewhat impressionistic. 59 Tab. 5.4 Participants of cohort eight (first grade) Participant No. Sex English experience † Age in grade 1 (June 2007) 1 Male B 7 2 Female M 7 3 Male B 7 4 Female B 6 5 Male B 7 6 Female M 7 7 Male M 7 8 Female M 6 9 Female B 7 10 Female M 7 11 Male B 6 13 Female B 7 14 Female B 7 15 Male B 7 16 Female M 6 18 Female M 7 20 Male B 7 21 Female B 7 † ‘B’ denotes children with bilingual preschool experience (‘bili’ group), ‘M’ children without L2 preschool experience (‘mono’ group). ‘SD’ denotes the standard deviation. For child 3 some input of (a) further language(s) including English cannot be excluded, since one of his parents is from an African country. Participants of this cohort are coded as C8-G1-number In Schleswig-Holstein, different text genres and especially stories are also part of the curriculum for L1 German classes throughout primary school (Ministerium für Bildung 1997: 51ff.). This ranges from comprehension activities, such as reading a book in class and talking about its content, to students’ own text production, e.g. in the form of oral (and later also written) personal or fictional narratives, as well as activities centering on constructing texts from sentences. According to the four cohorts’ teachers, such systematic activities centering on texts and especially text production are to be expected mainly from second grade onwards. Nevertheless, it is to be expected that apart from their varying home experience with stories all participants of the present study have had a significant school exposure to narrative structures and cohesive devices; this relates especially to the fourth graders. 60 5.1.3 Data collection: Materials and procedure From 2000 to 2007 picture-elicited oral storytelling tasks were administered once a year at the elementary school of the Kiel IM Project (e.g. Wode 2009). Participants’ L1 and L2 were tested on separate days with the L2 test being conducted before the L1 test. For all tests children were collected in their respective classrooms during regular teaching hours and tested individually in a separate room at the school. Since the present study deals exclusively with participants’ L2 data, only this part of the testing will be described in the following. The picture book Frog, where are you? (Mayer 1969; henceforth frog story), which has been used successfully in quite a number of studies (e.g. Akinçi et al. 2001, Severing & Verhoeven 2001, Berman & Slobin 1994, Bamberg & Marchman 1991; cf. also Strömqvist & Verhoeven (2004: 486ff.) for an overview), was used as elicitation material for the L2 data. The book, in black-and-white pictures, tells the story of a little boy and his dog, who embark on a search for a lost frog (cf. ch. 5.2.1 for a detailed description of the story’s structure). 16 In terms of task difficulty, a picture book elicitation task can be classified as presenting a great challenge to the age range under investigation as compared to, for example, personal experience stories, but a lesser challenge than expository text or a film-elicited narrative (Berman 2008, Berman 2004: 267; cf. also the developmental overview in ch. 4). At the same time picture book elicitation has a decisive advantage for the researcher: It leads to a high degree of comparability in the results as compared to, for example, a mere picture stimulus or narratives of personal experience. The L2 data collection involved two different parts, task A and task B, which were conducted one immediately following the other on the same day. Both Aand B-task have in common that participants were first of all encouraged and given sufficient time to look through the elicitation material in order to allow the formation of a mental representation of the events depicted. Additionally, children were allowed to look at the pictures while telling the story in order to avoid problems attributable to working memory constraints rather than cognitive or linguistic processing. Differences in the set-up and function of the two tests will be described in the following. In task A the interviewer was known to the participants as a speaker of German and English. Participants were instructed to look through the storybook carefully and then tell the story to the interviewer in English. At the same time the children were told explicitly that the interviewer could help with vocabulary questions if requested. Interviewer instructions were to help only when asked and to otherwise give exclusively non-specific prompts encouraging the storytelling. In task A interviewer and participant looked at the pictures together during storytelling; the children turned the pages at their own pace. Task A thus serves two main purposes: First of all, it familiarizes participants with the test situation and with the elicitation material. Secondly, the children are supported linguistically while they make a first attempt at encoding their mental representation of the story 16 For additional information on the frog story see especially Berman and Slobin 1994. 61 content. Since the interviewer can help to eliminate temporary linguistic difficulties, participants’ linguistic processing load is lessened. In task B participants were asked to tell the same story to a second interviewer, who they believed to speak and understand English only. The children were given time to look through the picture book once more before starting with task B. During the test they were again allowed to look at the pictures while telling the story and to turn the pages at their own pace. However, participants were instructed not to let the second interviewer see the pictures. The interviewer in turn was instructed not to let the child show her or him the pictures and to give only nonspecific prompts. In the B-task participants can focus on optimizing their linguistic expression of the content, since the cognitive load of conceptualizing the story has greatly been reduced through the A-task; that is, children have already had sufficient opportunity to form a coherent mental representation of the story. Additionally, they have also had an opportunity to practice the linguistic encoding of this representation. However, in the B-task the children also need to cognitively and linguistically process an information gap between themselves and the interviewer, i.e. they have to modify their linguistic encoding according to the needs of an uninformed listener. As the description of the test design shows, the set-up serves to optimize conditions for participants’ story production. That is, task A serves as a trial run allowing participants to optimize their performance in task B. Out of this rationale, the order of the A-task and the B-task was not counterbalanced and only the stories obtained from task B were analyzed in the present study. 17 All interviews were recorded and then transcribed verbatim using the transcription conventions given below. 17 Similarly, L1 and L2 tests were not counterbalanced because this would have given children telling the story first in the L1 and then in the L2 two trial runs. 1, 2 etc. Participant numbers # Short pause (500 msec- 5.0 sec) ## Long pause (5.5-10 sec) ### Very long pause (> 10 sec) <...> Transcriber comments x Incomprehensible (roughly approximating the number of syllables) / Speaker interrupts or corrects his or her utterance italics German words , Slightly rising or falling intonation (as usually before a subordinate clause) . Strongly falling intonation (as usually in signalling the end of an utterance) ? Strongly rising intonation (as in signalling a question) ! Intonation corresponding to an exclamation (e.g. frog! ) 62 5.2 Method of analysis In the following subsections, the operationalization of the categories of analysis provided by story grammar (coherence) and Halliday and Hasan’s approach to cohesion (1976) will be described in detail. As stated earlier, this description of the method of analysis corresponds to a large degree to a qualitiative results chapter: The operationalizations described are the result of an initial in-depth analysis of the transcripts based on the categories provided by both frameworks as well as on the operationalization of similar categories in related, earlier studies (e.g. Bae 2001, Berman & Slobin 1994). As will become clear, the linguistic barrier for accepting options chosen by the participants was set as low as possible for both coherence and cohesion in order to do justice to their status as L2 learners. 5.2.1 Narrative coherence: The structure of “Frog, where are you? ” As Fig. 5.1 shows, the task material “Frog, where are you? ” (Mayer 1969), i.e. the frog story, has an excellent fit with the prototypical episodic structure described by story grammars (cf. ch. 3.3.2). Before describing the individual components and their operationalization in detail, however, some clarification is in order as to the relationship between the (realization of) the narrative components and the task material. It could be assumed that a mere picture description would satisfy the requirements for the individual components, i.e. that there is no necessity for macrostructural (pre-)planning in telling the story. In fact, this is not the case. : Sound prolonged (e.g. and the: n, psh: t! ) ehm Any type of hesitator (German äh, hmm etc.) hehe Laughter | Overlap (More than one person speaking at the same time) B Task B I Interviewer known to the children as speaking both German and English (interviewer in task A; may still be in the room and interact in task B) IE Interviewer known to the children as a native speaker of English, who does not understand or speak German (interviewer in task B). 63 Fig. 5.1 The structure of the frog story in narrative components as identified by story grammars † Terminal nodes taken into account for the analysis are shaded in grey 64 The content of each narrative component follows from the pictures and their sequence but there is no one-to-one relationship between narrative component(s) and what is depicted. 18 As Berman and Slobin state quite nicely, “the structure of Frog, where are you? requires speakers to recall the progression and the outcome of the plot as represented in pictures other than the one(s) they happen to describe at any given moment.” (1994a: 41). That is, participants need to interpret the pictures in relation to an overall plot and infer relevant information from the individual pictures. 19 At the same time missing components should not be attributed to linguistic difficulties, since (a) the linguistic threshold for realizing the individual components was set very low and (b) the children had a “trial run” in the A-task (cf. 5.1.3), where vocabulary questions could be and were asked when necessary. 20 The SETTING follows from the first picture of the frog story: At night, a boy and a dog are sitting in a bedroom; they are looking at a frog in a glass jar. As explained in chapter 3.3.2, the minimal requirement for a SETTING in any story is the mention of one animate character, i.e. a protagonist. In the frog story this corresponds to three protagonists: Boy, dog, and frog. 21 Thus, participants were credited with the SETTING component, if they made explicit reference to all three protagonists at the beginning of the story. The SETTING can serve as an illustrative example that even where a mere picture description could be suspected, which is more easily true for the SETTING than for the other components, this is not in fact the case: To the adult onlooker it is clear that all three protagonists reappear later in the story and thus need to be mentioned right at the beginning. So why do not all of the children mention the three protagonists? Three children left out the boy, three of them left out the dog, all children mentioned the frog. Participants could have opted to tell the story as having only two protagonists, e.g. boy and frog or dog and frog, in which case the mention of only two protagonists in the SETTING would have been systematic 18 With the possible exception of the SETTING, but cf. the discussion below. 19 This is in line with children’s use of inferences when looking at a picture book. Boueke et al. (1995: 156ff.) were able to demonstrate in a picture-elicited storytelling task that the number of inferences as well as their sophistication increases from the last kindergarten year to the last year of elementary school (ca. age 5; 7 to 9; 11). Moreover, they found that the degree of sophisticated inferences is related to the degree to which a story is globally organized. 20 Even though the A-task results were not included in the present study, their evaluation within the Kiel Immersion Project showed that participants made active use of the opportunity to ask for missing vocabulary and subsequently used newly acquired vocabulary in task B. 21 The frog story contains several characters with their individual perspectives and thus several story strands (cf. Trabasso & Rodkin 1994: 91). The two main story strands, which overlap often but not always, relate to the boy and the dog (cf. Bamberg & Marchman 1994: 558f., Bamberg 1987: 23). The narrative components investigated in this study are based on the boy’s experiences, goals, adventures etc., since he is the main protagonist (cf. Trabasso & Rodkin 1994: 91) and, additionally, since no participant consistently chose a dog perspective. 65 and thus acceptable. However, all participants mentioned the three protagonists at some point in their storytelling, even if this was not the case in the SETTING. Again, linguistic difficulties should be ruled out as a possible explanation. Instead, the conclusion is that the children do not see the relevance of mentioning all three protagonists at this stage of storytelling development, even though in this particular instance the necessary information is more readily available from the picture than for some of the other components. The INITIATING EVENT follows from the second and/ or third picture, namely the frog’s escape (e.g. Reilly et al. 1998). In the task material, the second picture shows the frog on the rim of the glass jar with one leg outside and the other one inside the jar; the boy and the dog are sleeping and the window is partly open. Participants need to infer that the frog ecapes, first from the jar and then through the window, by relating picture two and three: In the third picture daylight is coming in through the window and the boy and the dog are lying on the bed looking at an empty jar; the boy’s face indicates an emotional reaction to the empty jar. In order to receive credit for the INITIATING EVENT, participants needed to make reference to either the frog escaping, the result of this action, i.e. the jar being empty or the frog being away (picture 3), or the boy’s discovery of the escape (cf. Berman & Slobin 1994: 52f., Kupersmitt & Berman 2001). Participants were credited with reference to the boy (and/ or dog) discovering the escape if they used a verb of perception or cognition—the clearest case being a verb of perception, e.g. see in example (5.1). The verb look, in contrast, can refer to both a physical and a mental activity, even if its core meaning relates to the domain of physical activity (Biber et al. 1999: 361). Therefore, a form of look was only accepted as encoding the discovery, if a description of the result of the escape followed, e.g. in (5.2). In cases where look was preceded by a description of the result as in example (5.3), however, it more likely indicates the beginning of the search and was thus not credited as the discovery but as an ATTEMPT (see below). Another option considered to be a realization of the discovery was the use of direct speech such as in example (5.4), which indicates the result of a thought process. (5.1) 5[at the next morning, the little boy saw] 6[the glass was empty.] (C1-G4- 1) 22 (5.2) 7[and the boy, waking, up,] 8[and look,] hä: : ? # 9[where is the frog? ] # 10[he is # he is not in the glass! ] (C1-G1-06) (5.3) 3[the frog run away.] # 4[tomorrow, the frog are not here.] # 5[the boy # look to the frog.] (C8-G1-14) (5.4) 3[the <de> frog # hopp out the </ de/ > ### glass.] # 4[the boy’s shout] 5[where is my frog? ] (C8-G1-21) 22 As explained above, references to individual participants are coded as follows: Cohortgrade-participant number. C1-G1-1, for example, refers to child one from cohort one in grade one, C1-G4-1 to the same child in grade four (longitudinal data set). 66 The narrative component SIMPLE REACTION, i.e. an internal response, follows from the third picture of the story, in which the boy’s face clearly indicates an emotional reaction, even if the exact emotion, e.g. surprise or sadness, and its source are open to interpretation. Consequently, the SIMPLE REACTION was considered realized if explicit reference was made to the boy’s (and/ or dog’s) emotional or cognitive reaction to the frog’s escape, i.e. either the description of an emotional state, as in example (5.5), or the reference to a thought process, as in (5.6). A third possibility accepted was the use of interjections such as oh no in example (5.7), which have the sole purpose of expressing an emotion. Exclamations such as in (5.8), on the other hand, are the result of a cognitive process but do not include any direct linguistic expression of a thought process or emotion. Thus, they were considered part of the INITIATING EVENT (discovery). (5.5) 5[in the morning, # the boy # are s/ is scared,] 6[becau: se, the frog is away! ] (C1-G4-13) (5.6) 6[in the morning, 7[when he woke up,] he looked to his glass,] 8[and thought,] 9[where is my frog? ] 10[is he gone? ] (C5-G4-19) (5.7) 3[then the little children # stand up] 4[and # sayd … # oh no! ] # 5[where’s I/ <? > the frog! ] ## 6[fro-og, where are you? ] (C8-G1-20) (5.8) 3[the <de> frog # hop out the </ de/ > ### glass.] # 4[the boy’s shout] 5[where is my frog? ] (C8-G1-21) Two GOALs can be assumed to motivate the goal-path and thus a globally organized narration of the frog story: On the one hand the goal “recover or replace the frog” and on the other hand the goal “find the frog” (cf. Trabasso & Rodkin 1994: 94) 23 . Either GOAL by itself is enough to act as a global organizing theme for the frog story (cf. Berman & Slobin 1994: 46ff., Bamberg & Marchman 1991: 281), even though it can be concluded from the picture sequence that merely finding the frog is not the actual end of the story. If both goals are realized in a narration, the higher-order goal “recover or replace the frog” (GOAL 1) motivates the secondary goal “find the frog” (GOAL 2), which in turn motivates attempts in particular locations (Trabasso & Rodkin 1994: 94). More specifically, the secondary goal “find the frog” motivates local goal-plans, i.e. the goal “find the frog” is reinstated in particular locations, thus motivating local attempts (Trabasso & Rodkin 1994: 94). The lower-order goal “find the frog” (GOAL 2) was considered realized explicitly, if participants used a verb or phrase expressing an intention, desire or strategy to find the frog as part of the ATTEMPT section (i.e. after the initiating event (cf. Fig. 5.1)), e.g. an adverbial to-phrase expressing purpose, either overtly 23 Trabasso and Rodkin (1994) postulate only the higher-order goal “get the frog back”. For the present analysis, this higher-order goal was seen as too restrictive, since different interpretations of the ending are possible: Recovery of the original frog or replacement by a new one (e.g. Berman & Slobin 1994). Thus, even though the differentiation between higher-order goal and subgoal is taken over from Trabasso and Rodkin, their higher-order goal is extended to “recover or replace the frog.” 67 marked by in order or not (cf. Biber et al. 1999: 827). Thus, examples (5.9) and (5.10) were considered a realization of GOAL 2, while (5.11) was coded as an ATTEMPT: (5.9) 15[the dog ehm put his muzzle into the glass jar,] 16[and wanted to look] 17[if there was any frog in ere/ in there.] (C1-G4-16) (5.10) 38[now, they go to the wood, and ehm look/ ehm to look for Bingo.] (C1- G4-14) (5.11) 11[He looked in ea/ each sh/ shoe, and everywhere.] 12[the dog put his hehehe head in the glass,] # 13[but he can’t find him.] (C5-G4-19) The same criteria were applied for an explicit realization of the higher-order goal “recover or replace the frog” (GOAL 1), i.e. participants were given credit if they used a verb or phrase expressing an intention, desire or strategy to recover or replace the frog. Goals can also be realized implicitly, however, so that they must be inferred from a combination of other narrative components (cf. ch. 3.3.2). GOAL 1 was therefore considered realized implicitly if participants produced the ENDING, i.e. encoded a recovery or replacement of the frog at the end of the story, in addition to the INITIATING EVENT and at least one ATTEMPT, as in example (5.12). GOAL 2 was considered realized implicitly if participants encoded the INITIAT- ING EVENT and at least two ATTEMPTs, thereby indicating a sustained search (e.g. Akinçi et al. 2001, Trabasso & Rodkin 1994, Berman 1988). (5.12) 3[at night 4[when the boy s/ slept] the frog crept out of his glass] 5[and 6[when the boy wakes up] the/ # the frog was away.] … 12[a: nd: the boy get out] 13[and # cried] 14[frog, where are you,] … 27[And then the boy took one of the/ of the children] 28[and # got away, at home.] (C1-G4-2) The distinction between explicit and implicit realizations of GOAL statements is actually scalar: In addition to linguistic expressions directly encoding the main character’s desire, intention etc., various other possibilities were exploited by the children to indicate the underlying goal(s), for example the frequently used exclamation or question frog where are you (subgoal), exclamations such as please come back (higher-order goal) or outcome statements such as “but he [the boy] can’t find him [the frog]” (C5-G4-19). Since these expressions involve at least some inference on the part of the listener, though, they cannot be considered explicit linguistic realizations of the protagonist’s goal(s) and thus were also coded as implicit realizations, but only if they had not already been given credit as AT- TEMPT or OUTCOME (see below). The GOAL is followed by what can be identified as a plan application sequence consisting of several ATTEMPTs to find the frog in different locations and their OUTCOMEs, i.e. a recursive goal-path. Only the very last attempt is successful; all other attempts inside and outside the house are met with failure. Following Bamberg and Marchman (1990, 1994), seven attempts were identified in the picture book (cf. Fig. 5.1). 68 The initial attempt (ATTEMPT 1), which instantiates the search (cf. Bamberg & Marchman 1994: 563), can be inferred from the fourth picture (“bedroom attempt”): The boy is looking into a boot, while the dog has its head in the jar the frog had been kept in. 24 Participants were credited with the first attempt, if—in relation to the fourth picture—they either made explicit reference to the search on the part of the boy or the dog, as in (5.13), mentioned activities not depicted, e.g. look + T-shirt, or used generalizations of search activities depicted or not depicted, e.g. look + all over, as in example (5.14), or look in + plural noun, as in (5.15) (cf. Bamberg & Marchman 1990: 80ff.). 25 (5.13) 6[they searched everywhere.] (C5-G4-11) (5.14) 12[then # they looked all! over, in the boo: ts, in the gla: ss,] (C1-G4-13) (5.15) 11[He look in he # s <his> boots,] # 12[and the frog are not there.] (C1-G1- 6) (5.16) 9[The boy looking in/ in the <but cf. A! > # shoe,] 10[and the dog in a glass.] (C1-G1-14) Look + preposition (in/ into/ to/ out etc.) was only credited as realization of AT- TEMPT 1, if additional mention was made of the frog or generalizations of the search such as the ones described above, since otherwise this corresponds to a mere picture description, as in example (5.16). The same is true for expressions involving shout, scream etc. only. Failure of the initial attempt (OUTCOME 1) needs to be inferred from the continuation of the search in the fifth picture, where the boy shouts out of the window (“window attempt”, pic. 5). The picture sequence allows to conclude that this second attempt also fails; it ends with the boy and the dog outside below the window with the jar broken into pieces. The pictures can then be interpreted as the search continuing outside (“outside attempt”, pic. 8) and in the woods: The boy is depicted as what can be interpreted as shouting into a hole in the ground and getting his nose bitten by a gopher (“gopher attempt”, pic. 9-10), as shouting or looking into a hole in a tree and falling down the tree as an owl comes out of the hole (“owl attempt”, pic. 11 & 12). From the next pictures follows the “deer attempt” (pic. 13-17): The boy climbs on a rock (pic. 13 & 14), ends up on the head of a deer (pic. 15) and is carried away by the deer, which drops him into a lake (pic. 16 &17). Finally, the protagonist(s) make(s) an attempt to find the frog behind a log in the lake ("lake attempt”, pic. 19-21). That is, the boy and the dog are depicted in the lake close to a log; the boy has one hand raised to his ear, indicating that he is listening to something (pic. 20). Then the boy and the dog are depicted next to the log, the boy has one finger raised to his lips, i.e. he is telling 24 Picture four is the most likely choice to signal the instantiation of the search; nevertheless, there are other acceptable options: Bamberg and Marchman (1990: 90), for example, found that several of their participants encoded picture four as a dressing scene and picture five as the instantiation of the search. 25 If participants’ utterances were not clearly attributable to picture four or picture five, they were coded as ATTEMPT 1, i.e. as referring to picture four. 69 the dog to be quiet (pic. 20). Finally, boy and dog are shown from behind with their heads on the other side of the log (pic. 21). The pictures allow to conclude that the lake attempt, which is the last attempt in the plan application sequence, is finally met with success and thus completes the overall search theme (cf. Berman & Slobin 1994: 46, Bamberg & Marchman 1990: 74); the protagonists are shown looking at two big frogs and several smaller frogs, i.e. a “frog family” (pic. 22 & 23). All attempts succeeding the instantiation of the search, i.e. ATTEMPTs 2 to 7, were coded following the same guidelines as ATTEMPT 1. OUTCOMEs 1 to 6 were not considered, since they follow automatically from their corresponding attempts—even if no explicit mention of failure is made—e.g. due to a subsequent attempt (as in (5.17)) or because the child simply continues with the storytelling, and because they serve no additional function in the story. (5.17) 13[and they open the window] 14[and shout/ shouted,] 15[frog, where are you? ] … 22[and they got to the forest] 23[where they shouted] # 24[frog! <fro-og> where are you? ] (C1-G4-4) The CONSEQUENCE (corresponding to OUTCOME 7) was coded, however, due to its function of concluding the search theme. Participants were credited with the CONSEQUENCE, if they identified the frog as being the one from the beginning of the story—either in relating the content of pictures 22 and 23 (explicit realization) or as part of the ending (implicit realization). The task material ends with picture 24: The boy and the dog are depicted at a little distance from the frog family, walking; the boy has a frog in his hand and is waving. Participants were credited with the ENDING component, if they made explicit reference to either a recovery or a replacement of the original frog, i.e. the boy either takes his initial frog with him or a new frog to replace the one he lost.Participants were credited with realizing the recovery option if they explicitly identified the frog as being the original one and made reference to the boy taking the frog away with him. The replacement option, on the other hand, was considered realized, if reference was made to the boy taking a frog away with him, which had either simply not been identified as the original frog or had been explicitly introduced as a different one. Two general problems occurred in the analysis of narrative coherence: First of all, as in any analysis involving spoken and especially spoken learner language, some degree of interpretation was involved in judging whether something was to be given credit or not. As a general rule, the interpretation was kept to a minimum and critical cases were discussed with other linguists. 26 A second problem in the analysis was the occurrence of German expressions in the L2 English data (cf. also ch. 5.2.2.1). Although narrative coherence, i.e. the realization of narrative components, is considered to be a cognitive phenomenon, it is only accessible via its linguistic expression, and thus the central question is whether participants are 26 No interor intrarater agreement was conducted; this is clearly one of the limitations of the present study. 70 able to express the narrative components in their L2. 27 As a general rule, participants were not credited with a narrative component if the crucial element of the clause was in German, as for the ENDING in example (5.18) and the INITIAT- ING EVENT in (5.19): (5.18) 21[and then have the boy a little frog # wieder.] (C8-G1-2) (5.19) 5[the glass # is leer,] (C8-G1-8) (5.20) 3[and the frog # riß <? > aus/ out the glass.] (C8-G1-2) They were given credit, on the other hand, if the narrative component could still be considered accomplished when leaving out the German expression, as for the INITIATING EVENT in example (5.20). Sometimes it was even necessary to recur to intonation in order to decide whether to include or exclude a clause: (5.21) 10[the boy are ruft, <Engl. pronunciation> fro: g! fro: g! ] (C1-G1-4) (5.22) 10[and the boy rufing # the frog.] (C1-G1-7). In the examples, C1-G1-4 (5.21) uses direct speech and thus conveys the sense of calling, even if the German verb is discarded. In the case of C1-G1-7’s utterance (5.22), on the other hand, only a pointless combination of two noun phrases would remain. 5.2.2 Cohesion The number of cohesive devices is to a large degree influenced by text length, since the number of ties increases with each additional clause. To exclude the influence of this factor the transcripts were first coded into clauses and then the proportion of cohesive devices per clause, i.e. the text’s cohesive density, was used as a measure for cohesion. The rationale for the choice of the clause as the basic unit of analysis and the coding method are both described in the next section. The method used for the analysis of cohesive devices is described in the ensuing sections. 5.2.2.1 Coding into clauses The transcripts were coded into clauses following the semantico-syntactic definition used by Berman and Slobin (1994): Any unit containing “a predicate that expresses a single situation (activity, event, or state), including finite and nonfinite verbs as well as predicate adjectives” (Berman et al. 1986 in Berman & Slobin 1994: 657). 27 A separation into linguistic, cognitive, and communicative abilities as well as their development is actually rather artificial and belies the strong interconnectedness of these abilities, especially in relation to narration (e.g. Berman 2001a: 25, Bamberg 1987: 13; cf. also ch. 2). 71 The clause was chosen as the basic unit of analysis because it is a comparatively well-defined grammatical unit. 28 That is, clauses allow a rather precise allocation of cohesive devices and their respective anaphoric elements even in spoken language. At the same time coding into clauses allows the measurement of participants’ text length and thus the calculation of the texts’ cohesive density. An additional advantage of clauses is that they are semantically close to propositions, whose usefulness for studies of text production and processing has been amply demonstrated (e.g. Kintsch 1998 & 1974, Reilly et al. 1998, Mandel & Johnson 1984, Thorndyke 1977). While propositions are located on a conceptual, cognitive level, however, clauses are a linguistic unit realized in speech or writing; it can be said that clauses express propositions (cf. Cruse 2004: 22, Huddleston & Pullum 2002: 34, Biber et al. 1999: 122, Kintsch 1998: 54ff., Reilly et al. 1998, Mandel & Johnson 1984, Thorndyke 1977, Kintsch 1974: 5). Due to being centered around the verb as expressing one state, event or activity, clauses thus also allow a rather precise allocation of the conceptual units of discourse realizing individual narrative components. Following Berman and Slobin’s coding (1994: 660ff.) for comparability, lexical verbs similar in function to modals and aspectual verbs, e.g. want or start, were coded together with the verbs they modified if they occurred in utterances with the same subject, as in example (5.23). If, on the other hand, they introduced utterances with a different subject, they were coded as belonging to two different clauses (5.24). Similarly, narrator comments, i.e. metacomments on story content, as in (5.25), or on the communicative situation, were coded together with the clause they commented on. If they formed a clause of their own they were coded accordingly but did not count towards total text length. 29 (5.23) 19[because they wanted to find Spot.] (C1-G4-7) (5.24) 50[and he wants] to/ # to/ 51[the deer to let the boy down.] (C5-G4-15) (5.25) 66[and it look like # ehm the dog was ill,] (C5-G4-24) Several coding problems arose due to the fact that this study looks at spoken language and, moreover, L2 learner data. These were unintelligible passages, German words in the L2 English transcripts, dysfluencies, verbal interaction with the interviewer, and verbless clauses. In the following, the coding procedure for each of these problem areas will be described. Utterances with unintelligible passages were coded into clauses if only optional elements, i.e. modifiers (e.g. Quirk et al. 1985), were unintelligible and the overall meaning of the utterance could still easily be inferred, as in example (5.26). Otherwise these passages were not coded. (5.26) 32[and find a Loch on X/ X.]< hit/ him ? > (C8-G1-1) 28 Especially compared to sentences (cf. Quirk et al. 1985: 47, Hunt 1965). For a discussion of the (dis-)advantages of several units of analysis, particularly with regard to measuring text length, cf. also Möller (2009a). 29 In this they were treated like crucial plot information, cf. below. 72 In the L2 transcripts, utterances with predominantly German words were not coded as clauses, since the question was whether the children were able to tell a coherent and cohesive story in English; thus, utterances consisting of more than 50% German words were discarded. First, the total number of words in the utterance as well as the total number of German words was counted. Articles, prepositions and conjunctions were not included and phrasal verbs treated as one unit. Words which were not completely target-like in their realization but sufficiently similar to their L2 targets, e.g. been or also beens (‘bees’, German Bienen), were treated as English words, even if they could also have been considered instances of code-switching to German. Words that were simply “anglicized” in pronunciation, on the other hand, were counted as German. Then the number of German words in an utterance was divided by the total number of words in the utterance and the result was checked against the 50% threshold. Dysfluencies such as (self-)repairs, repeats and hesitations (for the terminology cf. Biber et al. 1999) occurred very frequently in the data; they are a performance phenomenon that is typical for spoken language (cf. Biber et al. 1999: 1039) and even more so for learner language. Where simple repeats occurred within clauses, they were disregarded for the analysis. This is illustrated in example (5.27), where the child uses repeats and hesitators to extend processing time and indicate that it has not yet finished the utterance (cf. Biber et al. 1999: 1054f.). 30 (5.27) 5[that the frog is: # ehm # {the frog is: } # e: hm outgo # of the glass.] (C1- G1-18) When retrace-and-repair sequences occurred in the utterances, e.g. the self-repair in (5.28), only the reformulation was coded. 31 As a general rule, only clause elements that the speaker retraced explicitly were excluded from analysis. Thus, if the speaker repaired the connective in a retrace-and-repair sequence, the clause was coded from the new connective on, e.g. in example (5.29) where the simple sequential connective then is replaced by the more specific, punctiliar connective (cf. Halliday & Hasan 1976) in this moment. (5.28) # ehm, then he shaked the tree/ also 44[then the dog shaked the tree,] (C5- G4-21) (5.29) then/ ehm 42[in this moment # a/ all bees was/ were/ were following,] (C1- G4-13) If, on the other hand, element(s) following the connective were repaired but there was no reformulation of the connective itself, the clause was still coded from the connective on; this was done because retrace-and-repair sequences involve “backtracking to the initial part of a clause” (Biber et al. 1999: 1063), e.g. a connective, as opposed to a complete abandonment of the original utterance. 30 (Parts of) utterances that did not enter in the analysis are enclosed in {…}. 31 The terms repair, recast, reformulation and retrace-and-repair-sequence will be used interchangeably for any sequence where part(s) of an utterance are reformulated by the speaker thereby “retracing” or cancelling the original part(s). 73 Self-repairs sometimes led to utterances with apparently incoherent syntactic structure, and some degree of interpretation was necessary to decide which parts of the utterance were repaired and what the final intended structure was. This process will be demonstrated for example (5.30): First, the speaker self-repairs the verb form cames and finishes the originally planned utterance with the particle out. The original structure of the utterance is left intact; the reformulation is accommodated in the existential clause structure. (5.30) 52[then there cames an owl/ came an owl out/ # <atmet tief ein> flew out,] (C1-G4-17) The next self-repair, though, allows for two interpretations: 1. The speaker not only reformulates the original verb into a more specific verb of motion but also retraces the existential clause structure. This means that a syntactic blend is created, i.e. the utterance starts with a different grammatical structure than that with which it finishes (cf. Biber et al. 1999: 1065); only the connective would still be accommodated in the new structure and thus, the clause would enter the analysis as [then an owl flew out]. 2. The participant again repairs the verb only, as a kind of afterthought; evidence for this is the short hesitation pause and the parallelism of flew out to the retraced construction came […] out. This would mean that the original structure remains intact and the clause enters the analysis as [then there flew an owl out]. Following the general rule established for retrace-and-repair sequences, namely only to exclude from analysis clause elements that the speaker retraced explicitly, interpretation (2) was chosen. The interviews show a great awareness on the part of the participants, even in first grade, as to the stereotypical roles in a storytelling context, i.e. an active storyteller and a comparatively passive listener. This set-up was of course encouraged through the instructions given to the interviewers, who were told to use nonspecific prompts such as mhm or okay and in general to focus on the non-verbal aspects of their role as “fascinated listeners.” Nevertheless, some verbal interaction between participants and interviewers was unavoidable—its complete absence would have been as unnatural as in any other communicative context. The question was then if, and how, to code utterances resulting from an interaction with the interviewer. Such interaction was sometimes initiated by the interviewer and sometimes by the participating children; interviewer-initiated sequences resulted from prompts as well as from clarification requests, whereas participant-initiated interaction sequences usually followed vocabulary-related questions. First of all, interviewer-initiated sequences and their coding will be described: Prompts used by the interviewers can be categorized as being either non-specific or specific. Non-specific prompts are general interviewer contributions, which serve to encourage the child in telling the story and, if necessary, to encourage it to tell more; this includes non-verbal and verbal (e.g. oh, mhm) expressions of interest, surprise etc., as well as questions such as What is happening? . More specific prompts, such as And how about the dog? , were used only if a child was very 74 reluctant to speak at all. Additionally, the interviewers made use of non-specific and specific clarification requests; non-specific clarification requests used were, for example, requests to repeat an utterance due to background noise or because the child had mumbled part of an utterance. Specific clarification requests, on the other hand, were direct reactions to something the child had said, e.g. a German word, a non-target-like structure or an obvious mismatch in content with the previous utterances (example 5.31). (5.31) Participant 16 the frog <dog> ehm took his tongue a/ and # ehm Interviewer The frog or the dog? Participant 16 Ehm 29[the dog took his tongue] 30[and the/ # wish/ wif/ the wif/ wish <from German wischen> ehm over his cheeks.] (C5- G4-16) Children’s reactions to non-specific and specific prompts were coded like normal utterances, except when they were mere repetitions of what the child had said before. For all other interaction sequences, coding was based on the children’s verbal reaction—again excluding simple repetitions. When children’s utterances did not function as an answer to an interviewer question, but rather seemed attributable to a lag in the child’s processing of its originally planned utterance, they were coded as normal utterances. More often, though, interaction with the interviewer resulted in recasts. As a general rule, only the recast was coded, if it presented an “improved” version of the original utterance, e.g. by clarifying the actor or by replacing a German with an English expression, as in examples (5.31) and (5.32), where recasts follow a clarification request). In this case, the original utterance was not coded. (5.32) Participant 9 … then # <they, cf. below> fall in the Wasser, Interviewer Wh/ what? Participant 9 16[Then they </ ði/ > fall in the water,] (C8-G1-9) Participant-initiated interaction sequences usually followed vocabulary questions and involved either a recast following successful negotiation for a vocabulary item or a recast without any actual negotiation taking place. 32 Passages with reformulations resulting from successful negotiation were coded as follows: If the child managed to extract, for example, an English word it was missing via negotiation and interaction with the interviewer, the interaction sequence between “missing piece” and unfinished clause was disregarded and the whole unit coded as one clause. This is shown in (5.33), where {…} encloses sequences that did not enter into the analysis. 32 Negotiations for vocabulary items took place in English. Interviewers were instructed not to answer German vocabulary questions, since this would have made their role as monolingual speakers of English questionable. 75 (5.33) Participant 6 34[and in one tree there was {a bee # heap. </ hi/ > haive? </ eı/ > a ha/ a have oder? # bee ehm/ # a bee # heap? # ehm why don’t/ it’s called, ehm a bee heath? </ i/ > Interviewer Oh, a beehive.} Participant 6 A beehive.] (C1-G4-6) Interaction sequences without an actual negotiation for vocabulary usually involved the child asking for a German word and the interviewer reacting with a general expression of incomprehension. The child then self-repaired and continued the original utterance replacing the German with an English expression (example 5.34). Again, the recast was coded and entered into the analysis, while the German expression was disregarded. (5.34) Participant 10 27[and the dog # ehm # ehm # {bellt? Interviewer I am sorry? I don’t understand. Participant 10 X <bi/ ? >} bark at the # beehive,] (C5-G4-10) Verbless clauses were also coded. In this I again followed Berman and Slobin (1994: 661). Assumption of an ellipsis was avoided, however, except for clear cases of gapping (cf. Huddleston & Pullum 2002: 1337f.) as in example (5.35) below; that is to say, coordinated noun phrases, adverbs etc. were treated as one unit, since they syntactically function as one, even if semantically the respective predicate refers to both elements and two propositions could be argued for, e.g. in (5.36). This line of argumentation also holds true for existential clauses, as in (5.37), where a singular predicate is followed by two coordinated noun phrases as the notional subject. In written language this would be considered ungrammatical and an ellipsis would need to be assumed. In spoken language, however, the verb is often singular, even if the following notional subject is plural (cf. Biber et al 2002: 236), so these instances were also coded as one clause. 33 (5.35) 7[the boy looked in his boots,] 8[the # dog in the glass] (C1-G4-2) (5.36) 8[and on the ground lays a t-shirt and a sock.] (C1-G4-7) (5.37) 1[Once there was a boy and a dog,] 2[they had a frog in a glass,] # 3[and they were/ # were/ were nice to the frog,] (C1-G4-4) Verbless sentence elements introduced by with were coded as clauses if they had an S(V)A or S(V)C structure, i.e. a normal clause structure with verbal ellipsis, and if, additionally, they functioned as sentence adjuncts, i.e. grammatically separate from other clause elements (cf. Quirk et al. 1985: 478 & 511ff., Berman & Slobin 1994: 661). With functions as a subordinator in these cases and not as a preposition (Quirk et al. 1985: 705); compare e.g. examples (5.38) and (5.39). 33 This also does justice to Berman and Slobin’s (1994: 657) criterion of a unified predicate expressing a single situation (see above). 76 (5.38) 81[the dog <boy> goes in the water again] 82[with the frog on his hand.] (C5-G4-9) (5.39) 47[there they found two fro/ two big frogs with lit/ with many little small frogs.] (C1-G4-14) Verbless clauses were also coded if verb or auxiliary were missing due to a deficient structure attributable to learner language, and if the meaning was easily recoverable by adding a form of copular be, as in (5.40). (5.40) 9[the dog falling down.] # 10[the glass broken.] (C1-G1-17) Due to the nature of the storytelling task as an oral performance, the children sometimes “acted out” the verb instead of mentioning it explicitly. These cases were also coded as clauses (5.41). Finally, crucial plot information was coded even without the presence of a verb or an easily reconstructable ellipsis (5.42). (5.41) 8[the boy, # fro: g! fro: g! ] (C1-G1-4) (5.42) 38[there was two frogs,] # 39[and # he/ the one was frogfrog.] # and he’s/ <his> # 40[and he’s babies.] (C8-G1-5) 5.2.2.2 Cohesion methodology The present study’s cohesion analysis was based on the standard categories in discourse analysis, which all go back to the seminal work by Halliday and Hasan (1976 & 1985, Hasan 1984); thus, for each participant the use of references, substitution, ellipsis, connectives, and lexical cohesion was coded. The study takes into account intraas well as interclausal ties in order to capture all possible links, even if interclausal (and even more so intersentential) ties are more salient in contributing to cohesion. 34 All potentially cohesive items were extracted from the individual transcripts and coded according to (a) the subcategory of cohesion, e.g. reference or connective, and (b) the most immediate presupposed item as well as the clause this presupposed item occurred in. 35 Additionally, each tie’s phoricity was coded, i.e. whether references were endophoric or exophoric and anaphoric or cataphoric. Endoand anaphoric reference was considered the “default case,” i.e. it was first of all assumed that a relationship exists with some preceding element in the text rather than looking for possible relationships with subsequent text parts or outside the text. If both interpretations were possible, ties were coded as endo-/ anaphoric. 34 The intraclausal cohesive devices coded were references and lexical ties; the connectives and or but as phrasal coordinators were not coded. The overall number of intraclausal devices was very low. 35 Cohesive items are very often part of cohesive chains involving repeated references to the same referent, several repetitions of the same lexical item, etc. The degree of their cohesiveness is of course also determined through membership, i.e. embedding, in such a chain (cf. Halliday & Hasan 1976: 330). However, to preserve some degree of clarity in the analysis only the tie with the most immediate presupposed item is recorded. 77 German words were excluded from the analysis even if they were “anglicized” in pronunciation or had an English homophone which was, however, contextually infelicitous, e.g. / fil/ in example (5.43), which could be interpreted as either German fiel (i.e. English fell) or English feel. 36 Non-target-like English words were coded if it was possible to infer the intended meaning due to closeness to the target word, e.g. beens in example (5.44), or due to the linguistic context. (5.43) 9[the dog fiel </ fil/ > vor <=from> the window.] (C1-G1-10) (5.44) 18[and the beens: # fly after the dog,] (C1-G4-2) After the classification into subtypes, which will be described subsequently, the following measures were calculated for each subcategory of cohesive devices: • Total number of occurrences • Number of occurrences per clause (density). Additionally, participants’ overall cohesive density, i.e. the mean number of ties per clause, was calculated. Since only endophoric and anaphoric ties regularly contribute to cohesion (Halliday & Hasan 1976), exophoric and cataphoric ties were excluded from these final steps of the analysis. 5.2.2.3 References References were classified according to the semantic relationship they encode, and thus, personal, demonstrative and comparative references were coded (Tab. 5.5). Tab. 5.5 Subtypes of references included in the analysis Type Subtype Further subtype Example References personal personal pronoun he, him, they, them, it… possessive pronoun his.. relative pronoun who, which, that demonstrative adverb there, here definite article the demonstrative proper this, that relative pronoun where (~in which) comparative more, similar, other The category of personal references comprises personal (independent of case), possessive and relative pronouns. Left dislocations (also pronoun copies) such as in example (5.45) (cf. also Bamberg 1994: 226f., Biber et al. 2002: 956) were excluded from the overall count of (personal) references, since they are an oblique repetition. (5.45) 5[The # boy, he's # very very # tired.] (C1-G1-3) 36 Indefinite or definite articles used together with German nouns, on the other hand, were included in the analysis of demonstrative references, e.g. the Eul (C8-G1-11). 78 Since the present study’s cohesion analysis is based on the clause, relative pronouns (e.g. who, which) were also considered. In contrast to interrogative pronouns (cf. Halliday & Hasan 1976: 309), relative pronouns contribute to cohesion in a clausal analysis, since they refer back to a referent in the text (cf. Quirk et al. 1985: 365), as in (5.46). (5.46) 1[once there was a bo: y,] 2[who had a f/ frog in a gla/ ehm ehm in a glass,] (C1-G4-6) Similarly, possessive pronouns were coded as personal references, even if their referent was in the same clause. In such cases the use of the possessive is strongly dependent on clausal structure and it could therefore be considered non-cohesive; nevertheless, there is still a cohesion-building relationship of co-reference. Classified as demonstrative references were the adverbs here and there, the determiners this and that, the relative pronoun where indicating location analogously to spatial there (cf. Quirk et al. 1985: 442), and the definite article. 37 The definite article the contributes to cohesion by indicating that “the noun it modifies has a specific referent, and that the information required for identifying this referent is available” (Halliday & Hasan 1976: 74). Thus, it signals that this information is recoverable either endophorically, i.e. from the preceding text (anaphora) or from within the same nominal group (cataphora) as in the hole of the owl, or on the other hand extralinguistically, e.g. in generalized exophoric uses such as in the sun (Halliday & Hasan 1976: 71ff.). In the analysis, both the cataphoric noun-phraseinitial use of the definite article as in the latter example (Halliday & Hasan 1976: 72) and any (generalized) exophoric uses were, of course, excluded as being noncohesive (cf. ch. 5.2.2.2). The cohesive, i.e. anaphoric usage of the definite article involves in its clearest case the repetition of the item referred to (5.47), but often also use of a synonym, near-synonym, a hyponym (5.48) or even a meronym (Halliday & Hasan 1976: 72). (5.47) The boy climbed on a tree. Then the boy looked into a hole. (5.48) The boy looked at the frog. Then the frog ran away. The kid was very sad. The indefinite article, on the other hand, indicates that an entity’s referent is not recoverable from the preceding text. Any use of the indefinite article needs to be considered an exophoric demonstrative reference, which is non-cohesive and was therefore excluded from the analysis. Instances of all, some or both as non-specific determiners were coded analogously. 38 The classification of there was not always straightforward, since there occurs quite frequently in the data but not in its function as spatial adverb. Instead, it 37 Demonstrative that is not to be confounded with that acting as a relative pronoun such as in: 3[and he had a dog] 4[that he called Lucky.] (C1-G4-07). In the example, that was coded analogously to which as a personal reference. Additionally, demonstrative and factual connective that were discriminated (cf. the section on connectives). 38 If both was used as a universal pronoun (cf. Quirk et al. 1985: 381), it was coded in the same manner but with a subsequent nominal ellipsis. 79 most often functions as existential there, e.g. in (5.49). In the data, there was only classified as a spatial adverb if it carried tonic prominence and allowed no combination with here (5.50) (cf. Quirk et al. 1985: 1402ff.); otherwise there was excluded from the analysis as being non-cohesive. (5.49) 1[there was a boy,] 2[he had a dog and a frog.] (C1-G4-2) (5.50) 7[and the boy is looking in the glass,] 8[and the # frog is not there.] (C1- G1-16) Comparative references occurred very infrequently. The ones identified in the data were another, which was split up for coding into indefinite article an and comparative reference other, the comparative adverb again (analogously to more) and the adjective next, which was considered a comparative reference expressing difference. However, most occurrences of these comparative expressions were non-cohesive, since they had an implicit (exophoric) reference point (cf. Quirk et al. 1985: 530, Halliday & Hasan 1976: 81 & 324). 5.2.2.4 Substitution and ellipsis Substitutions and ellipses occurred only very marginally in the data. Substitutions which occurred were either nominal (one) or verbal (do); there were no instances of clausal substitution (so, not). For the analysis, substitute one, as in (5.51), needed to be differentiated from cardinal numeral one, as in (5.52), which was coded as an instance of lexical cohesion, and one functioning as an indefinite article (5.53), which was excluded as being an exophoric demonstrative reference (cf. Halliday & Hasan 1976: 98ff.). The following criteria were used: as opposed to the cardinal numeral, substitute one does not contrast with other numerals, it can have either singular or plural form (one-ones), and the substituted item cannot be added (5.54). In this latter criterion substitute one also contrasts with the indefinite article one, which is often accompanied by an ellipsis, as in (5.52), but to which the ellipted item can always be added. At the same time, the substitute’s plural form (ones) contrasts with the plural form of the indefinite article (some). Most importantly, however, substitute one is always accompanied by some contrast to the lexical item substituted, e.g. another in example (5.51); the expression encoding this contrast is phonologically salient, while substitute one is not (cf. Halliday & Hasan 1976: 95ff.). In this it resembles the indefinite article but contrasts with the numeral, which is always phonologically salient. (5.51) 51[and there’s sit his frog with another one.] (C5-G4-19) (5.52) 79[and then … very much # frog childs ehm jumped out of # the grass.] # 80[and one # was jumping to the boy.] (C5-G4-21) (5.53) 1[one night, a little boy had # a frog under his glass.] (C5-G4-19) (5.54) Which toy? *The red one toy. Categories of ellipses encountered were nominal, verbal, and clausal. Any omission of (proper) noun, pronoun or (coordinated) noun phrase(s) was classified as nominal ellipsis, while omissions of the verb or unified predicates were coded as being instances of verbal ellipsis (example (5.55)). Ellipses involving a combina- 80 tion of nominal and verbal element(s), e.g. a verb phrase including a prepositional phrase (5.56), were classified as clausal and thus also coded. Other omissions, e.g. of that or which, were not considered. (5.55) 65[then the deer be#guns <began> to run,] 66[and the dog in front of the deer.] (C1-G4-7) (5.56) 17[then the dog fall out # the # window,] 18[and the boy also.] (C8-G1-1) 5.2.2.5 Connectives All interclausal connectives were coded. Formally, this includes conjunctions (e.g. and), adverbs (e.g. then), prepositional phrases (e.g. in the morning), complex items (one day) or a combination of these forms. Semantically, Halliday and Hasan’s (1976: 226) classification was used and the coding thus included additive, temporal, causal and adversative connectives (Tab. 5.6). 39 Three categories were added, however, namely conditional if, which Halliday and Hasan include under the general heading of causal (1976: 259), the “neutral” factual connective that, which introduces a specifying complement clause (cf. Biber et al. 1999: 658), and the subordinator with (cf. Quirk et al. 1985: 705). Tab. 5.6 Subtypes of connectives included in the analysis Type Subtype Examples Connective additive and, or, also, and/ but ..also/ too/ as well † temporal then, now, once, first, and then/ now, (and/ but) in the morning, (and/ but) one day, next day, suddenly, when, as… adversative but, while, except causal because, cause, for, (and) so conditional if factual that subordinator ‘with’ with, e.g. with (the dog {being} on his bed) † And… also, and... too etc. are considered parallel constructions of additive connectives and thus coded as one unit (Halliday & Hasan 1976: 246). Combinations of and or but plus a subsequent connective were coded as one unit, e.g. and/ but then as a temporal connective, since and (as well as to a lesser extent but) often functions as a discourse marker indicating that more is to be said (cf. Renkema 2004: 168, Berman & Slobin 1994c: 178, Halliday & Hasan 1976: 261). Any complex connective item such as in the morning (e.g. C1-G4-1) was first of all coded as one unit according to its function as a connective, e.g. the temporal 39 A more detailed formal and semantic analysis should be conducted in the future, as this study only considers the overall category connective. Additionally, a detailed functional analysis would be desirable, since form and function cannot simply be equated (cf. ch. 3.4). Unfortunately, this is outside the scope of the present study. 81 connective in the morning. As a second step, the lexical items forming part of it were analyzed for lexical ties, e.g. an oppositional relation such as between morning and night. This seems justified since the use of a complex item represents a conscious lexical choice in expressing the semantic relationship between clauses much more specifically than if a more general connective is used, compare e.g. in the morning with then. 5.2.2.6 Lexical cohesion The intraand interclausal lexical relations of nouns, lexical verbs and adjectives were coded; morphological variants were considered together with their base forms. If a lexical item had no relationship to any preceding element, it was excluded as being non-cohesive. Two large categories of lexical cohesion, as distinguished by Halliday and Hasan (1976), were included in the analysis, namely reiteration and collocation (cf. ch. 3.4); in the following, I will describe these two categories and the way they were operationalized for the analysis. After that, some general coding problems and their solutions will be addressed. The category of reiteration comprises ties through simple repetition, synonymy, near-synonymy, and hyponymy. The instances of ties by synonymy and nearsynonymy identified and coded are listed in Tab. 5.7; included in this category are instantial synonyms, e.g. frog-Bingo or dog-Bello (C5-G4-3; cf. also Hasan 1984). Tab. 5.7 Synonymy and near-synonymy included in the analysis Synonyms lookpeep; springjump; tinylittlesmall; greatbig; earthground; instantial synonyms, e.g. dog-Bello Near-synonyms looksee; seepeep; lookwatch; cryscreamshoutyellsaycall; hearlisten; bowl-glass The occurrences of hyponymy identified are listed in table 5.8. Only the relationship hypernym-hyponym was coded as reiteration, co-hyponyms were included in the collocation category described below. Tab. 5.8 Hyponymy included in the analysis Hypernym Hyponyms animal dog, bees, owl, mouse, deer, hamster, bird pet dog, frog kid/ child boy The second large category of lexical cohesion distinguished by Halliday and Hasan, namely collocation, comprises lexical items tied by opposition, co- 82 hyponymy, 40 (co-)meronymy and general co-occurrence (1976; cf. also ch. 3.4). The oppositional relations identified and coded are shown in Tab. 5.9. Only one instance of a co-hyponymic sequence, namely cardinal numerals, was coded as such. Tab. 5.9 Oppositional relations included in the analysis Oppositional relations † jump/ run/ climb/ creep/ go/ hop/ fall/ spring/ swim/ fly stand/ sit / lie/ stop/ stay stand sit/ lie cry/ scream/ shout/ yell/ say/ call hear/ listen take give wake up sleep morning/ day night/ evening come go moon sun big/ great/ large little/ small/ tiny happy sad/ unhappy girl boy † Slashes indicate equivalent alternatives. No distinction was made between different types of opposition such as antonyms and converses. All other ties through collocation, including all instances of (co-)meronymy, were operationalized as the word fields and lexical sets 41 listed in Tab. 5.10, which incorporate lexical as well as psycholinguistic relations in the sense of belonging to a common schema (“general co-occurrence”). 42 40 This is taken to include sequences such as cardinal numerals, which can be described as co-hyponyms of number. 41 Lexical sets are defined as being related “either based on association and intuition, or on objectively verifiable relationships captured by encyclopedic knowledge” (Lipka 2002: 173); they play a dominant role in many texts (cf. Schubert 2012: 54) 42 (Co-)meronymy was not coded as a separate lexical relation, since meronymic relations are often difficult to delimit (cf. Cruse 2004, Löbner 2002). The same is true for the (co-)hyponymy of verbs. 83 Tab. 5.10 Word fields included in the analysis No. Word field Members † 1 Going to bed and getting up night/ morning or day, sleep/ wake up, bed, evening; bed dream; sleep dream; night dark; night dream, moon, pyjama, sleep; moon/ sun shine; night, light, moonlight, evening; morning sun, breakfast 2 Forest/ woods wood, tree, twig, bushes, (tree) trunk, log, stem, leaves 3 Animals dog, frog, bees, owl, mouse, deer, hamster 4 Verbs of motion jump, run, climb, creep, go (in the sense of movement), hop, spring, crabble, fall, fly, swim 5 Verbs of “perception” see, look, peep hear, listen smell 6 Frogs and their habitat frog, pond, lake, water, sea; wet, swim, shower, pond, lake, water, sea 7 Animals and their associated activities fly bees, flies, bird; owl fly; skunk stink; bees honey, sting 8 Sounds, sound production and perception noise, sound, shout, loud, quiet, cry, hear, say, scream, yell, make pst, sing, specific sounds such as a quak, to quack, bark 9 Mental verbs think, believe, know 10 Body parts of animate beings head, nose, hand, face, arms, face, tongue, cheeks, tummy, feet, paws, antlers, horns, muzzle, skin, horns 11 Animals and the sounds they make frog quak; dogbark 12 Emotions and their expression unfriendly/ friendly, nice, sad, unhappy/ happy, angry, laugh, afraid, smile, cry, kiss, like, love, surprised 13 House and furniture house, window, windowsill, room, bed, chair, door, carpet 14 Family baby, parent, father, mother, dad, married, man, wife, children, kid, baby, woman, girlfriend 15 Temporal expressions time, moment, night, morning, evening † In the coding procedure more specific lexical relations, i.e. (near-)synonymy, hyponymy and oppositional relations, overruled the less specific one of membership in a word field. Slashes indicate a relationship of opposition between the terms separated by them in a word field, while terms separated by commas are in a relationship of interchangeability, i.e. any two of them would have been coded as being related. Separation by a semi-colon indicates that only the terms in this respective chain were considered interchangeable in coding. Lexical field 15 was created for lexical cohesion not covered by the categories oppositional relation or lexical field 1. 84 Finally, some further decisions need to be outlined regarding the coding of lexical ties: First of all, the analysis was limited to relationships between individual lexical items. Relationships between phrases and lexical items were not coded, although, for example, an opposition between could not come out anymore and free (5.57) is in principle not different from an opposition such as stuck free, which would have been coded. (5.57) 12[but he could not come out anymore.] …17[the glass exploded,] 18[and the dog was free.] (C1-G4-10) Secondly, lexical verbs similar to modals and aspectuals were excluded from the analysis, e.g. let (5.58) or want, since their function is to modify the meaning of the following lexical verb (cf. ch. 5.2.2.1). (5.58) 23[and the deer get the boy] 24[and # let him fell in a pond.] (C1-G4-2) Similarly, forms of be and lexical verb do were excluded due to their minimal semantic content (cf. Halliday & Hasan 1976: 125f.), the same applying to go in fixed expressions such as go to sleep, since in contrast to go to bed or go home no interpretation of movement in space is possible. Have was only coded where it could be replaced with possess or own. Compounds were treated as one unit where a relationship of hyponymy was identifiable as the most immediate lexical relation (e.g. mouse hole hole). Otherwise the most immediate relationship of one of their parts was coded. This allowed for ties through repetition (e.g. mouse hole mouse) or membership in a lexical field (e.g. mouse hole dog as common members of the lexical field animal). Finally, the story’s title Frog where are you? and its variants, e.g. where are you, frog? , were split up into their individual elements—except for when they were used with the actual function of a title at the beginning of a story, as in (5.59). (5.59) 1[Frog, where are you? ] # 2[once upon a time…] (C1-G4-3) In the latter case they were excluded as being an exophoric and thus non-cohesive reference to the picture book. Subsequent mentions, however, were always coded according to their individual elements, even if an exophoric use referring to the title could not be excluded for any of the occurrences. 5.2.3 Statistical methods Several standard descriptive measures were used to interpret the data. For all coherence and cohesion measures, two measures of central tendency are given, namely the mean and the median. The mean marks “the centre of a distribution of scores,” i.e. it gives a “hypothetical estimate of the ‘typical’ score” (Field 2009: 789); however, the mean is not always representative, since it is easily influenced by extreme values (outliers). Therefore, the median is also given, which indicates “the middle score of a set of ordered observations” (Field 2009: 789), i.e. 50% of all scores lie above and 50% below the median. Another measure of central tendency, which was of interest for some of the coherence data, is the mode. It indicates the 85 most frequent score in a set of data (e.g. Field 2009: 790); more than one mode can exist, however, in any data set. Multiple descriptive statistics were used to shed light on the variability of the data set, i.e. as a measure for participants’ interindividual differences: standard deviation, variability ratio, minimum and maximum score, and the scoring range. The standard deviation constitutes “an estimate of the average variability (spread) of a set of data” (Field 2009: 794). However, its meaningfulness depends to a large degree on the respective group mean. A group with a high standard deviation and a low mean is more heterogeneous, for example, than another group with an equally high standard deviation but a higher mean, and therefore the variability ratio was introduced as a relative measure for interindividual differences; it is defined as a data set’s standard deviation divided by the corresponding mean, e.g. the standard deviation of the total number of components in first grade divided by the respective first grade mean. Maximum score and minimum score, as well as the corresponding range, which is equivalent to the maximum minus the minimum score (e.g. Field 2009: 792), are also given as measures for variability. In addition to the descriptive statistics, several inferential statistical tests were used for the interpretation of the results. These tests were performed with PASW Statistics 17.0. Three-way factorial analyses of variance (ANOVAs) were conducted in order to establish main and interaction effects of the independent variables grade, sex, and L2 experience on a dependent variable, namely participants’ total number of narrative components and their stories’ cohesive densities. ANOVAs serve to determine significant differences between several means on the basis of within-group and between-group variances (Mackey & Gass 2005: 274). 43 Results are reported in the form of the F-value, which represents the ratio of betweengroups over within-group variance (Hatch & Lazaraton 1991: 315) and which must be larger than one to indicate at least some difference among groups (ibid.: 321). A basic 2x2x2 design over all participants was carried out with grade (first, fourth), sex (female, male) and L2 preschool experience (mono, bili, cf. ch. 5.1.2) as between-subjects factors. Further analyses of variance with reduced designs, e.g. excluding statistical outliers, were carried out where necessary; these are reported in the corresponding chapters. If the ANOVA results showed any significant interaction effects, simple effects analyses (Field 2009: 442f.) were conducted in order to locate the source of these interaction effects or, alternatively, less conservative independent samples t-tests. As a measure for effect size, i.e. as an indicator for the degree of association, partial eta-squared (reported as partial η 2 ) is presented if the F-values were statis- 43 ANOVAs should not be conducted, however, when the data is not normally distributed (e.g. Field 2009: 359f.). This assumption was tested with a one-sample Kolmogorov- Smirnov test. If the data was not found to be normally distributed, a non-parametric alternative was chosen.. Neither should ANOVAs be conducted if the underlying assumption of the homogeneity of variances is violated (e.g. Field 2009: 359f.); this was tested with a Levene test beforehand. If the inhomogeneity was barely significant, however, the ANOVA was still conducted. 86 tically significant. Any measure for effect size or degree of association indicates “how much of the variability in the data can be accounted for by the independent variable” (Hatch & Lazaraton 1991: 266); thus, partial eta-squared can be interpreted “as the proportion of the total variability of the dependent variable which is explained by the variation in the independent variable” (Porte 2002: 235). A partial eta-squared value of 0.45, for example, means that 45% of the variability in the sample can be explained by the independent variable under investigation (analogous to eta-squared, cf. Hatch & Lazaraton 1991: 266); at the same time it indicates that another 55% of the variability is left unexplained. Where simple effects analyses were conducted to locate the source of interaction effects, r-squared (reported as r 2 ) will be given for these results as the measure for effect size; the meaning and function of r-squared is analogous to partial eta-squared. Pearson chi-square tests are used to test the statistical significance of observed differences in frequency variables (e.g. Hatch & Lazaraton 1991: 394, Woods et al. 1986: 139), namely variables measuring how often a variable is present in the data (nominal variables) as opposed to score data (ordinal and interval variables), which shows how much of a variable is present (Hatch & Lazaraton 1991: 62). 44 Chi-square was therefore used to test differences in the realization of the individual narrative components regarding grade, sex, and experience group. Chi-square tests indicate whether two variables are statistically independent by comparing observed frequencies with the ones to be expected in case of independence (e.g. Porte 2002: 136, Hatch & Lazaraton 1991: 393ff., Woods et al. 1986: 139ff.). The chi-square value (reported as χ 2 ) thus indicates the extent of the difference between observed and expected frequencies. The higher the χ 2 -value the more likely it is that the variables tested are not independent from each other. In practice, this would mean, for example, that the observed frequencies of a narrative component are dependent on grade, i.e. that a component’s realization differs significantly between grade one and four. Additionally, phi (reported as Φ) is presented as an indicator of the strength of association for chi-square tests, since the value of chi-square only indicates whether there is a relationship between variables, while phi serves to determine how strong this relationship is in a 2x2 design (e.g. Field 2009: 695, Hatch & Lazaraton 1991: 415f.). Phi is defined as the square root of (χ 2 / N), in which N is the total number of observations (e.g. Field 2009: 791). Phi follows a similar logic as (partial) eta-squared described above, but it takes values between zero and one, a value of zero indicating total statistical independence and a value of one indicating total dependence (e.g. Kähler 2010: 109). In case of expected cell frequencies below five, as well as in case of a strongly unequal sampling distribution, chi-square needs to be complemented with a Fisher’s exact test in order to reliably determine significance (e.g. Field 2009: 690, Hatch & Lazaraton 1991: 409). Fisher’s exact test follows the same logic as chisquare, but it also accommodates small and unequally distributed sample sizes. 44 Nominal variables are categorical variables, i.e. coded dichotomously using, for example, one and zero to represent “realized” and “not realized”. 87 The significance level of differences tested with Fisher’s exact test will be reported as Fisher’s p in the results section. Where expected cell frequencies dropped below one, however, the chi-square results were discarded entirely, because in such cases the test cannot be considered reliable anymore (cf. SPSS 15.0 Manual), even if Fisher’s exact test is applied additionally to determine significance. Thus, no statistical test was applied in such cases. A Mann-Whitney U-test (reported as U) was used to compare score data obtained from a newly created index of global narrative structure (cf. ch. 6.3). 45 The Mann-Whitney test indicates whether there is a significant difference in ranks between two groups (e.g. Hatch & Lazaraton 1991: 270ff.). Since the Mann- Whitney test is a non-parametric test based on the median, normal distribution of the data is not required, and this distinguishes the test from its more commonly used parametric counterpart, the two-sample t-test (ibid.). Eta-squared (represented as η 2 ), which is defined as (z 2 / N-1, Hatch & Lazaraton 1991: 279) was calculated as a measure of strength of association for this statistical test. Cronbachs alpha, a coefficient of consistency, was used to assess the test reliability of the newly created index (cf. ch. 6.3). After the construction of a scale or an index according to validity considerations, Cronbach’s alpha, which is based on inter-item correlation, can be used to determine the reliability of test items in terms of internal consistency (e.g. Mackey & Gass 2005: 128ff., Schmitt 1996, Cortina 1993). Thus, the alpha coefficient indicates on a range from zero to one to what degree a combination of test items is able to measure a single construct (cf. Eckstein 2008: 293ff., Rietveld & van Hout 1993: 203). 46 Individual test items can be included or eliminated based on considerations about their contribution to the overall alpha score. Even though no standard critical alpha coefficient exists, an alpha value around 0.7 or higher is generally considered acceptable, especially if a small number of items is tested (Eckstein 2008: 293ff., Schmitt 1996, Cortina 1993). Pearson product-moment correlation coefficients were calculated to investigate the relationship between narrative coherence and cohesion (cf. ch. 8). Correlation indicates whether two variables are linearly related and the correlation coefficient additionally indicates how strong this relationship is (cf. Hatch & Lazaraton 1991: 427). For Pearson correlations, this linear relationship is indicated by the correlation coefficient r, which lies between minus one and one. A coefficient of zero indicates that there is no relationship, a coefficient of one a perfect correlation. A coefficient between minus one and zero indicates a negative correlation, which means that the two variables perform in opposite ways, i.e. one variable has a low value if the other one has a high value and the reverse. A coefficient between zero and one consequently indicates a positive correlation in the sense that both variables perform the same having either both high or both low values. The closer the 45 As opposed to chi-square, which is used for nominal or frequency data, e.g. yes vs. no answers (cf. above). 46 If test items are positively correlated, otherwise no lower limit exists (Eckstein 2008: 293). 88 correlation coefficient is to plus/ minus one, the stronger is the relationship investigated (Hatch & Lazaraton 1991: 433). Additionally, the determination coefficient r 2 was calculated as a measure for the strength of the relationship; it indicates how much of the variance of one variable is explained by the other, i.e. the degree to which the two variables provide the same information (e.g. Field 2009: 179, Hatch & Lazaraton 1991: 441). Pearson correlation requires, however, that the data is normally distributed (e.g. Field 2009: 177), which was not the case for the narrative index scores in fourth grade (cf. ch. 6.3). Here the Spearman rank-order correlation coefficient (r s ) was calculated instead (cf. Field 2009: 179ff.); it is interpreted analogously to r. Several significant relationships between a coherence and a cohesion measure were found (cf. ch. 1). In order to exclude a possible mediating effect of a third variable, namely text length, these pairs of variables were additionally submitted to a partial correlation analysis. Partial correlations, which have the same underlying assumptions as Pearson correlations (e.g. Field 2009: 186ff.), control the relationship between two variables for the influence of one or more additional ones and thus yield “a measure of the unique relationship between [the two variables]” (Field 2009: 188). 5.3 Addressing the dangers of the comparative fallacy Any study on L2 acquisition must take care to avoid what Bley-Vroman (1983) called the comparative fallacy, which addresses the dangers of comparing learner languages with the respective target languages (i.e. learner language systems with L1 language systems), especially with respect to defining analytical categories through target-language standards. Bley-Vroman states that [f]or example, any study which classifies interlanguage (IL) data according to a target language (TL) scheme or depends on the notion of obligatory context or binary choice will likely fail to illuminate the structure of the IL. Of course any study which uses a target language scheme to preselect data for investigation (such as a study which begins with a corpus of errors) is even more liable to obscure the phenomenon under investigation. (1983: 15) At the same time Bley-Vroman’s justified methodological criticism cannot obscure the fact that “[g]iven that SLA is the study of how people acquire a second language, any SLA study implicitly has a built-in notion of interlanguage with a target language lurking in the background” (Park 2004: 3). Consequently, other researchers recognize a target-language perspective as one of several possible and legitimate perspectives on studying learner language (e.g. Cook 1999: 190, Ellis 1994). 47 Only a combination of all of these perspectives can, however, come even close to a full description of a system as complex and variable as that of learner language(s). 47 The reader is referred to Kersten (2009: 60ff.) for a recent discussion of different possible approaches to studying learner language (interlanguage, L1, and L2 perspective). 89 Since any research perspective carries its own methodological problems (e.g. Kersten 2009: 60ff., Foster-Cohen 2001, Lakshmanan & Selinker 2001), however, I will opt for a “pragmatic” stance with respect to my analysis. That is, I will not contribute further to the debate on which perspectives are (not) appropriate and under which circumstances. Instead, I will describe very briefly how my analysis aimed to at least reduce the danger of falling prey to the comparative fallacy and the reader is asked to form his or her own opinion as to how far this was successful. Just as for the theoretical basis put forward for the analysis of coherence and cohesion, my method of analysis could best be described in terms of a combination of top-down and bottom-up processes: Both the coherence and the cohesion analysis make use of certain general categories, namely the narrative components described in ch. 5.2.1 and the categories of cohesion going back to Halliday and Hasan (1976; cf. ch. 5.2.2.2). All of these categories have proved useful in previous studies on L1 and/ or L2 learners (cf. ch. 4). The first step in setting up the analysis grid for both coherence and cohesion was, however, to systematically evaluate how the study’s participants realized these general categories linguistically, i.e. which options they used in their L2 productions; for both coherence and cohesion options other than target-like ones were also coded. For the analysis of coherence the overall criterion applied was communicative success, i.e. whether the interlocutor—who did pretend, however, to be a native speaker of English—would be able to comprehend the speaker and fill each narrative component of the story schema with content. At the same time only a minimum of target-likeness was required, especially with respect to syntax, morphology and phonology, i.e. the interlocutor’s purported “comprehension threshold” was set as low as possible. This process was aided by the researcher, i.e. the author, being a very advanced, but nevertheless non-native speaker of English herself (cf. Lakshmanan & Selinker 2001: 415). For the analysis of cohesion a similar path was pursued. That is, the general categories of cohesion were applied to the data but their linguistic realization was not measured exclusively by L1 standards: One child, for example, produced and on the day, which was coded as a complex connective, even if an (adult) native speaker would most probably not have produced such an expression in the same linguistic context. Similarly, some children systematically pronounced the definite article the as / se/ but these non-target-like forms were of course still coded as definite articles. At the same time all instances of realizations of the cohesion categories were classified, regardless of whether these would, for example, constitute an overor underuse when compared to L1 speakers. Since any linguistic study is to a certain degree an artifact of its methodology, the most important precaution (not only against the comparative fallacy) was to give a detailed description of my method of analysis in the previous sections of the present chapter. This was done to ensure, first of all, that a replication of the study is possible; secondly, this detailed description also serves to allow readers to form 90 their own opinion about any possible bias in results attributable to the method of analysis—such as to which degree my study is subject to the comparative fallacy 91 6 The development of coherence: Results 6.1 Total number of narrative components This section aims to answer the questions (a) how coherent participants’ stories are as measured by the number of narrative components realized in first and fourth grade and (b) whether there are any quantitative differences attributable to grade, sex or experience. To this end I will describe, first of all, the observed results for the overall number of narrative components in first and fourth grade as well as the results obtained for the two background variables included in the study, i.e. participants’ sex and L2 preschool experience. After that I will present the corresponding statistical results. As stated in ch. 5.1.2, possible differences between the longitudinal and crosssectional data sets were explored for all overall measures of coherence (and cohesion) as well as for randomly selected individual variables. Since no significant differences were found, no further distinction will be made between cohorts in the discussion of the results. 6.1.1 Total number of components: Observed results The narrative components actually realized by participants corresponded to a very large extent to the 14 narrative components identified in ch. 5.2.1, namely: • Setting • Initiating event • Simple reaction • Goals 1 and 2 • Attempts 1 to 7 • Consequence • Ending. The only exceptions from the pre-identified components were produced by two children, C8-G1-13 and C8-G1-20, who realized two attempts in addition to those previously identified. These results speak for the validity of the approach. 6.1.1.1 Overall results The result obtained for the total number of narrative components in first and fourth grade is shown in Fig. 6.1; 1 the corresponding descriptive statistics are given in Tab. 6.1. 1 Only the components previously identified (cf. ch. 5.2.1) were considered. 92 Tab. 6.1 Descriptive statistics total number of narrative components by grade Total N Mean Standard Deviation Median Mode Minimum Maximum Grade 1 31 4.7 3.0 5 1 1 10 Grade 4 28 9.9 1.7 10 10 6 12 Total 59 7.2 3.6 8 10 1 12 All first graders are able to realize at least one component and on average they produce 4.7 narrative components. None of the first graders, however, produce more than 10 of the 14 components tested. At the same time participants’ interindividual differences are very large in first grade, which is evident from a standard deviation of three components, i.e. a variability of 64% in relation to the the mean (variability ratio), and a range of nine from minimum (1 component) to maximum score (10). This strong heterogeneity of results is confirmed by a look at Fig. 6.2, which presents the distribution of the number of components realized in first and fourth grade; the corresponding descriptive statistics are given in Tab. 6.2. Fig. 6.1 Mean number of narrative components by grade 4,7 9,9 5,2 0 2 4 6 8 10 12 14 Grade 1 Grade 4 Increase number of components Grade 1 Grade 4 Increase 93 Tab. 6.2 Distribution of the total number of component scores by grade Grade 1 Grade 4 N % within grade N % within grade Total number of narrative components 1 6 19 0 0 2 4 13 0 0 3 5 16 0 0 4 0 0 0 0 5 2 6 0 0 6 5 16 1 4 7 3 10 2 7 8 2 6 2 7 9 1 3 4 14 10 3 10 9 32 11 0 0 4 14 12 0 0 6 21 Total 31 100 28 100 The distribution shows that in terms of frequency first graders most often realize one component (19%, i.e. 6 out of 31 first graders). Slightly less often participants produce three (16%, 5 of 31) and six components (16%, 5 of 31), even less often two components (13%, 4 of 31). All other components are realized by 10% or less Fig. 6.2 Distribution of the total number of component scores by grade 19 13 16 6 16 10 6 3 10 4 7 7 14 32 14 21 0 10 20 30 40 50 1 2 3 4 5 6 7 8 9 10 11 12 13 14 % of grade Total number of components realized % of first graders % of fourth graders 94 of the first graders. 2 At the same time the values below the mean, i.e. less than 4.7, are more evenly distributed: The realization of one, two or three components already accounts for 48% of the results, while values higher than the mean, i.e. from 4.6 to 10 components, are realized mainly by a strong clustering of scores at 6 components (16%). The highest score (10 components) is achieved by only 10% of the first graders (3 of 31, namely C1-G1-16, C8-G1-13 and -21). One could also say that the distribution shows a division of first graders into minimal (1 to 3 components) and relatively proficient storytellers (6 or more components). Fourth grade participants realize an average of 9.9 and minimally 6 components. That is, the mean more than doubles from first to fourth grade—from 4.7 to 9.9 components or from 34% to 71% of the 14 components tested. At the same time the minimum score increases very strongly (from 1 to 6 components). However, even in fourth grade the highest score is 12 so that none of the participants in either grade produced all 14 narrative components (Tab. 6.1 and 6.2, Fig. 6.2). Additionally, participants’ performance becomes far more homogeneous from first to fourth grade (Fig. 6.1, Tab. 6.1); thus, their interindividual differences decrease from a variability ratio of 64% of the mean in first to only 17% in fourth grade. Similarly, the range of scores from minimum to maximum decreases from 9 to 6. This development is reflected in Fig. 6.2, which shows low frequencies below 10 components and a clustering of observations above that. Taking a closer look at the distribution in first and fourth grade (Fig. 6.2, Tab. 6.2) an overlap of first and fourth graders’ scores can be observed, which ranges from 6 to 10 components (fourth grade lowest to first grade highest score). This overlap includes 45% of all first graders (14 of 31) and 64% of fourth graders (18 of 28). Scores below 6, on the other hand, are exclusive to first graders and scores above 10 to fourth graders. Judging from the frequency of realizations, however, values between 1 and 6 components seem to be most characteristic for first grade; together they account for 71% of all observations in first grade. In fourth grade, on the other hand, values of nine components or more seem to be most typical, since results of fewer than 9 components are produced much less frequently and they are more spread out. Together, the frequencies obtained for 9 components or more account for 82% of all observations, while the frequencies below 9 components account for only about one third of the fourth graders. To summarize, a comparison of the overall number of components by grade showed that all participants were able to produce at least one component. At the same time a dramatic increase in the number of components, as measured by the group difference, was found from first to fourth grade, paired with a strong decrease in interindividual variation. A score of 6 components or fewer was found to be most typical for first graders, while a score of 9 components or more was identified as most typical for fourth graders. Almost half of all first graders, however, were found to produce a number of components which could also have been produced by a fourth grader. 2 However, in absolute numbers the differences are relatively small. The difference between 6% and 10%, for example, corresponds to one child. 95 6.1.1.2 Total number of components by sex Fig. 6.3 and Tab. 6.3 show the results obtained in first and fourth grade as a function of participants’ sex. Tab. 6.3 Descriptive statistics total number of components by sex and grade Total N Mean Standard Deviation Median Mode Minimum Maximum Grade 1 Male 10 5.2 2.9 6 6 1 10 Female 21 4.4 3.1 3 1 1 10 Total 31 4.7 3.0 5 1 1 10 Grade 4 Male 7 10.4 1.6 10 12 8 12 Female 21 9.8 1.7 10 10 6 12 Total 28 9.9 1.7 10 10 6 12 Only comparatively small differences exist between male and female first graders. Female first graders produce an average of 4.4 and male first graders a slightly higher mean of 5.2 components. Still, male participants seem to perform slightly better. This becomes clear by a look at the median: The male median is 6, indicating that half of the male group realize 6 or more components, while for female participants the median lies at 3 components (Tab. 6.3, cf. also Fig. 6.3). Similarly, the mode, i.e. the number of components realized most frequently, is 6 components for male and 1 for female first graders; this is reflected in Fig. 6.4, which Fig. 6.3 Mean number of narrative components by sex and grade 5,2 10,4 5,2 4,4 9,8 5,4 0 2 4 6 8 10 12 14 Grade 1 Grade 4 Increase number of components Male Female 96 shows the distribution of scores for male and female first graders (cf. Tab. 6.4 for the corresponding values). Tab. 6.4 Distribution of the total number of component scores by sex and grade Grade 1 Grade 4 Male Female Male Female N % within sex N % within sex N % within sex N % within sex Total number of narrative components 1 1 10 5 24 0 0 0 0 2 1 10 3 14 0 0 0 0 3 2 20 3 14 0 0 0 0 4 0 0 0 0 0 0 0 0 5 0 0 2 10 0 0 0 0 6 3 30 2 10 0 0 1 5 7 1 10 2 10 0 0 2 10 8 1 10 1 5 1 14 1 5 9 0 0 1 5 1 14 3 14 10 1 10 2 10 2 29 7 33 11 0 0 0 0 0 0 4 19 12 0 0 0 0 3 43 3 14 Total 10 100 21 100 7 100 21 100 Fig. 6.4 Distribution of the total number of components by sex in first grade 10 10 20 30 10 10 10 24 14 14 10 10 5 5 0 10 20 30 40 50 1 2 3 4 5 6 7 8 9 10 11 12 13 14 % of sex in first grade Total number of components realized % of male first graders % of female first graders 97 The variability ratio also shows somewhat of an advantage for male first graders, since the ratio of standard deviation to mean is 70% for female first graders and 56% for males—a little above and below, respectively, the average variability ratio of 64% in first grade. However, the range from minimum to maximum score is the same for both groups. Both sexes follow the general trend of a steep increase in the number of components and a decrease in interindividual variation. Consequently, their fourth grade results are again quite similar—even more so than in first grade—with males producing a mean number of 10.4 and females a mean number of 9.8 components. At the same time the female group’s variability ratio decreases to 17% and the male group’s to 15%. In fourth grade, while no difference in median can be observed, a difference in modes can still be seen with 12 components for male fourth graders (43%, 3 of 7) but 10 components for female fourth graders (33%, 7 of 21) (Tab. 6.3). This is reflected in Fig. 6.5, which shows the distribution of male and female fourth grade scores; for the corresponding values are given in Tab. 6.4. The distribution also shows that the lowest fourth grade scores, namely 6 (5%, C5- G4-20) and 7 components (10%, 2 of 21), are produced exclusively by female participants. At the same time 14% of females (3 of 21) score below an overlap in scores (ranging from 8 to 10 components), which includes, on the other hand, all male fourth graders (cf. also each group’s range); thus, there seems to be a slight male advantage not only in first but also in fourth grade. However, this finding Fig. 6.5 Distribution of total number of narrative component scores by sex in fourth grade 14 14 29 43 5 10 5 33 19 14 0 10 20 30 40 50 1 2 3 4 5 6 7 8 9 10 11 12 13 14 % of sex in fourth grade Total number of components realized % of male fourth graders % of female fourth graders 98 would need to be substantiated in a larger data set due to the likelihood of a group size effect. 3 In regard to male and female results in first as compared to fourth grade, both groups show some overlap. Male participants’ first and fourth grade distribution of the total number of component scores is depicted in Fig 6.6, the corresponding female distribution in Fig. 6.7. Male first and fourth grade results overlap in the range of 8 to 10 components (male fourth grade minimum to first grade maximum). Only 20% of male first graders, however, realize between 8 or 10 components, which corresponds to one male first grader scoring 8 and another one 10 components. These latter two male first graders seem to be precocious in storytelling abilities in comparison to their peers, since all remaining male first graders produce fewer than 8 components. Fifty-seven percent of male fourth graders (4 of 7), on the other hand, realize between 8 and 10 components and the other 43% more than 10. This means that scores below 8 components are exclusive to first and scores above 10 exclusive to fourth graders, which in turn indicates that, even though some male first graders do already perform like fourth graders, the great majority of them does not. Thus, the distribution of scores clearly confirms a quantitative difference between first and fourth grade results. 3 In absolute numbers, for example, three male (43%) and three female fourth graders (14%) realize twelve components. Fig. 6.6 Distribution of the total number of components by grade for male participants 10 10 20 30 10 10 10 14 14 29 43 0 10 20 30 40 50 1 2 3 4 5 6 7 8 9 10 11 12 13 14 % of sex Total number of components realized % of male first graders % of male fourth graders 99 With respect to the female participants, the difference between first and fourth grade results is a little less pronounced (Fig. 6.7); female results overlap in a somewhat larger range, namely between 6 and 10 components (female fourth grade minimum to first grade maximum). Thirty-eight percent of female first graders (8 of 21) produce a total number of components within this range. However, none of these scores is realized by more than just one or two of them. The majority of female fourth graders (67%, 14 of 21), on the other hand, produces results within the overlapping range; 33% of them (7 of 21) score higher. Even so, the three lowest fourth grade scores, namely 6, 7 and 8 components, are produced by only 20% of the female fourth graders (4 of 21) and each one of these results by only 5% to 10% of them, i.e. one or two participant(s), respectively. The distribution thus seems to indicate a tendency of female fourth graders to realize 9 or even 10 or more components; together, these results account for a full 81% of the group. The only scores achieved by more than one or two female first graders, on the other hand, are 1, 2, and 3 components; together, their frequencies already account for 52% of the first graders. Thus, a result of 6 or even 3 and fewer components seems to be more typical for female first graders, while 9 or even 10 and more components are more typical for female fourth graders. To sum up, male and female results closely resembled each other in both grades as well as in their development from first to fourth grade. Both groups’ mean strongly increased from first to fourth grade and their results became more homogeneous, i.e. they followed the general developmental trends described in the previous section. Male and female distributions both indicated an overlap between first and fourth grade scores, even though the difference between the two Fig. 6.7 Distribution of the total number of components by grade for female participants 24 14 14 10 10 10 5 5 10 5 14 33 19 14 0 10 20 30 40 50 1 2 3 4 5 6 7 8 9 10 11 12 13 14 % of sex Total number of narrative components % of female first graders % of female fourth graders 100 grades was a little more pronounced for the male participants. The respective distributions also indicated that scores of 6 components or fewer are typical for female and 8 or fewer for male first graders. In fourth grade, scores of 9 or more components were found to be typical for female and scores of 10 or more for male participants. 6.1.1.3 Total number of components by experience group The total number of components as a function of L2 preschool experience and grade is shown in Fig. 6.8; the corresponding descriptive statistics are given in Tab. 6.5. † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. In first grade the bilingual experience group clearly outperforms the monolingual one: On average, children with bilingual preschool experience realize more than twice as many components (6.1 vs. 2.5). A similar advantage is evident in the distribution of scores (cf. also the respective modes), which is depicted in Fig. 6.9; the corresponding values are given in Tab. 6.6. Almost half of all first graders with monolingual preschool experience (42%, 5 of 12) realize only 1 component, while those with bilingual preschool experience most often realize 3 (21%, 4 of 12) or even 6 components (21%). Fig. 6.8 Mean number of components by experience group and grade † 2,5 9,8 7,3 6,1 10,0 3,9 0 2 4 6 8 10 12 14 Grade 1 Grade 4 Increase No. of components Mono Bili 101 Tab. 6.5 Descriptive statistics total number of narrative components by experience group and grade Total N Mean Standard Deviation Median Mode† Minimum Maximum Grade 1 Mono 12 2.5 1.8 2 1 1 6 Bili 19 6.1 2.8 6 3 1 10 Total 31 4.7 3.0 5 1 1 10 Grade 4 Mono 12 9.8 1.6 10 10 7 12 Bili 16 10 1.8 10 12 6 12 Total 28 9.9 1.7 10 10 6 12 † More than one mode exists. ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. Fig. 6.9 Distribution of the total number of components by experience group in first grade † 42 25 8 0 17 8 0 0 0 0 5 5 21 0 0 21 16 11 5 16 0 20 40 60 80 100 1 2 3 4 5 6 7 8 9 10 11 12 13 14 % of experience group Total number of components realized % of mono first graders % of bili first graders 102 Tab. 6.6 Distribution of the total number of components by experience group and grade Grade 1 Grade 4 Mono† Bili Mono Bili N % within exp. group N % within exp. group N % within exp. group N % within exp. group Total number of narrative components 1 5 42 1 5 0 0 0 0 2 3 25 1 5 0 0 0 0 3 1 8 4 21 0 0 0 0 4 0 0 0 0 0 0 0 0 5 2 17 0 0 0 0 0 0 6 1 8 4 21 0 0 1 6 7 0 0 3 16 2 17 0 0 8 0 0 2 11 0 0 2 13 9 0 0 1 5 1 8 3 19 10 0 0 3 16 6 50 3 19 11 0 0 0 0 1 8 3 19 12 0 0 0 0 2 17 4 25 Total 12 100 19 100 12 100 16 100 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. The frequency distribution also shows an almost steady decline in the percentage of mono first graders realizing more than one component: 75% of the mono first graders (9 out of 12) realize 1, 2 or 3 components. Only 25% (3 children), however, produce more components, namely five or six. The highest mono score of 6 components is reached by only 8% of them, i.e. one child (C1-G1-10). At the same time, none of the mono first graders realizes 4 components, and this seems to split them into a low and a high performance subgroup; thus, a score of 3 components or fewer seems to be more typical for first graders without prior L2 preschool experience. The bilingual experience group has a larger range of scores—from a minimum of 1 to a maximum of 10 components—and its scores are more evenly distributed across that range. This is especially evident in the absolute frequencies (cf. Tab. 6.6). At the same time none of the bili first graders realizes 4 or 5 components and this also seems to split the bili group up into a low and a high performance subgroup: 32% of the bili children (6 of 19) realize 1, 2 or 3 components, while 68% (13 of 19) realize 6 or more; thus, a score of 6 or more components may be even more typical for those first graders with L2 preschool experience and a score of 3 components or fewer for first graders from monolingual preschool groups. The interindividual differences, as measured by the variability ratio, confirm the more even distribution in the bili group: The variability ratio is much higher 103 for the mono (72%) than for the bili group (45%), which means that the results achieved by the mono group are more heterogeneous, even if nominally the bili group’s standard deviation is larger. However, both groups have a comparatively large interindividual variation in first grade. Both experience groups follow the general developmental trends described earlier, i.e. the mean number of components strongly increases. However, their degree of improvement differs. The monolingual group has a much lower mean score in first grade than does the bilingual group, but its performance also improves much more dramatically (Fig. 6.8, Tab. 6.5). More specifically, the monolingual group’s fourth grade mean is almost four times higher than its first grade mean (2.5 vs. 9.8). The bilingual group’s fourth grade mean, on the other hand, is “only” about 1.5 times higher (6.1 vs. 10). As a result of these unequal increase rates, virtually no difference between monolingual and bilingual preschool experience group can be observed in fourth grade with respect to the mean number of components. At the same time the interindividual differences decrease in both experience groups—from a variability ratio of 45% of the mean in first to 18% in fourth grade for the bili group and from 72% to 16% for the mono group (cf. also Tab. 6.5); thus, bili fourth graders have a marginally higher variation in scores. This is also reflected by their slightly larger range (6 components as opposed to 5 components in the mono group, cf. Tab. 6.5) and the frequency distribution, which is given in Fig. 6.10. 4 The frequency distribution shows that the results of the bili fourth graders are again not only more varied but also more evenly distributed, while the mono group’s results are dominated by a cluster of scores at ten components (50%, 6 of 12). Overall, however, the differences in distribution between mono and bili fourth graders are marginal. 4 However, the lowest score of 6 components is realized by only one child and none of the bili children in fourth grade realizes 7 components (cf.Tab. 6.6); thus, a score of 8 or more components may be more typical for this group. 104 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. A final comparison must be made between each experience group’s first and fourth grade distribution; to this end, the distribution of the total number of narrative components as a function of grade and L2 preschool experience is depicted in Fig. 6.11 for the mono and in Fig. 6.12 for the bili group. Mono first and fourth grade results are clearly distinguishable: Mono first graders score 6 or fewer components, while mono fourth graders score 7 or more components; there is no overlap in scores. With respect to the bili group’s first and fourth grade scores a different picture emerges: While scores of 3 components or fewer are exclusive to bili first graders and scores of 11 components or more to bili fourth graders, there is an overlap in results ranging from 6 to 10 components, i.e. from the bili group’s fourth grade minimum to the first grade maximum. Sixty-eight percent of first (13 of 19) and 56% of fourth graders (9 of 16) achieved a total number of components within this range. The one bili fourth grader (6%, C5-G4-20) producing only 6 components could be considered a statistical outlier, however, since all other bili fourth graders realized 8 or more components, so the overlap is actually smaller and ranges from 8 to 10 components with 32% of bili first (6 of 19) and 50% of fourth graders (8 of 16) achieving a total number of components within this range. Nevertheless, a substantial overlap remains, which indicates that many of the bili first graders already perform like fourth graders, while this is not the case for any of the mono first graders. Fig. 6.10 Distribution of the total number of components by experience group in fourth grade † 0 17 0 8 50 8 17 6 0 13 19 19 19 25 0 20 40 60 80 100 1 2 3 4 5 6 7 8 9 10 11 12 13 14 % of experience group Total number of components realized % of mono fourth graders % of bili fourth graders 105 To sum up, a comparison of the total number of components by experience group found that both groups under investigation largely followed the overall trends identified earlier in this section; that is, their mean number of components strongly increased from first to fourth grade, while the interindividual variation Fig. 6.11 Distribution of the total number of components by grade for participants with exclusively monolingual preschool experience Fig. 6.12 Distribution of the total number of components by grade for children with bilingual preschool experience 5 5 21 21 16 11 5 16 6 13 19 19 19 25 0 10 20 30 40 50 1 2 3 4 5 6 7 8 9 10 11 12 13 14 % of experience group Total number of narrative components realized % of bili first graders % of bili fourth graders 42 25 8 17 8 17 8 50 8 17 0 10 20 30 40 50 1 2 3 4 5 6 7 8 9 10 11 12 13 14 % of experience group Total number of components realized % of mono first graders % of mono fourth graders 106 decreased. However, it was found that the factor of experience group influences the results in several respects and thus, bili outperformed mono first graders in terms of a higher mean and fewer interindividual differences. The distribution of scores underlined the large difference between experience groups in first grade in that a score of 3 components or fewer was found most typical for mono and a score of 6 or more most typical for bili first graders. However, the mono participants’ mean score increased far more from first to fourth grade—this was supported by a look at the distribution, which showed a stronger overlap of scores between first and fourth grade for the bili group—and their interindividual variation decreased more markedly. As a result, virtually no differences attributable to L2 preschool experience were found by the end of fourth grade. This lack of significant differences also held true for the distribution, which indicated that 7 or more components were most typical for mono and 8 or more most typical for bili fourth graders. 6.1.2 Total number of components: Statistical results A factorial analysis of variance was conducted to statistically test main and interaction effects of grade, sex, and L2 experience group on the total number of components realized (cf. ch. 5.2.3). There was a very highly significant main effect of grade (F(1, 51)=60.38, p<0.001, partial η 2 =0.54), which confirmed the observed difference between the mean number of realizations in grade one and grade four described above. This effect of grade accounted for 54% of the overall variance. In addition, a significant main effect of experience group (F(1, 51)=6.95, p<0.05, partial η 2 =0.12) was found, which accounted, however, for only 12% of the overall variance. Participants’ sex was not a significant influence factor (F(1, 51)=0.09, ns). The analysis also showed a significant interaction effect of grade and experience (F(1, 51)=7.14, p<0.05, partial η 2 = 0.12), which qualified the two main effects; that is, it indicated that the effect of experience group was not the same for the two grades. To further explore the difference between mono and bili results, independent samples t-tests were conducted between the mono and the bili group for first and fourth grade. They confirmed that the experience group effect is attributable to the strong difference found for first grade (t(29)=3.85, p<0.05, η 2 =0.33), where experience group accounts for 33% of the variation, whereas by grade four there is no statistically significant difference attributable to L2 preschool experience (t(26)=0.26, ns). Thus, the statistical analysis fully confirmed the observed results in that it showed a very strong effect of grade on the total number of components as well as a strong effect of L2 experience on first grade results. It also confirmed that in fourth grade experience group is not a significant factor anymore. Additionally, the statistical analysis corroborated the observed finding that participants’ sex does not have any significant influence on the total number of components realized. 107 6.1.3 Summary: Total number of narrative components The present section was aimed at answering the question of how coherent participants’ stories are, as measured by the number of narrative components realized in first and fourth grade, and whether there are any quantitative differences attributable to grade, sex or experience. Two hypotheses had been formulated on the basis of previous studies (cf. ch. 4.6): 1. Participants’ stories become more coherent from first to fourth grade as measured by the number of narrative components. 2. There are qualitative differences in narrative coherence between grades as measured by differences in frequency among the individual narrative components. First of all, it was found that all first graders were able to produce at least 1 component and that none of the participants produced all 14 components—even in fourth grade the highest score was 12. Apart from this, two developmental trends were identified in the data, namely (1) a strong increase in the mean number of narrative components from first to fourth grade and (2) a strong decrease in interindividual variation. Thus, the participants’ mean number of components doubled from 4.7 in first to 9.9 components in fourth grade; the statistical analysis showed that this increase was very highly significant and explained slightly more than half of the variation in results (54%). At the same time the heterogeneity of results, as measured by the variability ratio, decreased to roughly one fourth of its first grade value (from 64% to 14% of the mean). With respect to an influence of sex and experience group the following was found: No significant differences—either observed or statistical—were found between male and female participants in either grade. At the same time both groups’ results followed the overall pattern of an increase in mean score and a decrease in the heterogeneity of results from first to fourth grade. This was also true for the two experience groups under investigation; however, a comparison between the mono and the bili group found that L2 preschool experience had a significant impact on first grade results, where the bili group produced a significantly higher mean number of components (6.1 vs. 2.5) and performed more homogeneously. The statistical analysis showed that this significant impact of L2 preschool experience accounts for 33% of the variation in first grade. Due to the mono groups’ stronger increase in the mean number of components and decrease in interindividual variation, however, the differences attributable to the experience group had disappeared by the end of fourth grade; this was confirmed by the statistical analysis. The distribution of first and fourth grade scores also qualified the overall results, since it showed a large overlap between first and fourth grade scores. That is, 45% of first graders produced results that could equally have been produced by a fourth grader, since these lay above the fourth grade minimum. However, this overlap was again influenced by L2 preschool experience; that is, the mono group’s results showed no overlap at all, while there was a strong overlap between 108 bili first and fourth grade results. Thus, the overlap of first and fourth grade scores was found to be entirely attributable to participants with L2 preschool experience. With the help of the respective distributions it was also established that a score of 6 components or fewer is most typical for first graders and a score of 9 or more for fourth graders. Male and female participants’ distributions roughly agreed with this pattern, while L2 preschool experience was again found to have a significant impact in first grade, which qualifies the overall results: The majority of bili first graders (68%) produced 6 or more components, while only about one third of them produced 3 components or fewer. Mono first graders, on the other hand, realized 6 components or fewer and most typically 3 or fewer (75%). In fourth grade, both experience groups roughly corresponded to the overall pattern and were very similar to each other. To sum up, the mean coherence of participants’ stories—as measured by the total number of components—increased significantly from first to fourth grade and at the same time the interindividual differences in coherence became much smaller. Narrative coherence and its increase were qualified by a strong effect of experience group in that most first graders with bilingual preschool experience produced stories in the range of fourth graders, while none of the children without prior experience did so. However, children without prior L2 experience had caught up by the end of fourth grade due to a comparatively stronger increase rate. Sex was not found to be of any significant influence to coherence. 6.2 Individual narrative components This section aims to answer the question whether there are any qualitative differences in coherence as measured by the frequency of the 14 individual components under investigation and whether these differences are attributable to grade, sex or L2 preschool experience. 6.2.1 Individual narrative components: Observed and statistical results Fig. 6.13 shows all 14 narrative components and their relative frequencies in first and in fourth grade in the order of occurrence in the story; Tab. 6.7 gives the corresponding values. All of the narrative components are realized by at least one of the children as early as in first grade. At the same time there are large differences in frequency among the components. Only SETTING (81%) and INITIATING EVENT (81%) are produced by the majority of first graders. All 12 other components are observed in around 50% or fewer cases. GOAL 2 (“find the frog”) is realized by half of the first graders (52%)—albeit only implicitly 5 —while the remaining components are produced by only up to 45% of them. Around 20% to 45% of first graders realize (in descending order) ATTEMPT 3 (45%), CONSE- QUENCE (39%), ATTEMPTs 1 (39%), 2 (29%), 4 and 6 (both 19%). Even fewer 5 I.e. as deducible from the realization of INITIATING EVENT and two ATTEMPTs. 109 first graders produce the ENDING (16%), GOAL 1 (“recover/ replace the frog”, 16%), ATTEMPT 5 (16%) and SIMPLE REACTION (13%) components. AT- TEMPT 7 (“lake attempt”, 3%) is realized least often, namely by only one child (C8-G1-21). Tab. 6.7 Frequencies of the 14 individual narrative components by grade Grade 1 (N=31) Grade 4 (N=28) N % N % Setting 25 81 27 96 Initiating event 25 81 28 100 Simple reaction 4 13 12 43 Goal 1 5 16 26 93 Goal 2 16 52 27 96 Attempt 1 12 39 28 100 Attempt 2 9 29 22 79 Attempt 3 14 45 27 96 Attempt 4 6 19 12 43 Attempt 5 5 16 10 36 Attempt 6 6 19 14 50 Attempt 7 1 3 1 4 Consequence 12 39 18 64 Ending 5 16 26 93 Fig. 6.13 Frequencies of the 14 individual narrative components by grade 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% % of participants Narrative components in their order of occurrence in the story Grade 1 Grade 4 110 As stated in ch. 4, some narrative components seem to be easier to acquire than others in the sense that they are produced from an earlier age. My results support this finding: Most first graders describe (a) who they are talking about, i.e. the protagonists, and (b) the event that sets up the “problem” to be solved in the ensuing development sequence of the story (cf. ch. 3.3), i.e. the escape of the frog. Even though they use few other narrative components, there is evidence for an emerging problem-resolution structure besides the realization of the initiating event: The narrative components following SETTING and INITIATING EVENT in frequency are GOAL 2, which is inferable in half of the stories, ATTEMPTs 3 and 1, which correspond to the story’s two possibilities for encoding the beginning of the goal-path (i.e. the recursive ATTEMPT sequence), and CONSE- QUENCE, which is one of the two possibilities to end the goal-path. Thus, first graders seem to be on the right track, even though the latter narrative components are encoded by a comparatively low percentage of them. The frequency of all narrative components besides ATTEMPT 7 increases from first to fourth grade albeit to very different degrees; this was already evident from Fig. 6.13. However, the increase rates are somewhat conditioned by the components’ first grade frequency—SETTING, for example, is produced by 81% of first graders already, so there is not much room for increase in this instance. Fig. 6.14 gives the individual components’ increase rates, i.e. the difference between their first and fourth grade frequency. The great majority of these observed increase rates were found to be either statistically significant or at least indicating a trend towards significance. Fig. 6.14 Increase in frequency from first to fourth grade for each narrative component 15% 19% 30% 77% 44% 61% 50% 51% 24% 20% 31% 1% 25% 77% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% % of participants Narrative components ordered by their occcurrence in the story 111 The following order of increase rates emerges from Fig. 6.14: The strongest increase rates observed—all of them statistically very highly significant—were found for GOAL 1 (77%, χ 2 (1)=34.73, Fisher’s p<0.001, Φ=0.77), ENDING (77%, χ 2 (1)=34.73, Fisher’s p<0.001, Φ=0.77), ATTEMPT 1 (61%, χ 2 (1)=25.31, Fisher’s p<0.001, Φ=0.66), ATTEMPT 3 (51%, χ 2 (1)=18.24, Fisher’s p<0.001, Φ=0.56), ATTEMPT 2 (50%, χ 2 (1)=14.48, Fisher’s p<0.001, Φ=0.5) and GOAL 2 (44%, χ 2 (1)=14.95, Fisher’s p<0.001, Φ=0.5). Additionally, the respective effect sizes, ranging from phi-values of 0.5 to 0.77, show the important effect of grade ; the phivalues indicate that the relationship between grade and the individual narrative components is at least moderately strong and often very strong. Most of the components with lower increase rates, however, still differed significantly between first and fourth grade. Thus, first and fourth grade results of ATTEMPT 6 (31%, χ 2 (1)=6.17, Fisher’s p<0.05, Φ=0.32), SIMPLE REACTION (30%, χ 2 (1)=6.68, Fisher’s p<0.05, Φ=0.34) and INITIATING EVENT (19%, χ 2 (1)=6.03, Fisher’s p<0.05, Φ=0.32) were statistically significant, even if the effect sizes, ranging from 0.32 to 0.34, indicated a more modest impact of grade. A trend toward significance was found for CONSEQUENCE (25%, χ 2 (1)=3.85, Fisher’s p=0.07, ns) and ATTEMPT 4 (24%, χ 2 (1)=3.83, Fisher’s p=0.09, ns). Only the observed increase in SETTING (15%, χ 2 (1)=3.50, Fisher’s p=0.11, ns) and ATTEMPT 5 (20%, χ 2 (1)=2.98, Fisher’s p=0.23, ns) was not statistically significant. In fourth grade, large differences in frequency remain between the individual components (Fig. 6.13, Tab. 6.7); that is, INITIATING EVENT and ATTEMPT 1 are realized by all fourth graders (100%) and a further 5 narrative components by over 90% of them, namely SETTING, GOAL 2, ATTEMPT 3 (all three 96%), GOAL 1 and ENDING (both 93%). 6 In absolute numbers, 96% correspond to merely one and 93% to two of the fourth graders not realizing the respective component; that is, half of the 14 narrative components are realized by almost all fourth graders. The remaining 7 components do not show any similarities in frequency: 79% of the fourth graders realize the “window attempt”, i.e. ATTEMPT 2. Half of the fourth graders or more produce CONSEQUENCE (64%) and AT- TEMPT 6 (50%), but not even half of them realize SIMPLE REACTION (43%) and ATTEMPTs 4 (43%) and 5 (36%). Narrative component ATTEMPT 7 is again realized by only one participant (4%, C5-G4-10). As discussed in ch. 4, some narrative components seem to be easier to acquire than others in the sense that they are produced from an earlier age. The order of difficulty that emerges from my results in first and fourth grade is shown in Tab. 6.8. As Tab. 6.8 shows, if a certain interchangeability in ATTEMPTs 2 to 7 as well as CONSEQUENCE and ENDING is allowed for, the order of difficulty is not very different between the two grades; the main difference seems to be a quantitative one: Almost all fourth graders (over 90%) describe not only—as most first graders do—(a) the protagonists (SETTING) and (b) the problem that triggers the 6 The high frequency of goals 1 and 2 is attributable to their implicit realization as inferable from these components; only GOAL 2 is also realized explicitly and then by a mere 10 of the 28 fourth graders, i.e. 36% of them. 112 attempt sequence (INITIATING EVENT) but also further components indicative of a global narrative structure in the sense of the narrative schema outlined in ch. 3. Thus, the great majority of fourth graders (over 90%) additionally realize (c) the initiation of the search theme and the repeated attempt sequence (ATTEMPTs 1 & 3) and (d) the completion of the search theme (ENDING). Additionally, the underlying GOALs motivating the story characters can be inferred in almost all of the stories, even if they are still rarely realized explicitly. Tab. 6.8 Order of difficulty for first and fourth graders † Frequencies of 90% or above are shaded in dark grey, frequencies of 75% or above in lighter grey and those of 50% or above in very light grey. Components with identical frequencies share one box. However, even in fourth grade only 7, i.e. half of all components are produced by the great majority of participants: What does this mean for a global narrative structure? As opposed to the components with a frequency of 90% or more, the remaining 7 components can be considered optional and therefore dispensable, and thus, no CONSEQUENCE is necessary if the ENDING—which is additionally more salient—is realized. Similarly, ATTEMPTs 1 and 3 initiate the search inside and outside respectively, and participants may see the search theme as sufficiently established after mentioning these two components (cf. Bamberg & Marchman 1994). Therefore, any additional mentions of the search would become a matter of personal style and choice to elaborate rather than of necessity for a global story structure and theme. Similarly, SIMPLE REACTION is a category that is largely optional and usually left implicit even in adult stories (e.g. Stein & Glenn 1979: 64; cf. also ch. 3). Thus, the great majority of fourth graders was Grade 1 Grade 4 1 Setting † Initiating event 2 Initiating event Attempt 1 3 Goal 2 Setting 4 Attempt 3 Goal 2 5 Attempt 1 Attempt 3 6 Consequence Goal 1 7 Attempt 2 Ending 8 Attempt 4 Attempt 2 9 Attempt 6 Consequence 10 Goal 1 Attempt 6 11 Ending Simple reaction 12 Attempt 5 Attempt 4 13 Simple reaction Attempt 5 14 Attempt 7 Attempt 7 Order 113 found to produce the components most important for a globally structured narrative, while leaving out components which fulfill more of an elaborative function. To sum up, all 14 narrative components were already produced in first grade but almost all components’ frequency increased from first to fourth grade, and thus, only 2 components were realized by the majority of first graders (over 75%) but 7 by almost all fourth graders (over 90%). The statistical analysis showed that the increase in frequency was significant for most of the components. With the help of the respective frequencies, an order of difficulty of realization was established, which showed that the development from first to fourth grade went hand in hand with a development towards a global narrative structure; that is, the beginnings of such a structure were already discernable in first grade but only in fourth grade were the necessary components produced by the great majority of participants. 6.2.1.1 Individual narrative components by sex The realization of the individual components by male and female participants in first and fourth grade is presented in Fig. 6.15 and the corresponding values in Tab. 6.9. As Fig. 6.15 shows, some observable differences exist; these will be described in the following. However, none of these observed differences were statistically significant. Neither sex performs very differently from the mean in the realization of the individual narrative components in first grade; however, female first graders produce all of the narrative components, while none of the males realize ATTEMPTs 5 and 7. Clearly more male than female first graders, on the other hand, realize SETTING (90% vs. 76%), INITIATING EVENT (90% vs. 76%), SIMPLE REAC- TION (30% vs. 5%), GOAL 2 (60% vs. 48%), ATTEMPT 1 (50% vs. 33%), AT- TEMPT 3 (60% vs. 38%), and ATTEMPT 6 (30% vs. 14%). 7,8 7 The expression clearly more is used in the following to indicate that observed results differ by 10% or more. This does not necessarily mean that the difference is also statistically significant. 8 The statistical results are: SETTING (χ 2 (1)=0.83, ns), INITIATING EVENT (χ 2 (1)=0.83, ns), GOAL 2 (χ 2 (1)=0.42, ns), ATTEMPT 1 (χ 2 (1)=0.79, ns), ATTEMPT 3 (χ 2 (1)=1.31, ns), and ATTEMPT 6 (χ 2 (1)=1.07, ns). The SIMPLE REACTION component’s result (χ 2 (1)=3.84, Fisher’s p=0.09) could be interpreted as showing a trend towards significance. 114 Tab. 6.9 Frequency of each narrative component by sex and grade Grade 1 Grade 4 Male (N=10) Female (N=21) Male (N=7) Female (N=21) N % N % N % N % Setting 9 90 16 76 7 100 20 95 Initiating event 9 90 16 76 7 100 21 100 Simple reaction 3 30 1 5 4 57 8 38 Goal 1 1 10 4 19 7 100 19 90 Goal 2 6 60 10 48 7 100 20 95 Attempt 1 5 50 7 33 7 100 21 100 Attempt 2 3 30 6 29 5 71 17 81 Attempt 3 6 60 8 38 7 100 20 95 Attempt 4 2 20 4 19 4 57 8 38 Attempt 5 0 0 5 24 1 14 9 43 Attempt 6 3 30 3 14 4 57 10 48 Attempt 7 0 0 1 5 1 14 0 0 Consequence 3 30 9 43 5 71 13 62 Ending 2 20 3 14 7 100 19 90 Fig. 6.15 Frequency of each narrative component by sex in first grade 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% % of sex Narrative components ordered by occurrence in the story Male first graders Female first graders First grade mean 115 Slightly more male participants produced ATTEMPT 4 (19% vs. 20%) and ENDING (20% vs. 14%). 9,10 Female first graders, on the other hand, realize AT- TEMPT 5 (24% vs. 0%) clearly more often, and GOAL 1 (19% vs. 10%) and CON- SEQUENCE (43% vs. 30%) slightly more often. 11 Virtually no difference was found for ATTEMPT 2, and ATTEMPT 7 must be disregarded because it is realized by only one first grader in the data set. Although male first graders seem to have somewhat of an advantage on some of the components important for a global narrative structure (SETTING, INITI- ATING EVENT, ATTEMPT 1 and/ or 3), the difficulty of realization in first grade, which is shown in Tab. 6.10, is quite similar for both sexes. 12 Tab. 6.10 Order of difficulty for first graders by sex † Frequencies of 90% or above are shaded in dark grey, frequencies of 75% or above in lighter grey and those of 50% or above in very light grey. Components with identical frequencies share one box; parentheses indicate that components were not produced by any of the group members. That is, the majority of male and female first graders realize SETTING and INI- TIATING EVENT and approximately half of them realize GOAL 2, while half of the male first graders or a little more also produce ATTEMPTs 3 and 1. All other 9 Slightly more is used to refer to a difference of less than 10%. 10 The statistical results are: ATTEMPT 4 (χ 2 (1)=0.004, ns), ENDING (χ 2 (1)=0.16, ns). 11 The statistical results are: ATTEMPT 5 (χ 2 (1)=2.84, ns), GOAL 1 (χ 2 (1)=0.41, ns), CONSEQUENCE (χ 2 (1)=0.47, ns). 12 As explained in ch. 5.2.1, either the first or the third attempt can serve to introduce the search theme. The ENDING component was not included since it can be compensated for by CONSEQUENCE, which is realized slightly more often by female than by male first graders. Order Male first graders Female first graders 1 Setting † Setting 2 Initiating event Initiating event 3 Goal 2 Goal 2 4 Attempt 3 Consequence 5 Attempt 1 Attempt 3 6 Consequence Attempt 1 7 Simple reaction Attempt 2 8 Attempt 2 Attempt 5 9 Attempt 6 Attempt 4 10 Ending Goal 1 11 Attempt 4 Attempt 6 12 Goal 1 Ending 13 (Attempt 5) Simple reaction 14 (Attempt 7) Attempt 7 116 components are realized by less than half of the males and females; in this they follow the overall results described above. The remaining components are realized in a similar order of difficulty, which also corresponds closely to the overall order of difficulty described earlier. The only notable exception—if one assumes a certain degree of interchangeability in the ATTEMPTs—is SIMPLE REACTION (30% vs. 5%, i.e. 7 th rank vs. 13 th rank). Both male and female participants increase their performance from first to fourth grade for virtually all narrative components. The only exception is AT- TEMPT 7, which is produced by a female participant in first and a male participant in fourth grade. The increase rates of both sexes roughly follow the overall increase rates discussed earlier, i.e. they vary strongly from component to component; thus in fourth grade both female and male participants’ performance is again very close to the mean; this is evident from Fig. 6.16, which shows the frequency of each narrative component by sex in fourth grade. Female participants’ performance resembles the mean in all but ATTEMPT 5, while male fourth graders realize several components not only slightly more often than the mean but also slightly more often than female fourth graders. That is, male fourth graders realize SIMPLE REACTION (57% vs. 38%), ATTEMPT 4 (57% vs. 38%), and ATTEMPT 7 (14% vs. 0%). They slightly more often produce SETTING (100% vs. 95%), GOAL 1 (100% vs. 90%), GOAL 2 (100% vs. 95%), ATTEMPT 3 (100% vs. 95%), ATTEMPT 6 (57% vs. 48%), CONSEQUENCE 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% % ofsex Narrative components ordered by occurrence in the story Male fourth graders Female fourth graders Fourth grade mean Fig. 6.16 Frequency of each narrative component by sex in fourth grade 117 (71% vs. 62%), and ENDING (100% vs. 90%). 13 Female fourth graders, on the other hand, realize only ATTEMPT 5 (43% vs. 14%) clearly more often and AT- TEMPT 2 (81% vs. 71%) slightly more often than their male counterparts. 14 However, in absolute numbers group differences for the crucial components often translate into one or two participants of the respective group (not) realizing the corresponding component (Tab. 6.9) and thus these differences do not seem systematic. This is especially true since, on the two remaining components (INITI- ATING EVENT, ATTEMPT 1), which are important for a globally structured narrative, the performance of both groups is the same. At the same time the order of difficulty, as measured by the frequency of realizations, is again very similar for both sexes (Tab. 6.11). Tab. 6.11 Order of difficulty for fourth graders by sex † Frequencies of 90% or above are shaded in dark grey, frequencies of 75% or above in lighter grey and those of 50% or above in very light grey. Components with identical frequencies share one box; parentheses indicate that components were not produced by any of the group members. To sum up, somewhat of a male advantage was found in both grades with respect to the realization of the individual narrative components. However, any differences between the groups were not systematic and thus were negligible, even more so when keeping in mind a possible group size effect. The statistical analysis 13 The statistical results are: SIMPLE REACTION (χ 2 (1)=0.78, ns), ATTEMPT 4 (χ 2 (1)=0.78, ns), ATTEMPT 6 (χ 2 (1)=0.19, ns) and CONSEQUENCE (χ 2 (1)=0.21, ns). SETTING, GOAL 1 and 2, ATTEMPT 3, ATTEMPT 7 and ENDING could not be tested reliably due to expected cell frequencies below one (cf. ch. 5.2.3). 14 ATTEMPT 5 (χ 2 (1)=1.87, ns), ATTEMPT 2 (χ 2 (1)=0.28, ns. Order Male fourth graders Female fourth graders 1 Setting † Initiating event 2 Initiating event Attempt 1 3 Goal 1 Setting 4 Goal 2 Goal 2 5 Attempt 1 Attempt 3 6 Attempt 3 Goal 1 7 Ending Ending 8 Attempt 2 Attempt 2 9 Consequence Consequence 10 Attempt 6 Attempt 6 11 Simple reaction Attempt 5 12 Attempt 4 Simple reaction 13 Attempt 5 Attempt 4 14 Attempt 7 (Attempt 7) 118 also supported the conclusion that the differences between male and female participants are negligible, since none of the differences tested were significant. 6.2.1.2 Individual narrative components by experience group The realization of each narrative component by experience group in first grade is shown in Fig. 6.17; the corresponding values are given in Tab. 5.12. All narrative components are produced by at least some of the bili first graders, while none of the mono first graders realize SIMPLE REACTION, GOAL 1, ATTEMPTs 4, 6 and 7 and the ENDING. Additionally, Fig. 6.17 and Tab. 6.12 show that the higher mean number of components achieved by first graders with bilingual preschool experience (cf. ch. 6.1.1.3) is due to a consistently higher number of realizations; thus, the bili group outperforms the mono group on all but the SETTING, which is realized slightly more often by the mono group (mono 83% vs. bili 79%)— statistically this difference was also not significant (χ2(1)=0.76, ns). Only a small observable difference between the two experience groups was found for AT- TEMPT 7 (bili 5% vs. mono 0%) and CONSEQUENCE (bili 42% vs. mono 33%). The difference in ATTEMPT 7 corresponds to one bilingual first grader realizing this component, however, which cannot be considered representative, while the difference in CONSEQUENCE (χ2(1)=0.63, Fisher’s p=0.72, ns) was not statistically significant. † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. Fig. 6.17 Frequency of each narrative component in first grade by experience group † 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% % of experience group Narrative components orderd by occurrence in the story Mono first graders Bili first graders First grade mean 119 Tab. 6.12 Frequency of each narrative component by experience group and grade Grade 1 Grade 4 Mono (N=12)† Bili (N=19) Mono (N=12) Bili (N=16) N % N % N % N % Setting 10 83 15 79 12 100 15 94 Initiating event 6 50 19 100 12 100 16 100 Simple reaction 0 0 4 21 4 33 8 50 Goal 1 0 0 5 26 10 83 16 100 Goal 2 3 25 13 68 12 100 15 94 Attempt 1 2 17 10 53 12 100 16 100 Attempt 2 2 17 7 37 9 75 13 81 Attempt 3 2 17 12 63 12 100 15 94 Attempt 4 0 0 6 32 6 50 6 38 Attempt 5 1 8 4 21 7 58 3 19 Attempt 6 0 0 6 32 4 33 10 63 Attempt 7 0 0 1 5 1 8 0 0 Consequence 4 33 8 42 7 58 11 69 Ending 0 0 5 26 10 83 16 100 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. All components besides SETTING, ATTEMPT 7 and CONSEQUENCE are thus realized clearly more often by the bili group, i.e. by at least 10% more of the bili participants. Over 30% more bili participants realize INITIATING EVENT (100% of the bili vs. 50% of the mono group), GOAL 2 (68% vs. 25%), ATTEMPT 1 (53% vs. 17%), ATTEMPT 3 (63% vs. 17%), ATTEMPT 4 (32% vs. 0%), and ATTEMPT 6 (32% vs. 0%). At least 20% more bilis than monos produce SIMPLE REACTION (21% vs. 0%), GOAL 1 (26% vs. 0%), ATTEMPT 2 (37% vs. 17%) and ENDING (26% vs. 0%), while over 10% more of the bilis realize ATTEMPT 5 (21% vs. 8%). Statistically, the difference between mono and bili group was testable for only 5 of these components, mostly because the remaining 6 components were not produced by any of the mono first graders. The difference in realizations was found to be statistically significant for INITIATING EVENT (χ 2 (1)=11.78, Fischer’s p<0.01, Φ=0.62), GOAL 2 (χ 2 (1)=5.55, Fisher’s p<0.05, Φ=0.42), and ATTEMPT 3 (χ 2 (1)=6.42, Fisher’s p<0.05, Φ=0.46). The respective effect sizes, ranging from a moderately strong 0.42 to a strong 0.62, confirm the importance of experience group in first grade. The statistical results for ATTEMPT 1 (χ 2 (1)=4.01, Fisher’s p=0.065, ns), ATTEMPTs 4 and 6 (both χ 2 (1)=4.70, Fisher’s p=0.06, ns) indicated a trend towards significance, while the results for SIMPLE REACTION (χ 2 (1)=2.90, ns), GOAL 1 (χ 2 (1)=3.77, ns), ATTEMPT 2 (χ 2 (1)=1.45, ns), AT- TEMPT 5 (χ 2 (1)=0.88, ns) and ENDING (χ 2 (1)=3.77, ns) were non-significant. 120 Tab. 6.13 Order of difficulty for first graders by experience group † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. Frequencies of 90% or above are shaded in dark grey, frequencies of 75% or above in lighter grey and those of 50% or above in very light grey. Components with identical frequencies share one box; parentheses indicate that components were not produced by any of the group members. The order of difficulty that emerges for mono and bili first graders is shown in Tab. 6.13. Four components are realized by the majority of bili first graders, namely SETTING (79%), INITIATING EVENT (100%), GOAL 2 (68%), and ATTEMPT 3 (63%). ATTEMPT 1 (53%) is realized by a little more than half of the bili first graders. In the mono group, the SETTING (83%) is also realized by the majority of participants and the INITIATING EVENT by at least half of them (50%). All other components were produced by 25% or less of the mono first graders. Thus, bili first graders show the beginnings of a global narrative structure: The majority of first graders with bilingual preschool experience describes the protagonists, the problem triggering the global problem-resolution structure, and at least one of the two attempts which can be considered the beginning of a global search theme (ATTEMPT 1 and 3) 15 . In the case of the mono group, there is no comparably conclusive evidence for an emerging global narrative structure. Additionally, first graders with bilingual preschool experience tell more elaborate, more detailed stories, i.e. they incorporate not only more (cf. ch. 6.1.1.3) but also a greater variety of narrative components. 15 These two attempts are realized clearly more often than any of the other attempts, which could either be explained through greater saliency in the elicitation material or—more likely—by being considered more important for the story. Order Mono first graders † Bili first graders 1 Setting Initiating event 2 Initiating event Setting 3 Consequence Goal 2 4 Goal 2 Attempt 3 5 Attempt 1 Attempt 1 6 Attempt 2 Consequence 7 Attempt 3 Attempt 2 8 Attempt 5 Attempt 4 9 (Simple reaction) Attempt 6 10 (Goal 1) Ending 11 (Attempt 4) Goal 1 12 (Attempt 6) Simple reaction 13 (Attempt 7) Attempt 5 14 (Ending) Attempt 7 121 Both experience groups follow the general trend of a higher number of realizations in fourth grade for most of the narrative components. Exceptions are mostly attributable to realizations of 100% in first grade (INITIATING EVENT, SET- TING ) , which leave no room for an increase, or to small changes of one participant more or less in first or fourth grade (ATTEMPT 7, ATTEMPT 4). The fourth grade results by experience group are depicted in Fig. 6.18; for the corresponding values see Tab. 6.12. † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. The results of both groups closely resemble each other in fourth grade, i.e. the differences have become quite small compared to first grade, even if a certain advantage of the bili group (over 10% more realizations) can be observed for five of the 14 components, namely SIMPLE REACTION (bili 50% vs. mono group 33%), GOAL 1 (100% vs. 83%), ATTEMPT 6 (63% vs. 33%), CONSEQUENCE (69% vs. 58%) and ENDING (100% vs. 83%). A slightly higher performance of the mono group can be observed for ATTEMPT 4 (mono 50% vs. bili 38%) and AT- TEMPT 5 (58% vs. 19%). The largest differences are thus found for ATTEMPT 5 (monos 39% more) and ATTEMPT 6 (bilis 30% more). For all 7 other components, there are either no observable differences or these differences are below 10% and thus negligible. The statistical analysis confirms this similarity in results since none of the differences tested were statistically significant. 16 16 SIMPLE REACTION (χ 2 (1)=0.78, ns), ATTEMPT 2 (χ 2 (1)=0.16, ns), ATTEMPT 4 (χ 2 (1)=0.44, ns), ATTEMPT 6 (χ 2 (1)=2.33, ns), CONSEQUENCE (χ 2 (1)=0.32, ns). Fig. 6.18 Frequency of each narrative component by experience group in fourth grade † 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% % of experience group Narrative components ordered by occurrence in the story Mono fourth graders Bili fourth graders Fourth grade mean 122 Tab. 6.14 Order of difficulty for fourth graders by experience group † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. Frequencies of 90% or above are shaded in dark grey, frequencies of 75% or above in lighter grey and those of 50% or above in very light grey. Components with identical frequencies share one box; parentheses indicate that components were not produced by any of the group members. The order of difficulty (as measured by frequency of realization) for both mono and bili fourth graders is given in Tab. 6.14. The order of difficulty shows some minor differences between fourth graders with and without bilingual preschool experience in regard to the order of difficulty and the frequency of realization. These differences again seem negligible, however, since the same components are included in the seven most frequent realizations and the differences in frequency translate to an absolute number of only one or two participants in a group (not) realizing a component. A frequency of 94% (SETTING, GOAL 2, ATTEMPT 3) in the bili group, for example, means that one bili fourth grader did not realize the respective component; a frequency of 83% (GOAL 1, ENDING) in the mono group corresponds to two mono fourth graders not realizing the component in question. In summary, L2 preschool experience was found to have a significant impact on first grade results, where bili participants clearly outperformed mono participants; that is, all narrative components were produced by at least one of the bili Even the result for ATTEMPT 5 missed qualifying as statistically significant, albeit only barely (χ 2 (1)=4.68, Fisher’s p=0.05, ns). SETTING, GOAL 1 and 2, ATTEMPT 3 and 7, ENDING could not be tested reliably due to expected cell frequencies below one, IN- ITIATING EVENT and ATTEMPT 1 because they are a constant. Order Mono fourth graders † Bili fourth graders 1 Setting Initiating event 2 Initiating event Goal 1 3 Goal 2 Attempt 1 4 Attempt 1 Ending 5 Attempt 3 Setting 6 Goal 1 Goal 2 7 Ending Attempt 3 8 Attempt 2 Attempt 2 9 Consequence Consequence 10 Attempt 5 Attempt 6 11 Attempt 4 Simple reaction 12 Simple reaction Attempt 4 13 Attempt 6 Attempt 5 14 Attempt 7 (Attempt 7) 123 first graders, while 6 of 14 components (43%) were not produced by any of the mono first graders. Qualitatively, only bili first graders showed the beginnings of a global narrative structure and they also told more elaborate stories in terms of the variety of components used. The two experience groups followed the overall trend of an increase in the frequencies of realization and by the end of fourth grade not only any quantitative but also any qualitative differences had become negligible. 6.2.2 Summary: Individual narrative components This section aimed to answer the question of whether there are any qualitative differences in coherence—as measured by the frequency of the 14 individual components under investigation—attributable to grade, sex or L2 preschool experience. It was found that all narrative components were realized by at least one participant even in first grade and that there was a significant observable as well as a statistically significant increase in the frequency of realization from first to fourth grade for almost all components. An order of difficulty of realization based on the components’ frequency showed that only SETTING and INITIATING EVENT (both 81%) were produced by the majority of first graders, while the great majority of fourth graders produced additional components necessary for a global narrative structure, especially ATTEMPTs 1 (100%) and 3 (96%), and ENDING (93%). Thus, the respective orders of difficulty indicated the beginnings of a global structure in first grade. Only the combination of components realized by the great majority of fourth graders, however, actually reflected a global narrative structure, i.e. only fourth graders were found to produce coherent stories. Participants’ sex did not have any significant effect on frequencies or order of difficulty, i.e. male and female participants performed very similarly to each other (as well as to the overall results) in each grade and they followed the overall pattern of an increase in frequencies from first to fourth grade. L2 preschool experience, on the other hand, was found to have a profound influence; however, this influence was limited to first grade. Thus, all narrative components were produced by at least one bili first grader, while 6 components were realized by none of the mono first graders. Qualitatively, SETTING (79%), INITIATING EVENT (100%), ATTEMPT 1 (53%) and ATTEMPT 3 (63%) were produced by over 50% of bili first graders. Only SETTING (83%) and INITIATING EVENT (50%), on the other hand, were realized by 50% or more of mono first graders; that is, the evidence for an emerging global narrative structure in first grade is mainly attributable to first graders with L2 preschool experience. In terms of the development from first to fourth grade both experience groups followed the overall trend of an increase in frequency and by the end of fourth grade any significant influence of L2 preschool experience had disappeared, i.e. the performance of all three groups was largely the same with respect to the components’ frequencies and the order of difficulty. To sum up, a general trend of increase in the frequency of realization was found for almost all components. Participants’ sex was of no significant influence 124 but L2 preschool experience significantly influenced the first grade results and thereby qualified the overall findings. By the end of fourth grade this influence had disappeared and the performance for both experience groups was the same. 6.3 Index of global narrative structure In order to more closely investigate participants’ qualitative development in the sense of a global narrative structure, which seems to be in its initial stages by the end of first grade and largely developed by grade four (cf. the previous section), a weighted index, the (global) narrative index, was constructed. This index of global narrative structure was defined as the sum of the five narrative components most important for representing the narrative structure of task material, i.e. the frog story (cf. ch. 5.2.1). In the following I will briefly describe the criteria applied in index construction and then the results obtained. 6.3.1 Index construction and reliability First of all, the SETTING (mention of all three protagonists) was included, as it is one of the integral parts of a story according to story grammar (cf. ch. 3.3.2). Secondly, a key focus in the analysis of global structure lies in whether participants establish and conclude a problem-resolution structure around which the story is organized. Since it marks the beginning of this structure, the INITIATING EVENT (escape of the frog) was also incorporated in the index. Additionally, the instantiation of the search was included as a crucial indicator for the global problem-resolution structure. Since participants have several possibilities for encoding the first attempt (cf. ch. 5.2.1), though, this component was redefined as realization of either ATTEMPT 1, 2 or 3, which are all legitimate options for encoding the initiation of the search. A second ATTEMPT component was added to the index as an indicator for a sustained search motif (cf. Akinçi et al. 2001, Kuppersmitt & Berman 2001, Berman & Slobin 1994, Bamberg & Marchman 1990), and it was defined as any attempt following the first attempt realized. However, this component does not have the same weight for global structuring as the other components do, since the search theme becomes the default case once it has been instantiated (cf. Bamberg & Marchman 1990, 1991, 1994). Consequently, the second attempt was assigned simple weight, while all other components were assigned double weight. Finally, a RESOLUTION component was introduced. As pointed out earlier, considerable overlap can exist between OUTCOME and ENDING in one-episode stories such as the frog story, while at the same time, both higher-order goal “recover or replace the frog” and lower-order goal “find the frog” (cf. ch. 5.2.1) can serve as the global organizing theme of the frog story. Consequently, either CON- SEQUENCE (finding the frog) or ENDING (recovering/ replacing the frog) can conclude the story theme. Therefore, outcome and ending were combined into 125 the RESOLUTION (cf. Berman & Slobin 1994), which was defined as the realization of CONSEQUENCE, ENDING or both. The narrative components composing the global narrative index and their assigned weight are summarized in Tab. 6.15. The index was found to be reliable (Cronbach’s α= 0.71) and thus the global narrative structure of participants’ texts was measured by their score on a scale from one to nine, where nine points signal a fully developed global narrative structure. Additionally, the index was found to correlate very strongly with the total number of components realized (r=0.90, p<0.01) with an effect size of r 2 =0.81, which means that this “reduced” version of story grammar can be used as a very representative measure also for the total number of components. Tab. 6.15 Index construction: Components and their assigned weight Index component Story grammar component Weight assigned Setting SETTING double Initiating event INITIATING EVENT double First attempt ATTEMPT 1 or 2 or 3 double Subsequent attempt Any following attempt single Resolution CONSEQUENCE and/ or ENDING double 6.3.2 Global narrative index results This section complements the results presented for the total number of components and the individual narrative components by focusing on a combination of components representative for a globally organized and thus coherent story; that is, it is investigated whether there are any differences in global narrative structure—as measured by the index score—attributable to grade, sex or L2 experience. 6.3.2.1 Overall results Fig. 6.19 shows the mean index score in first and fourth grade; Tab. 6.16 gives the descriptive statistics for the narrative index results in both grades. First graders score an average of 5.8 points on the index. By the end of fourth grade the mean index score has risen to 8.9, which basically corresponds to a full score. Statistically, this increase of the mean score from first to fourth grade is very highly significant and the effect size corroborates the strong influence of grade (3.1 points; U=123.50, p<0.001, η 2 =0.48), since it indicates that grade as a single factor explains 48% of the variation in results. If one looks at the mean index score as a function of the maximal score of 9—which would indicate a fully developed global narrative structure—first graders thus realize on average 64% of a global narrative structure, while fourth graders have completed their development (99%). 126 Tab. 6.16 Descriptive statistics narrative index by grade Total N Mean Standard Deviation Median Mode Minimum Maximum Grade 1 31 5.8 2.6 6 9 2 9 Grade 4 28 8.9 0.4 9 9 7 9 Total 59 7.3 2.4 9 9 2 9 At the same time first and fourth graders differ with respect to their interindividual variation, which is quite high in first grade—as measured by a standard deviation of 2.6 points, a variability ratio of 45% and a range of seven points from lowest (2 points) to highest score (9)— but which by the end of fourth grade has decreased to a standard deviation of merely 0.4 or 4% of the mean and a range of two (cf. Tab. 6.16). This larger homogeneity in fourth grade is illustrated in Fig. 6.20, which presents the distribution of scores in first and fourth grade; Tab. 6.17 gives the corresponding values. Fig. 6.19 Mean narrative index score by grade 5,8 8,9 3,1 0 1 2 3 4 5 6 7 8 9 Grade 1 Grade 4 Increase Index score Grade 1 Grade 4 Increase 127 Tab. 6.17 Distribution of narrative index scores by grade Grade 1 Grade 4 N % within grade N % within grade Narrative index score 2 6 19 0 0 4 6 19 0 0 5 1 3 0 0 6 3 10 0 0 7 7 23 1 4 8 0 0 1 4 9 8 26 26 93 Total 31 100 28 100 The distribution shows that first graders’ scores are distributed relatively evenly on a range from 2 to 9 with a little over half of them (52%, 16 of 31) scoring 2, 4, 5 or 6 points on the index and a little less than half of them (48%, 15 of 31) scoring 7 or 9 points. 17 Twenty-six percent of first graders (8 participants) are already able to produce a fully globally organized narrative—as indicated by the maximum score of 9 points—and another 23% come close to it. In fourth grade 93% of participants reach the maximum score; only two children produce lower scores, namely 7 and 8 points (both 4%). That is, frequency distribution and mean score 17 Cf. also the median in first grade. Fig. 6.20 Distribution of narrative index scores by grade 19 19 3 10 23 0 26 0 0 0 0 4 4 93 0 10 20 30 40 50 60 70 80 90 100 1 2 3 4 5 6 7 8 9 % of participants in grade Index score % of first graders % of fourth graders 128 show that some first graders already perform like fourth graders with respect to a global structure but that a global organization of narratives has become the rule only in fourth grade. Since the present study only compares first and fourth grade data, it is, of course, not possible to draw any conclusion as to when exactly a global structure becomes predominant. To sum up, mean score and distribution showed that first graders’ global narrative organization is in most cases still developing, even if there are large differences among them with roughly a quarter of first graders already producing a fully developed narrative structure. The participants’ mean index score was found to increase strongly from first to fourth grade (the significance of this increase was confirmed by the statistical analysis), while their interindividual differences decreased very strongly. As a result, almost all fourth graders produced the highest index score, which means that the development of a global narrative structure can be seen as concluded by the end of fourth grade. 6.3.2.2 Index by sex The mean index score by sex and grade is shown in Fig. 6.21; the corresponding descriptive statistics are given in Tab. 6.18. Male and female first graders perform very similarly, and thus, there is neither a significant observable nor a statistically significant difference between the two groups’ mean score in first grade (male 6.2 vs. female mean 5.6; U=92.00, ns). Great similarity also exists in the interindividual variation as measured by the standard deviation, 2.4 versus 2.7. The lowest (2 points) and highest (9 points) scores, as well as the scoring range (7 points), are even identical. The variability ratio shows, on the other hand, that the male group (39%) is a little more homogeneous than the female group (48%). Most likely, however, this finding is attributable to a group size effect. Tab. 6.18 Descriptive statistics narrative index by sex and grade Narrative index Total N Mean Standard Deviation Median Mode † Min. Max. Grade 1 Male 10 6.2 2.4 6.5 9 2 9 Female 21 5.6 2.7 6.0 2 2 9 Total 31 5.8 2.6 6.0 9 2 9 Grade 4 Male 7 9.0 0 9.0 9 9 9 Female 21 8.9 0.5 9.0 9 7 9 Total 28 8.9 0.4 9.0 9 7 9 † More than one mode exists. The frequency distribution, which is shown in Fig. 6.22 (the corresponding statistics are given in Tab. 6.19), underlines this similarity between male and female first graders: Both groups’ scores are very similarly distributed and any differences can be considered negligible. 129 10 20 10 10 20 30 24 19 0 10 24 24 0 20 40 60 80 100 1 2 3 4 5 6 7 8 9 % of sex Index score Male first graders Female first graders Fig. 6.22 Distribution of narrative index scores by sex in first grade 6,2 9 2,8 5,6 8,9 3,3 0 1 2 3 4 5 6 7 8 9 Grade 1 Grade 4 Increase Index score Male Female Fig. 6.21 Mean narrative index score by sex and grade 130 Tab. 6.19 Distribution of narrative index scores by sex and grade Grade 1 Grade 4 Male Female Male Female N % within grade N % within grade N % within grade N % within grade Narrative index score 2 1 10 5 24 0 0 0 0 4 2 20 4 19 0 0 0 0 5 1 10 0 0 0 0 0 0 6 1 10 2 10 0 0 0 0 7 2 20 5 24 0 0 1 5 8 0 0 0 0 0 0 1 5 9 3 30 5 24 7 100 19 90 Total 10 100 21 100 7 100 21 100 Both sexes’ performance follows the general trend of a significant increase of the mean index score (cf. Fig. 6.21, Tab. 6.18). The two groups’ mean scores rise a little differently from each other due to the female group’s lower mean in first grade, but the increase is statistically (very highly) significant for both male (U=10.50, p<0.05, η 2 =0.45) and female (U=60.00, p<0.001, η 2 =0.49) participants. At the same time the interindividual variation drops from a standard deviation of 2.4 to 0, i.e. from a variability ratio of 39% to 0%, for male and from 2.7 to 0.5, i.e. 48% to 6%, for the female group. Consequently, the performance of male and female participants is even more similar in fourth grade: Again, there is neither a significant observable nor a statistically significant difference between the two groups’ mean scores (9.0 vs. 8.9; U=66.50, ns) and their interindividual variation is also very similar. The only observable differences between female and male fourth graders are their minimum and maximum scores and therefore also the range of scores (2 and 0, respectively). 18 The frequency distribution, given in Fig. 6.23, shows that the only two fourth graders scoring fewer than 9 points are both females. Male fourth graders, therefore, seem to perform slightly better. In absolute numbers, however, this means that two of the females do not score at ceiling. Keeping in mind the distinct group sizes, this “male advantage” seems again attributable to a group size effect rather than a true difference between male and female participants. 18 At the same time median and mode are identical, however, (both 9 points) for the two groups. 131 To sum up, neither significant observable nor statistically significant differences between male and female participants were found for either grade. Both sexes did, however, follow the general trends of an increase in the mean index score, which was statistically significant, and a decrease in interindividual variation. 6.3.2.3 Index by experience group The mean index scores obtained by the two different experience groups in first and fourth grade are shown in Fig. 6.24; the corresponding descriptive statistics are given in Tab. 6.20. First graders with bilingual preschool experience clearly outperform those without such prior experience: The bilingual group’s mean index score is 6.8 as opposed to 4.3 for the mono group. Statistically, this difference is very highly significant and has a solid effect size (U=50.50, p<0.01, η 2 =0.23). The respective mean scores indicate that bili first graders realize on average 76% but mono first graders merely 48% of a fully global narrative structure (indicated by 9 points on the index). That is, bili first graders already realize large parts of a global narrative structure. Mono first graders, on the other hand, realize some of the important components but do not yet seem to use them in their function for globally organizing a narrative. At the same time the mono group’s interindividual variation is again larger, since they reach a standard deviation of 2.5 and a variability ratio of 60%, while the bili group’s standard deviation is 2.2 and its variability ratio only 32%. Fig. 6.23 Distribution of narrative index scores by sex in fourth grade 100 5 5 90 0 20 40 60 80 100 1 2 3 4 5 6 7 8 9 % of sex Index score Male fourth graders Female fourth graders 132 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. Tab. 6.20 Descriptive statistics narrative index by experience group and grade Narrative index Total N Mean Standard Deviation Median Mode Min. Max. Grade 1 Mono† 12 4.3 2.5 4 2 2 9 Bili 19 6.8 2.2 7 9 2 9 Total 31 5.8 2.6 6 9 2 9 Grade 4 Mono 12 9.0 0 9 9 9 9 Bili 16 8.8 0.5 9 9 7 9 Total 28 8.9 0.4 9 9 7 9 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. The bili group’s advantage in first grade is corroborated by median and mode (cf. Tab. 6.20) as well as by the frequency distribution in first grade; all of these are illustrated in Fig. 6.25, which shows the frequency distribution in first grade. The corresponding values are given in Tab. 6.21. Fig. 6.24 Mean narrative index score by experience group and grade † 4,3 9,0 4,7*** 6,8 8,8 2*** 0 1 2 3 4 5 6 7 8 9 Grade 1 Grade 4 Increase Index score Mono Bili 133 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. Tab. 6.21 Distribution of narrative index scores by experience group and grade Narrative index score Grade 1 Grade 4 Mono † Bili Mono Bili N % within grade N % within grade N % within grade N % within grade 2 5 42 1 5 0 0 0 0 4 3 25 3 16 0 0 0 0 5 0 0 1 5 0 0 0 0 6 1 8 2 11 0 0 0 0 7 2 17 5 26 0 0 1 6 8 0 0 0 0 0 0 1 6 9 1 8 7 37 12 100 14 88 Total 12 100 19 100 12 100 16 100 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. The distribution demonstrates that over one third of bili first graders (37%, i.e. 7 of 19) scored the highest possible number of points on the index, namely 9, as opposed to only one child of the mono group (8%, 1 of 12). The two highest 42 25 8 17 8 5 16 5 11 26 37 0 20 40 60 80 100 1 2 3 4 5 6 7 8 9 % of experience group Index score % of mono first graders % of bili first graders Fig. 6.25 Distribution of narrative index scores by experience group in first grade † 134 scores taken together, i.e. 7 and 9 points, are obtained by the majority of bili first graders, namely 63% (12 of 19) as compared to only 25% of mono first graders (3 of 12). Almost half of the mono group (42%, 5 of 12), on the other hand, score the lowest number of points observed, namely 2, as opposed to only one bili first grader (5%, 1 of 19). At the same time the two lowest scores together (2 and 4 points) account for the majority of mono first graders’ results, namely 67% (8 of 12), but for only 21% of the bili group (4 of 19). Consequently, an index score between two and four, i.e. a largely undeveloped global structure, seems to be more typical for mono first graders, while a score between 7 and 9, i.e. an (almost) fully developed global organization, seems to be more typical for bili first graders. Thus, the distribution of index scores confirms the conclusions drawn from the mean. Both experience groups follow the general trend of an increase in mean score and a decrease in interindividual variation from first to fourth grade (Fig. 6.24 and Tab. 6.20). The mono group’s mean index score increases more strongly (+4.7); statistically this increase was also very highly significant (U=6.00, p<0.001, η 2 =0.76). The bili group had a comparatively lower increase (+2), which was, nevertheless, statistically highly significant (U=65.50, p<0.01, η 2 =0.31). These differences in increase can be explained by the first grade scores: Due to a much lower first grade score, the mono group has a larger necessity, and additionally more room, for increase. This is also reflected in the effect sizes, which show that grade explains 31% of the variation in the bili but 76% in the mono index scores. As a result of the different increase rates, the two experience groups perform very similarly in fourth grade (Fig. 6.24 and Tab. 6.20). The mean score of the mono group is at ceiling (9 points on the index), which means that all of its members realize a fully global narrative structure. The bili group has a slightly lower mean score of 8.8 points, however, which corresponds to “only” 98% of a global narrative structure. This lower mean score is explainable through the bili fourth graders’ interindividual differences, which are reflected by the frequency distribution given in Fig. 6.26. As the frequency distribution shows, 100% of the mono participants score 9 points on the index as opposed to only 88% of the bilis. More specifically, 6% of the bili participants score 7 and 6% score 8 points. That is, only the bili group has any variation at all in fourth grade (SD=0.5, 6% of the mean). In absolute numbers, however, the 12% of fourth graders not scoring 9 points corresponds to only two of the 16 bili fourth graders (C1-G4-15 and C5-G4-20). 19 In combination with the fact that the difference between mono and bili fourth graders is not statistically significant (U=84.00, ns) this leads to the conclusion that there is no significant difference in fourth grade results attributable to L2 preschool experience. 19 C1-G4-15, who scored 7 points, was not given credit for the SETTING since the dog was not mentioned. C5-G4-20 did not produce a second attempt and thus scored eight points. 135 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. To sum up, L2 preschool experience was again found to have a significant impact on first but not on fourth grade results: Mean score and distribution showed that the majority of bili first graders was able to realize either a fully global narrative structure or large parts of it, while this was not the case for the mono group. At the same time the bili group’s results were more homogeneous. Both groups’ index score followed the overall trend of an increase in mean score and homogeneity of results. Due to different increase rates, the influence of L2 preschool experience had disappeared by the end of fourth grade; that is, both experience groups told stories with a fully global narrative structure in fourth grade. The statistical analysis confirmed the significant bili advantage in first grade and both groups’ significant increase in mean score from first to fourth grade, as well as the lack of significant differences in fourth grade. 6.3.3 Summary: Narrative index This section presented the results for a narrative index indicating the degree of coherence of participants’ stories, i.e. the degree to which the stories are organized along a global narrative structure, on a range from 1 to 9 points. The purpose of this index was to complement the results obtained for both the total number of components and the individual narrative components by focusing on a small selection of components crucial for producing a coherent story. It was investigat- Fig. 6.26 Distribution of narrative index scores by experience group in fourth grade † 100 6 6 88 0 10 20 30 40 50 60 70 80 90 100 1 2 3 4 5 6 7 8 9 % of experience group Index score % of mono fourth graders % of bili fourth graders 136 ed whether there were any differences in global narrative structure—as measured by the index score—attributable to grade, sex or experience group. Grade was found to have a significant influence with respect to mean index score and interindividual differences: Observed and statistical results showed that fourth graders obtained a significantly higher mean score than first graders (5.8 vs. 8.9). At the same time participants’ interindividual variation was significantly lower in fourth grade. However, a remarkable overlap between first and fourth grade mean scores was observed, which showed that quite a large number of first graders was able to produce an (almost) fully global narrative structure; that is, 48% of the first graders achieved a mean index score also obtained by at least one of the fourth graders; more than half of these first graders even realized the maximum score of 9 points on the index. Sex did not significantly influence participants’ index results, i.e. neither significant observable nor statistical differences between male and female participants were found in either grade. At the same time both sexes followed the general trend of an increase in mean score and a decrease in interindividual variation from first to fourth grade. L2 preschool experience, on the other hand, had a strong influence on participants’ score. However, this influence was limited to first grade. That is, the results by experience group qualified the overall results in that an (almost) fully global narrative was produced mainly by first graders with L2 preschool experience. More specifically, seven of the eight first graders (88%) achieving 9 points and five of the seven first graders (71%) achieving 7 points belonged to the bili group. Merely one mono first grader, on the other hand, obtained 9 points and only two of them obtained 7. To sum up, the narrative index results showed that both first and fourth graders were able to produce fully coherent narratives. However, almost exclusively first graders with bilingual preschool experience were able to produce fully globally organized stories, i.e. an index score of 9 points. It is only by the end of fourth grade that a global organization has become the rule for all participants. 6.4 Narrative coherence: Summary In the previous sections the results of three different measures were described, which were used to investigate the global narrative structure of participants’ stories, i.e. their coherence: The total number of components in participants’ stories, the frequency of each of the 14 narrative components under investigation, and an index of narrative structure consisting of a reduced number of components crucial for a globally organized story. All results were examined as to a possible influence of grade, sex, and L2 preschool experience. Based on the findings of previous studies the following hypotheses (cf. ch. 4.6) had been put forward with respect to the research questions (cf. ch. 3.3) addressed in the analysis of narrative coherence: 1. Grade/ age has a significant influence on the narrative coherence of participants’ stories. 137 2. Participants’ stories become more coherent from first to fourth grade as measured by the number of narrative components. 3. There are qualitative differences in narrative coherence between grades as measured by differences in frequency among the individual narrative components. 4. Sex does not have a significant influence on the narrative coherence of participants’ stories. First and foremost it was found that all first graders were able to produce at least one narrative component (minimum score) and that none of the first or fourth graders realized all 14 components; the highest score was 12. At the same time each of the 14 narrative components was produced by at least one participant in first as well as in fourth grade. All three measures were strongly influenced by grade. That is, participants produced significantly more and a greater variety of components in fourth than in first grade, the great majority of narrative components was produced more often by fourth graders, and participants’ index scores were significantly higher in fourth grade. Hypotheses (1), (2), and (3) were thus confirmed. At the same time the interindividual variation—measureable for the total number of components and the index score—was significantly lower in fourth than in first grade. Qualitatively, narrative index and an order of difficulty based on the frequency of the individual components showed that the coherence of participants’ stories was only beginning to emerge in first grade. By the end of fourth grade, however, it was fully developed for almost all participants. Consequently, a developmental pattern emerges: The coherence of participants’ stories strongly increases from first to fourth grade, and by the end of fourth grade the development of coherence can be seen as concluded. However, a closer look at the distributions of total number of components and index score qualified this overall pattern, since there was a large overlap between first and fourth grade scores; that is, almost half of all first graders (45%) produced stories which were as coherent as those of one or more fourth graders and 26% of first graders told fully coherent stories (as measured by an index score of 9 points). Participants’ sex did not have any significant influence on coherence results. That is, male and female participants performed very similarly in both first grade and fourth grade regarding all three measures of coherence. Additionally, they followed the overall developmental pattern described above. Hypothesis (4) was thus also confirmed. L2 preschool experience, on the other hand, was found to have a significant influence on all three coherence measures in first grade, which qualifies the overall results. That is, first graders with L2 preschool experience (the bili group) produced a significantly higher number of components and they realized almost every component more often as well as obtaining a significantly higher index score than the first graders without prior L2 experience (the mono group). At the same time bili first graders’ results showed less interindividual variation. This means that the overlap of results between first grade and fourth grade described 138 above is largely attributable to the first grade results of children with L2 preschool experience. Thus, first graders with bilingual preschool experience significantly more often produced coherent stories than those without and a very substantial number even produced stories as (fully) coherent as did the fourth graders. 20 Due to distinct increase rates, however, these differences have evened out by the end of fourth grade so that by then coherent stories have become the rule for all participants regardless of experience group. 20 Sixty-three percent as measured by the index and 69% as measured by the total number of components and thus the majority of bili first graders. If “atypical” fourth graders’ scores are disregarded, however, the percentages decrease to 37% and 32% respectively. 139 7 The development of cohesion: Results 7.1 Overall cohesive density The present section is aimed at answering the question of how cohesive participants’ stories are, as measured by their overall degree of cohesive density, and whether there are any differences in cohesion attributable to grade, sex or L2 preschool experience. The observed results will be described first, followed by the statistical results. Finally, the results obtained will be summarized. As stated in ch. 5, possible differences between the longitudinal and the crosssectional data set were explored for all measures of cohesion and coherence. No significant differences were found, so the respective first and fourth grade cohorts were treated as one first/ fourth grade data set and cohorts are not differentiated in the discussion of any of the cohesion results in this section or the following ones. 7.1.1 Observed results overall cohesive density Participants’ mean cohesive density in first and fourth grade as well as the corresponding increase, i.e. the group difference between grade one and four, is depicted in Fig. 7.1; the descriptive statistics are given in Tab. 7.1. Fig. 7.1 Mean cohesive density by grade 3,45 3,91 0,46 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5 4,0 4,5 Grade 1 Grade 4 Increase No. of ties per clause Grade 1 Grade 4 Increase 140 All first graders are able to produce at least some cohesive ties; this is indicated by the first grade minimum score of 2.17 ties per clause. On average, first graders produce 3.45 cohesive devices per clause—or approximately 7 in 2 clauses—which refer to a preceding textual element. 1 At the same time the interindividual variation, as measured by the standard deviation, is low compared to those found for coherence, namely 0.53 or only 15% in relation to the mean (variability ratio). The range (2.27) from lowest to highest score is quite high, on the other hand, i.e. the first grader with the highest cohesive density produces more than twice as many cohesive devices per clause than the first grader with the lowest score. Tab. 7.1 Descriptive statistics cohesive density by grade Total N Mean Standard Deviation Median Min. Max. Cohesive density Grade 1 31 3.45 0.53 3.46 2.17 4.44 Grade 4 28 3.91 0.32 3.88 3.43 4.65 Total 59 3.67 0.50 3.73 2.17 4.65 From first to fourth grade participants’ overall cohesive density increases by 0.46, i.e. fourth graders produce roughly 5 cohesive devices more in 10 clauses. As a result, fourth graders average a cohesive density of 3.91, i.e. they produce roughly 4 cohesive ties per clause. At the same time the interindividual variation decreases from 0.53 or 15% of the mean in first grade to an even lower 0.32 or 8% of the mean in fourth grade. This further reduction of the interindividual variation is also evident from the range of results, which decreases from 2.27 to 1.22. Fig. 7.2 shows the distribution of participants’ individual cohesive density scores in first and fourth grade; the corresponding values are given in Tab. 11.1 and Tab. 11.2 in the Appendix (ch. 11.1). As Fig. 7.2 shows, the first grade curve is much steeper than the one in fourth grade, reflecting the higher degree of variation. At the same time the distribution qualifies the observed difference in mean cohesive density between first grade and fourth grade. That is, there is an overlap of scores between 3.43 (fourth grade minimum) and 4.44 ties per clause (first grade maximum). Ninety-three percent of fourth graders (26 of 28) as compared to 58% of first graders (18 of 31) obtain a cohesive density within this range. At the same time scores below 3.43 are realized exclusively by first graders (42%, 13 of 31) and the only two participants (7%, C5-G4-12 and C5-G4-18) with a higher cohesive density than the overlap are fourth graders. Thus, the distribution indicates that the main difference between first and fourth grade cohesive density lies in the 42% of first graders producing cohesive density scores below the overlap. 1 In the following, the terms cohesive device and (cohesive) tie will be used interchangeably, since only cohesive devices forming cohesive ties were considered for the analysis. 141 An additional comparison of minimum and maximum scores in first and in fourth grade shows that the minimum score increases much more strongly (by 1.26) than the maximum score (by 0.21). This leads to the conclusion that changes in cohesive density from first to fourth grade mainly concern an increase in the lower scoring ranges, while there is a comparative stability in the higher ranges. To sum up, fourth graders’ stories were found to have a higher mean cohesive density and a smaller amount of interindividual variation, even if the results were comparatively homogeneous in both grades. The distribution of the individual results showed that more than half of all first graders produced a cohesive density that could also have been produced by a fourth grader. At the same time the distribution indicated that the major change in cohesive density from first to fourth grade concerns the lower density scores, which increase markedly, while the higher scoring ranges largely remain the same. 7.1.2 Overall cohesive density by sex Male and female participants’ mean cohesive density in first and in fourth grade as well as the corresponding increase is depicted in Fig. 7.3; the descriptive statistics are given in Tab. 7.2. Fig. 7.2 Distribution of participants’ individual cohesive density scores by grade 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5 4,0 4,5 5,0 No. of ties per clause Individual scores in increasing order Grade 1 Grade 4 142 Tab. 7.2 Descriptive statistics cohesive density by sex and grade Cohesive density (No. of ties per clause) Total N Mean Standard Deviation Median Minimum Maximum Grade 1 Male 10 3.65 0.45 3.57 2.85 4.44 Female 21 3.36 0.55 3.40 2.17 4.42 Total 31 3.45 0.53 3.46 2.17 4.44 Grade 4 Male 7 3.92 0.27 3.78 3.64 4.34 Female 21 3.90 0.34 3.88 3.43 4.65 Total 28 3.91 0.32 3.88 3.43 4.65 In first grade male participants produce, on average, 0.29 ties more per clause or 3 in 10 clauses more than female participants. However, the median indicates that 50% of the male participants actually score 3.57 or higher, while 50% of females score 3.40 or higher, i.e. the median points to the difference between sexes being smaller than what is indicated by the mean. At the same time male and female first graders’ interindividual variation is very similar and—just as for the overall results—comparatively low. Thus, male results have a standard deviation of 0.45 or 12% of the mean, while females show a marginally higher variation of 0.55 or 16% of the mean. The same is evident in the range of scores, which is 1.59 for male and 2.25 for female first graders. However, the higher scoring range for female first graders is most likely attributable to their larger group size. Fig. 7.3 Mean cohesive density by grade and sex 3,65 3,92 0,27 3,36 3,9 0,54 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5 4,0 4,5 Grade 1 Grade 4 Increase No. of ties per clause Male Female 143 The distribution of participants’ individual cohesive density scores in first grade as a function of sex is depicted in Fig. 7.4; the corresponding values are given in Tab. 11.3 and Tab. 11.4 in the Appendix (ch. 11.2). The curves confirm the slightly larger interindividual variation in female first graders’ results described above; however, again they show a great similarity in results in terms of an overlap between the cohesive density scores of male and female first graders. This overlap ranges from the male minimum of 2.85 to the female maximum score of 4.42 and it includes 90% of male first graders (9 of 10) as well as 86% of female first graders (18 of 21). Three female first graders (14%) produce a cohesive density below the overlap, 2 while one male first grader (10%, C1-G1-2), scores marginally above the overlap (4.44). However, these differences seem negligible, since the group sizes are so different. The large overlap and the relatively few participants scoring below or above the overlap both seem rather to indicate that male and female participants perform identically. Both the male and female group’s mean density increases from first to fourth grade; however, female participants’ cohesive density increases by 0.54, i.e. by around 5 cohesive devices per 10 clauses, while males’ cohesive density increases by only half as much, namely by 0.27, i.e. by close to 3 cohesive devices per 10 clauses. As a result, male and female fourth graders’ results are even more similar (Fig. 7.3, Tab. 7.2); the difference between male and female mean cohesive density is now only 0.02, i.e. 2 cohesive devices in 100 clauses. 2 C1-G1-4 achieved a cohesive density of 2.17, C8-G1-18 scored 2.62, and C8-G1-21 scored 2.66. Fig. 7.4 Distribution of participants’ individual cohesive density scores by sex in first grade 0 0,5 1 1,5 2 2,5 3 3,5 4 4,5 5 No. of ties per clause Individual scores in increasing order Male first grader Female first grader 144 As the mean score increases, both groups’ interindividual variation becomes even smaller. Thus, the male group’s standard deviation decreases from 0.45 or 12% of the mean in first to 0.27 or merely 7% in fourth grade, while for the female group it decreases from 0.55 or 16% of the mean to 0.34 or 9%. The same is indicated by the range of scores, which drops from 1.59 in first grade to 0.7 in fourth grade for male participants and from 2.25 to 1.22 for female participants. Standard deviation, variability ratio and range indicate that by the end of fourth grade female participants continue to have a slightly larger variation in results. Altogether, however, these differences are as negligible as in first grade, especially when keeping in mind the large difference in group size. Fig. 7.5 gives the distribution of participants’ individual cohesive density scores in fourth grade as a function of sex; the corresponding values are given in Tab. 11.5 and Tab. 11.6 in the Appendix (ch. 11.2). Again, there is a strong overlap between male and female cohesive density. Thus, all male (10) and 62% of female fourth graders (13 of 21) achieve a cohesive density between 3.64 (male fourth grade minimum) and 4.34 ties per clause (male fourth grade maximum). Twenty-four percent of the remaining female fourth graders (5 of 21) score below that range, while 14% of them (3 of 21) score above the overlap. 3 However, keeping in mind a possible group size effect, this difference in the distribution of scores again needs to be interpreted as indicating no distinction between the two sexes. 3 Up to 0.21 ties per clause or roughly 2 in 10 clauses fewer and, respectively, up to 0.31 ties per clause, i.e. about 3 ties, more per 10 clauses. Fig. 7.5 Distribution of participants’ individual cohesive density scores by sex in fourth grade 0 0,5 1 1,5 2 2,5 3 3,5 4 4,5 5 No. of ties per clause Individual scores in increasing order Male fourth grader Female fourth grader 145 Fig. 7.6 shows the distribution of male and female individual cohesive density scores in first and in fourth grade. A comparison of both sexes’ first and fourth grade individual scores reflects, first of all, an increase in each sex’s minimum cohesive density from first to fourth grade (cf. also Tab. 7.2). At the same time the curves show that there are considerable overlaps between first grade and fourth grade for both sexes. Male participants’ cohesive density overlaps in the range of 3.64 (fourth grade minimum) to 4.34 ties per clause (fourth grade maximum). All seven fourth graders score within this range and also 40% of the first graders (4 of 10). 50% of the remaining male first graders (5 of 10) obtain a cohesive density below the overlap, while one of them (10%) scores above it. Female participants’ cohesive density, on the other hand, overlaps between 3.43 (fourth grade minimum) and 4.42 ties per clause (first grade maximum). Eighty-six percent of female fourth graders (18 of 21) and 48% of first graders (10 of 21) score in this range. The remaining 52% of female first graders (11) score below the overlap, while the remaining three fourth graders (14%) score above it. Thus, a comparison of either sex’s first and fourth grade curves confirms the general trend described in the last section, namely a reduction in lower scores with a substantial overlap between first and fourth graders in the upper scoring ranges. To sum up, only negligible differences in cohesive density were observed between male and female participants in either grade, which are most likely attributable to group size effects. At the same time the analysis confirmed the pattern identified for the overall results with respect to mean score, interindividual variation, and the distribution of scores. That is, each sex’s mean score increased from first to fourth grade and the interindividual variation, which was again compara- Fig. 7.6 Distribution of participants’ individual cohesive density scores by sex and grade 0 0,5 1 1,5 2 2,5 3 3,5 4 4,5 5 No. of ties per clause Individual scores in increasing order Male first grader Female first grader Male fourth grader Female fourth grader 146 tively low even in first grade, decreased. At the same time the distribution of male and female participants’ cohesive density scores showed that quite a large percentage of first graders, males as well as females, produced stories which were as cohesive as those of their fourth grade peers and that for both sexes the main difference between grades involved a disappearance of the lower first grade density scores. 7.1.3 Overall cohesive density by experience group Participants’ mean cohesive density by grade and experience group is given in Fig. 7.7 together with its increase from first to fourth grade; the corresponding descriptive statistics are given in Tab. 7.3. † ‘Bili’ denotes children with bilingual preschool experience, ‘mono’ those with exclusively monolingual preschool experience. First graders with L2 preschool experience achieve a somewhat higher cohesive density than those without such prior experience; thus, the bili group achieves an average of 3.59 ties per clause or roughly 36 per 10 clauses, while the mono group produces 3.24 ties per clause or roughly 32 per 10 clauses—a difference of 0.35 cohesive devices per clause or almost 4 in 10 clauses. 3,24 3,87 0,63 3,59 3,93 0,34 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5 4,0 4,5 Grade 1 Grade 4 Increase No. of ties per clause Mono Bili Fig. 7.7 Cohesive density by experience group and grade † 147 Tab. 7.3 Descriptive statistics cohesive density by experience group and grade Cohesive density (No. of ties per clause) Total N Mean Standard Deviation Median Minimum Maximum Grade 1 Mono† 12 3.24 0.63 3.15 2.17 4.42 Bili 19 3.59 0.43 3.67 2.66 4.44 Total 31 3.45 0.53 3.46 2.17 4.44 Grade 4 Mono 12 3.87 0.30 3.83 3.58 4.65 Bili 16 3.93 0.34 3.89 3.43 4.64 Total 28 3.91 0.32 3.88 3.43 4.65 † ‘Bili’ denotes children with bilingual preschool experience, ‘mono’ those with exclusively monolingual preschool experience. As the median shows, mono and bili results are slightly skewed, however, and the bili group’s advantage is a little more pronounced than the mean suggests. At the same time mono first graders’ results are a little more heterogeneous than those of the bilis. This lower heterogeneity is evident from the standard deviation, which is 0.63 or 19% of the mean for the mono group as opposed to 0.43 or 12% for the bili group, and from the range of cohesive density scores, which is also higher for mono than bili first graders (2.25 vs. 1.78). † ‘Bili’ denotes children with bilingual preschool experience, ‘mono’ those with exclusively monolingual preschool experience. Fig. 7.8 Distribution of participants’ individual cohesive density scores by experience group in first grade † 0 0,5 1 1,5 2 2,5 3 3,5 4 4,5 5 No. of ties per clause Individual scores in increasing order Mono first grader Bili first grader 148 The distribution of mono and bili first graders’ individual cohesive density scores is depicted in Fig. 7.8; the corresponding values are given in Tab. 11.7 and Tab. 11.8 in the Appendix (ch. 11.3). There is a strong overlap between mono and bili first graders, which ranges from 2.66 (bili minimum) to 4.42 ties per clause (mono maximum). Eighty-three percent of mono (10 out of 12) and 95% of bili first graders (18 out of 19) score within this range. The two remaining mono first graders (17%, C1-G1-4 & C8-G1-18) score (slightly) below this range (2.17 and 2.62 respectively). The remaining bili first grader (5%, C1-G1-2), on the other hand, scores slightly above the overlap and at the same time achieves the highest cohesive density of all participants, namely 4.44 ties per clause. Thus, the distribution of individual cohesive density results indicates at best a marginal advantage of first graders with L2 preschool experience over those without. From first to fourth grade both groups’ mean cohesive density increases (Fig. 7.7, Tab. 7.3), but the rate of increase varies: The mono group’s mean increases comparatively strongly, namely by 0.63 ties per clause, while the bili group’s mean density increases by only about half as much, namely by 0.34 ties per clause. As a result, the performance of both experience groups is almost the same in fourth grade with respect to their mean cohesive density: Bili fourth graders achieve a mean cohesive density of 3.93 ties per clause, i.e. approximately 39 per 10 clauses, while mono fourth graders achieve a marginally lower mean of 3.87 ties per clause, i.e. also approximately 39 per 10 clauses. The difference between the mono group and the bili group has thus shrunk to a negligible 0.06 ties per clause, i.e. merely 6 cohesive devices in 100 clauses. Both groups’ interindividual variation becomes even lower from first to fourth grade. The mono group’s standard deviation decreases from 0.63 ties per clause (19% of the mean) in first to merely 0.3 (8%) in fourth grade and the bili group’s standard deviation from 0.43 (12% of the mean) to 0.34 (9%). This drop in interindividual variation is also evident in the range of scores: The bili group’s range decreases from 1.78 in first to 1.21 in fourth grade, and the mono group’s range from 2.25 to 1.07 ties per clause. This means that by the end of fourth grade the differences in interindividual variation are also negligible. Fig. 7.9 shows the distribution of participants’ individual cohesive density scores by experience group in fourth grade; the corresponding values are given in Tab. 11.9 and Tab. 11.10 in the Appendix (ch. 11.3). In fourth grade there is again a strong overlap of cohesive density scores between mono and bili participants. Ninety-two percent of mono fourth graders (11 of 12) and 88% of bili fourth graders (14 of 16) achieve a cohesive density between 3.58 (mono fourth grade minimum) and 4.64 ties per clause (bili fourth grade maximum). The remaining two bili fourth graders (13%, C5-G4-6 and C1-G4-17) realize cohesive density scores below the overlap (3.43 and 3.49 respectively), while the remaining mono fourth grader (8%, C5-G4-12) achieves a cohesive density that is very slightly above the overlap (4.64). Thus, the distribution of cohesive density results in fourth grade also shows virtually no difference between mono and bili participants. 149 † ‘Bili’ denotes children with bilingual preschool experience, ‘mono’ those with exclusively monolingual preschool experience.† ‘Bili’ denotes children with bilingual preschool experience, ‘mono’ those with exclusively monolingual preschool experience The distribution of mono and bili participants’ individual cohesive density scores in first and fourth grade is depicted in Fig. 7.10. Fig. 7.10 shows that the distribution of scores by grade and experience group follows the general trend of a disappearance of the lower first grade scores as opposed to a relative stability in the upper first grade ranges (cf. also the respective minimum scores). However, this development is far more pronounced for the mono than for the bili group, confirming that the mono group’s cohesive density increases more strongly from first to fourth grade than does the bili group’s. Thus, there is an overlap between mono first and fourth graders’ cohesive density ranging from 3.58 (fourth grade minimum) to 4.42 ties per clause (first grade maximum). Only two of the 12 mono first graders (17%, C8-G1-16 & C1-G1-8) achieve a cohesive density within the overlapping range, though, as compared to 92% of the fourth graders (11 of 12). The remaining 83% of mono first graders (10 of 12) score up to 1.41 ties fewer per clause; the one remaining fourth grader (8%, C5-G4-12) achieves a cohesive density slightly above the overlap (4.44). Bili first and fourth grade cohesive density results, on the other hand, overlap in the range of 3.43 (fourth grade minimum) to 4.44 ties per clause (first grade maximum). A full 68% of bili first graders (13 of 19) and 94% of the fourth graders (15 of 16) achieve a cohesive density within this range. The remaining 32% of bili first graders (6 of 19) achieve a cohesive density below the overlap, while the one remaining fourth grader (6%, C5-G4-18) scores higher (4.64). Fig. 7.9 Distribution of participants’ individual cohesive density scores by experience group in fourth grade † 0 0,5 1 1,5 2 2,5 3 3,5 4 4,5 5 No. of ties per clause Individual scores in increasing order Mono fourth grader Bili fourth grader 150 † ‘Bili’ denotes children with bilingual preschool experience, ‘mono’ those with exclusively monolingual preschool experience. To sum up, participants with and without L2 preschool experience performed similarly in first and in fourth grade regarding the cohesive density of their stories, and they followed the general developmental trends outlined earlier. That is, both groups’ mean cohesive densities increased from first to fourth grade and their interindividual variation, which was already low in first grade, decreased even further. The distribution of participants’ individual cohesive density scores showed a significant overlap between mono and bili participants in both grades, confirming the similarity in their results. The respective distributions also showed an overlap between first and fourth grade results for each group. While the majority of first graders with L2 preschool experience produced stories as cohesive as their fourth grade peers, however, this was the case for only a small percentage of mono participants. Consequently, the large overlap between first and fourth graders found for the overall results seems to be attributable especially to the bili group. In their development from first to fourth grade both groups again followed the overall trend, i.e. an increase mainly in the lower scores and a relative stability in the upper scoring ranges, even if this development was far more pronounced for the mono group. Fig. 7.10 Distribution of participants’ individual cohesive density scores by experience group and grade † 0 0,5 1 1,5 2 2,5 3 3,5 4 4,5 5 No. of ties per clause Individual scores in increasing order Mono first grader Bili first grader Mono fourth grader Bili fourth grader 151 7.1.4 Overall cohesive density: Statistical results A factorial ANOVA was conducted to statistically test main and interaction effects of grade, sex, and L2 experience group on overall cohesive density (cf. ch. 5.2.3). There was a significant main effect of grade (F(1, 51)=7.21, p<0.05, partial η 2 =0.12), which explained only 12% of the variance in results, however, but no main effects for sex (F(1, 51)=0.35, ns) or experience group (F(1, 51)=1.54, ns). Also, there were no interaction effects. Thus, the statistical analysis found that grade is a significant factor for explaining variance in cohesive density, even though the factor grade explains only 12% of the variance in cohesive density, which means that it leaves another 88% unexplained. The statistical analysis also showed that sex and L2 preschool experience bear no significance in explaining the variability in cohesive density. 7.1.5 Summary: Overall cohesive density Section 7.1 and its subsections aimed to answer the question of how cohesive participants’ stories are (as measured by their degree of cohesive density) and whether there are any differences in cohesion attributable to grade, sex or L2 preschool experience. Based on the findings of previous studies (cf. ch. 4) the following hypotheses had been put forward with respect to these research questions (cf. ch. 3.5): 1. Grade/ age has a significant influence on the cohesion of participants’ stories. 2. Sex does not have a significant influence on the cohesion of participants’ stories. 3. Participants’ stories become more cohesive from first to fourth grade as measured by the number of cohesive devices. It is now possible to discuss to what extent these hypotheses are confirmed. The observed results first and foremost showed that all participants, even all first graders, were able to produce at least some cohesive ties. Thus, the lowest number of ties (produced by a first grader) was 2.17, i.e. approximately 2 ties per clause. At the same time the interindividual differences in cohesive density were found to be very low in both first grade and fourth grade compared to those found for coherence. With respect to an influence of grade the observed results showed two developmental trends, namely: • an increase in cohesive density, and • a (further) decrease in variation. Thus, participants’ mean cohesive density increased from 3.45 in first grade to 3.91 in fourth grade. That is, first graders produced approximately 7 ties in 2 clauses, while fourth graders produced approximately 4 ties in 1 or 8 ties in 2 152 clauses. This increase was found to be statistically significant, even if grade explained a relatively low percentage of the variance in results, namely 12%. As opposed to the increase in mean cohesive density, participants’ standard deviation decreased from 0.53 or 15% of the mean to 0.32 or 8% of the mean. Male and female participants, as well as the mono and bili experience groups, all followed this development. At the same time the distribution of participants’ individual density scores showed a large overlap between first grade and fourth grade, which helps to explain the low effect size; that is, a large part of first graders’ stories (58%) were as cohesive as those produced by fourth graders. The distribution showed that the major change in cohesive density from first to fourth grade concerned the disappearance of the lower first grade scores, also indicated by a higher minimum score in fourth grade. 4 Besides other possible factors not investigated in my study, this could point to stylistic variations or simply individual preferences as factors in explaining variance in cohesion scores once a certain minimum of cohesive devices has been reached. With respect to the influence of sex or experience group, there were neither significant observed nor statistically significant differences between groups in either grade regarding mean cohesive density, interindividual variation or the distribution of participants’ individual density scores. However, L2 preschool experience had a significant influence on the overlap between first and fourth grade scores. That is, 68% of the bili first graders but only 17% of the mono first graders produced stories which were as cohesive as those of their fourth grade peers. To sum up, all three hypotheses were verified. The cohesion of participants’ stories generally increased from first to fourth grade, while the interindividual variation in cohesion decreased. At the same time a substantial number of first graders produced stories as cohesive as those produced by some fourth graders, the main change from first to fourth grade consisting of a reduction of the lower first grade scores. Sex and L2 preschool experience were not found to influence cohesive density except for a larger overlap of the bili group’s first and fourth grade cohesive density as opposed to a more pronounced difference between first and fourth grade cohesion on the part of the mono group. 7.2 Cohesive density: Subcategories The present section aims to answer the question of how often participants use the subcategories of cohesion—i.e. references, connectives, substitutions, ellipses and lexical cohesion—cohesively in first and in fourth grade. That is, the degree of the subcategories’ density in first and in fourth grade is investigated as well as any possible changes between first grade and fourth grade. Background variables tak- 4 However, it should be kept in mind for all cohesive density results that the interindividual differences between first grade and fourth grade are often quite small, mostly describable only as a difference per 10 or even 100 clauses. 153 en into account besides grade are again sex and experience group. In the following, the observable and statistical results as well as a short summary will be given for each subcategory, followed by a summary covering all of the subcategories. By the end of this section, it will not only be possible to confirm or disprove the hypotheses restated in the last chapter with respect to the subcategories of cohesion, but also to shed light on the fourth hypothesis set up for cohesion, namely that there are no qualitative differences in cohesion between grades as measured by the frequency order of the subcategories of cohesion (cf. ch. 4.6). 7.2.1 References 7.2.1.1 Overall reference density The present section analyzes how often participants use referential ties in first and in fourth grade and whether there are any differences attributable to grade, sex or L2 preschool experience. 5 Tab. 7.4 gives the descriptive statistics for participants’ reference density in first and fourth grade. Tab. 7.4 Descriptive statistics reference density Reference density (No. of referential ties per clause) Total N Mean Standard Deviation Median Minimum Maximum Grade 1 31 1.14 0.23 1.16 0.57 1.65 Grade 4 28 1.21 0.17 1.22 0.76 1.62 Total 59 1.17 0.20 1.18 0.57 1.65 All participants produce at least some referential ties. This is evident from the minimum scores, which show that even first graders produce at least 0.57 referential ties (minimum score), i.e. approximately 1 reference per 2 clauses. On average, first graders use 1.14 referential ties per clause, i.e. approximately 1 reference per clause. First graders’ interindividual differences, as measured by the standard deviation, amount to 0.23 or a variability ratio of 20% in relation to the mean, which is slightly higher than the overall interindividual variation described above. At the same time their results have a range of 1.08 references per clause from minimum (0.57) to maximum score (1.65), i.e. the highest reference density achieved is more than twice the density of the lowest score (Tab. 7.4). 5 The terms reference and referential tie will be used synonymously in the following, since only references functioning cohesively were included in the analysis. 154 Participants’ mean reference density increases only marginally from first to fourth grade; that is, fourth graders produce only 7 more references per 100 clauses (+0.07), namely a mean of 1.21 references per clause. As opposed to participants’ mean density, their interindividual variation decreases to 0.17 or 14% of the mean. Similarly, the range of reference density results decreases to 0.86, which is attributable to an increase in the minimum score as opposed to an almost constant maximum. Fig. 7.11 shows the distribution of participants’ individual reference density scores in first and in fourth grade in increasing order; the corresponding values are given in Tab. 11.1 and Tab. 11.2 in the Appendix (ch. 11.1). There is a large overlap in results between first grade and fourth grade, which ranges from 0.76 (fourth grade minimum) to 1.62 ties per clause (fourth grade maximum). All 28 fourth graders and also 90% of the first graders (28 of 31) obtain a reference density within this range. Of the remaining three first graders, one (3%, C8-G1-2) obtains a substantially lower reference density (0.57), a second first grader (3%, C1-G1-4) a marginally lower one (0.74) and a third first grader (3%, C8-G1-16) 6 even obtains a marginally higher density than the overlap (1.65). Both grades are influenced by somewhat extreme values. Thus, not only one first grader (3%, 0.57; C8-G1-2) but also one fourth grader (3%, 0.76; C1-G4-17) achieves an extremely low reference density (compared to their respective peers). Even if one or both of these values are removed, however, the number of first graders scoring below the overlap re- 6 As the following sections will show, this participant can be considered a statistical outlier. Fig. 7.11 Distribution of participants’ individual reference density scores by grade 0 0,2 0,4 0,6 0,8 1 1,2 1,4 1,6 1,8 No. of referential ties per clause Individual scores in increasing order Grade 1 Grade 4 155 mains small (max. 16% or 5 of 31). In total, the distribution indicates that there are no “typical” first and fourth grade scores but that the performance of the majority of first and fourth graders is the same when it comes to reference density. To sum up, participants’ mean reference density was found to increase and their interindividual variation to decrease from first to fourth grade. However, the mean increased only marginally and the distribution of participants’ individual scores showed almost no differences between first grade and fourth grade; that is, almost all first graders were likely to produce a reference density that could also have been produced by a fourth grader. Thus, the results showed that the difference between participants’ first and fourth grade reference density is negligible. 7.2.1.2 Reference density by sex Tab. 7.5 gives the reference density results obtained by male and female participants in first and fourth grade. Male first graders achieve a mean density of 1.21 referential ties per clause, while their female counterparts’ mean density is a little lower with 1.11 ties per clause, i.e. female first graders produce roughly 1 reference less in 10 clauses. At the same time male first graders’ interindividual variation is a little lower: Their results have a standard deviation of 0.2, which corresponds to a variability ratio of 17%, and females 0.24, i.e. a variability ratio of 22%. This difference in variation is also expressed in the range of scores, which is 0.71 for male and 1.08 for female first graders. Tab. 7.5 Descriptive statistics reference density by sex and grade Reference density (No. of referential ties per clause) Total N Mean Standard Deviation Median Minimum Maximum Grade 1 Male 10 1.21 0.20 1.19 0.85 1.56 Female 21 1.11 0.24 1.11 0.57 1.65 Total 31 1.14 0.23 1.16 0.57 1.65 Grade 4 Male 7 1.19 0.13 1.13 1.04 1.36 Female 21 1.22 0.18 1.22 0.76 1.62 Total 28 1.21 0.17 1.22 0.76 1.62 Fig. 7.12 gives the distribution of individual reference density scores for male and female first graders; the corresponding values are given in Tab. 11.3 and Tab. 11.4 in the Appendix (ch. 11.2). Fig. 7.12 shows an overlap of reference density scores between 0.85 (male minimum) and 1.56 references per clause (male maximum). Consequently, all male first graders score within this range. 81% of females (17 of 21) also do so. Three of the remaining female first graders (14%) score below (from 0.57 to 0.76) and one (5%, C8-G1-16) above the overlap (1.65). However, Fig. 7.12 also shows that this last female first grader needs to be considered a statistical outlier due to her extreme value; that is, the overlap between male and female first graders actually ranges from 0.85 to 1.34. Thus, 80% of the 156 male first graders (8 of 10) would be considered as scoring within the overlap, while 20% of them (2 of 10) score even higher; female results remain as described for the original range. An interpretation in terms of a slight male advantage is made difficult, however, by the large difference in group size. Consequently, the distribution mainly confirms the very small difference in interindividual variation between male and female first graders outlined above. Only the female participants’ reference density increases from first to fourth grade, namely by 0.11 references per clause or approximately 1 tie in 10 clauses (Tab. 7.5). As a result, male and female mean density are even more similar in fourth than in first grade; the difference is now merely 0.03 or three lexical ties in 100 clauses. At the same time female fourth grade results are only marginally more heterogeneous than those of their male counterparts. This similarity is indicated by a female standard deviation of 0.18 or 15% of the mean as opposed to 0.13 or 11% of the mean for male fourth graders. Similarly, females’ scoring range from minimum to maximum is 0.86 as opposed to 0.32 for the male fourth graders. Fig. 7.13 shows the distribution of male and female participants’ individual reference density scores in first and fourth grade, i.e. on the one hand the differences and/ or similarities between male and female participants in each grade and, on the other hand, the differences between each sex’s first and fourth grade results. The corresponding numerical values are given in Tab. 11.3 to Tab. 11.6 in the Appendix (ch. 11.2). Male and female results again overlap to a high degree; in fourth grade this overlap ranges from a reference density of 1.04 (male minimum) Fig. 7.12 Distribution of participants’ individual reference density scores by sex in first grade 0,0 0,2 0,4 0,6 0,8 1,0 1,2 1,4 1,6 1,8 No. of referential ties per clause Individual scores in increasing order Male first graders Female first graders 157 to 1.36 referential ties per clause (male maximum) and thus again includes all male fourth graders. 77% of female fourth graders (16 of 21) also achieve a reference density in this range, while two female fourth graders (10%) score below and three of them (14%) up to 0.26 above the overlap. Keeping in mind the respective group sizes, this means that there is again little difference between the two groups. With respect to the distribution of each sex’s reference density in first as opposed to fourth grade the following can be observed: Male first and fourth grade results overlap in a range from 1.04 (male fourth grade minimum) to 1.36 referential ties per clause (male fourth grade maximum). All seven male fourth graders score within this range and 60% of first graders (6 of 10). 20% of the remaining male first graders (2 of 10) achieve a reference density below the overlap, while another 20% of them realize a density above the overlap. 7 Female first and fourth grade results overlap more strongly in a range from 0.76 (fourth grade minimum) to 1.62 (fourth grade maximum), which includes all fourth graders (100%) and 86% of the first graders (18 of 21). Two female first graders (10%) achieve a (marginally) lower cohesive density (0.57 and 0.74 respectively), while one female first grader (5%, C8-G1-16) achieves a marginally higher score (1.65). 8 These results indicate that male and female participants fol- 7 C8-G1-7 scored 1.48 (+0.11) and C1-G1-2 scored 1.56 references per clause (+0.19), which corresponds to approximately 1 and 2 references more in 10 clauses. 8 As described earlier, however, this latter first grader needs to be considered a statistical outlier. If she is excluded, an overlap in the range of 0.76 to 1.34 referential ties per clause can be identified. This overlap both includes and excludes the same number of Fig. 7.13 Distribution of participants’ individual reference density scores by sex and grade 0,0 0,2 0,4 0,6 0,8 1,0 1,2 1,4 1,6 1,8 No. of referential ties per clause Individual scores in increasing order Male first graders Female first graders Male fourth graders Female fourth graders 158 low the general trend of largely similar results for reference density in first and fourth grade. To sum up, no significant observable differences in reference density were found between male and female participants in either grade. Both groups followed the overall trends identified for reference density: Largely similar mean densities in first and fourth grade, a strong overlap of first and fourth grade scores, and a decrease from an already comparatively low interindividual variation to an even lower one in fourth grade. 7.2.1.3 Reference density by experience group Tab. 7.6 gives the reference density results obtained in first and fourth grade by participants with bilingual English-German preschool experience and children with monolingual German preschool experience. Mono and bili first graders achieve a very similar reference density; that is, bili first graders produce an average of 1.15 referential ties per clause and thus only 0.02 references per clause or two more ties in 100 clauses than the mono group, whose mean reference density is 1.13 per clause. Tab. 7.6 Descriptive statistics reference density by experience group and grade Reference density (No. of referential ties per clause) Total N Mean Standard Deviation Median Minimum Maximum Grade 1 Mono† 12 1.13 0.32 1.20 0.57 1.65 Bili 19 1.15 0.15 1.15 0.85 1.56 Total 31 1.14 0.23 1.16 0.57 1.65 Grade 4 Mono 12 1.23 0.16 1.23 0.96 1.62 Bili 16 1.20 0.17 1.19 0.76 1.49 Total 28 1.21 0.17 1.22 0.76 1.62 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. The two experience groups differ, on the other hand, in their degree of interindividual variation. Mono first graders have a comparatively large variation female first graders as the original one, but means that four of the 21 fourth graders (19%) score (slightly) higher. Similarly, female fourth grader C1-G4-17 seems to produce an exceptionally low reference density compared to her peers (0.76, i.e. the female minimum in fourth grade). If her result is also disregarded, the overlap between first and fourth grade ranges from 0.96 to 1.34. It then includes 80% of the remaining fourth graders as well as 80% of the remaining first graders (both 16 of 20). 20% of the fourth graders (4 of 20) and 5% of the first graders (1 of 20) then score higher, while 18% of the first graders (4 of 22) score lower than the overlap. Any comparison of females’ first and fourth grade distribution thus shows mainly one overall trend, namely that of a substantial overlap. 159 with a standard deviation of 0.32, which corresponds to a variability ratio 28% of the mean, while bili first graders’ results have a standard deviation of only 0.15 or of 13% of the mean. Similarly, the mono group has a comparatively large range of scores with a difference of 1.08 ties per clause between the lowest and highest reference density of a mono first grader. The bili group, on the other hand, obtains a range of 0.71 ties per clause between minimum and maximum score. † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. The difference between the mono and the bili group’s heterogeneity is reflected in Fig. 7.14, which shows the distribution of mono and bili first graders’ individual reference density results. The corresponding values are given in Tab. 11.7 and Tab. 11.8 in the Appendix (ch. 11.3). The mono first graders’ larger heterogeneity is evident in the stronger steepness of their curve. At the same time there is a large overlap between mono and bili results, which ranges from 0.85 (bili minimum) to 1.56 (bili maximum). All bili first graders score within this range as compared to 67% of the mono group (8 of 12). Twenty-five percent of the mono first graders (3 of 12) achieve lower reference density scores, 9 while the one remaining mono first grader (8%, C8-G1-16), the one already identified as a statistical outlier, achieves a referential density above the overlap (1.65). Thus, the distribution seems to indicate that the mono group tends a little more often to score in the lower ranges. 9 C8-G1-18 produces 0.76 referential ties per clause, i.e. approximately 1 in 10 clauses fewer, C1-G1-4 scores 0.74, i.e. also roughly 1 in 10 clauses fewer, and C8-G1-2 produces 0.57, i.e. 0.28 or approximately 3 references fewer in 10 clauses. Fig. 7.14 Distribution of participants’ individual reference density scores by experience group in first grade † 0,0 0,2 0,4 0,6 0,8 1,0 1,2 1,4 1,6 1,8 No. of referential ties per clause Individual scores in increasing order Mono first graders Bili first graders 160 Both experience groups’ mean reference density increases from first to fourth grade, albeit only slightly. The mono group has the comparatively stronger increase with a mean reference density of 1.23 references per clause in fourth grade (+0.1). The bili fourth graders, on the other hand, obtain a mean reference density of 1.2 per clause (+0.05). As a result of these different increase rates, mono and bili fourth graders’ reference density is again very similar, i.e. they differ by only 0.03 or three references in 100 clauses. The interindividual variation of the mono experience group decreases from first to fourth grade (Tab. 7.6). Thus, the mono group’s standard deviation is 0.16 in fourth grade, which corresponds to 13% of the mean—less than half as much as in first grade—and its range of results has decreased to 0.66. This difference in range is largely due to a higher minimum score in fourth grade as opposed to a similar (even slightly lower) maximum score. The bili group’s standard deviation (and correspondingly the range), on the other hand, rises slightly from 0.15 in first grade to 0.17 references per clause in fourth grade. In terms of the variability ratio, however, the bili group’s fourth (14%) and first grade variation (13%) are only marginally different. Thus, it can be said that the interindividual variation of the mono group decreases, while that of the bili group remains constant. As a result, mono group and bili group have roughly the same variation in fourth grade (13% and 14% of the mean, respectively). † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. Fig. 7.15 shows the distribution of mono and bili participants’ individual density scores as a function of grade, i.e. on the one hand the differences and/ or similari- Fig. 7.15 Distribution of participants’ individual reference density scores by experience group and grade † 0,0 0,2 0,4 0,6 0,8 1,0 1,2 1,4 1,6 1,8 No. of referential ties per clause Individual scores in increasing order Mono first graders Bili first graders Mono fourth graders Bili fourth graders 161 ties between mono and bili participants in first and fourth grade and, on the other hand, the difference between each experience group’s first and fourth grade results. The corresponding values are given in Tab. 11.7 to Tab. 11.10 in the Appendix (ch. 11.3). A large overlap in both groups’ fourth grade results is evident; it ranges from 0.96 (mono minimum) to 1.49 references per clause (bili maximum). Ninety-two percent of mono fourth graders (11 of 12) as well as 94% of bili fourth graders (15 of 16) achieve a reference density within this range. The remaining mono fourth grader (8%, C5-G4-12) obtains a somewhat higher referential density (1.62), while the remaining bili fourth grader (6%, C1-G4-17) achieves a density below the overlap (0.76). Thus, the distribution confirms mono and bilis’ overall similarity of results and interindividual variation in fourth grade. As to each experience group’s first and fourth grade results, the following can be observed: There is a large overlap between mono first and fourth grade scores, which ranges from 0.96 (mono fourth grade minimum) to 1.62 (mono fourth grade maximum). All mono fourth graders obtain a reference density within this range as opposed to only 58% of the first graders (7 of 12). 33% of the remaining mono first graders (4 of 12) score lower (one of them marginally), while one first grader (8%) scores higher (1.65; C8-G1-16, i.e. the outlier already mentioned). That is, the main difference between the mono group’s first and fourth grade results is an increase in the lower first grade scores (minimum density) as was already observed for participants’ overall cohesive density. An even larger overlap between first grade and fourth grade can be observed for the bili group. This overlap ranges from 0.85 (bili first grade minimum) to 1.49 referential ties per clause (bili fourth grade maximum). Ninety-five percent of bili first graders (18 of 19) and 94% of bili fourth graders (15 of 16) score within this range. One first grader (5%, C1-G1-2) produces more (1.56) and one fourth grader (6%, C1-G4-17) fewer referential ties than the range of the overlap (0.76). However, this latter bili fourth grader seems to produce an exceptionally low reference density compared to her peers. If her result is excluded, bili first and fourth grade scores overlap from 1.04 to 1.49; all bili fourth graders are included in this range, while 21% of the first graders score below the overlap. Thus, a comparison of the bili group’s first and fourth grade distribution does not initially reveal any difference between the two grades. However, this changes once an outlier is disregarded. The bili first and fourth grade curves then point to a similar trend as that found for the monos, namely an increase of the minimum score as the main difference between the first and the fourth grade distribution (paired with a strong overlap). To sum up, the difference in mean reference density between children with and without L2 preschool experience was found to be negligible both in first grade and in fourth grade. With respect to participants’ interindividual variation in first grade, however, the mono group was found to be more heterogeneous than the bili group. From first to fourth grade both groups’ mean reference density increased. At the same time the interindividual differences of the mono group decreased, while the variation of the bili group remained fairly stable. As a result, the 162 fourth grade interindividual variation differed very little between the two experience groups. The only effect attributable to L2 preschool experience was found in the distribution of participants’ individual results, which showed a greater likelihood of mono first graders to produce low reference density scores. The distribution also showed a more pronounced difference between the mono group’s first and fourth grade curves, i.e. a stronger increase of the mono group’s minimum reference density. The bili curves, on the other hand, showed a similar difference between first and fourth grade only after a fourth grade outlier was removed from the analysis. 7.2.1.4 Referential density: Statistical results A factorial ANOVA including all participants found neither significant main effects attributable to grade (F(1, 51)=0.05, ns), sex (F(1, 51)=1.51, ns) or experience group (F(1, 51)=0.91, ns), nor any interaction effects. That is, the statistical analysis showed that neither grade, nor sex, nor L2 preschool experience are statistically significant factors for the variability in referential density results. The interaction effect between sex and grade could be interpreted as showing a trend toward significance (F(1, 51)=3.12, p=0.083). A second ANOVA was conducted without outlier C8-G1-16. There were again no significant main effects of grade (F(1, 50)=0.001, ns), sex (F(1, 50)=2.33, ns), or experience group (F(1, 50)=0.68, ns). However, there was a significant interaction effect between grade and sex (F(1, 50)=4.43, p<0.05). A simple effects analysis showed that this was attributable to a significant difference between male and female participants in first (p=0.04, r 2 =0.09) but not in fourth grade (p=0.418) and a significant difference between first grade and fourth grade for female (p=0.024, r 2 =0.09) but not male participants (p=0.685). To sum up, experience group is not a significant factor in explaining variability in referential density. Neither are grade and sex by themselves significant factors, but they do influence some of the subgroups; that is, male first graders produce a significantly higher mean and only the female mean reference density increases significantly from first to fourth grade. 7.2.1.5 Summary: Referential density Section 7.2.1 aimed to answer the question of how often participants use referential ties and whether there are any differences attributable to grade, sex or L2 preschool experience. It was found that all participants used at least some references—even first graders produced a minimum cohesive density of roughly one reference in two clauses. At the same time participants’ interindividual variation was slightly higher than that found for overall cohesive density but still low compared to, for example, the coherence results—even if all subgroups’ results were influenced by statistical outliers. No significant observed or statistically significant differences were found between first grade and fourth grade with respect to participants’ mean reference 163 density; it increased only marginally from first to fourth grade. 10 Thus, first graders produced an average of 1.14 and fourth graders 1.21 referential ties per clause—a difference of merely 0.07 or seven references in 100 clauses. A comparison of participants’ distribution curves reflected the small difference between grades, since it showed a large overlap between first and fourth grade scores: Almost all first graders (90% with and 84% without outliers) achieved a reference density which could also have been produced by a fourth grader. Consequently, no “typical” first or fourth grade reference density could be identified. While the mean density remained almost the same, participants’ interindividual variation, which was already comparatively low in first grade, decreased even further by the end of fourth grade (from a standard deviation of 0.23 or 20% of the mean to 0.17 or 14% of the mean). The distribution of participants’ scores also reflected the decrease in variation, since it showed that the main difference between participants’ first and fourth grade distribution curves consisted of a higher minimum score in fourth grade. L2 preschool experience was not found to have any significant observable or statistically significant influence on participants’ mean reference density, i.e. mono and bili mean densities were virtually the same in both grades. Participants’ sex, on the other hand, was found to be of influence in interaction with grade, i.e. male first graders produced a statistically significantly higher mean and only the female mean differed significantly between grades. With respect to participants’ interindividual differences, male and female participants performed very similarly in both grades, even if the females’ variation was slightly higher in both grades (mainly due to two extreme values). Some differences in variation were observed between mono and bili participants, however; that is, mono first graders’ results differed twice as much as those of their bili peers (the mono group’s standard deviation was 0.32 and the variability ratio 28% as opposed to the bili group’s standard deviation of 0.15 and variability ratio of 13%). Almost all subgroups adhered to the general developmental trends identified for reference density, namely only a marginal increase in mean density and a decrease in variation. The only exceptions were the mean density of the male participants, which decreased a little, and the interindividual variation of the bili group, which remained largely stable. The distribution of participants’ individual scores also showed no difference between males and females in first or fourth grade or in regard to either group’s difference between first and fourth grade overlap. Both sexes followed the general trend of a strong overlap between first and fourth grade scores and an increase in minimum score as the major change from first to fourth grade. L2 preschool experience, on the other hand, did seem to have some effect on the distribution of individual scores. Thus, 25% of mono first graders (3 of 12) performed below a common overlap, which means that mono first graders were somewhat more likely than bili first graders to produce low referential density scores. By the end of fourth grade this difference between mono and bili participants had disappeared. 10 As will be described below, the subgroup of female participants is the only exception. 164 However, the mono first and fourth grade distributions were also observed to differ more than those of the bili participants, which is again attributable to the greater likelihood of the mono first graders to produce lower scores. One could also say that the mono participants’ minimal reference density increased more strongly. To sum up, neither grade, nor sex, nor experience group alone were found to have a significant impact on reference density and its development. Instead, first and fourth graders were found to be virtually indistinguishable with respect to their use of referential ties—a finding that was confirmed by a large overlap in the distribution of individual scores. The only differences found between first grade and fourth grade were a slightly lower interindividual variation in fourth grade and a higher minimum score, which reflected the lower variation. The influence of sex and grade interacted significantly in that (a) male first graders produced a statistically significant higher mean, and (b) only female participants’ mean increased statistically significantly from first to fourth grade; other than that, both sexes performed very similarly. L2 preschool experience was mainly found to be responsible for a lower interindividual variation in first grade, which was reflected by a higher first grade minimum on the part of the bili group; other than that the performance of the two experience groups was largely the same. 7.2.2 Connectives 7.2.2.1 Overall connective density The present section is aimed at answering the question of how often participants use connective ties and whether there are any differences attributable to grade, sex or L2 preschool experience. 11 Tab. 7.7 gives the descriptive statistics for participants’ connective density in first and in fourth grade. As the minimum score shows, not all first graders produce connective ties, even if they achieve a mean density of 0.45 connectives per clause. That is, on average, first graders link almost every two clauses with a connective. At the same time the variation in connective density is quite large. This is reflected, on the one hand, by the range of connective density from minimum (0) to maximum score (0.97), which indicates that some first graders produce almost one connective per clause while others do not realize any at all. Additionally, the standard deviation is very large with 0.28 connectives per clause or a variability ratio of 62%. 11 The terms connective and connective tie will be used synonymously in the following, since only connectives functioning cohesively were included in the analysis. 165 Tab. 7.7 Descriptive statistics connective density by grade Connective density (No. of connective ties per clause) Total N Mean Standard Deviation Median Minimum Maximum Grade 1 31 0.45 0.28 0.48 0 0.97 Grade 4 28 0.63 0.13 0.62 0.46 0.91 Total 59 0.53 0.24 0.56 0 0.97 From first to fourth grade participants’ mean connective density increases to 0.63 connective ties per clause, i.e. fourth graders realize 0.18 connectives more per clause or almost 2 connectives per 10 clauses. Thus, not only do all fourth graders produce connectives (minimum score 0.46) but on average they connect at least every two clauses with a connective. One could also say that they link every two clauses with a connective and produce roughly 1 additional connective every 10 clauses. While the mean connective density increases, the interindividual differences drop sharply from first to fourth grade: Fourth graders’ results have a standard deviation of 0.13 or 21% of the mean. 12 Similarly, the range of connective density results decreases from 0.97 to 0.45, i.e. to less than half the first grade range. As first and fourth grade minimum indicate, this decrease in range seems to be due to an increase in the minimum (from 0 to 0.46) as opposed to a fairly constant, even slightly lower maximum score in fourth grade (0.97 vs. 0.91). Fig. 7.16 shows the distribution of individual connective density results in first and in fourth grade; the corresponding values are given in Tab. 11.1 and Tab. 11.2 in the Appendix (ch. 11.1). The distribution of participants’ individual connective density scores shows, first of all, that 2 of the 31 first graders (6%, C8-G1-8 and C8-G1-16) do not produce any connective ties, while the great majority of them (94%) produces at least some such ties—even if the steepness of the curve simultaneously points to the large interindividual differences. The distribution also reflects that all fourth graders produce at least some connectives and that their minimum density lies much higher than in first grade (cf. the fourth grade minimum score). At the same time there is an overlap of first and fourth grade results in the range of 0.46 (fourth grade minimum) to 0.91 connectives per clause (fourth grade maximum). All fourth graders achieve a connective density within this range as opposed to 58% of first graders (18 of 31). 35% of the remaining first graders producing connectives (11 of 31) obtain a connective density below that range. 13 One first grader (3%, C1-G1-8) produces a higher number of connective ties per clause, namely 0.97, which is also the highest score of all participants. 12 However, this variability ratio is still high compared to the fourth grade variation found for overall cohesive density (8%) and it is also a little higher than the fourth grade variation in reference density (14%). 13 Between 0.11 and 0.40 connectives fewer per clause. 166 Thus, the distribution confirms the assumption made above that the main difference between first and fourth grade is the increase of the lower scoring ranges as opposed to fairly constant results in the upper ones. To sum up, all fourth but not all first graders produced connectives and the mean number of connectives increased from first to fourth grade, while the variation— which was very high in first grade—decreases very strongly. The distribution of participants’ individual connective density scores showed a substantial overlap between first and fourth grade results. Together with the minimum and maximum scores it also showed that the main difference between the two grades involved an increase in the lower scoring ranges as opposed to a relative stability in the upper ones, just as was found for participants’ overall cohesive density. 7.2.2.2 Connective density by sex Tab. 7.8 gives the connective density results obtained by male and female participants in first and fourth grade. All male first graders realize at least some connective ties, i.e. a minimum of 0.1 per clause or 1 tie every 10 clauses. Not all female first graders, however, produce connective ties; this is evident from their minimum score of zero. Male first graders achieve a mean density of 0.51 connectives per clause, while female first graders obtain a slightly lower connective density of 0.42 connectives per clause. That is, male first graders realize 0.09 connective ties more per clause or approximately 1 in 10 clauses. At the same time both groups’ interindividual variation is large: Male first graders’ standard deviation amounts to 0.27 or 53% of Fig. 7.16 Distribution of participants’ individual connective density scores by grade 0 0,2 0,4 0,6 0,8 1 1,2 No. of connective ties per clause Individual scores in increasing order Grade 1 Grade 4 167 the mean, and female first graders’ results vary even more with a standard deviation of 0.28 but 67% of the mean. Similarly, both groups’ range between minimum and maximum score is large. Male first graders’ results have a range of 0.87 between their group members’ minimum and maximum connective density; female first graders results again differ even more with a range of 0.97. Tab. 7.8 Descriptive statistics connective density by sex and grade Connective density (No. of connective ties per clause) Total N Mean Standard Deviation Median Minimum Maximum Grade 1 Male 10 0.51 0.27 0.56 0.10 0.87 Female 21 0.42 0.28 0.47 0 0.97 Total 31 0.45 0.28 0.48 0 0.97 Grade 4 Male 7 0.65 0.12 0.67 0.49 0.79 Female 21 0.62 0.13 0.58 0.46 0.91 Total 28 0.63 0.13 0.62 0.46 0.91 Fig. 7.17 shows the distribution of male and female participants’ individual connective density scores in first grade; the corresponding values are given in Tab. 11.3 and Tab. 11.4 in the Appendix (ch. 11.2). The comparatively strong steepness of the distribution curves reflects, first of all, both sexes’ strong interindividual variation. At the same time a large overlap in results is evident, ranging from 0.1 Fig. 7.17 Distribution of participants’ individual connective density scores by sex in first grade 0,0 0,2 0,4 0,6 0,8 1,0 1,2 No. of connective ties per clause Individual scores in increasing order Male first graders Female first graders 168 (male minimum) to 0.87 (male maximum). The overlap encompasses all ten male and the majority of female first graders (81%, 17 of 21). However, three female first graders (14%) achieve a connective density below the overlap and, as Fig. 7.17 shows, two of those (10%) produce no connectives at all. The remaining female first grader (5%) obtains a connective density above the overlap. Keeping in mind the distinct group sizes the distribution thus shows no significant differences between male and female first graders with respect to their connective density scores. The mean connective density for both male and female participants increases from first to fourth grade (Tab. 7.8). While the male group’s density increases by 0.14 connectives per clause, i.e. 1 connective more in 10 plus an additional 1 in 25 clauses, females’ mean density increases by 0.20 or 2 connectives in 10 clauses. As a result, male and female fourth graders achieve an even more similar mean connective density than in first grade (0.65 and 0.62 ties per clause respectively). That is, both male and female fourth graders use a connective slightly more often than every two clauses and they produce an average of approximately 6 connectives in 10 clauses. At the same time all male and all female fourth graders produce at least some connectives; this is again evident from the minimum score, which is now above zero not only for the male but also for the female group. While both male and female mean densities increase, the heterogeneity in their results decreases. The standard deviation for female results drops from 0.28 to 0.13, i.e. from a variability ratio of 67% to 21%, while male participants’ standard deviation decreases from 0.27 to 0.12, i.e. from a variability ratio of 53% to 18%. Similarly, the range of results from minimum to maximum score drops from 0.97 to 0.45 for the female group and from 0.77 to 0.30 for the male group. Consequently, the interindividual differences in fourth grade are very similar for both males and females. Fig. 7.18 shows the distribution of male and female participants’ individual connective density scores in first and in fourth grade, i.e. on the one hand the differences and/ or similarities between male and female participants in grades one and four and, on the other hand, the differences between the two sexes’ first and fourth grade results. The corresponding numerical values are given in Tab. 11.3 to Tab. 11.6 in the Appendix (ch. 11.2). Male and female fourth grade curves again indicate a large overlap in connective density results. This overlap ranges from 0.49 (male minimum) to 0.79 (male maximum) and, consequently, includes all male fourth graders. Fewer, but still the majority, of the female fourth graders (76%, i.e. 16 of 21) achieve a connective density within this range. Fourteen percent of female fourth graders (3 of 21) score only negligibly lower (up to 0.03 connectives per clause), while 10% of them (2 of 21) achieve a connective density which lies a little above the overlapping range (0.86 and 0.91 respectively). All in all, however, these differences in fourth grade are negligible, especially when keeping in mind the large difference in group size. As to the difference of either sex’s connective density between first grade and fourth grade, the following can be observed: The steepness of the curve lessens for 169 both sexes from first to fourth grade, which reflects the decrease in interindividual variation described earlier. The curves also reflect that both sexes’ fourth grade minimum scores are much higher than those of first grade (cf. Tab. 7.8). Nevertheless, an overlap between first grade and fourth grade can be observed for both sexes. That is, female first and fourth grade connective density results overlap in the range of 0.46 (fourth grade minimum) to 0.91 connectives per clause (fourth grade maximum). All female fourth graders score within this range but so do approximately half of the first graders (48%, 10 of 21). Almost half of the female first graders (48%, 10 of 21), however, obtain a lower connective density, two of these scoring zero (10%). One female first grader (5%, C1-G1-8) produces slightly more connectives per clause (0.97). Male first and fourth grade results, on the other hand, overlap between 0.49 (fourth grade minimum) to 0.79 (fourth grade maximum), which also includes all of the fourth graders and a little more than half of the first graders’ results (60%, i.e. 6 of 10). Three of the remaining male first graders (30%) produce substantially fewer connectives per clause than the overlap, while one other first grader (10%) produces 0.11 more. 14 All in all, this comparison between each sex’s first and fourth grade results indicates that fourth graders do not necessarily achieve a higher connective density. Also, it confirms that the general developmental pattern of an increase in the lower scoring ranges, coupled with relatively little change in the (first grade) upper ranges, applies to both male and female results. 14 The three male first graders scoring below the overlap produced between 0.28 and 0.39 connective ties fewer. Fig. 7.18 Distribution of participants’ individual connective density scores by sex and grade 0,0 0,2 0,4 0,6 0,8 1,0 1,2 No. of connective ties per clause Individual scores in increasing order Male first graders Female first graders Male fourth graders Female fourth graders 170 To sum up, male and female participants performed quite similarly in both grades, especially when keeping in mind a likely group size effect. In first grade slight differences were observed in that female first graders produced slightly fewer connective ties per clause and their results varied slightly more. Due to distinct increase rates with regard to mean density and decrease rates with regard to interindividual variation, however, there were almost no differences between the two sexes in fourth grade. The respective distributions confirmed not only these findings but also that the general development observed for connective density, namely an elimination of the lower first grade scores paired with a substantial overlap of first and fourth grade scores, holds true for both sexes’ results. 7.2.2.3 Connective density by experience group Participants’ connective density results by experience group and grade are given in Tab. 7.9. Bili first graders achieve a mean connective density of 0.53 and thus realize 0.21 connectives per clause (or 2 per 10 clauses) more than mono first graders, who reach a mean connective density of 0.32. That is, bili first graders produce approximately one connective every two clauses but mono first graders produce one connective every three clauses. At the same time all bili but not all mono first graders use at least some connective ties; this is evident from the two groups’ minimum scores. Tab. 7.9 Descriptive statistics connective density by experience group and grade Connective density (No. of connective ties per clause) Total N Mean Standard Deviation Median Minimum Maximum Grade 1 Mono† 12 0.32 0.31 0.21 0 0.97 Bili 19 0.53 0.23 0.55 0.06 0.87 Total 31 0.45 0.28 0.48 0 0.97 Grade 4 Mono 12 0.62 0.13 0.58 0.46 0.86 Bili 16 0.63 0.13 0.64 0.47 0.91 Total 28 0.63 0.13 0.62 0.46 0.91 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. Mono first graders also show an extremely high variation in results with a standard deviation of 0.31, which corresponds to a variability ratio of 97%, and a range of 0.97 connectives per clause from minimum to maximum score. Bili first graders’ results are also very heterogeneous but still less so than those of the monos: The standard variation for bili first graders is 0.23 or 43% of the mean and the range of scores is 0.81. 171 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. Fig. 7.19 shows the distribution of mono and bili participants’ individual connective density scores as a function of experience group in first grade; the corresponding numerical values are given in Tab. 11.7 and Tab. 11.8 in the Appendix (ch. 11.3). The steepness of both groups’ curves first of all reflects their large interindividual variations. The curves also show an overlap of results from 0.1 (mono minimum for the mono participants actually producing connectives) to 0.87 connectives per clause (bili maximum). 95% of the bili first graders (18 of 19) achieve a cohesive density within this range as compared to 75% of the mono first graders (9 of 12). The remaining bili first grader (5%, 1 of 19) scores slightly lower (0.06). Two mono first graders (17%, 2 of 12) do not produce any connectives at all, 15 while one mono first grader (8%; C1-G1-8) achieves a connective density above the overlap (0.97). However, the two highest mono scores (0.81 and 0.97) seem to be exceptional for this group, since the next lowest score amounts to only about half of this connective density (0.48), meaning that scores up to 0.48 seem more typical for the mono group. 16 Consequently, the overlap between mono and bili first graders would actually need to be defined as 0.1 (mono minimum, cf. above) to 0.48 (highest mono score excluding the two outliers). This new overlap would include all eight mono first graders who produce connectives but only 32% 15 C8-G1-8 and C8-G1-16 produced zero connectives in 12 and 23 clauses, respectively. 16 This skewed distribution of the mono groups’ connective density scores is also evident from their median (0.21) as opposed to the mean (0.32) (cf. Tab. 7.9). Fig. 7.19 Distribution of participants’ individual connective density scores by experience group in first grade † 0 0,2 0,4 0,6 0,8 1 1,2 No. of connective ties per clause Individual scores in increasing order Mono first graders Bili first graders 172 of bili first graders (6 of 19); 68% of the bili participants (13 of 19) would score above the overlap. The first grade distribution of connective density scores without mono outliers would therefore need to be interpreted as indicating an advantage of the bili group. 17 The mean connective densities of both the mono group and the bili group increase to different degrees from first to fourth grade. The mono group’s mean connective density rises by 0.3 connectives per clause, and thus more strongly than the bili group’s mean, which increases by only 0.1. As a result, mono and bili fourth graders perform almost indistinguishably; they produce a mean connective density of 0.62 and 0.63, respectively. In other words, both groups realize approximately three connective ties in five clauses—one could also say they connect every two clauses with a connective and produce roughly 1 additional connective tie in 10 clauses. At the same time not only all bili but also all mono fourth graders use at least some connective ties; this is evident from a rise in the mono minimum score from zero in first to 0.46 in fourth grade. At the same time both experience groups’ results become more homogeneous from first to fourth grade. The mono group’s standard deviation drops from 0.31 or 97% of the mean in first to 0.13 or 21% of the mean in fourth grade. Relatedly, their range of scores decreases from 0.97 to 0.4. The bili group’s standard deviation drops from 0.23 or 43% of the mean in first also to 0.13 or 21% of the mean in fourth grade and their scoring range decreases from 0.81 to 0.44. Consequently, the interindividual variation in fourth grade is the same for both mono group and bili group. Fig. 7.20 shows the distribution of mono and bili participants’ individual connective density scores in increasing order as a function of grade, i.e. on the one hand, the differences and/ or similarities between mono and bili participants in first and in fourth grade and, on the other hand, the differences between the two experience groups’ first and fourth grade results. The corresponding values are given in Tab. 11.7 to Tab. 11.10 in the Appendix (ch. 11.3). The fourth grade curves show an overlap between the mono and the bili scores, which ranges from 0.47 (bili minimum) to 0.86 (mono maximum). It includes 92% of the mono fourth graders (11 of 12) and 94% of the bili fourth graders (15 of 16), i.e. the large majority of both experience groups. Only one mono fourth grader (8%, C1-G4-3) achieves a marginally lower connective density (0.46) and one bili fourth grader (6%, C5-G4-18) a marginally higher connective density (0.91); that is, the fourth grade distribution of participants’ connective density results shows no difference between mono and bili participants. 17 However, such an advantage would need to be confirmed by a larger number of participants, since only 10 mono but still 19 bili first graders remain and thus the possibility of a group size effect cannot be excluded. 173 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. A comparison of both experience groups’ first and fourth grade curves reflects, first of all, the strong decrease in interindividual variation described above, since the fourth grade curves are much less steep than their first grade counterparts. The curves also indicate that the decrease in variability is again mainly due to a strong increase in minimum scores (cf. also Tab.6.9). At the same time the distribution shows an overlap between first and fourth grade results for both groups. Thus, the mono results overlap between 0.46 (fourth grade minimum) and 0.86 (fourth grade maximum). All mono fourth graders score within this range, while only 25% of the first graders (3 of 12) do so. Sixty-seven percent of mono first graders (8 of 12) achieve a lower connective density; the one remaining first grader (8%) scores higher than the overlap. Bili first and fourth grade results overlap in the very similar range of 0.47 (fourth grade minimum) to 0.87 (first grade maximum). Almost all bili fourth graders (94%, 15 of 16) achieve a connective density within that range, and one fourth grader (6%; C1-G1-8) obtains a higher density. However, the majority of bili first graders, namely 74% (14 of 19), also score within this range; 26% of them (5 of 19) obtain a lower connective density. This means that both mono and bili results follow the general trend of an overlap between first grade and fourth grade and a decrease of results in the lower scoring ranges with little change in the upper ranges. For the mono group this trend is more pronounced, however, due to larger differences between the first and fourth grade scores. Fig. 7.20 Distribution of participants’ individual connective density scores by experience group and grade † 0 0,2 0,4 0,6 0,8 1 1,2 No. of connective ties per clause Individual scores in increasing order Mono first graders Bili first graders Mono fourth graders Bili fourth graders 174 To sum up, all bili but not all mono first graders produced connectives. Additionally, bili first graders produced a higher mean connective density and their variation in results was much lower than that of the mono first graders. The distribution of the individual scores also pointed towards an advantage of the bili group in that, once two extreme values of the mono group were disregarded, most bili first graders scored above a common overlap. Mono and bili participants followed the general trend of an increase in mean connective density and a (strong) decrease in interindividual variation, even if the rate of increase/ decrease was stronger for the mono group. In fourth grade mono and bili participants performed very similarly in regard to mean connective density and variation; this similarity was confirmed by the distribution curves. Mono and bili groups’ first and fourth grade distributions showed that they follow the general trend of an overlap of first and fourth grade results and an increase mainly in the lower density ranges. However, the difference between first and fourth grade was found to be more pronounced for the mono group. 7.2.2.4 Connective density: Statistical results A factorial ANOVA was conducted to statistically test main and interaction effects of grade, sex, and L2 experience group on the overall connective density (cf. ch. 5.2.3). The inhomogeneity of variance was barely significant (F(7, 51)=2.24, p=0.046), when all participants were included, so that the ANOVA could still be conducted. There was a (highly) significant main effect of grade (F(1, 51)=11.61, p<0.01, partial η 2 =0.19), which accounted for 19% of the variance, and also a main effect of experience group (F(1, 51)=4.62, p<0.05, partial η 2 =0.08), but this explained only 8% of the variance. The interaction between grade and experience group just missed significance (F(1, 51)=3.62, p=0.06, ns) and there was no main effect of sex (F(1, 51)=0.21, ns) or any other interaction effects. A second ANOVA was conducted without outlier C1-G1-8. Excluding C1-G1- 8 first of all had the effect that the assumption of the homogeneity of variance was not violated anymore. The ANOVA results without C1-G1-8 confirmed the (very highly) significant main effects of grade (F(1, 50)=15.71, p<0.001, partial η 2 =0.24) and experience group (F(1, 50)=6.67, p<0.05, partial η 2 =0.19). However, both factors now accounted for a larger part of the variance in connective density results. Again, there was no significant main effect of sex (F(1, 50)=0.08, ns), however, the two significant main effects were now qualified by a significant interaction effect between grade and experience group (F(1, 50)=5.34, p<0.05, partial η 2 =0.1). Simple effects analyses (cf. ch. 5.2.3) showed that experience group had an impact on the first but not on the fourth grade results, i.e. there was a significant difference between the mono group and the bili group in first (p=0.015, r 2 =0.11) but not in fourth grade (p=0.623). Experience group explained only 11% of the variability in first grade, however. At the same time grade did not influence both experience groups to the same degree. More specifically, the difference between first and fourth grade connective density was significant for the mono 175 group (p=0.000, r 2 =0.27), where it explained 27% of the variability in results, but not for the bili group (p=0.162). To summarize, the statistical results confirm the observed result that grade is a significant factor for variance in connective density, even though its effect is limited and leaves a large amount of variability unexplained. At the same time the statistical analysis confirms that the importance of grade does not apply to the same degree to all subgroups under investigation. The statistical analysis supported the observed finding that L2 preschool experience (mono vs. bili) is a significant factor in explaining variation in connective density results, but that the influence of experience group is limited to first grade. 7.2.2.5 Summary: Connective density The present section was aimed at answering the question how often participants use connective ties and whether there are any differences attributable to grade, sex or L2 preschool experience. First of all, it was found that almost all participants produced connectives. That is, all fourth graders and 94% of first graders (29 of 31) produced at least some connective ties; only two first graders (6%) did not use any connectives. Also, participants’ interindividual variation in both grades was larger than that obtained for overall cohesive density and reference density (the standard deviation was 0.28 or 62% of the mean in first grade and 0.13 or 21% of the mean in fourth grade). With respect to an influence of grade it was found that participants’ mean connective density differed statistically significantly between first and fourth grade, even if the effect size was again low compared to the coherence results (grade explained, depending on the set-up of the statistical test, up to 24% of the variance in results). Thus, first graders produced a mean number of 0.45 connective ties per clause, i.e. they connected roughly every two clauses by a connective, while fourth graders produced an average of 0.63 connectives per clause, i.e. they connected every two clauses with a connective and produced approximately one additional connective in 10 clauses. However, the effect of grade was qualified by experience group. That is, grade significantly affected the mono results, where it explained 27% of the variance in results. The observed increase in the bili results, on the other hand, was not statistically significant. Grade also influenced the interindividual variation in results. That is, participants’ results became far more homogeneous from first to fourth grade. Sex did not have any significant effect on connective density, even if all male but not all female first graders produced connective ties. Participants’ L2 preschool experience, on the other hand, was also found to exercise a significant observable as well as statistically significant influence on first grade results. That is, bili first graders produced a significantly higher number of connective ties per clause (0.53) than mono first graders (0.32)—even if the effect size was quite small (experience group explained 11% of the variance in results). At the same time all bili but not all mono first graders produced connective ties and the mono group’s results had a much larger interindividual variation (97% vs. 43%). By the end of 176 fourth grade these differences had disappeared, however, so that mono group and bili group produced an almost identical number of connective ties (0.62 vs. 0.63) and showed the same degree of variation in results (21%). The distributions of participants’ individual connective density scores showed an overlap between first and fourth grade results for all subgroups; that is, over half of the first graders (58%) produced a connective density which might just as well have been produced by a fourth grader. The distributions also showed that the main distinction between first grade and fourth grade was an increase in participants’ lower scoring ranges. These findings were again influenced by L2 preschool experience (the difference between grades was more pronounced for mono than for bili participants) but they were not influenced by sex. To sum up, four general trends were found with respect to participants’ connective density, which resemble those found for overall cohesive density: A (statistically significant) increase in mean density, a decrease in interindividual variation, a substantial overlap between participants’ first and fourth grade results, and an increase in the lower scores from first to fourth grade, which was paired with a relative stability in the upper ranges. As discussed, these trends were influenced to differing degrees by experience group, so that neither grade nor experience group alone is able to explain a large proportion of the variation in results. 7.2.3 Substitution and ellipsis As described in ch. 3.4, substitutions and ellipses are two closely related processes and so they will be covered in the same section. Consequently, the present section aims to answer the question of how often participants use ties by substitution or ellipsis and whether there are any differences attributable to grade, sex or L2 preschool experience. 7.2.3.1 Substitutions Ties by substitution occur only marginally in the data. None of the first graders produce any such ties and only four of 31 fourth graders (13%) do so. C5-G4-3 (male, bili) uses one substitution, C5-G4-6 (female, bili) two, C5-G4-19 (male, mono) one and C5-G4-20 (female, bili) another one. These five substitutions correspond to a mean number of 0.002 substitutions per clause in fourth grade, i.e. merely 2 substitutions in 1000 clauses. Due to their extremely limited use, substitutions were not investigated further with respect to effects of sex or experience group and they were not submitted to any statistical test. That is, the analysis of substitutions found that the number of ties by substitution increased from first to fourth grade, but that their overall use remained marginal. 7.2.3.2 Overall ellipsis density Tab. 7.10 gives the descriptive statistics for participants’ ellipsis density in first and in fourth grade. As Tab. 7.10 shows, not all first graders produce ties by ellip- 177 sis—this is evident from the minimum score (0)—and all together the number of ellipses produced is quite low (87 in a total of 936 clauses). 18 Tab. 7.10 Descriptive statistics ellipsis density by grade Ellipsis density (No. of ties by ellipsis per clause) Total N Mean Standard Deviation Median Minimum Maximum Grade 1 31 0.09 0.09 0.07 0 0.38 Grade 4 28 0.15 0.05 0.16 0.03 0.23 Total 59 0.12 0.08 0.12 0 0.38 First graders’ mean ellipsis density is 0.09, i.e. they produce roughly 1 ellipsis in 10 clauses. However, the median in first grade (0.07) is even lower than the mean, which indicates that the mean may be slightly misleading as a measure of central tendency—this question will be addressed again together with the distribution of participants’ individual ellipsis density scores. At the same time first graders’ interindividual variation is extremely high. This is indicated by a standard deviation of 0.09, i.e. a variability ratio of 100%, and a range of 0.38 between the minimum and maximum ellipsis densities obtained in first grade. Participants’ mean ellipsis density increases comparatively strongly (by ca. 67%) from first to fourth grade so that the mean density rises to 0.15 or 3 ellipses in 20 clauses. Even if the mean number of ellipses is still low, e.g. compared to the results obtained for referential ties, the minimum score (0.03) indicates that all fourth graders produce at least some ellipses. This change also contributes to the decrease in interindividual variation from first to fourth grade. Thus, the standard deviation drops from 0.09 to 0.05 or from 100% to 33% of the mean. At the same time the range drops to 0.2 in fourth as opposed to 0.38 in first grade. Fig. 7.21 shows the distribution of participants’ individual ellipsis density scores in increasing order as a function of grade; the corresponding values are given in Tab. 11.1 and Tab. 11.2 in the Appendix (ch. 11.1). The first grade curve shows that 26% of the first graders (8 of 31) do not produce any ellipses at all. It is also evident that one first grader (C8-G1-2) distorts the results and needs to be considered a statistical outlier; she obtains the highest score of all participants and produces almost twice as many ellipses per clause (0.38) as the second highest score obtained by a first grader (0.21). 19 These extremes were already indicated by the difference between mean and median described above, and illustrate (as well as being largely responsible for) the very strong interindividual variation in first grade. 18 The terms ellipsis and tie by ellipsis will be used synonymously in the following, since only ellipses functioning cohesively were included in the analysis. 19 C8-G1-2 is a female first grader without monolingual German preschool experience. 178 The distribution curve also reflects that—as opposed to first graders—all fourth graders realize at least some ellipses. At the same time a substantial overlap in ellipsis density results is evident, which ranges from 0.03 (fourth grade minimum and lowest first grade result) to 0.23 (fourth grade maximum). All fourth graders score within this range but so do 70% of the first graders (22 of 31). Twenty-six percent of the first graders (8 of 31) score below the range, namely zero, while first grade outlier C8-G1-2 (3%) scores clearly above. 20 Thus, the distribution indicates that also for ellipsis density the main change between first grade and fourth grade relates to the first grade lower scores, in this case the disappearance of zero scores. However, it is not possible to generally associate a high(er) ellipsis density with fourth rather than first graders, since the results of the first graders’ who do produce ellipses could just as easily stem from fourth graders. To sum up, all fourth graders produced ties by ellipsis, but not all first graders did so. At the same time the mean number of ties increased from first to fourth grade, even if their number was low in both grades compared to, for example, references and connectives. A strong decrease in interindividual variation accompanied the increase in mean ellipsis density, but nevertheless, the variation in results was found to remain high even in fourth grade. The distribution of participants’ individual scores showed a large overlap between first grade and fourth grade and also that the main difference between the two grades involved a reduction in the lower scoring ranges, i.e. a higher minimum score in fourth grade caused by the disappearance of zero scores. 20 However, C8-G1-2 does not influence the overlap as such. Fig. 7.21 Distribution of participants’ individual ellipsis density scores by grade 0 0,05 0,1 0,15 0,2 0,25 0,3 0,35 0,4 No. of ties by ellipsis per clause Individual scores in increasing order Grade 1 Grade 4 179 7.2.3.3 Ellipsis density by sex Tab. 7.11 gives the ellipsis density results obtained by male and female participants in first and fourth grade. As Tab. 7.11 shows, neither all male nor all female first graders produce an ellipsis; both groups’ minimum score is zero. While both groups scarcely use ties by ellipsis, female first graders have a marginally higher mean ellipsis density with 0.1, as opposed to male first graders, whose mean density is 0.07. That is, female first graders produce 1 ellipsis in 10 clauses, while male first graders produce approximately one ellipsis in 20 clauses plus an additional 1 in 50 clauses. However, not only is this difference relatively small (0.03, i.e. 3 ties in 100 clauses) but the female group’s median indicates that the distribution of their results is somewhat skewed and the mean thus somewhat misleading. A comparison of both groups’ medians indicates that their results are even more similar than the mean suggests. At the same time both groups’ interindividual variation is very high. Thus, male results have a standard deviation of 0.07, which corresponds to a full 100% of the mean, while the female standard deviation is 0.09, which corresponds to a variability ratio of 90%. Similarly, male participants’ first grade results have a range of 0.17; the female range (0.38) amounts to approximately twice as much. Tab. 7.11 Descriptive statistics ellipsis density by sex and grade Ellipsis density (No. of ties by ellipsis per clause) Total N Mean Standard Deviation Median Min. Max. Grade 1 Male 10 0.07 0.07 0.07 0 0.17 Female 21 0.10 0.09 0.08 0 0.38 Total 31 0.09 0.09 0.07 0 0.38 Grade 4 Male 7 0.16 0.05 0.18 0.05 0.21 Female 21 0.14 0.05 0.15 0.03 0.23 Total 28 0.15 0.05 0.16 0.03 0.23 Fig. 7.22 gives the distribution of male and female participants’ individual ellipsis density scores in first grade in increasing order; the corresponding numerical values are given in Tab. 11.3 and Tab. 11.4 in the Appendix (ch. 11.2). The distribution of the individual scores shows that 19% of female (4 of 21) and 40% of male first graders (4 of 10) do not produce any ellipses at all. At the same time the steepness of the curves illustrates the large interindividual variation in both groups of first graders. In spite of the large interindividual variation within the two groups, however, there is a large overlap of scores between male and female first graders, ranging from 0.06 (male minimum for first graders producing ellipses) to 0.17 (male maximum); 60% of male (6 of 10) and 43% of female first graders (9 of 21) score within this range. Besides the four males (40%) and four females (19%) who do not use any ellipses and are therefore below the overlap, 14% of the female first graders (3 of 21) score above the overlap. The latter group in- 180 cludes C8-G1-2, who was already identified as a statistical outlier in the previous section. 21 The distribution of individual scores thus reveals only negligible differences between the two sexes. Both groups’ mean ellipsis density increases from first to fourth grade. Male participants’ mean density rises a little more (+0.09) than that of the females (+0.04) so that the two groups’ fourth grade results are even more similar: Male (0.16) and female (0.14) mean ellipsis densities differ only by two ellipses in 100 clauses. At the same time both sexes produce at least some ellipses (male minimum 0.05, female minimum 0.03). While their mean ellipsis density increases, both groups’ interindividual variation decreases and they remain very similar in this respect also. Thus, females’ standard deviation drops from 0.09 in first to 0.05 in fourth grade or from 90% to 36% of the mean. The male standard deviation decreases from 0.07 to 0.05 and in relation to the mean from 100% to 31%. The scoring range for the males decreases only marginally (-0.01) to 0.16, while the female scoring range drops by 0.18, from 0.38 in first to 0.2 in fourth grade—a development that is, however, strongly influenced by statistical outlier C8-G1-2. Without this extreme value, the female scoring range would also be only marginally different between first grade (0.21) and fourth grade (0.2). 21 It has now become clear that participant C8-G1-2 is responsible for the difference between female mean and median in first grade, which was discussed above. If the female mean density in first grade is calculated without C8-G1-2, it amounts to 0.08 and is thus clearly closer to the male mean. Fig. 7.22 Distribution of participants’ individual ellipsis density scores by sex in first grade 0 0,05 0,1 0,15 0,2 0,25 0,3 0,35 0,4 No. of ties by ellipsis per clause Individual scores in increasing order Male first graders Female first graders 181 Fig. 7.23 shows the distribution of participants’ individual ellipsis density scores as a function of sex and grade and in increasing order, i.e. on the one hand the differences and/ or similarities between male and female participants in first and in fourth grade and, on the other hand, the differences between each sex’s first and fourth grade results. The corresponding numerical values are given in Tab. 11.3 to Tab. 11.6 in the Appendix (ch. 11.2). There is a large overlap of male and female ellipsis density scores in fourth grade, which ranges from 0.05 (male minimum) to 0.21 (male maximum). This range includes all male fourth grade results and 90% of the female fourth graders (19 of 21). One female (5%) obtains a slightly lower score than the overlap (0.03) and another scores slightly higher (0.23). That is, there are no significant observable differences in the distribution of ellipsis density scores between male and female fourth graders. With respect to each sex’s first and fourth grade results, the following can be observed: Male first and fourth grade results overlap in the range between 0.05 (fourth grade minimum) and 0.17 (first grade maximum). More than half of all first graders (60%, 6 of 10) as well as almost half of all fourth graders (43%, 3 of 7) score within this range. The remaining 40% of male first graders (4 of 10) produce no ellipsis at all. The remaining 57% of fourth graders (4 of 7), on the other hand, achieve an ellipsis density above the overlap. Similarly, there is an overlap of female first and fourth grade results in the range between 0.03 (fourth grade minimum and first grade lowest result) and 0.23 (fourth grade maximum). This range includes all female fourth graders’ results and 76% of the first graders (16 of 21). Nineteen percent of the female first graders (4 of 21), on the other hand, are not Fig. 7.23 Distribution of participants’ individual ellipsis density scores by sex and grade 0 0,05 0,1 0,15 0,2 0,25 0,3 0,35 0,4 No. of ties by ellipsis per clause Individual scores in increasing order Male first graders Female first graders Male fourth graders Female fourth graders 182 included, since they do not produce any ellipses. One female first grader (5%), namely statistical outlier C8-G1-2, scores higher than the overlap. Thus, male and female distributions do not differ significantly in either grade. Instead, a large similarity can be observed: With respect to each group’s first and fourth grade results, female results adhere to the general pattern observed, namely a strong overlap between first and fourth grade scores, an increase in the lower densities, which is marked by the loss of zero scores, and a relative stability in the upper scoring ranges. Male results also adhere to the general trends, but their results overlap a little less and show a more pronounced advantage of the fourth graders. Confirmation of this trend, however, would require that more male first graders be investigated. To sum up, male and female participants were found to perform very similarly regarding the ellipsis density of their narratives. Both sexes followed the overall trends identified for ellipsis density, namely an increase in mean density coupled with a decrease in interindividual variation. The distribution of both sexes’ results showed a substantial overlap between first and fourth grade scores, although the difference between the two grades was more pronounced for male participants. The distribution also showed that the main difference between first and fourth grade is an increase in the lower scores, more specifically the elimination of zero scores, just as was identified for the overall ellipsis density in the last section. 7.2.3.4 Ellipsis density by experience group Participants’ ellipsis density results by experience group and grade are given in Tab. 7.12. Tab. 7.12 Descriptive statistics ellipsis density by experience group and grade Ellipsis density (No. of ties by ellipsis per clause) Total N Mean Standard Deviation Median Minimum Maximum Grade 1 Mono† 12 0.09 0.12 0.04 0 0.38 Bili 19 0.09 0.06 0.08 0 0.21 Total 31 0.09 0.09 0.07 0 0.38 Grade 4 Mono 12 0.14 0.05 0.16 0.05 0.20 Bili 16 0.15 0.05 0.16 0.03 0.23 Total 28 0.15 0.05 0.16 0.03 0.23 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. As the respective minimum densities show, neither all mono nor all bili first graders produce ellipses. Also, both groups achieve the same mean number of ellipses per clause (0.09), which corresponds to approximately 1 ellipsis in 10 clauses. As the mono group’s median shows, however, the distribution of scores seems to be skewed and, consequently, the mean is somewhat misleading; that is, the mono 183 group actually performs lower than the mean suggests and therefore also lower than the bili group. Even though both experience groups have a high interindividual variation in first grade, the mono group’s variation is larger than that of the bili group. That is, the mono group reaches a standard deviation of 0.12 or 130% of the mean, while the bili group’s standard deviation is “only” 0.06 or 67% of the mean. Similarly, the mono group has a higher range from minimum (0) to maximum score (0.38) than the bili group (from 0 to 0.21). † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. Fig. 7.24 shows the distribution of mono and bili first graders’ individual ellipsis density scores in increasing order; the corresponding values are given in Tab. 11.7 and Tab. 11.8 in the Appendix (ch. 11.3). As the distribution of both group’s individual scores indicates, the mono group’s results in first grade are distorted by the previously identified outlier C8-G1-2. If she is eliminated, the group’s mean decreases to 0.06 and is thus slightly lower than that of the bili group. At the same time the mono group’s standard deviation decreases to 0.07 and the range to 0.21; these values are almost identical to the bili group’s results. Nevertheless, the mono group’s variability ratio (117%) remains much higher than that of the bili group. As the individual density scores depicted in Fig. 7.24 also show, 42% of the mono (5 of 12) but only 16% of the bili first graders (3 of 19) do not produce ellipses at all, i.e. only 58% of the mono (7 of 12) but 84% of the bili first graders (16 of 19) use any ellipses in their narratives. For the mono and bili participants who do produce ellipses, there is an overlap of results in the range of 0.04 (bili minimum) Fig. 7.24 Distribution of participants’ individual ellipsis density scores by experience group in first grade 0 0,05 0,1 0,15 0,2 0,25 0,3 0,35 0,4 No. of ties by ellipsis per clause Individual scores in increasing order Mono first graders Bili first graders 184 to 0.21 (bili maximum). All bili first graders who produce ellipses are included in this range, i.e. 84% of the bili group, and 42% of the respective mono participants (5 of 12). One mono first grader (8%) scores marginally lower (0.03), while the remaining one (8%), statistical outlier C8-G1-2, produces the highest result of all participants in either grade (0.38). The distribution thus indicates that, on the one hand, there are large similarities between the two groups but that, on the other hand, bilingual first graders have somewhat of an advantage over mono first graders with respect to producing ties by ellipsis. The two experience groups’ mean ellipsis density increases to almost the same degree from first to fourth grade (0.05 and 0.06). 22 Therefore, the mono group and the bili group again perform almost identically in their mean densities (0.14 and 0.15 respectively); the difference is merely 1 ellipsis in 100 clauses. At the same time all fourth graders, irrespective of experience group, produce at least some ellipses. As the mean density increases, the interindividual variation, as measured by the standard deviation, decreases very strongly, especially for the mono group, even if it remains high compared to some of the results described earlier in this chapter. The mono group’s standard deviation in fourth grade has thus dropped to only 0.05 or 36% of the mean, which is almost identical to the bili fourth graders’ standard variation of 0.05 or 33% of the mean. The range of scores, on the other hand, only decreases for the mono group: The bili group’s scoring range remains fairly stable (0.21 in first and 0.2 in fourth grade), while the mono group’s range drops sharply (from 0.38 to 0.15). 23 As a result, the bili fourth graders’ range of ellipsis density scores is marginally larger than that of the mono group (by 0.06). Fig. 7.25 shows the distribution of mono and bili participants’ individual ellipsis density scores in increasing order as a function of grade, i.e. on the one hand the differences and/ or similarities between mono and bili participants in both first grade and fourth grade and, on the other hand, the differences between the each experience group’s first and fourth grade results. The corresponding values are given in Tab. 11.7 to Tab. 11.10 in the Appendix (ch. 11.3). The fourth grade curves show an overlap between the mono and the bili scores from 0.05 (mono minimum) to 0.2 (mono maximum) ellipses per clause. This range includes all mono and 81% of bili fourth graders (13 of 16). Two bili fourth graders (13%) obtain marginally higher ellipsis density scores (0.21 and 0.23), while one bili fourth grader (6%) scores marginally lower than the overlap (0.03). That is, no significant difference is observable between mono and bili fourth graders in terms of the distribution of results. 22 Keeping in mind, however, the one extreme value in the mono first graders’ results, the increase of the mono group’s mean ellipsis density is actually larger than suggested by the difference in mean between first and fourth grade. Without the outlier it rises to 0.08. 23 When disregarding the mono outlier in first grade, however, it decreases from 0.21 to 0.15, i.e. by only 0.06. 185 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. With respect to the mono and the bili group’s first and fourth grade results, the following can be observed: both groups’ curves start out higher in fourth grade (cf. also the respective minimum scores), and there is an overlap of each group’s first and fourth grade ellipsis density scores. Bili results overlap between 0.04 (first grade lowest score) and 0.21 (first grade maximum) with 88% of fourth graders (14 of 16) but also 84% of first graders (16 of 19) achieving an ellipsis density within this range. Three first graders (16%) score zero, and one fourth grader (6%) obtains a marginally lower density than the overlap (0.03). One last bili fourth grader (6%) scores marginally higher (0.23). Mono first and fourth grade results, on the other hand, overlap between 0.05 (fourth grade minimum and first grade lowest density) and 0.2 (fourth grade maximum). This range includes all fourth graders but only 33% of the mono first graders (4 of 12). 50% of the first graders (6 of 12) obtain a lower ellipsis density, 42% of which (i.e. 5) score zero. Two mono first graders (17%) obtain a higher density (0.21 and 0.38). Thus, the comparison between first and fourth grade results shows that both experience groups follow the trend of an increase in the lower scores with a comparative stability in the higher density results. This phenomenon, which was first described for participants’ overall cohesive density scores, had also been observed for references and connectives. However, the difference between first and fourth grade results is again quite pronounced for the mono group, while the great majority of bili first and fourth graders achieve similar ellipsis density scores. Fig. 7.25 Distribution of participants’ individual ellipsis density scores by experience group and grade 0 0,05 0,1 0,15 0,2 0,25 0,3 0,35 0,4 No. of ties by ellipsis per clause Individual scores in increasing order Mono first graders Bili first graders Mono fourth graders Bili fourth graders 186 To sum up, neither all bili nor all mono first graders produced ellipses, while in fourth grade both experience groups used at least some ellipses. In each grade mono and bili participants achieved the same or only marginally different mean ellipsis densities and both groups followed the general trend of an increase in mean ellipsis density from first to fourth grade. However, some differences were observed between experience groups in regard to their interindividual variation. That is, the mono first graders’ results were much more heterogeneous than those of the bili first graders, not only due to a statistical outlier. The mono and the bili group’s variation followed the general trend of a decrease in variation from first to fourth grade and, since the mono group’s variability decreased more strongly, there was almost no difference between the two groups by the end of fourth grade. The distribution of mono and bili participants’ individual ellipsis density scores indicated an advantage of the bili over the mono first graders, while no differences between groups were found in fourth grade. The distribution also showed a large overlap between each group’s first and fourth grade results and an increase in the lowest scores as the main difference between grades, which reflected the findings for the overall ellipsis density. However, the overlap in scores between first grade and fourth grade was less pronounced for mono participants, i.e. their results differed more markedly between grades. Thus, distribution and interindividual differences pointed towards an effect of L2 preschool experience on the first grade results, even though no such effect was observable in the mean ellipsis densities. 7.2.3.5 Ellipsis density: Statistical results Factorial ANOVAs were conducted to statistically test main and interaction effects of grade, sex, and L2 experience group on the overall ellipsis density (cf. ch. 5.2.3). For an ANOVA including all participants, the difference in homogeneity of variance was significant—albeit barely (F(7, 51)=2.29, p=0.042) so that the ANO- VA was still conducted. This ANOVA found a statistically significant main effect of grade (F(1, 54)=5.83, p<0.05, partial η 2 =0.1), which explained 10% of the variance in results, but there were no significant main effects of experience group (F(2, 54)=1.2, ns) or sex (F(1, 54)=0.38, ns) and no significant interaction effects. A second ANOVA was conducted without participant C8-G1-2, who was identified as a statistical outlier (cf. the previous sections). When C8-G1-2 was excluded from the data, the inhomogeneity of variance disappeared. However, the ANOVA did not yield any different results—there was again a highly significant main effect only of grade (F(1, 53)=9.76, p<0.01, partial η 2 =0.16), even if the size of the grade effect was larger in this second run. That is, grade explained 16% of the variance in results once C8-G1-2 was eliminated. Sex (F(1, 53)=0.28, ns) and L2 preschool experience (F(2, 53)=2.15, ns), on the other hand, were once again not significant. Similarly, none of the possible interaction effects was significant, even though the interaction between sex and grade showed a trend towards significance (F(1, 53)=3.1, p=0.09, ns). Thus, the statistical analysis shows that—across all subgroups—grade is a highly significant factor for explaining variation in participants’ ellipsis density 187 results. Sex or experience group, on the other hand, do not have a significant influence on participants’ ellipsis density. 7.2.3.6 Summary: Substitution and ellipsis density The present section aimed to answer the question of how often participants use ties by substitution or ellipsis and whether there are any differences attributable to grade, sex or L2 preschool experience. It was found that only fourth graders produced substitutions and that the overall number of substitutions in the data (5) remained marginal. Due to their marginal occurrence, no further analyses of the substitutions were conducted. Ellipses were produced more often than substitutions: All fourth graders and 70% of the first graders (22 of 31) were found to produce ties by ellipsis. However, the overall number of ellipses produced was also low compared to references and connectives, and participants’ interindividual variation was very high. Grade was found to have a significant observable and statistical effect on participants’ ellipsis density, even though the effect size was relatively low—just as for overall cohesive density and connectives—and grade explained (depending on the set-up of the statistical test) only up to 16% of the variance. Thus, first graders achieved a mean ellipsis density of 0.09, i.e. they used roughly 1 tie by ellipsis per 10 clauses, while by the end of fourth grade the mean density had risen to 0.15, i.e. fourth graders produced approximately 3 ellipses per 20 clauses. While participants’ mean density increased, their interindividual variation decreased (from a standard deviation of 0.09 or 100% of the mean in first to 0.05 or 33% of the mean in fourth grade). The distribution of participants’ ellipsis density scores showed a large overlap between first grade and fourth grade, just as was found for overall cohesive density, references, and connectives; that is, 70% of the first graders (22 of 31) achieved an ellipsis density that could also have been produced by a fourth grader. A comparison between the first and fourth grade distribution curves showed that again the main difference was an increase in the minimum score from first to fourth grade. However, in the case of ellipsis density this higher minimum was due to a disappearance of participants not producing any ellipses; the minimum density itself remained very low and the variation high. That is, the distribution showed that the statistically significant difference between first and fourth grade ellipsis density is mainly attributable to the first graders not producing ellipses, which explains the relatively low effect size obtained in the statistical analysis. Sex was not found to have a significant observable or statistically significant influence on mean ellipsis density or interindividual variation. Instead, male and female participants performed very similarly in first and fourth grade—especially when female outlier C8-G1-2 was disregarded—and they followed the general trends of an increase in mean ellipsis density and a decrease in interindividual variation. The distribution curves showed some distinction between male and female participants in that the difference between the first and the fourth grade curves was more pronounced for the male group. However, this phenomenon is 188 most likely attributable to a group size effect, especially since the statistical analysis showed no significant differences between male and female participants. L2 preschool experience was not found to have a significant observable or statistically significant effect on participants’ mean ellipsis density. That is, only marginal differences in mean density were observed between the two experience groups in either first grade or fourth grade and both groups followed the overall trend of an increase in mean density and a decrease in variability. However, L2 preschool experience influenced participants’ interindividual variation and the distribution of the individual scores. More specifically, the ellipsis density of participants with L2 preschool experience varied only about half as much in first grade as the density of children without such prior L2 experience (the bili standard deviation was 0.06 or 67% in relation to the mean vs. 0.12 or 130% of the mean for the mono group). By the end of fourth grade the immense difference in variation had disappeared (mono fourth graders had a standard deviation of 0.05 or 36% of the mean, bili fourth graders a standard deviation of 0.05 or 33% of the mean). The distribution of mono and bili individual ellipsis density scores also showed a slight bili advantage in first grade in that 84% of the bili (16 of 19) but only 58% of the mono first graders (7 of 12) produced ellipses. Correspondingly, a comparison of monos’ and bilis’ first and fourth grade curves showed that the difference between first and fourth grade was more pronounced for mono than for bili participants. To sum up, it was found that grade significantly influenced ellipsis density. Besides the fact that all fourth but not all first graders produced ellipses, the same two main developmental trends were identified for ellipsis density as for the other densities, namely (1) a (statistically significant) increase in mean ellipsis density and (2) a strong decrease in interindividual variation. Even in fourth grade, however, participants’ mean ellipsis density remained comparatively low and the variation comparatively high. The respective distributions qualified the statistically significant increase in mean ellipsis density, since they showed, on the one hand, a significant overlap between the first and the fourth grade ellipsis density scores and, on the other hand, that the difference between first and fourth grade consisted mainly of the elimination of zero ellipsis scores. Sex was not found to have any influence on participants’ results for ellipsis density. L2 preschool experience, however, was observed to affect the first grade results in that first graders with L2 preschool experience had a lower interindividual variation and were more likely to produce ellipses than first graders without L2 preschool experience. By the end of fourth grade this difference had also disappeared. 7.2.4 Lexical cohesion 7.2.4.1 Overall lexical density The present section is aimed at answering the question of how often participants use lexical ties and whether there are any differences attributable to grade, sex or 189 L2 preschool experience. Tab. 7.13 gives the descriptive statistics for participants’ lexical density in first and in fourth grade. Tab. 7.13 Descriptive statistics lexical density by grade Lexical density (No. of lexical ties per clause) Total N Mean Standard Deviation Median Minimum Maximum Grade 1 31 1.78 0.31 1.76 1.22 2.52 Grade 4 28 1.92 0.18 1.89 1.58 2.31 Total 59 1.84 0.26 1.88 1.22 2.52 As Tab. 7.13 shows, all first graders—as well as all fourth graders—produce lexical ties. In addition, the results obtained for lexical density are the highest of the subcategories of cohesion analyzed and also the ones with the smallest amount of interindividual variation and the highest minimum density. More specifically, first graders minimally produce 1.22 lexical ties per clause (or approximately five ties in four clauses), and they achieve a mean number of 1.78 lexical ties, i.e. on average they use slightly fewer than two lexical cohesive devices per clause. First graders’ interindividual differences are low, compared to ellipses and connectives (cf. the preceding sections), with a standard deviation of 0.31 or 17% of the mean, even if the range of their results is quite high with 1.3 lexical ties per clause from minimum (1.22) to maximum score (2.52). From first to fourth grade participants’ mean lexical density increases to 1.92 (+0.14), i.e. fourth graders produce roughly 3 lexical ties more in 20 clauses than do first graders. While the mean lexical density increases, the interindividual variation decreases. This decrease is evident from an even lower standard deviation, which amounts to 0.18 or 9% of the mean and a reduction in range from 1.3 in first to 0.73 in fourth grade. Fig. 7.26 shows the distribution of participants’ individual lexical density scores in increasing order as a function of grade; the corresponding values are given in Tab. 11.1 and Tab. 11.2 in the Appendix (ch. 11.1). There is a large overlap between the first and the fourth grade lexical density results, which ranges from 1.58 (fourth grade minimum) to 2.31 (fourth grade maximum). All fourth graders achieve a lexical density within this range but so do 68% of the first graders (21 of 31). Twenty-nine percent of the first graders (9 of 31) obtain a lower lexical density. 24 One first grader (3%, C8-G1-16 25 ), however, outperforms all other participants by producing 2.52 lexical ties per clause, i.e. 0.24 ties per clause or 5 ties in 20 clauses more than the next lowest first grade score. Participant C8- G1-16 can thus be considered a statistical outlier. A comparison of the first and fourth grade curves thus yields the same developmental trend for lexical cohesion as has been described repeatedly for the use of cohesive devices, namely that the 24 Up to 0.36 ties per clause lower. 25 A female first grader without L2 preschool experience. 190 main difference between first grade and fourth grade concerns a reduction of the lower scores, which in the case of lexical ties means an increase in minimal density. To sum up, all participants produced lexical ties, and lexical cohesion was found to have the highest density of all subcategories under investigation, as well as the highest minimum density. From first to fourth grade participants’ mean lexical density increased and their interindividual variation decreased. The distribution of participants’ individual density results found that, once again, in the case of lexical density, the majority of first graders achieved results that might just as well have been produced by a fourth grader. Additionally, the distribution showed that the main difference between first grade and fourth grade was again an increase in participants’ minimum score, i.e. an elimination of the lower first grade scores. 7.2.4.2 Lexical density by sex Tab. 7.14 gives the lexical density results obtained by male and female participants in first and in fourth grade. Male first graders produce an average of 1.85 lexical ties per clause, i.e. 0.11 more than female first graders, who use an average of 1.74 lexical ties per clause. 26 26 This seems to point to a slight advantage of the male first graders. However, keeping in mind the strong difference in group size, this advantage would have to be confirmed in a larger data set and thus cannot be considered a robust finding. Fig. 7.26 Distribution of participants’ individual lexical density scores by grade 0,0 0,5 1,0 1,5 2,0 2,5 3,0 No. of lexical ties per clause Individual scores in increasing order Grade 1 Grade 4 191 Tab. 7.14 Descriptive statistics lexical density by sex and grade Lexical density (No. of lexical ties per clause) Total N Mean Standard Deviation Median Minimum Maximum Grade 1 Male 10 1.85 0.26 1.93 1.41 2.28 Female 21 1.74 0.33 1.70 1.22 2.52 Total 31 1.78 0.31 1.76 1.22 2.52 Grade 4 Male 7 1.92 0.17 1.87 1.76 2.25 Female 21 1.92 0.19 1.93 1.58 2.31 Total 28 1.92 0.18 1.89 1.58 2.31 At the same time the female results are slightly more varied, even though the interindividual differences of both sexes are low compared to some of the other densities. That is, female first graders’ results have a standard deviation of 0.33 or 19% of the mean, while the male group’s standard deviation is 0.26 or 14% of the mean. Similarly, female first graders score on a range of 1.3 from minimum (1.22) to maximum (2.52), as opposed to a male range of 0.87 (from 1.41 to 2.28). The distribution of male and female first graders’ individual lexical density scores, which is shown in Fig. 7.27 (the corresponding values are given in Tab. 11.3 and 11.4 in the Appendix (ch. 11.2)), reflects these differences in the range of scores. The distribution also shows that male and female first graders’ results overlap to a large degree. This overlap ranges from 1.41 (male minimum) to 2.28 (male maxi- Fig. 7.27 Distribution of participants’ individual lexical density scores by sex in first grade 0,0 0,5 1,0 1,5 2,0 2,5 3,0 No. of lexical ties per clause Individual scores in increasing order Male first graders Female first graders 192 mum) and includes all male first graders and also the large majority of female first graders (81%, i.e. 17 of 21). Three females (14%) obtain a lower lexical density and one (5%), the statistical outlier identified in the last section, obtains a higher one. The distribution thus shows only negligible differences between the two sexes in first grade. Both male and female lexical densities increase from first to fourth grade, albeit to different degrees (Tab. 7.14): The mean number of lexical ties increases only marginally for the male group, namely by 0.07 or 7 ties in 100 clauses. Females’ lexical density, on the other hand, increases by 0.18, i.e. by almost 2 ties per 10 clauses. As a result, both male and female fourth graders produce the same number of lexical ties, namely 1.92 per clause. While the mean increases only slightly, both groups’ interindividual variation decreases comparatively strongly from first to fourth grade. The male group’s standard deviation declines from 0.26, i.e. a variability ratio of 14%, to an even lower 0.17, i.e. a variability ratio of merely 9%. At the same time the male participants’ scoring range decreases from 0.87 to 0.49. The female groups’ standard deviation, on the other hand, declines from 0.33 to 0.19, i.e. from 19% to 10% of the mean; their scoring range simultaneously decreases from 1.3 in first to 0.73 in fourth grade. Fig. 7.28 shows the distribution of participants’ individual lexical density scores in increasing order as a function of sex and grade, i.e. on the one hand the differences and/ or similarities between male and female participants in first and fourth grade and, on the other hand, the differences between the two sexes’ first and fourth grade results. The corresponding numerical values are given in Tab. 11.3 to Fig. 7.28 Distribution of participants’ individual lexical density scores by sex and grade 0,0 0,5 1,0 1,5 2,0 2,5 3,0 No. of lexical ties per clause Individual scores in increasing order Male first graders Female first graders Male fourth graders Female fourth graders 193 11.6 in the Appendix (ch. 11.2). A comparison of the male and female fourth grade distribution curves shows an overlap between 1.76 (male minimum) and 2.25 lexical ties per clause (male maximum). All male and 71% of female fourth graders (15 of 21) reach a lexical density within this range. Twenty-four percent of females (5 of 21), however, score below and one of them (5%) scores above the overlap. That is, the distribution of individual participants’ lexical density scores seems to indicate a slight advantage of the male fourth graders over the female fourth graders. However, this finding would also need to be further investigated in a larger data set. A comparison of each sex’s first grade distribution and that of fourth grade shows an overlap in scores between the two grades. Male first and fourth grade results overlap in the range of 1.76 (male fourth grade minimum) to 2.25 (male fourth grade maximum). All male fourth graders achieve a lexical density within this range as compared to 60% of the first graders (6 of 10). One third of the male first graders (30%, 3 of 10) score lower, 27 while one male first grader (10%) obtains an even higher lexical density than the upper limit of the overlap (2.28). Female lexical density results, on the other hand, overlap between 1.58 (female fourth grade minimum) and 2.31 (female fourth grade maximum). Again, all fourth graders are included in this range as compared to 62% of the first graders (13 of 21). Approximately one third of the female first graders (3%, 7 of 21), like the males, also score lower than their fourth grade peers, while female first grader C8-G1-16 (5%), already described as having produced the largest number of lexical ties of all participants, scores higher. Thus, a comparison of each sex’s first grade lexical density distribution and that of fourth grade confirms the general trend of an increase in the lower lexical density scores as opposed to a comparative stability in the upper ones. To sum up, only negligible differences were found between male and female participants in either grade, although male first graders achieved a somewhat higher lexical density than female first graders. Both groups followed the general developmental trend identified for lexical density, namely an increase in mean density and a decrease of the interindividual variation to an even lower level than in first grade. Similarly, the distribution of male and female participants’ individual lexical density scores followed the findings for the overall distribution, namely a strong overlap between first and fourth grade results and an increase in minimum density scores as the main difference between grades. 7.2.4.3 Lexical density by experience group Tab. 7.15 gives participants’ lexical density results as a function of experience group and grade. Bili first graders produce a mean lexical density of 1.82 ties per clause, i.e. roughly 18 ties in 10 clauses, while mono first graders use an average of 1.7 lexical ties per clause or 17 per 10 clauses. That is, bili first graders produce, on average, 0.12 lexical ties per clause more or roughly 1 tie in 10 clauses. Both expe- 27 Between 0.03, i.e. only marginally lower, and 0.35, i.e. substantially lower. 194 rience groups’ interindividual variation is comparatively low in first grade and the mono group’s results are again (a little) more heterogeneous: The bili group’s results have a standard deviation of 0.25 or 14% of the mean, followed by the mono group with a standard deviation of 0.38 or 22% of the mean. The mono group also has a larger scoring range (1.3) than does the bili group (0.87). Tab. 7.15 Descriptive statistics lexical density by experience group and grade Lexical density (No. of lexical ties per clause) Total N Mean Standard Deviation Median Minimum Maximum Grade 1 Mono† 12 1.70 0.38 1.67 1.22 2.52 Bili 19 1.82 0.25 1.88 1.41 2.28 Total 31 1.78 0.31 1.76 1.22 2.52 Grade 4 Mono 12 1.87 0.18 1.82 1.68 2.31 Bili 16 1.95 0.18 1.94 1.58 2.25 Total 28 1.92 0.18 1.89 1.58 2.31 † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. Fig. 7.29 shows the distribution of mono and bili first graders’ individual lexical density scores in increasing order. The corresponding numerical values are given Fig. 7.29 Distribution of participants’ individual lexical density scores by experience group in first grade † 0,0 0,5 1,0 1,5 2,0 2,5 3,0 No. of lexical ties per clause Individual scores in increasing order Mono first graders Bili first graders 195 in Tab. 11.7 and 11.8 in the Appendix (ch. 11.3). The distribution curves show an overlap of mono and bili first graders’ lexical density between 1.41 (bili minimum) and 2.28 (bili maximum). All 19 bili first graders achieve a lexical density within this range as compared to 67% of the mono first graders (8 of 12). Twentyfive percent, i.e. three of the mono participants achieve a lower and one mono first grader (8%, C8-G1-16), who was already described as a statistical outlier in the previous sections, obtains a higher lexical density (2.52). Thus, the first grade distribution seems to indicate a slight advantage of the bili group in that mono first graders are a little more likely to produce lower lexical density scores than bili first graders. Mono and bili participants’ mean lexical densities increase from first to fourth grade (+0.17 and +0.13 respectively). As a result of the slightly different increase rates, their mean lexical densities become more alike. That is, the bili group still produces a larger number of lexical ties per clause (1.95) in fourth grade, but the mono group’s mean lexical density (1.87) is only marginally lower with a difference of merely 0.08 or 8 ties in 100 clauses. Both groups’ interindividual variation decreases from first to fourth grade and by the end of fourth grade it has become virtually the same; that is, the bili fourth graders’ results have a standard deviation of 0.18 or 9% of the mean and a range of 0.67. The mono fourth graders’ results also have a standard deviation of 0.18, which corresponds to 10% of their mean, and a scoring range of 0.63. † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. Fig. 7.30 Distribution of participants’ individual lexical density scores by experience group and grade † 0,0 0,5 1,0 1,5 2,0 2,5 3,0 No. of lexical ties per clause Individual scores in increasing order Mono first graders Bili first graders Mono fourth graders Bili fourth graders 196 Fig. 7.30 shows the distribution of mono and bili participants’ individual lexical density scores in increasing order, i.e. on the one hand the differences and/ or similarities between mono and bili participants in both first grade and fourth grade and, on the other hand, the differences between each experience groups’ first and fourth grade results. The corresponding values are given in Tab. 11.7 to 11.10 in the Appendix (ch. 11.3). There is an overlap of mono and bili fourth grade lexical density scores between 1.68 (mono minimum) and 2.25 (bili maximum). The great majority of both groups’ members, namely 92% of the mono (11 of 12) and 94% of the bili participants (15 of 16) achieve a lexical density within this range. Only one mono fourth grader (8%, C5-G4-12) scores higher (2.31) and one bili fourth grader lower (6%, C5-G4-6, 1.58 lexical ties per clause). That is, the distribution of individual participants’ lexical density scores shows virtually no difference between mono and bili lexical density scores in fourth grade. A comparison between each experience group’s first and fourth grade results again shows large overlaps; this phenomenon was already established not only for the overall lexical density but also for the other subcategories of cohesion. With respect to lexical cohesion, the mono group’s scores overlap between 1.68 (fourth grade minimum) and 2.31 (fourth grade maximum). This range includes all 12 mono fourth graders but only 42% of the mono first graders (5 of 12). Fifty percent of the mono first graders (6 of 12) obtain a lexical density below the overlapping range, while one mono first grader (8%) obtains a density above this range (statistical outlier C8-G1-16). Bili first and fourth grade scores, on the other hand, overlap between 1.58 (fourth grade minimum) and 2.25 (fourth grade maximum). All 16 bili fourth graders obtain a lexical density within this range but so do 74% of the bili first graders (14 of 19). Twenty-one percent of the bili first graders (4 of 19) achieve a lowerlexical density—between 0.03 and 0.17 lexical ties per clause lower—and one (5%, C1-G1-2) achieves a slightly higher lexical density (2.28) than the overlap. That is, the comparison of each experience group’s first and fourth grade curves reveals once more that fourth graders do not necessarily achieve a higher lexical density than first graders. Instead, there is a relatively large overlap, as has been found repeatedly in the analysis of cohesion, and the main difference between grades consists of a reduction of the lower first grade scores. However, the distribution also shows that there is a tendency of mono first graders to score less often within the fourth grade range than is the case for bili first graders and thus that their increase in the lower scores is more pronounced. To sum up, mean lexical density, interindividual variation and distribution taken together seem to indicate a slight advantage of first graders with L2 preschool experience over first graders without prior experience. That is, bili first graders produced a somewhat higher mean density, their results varied less and they were less likely to produce low lexical density scores. Both experience groups followed the general trends of an increase in mean lexical density and a decrease in variation. As a result of these two developments and distinct increase/ decrease rates, any differences attributable to L2 preschool experience were negligible by the end of fourth grade. A comparison of the mono and the bili groups’ first and 197 fourth grade distribution of individual scores showed that both groups followed the general pattern of a large overlap between first and fourth grade (lexical density) scores and an increase in the lowest (lexical density) scores as the main difference between grades. However, this difference between first and fourth grade was again found to be more pronounced for the mono group. 7.2.4.4 Lexical density: Statistical results General linear model ANOVAs were conducted to statistically test main and interaction effects of grade, sex, and L2 experience group on the overall lexical density (cf. ch. 5.2.3). For an ANOVA including all participants there was a barely significant homogeneity of variance (F(7, 51)=2.26, p=0.044), so that the ANOVA could still be carried out. There were no significant main effects of grade (F(1, 51)=1.07, ns), sex (F(1, 51)=0.41, ns) or experience group (F(1, 51)=0.62, ns) nor any interaction effects. For a second ANOVA, participant C8-G1-16, who had been identified as a statistical outlier (see above), was excluded. As a result, the inhomogeneity of variance disappeared (F(7, 50)=1.75, ns). The second ANOVA without C8-G1-16 nevertheless confirmed the findings of the first one, i.e. there were also no significant main effects of grade (F(1, 50)=1.98, ns), sex (F(1, 50)=0.93, ns) or experience group (F(1, 50)=1.27, ns) and no interaction effects. To sum up, none of the three factors investigated plays a significant role in explaining variance in lexical density—either by itself or in interaction with the other two factors. 7.2.4.5 Summary: Lexical density The present section was aimed at answering the question of how often participants use lexical ties and whether there are any differences attributable to grade, sex or L2 preschool experience. First of all, it was found that all participants produced lexical ties and that lexical cohesion had the highest density of all subcategories of cohesion under investigation—even in first grade the minimum score was 1.22 lexical ties per clause. That is to say, all participants produced at least one lexical tie per clause, plus an additional one in five clauses. At the same time participants’ interindividual variation was very low in both grades compared to, for example, the variation in ellipsis or connective density. With respect to an influence of grade, participants’ lexical density followed two developmental trends, which were already observed for the other densities, namely an increase in mean density and a decrease in interindividual variation: Participants’ mean lexical density increased from 1.8 lexical ties per clause in first grade to 1.92 ties per clause in fourth grade. That is, first and fourth graders both used approximately two lexical ties per clause, but fourth graders produced an average of 1 lexical tie more per 10 clauses; this difference was not found to be statistically significant, however. While participants’ mean density increased, their interindividual variation decreased from a comparatively low standard deviation of 0.3 in first grade to an even lower one of 0.17 in fourth grade, which corresponds to a decrease in variability ratio from 17 to 9%. 198 The distribution of participants’ individual scores in first and in fourth grade showed a similar pattern as was found for the overall cohesive density of participants’ narratives, as well as for the other subcategories of cohesion. That is, a substantial percentage of first graders (71%) achieved a lexical density that could also have been obtained by a fourth grader and the main difference between first grade and fourth grade again consisted of an increase in the lower lexical density scores (marked by an increase in participants’ minimum density). However, this phenomenon was more pronounced for some subgroups than for others. Sex was not found to have a significant observable or a statistically significant influence on participants’ lexical density; any differences, such as, for example, a slightly higher mean density of male first graders, were too small to discard the possibility of a group size effect. In addition, both sexes followed the general trends of an increase in mean density, a decrease in variation, and a strong overlap between first and fourth grade scores. L2 preschool experience was observed to have some influence on participants’ lexical density, but this influence was limited to first grade. First of all, it was observed that first graders with L2 preschool experience (bili group) achieved a somewhat higher mean lexical density (1.82) than those without prior experience (mono group; 1.7). However, this observed difference was not statistically significant. Secondly, the bili participants’ results varied a little less, i.e. the bili group reached a standard deviation of 0.25 or 14% of the mean and the mono group 0.38 or 22% of the mean. Additionally, a comparison between both groups’ distribution curves in first grade showed that the mono first graders were a little more likely to obtain a low lexical density than were the bili first graders. Correspondingly, the bili first and fourth grade results overlapped more strongly. However, both experience groups followed the general trend of an increase in mean density, a (further) decrease in variation and, with respect to the distribution of the individual results, an increase in the lower scoring ranges. Due to distinct increase/ decrease rates, only marginal differences attributable to L2 preschool experience remained by the end of fourth grade. To sum up, participants’ lexical density followed the same developmental trends found for the other subcategories of cohesion, namely an increase in mean density, which was, however, not statistically significant, and a decrease in variation. These trends applied in differing degrees to the subgroups under investigation. Sex was not found to influence lexical density results whereas L2 preschool experience, on the other hand, had an observed, albeit not a statistically significant, effect on participants’ lexical density in first grade. That is, the mean density produced by children with L2 preschool experience was a little higher and they performed a little more homogeneously than did the children without prior experience. Another general pattern, which was also found for lexical density, was a strong overlap between participants’ first and fourth grade scores and a reduction of the lower scores from first to fourth grade paired with relative stability in the upper scoring ranges. This pattern was again influenced by L2 preschool experience in that more bili than mono first graders achieved a lexical density in the 199 range of their fourth grade peers and the reduction of the lower scores was more pronounced for the mono participants. 7.3 Contribution of the subcategories to the overall cohesive density 7.3.1 Overall contribution The previous sections described how often participants use each subcategory of cohesion, namely references, connectives, substitutions, ellipses, and lexical cohesion. From the results a clear order of mean densities emerges, which holds true for both first grade and fourth grade: Lexical ties have the highest mean densities, followed by references and connectives. The use of ellipses and substitutions, on the other hand, remains marginal in both grades. One question that remains to be answered, however, is to what degree each subcategory contributes to the overall cohesiveness of participants’ stories, and whether there are any differences attributable to grade, sex or experience group in regard to this question. The percentage of each subcategory’s density relative to the overall cohesive density, i.e. each subcategory’s proportion, is the most appropriate measure for this purpose. As Fig. 7.31 shows, lexical ties make up more than half (51.5%) and references roughly one third of the overall cohesive density (33.1%) in first grade. Connectives constitute almost 13% of the overall cohesive density in first grade, while ellipses contribute only marginally (2.5%). As described earlier, substitutions do not occur at all in first grade. The contribution of the individual subcategories does not change much from first to fourth grade (Fig. 7.32). 28 Lexical cohesion remains the most important subcategory (49.1%), followed by references (31%); the contribution by these two categories decreases a little between grades one and four (-2.4% and -2.1% respectively). The third most important subcategory of cohesion is again connectives (16.1%), the contribution of which increases (+3.2%). As in first grade, ellipses contribute only to a small degree to overall cohesion (3.8%), even though their contribution has also increased (+1.3%) and even though this increase is quite remarkable when compared to their first grade contribution. Finally, although some substitutions were produced in fourth grade, their number is still so small that substitutions account for less than 1% of the fourth grade cohesion (0.05%). 28 Even if some of the subcategories, e.g. connectives, show quite a strong increase relative to their respective first grade percentage. This applies to all (sub-)groups and distributions investigated. 200 7.3.2 Contribution by sex The distribution of the subcategories for both males (Fig. 7.33 and 7.34) and females (Fig. 7.35 and 7.36) is very similar to the overall distribution, and the two sexes show no significant observable differences in either first grade or fourth grade: The major bulk of male (50.9%) as well as female (51.8%) cohesive density is made up of lexical ties. The next highest contribution in both male (33.3%) and female data (33%) is made by references, followed by connectives, again in both Fig. 7.32 Contribution of the subcategories of cohesion in fourth grade Fig. 7.31 Contribution of the subcategories of cohesion in first grade References; 33,1% Connectives; 12,9% Ellipses; 2,5% Substitutions; 0,0% Lexical cohesion; 51,5% References; 31,0% Connectives; 16,1% Ellipses; 3,8% Substitutions; 0,05% Lexical cohesion; 49,1% 201 male (14%) and female data (12.4%). Ellipses contribute only marginally to both male (1.9%) and female (2.9%) overall cohesive density. Thus, the only difference in first grade with respect to the contribution of each subcategory is a slightly stronger contribution of connective density in the male data as compared to a slightly stronger contribution of lexical and ellipsis density in the female data. Male and female participants’ general distribution of the subcategories changes only slightly from first to fourth grade (Fig. 7.33 to Fig. 7.36) and in fourth grade there are even fewer observable differences between sexes; that is, lexical ties (males 48.9%, females 49.1%) and references (30.3% and 31.2%) contribute most Fig. 7.33 Contribution of the subcategories of cohesion for male first graders Fig. 7.34 Contribution of the subcategories of cohesion for male fourth graders References; 33,3% Connectives; 14,0% Substitutions; 0,0% Ellipses; 1,9% Lexical ties; 50,9% References; 30,3% Connectives; 16,6% Substitutions; 0,11% Ellipses; 4,1% Lexical ties; 48,9% 202 to overall cohesive density, even though their contribution is a little lower than in first grade, while that of connectives has risen to 16.6% for male (+2.6%) and 16% for female participants (+3.6%). The contribution of ellipses has also increased very slightly in both male and female data and now makes up 4% of the overall cohesive density in both groups. 29 The contribution of substitutions remains negligible for both sexes (females 0.04% and males 0.1%). 29 However, compared to the first grade contribution this is again a very substantial increase. Fig. 7.35 Contribution of the subcategories of cohesion for female first graders Fig. 7.36 Contribution of the subcategories of cohesion for female fourth graders References; 33,0% Connectives; 12,4% Substitutions; 0,0% Ellipses; 2,9% Lexical ties; 51,8% References; 31,2% Connectives; 16,0% Substitutions; 0,04% Ellipses; 3,7% Lexical ties; 49,1% 203 7.3.3 Contribution by experience group The degree to which the different subcategories contribute to overall cohesion in both experience groups’ data is similar to the general results described earlier. Thus, in first grade, lexical ties are most important in both the mono (52.6%, Fig. 7.37) and the bili data (50.8%, Fig. 7.39), followed by references (35% and 32% respectively) and connectives (9.8% and 14.7% respectively). Ellipses once more make the most marginal contribution to overall cohesion with 2.6% in the mono and 2.5% in the bili first graders’ data, while substitutions do not occur at all. The first grade results do show some differences, however, between monos and bilis: The distribution of the bili data is influenced by a somewhat larger contribution of connectives than in the mono data (a difference of 4.9%), which in turn is responsible for a marginally lower contribution of lexical ties, references, and ellipses in the bili data. In fourth grade the order of importance of the subcategories remains the same in both groups (Fig. 7.38 and 7.40), although some changes in the proportions can be observed. That is, lexical ties make up the largest part of the overall cohesive density in fourth grade with 48.4% in the mono and 49.6% in the bili data, even though their contribution has decreased somewhat in both the mono group (- 4.2%) and the bili group (-1.2%). Similarly, references still make the second largest contribution with 31.7% in the mono and 30.5% in the bili data, even if their contribution has also decreased a little in both the mono group (-3.3%) and the bili group (-1.5%). The main difference between mono and bili participants is found in the development of the contribution of connectives. That is, the contribution of connectives increases substantially in the mono group (+6.3%) but only slightly in the bili data (+1.4%), so that by the end of fourth grade the two experience groups do not differ anymore in regard to the contribution of connectives (both 16.1%). The increase in the contribution of connectives in the mono data also explains its comparatively stronger changes in the contribution of lexical ties and references. In both experience groups the contribution of ellipses remains very low (3.7% in the mono and 3.8% in the bili data), although it has also increased (monos +1.1%, bilis +1.3%), and the contribution of substitutions stays marginal (monos 0.04%, bilis 0.06%). As a result of the changes between first grade and fourth grade, there are only negligible differences in the distributions of mono and bili fourth graders. 204 Fig. 7.38 Contribution of the subcategories of cohesion for fourth graders with monolingual preschool experience (mono group) Fig. 7.37 Contribution of the subcategories of cohesion for first graders with monolingual preschool experience (mono group) References; 35,0% Connectives; 9,8% Substitutions; 0,0% Ellipses; 2,6% Lexical ties; 52,6% References; 31,7% Connectives; 16,1% Substitutions; 0,04% Ellipses; 3,7% Lexical ties; 48,4% 205 Fig. 7.40 Contribution of the subcategories of cohesion for fourth graders with bilingual preschool experience (bili group) Fig. 7.39 Contribution of the subcategories of cohesion for first graders with bilingual preschool experience (bili group) References; 32,0% Connectives; 14,7% Substitutions; 0,0% Ellipses; 2,5% Lexical ties; 50,8% References; 30,5% Connectives; 16,1% Substitutions; 0,06% Ellipses; 3,8% Lexical ties; 49,6% 206 To sum up, a general order of importance was found with respect to the degree to which the subcategories of cohesion contribute to the overall cohesion of participants’ stories: • Lexical cohesion (ca. 50%) • References (ca. 30%) • Connectives (ca. 15%) • Ellipses (< 4%) • Substitutions (< 1%). This general order was not influenced by grade, sex or L2 preschool experience. The only remarkable difference observed involved the degree of contribution by the subcategories in the first grade data of the two experience groups, which was found to be attributable to the larger contribution of connectives in the bili first graders’ data. By the end of fourth grade, however, even this difference had disappeared, i.e. the distribution of the subcategories of cohesion was virtually the same in all subgroups investigated and it corresponded closely to the overall distribution. 7.3.4 Qualitative changes Does that mean there are no overall changes from first to fourth grade besides a slight increase in mean cohesive density? A brief look at the make-up of the three most important subcategories (lexical cohesion, references, and connectives) can help to point toward an answer to this question. The dominant subcategory of lexical ties in both grades (Fig. 7.41 and 7.42) is repetitions, followed by lexical fields and then by a fairly marginal use of all other lexical ties. However, a shift in proportions can be observed between the two grades. Thus, the percentage of repetitions decreases by 12.3%, while the percentage of lexical fields increases by 8.7% and the percentage of all other subcategories combined also rises (from 8.4 to 12.1%). That is, there is a clear shift in the distribution of lexical cohesion subcategories, with the “easiest” subcategory—namely creating ties via simple repetition—decreasing quite substantially, while the subcategories indicative for a stronger lexical diversity increase from first to fourth grade. 207 A similar but even clearer development from first to fourth grade is evident in the frequencies of the subcategories of reference (Fig. 7.43 and 7.44). Demonstrative references (mainly the definite article) contribute most to the overall reference density in both grades, followed by personal references as the second most important subcategory However, the percentage of demonstrative references decreases by 21% from first to fourth grade, while the percentage of personal references increases by 20%. The contribution of comparative references doubles from first to fourth grade but remains marginal in both grades. That is, an even clearer qualitative shift can be Fig. 7.42 Contribution of the subcategories of lexical cohesion in fourth grade Fig. 7.41 Contribution of the subcategories of lexical cohesion in first grade Repetition; 61,1% Hyponymy; 1,5% Synonymy; 0,1% Nearsynonymy; 1,0% Instantial synonymy; 0,8% Co-hyponymy; 0,4% Opposition; 4,6% Lexical fields; 30,4% Repetition; 48,8% Hyponymy; 0,9% Synonymy; 0,4% Nearsynonymy; 2,2% Instantial synonymy; 1,9% Co-hyponymy; 0,8% Opposition; 5,9% Lexical fields; 39,1% 208 observed in the subcategories of references, with the use of personal references (i.e. pronouns) increasingly replacing demonstrative references (mainly the definite article in constructions such as the boy). A similar qualitative development from first to fourth grade is also evident in the subcategories of connectives (Fig. 7.45 and 7.46); additive connectives contribute most to connective density in both first grade and fourth grade, followed by temporal connectives. However, a comparison between first grade and fourth grade shows that the percentage of additive connectives decreases by 9.1%. While the percentage of temporal connectives also decreases, even though by much less (-2.1%), it is the Fig. 7.43 Contribution of the subcategories of reference in first grade Fig. 7.44 Contribution of the subcategories of reference in fourth grade Personal references; 15% Demonstrative references; 83,9% Comparative references; 0,9% Personal references; 35% Demonstrative references; 62,9% Comparative references; 2,3% 209 connectives expressing more specific relationships that increase. Together, the percentage of the “lesser” connective categories increases from merely 1.2% (adversative and causal) in first to 12.4% in fourth grade. More specifically, the proportion of adversative connectives makes up 5.8% in fourth grade (+5%) and the one of causal connectives 3.1% (+2.7%). Additionally, the use of connectives has become more diversified in fourth grade: Factual connectives (2.4%), conditional connectives (0.9%), and the subordinator with (0.2%) also make a (marginal) contribution to the cohesion of participants’ stories. Fig. 7.46 Contribution of the subcategories of connectives in fourth grade Fig. 7.45 Contribution of the subcategories of connectives in first grade Additive connectives; 67,8% Temporal connectives; 31,0% Adversative connectives; 0,8% Conditional connectives; 0,0% Causal connectives; 0,4% Factual connectives; 0,0% Subordinator 'with'; 0,0% Additive connectives; 58,7% Temporal connectives; 28,9% Adversative connectives; 5,8% Conditional connectives; 0,9% Causal connectives; 3,1% Factual connectives; 2,4% Subordinator 'with'; 0,2% 210 7.3.5 Summary: Contribution of the subcategories to cohesive density To sum up, a general order of the degree of contribution to the overall cohesive density of participants’ stories was found for the subcategories investigated: lexical ties > references > connectives > ellipses > substitutions. The respective degrees of contribution showed almost no changes from first to fourth grade, nor any influence of sex; some differences attributable to experience group were found with respect to the contribution of connectives. A more detailed look at the distribution of the subcategories of lexical cohesion, references, and connectives, on the other hand, revealed that clear qualitative shifts occurred within these subcategories, even though their overall proportion remained largely stable from first to fourth grade. Thus, there was a shift from “easier” lexical categories, such as repetition, to categories indicating a more diversified use of vocabulary, a shift from the use of “easier” demonstrative constructions to the use of personal references to refer to story characters, and a shift from “easier” connectives with a general additive or temporal meaning to an increasingly specific encoding of relationships between clauses, e.g. as adversative. 7.4 Cohesion results: Summary The two previous sections focused on three main questions, namely • how cohesive participants’ stories are as measured by the number of cohesive ties per clause (ch. 7.1) • whether there are any qualitative differences in regard to the cohesion of participants’ stories as measured by the use (ch. 7.2) and degree of contribution (ch. 7.3) of the subcategories of cohesion, i.e. references, connectives, substitutions, ellipses and lexical cohesion • whether there are any differences attributable to grade, sex and experience group with respect to the first two questions. Based on the findings of previous studies (cf. ch. 4) the following hypotheses had been put forward with respect to these research questions (cf. ch. 3.5 and 4.6): 1. Grade/ age has a significant influence on the cohesion of participants’ stories. 2. Sex does not have a significant influence on the cohesion of participants’ stories. 3. Participants’ stories become more cohesive from first to fourth grade as measured by the overall number of cohesive devices. 4. There are no qualitative differences in cohesion between grades as measured by the frequency order of the subcategories of cohesion (lexical ties > references > connectives > ellipses and substitutions). For an easier recapitulation and comparison of results Fig. 7.47 shows the overall mean cohesive density of participants’ stories as well as the densities of the individual subcategories by grade in one diagram; the corresponding descriptive sta- 211 tistics are given in Tab. 7.16. It was found that all participants, i.e. even all first graders, produced at least some cohesive ties; the overall minimum score was approximately two cohesive ties per clause (produced by a first grader). However, not all participants used every subcategory, especially in first grade; that is, all first graders produced references and lexical ties and most of them also produced connectives. Only some of the first graders, however, used ellipses and none of them substitutions. All fourth graders, on the other hand, produced references, lexical ties, connectives and ellipses. Additionally, at least some of them also produced substitutions. † The statistical significance for overall cohesion, connectives and ellipses is based on a factorial ANOVA including all participants. An ANOVA for connectives without a statistical outlier showed, however, that the effect of grade found for connectives is largely limited to participants without L2 preschool experience. For referential and lexical cohesion the difference between first and fourth grade was not statistically significant. However, an ANOVA for references computed without a statistical outlier showed an effect of grade, which applied only to female but not male participants. Substitutions were not tested, since they only occur in fourth grade. Fig. 7.47 Cohesive densities by grade † 1,14 0,45 0,09 0 1,78 3,45 1,21 0,63 0,15 ,002 1,92 3,91 0,07 0,18** 0,06* 0,002 0,14 0,46* 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5 4,0 4,5 References Connectives Ellipses Substitutions Lexical cohesion Total ties per clause Grade 1 Grade 4 Increase 212 Tab. 7.16 Descriptive statistics cohesive densities by grade Total N Mean Standard Deviation Median Min. Max. Grade 1 Reference density 31 1.14 0.23 1.16 0.57 1.65 Connective density 31 0.45 0.28 0.48 0 0.97 Substitution density 31 0 0 - 0 0 Ellipsis density 31 0.09 0.09 0.07 0 0.38 Lexical density 31 1.78 0.31 1.76 1.22 2.52 Total density 31 3.45 0.53 3.46 2.17 4.44 Grade 4 Reference density 28 1.21 0.17 1.22 0.76 1.62 Connective density 28 0.63 0.13 0.62 0.46 0.91 Substitution density 28 0.002 0.01 - 0 0.02 Ellipsis density 28 0.15 0.05 0.16 0.03 0.23 Lexical density 28 1.92 0.18 1.89 1.58 2.31 Total density 28 3.91 0.32 3.88 3.43 4.65 A comparison of the individual subcategories (cf. Fig. 7.4 and ch. 7.3) and the degree to which they contribute to the cohesion of participants’ stories established a general order of contribution, which was not influenced by grade, sex or experience group (in descending order): Lexical ties > references > connectives > ellipses > (substitutions). Lexical ties were found to make up about half and references about one third of the overall cohesion, followed by connectives, which contributed about 15%. Ellipses (below 4%) and substitutions (below 1%) were found to be of marginal importance. This frequency order closely corresponds to Bae’s (2001) findings for the written data of L2 first and second graders (cf. ch. 4.4) but leads this general order back to oral data and extends it to a later age group. The overall variation in cohesive density results was found to be low compared to the variation found for coherence. However, there were very large differences among the subcategories. All variability ratios are summarized by grade, sex and experience group in Tab. 7.17 to allow for a better comparison of the results obtained for participants’ interindividual variation. A comparison of the individual variability ratios of the subcategories, as well as their scoring ranges, shows the following order of interindividual variation (in decreasing order; Tab. 7.17): Ellipses > connectives > references > lexical ties; that is, participants differed most strongly in their use of ellipses, followed by connectives. The results obtained for references and lexical ties, on the other hand, were comparatively homogeneous. The general order of interindividual variation was not found to be influenced by grade, sex or experience group, 30 even if the degree of interindividual variation in first grade was highly influenced by L2 preschool 30 Except for the bili first graders’ variation, which was virtually the same for references and lexical ties. 213 experience in the sense that the results of children with bilingual preschool experience varied less than those of first graders without prior L2 experience. Tab. 7.17 Interindividual variation in cohesive density by grade, sex, and experience group Total Male Female Mono Bili Grade 1 Reference density 20% † 17% 22% 28% 13% Connective density 62% 53% 67% 97% 43% Ellipsis density 100% 100% 90% 133% 67% Lexical density 17% 14% 19% 22% 14% Total cohesive density 15% 12% 16% 19% 12% Grade 4 Reference density 14% 11% 15% 13% 14% Connective density 21% 18% 21% 21% 21% Ellipsis density 33% 31% 36% 36% 33% Lexical density 9% 9% 10% 10% 9% Total cohesive density 8% 7% 9% 8% 9% † The interindividual variation is given in terms of the standard deviations as a function of the respective means (variability ratio). Substitutions were not included due to their extremely low number of occurrences. ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. Two main developmental trends from first to fourth grade were observed in the data: (1) An increase in mean density and (2) a decrease in interindividual variation. This developmental pattern applied not only to the overall cohesive density but also to all of the subcategories (cf. Fig. 7.47, Tab. 7.16 and 7.17) and it was not influenced by either sex or experience group. Meanwhile the rates of increase in mean density varied for the individual subcategories and the following order of increase rates emerged (fourth grade result minus first grade result, divided by first grade result): Ellipses (+67%) > connectives (+40%) > lexical ties (+8%) and references (+6%). In other words, the mean densities of references and lexical ties increased only a little from first to fourth grade, while those of connectives, ellipses and substitutions increased very strongly. Even though substitutions were not considered for the order of increase rates, they had the comparatively strongest increase, since they were not produced at all in first grade. The statistical analysis largely confirmed the observed order. Thus, overall cohesive density, connectives and ellipses differed statistically significantly between first grade and fourth grade, while this was not the case for references and lexical cohesion; substitutions could not be tested. At the same time the influence of grade, as measured by the respective effect sizes, was low compared to that found for coherence (12% for overall cohesion, for example, up to 16% for ellipses and up to 24% for connectives). Additionally, the overall statistical significance did not always apply to all subgroups and the effect sizes were also influenced somewhat by sex or experience group. 214 With respect to a decrease in interindividual differences (as measured by the respective variability ratio) it was found that the variation of the overall cohesion of participants’ stories decreased by 47% from first to fourth grade. At the same time the decrease rates of the subcategories’ interindividual variation differed just as strongly as their increase rates in mean density. The following order of decrease rates emerges (first grade variability ratio minus fourth grade ratio, divided by first grade ratio): Connectives and ellipses (both -67%) > lexical ties (-46%) > references (-30%). This general order is not affected by sex or L2 preschool experience. However, participants without L2 preschool experience had much higher decrease rates than the bili participants, not only for the overall cohesive density (-58% vs. -25%) but also for all subcategories. The distribution of participants’ individual scores revealed two additional general phenomena: (1) An overlap between first and fourth grade scores and (2) an increase in the lowest scores from first to fourth grade, paired with a relative stability in the upper scores. That is, a large percentage of first graders accomplished density scores which could also have been achieved by a fourth grader. At the same time the major difference between first and fourth grade was found to be a reduction in the lower density scores as opposed to a relative stability in the upper scoring ranges. These general observations hold true for all subcategories of cohesion as well as for the two background variables included in the study, i.e. sex and experience group. However, the increase in minimum densities was more pronounced for mono than for bili participants. Sex was found to have very little influence on participants’ cohesion results; this was true across all subcategories. That is, male and female participants performed very similarly in both grades (the only statistically significant differences were found for references when an outlier was removed) and they followed the overall developmental trends for mean density, interindividual variation and the distribution of the individual scores. L2 preschool experience was found to influence cohesion results somewhat, but not as significantly as was the case for coherence (cf. ch. 1). The only stable differences between participants with and without L2 preschool experience were a lower interindividual variation of the bili group in first grade and a stronger overlap of their first and fourth grade scores due to consistently higher minimum scores in first grade. These differences were found for overall cohesion as well as for all of the subcategories. Even though bili first graders had a higher observed mean density for overall cohesion, connectives and lexical cohesion, only the difference in mean connective density was statistically significant. Both experience groups largely followed the overall trends of an increase in mean densities, a decrease in interindividual variation and an increase in minimum scores as the main difference between the first and fourth grade distribution of participants’ individual results. Due to slightly different increase/ decrease rates and a stronger increase in minimum scores for children without L2 preschool experience, any observed or statistically significant differences in mean density and interindividual variation had disappeared by the end of fourth grade. 215 To sum up, all four hypotheses were confirmed. Grade had an influence on the cohesion of participants’ stories in that a general developmental pattern was found, namely an increase in the mean densities, which was, however, statistically significant only for the overall cohesiveness of participants’ stories and for their use of connectives and ellipses. The general increase in mean density from first to fourth grade was accompanied by a decrease in interindividual variation and an overlap between the first and fourth grade scores, which in turn was accompanied by an increase in minimum scores as the main difference between the first and fourth grade results. These developments held true for all subcategories of cohesion under investigation, even though the increase/ decrease rates and overlaps differed for the individual subcategories. No systematic differences attributable to sex were found in the data, but experience group, on the other hand, did have some influence on the cohesion results in first grade: The results of participants with L2 preschool experience had a lower interindividual variation in first grade and their results overlapped more strongly between grades due to higher minimum scores in first grade. In regard to participants’ mean densities, however, L2 preschool experience had a statistically significant impact on connective density only. By the end of fourth grade any influence of experience group had disappeared. 217 8 The relationship between (the development of) coherence and cohesion The present chapter aims to answer the questions of whether there is a relationship in participants’ narrative discourse between (1) coherence and cohesion and (2) the development of coherence and cohesion from first to fourth grade. The analysis was limited to investigating linear relationships, however, and any lack of relationship in the results should be interpreted in this light. Thus, several correlation analyses (cf. ch. 5.2.3) were conducted to test whether • high (low) coherence implies high (low) cohesion and vice versa (positive correlation) or • high (low) coherence implies low (high) cohesion and vice versa (negative correlation) or • there is no discoverable linear relationship (no correlation). 1 The measures submitted to the analyses were participants’ total number of narrative components and their narrative index score, both of which represent measures of overall coherence, and participants’ total number of cohesive devices and their texts’ degree of cohesive density, both of which represent measures of overall cohesion. Consequently, the following correlations are possible in terms of a relationship between (the development of) coherence and cohesion: • Narrative index and cohesive density • Narrative index and total number of cohesive devices • Total number of narrative components and cohesive density • Total number of narrative components and total number of cohesive devices. For question (1) correlation results will be reported separately for first grade and fourth grade; the two groups were not pooled for analysis (cf. Bachmann 2004: 98f.), since it was shown in the previous chapters that they behave differently. Also, any significant correlation across all participants may not reflect an actual correlation between variables, since all variables increase from first to fourth grade and any correlation would be strongly influenced by these increases. Since the mono group and the bili group also tend to be at variance, these two groups’ results will equally be reported separately for each grade. No distinction by sex was made for the correlation analysis, since sex was not found to be a significant factor in any of the coherence or cohesion results reported in earlier chapters. While both longitudinal and cross-sectional data were used in addressing question (1), for question (2) only the longitudinal data set was included in the analysis. Thus, all of the above measures’ increase rates were calculated for participants of the first cohort (fourth grade minus first grade results) and a correlation analysis was performed on these new increase variables. Only results for the entire 1 Analogously the development of coherence and cohesion. 218 first cohort will be reported, since sex was again excluded and experience groups were too small to conduct separate analyses. The correlation matrix in Tab. 8.1 gives the correlation coefficients between coherence and cohesion measures for all first graders together as well as for the mono experience group and the bili experience group. The correlations across all first graders indicate a relationship between coherence and cohesion: There is a significant moderate-to-strong positive correlation between (a) the total number of components and the total number of ties and (b) between narrative index and the total number of ties. That is, first graders produce either a high number of components (respectively a high index score) and a high number of cohesive devices or they produce a low number of both components (respectively a low index score) and low number of cohesive devices. However, in both correlations the shared variance is relatively low, namely (a) 19% (r 2 =0.19) and (b) 16% (r 2 =0.16). Tab. 8.1 Correlations in first grade First graders (N=31, df=29) Mono first graders (N=12, df=10) Bili first graders (N=19, df=17) Cohesive density Total number of cohesive devices Cohesive density Total number of cohesive devices Cohesive density Total number of cohesive devices Narrative index † -.032 .401* -.252 -.001 -.211 .278 Total number of narrative components .008 .440* -.359 -.138 -.190 .267 † All are Pearson product-moment correlation coefficients (r). * correlation is significant at the 0.05 level (2-tailed). ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. At the same time no correlation between these two sets of variables is found when mono group and bili group are looked at separately. Although this could be due merely to the effect of combining groups (cf. Bachmann 2004: 98f.), it does raise the question of whether the significant correlations are more likely to be attributable to the mediating effect of another variable. Additional analyses showed that the total number of components (r(29)=0.532, p<0.01, r 2 =0.28) and, to an even greater degree, the number of ties (r(29)=0.946, p<0.01, r 2 =0.89) both correlate positively with text length, i.e. the total number of clauses. Therefore, a partial correlation with length as the controlling variable was performed. It showed that the significant correlation between the total number of components and the total number of ties (r(28)=0.228, ns), as well as the significant correlation between narrative index and the total number of ties (r(28)=0.231, ns), disappear when controlled for the influence of length. That is, there is no significant linear rela- 219 tionship between any of the coherence and cohesion measures in first grade; any apparently significant relation between the two constructs is entirely attributable to the mediating effect of text length. The correlation matrix in Tab. 8.2 gives the correlations between coherence and cohesion measures across all fourth graders as well as for the two experience subgroups. Across all participants, one significant strongly positive correlation remains in fourth grade, namely between the total number of components and the total number of cohesive devices, with a shared variance of 37% (r 2 =0.37). That is, fourth graders again produce either a high number of both components and cohesive devices or they produce a low number of both components and cohesive devices. This also holds true for the two experience groups, where the shared variance is 43% for the mono (r 2 =0.43) and 37% for the bili group (r 2 =0.37). Tab. 8.2 Correlations in fourth grade Fourth graders (N=28, df=26) Mono fourth graders (N=12, df=10) Bili fourth graders (N=16, df=14) Cohesive density Total number of cohesive devices Cohesive density Total number of cohesive devices Cohesive density Total number of cohesive devices Narrative index † -.026 .157 - - .020 .210 Total number of narrative components -.055 .624 ** .234 .657* -.234 .606* † ‘Mono’ denotes children with exclusively monolingual preschool experience, ‘bili’ those with bilingual preschool experience. Narrative index scores in grade four are not normally distributed and therefore Spearman rank-order correlation coefficients are given (r s ). The mono group’s narrative index results in fourth grade could not be used for the correlation analysis at all, since they did not show any variation. In all other cases Pearson product-moment correlation coefficients (r) are given. ** correlation is significant at the 0.01 level (2-tailed), * correlation is significant at the 0.05 level (2-tailed). All other coherence-cohesion pairs remain without significance, be it across all fourth graders or for the subgroups. Most importantly, no significant correlations were found between the narrative index, which focuses specifically on the narrative components indispensable for coherence (cf. ch. 6.3), and any of the two cohesion measures. This phenomenon could be explained by the variation in fourth grade results: As opposed to the interindividual variation in cohesion, coherence scores are almost constant, which could lead to the non-significant correlation. However, the significant correlation between the number of components and the total number of ties could also again be mediated by text length. Additional anal- 220 yses showed that this was in fact the case. Narrative components (r(26)=0.593, p<0.01, r 2 =0.35) and, again even more strongly, the total number of cohesive devices (r(26)=0.958, p<0.001, r 2 =0.92) correlated significantly with text length, with a shared variance of 35% and respectively a full 92%. In the mono data the same phenomenon was found, i.e. narrative components (r(10)=0.691, p<0.05, r 2 =0.48) correlated significantly with length, as did, even more strongly, the total number of ties (r(10)=0.959, p<0.001, r 2 =0.92). The same was true for the bili data where narrative components (r(14)=0.569, p<0.05, r 2 =0.32) correlated significantly with length, as did, again even more strongly, the total number of ties (r(14)=0.972, p<0.001, r 2 =0.94). In the two subgroups the shared variance ranged from 32% in the bili group to 48% in the mono group for the total number of components and from a full 92% to 95%, respectively, for the total number of ties. Therefore, partial correlations were performed with text length as the controlling variable. Any significant correlations between the number of components and the total number of ties disappeared once length was controlled for, be it across all fourth graders (r(25)=0.243, ns) or the mono subgroup (r(9)= -0.033, ns) and the bili subgroup (r(13)=0.273, ns). That is, the partial correlation results again qualify the initial results, to the effect that no relationship between coherence and cohesion is found in fourth grade when length is controlled for. Tab. 8.3 Correlations between increase variables Increase cohesive density Increase total number of cohesive devices Increase narrative index .037 † -.011 Increase total number of narrative components .093 .589 * † All are Pearson product-moment correlation coefficients (r). N=13 for all analyses, df=11. * correlation is significant at the 0.05 level (2-tailed). Tab. 8.3 shows the correlation matrix for the development from first to fourth grade in the longitudinal data set. A significant, strongly positive correlation was found between the increase in the total number of narrative components and the increase in the total number of cohesive devices, with a shared variance of 35% (r 2 =0.35). That is, the two variables either both increase strongly or they both increase slowly. Again, a possible impact of text length was explored; in this case the increase in the number of clauses. The increase in the total number of ties correlated very strongly positively with the increase in text length (r(11)=0.964, p<0.01, r 2 =0.93), with a shared variance of a full 93%. The increase in narrative components also correlated strongly with the increase in length (r(11)=0.523, ns), but this correlation just missed significance (p=0.067). A partial correlation with the increase in text length as the controlling variable rendered no significant correlation between the increases of the number of components and the total number of cohesive devices (r(10)=0.374, ns). That is, both the coherence and the cohe- 221 sion of participants’ stories develop significantly from first to fourth grade (cf. the previous chapters) but their development is not (linearly) related in individual participants’ texts. To sum up, no linear relationship between coherence and cohesion measures was found—notwithstanding a possible non-linear relation. With respect to the development of coherence and cohesion there were also no linear relations. Any seemingly significant correlations were attributable to the effect of (the increase in) text length and disappeared once (the increase in) length was controlled for. 223 9 General discussion The present study investigated narrative coherence and cohesion in the L2 story productions of German children participating in an English-language immersion (IM) program. As stated in the introduction (ch. 1), this study had a threefold aim: 1. To investigate how linguistic and content organization of (narrative) discourse develop over the four-year duration of an early partial IM program (Goal 1) 2. To relate this development to participants’ cognitive and linguistic development (Goal 2) 3. To relate the overall results to the effectiveness of the program (Goal 3). Participants were 59 elementary school children in Germany, who attended an early partial immersion program in which ca. 70% of the weekly schooling is conducted in the L2 English (cf. ch. 5.1.1). Learner variables taken into account were grade (first vs. fourth, which corresponds to mean age 6; 7 and 9; 7), sex and L2 preschool experience (English-German bilingual vs. monolingual German preschool experience). All stories were collected with the help of a picture-elicited oral storytelling task (cf. ch. 5.1.3). The main focus of the present study was, as stated in the introduction, to give a quantitative account of the development of coherence and cohesion. This quantitative account was based on an initial qualitative analysis, the result of which led to the operationalization of the categories of analysis provided by the underlying frameworks (story grammar, Halliday and Hasan’s approach to cohesion, cf. ch. 2 and 3); the detailed description of the categories of analysis in the methods chapter (ch. 5) therefore largely corresponds to the results of a quantitative analysis. In chapter 3 a simplified model of narrative discourse production was presented, which showed two fundamental, interrelated organization principles of texts: Top-down or global and bottom-up or local organization. It was argued that the global organization of (narrative) discourse, i.e. its coherence, is based on an underlying narrative schema. Thus, narrative coherence was defined as a cognitive measure. It was further argued that narrative coherence can be investigated in language with the help of story grammar. Accordingly, narrative coherence was operationalized as narrative components reflecting the underlying story schema in the task material (cf. ch. 3 and ch. 5). The number of components, i.e. a quantitative measure, was used to assess the overall coherence of participants’ stories (cf. ch. 6.1), and, in order to assess qualitative distinctions, the individual components’ frequency was compared (cf. ch. 6.2). Additionally, a global narrative index was created, complementing the comparison between the individual components in that it reveals the degree to which a story is organized along the five narrative components axiomatic for a global structure (cf. ch. 6.3). 224 It was further argued in ch. 3 that local discourse organization is based on the use of linguistic devices to connect clauses into a unified stretch of speech, i.e. on a text’s cohesion, and consequently, cohesion was defined as a linguistic measure. Cohesion was operationalized via the classical text-linguistic categories first systematized by Halliday and Hasan (1976) (cf. ch. 5.2.2). Thus, the density of references, connectives, substitutions, ellipses, and lexical ties was assessed (cf. ch. 7.2), i.e. their respective mean number per clause, and the overall cohesive density of participants’ texts, i.e. the total number of cohesive ties per clause (cf. ch. 7.1). These measures served to assess quantitative differences attributable to grade, sex or experience group. Additionally, it was investigated to what extent the individual subcategories contribute to overall cohesive density (cf. ch. 7.3); this latter measure served to assess qualitative differences. With respect to the first goal of my study, namely to investigate the coherence and cohesion of participants’ narrative productions, I raised several research questions (cf. ch. 3) and put forward corresponding hypotheses (cf. ch. 4). All of these were already reviewed in the respective results sections (ch. 6.4 and 7.4) to which the reader is referred for details. With respect to the second goal, two main research questions were posed in ch. 3, which have not yet been revisited: 1. Is there a cognitive development from grade one to four as measured by L2 narrative coherence? 2. Is there a linguistic development from grade one to four as measured by L2 narrative cohesion? Additionally, the question was raised as to whether there is a relationship between (the development of) L2 cohesion and coherence and thus, by continuity, between linguistic and cognitive development. In the following section I will summarize my results with a focus on similarities and differences between (the development of) coherence and cohesion. Then I will review the results obtained in regard to a relationship between these two discourse measures. After that I will discuss two recurring themes of my study: The influence of the three learner variables grade, sex and L2 preschool experience and participants’ interindividual variation. Then I will present the answers for questions (1) and (2) (re-) stated above, which derive from the sum of my results, and critically discuss the validity of the constructs of coherence and cohesion. Next, some general limitations of my study will be considered. Finally, the third goal will be revisited; that is, I will conclude with a brief discussion of the importance of my results with respect to the immersion program at hand as well as immersion education in general. 9.1 Coherence and cohesion: Similarities and differences Since detailed summaries of narrative coherence and cohesion results were given at the end of the respective results chapters (ch. 6.4 and ch. 7.4), the following 225 summary will focus on similarities and differences between results obtained for the two measures. All participants produced narrative components and cohesive ties; that is, even first graders were able to produce at least one narrative component, and all narrative components were realized by at least one first grader. At the same time even first graders used a minimum of two cohesive ties per clause, however, only references and lexical ties were produced by all of them. Not all first graders, on the other hand, used connectives or ellipses, and substitutions were produced exclusively by fourth graders. Both the narrative coherence and cohesion of participants’ stories increased significantly from first to fourth grade, even though the effect of grade was far stronger for coherence than it was for cohesion. At the same time an overlap between first and fourth grade results was observed for both discourse measures, however, this overlap was stronger for the overall cohesion (58% of the results overlapped). The respective overlaps also showed that for both measures the difference between first and fourth grade scores consisted especially of an increase of the lower first grade scores with relative stability in the upper scoring ranges. This development was, of course, less pronounced for cohesion than for coherence results due to the cohesion results’ stronger overlap. L2 preschool experience affected the overall coherence of participants’ stories but not their overall cohesion. At the same time the influence of preschool was limited to first grade. That is, first graders with bilingual preschool experience produced significantly more coherent stories than children without prior experience. Stories’ mean cohesive density, on the other hand, did not differ between experience groups. However, L2 preschool experience affected both measures’ interindividual variation in that individual results differed less for participants who had attended a bilingual preschool; this effect was again limited to first grade. At the same time both measures’ first and fourth grade scores overlapped more strongly for the experienced group. Sex was not found to significantly influence overall coherence or cohesion, the respective interindividual differences, or the overlap between first and fourth grade scores in any systematic way. With respect to each measure’s subcategories, i.e. individual narrative components and types of cohesive devices, there was also a general tendency for an increase from first to fourth grade: The frequency of almost all individual narrative components increased, and in most cases these increases were statistically significant. Connectives and ellipses also increased statistically significantly. This was not the case for references and lexical ties, however, even though the observed results showed an increase. At the same time the cohesion results again overlapped between first grade and fourth grade and this overlap was comparatively large for all subcategories (largest for references and smallest for connectives). Thus, the main development concerned the disappearance of the lower first grade scores paired with relative stability in the upper scoring ranges—just as was found for overall cohesion. 226 The frequencies of the individual narrative components differed very strongly in first and less strongly in fourth grade. Similarly, all of the cohesion subcategories followed the trend of a decrease in participants’ interindividual variation from first to fourth grade. An order of frequency was established for narrative components and the cohesion subcategories. For both areas the respective order of frequency was (largely) unaffected by grade. Thus, the frequency order of narrative components indicated an emerging global narrative structure in first grade, which was fully developed by the end of fourth grade (this was confirmed by the narrative index). That is, the major change from first to fourth grade consisted of an increase in each component’s frequency, while the general order of frequency was not observed to change significantly. Across grades the frequency order of cohesive devices, on the other hand, indicated that lexical ties and references make the largest contribution to cohesive density (together ca. 80%), followed by connectives (ca. 15%). Ellipses and substitutions were found to contribute only marginally. This result confirmed the findings of earlier studies (cf. ch. 4). While sex also had no significant systematic impact on the individual narrative components or the subcategories of cohesion, L2 preschool experience affected both the frequency of the individual components and the density of most cohesion categories. However, this preschool effect was again limited to first grade. Thus, all individual components were produced by at least one of the first graders with L2 preschool experience, while this was not the case for the inexperienced group. Additionally, the narrative components’ order of frequency indicated that the emerging global organization structure in first grade is limited to the L2 preschool group. With respect to the cohesion results, again the only consistent pattern regarding experience group was a lower interindividual variation in first grade and a stronger overlap between first and fourth grade scores for children with L2 preschool experience. Only the use of connectives, however, differed significantly between experience groups (experienced first graders had a statistically significant higher connective density), even though the observed results also showed (slightly) higher scores for first graders with bilingual preschool experience with respect to references, ellipses, and lexical cohesion. 9.2 The relationship between (the development of) L2 cohesion and coherence Correlations were computed between the main measures of coherence and cohesion (total number of narrative components, narrative index score, total number of cohesive devices and overall cohesive density) in order to investigate a potential relationship between overall coherence and cohesion. There were some significant correlations between coherence and cohesion; however, these were largely limited to the relationship between the number of components and the total number of ties. Additionally, no consistent pattern evolved across the (sub)groups investigated. Neither coherence measure correlated significantly with cohesive density and 227 only in one instance was there a significant correlation between the narrative index and the number of ties. The mono group and the bili group showed exactly the same pattern; however, additional analyses (partial correlations) showed that any initially significant correlations disappeared once the influence of text length on both variables was controlled for. That is, there were no true correlations between coherence and cohesion. What do these results mean for the relationship between coherence and cohesion? The results indicate, first of all, that global narrative structure and the use of cohesive devices are not related, i.e. children producing a fully coherent story, as indicated by the index do not necessarily also produce a large number of linguistic devices signaling connections between clauses. Secondly, the results also show that more coherent texts, i.e. texts with a large number of components or a high index score, are not necessarily associated with fewer cohesive devices. That is, children do not necessarily compensate fewer cohesive devices with a tighter narrative structure. At the same time less coherent texts are not necessarily associated with more cohesive devices, i.e. children do not compensate loosely structured narratives with a larger number of cohesive devices. While participants’ discourse becomes significantly more coherent and cohesive from first to fourth grade, the increase in the two measures was also not found to be related, as an analysis of the longitudinal data indicated. More specifically, the only significant correlation, which was found between the increase in components and the increase in the total number of ties, was again mediated by text length; here the increase in text length. This confirms the conclusions drawn for the relationship between coherence and cohesion from a developmental perspective. The findings of the present study’s investigation of young L2 learners’ narrative discourse, which should be viewed against the background of the specific operationalizations of coherence and cohesion, thus disconfirm the traditional stance of cohesion contributing to coherence (cf. ch. 3.2). They do, however, support that cohesive devices are neither necessary nor sufficient for coherence. At the same time the results show that the density and/ or number of cohesive devices cannot be regarded as influenced by the coherence of participants’ discourse. Furthermore, the results obtained for the development of each of the measures over time show that there is no change in the (lack of) relationship between coherence and cohesion from first to fourth grade. Several limitations should be mentioned with respect to these findings. First of all, correlation analysis, which is the standard statistical procedure for testing relationships between variables, only serves to test linear relations. Further testing would thus be desirable to also investigate also the possibility of a non-linear relationship. Secondly, a more detailed look at correlations involving submeasures of coherence (individual narrative components) and especially cohesion (e.g. connectives) would need to be included in future studies (cf. also Bae 2001: 74). Finally, it would be desirable also to include coherence and cohesion measures derived 228 from more qualitative, functional analyses, e.g. reference strategies (cf. for example Hickmann 2003). 9.3 Learner variables and their impact: Grade, sex and preschool experience On the basis of previous research two hypotheses were put forward with respect to the learner variables investigated (cf. ch. 4): 1. Grade/ age has a significant influence on the narrative coherence and cohesion of participants’ stories. 2. Sex has no significant influence on the narrative coherence and cohesion of participants’ stories. No hypothesis was formulated regarding L2 preschool experience. The results obtained for coherence and cohesion showed a common pattern, which confirms the first hypothesis: Grade is the single most influential factor in explaining variation (cf. the sections above). However, grade explains variation in coherence far better than in cohesion. This can be seen in the effect sizes, which were very high for coherence measures, namely ca. 50% (partial eta-squared) for the total number of components and the global narrative index score and 0.32 to 0.77 (phi-values) for the individual narrative components. The effect sizes for cohesion were comparatively low, on the other hand, namely 12% for overall cohesion and typically between 10% and 20% for the subcategories (without outliers maximally 24%; all are partial eta-squared). Furthermore, grade had no statistically significant effect on two of the subcategories of cohesion, namely references and lexical cohesion, which together make up the largest part of the overall cohesive density. So far grade has been treated as one complex influence factor, which incorporates exposure as well as maturational effects. The question arises, however, as to whether it is feasible to do this. That is, the question of whether maturational effect and exposure effect can be isolated needs to be addressed. This question was already touched on in passing in the overview of previous research (cf. ch. 4) and I would like to argue that both effects overlap inseparably in my study. This is especially evident for narrative coherence: As explained in ch. 4, the age range of the present study’s participants corresponds to that of monolingual children where a strong development in coherence is to be expected. There is no readily conceivable reason why this should not also apply to the present study’s participants—even if they tell stories in an L2—since narrative coherence is a cognitive measure. That is, a maturational effect is to be expected. The impact of L2 preschool experience on first graders’ coherence results, on the other hand, indicates an overlap of both factors—at least if one does not assume that children with L2 preschool experience are cognitively more advanced per se. This latter interpretation does not hold up against the fact, however, that children without bilingual preschool experience have been able to catch up by the end of fourth grade. Thus, 229 the substantially weaker performance of first graders without L2 preschool experience is more likely attributable to linguistic deficits. Since neither experience group consistently produces coherent stories in first grade but both do so in fourth grade, this in turn means that development is attributable to a combination of maturational effect and exposure effect. Consequently, these two factors are indeed inseparable in my study. No significant influence of sex was found in the analyses. That is, there were no systematic observed or statistically significant differences between male and female participants with respect to cohesion or coherence; this confirms the second hypothesis. The observed results indicated some minor differences, e.g. with respect to connectives (ch. 7.2.2.2) and lexical cohesion (ch. 7.2.4.2), but no consistent pattern emerged. The lacking influence of sex is in line with the scarce number of similar studies, which found either no differences or only qualitative differences within subcategories, which were not investigated in the present analysis (cf. ch. 4). It would thus be desirable to conduct a more fine-grained analysis of the subcategories in the future to see whether this yields any more substantial differences attributable to participants’ sex. The impact of L2 preschool experience is limited to first grade, although the development from first to fourth grade could also be affected by the difference in first grade scores. By the end of fourth grade any differences attributable to experience group have evened out. At the same time the only consistent influence of experience group on both coherence and cohesion scores concerned participants’ interindividual variation, which was lower for first graders who had attended a bilingual preschool. Any other effect of experience group was limited mostly to the coherence measures, where first graders with L2 preschool experience generally outperform those without prior experience. The effect sizes for this phenomenon were comparatively high. More specifically, preschool experience explains between 23% (index score) and 33% (total number of components) of the variation in first grade. The influence of preschool experience on cohesion is far less evident, apart from the lower variation for first graders with bilingual preschool experience, which was described above. With respect to overall cohesion, as well as the subcategories, the difference between mono and bili first graders reaches significance only for connectives, and here preschool experience explains merely 8% of the variation in results. This is somewhat surprising, since the model outlined in ch. 2 purports cohesion as a linguistic measure and thus clear differences among groups would be expected for cohesion but much less so for coherence measures. This issue will be taken up again in one of the following sections. Even grade as the single most important factor in explaining variation leaves about 50% of the variation in coherence, and 88% or more for cohesion results unexplained. Thus, other factors seem to influence coherence and particularly cohesion in L2 narrative discourse, even if some degree of random variation among participants is allowed for. One of the limitations of the present study is that only the influence of grade, sex, and L2 preschool experience was considered. 230 That is, many potential influence factors were not included, such as socioeconomic status, L1 proficiency, general cognitive abilities, motivation, or activities and skills strongly related to storytelling, such as story comprehension or parents’ storytelling (cf. also overviews of learner variables, e.g. Larsen-Freeman 1991, Ellis 1994). Any future study on coherence and cohesion would thus need to include as many additional impact factors as possible. 9.4 Interindividual variation Several general phenomena were discovered with respect to participants’ interindividual variation as measured by the variability ratio (cf. ch. 5.2.3). First of all, it was found that interindividual variation is influenced by grade. That is, participants’ interindividual variation is higher in first than in fourth grade; this pattern agrees with other studies conducted in the Kiel IM project (cf. Wode 2009: 37). An influence of experience group was also found in that (a) the interindividual variation is generally lower for first graders with bilingual preschool experience than those without such prior experience, (b) the interindividual variation of participants without L2 preschool experience decreases more strongly from first to fourth grade and (c) by the end of fourth grade any differences in interindividual variation between experience groups have disappeared. Sex had no influence on participants’ interindividual variation. Besides the learner variables grade and experience group, however, interindividual variation was also found to differ among the coherence and cohesion measures investigated. First of all, first graders’ interindividual variation was far larger for coherence than cohesion: The variation in overall coherence amounted to 64% in first grade for the number of components and 45% for the narrative index, while it was only 15% for the overall cohesive density. This order was also stable across experience groups. Participants’ interindividual variation decreased far more strongly for coherence measures, however, so that by the end of fourth grade it was similar for both coherence and cohesion. That is, participants’ variation in overall coherence amounted to 17% in fourth grade for the number of components, 4% for the narrative index and 8% for the overall cohesive density. As the comparison between the number of components and the narrative index indicates, differences in interindividual variation between these two coherence measures are mainly attributable to the variety of components produced and much less to differences in global organization. This points to stylistic reasons for the difference in variation, especially in fourth grade. The question remains of which factors are responsible for the much larger variance in coherence than in cohesion results, especially in first grade. This is possibly an artifact of the method of analysis; that is, while it was possible for participants to have realized none of the narrative components, even if they produced a stretch of speech in relation to the task material, it would be far more difficult to produce a stretch of speech not containing any cohesive ties. It is difficult to imag- 231 ine, for example, how participants could produce utterances in relation to the task material that do not contain any content words. Any stretch of clauses thus very likely includes, for example, repetitions of a lexeme denoting the main character(s) boy or dog and, consequently, a certain number of cohesive devices; that is, participants may not be able to avoid producing a minimum number of ties, since certain characters and phenomena are presented repeatedly in the task material. At the same time the interindividual variation in cohesion was strongly dependent on the type of cohesive device. The variability in results was highest for ellipses (100% in first and 33% in fourth grade), followed by connectives (62% vs. 21%), references (20% vs. 14%) and lexical ties (17% vs. 9%). That is, the amount of variation was dependent on grade but not its general order—just as this order was not influenced by experience group, even though first grade results varied less for children with L2 preschool experience. This could indicate that some cohesive ties are more difficult to produce than others. The use of ellipses, for example, is strongly dependent on the use of complex syntax. If participants produce only simple sentences—as they are quite likely to do in first grade—the structural prerequisites for ellipses will not be met and thus no ellipses produced. At the same time the use of some types of cohesive devices may be more flexible than that of others. That is, some cohesive ties are artifacts of task and task material (especially lexical ties, cf. above), while others enjoy a much greater stylistic flexibility. This seems to be the case for connectives, for example, the use of which is mostly optional. 9.5 Coherence and cohesion: Cognitive and linguistic development As I outlined in ch. 3 and summarized again at the beginning of the present chapter, narrative coherence, which is based on the application of a narrative schema as a global production plan, is a cognitive measure. Cohesion, on the other hand, was defined as a linguistic measure based on the use of cohesive devices, which serve to establish local relations among stretches of clauses and thereby to connect theses stretches of clauses into a unified whole. By continuity, the development of narrative coherence and cohesion from first to fourth grade was assumed to reflect participants’ cognitive and, respectively, linguistic development. Thus, the following research questions had been put forward in ch. 3 and were repeated at the beginning of the present chapter: • Is there a cognitive development from grade one to four as measured by L2 narrative coherence? • Is there a linguistic development from grade one to four as measured by L2 narrative cohesion? As my results showed, both narrative coherence and cohesion increase significantly from first to fourth grade. This in turn means that participants in the immersion program develop cognitively as well as linguistically over the three years between test time one and test time two. However, several questions arise as to the 232 validity of coherence as a cognitive measure and that of cohesion as a linguistic measure. Coherence, first of all, does not only show differences between grades but also between experience groups in first grade. That is, participants without L2 preschool experience were found to produce significantly less coherent stories in first but not in fourth grade. Does this mean that first graders with bilingual preschool experience are cognitively more advanced than those without prior experience? Such an interpretation is possible but problematic and would have to be crosschecked very carefully not only with evidence for a narrative schema in the L1 but also with further cognitive measures. It is more likely that the significant difference between experience groups in first grade points to the influence of a linguistic threshold. That is, first graders with bilingual preschool experience have the means to express the narrative schema in the L2 more readily available than those without such prior experience (cf. ch. 4 for an analogous phenomenon in L1 acquisition). Although it could similarly be claimed that the large differences in coherence between first and fourth graders are attributable to linguistic difficulties, this does not seem likely for two reasons: First of all, even for monolingual children the major development in global narrative structure has been found to take place within the age range under investigation (cf. the developmental overview in ch. 4). Secondly, neither group consistently structured their narratives according to a global narrative schema in first grade; significantly more but not all first graders with L2 preschool experience produced a(n) (almost) fully coherent story, while some of the children without prior experience also produced a(n) (almost) fully coherent story. This in turn means that the usefulness of coherence as a cognitive measure can be obscured by a linguistic threshold. This applies, of course, as much to L2 as to L1 studies, even if the discrepancy between cognitive abilities and linguistic expression may be more salient for L2 speakers. Consequently, coherence can be regarded as a useful measure for cognitive development over time only. And what about cohesion? The degree of cohesive density, i.e. the number of ties per clause, differs significantly between grades. However, the sum of the cohesion results casts doubts over whether the use of cohesive devices is a straightforward indicator for linguistic development, since (a) not all subcategories develop significantly, (b) the interindividual variation is comparatively low but differs strongly among the subcategories (cf. the previous section), (c) a general order of frequency applies to the subcategories and their contribution to overall cohesion and (d) no significant differences between experience groups were found except for the use of connectives, even though significant linguistic differences would be expected at least in first grade. What does this mean for the construct validity of cohesion? Differences in interindividual variation among the cohesion subcategories have already been discussed in the previous section, and linguistic complexity and stylistic flexibility were identified as possible reasons for these differences. It was 233 also discussed that the use of at least some cohesive devices can hardly be avoided if any utterances at all are produced in relation to task and task material. The general order of frequency and contribution (cf. also the findings of other studies described in ch. 4) points additionally to structural influences on the use of cohesive devices. That is, clauses minimally have an SV-structure, i.e. they contain a subject, which will minimally be a noun, noun phrase or pronoun, and a verb. Thus, any clause will normally contain at least one word that is a potential lexical tie (the verb). At the same time participants, especially first graders, often fill the subject slot with a full noun phrase. This means that with great likelihood (a) the definite article, i.e. a demonstrative reference, will be used and (b) a noun, i.e. a word that again has the potential of a lexical tie. A typical SV clause will therefore contain at least two potential lexical ties and one potential (demonstrative) reference; this is reflected in the degree to which the individual subcategories contribute to overall cohesion (cf. ch. 7.2). The use of connectives is, on the other hand, far more flexible, since semantic relationships between clauses are often inherent in their content and do not have to be made explicit. Ellipses, finally, are bound to the use of complex syntax, i.e. they have a strong need for certain structural prerequisites to be fulfilled, and their use is therefore far more limited. Thus, differences in interindividual variation as well as the general order of frequency and the contribution to overall cohesion point to a limited comparability among the subcategories. However, the cohesion results raise an even more crucial question: What is a “cohesive” text and how cohesive does a text need to be? The importance of this question is reflected in my results for the relationship between coherence and cohesion, which showed that the two measures are not related and that their development from first to fourth grade, while parallel is also not related. That is, the degree of cohesiveness was not necessarily found to correlate positively or negatively with the degree of coherence. The changes found within subcategories (cf. ch. 7.3.4) also tie in with this question, since they showed that the majority of changes from first to fourth grade seems to take place within subcategories and not in overall cohesive density or the density of each subcategory. In the light of the issues raised above, the classical text-linguistic approach to cohesion (ultimately based on Halliday and Hasan (1976)) needs to be complemented by more properly functional approaches in order to shed light on other important phenomena, such as more general discourse phenomena being governed by underlying cognitive strategies, e.g. referring to story characters. The linguistic implementation of reference strategies, for example, would cut across Halliday and Hasan’s categories of personal reference, ellipses (zero anaphora), demonstrative references and lexical ties (cf. Hickmann 2003 & 1991, Bamberg 1987, Karmiloff-Smith 1985). Relatedly, the use of connectives may be governed by a “difficulty of inference” strategy, which calls for the use of a connective only if the storyteller believes that the relationship between two clauses is not easily inferable. 234 To sum up, the construct validity of cohesion as originally defined within the framework of Halliday and Hasan (1976) is not entirely robust. What does this mean for discourse as the result of top-down and bottom-up organization processes as outlined in the simplified model of narrative discourse production presented in chapter 3? The two organizational structures identified for texts, topdown and bottom-up, were only partly confirmed in the way they had been defined. Rather, it may be necessary to redefine top-down organization as the global strategies governing bottom-up processes. That is, top-down processes refer to the cognitive pre-planning not only of narrative structure(s) but also of strategies underlying the use of cohesive devices. Bottom-up processes, in turn, would need to be redefined as the linguistic expression of these organizational strategies. 9.6 Limitations of the study Any discussion of my results is, of course, not complete without mentioning the limitations of the present study and desiderata for future research. Several limitations have already been mentioned in previous sections; however, several additional ones have not yet been discussed and these will be presented in the following. First of all, my study uses data from a limited number of non-randomly selected participants, since it explores data from classes in one specific program. Although this can hardly be avoided in a study of language acquisition in an educational setting and even less so in a pilot project such as the one in which my data was collected, it also means that the results may not be fully generalizable. In addition, the specific make-up of these classes, with not only two but three experience groups (cf. ch. 5.1), two of which were included in this study, may limit the comparability with other immersion programs. However, the results obtained for differences between groups and the development from first to fourth grade are so solid that these factors should not constitute a significant danger to reliability. An additional limitation, which is directly attributable to the number of participants, is that the administration of the task could not be counterbalanced in order to avoid excessively small subgroups; reasons for the order chosen were discussed in ch. 5.1. Thus, it would be desirable to repeat this study with a larger number of (randomly-selected) participants from different programs so that all subgroups are big enough to be representative and the task order can be counterbalanced. At the same time it would be desirable, as discussed in a previous section, to collect as many (additional) learner variables with potentially predictive value as possible. Secondly, the present study focused on a quantitative analysis of coherence and cohesion. However, as my results have shown, several areas not considered here may need closer investigation with a more properly qualitative, functional approach in the future, for example a function-to-form analysis such as in several earlier studies on cohesion (e.g. Bamberg 1997, Karmiloff-Smith; cf. Berman and Slobin (1994) for the general approach) . Thus, a more detailed analysis of the subcategories of cohesion and coherence would be desirable to discover more 235 qualitative differences between participants; are there differences in the linguistic means used to establish connective (cf. Möller 2012) or referential ties, for example, which are attributable to grade and/ or experience group? Are there differences as to what and how much additional information is included in the SET- TING component or how the ENDING is expressed (cf. Möller 2010)? Similarly, a more detailed look at the relationship between cohesion and coherence should be conducted, looking at correlations between subcategories. Previous research has found, for example, that lexical and referential ties are significant predictors at least of coherence ratings (Bae 2001: 74). Especially with respect to coherence, an investigation of participants’ L1 data analogous to the present approach should be conducted in order to confirm whether differences between first graders with and without L2 preschool experience are really attributable to linguistic differences. With respect to cohesion, an analysis of L1 data could point out whether participants make systematic differences in the use of cohesive devices and whether the general order of frequency found in the L2 English data is confirmed for participants’ L1 German data. Also, the approach to cohesion chosen in the present study should be complemented with a more properly functional analysis, particularly of connectives. Even more fundamentally, however, the use of cohesive devices in L2 discourse should be compared with L2 listeners’ (perceived) need for cohesive devices in (narrative) discourse comprehension—an issue that is far from resolved even in L1 research. Further areas of research that should be explored by future studies are, for example, a comparison with monolingual L1 data (in relation to which previous studies on L2 learners have reported an overas well as an underuse of cohesive devices, cf. Bae 2001, Viberg 2001), and a comparison with results from traditional foreign language teaching. With respect to theory building, it would be more than desirable to link existing research and theories on the “next lower” level of (L2) production, i.e. syntax, more strongly with discourse phenomena and to review existing (L2) acquisition theories critically in the light of (L2) discourse development (cf. Hickmann 2003, esp. 334ff.). 9.7 Conclusions for the effectiveness of immersion programs Discourse features are an important component of language proficiency and are therefore included in all major language tests and models of language proficiency (cf. ch. 1). The results of the present study show that participants in the Kiel IM program benefit from IM education with respect to one of the most important genres of discourse, namely narratives. Make-believe stories, such as the ones used for data elicitation in the Kiel project, require a largely autonomous construction of text with a culturally pre-defined structure and they have thus been identified as one of the most challenging types of narrative discourse (cf. ch. 1 and ch. 3). The results presented in previous chapters have shown that participants’ L2 discourse proficiency improves significantly over the duration of an IM program such as the one under consideration, i.e. early partial IM at elementary school (for 236 some participants in combination with IM in preschool, cf. ch. 5.1). While strong oral communication skills have been observed repeatedly for IM students (cf. ch. 1), especially in comparison with their peers in regular foreign language teaching programs, my results show that IM students also benefit with respect to academically valued discourse structuring skills—be it narrative structure or the linguistic connectedness of clauses and sentences. This beneficial effect is observed for both bilingual preschool (as evident from the significant differences between experience groups in first grade) and subsequent immersion in elementary school (shown by the development from first to fourth grade independent of experience group). The present study also contributes to two areas of immersion research, which have recently been identified as “neglected” (Tucker & Dubiner 2008: 272): Newcomers to immersion programs, and (knowledge) transfer. In the present study, first graders without L2 preschool experience represent such a newcomer group, and the results show that these children often lag behind initially (first grade results) but that they are able to compensate this initial disadvantage if given sufficient time. Since only first and fourth grade results were studied, it is not entirely clear how much time is needed; my results show, however, that one year is not enough, while four years are sufficient. Evidence from other studies conducted in the Kiel IM project, for example on verbal morphology, point to two years as the necessary time span to catch up (cf. Wode 2009: 37). Further testing is needed to see whether this also applies to IM programs with other set-ups, especially those with a lower percentage of instruction in the L2. Independently of bilingual or monolingual preschool experience children thus benefit from attending an immersion program, but since any differences attributable to preschool experience have evened out by the end of fourth grade, does this mean that attending a bilingual preschool is superfluous? This conclusion would be premature. First of all, we do not know how a class without one or the other group would perform. It is not unlikely, however, that a class consisting entirely of children without L2 preschool experience would perform less strongly, since in heterogeneous classes inexperienced children can benefit from linguistically more advanced ones, especially in the initial phases. Secondly, this study compares only first and fourth grade data; that is, the results do not indicate how early the group with bilingual preschool experience arrives at the level it has reached by the end of fourth grade. First grade results suggest, however, that this will be earlier than for children without prior experience (at least for coherence). That is, the inexperienced group needs a comparatively longer exposure time to the L2 in elementary school in order to reach the same level of proficiency. This in turn has consequences for program planning, especially if an L3 is to be introduced into the program. That is, an L3 could be introduced earlier for a group with L2 preschool experience than for an inexperienced group. A more viable conclusion would therefore be that the significantly stronger performance of first graders with L2 preschool experience shows the beneficial effect of combining preschool and ele- 237 mentary school immersion and that an immersion program should indeed combine both educational levels in order to achieve optimal results. With respect to knowledge transfer, coherence results point to a positive transfer of narrative schema, a language-independent cognitive structure. Earlier research (cf. ch. 4) on monolingual L1 speakers showed that in my participants’ age range (mean age 6; 8 and 9; 8) a strong development of coherence is to be expected in L1 discourse. The results of my study showed that an equally strong development can be observed in L2 texts: A comparison of my results and those of previous studies thus strongly suggests that the availability of a story schema for L2 discourse production means a general availability, i.e. the application of the schema is not bound to any particular language, even if its successful implementation can be hindered by linguistic difficulties. However, an analysis of L1 data would need to be conducted to fully confirm this claim. Finally, the popular myth of cognitive deficits due to early bilingualism and/ or schooling in an L2 has once more been shown to lack empirical evidence. That is, cognitive development—as evident from the narrative coherence results—is not impeded by attending an immersion program. On the contrary, immersion programs, such as the one whose results were analyzed in the present study, are highly successful in fostering both cognitive and linguistic skills. 239 10 Bibliography Aarssen, J., Akinçi, M.-A. & Yağmur, K. 2001. Development of clause linkage in narratives: A comparison of Turkish children in Australia, France, the Netherlands and Turkey. In Research on child language acquisition. Proceedings of the 8th conference of the IASCL, A. Barreña, M.-J. Ezeizabarrena, I. Idiazabal & B. MacWhinney (eds), 41-56. San Sebastián: University of the Basque Country. Akinçi, M.-A., Jisa, H. & Kern, S. 2001. Influence of L1 Turkish on L2 French narratives. In Narrative development in a multilingual context, L. Verhoeven & S. Strömqvist (eds), 188-208. Amsterdam: John Benjamins. Aksu-Koç, A. & von Stutterheim, C. 1994. Temporal relations in narrative: Simultaneity. In Relating events in narrative: A crosslinguistic developmental study, R.A. Berman & D.I. Slobin (eds), 393-455. Hillsdale, NJ: Lawrence Erlbaum Associates. Allen, M.S., Kertoy, M.K., Sherblom, J.C. & Petit, J.M. 1994. Children’s narrative productions: A comparison of personal event and fictional stories. Applied Psycholinguistics 15: 149-176. Almgren, M., Beloki, L., Idiazabal, I. & Manterola, I. 2008. Acquisition of Basque in successive bilingualism: Data from oral storytelling. In Language contact and contact languages, P. Siemund & N. Kintana (eds), 239-259. Amsterdam: John Benjamins. Almgren, M., Idiazabal, I., Manterola, I. & Beloki, L. 2007. Language acquisition in Basque-Spanish successive bilinguals: Narrative skills in five year old children. Paper presented at the 6 th International Symposium on Bilingualism, Hamburg. Anderson, R.C. [1984] 1994. Role of the reader’s schema in comprehension, learning and memory. In Theoretical models and processing of reading, R.B. Ruddell, M.R. Ruddell & H. Singer (eds), 4 th edition, 469-482. Newark, DE: International Reading Association. Anderson, R.C. & Pearson, P.D. 1984. A schema theoretic view of basic processes in reading comprehension. In Handbook of reading research, P.D. Pearson (ed), 255- 291. New York: Longman. Asher, N. & Lascarides, A. 1998. Bridging. Journal of Semantics 15 (1): 83-113. Augst, G. 2010. Zur Ontogenese der Erzählungskompetenz in der Primar- und Sekundarstufe. In Textformen als Lernformen, T. Pohl & T. Steinhoff (eds), 63-95. Duisburg: Gilles & Francke. Augst, G., Disselhoff, K., Henrich, A., Pohl, T. & Völzing, P.-L. 2007. Text-Sorten- Kompetenz. Eine echte Longitudinalstudie zur Entwicklung der Textkompetenz im Grundschulalter. Frankfurt/ Main et al.: Peter Lang. Bachem, J. 2004. Lesefähigkeiten deutscher Kinder im frühen englischen Immersionsunterricht. M.A. thesis, Kiel University. Bachmann, L.F. 2004. Statistical analyses for language assessment. Cambridge, UK et al.: Cambridge University Press. Bae, J. 2001. Cohesion and coherence in children’s written English: Immersion and English-only classes. Issues in Applied Linguistics 12 (1): 51-88. 240 Baker, S. & MacIntyre, P.D. 2003. The role of gender and immersion in communication and second language orientations. Language Learning 53 (1): 65-96. Bamberg, M. 1986. Studies on morphological and syntactic development. Linguistics 24: 227-284. Bamberg, M. 1987. The acquisition of narratives. Learning to use language. Berlin et al.: Mouton de Gruyter. Bamberg, M. 1994. Development of linguistic forms: German. In Relating events in narrative. A crosslinguistic developmental study, R.A. Berman & D.I. Slobin (eds), 189-238. Hillsdale, NJ: Lawrence Erlbaum Associates. Bamberg, M. 1997. Narrative development: Six approaches. Mahwah, NJ & London: Lawrence Erlbaum Associates. Bamberg, M. & Marchman, V. 1990. What holds a narrative together? The linguistic encoding of episode boundaries. IPrA Papers in Pragmatics 4 (1/ 2): 58-121. Bamberg, M. & Marchman, V. 1991. Binding and unfolding: Towards the linguistic construction of narrative discourse. Discourse Processes 14: 277-305. Bamberg, M. & Marchmann, V. 1994. Foreshadowing and wrapping up in narrative. In Relating events in narrative. A crosslinguistic developmental study, R.A. Berman & D.I. Slobin (eds), 555-590. Hillsdale, NJ: Lawrence Erlbaum Associates. Bamberg, M. & Moissinac, L. 2003. Discourse development. In The handbook of discourse processes, A.C. Graesser, M.A. Gernsbacher & S.R. Goldman (eds), 395-437. Mahwah, NJ: Lawrence Erlbaum Associates. Bartlett, F.C. 1932. Remembering. Cambridge, UK: Cambridge University Press. Becker Bryant, J. 2009. Pragmatic development. In The Cambridge handbook of child language, E.L. Bavin (ed), 339-354. Cambridge, UK et al.: Cambridge University Press. Becker, T. 2005. Kinder lernen erzählen. Zur Entwicklung von narrativen Fähigkeiten von Kindern unter Berücksichtigung der Erzählform. 2 nd edition. Baltmannsweiler: Schneider. Beigman Klebanov, B. & Shamir, E. 2005. Lexical cohesion: Some implications of an empirical study. In Natural Language Understanding and Cognitive Science. Proceedings of the 2nd International Workshop on Natural Language Understanding and Cognitive Science, NLUCS 2005, B. Sharp (ed), 13-21. INSTICC Press, <http: / / www.cs.huji.ac.il/ ~beata/ #NLUCS> (5 July 2006). Bennet-Kastor, T. 1986. Cohesion and predication in child narrative. Journal of Child Language 13: 353-370. Berman, R.A. 1988. On the ability to relate events in narrative. Discourse Processes 11: 469-497. Berman, R.A. 1995. Narrative competence and storytelling performance: How children tell stories in different contexts. Journal of Narrative and Life History 5 (4): 285-313. Berman, R.A. 2001. Narrative development in multilingual contexts: A cross-linguistic perspective. In Narrative development in a multilingual context, L. Verhoeven & S. Strömqvist (eds), 419-428. Amsterdam: John Benjamins. 241 Berman, R.A. 2001a. Setting the narrative scene: How children begin to tell a story. In Children’s Language, Volume 10. Developing narrative and discourse competence, K. Nelson, A. Aksu-Koç & C.E. Johnson (eds), 1-30. Mahwah, NJ: Lawrence Erlbaum Associates. Berman, R.A. 2004. The role of context in developing narrative abilities. In Relating events in narrative, Volume 2. Typological and contextual perspectives, S. Strömqvist & L. Verhoeven (eds), 261-280. Mahwah, NJ & London: Lawrence Erlbaum Associates. Berman, R.A. 2008. The psycholinguistics of developing text construction. Journal of Child Language 35, 735-771. Berman, R.A. & Slobin, D.I. (eds). 1994. Relating events in narrative. A crosslinguistic developmental study. Hillsdale, NJ: Lawrence Erlbaum Associates. Berman, R.A. & Slobin, D.I. 1994a. IB: Goals and procedures. In Relating events in narrative. A crosslinguistic developmental study, R.A. Berman & D.I. Slobin (eds), 17-35. Hillsdale, NJ: Lawrence Erlbaum Associates. Berman, R.A. & Slobin, D.I. 1994b. IIA: Narrative structure. In Relating events in narrative. A crosslinguistic developmental study, R.A. Berman & D.I. Slobin (eds), 39-84. Hillsdale, NJ: Lawrence Erlbaum Associates. Berman, R.A. & Slobin, D.I. 1994c. IIIA: English. In Relating events in narrative. A crosslinguistic developmental study, R.A. Berman & D.I. Slobin (eds), 127-187. Hillsdale, NJ: Lawrence Erlbaum Associates. Berman, R.A. & Slobin, D.I. 1994d. VA: Becoming a proficient speaker. In Relating events in narrative. A crosslinguistic developmental study, R.A. Berman & D.I. Slobin (eds), 597-610. Hillsdale, NJ: Lawrence Erlbaum Associates. Berman, R.A. & Verhoeven, L. 2002. Cross-Linguistic Perspectives on the Development of Text-Production Abilities: Speech and Writing. Written Language & Literacy (special issue) 5 (1): 1-43. Biber, D., Conrad, S. & Leech, G. 2002. Longman student grammar of spoken and written English. Harlow, UK: Pearson Longman. Biber, D., Johansson, S., Leech, G., Conrad, S. & Finegan, E. 1999. Longman grammar of spoken and written English. Harlow, UK: Longman. Black, J.B. & Wilensky, R. 1979. An evaluation of story grammars. Cognitive Science 3: 213-230. Blankman, C., Teglasi, H. & Lawser, M. 2002. Thematic apperception, narrative schemas, and literacy. Journal of Psychoeducational Assessment 20: 268-289. Bley-Vroman, R. 1983. The comparative fallacy in interlanguage studies: The case of systematicity. Language Learning 33 (1): 1-17. Bloor, T. & Bloor, M. 1995. The functional analysis of English. London, New York et al.: Arnold. Botvin, G.J. & Sutton-Smith, B. 1977. The development of structural complexity in children’s fantasy narratives. Developmental Psychology 13 (4): 377-388. Boueke, D., Schülein, F., Büscher, H., Terhorst, E. & Wolf, D. 1995. Wie Kinder erzählen. Untersuchungen zur Erzähltheorie und zur Entwicklung narrativer Fähigkeiten. München: Fink. 242 Bransford, J.D. & Johnson, M.K. 1972. Contextual prerequisites for understanding: Some investigations of comprehension and recall. Journal of Verbal Learning and Verbal Behavior 11: 717-726. Brewer, W.F. 1985. The story schema: Universal and culture-specific properties. In Literacy, language and learning. The nature and consequences of reading and writing, D.R. Olson, N. Torrance & A. Hildyard (eds), 167-94. Cambridge, UK: Cambridge University Press. Brewer, W.F. & Lichtenstein, E.H. 1981. Event schemas, story schemas, and story grammars. In Attention and performance, J. Long and A. Baddeley (eds), 363-379. Hillsdale, NJ: Erlbaum. Brewer, W.F. & Lichtenstein, E.H. 1982. Stories are to entertain: A structural affect theory of stories. Journal of Pragmatics 6: 473-486. Bublitz, W., Lenk, U. & Ventola, E. (eds). 1999. Coherence in spoken and written discourse. How to create it and how to describe it. Amsterdam: John Benjamins. Burmeister, P. & Daniel, A. 2002 How effective is late partial immersion? Some findings from a secondary school program in Germany. In An integrated view of language development. Papers in honor of Henning Wode, P. Burmeister, T. Piske & A. Rohde (eds), 499-515. Trier: Wissenschaftlicher Verlag Trier. Burmeister, P. & Pasternak, R. 2004. Früh und intensiv: Englische Immersion in der Grundschule am Beispiel der Claus-Rixen-Grundschule Altenholz. Mitteilungsblatt FMF-Landesverband Schleswig-Holstein 2004 (August): 24-30. Cain, K., 2003. Text comprehension and its relation to coherence and cohesion in children’s fictional narratives. British Journal of Developmental Psychology 21: 335- 351. Calfee, R. 1982. Some theoretical and practical ramifications of story grammars. Journal of Pragmatics 6 (5-6): 381-572. Cameron, C.A., Lee, K., Webster, S., Munro, K., Hunt, A.K. & Linton, M.J. 1995. Text cohesion in children’s narrative writing. Applied Psycholinguistics 16: 257-269. Chafe, W.L. 1980. The pear stories. Cognitive, cultural and linguistic aspects of narrative production. Norwood, NJ: Ablex. Chang, C. 2006. Linking early narrative skill to later language and reading ability in Mandarin-speaking children: a longitudinal study over eight years. Narrative. Inquiry 16 (2): 275-293. Clark, H.H. 1994. Discourse in production. In Handbook of psycholinguistics, M.A. Gernsbacher (ed), 985-1021. San Diego, CA: Academic Press. Claussen, I. 1997. Ein Vergleich der Diskursfähigkeiten mono- und bilingual unterrichteter Schüler im Rahmen der Unterrichtserprobung in Schleswig Holstein. Unpublished M.A. thesis, Kiel University. Collins, A., Brown, J.S. & Larkin, K.M. 1980. Inference in text understanding. In Theoretical issues in reading and comprehension, R.J. Spiro, B. Bruce & W.F. Brewer (eds), 385-405. Hillsdale, NJ: Lawrence Erlbaum Associates. Commission of the European Communities, 2003. Promoting Language Learning and Linguistic Diversity. An Action Plan 2004-2006. Communication from the Com- 243 mission to the Council, the European Parliament, the Economic and Social Committee and the Committee of the Regions. COM (2003) 449 final, 24.07.2003. Cook, G. 1989. Discourse. Oxford, UK et al.: Oxford University Press. Cook, V. 1999. Going beyond the native speaker in language teaching. TESOL Quarterly 33: 185-209. Cook-Gumperz, J. & Green, J.L. 1984. A sense of story: Influences on children’s storytelling ability. In Coherence in spoken and written discourse, D. Tannen (ed), 201- 218. Norwood, NJ: Ablex. Cortina, J.M. 1993. What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology 78 (1): 98-104. Council of Europe, 2001. Common European Framework of reference for languages. Learning, Teaching, Assessment. Cambridge, UK: Cambridge University Press. Crowhurst, M. 1987. Cohesion in argument and narration at three grade levels. Research in the Teaching of English 21 (2): 185-201. Cruse, A. 2004. Meaning in language. An introduction to semantics and pragmatics. 2 nd edition. Oxford, UK et al.: Oxford University Press. Crystal, D. 2003. A dictionary of linguistics and phonetics. 5 th edition. Malden, Mass. et al.: Blackwell. Cutler, A. & Clifton, C. Jr. 2000. Comprehending spoken language: A blueprint of the listener. In The neurocognition of language, C.M. Brown & P. Hagoort (eds), 123- 166. Oxford, UK et al.: Oxford University Press. Cutting, J. 2002. Pragmatics & Discourse. A resource book for students. London, UK et al.: Routledge. de Beaugrande, R. 1982. The story of grammars and the grammar of stories. Journal of Pragmatics 6: 383-422. de Beaugrande, R.-A. & Dressler, W. 1981. Introduction to text linguistics. Harlow, UK: Longman. Dickinson, D.K. & McCabe, A. 2001. Bringing it all together: The multiple origins, skills, and environmental supports of early literacy. Learning Disabilities Research & Practice 16 (4): 186-202. Diehr, B. 2011. Sprachproduktion in der Erstsprache und in der Fremdsprache. Erkenntnisse über die diskursiven Fähigkeiten von Englischlernenden in der Grundschule. In Fremdsprachenunterricht in der Grundschule, M. Kötter and J. Rymarczyk (eds), 11-36. Frankfurt: Peter Lang. Eckstein, P.P. 2008. Angewandte Statistik mit SPSS. Praktische Einführung für Wirtschaftswissenschaftler. 6 th edition. Wiesbaden: Gabler. Eikmeyer, H.-J., Kindt, W., Laubenstein, U., Lisken, S., Rieser, H. & Schade, U. 1995. Coherence regained. In Focus and coherence in discourse processing, G. Rickheit & C. Habel (eds), 115-142. Berlin & New York: Walter de Gruyter. Ellis, R. 1994. The study of second language acquisition. Oxford, UK et al.: Oxford University Press. Epstein, S.-A. & Phillips, J. 2009. Storytelling skills of children with specific language impairment. Child Language Teaching and Therapy 25 (3): 285-300. 244 Esser, J. 2009. Introduction to English text-linguistics. Frankfurt am Main et al.: Peter Lang. Eysenck, M.W. 2001. Principles of cognitive psychology. 2 nd edition. Hove, UK: Psychology Press. Fairclough, N. 1989. Language and power. London: Longman. Fairclough, N. 1992. Discourse and social change. Cambridge, UK: Polity. Fairclough, N. 1995. Critical discourse analysis: The critical study of language. London & New York: Longman. Fairclough, N. 2003. Analysing discourse: Textual analysis for social research. London: Routledge. Fazio, B., Naremore, R. & Connell, P. 1996. Tracking children from poverty at risk for specific language impairment: A 3-year longitudinal study. Journal of Speech and Hearing Research 39 (3): 52-63. Field, A. 2009. Discovering statistics using SPSS. 3 rd edition. Los Angeles, CA et al.: Sage. Fitzgerald, J. & Spiegel, D.L. 1986. Textual cohesion and coherence in children’s writing. Research in the Teaching of English 20 (3): 263-280. Fivush, R. & Slackman, E. 1986. The acquisition and development of scripts. In Event knowledge. Structure and function in development, K. Nelson (ed), 71-96. Hillsdale, NJ: Lawrence Erlbaum Associates. Foltz, P.W. 2003. Quantitative cognitive models of text and discourse processing. In The handbook of discourse processes, A.C. Graesser, M.A. Gernsbacher & S.R. Goldman (eds), 487-523. Mahwah, NJ: Lawrence Erlbaum Associates. Foster-Cohen, S. 2001. First language acquisition … second language acquisition: ‘What’s Hecuba to him or he to Hecuba? ’. Second Language Research 17 (4): 329- 344. Gebauer, S.K., Zaunbauer, A.C.M. & Möller, J. 2012. Erstsprachliche Leistungsentwicklung im Immersionsunterricht: Vorteile trotz Unterrichts in einer Fremdsprache? Zeitschrift für Pädagogische Psychologie 26: 183-196. Gee, J.P. 2005. An introduction to discourse analysis: Theory and method. 2nd edition. New York & London: Routledge. Genesee, F. 1987. Learning through two languages. Cambridge, Mass.: Newbury House. Gernsbacher, M.A. & Givón, T. (eds). 1995. Coherence in spontaneous text. Amsterdam: John Benjamins. Gernsbacher, M.A. & Givón, T. 1995a. Introduction: Coherence as a mental entity. In Coherence in spontaneous text, M.A. Gernsbacher & T. Givón (eds), vii-x. Amsterdam: John Benjamins. Gilliam, R.B. & Pearson, N. 2004. Test of narrative language. Austin, TX: PRO-ED. Givón, T. 1995. Coherence in text vs. coherence in mind. In Coherence in spontaneous text, M.A. Gernsbacher & T. Givón (eds), 59-115. Amsterdam: John Benjamins. Gombert, J.E. 1992. Metalinguistic development. Chicago, IL: University of Chicago Press. Goswami, U. 1998. Cognition in children. Hove, UK: Psychology Press. 245 Graesser, A.C., Gernsbacher, M.A. & Goldman, S.R. 2003. Introduction to the handbook of discourse processes. In The handbook of discourse processes, A.C. Graesser, M.A. Gernsbacher & S.R. Goldman (eds), 1-23. Mahwah, NJ: Lawrence Erlbaum Associates. Graesser, A.C., McNamara, D., Louwerse, M.M. & Cai, Z. 2004. Coh-Metrix: Analysis of text on cohesion and language. Behavior Research Methods, Instruments & Computers 36: 193-202. Greenbaum, S. 1996. The Oxford English grammar. Oxford, UK et al.: Oxford University Press. Grice, H.P. 1975. Logic and conversation. In Syntax and Semantics, Vol. 3. Speech Acts, P. Cole & J. Morgan (eds), 41-58. New York: Academic Press. Griffin, T., Hemphill, L., Camp, L., & Wolf, D. 2004. Oral discourse in the preschool years and later literacy skills. First Language 24: 124-147. Gumperz, J.J., Kaltman, H. & O’Connor, M.C. 1984. Cohesion in spoken and written discourse: Ethnic style and the transition to literacy. In Coherence in spoken and written discourse, D. Tannen (ed), 3-19. Norwood, NJ: Ablex. Halliday, M.A.K. & Hasan, R. 1976. Cohesion in English. London: Longman. Halliday, M.A.K. & Hasan, R. [1985] 1997. Language, context and text. Aspects of language in a social-semiotic perspective. Victoria: Deakin University Press. Harley, B., Cummins, J., Swain, M. & Allen, P. 1990. The nature of language proficiency. In The development of second language proficiency, B. Harley, P. Allen, J. Cummins & M. Swain (eds), 7-25. Cambridge, UK et al.: Cambridge University Press. Hasan, R. 1984. Coherence and cohesive harmony. In Understanding reading comprehension. Cognition, language and the structure of prose, J. Flood (ed), 181-220. Newark, Del.: International Reading Association. Hatch, E. & Lazaraton, A. 1991. The research manual. Design and statistics for applied linguistics. Boston MA: Heinle & Heinle. Hausendorf, H. & Quasthoff, U.M. 1996. Sprachentwicklung und Interaktion: Eine linguistische Studie zum Erwerb von Diskursfähigkeiten. Opladen: Westdeutscher Verlag. Heinemann, M. & Heinemann, W. 2002. Grundlagen der Textlinguistik: Interaktion - Text - Diskurs. Tübingen: Niemeyer. Hellmann, C. 1995. The notion of coherence in discourse. In Focus and coherence in discourse processing, G. Rickheit & Habel, C. (eds.), 190-202. Berlin & New York: Walter de Gruyter. Hendrickson, V. & Shapiro, L.R. 2001. Cohesive reference devices. Journal of Psychological Inquiry 6 (1): 17-22. Herman, D. 2009. Basic elements of narrative. Malden, MA et al.: Wiley-Blackwell. Herrman, C. & Fiebach, C. 2004. Gehirn und Sprache. Frankfurt/ Main: Fischer. Hickmann, M. 1991. The development of discourse cohesion: Some functional and cross-linguistic issues. In Language bases... discourse bases. Some aspects of contemporary French-language psycholinguistics research, G. Piéraut-Le Bonniec & M. Dolitsky (eds), 157-185. Amsterdam: John Benjamins. 246 Hickmann, M. 1996. Discourse organization and the development of reference to person, space, and time. In The handbook of child language, P. Fletcher & B. MacWhinney (eds), 194-218. Malden, Mass. et al.: Blackwell. Hickmann, M. 2003. Children’s discourse. Person, space and time across languages. Cambridge, UK: Cambridge University Press. Hickmann, M. 2004. Coherence, cohesion, and context: Some comparative perspectives in narrative development. In Relating events in narrative, Volume 2. Typological and contextual perspectives, S. Strömqvist & L. Verhoeven (eds), 281-306. Mahwah, NJ & London: Lawrence Erlbaum Associates. Hicks, D. 1991. Kinds of narrative: Genre skills among first graders from two communities. In Developing narrative structure, A. McCabe & C. Peterson (eds), 55-87. Hillsdale, NJ: Lawrence Erlbaum Associates. Hoey, M. 1996. Patterns of lexis in text. Oxford, UK et al.: Oxford University Press. Hoey, M. 2005. Lexical priming: A new theory of words and language. London: Routledge. Huddleston, R. & Pullum, G.K. 2002. The Cambridge grammar of the English language. Cambridge, UK et al.: Cambridge University Press. Hudson, J.A. & Shapiro, L.R. 1991. From knowing to telling: The development of children’s scripts, stories, and personal narratives. In Developing narrative structure, A. McCabe & C. Peterson (eds), 89-136. Hillsdale, NJ: Lawrence Erlbaum Associates. Hüttner, J. & Rieder-Bünemann, A. 2007. The effect of CLIL instruction on children’s narrative competence. VIEWS 16 (3), (CLIL Special Issue 2): 20-27. Hunt, K.W. 1965. Grammatical structures written at three grade levels. Champaign, IL: National Council of Teachers of English. Hyland, K. & Paltridge, B. (eds), 2011. The Continuum companion to discourse. London & New York: Continuum. Jaworski, A. & Coupland, N. (eds), 2006. The discourse reader. 2 nd edition. London et al.: Routledge. Joanette, Y. & Brownell, H.H. (eds). 1990. Discourse ability and brain damage. Theoretical and empirical perspectives. New York et al.: Springer. Johnson, R.K. & Swain, M. (eds). 1997. Immersion Education. International perspectives. Cambridge, Mass.: Cambridge University Press. Johnson-Laird, P.N. 1991. Mental models. In Foundations of cognitive science, M.I. Posner (ed), 469-499. Cambridge, MA: MIT Press. Johnstone, B. 2002. Discourse Analysis. Malden, MA & Oxford, UK: Blackwell. Johnstone, B. 2008. Discourse analysis. 2 nd edition. Malden, MA et al.: Blackwell. Jose, P.E. & Brewer, F.W. 1990. Early grade school children's liking of script and suspense story structures. Journal of Reading Behavior 22 (4): 355-372. Kähler, W. 2010. Statistische Datenanalyse. Verfahren verstehen und mit SPSS gekonnt einsetzen. 6th edition. Wiesbaden: Vieweg + Teubner. Kamalski, J., Sanders, T. & Lentz, L. 2008. Coherence marking, prior knowledge, and comprehension of informative persuasive texts: Sorting things out. Discourse Processes 45: 323- 345. 247 Kang, J.Y. 2004. Telling a coherent story in a foreign language: Analysis of Korean EFL learners’ referential strategies in oral discourse. Journal of Pragmatics 36: 1976-1990. Kang, J.Y. 2005. Written narratives as an index of L2 competence in Korean EFL learners. Journal of Second Language Writing 14 (4): 259-279. Kang, J.Y., Kim, Y.-S. & Pan, B.A. 2009. Five-year-olds' book talk and story retelling: Contributions of mother-child joint bookreading. First Language 29 (3): 243-265. Karmiloff-Smith, A. 1985. Language and cognitive processes from a developmental perspective. Language and Cognitive Processes 1: 61-85. Kayser, H. 1989. Some aspects of language understanding, language production, and intercomprehension in verbal interaction. In Connexity and coherence. Analysis of text and discourse, W. Heydrich, F. Neubauer, J.S. Petöfi & E. Sözer (eds), 342-365. Berlin & New York: de Gruyter. Kellogg, R.T. 1995. Cognitive psychology. Thousand Oaks, CA et al.: Sage. Kersten, K. 2009. Verbal inflections in L2 child narratives. A study of lexical aspect and grounding. Trier: Wissenschaftlicher Verlag Trier. Kersten, K. (ed). 2010. ELIAS - Early language and intercultural acquisition studies. Final report, public part. Magdeburg: ELIAS, <http: / / www.elias.bilikita.org/ > (10 March 2011). Kersten, K., Imhoff, C. & Sauer, B. 2002. The Acquisition of English verbs in an elementary school immersion program in Germany. In An integrated view of language development. Papers in honor of Henning Wode, P. Burmeister, T. Piske & A. Rohde (eds), 473-497. Trier: Wissenschaftlicher Verlag Trier. Khodadady, E. & Herriman, M. 2000. Schemata theory and selected response item tests: From Theory to Practice. In Fairness and validation in language assessment, A.J. Kunnan (ed), 201-222. Cambridge, UK: Cambridge University Press. Kintsch, W. 1974. The representation of meaning in memory. New York et al.: John Wiley & Sons. Kintsch, W. 1982. Gedächtnis und Kognition. Translated by Angelika Albert. Berlin et al.: Springer. Kintsch, W. 1995. How readers construct situation models for stories: The role of syntactic cues and causal inferences. In Coherence in spontaneous text, M.A. Gernsbacher & T. Givón (eds), 139-160. Amsterdam: John Benjamins. Kintsch, W. 1998. Comprehension. A paradigm for cognition. Cambridge, UK et al.: Cambridge University Press. Kintsch, W. & van Dijk, T.A. 1978. Toward a model of text comprehension and production. Psychological Review 85 (5): 363-394. Knott, A. & Sanders, T. 1998. The classification of coherence relations and their linguistic markers: An exploration of two languages. Journal of Pragmatics 30 (2): 135-175. Koch, P. & Oesterreicher, W. 1994. Schriftlichkeit und Sprache. In Schrift und Schriftlichkeit. Writing and its use. Ein interdiszplinäres Handbuch internationaler Forschung. An interdisciplinary handbook of international research, H. Günther & O. Ludwig (eds), 587-604. Berlin & New York: de Gruyter. 248 Kreyß, J. 1995. Comprehension processes as a means for text generation. In Focus and coherence in discourse processing, G. Rickheit & C. Habel (eds), 143-169. Berlin & New York: de Gruyter. Krohn, G. 1996. Kohäsive Merkmale in mündlichen Texten bilingual deutsch-englisch unterrichteter Schüler im Rahmen der Unterrichtserprobung in Schleswig Holstein. Unpublished manuscript, Kiel University. Kultusministerkonferenz (KMK). 2011. Schulische Bildung in der Bundesrepublik Deutschland. <http: / / www.kmk.org/ bildung-schule/ allgemeine-bildung.html> (20 March 2011). Kupersmitt, J. & Berman, R.A. 2001. Linguistic features of Spanish-Hebrew children’s narratives. In Narrative development in a multilingual context, L. Verhoeven & S. Strömqvist (eds), 277-317. Amsterdam: John Benjamins. Labov, W. [1972] 1999. The transformation of experience in narrative. In The discourse reader, A. Jaworski & N. Coupland (eds), 221-235. London, UK & New York: Routledge. Labov, W. & Waletzky, J. 1967. Narrative analysis: Oral versions of personal experience. In Essays on the verbal and visual arts, J. Helm (ed), 12-44. Seattle: University of Washington Press. Lakshmanan, U. & Selinker, L. 2001. Analyzing interlanguage: How do we know what learners know? Second Language Research 17 (4): 393-420. Lanza, E. 2001. Temporality and language contact in narratives by children bilingual in Norwegian and English. In Narrative development in a multilingual context, L. Verhoeven & S. Strömqvist (eds), 15-50. Amsterdam: John Benjamins. Larsen-Freeman, D. 1991. Second Language Acquisition Research: Staking Out the Territory. TESOL QUARTERLY 28 (2): 315-350. Laurén, U. 1998. Narrative structures in the stories of immersion pupils in their second language. In Immersion programmes. A European perspective, J. Arnau & J. Artigal (eds), 522-531. Barcelona: Universitat de Barcelona. Lazaruk, W. 2007. Linguistic, academic, and cognitive benefits of French Immersion. The Canadian Modern Language Review/ La Revue canadienne des languages vivantes 63 (5): 605-628. Lenk, U. 1998. Discourse markers and global coherence in conversation. Journal of Pragmatics 30: 245-257. Levelt, W.J.M. 1999. Producing spoken language: A blueprint of the speaker. In The neurocognition of language, C.M. Brown & P. Hagoort (eds), 83-122. Oxford, UK et al.: Oxford University Press. Lipka, L. 2002. English Lexicology: Lexical structure, word semantics & wordformation. Tübingen: Narr. Löbner, S. 2002. Understanding semantics. London: Hodder Arnold. Louwerse, M.M. 2004. A concise model of cohesion in text and coherence in comprehension. Memphis: Department of Psychology, 1-33, <http: / / madresearchlab.org/ references.htm> (21 December 2005). English translation of: Un modelo conciso de cohesión en el texto y coherencia en la comprensión. Revista Signos 37: 41-58, 249 Lück, H.E. & Miller, R. 2005. Illustrierte Geschichte der Psychologie. Weinheim: Beltz. Mackey, A. & Gass, S. 2005. Second language research. Methodology and design. Mahwah, NJ: Lawrence Erlbaum Associates. Mandel, R.G. & Johnson, N.S. 1984. A developmental analysis of story recall and comprehension in adulthood. Journal of verbal learning and verbal behavior 23: 643-659. Mandler, J. M. 1982. Another story of grammars: Comments on de Beaugrande's "The story of grammars and the grammar of stories." Journal of Pragmatics 6: 433-440. Mandler, J.M. & Goodman, M.S. 1982. On the psychological validity of story structure. Journal of verbal learning and verbal behaviour 21: 507-523. Mandler, J.M. & Johnson, N.S. 1977. Remembrance of things parsed: Story structure and recall. Cognitive Psychology 9: 111-151. Manhardt, J. & Rescorla, L. 2002. Oral narrative skills of late talkers at ages 8 and 9. Applied Psycholinguistics 23: 1-21. Mann, W. C. and Thompson, S. A. 1988. Rhetorical Structure Theory: Toward a functional theory of text organization. Text 8 (3): 243-281. Manolitsi, M. & Botting, N. 2011. Language abilities in children with autism and language impairment: Using narrative as a additional source of clinical information. Child Language Teaching and Therapy 27 (1): 39-55. Martin, J.R. 2001. Cohesion and Texture. In The handbook of discourse analysis, D. Schiffrin, D. Tannen & H. Hamilton (eds), 35-53. Malden, Mass. et al.: Blackwell. Maschewski, C. 2002. Pilotstudien zur Entwicklung kohäsiver Mittel im immersiven Englischunterricht von der 1. zur 2. Klasse der Claus-Rixen-Schule (Jahrgang 1999/ 2000). Unpublished diploma thesis (Staatsexamensarbeit), Kiel University. Mayer, M. 1969. Frog, where are you? . New York: Puffin Pied Piper. McCabe, A. & Peterson, C. 1991a. Getting the story: A longitudinal study of parental styles in eliciting narratives and developing narrative skills. In Developing narrative structure, A. McCabe & C. Peterson (eds), 217-253. Hillsdale, NJ: Lawrence Erlbaum Associates. McCutchen, D. & Perfetti, C.A. 1982. Coherence and connectedness in the development of discourse production. Text 2 (1-3): 113-139. Michaels, S. & Collins, J. 1984. Oral discourse styles: Classroom interaction and the acquisition of literacy. In Coherence in spoken and written discourse, D. Tannen (ed), 219-244. Norwood, NJ: Ablex. Ministerium für Bildung, Wissenschaft, Forschung und Kultur des Landes Schleswig Holstein. 1997. Lehrplan Grundschule. <http: / / lehrplan.lernnetz.de/ intranet1/ links/ materials/ index.php? wahl=4> (1 August 2011). Minsky, M. 1975. A framework for representing knowledge. In The psychology of computer vision, P.H. Winston (ed), 211-277. New York: McGraw-Hill. Möller, C. 2003. Erwerb und Entwicklung kohäsiver Bindungen im immersiven Englischunterricht. Unpublished diploma thesis (Staatsexamensarbeit), Kiel University. 250 Möller, C. 2009. The history and future of bilingual education: Immersion teaching in Germany and its Canadian origins. In Translation of cultures, P. Rüdiger & K. Gross (eds), 235-254. Amsterdam & New York: Rodopi. Möller, C. 2009a. Messbarkeit und Entwicklung von Textlänge in mündlichen Lernerdaten am Beispiel der L2 Englisch. In Empirische Fremdsprachenforschung. Konzepte und Perspektiven, C. Lütge, A.I. Kollenrott, B. Ziegenmeyer & G. Fellmann (eds), 63-74. Frankfurt a.M.: Peter Lang. Möller, C. 2010. Formulaic language in storytelling. Paper presented at the 4th International Formulaic Language Acquisition Research Network (FLaRN) Conference, Paderborn. Möller, C. 2012. Wie junge Fremdsprachenlerner temporale Räume schaffen: Eine Annäherung. In RaumTexte - TextRäume. Sprachwissenschaftliche Studien zur Verortung im Diskurs, C. Schubert & T. Schöberl (eds). Berlin: Frank & Timme. Möller, C. 2013. Zur Geschichte und Zukunft des bilingualen Unterrichts. In Mehrsprachigkeit in bilingualen Kindertagesstätten und Schulen. Voraussetzungen— Methoden—Erfolge, A.K. Steinlen & A. Rohde (eds), 14-30. Berlin: Dohrmann. Montanari, S. 2004. The development of narrative competence in the L1 and L2 of Spanish-English bilingual children. International Journal of Bilingualism 8 (4): 449-497. Mukherjee, V. 1999. Schriftlichkeit im bilingualen Unterricht: Kohäsive Merkmale in schriftlichen L2-Daten bilingual deutsch-englisch unterrichteter Schüler der 7. Jahrgangsstufe. Kiel: l&f Verlag. Myles, F. 2003. The early development of L2 narratives: A longitudinal study. Marges Linguistiques 5: 40-55. Nelson, K. (ed). 1986. Event knowledge. Structure and function in development. Hillsdale, NJ: Lawrence Erlbaum Associates. Nelson, K. 1986a. Event knowledge and cognitive development. In Event knowledge. Structure and function in development, K. Nelson (ed), 1-19 & 231-263. Hillsdale, NJ: Lawrence Erlbaum Associates. Nelson, K. & Gruendel, J.-M. 1986. Children’s scripts. In Event knowledge. Structure and function in development, K. Nelson (ed), 21-46. Hillsdale, NJ: Lawrence Erlbaum Associates. Neuner, J.L. 1987. Cohesive ties and chains in good and poor freshman essays. Research in the Teaching of English 21 (1): 92-105. Nistov, I. 2001 Reference continuation in L2 narratives of Turkish adolescnts in Norway. In Narrative development in a multilingual context, L. Verhoeven & S. Strömqvist (eds), 51-81. Amsterdam: John Benjamins. Norbury, C.F. & Bishop, D.V. 2003. Narrative skills of children with communication impairments. International Journal of Language & Communication Disorders 38 (3): 287-313. Norment, N. 1995. Quantitative analysis of cohesive devices in Spanish and Spanish ESL in narrative and expository written texts. Language Quarterly 33 (3-4): 135- 159. 251 Norris, J.A. & Bruning, R.H., 1988. Cohesion in the narratives of good and poor readers. Journal of Speech and Hearing Research 53, 416-424. O’Halloran, K.L. 2011. Multimodal discourse analysis. In The Continuum companion to discourse, K. Hyland and B. Paltridge (eds), 120-137. London & New York: Continuum. O’Neill, D.K. & Holmes, A. 2002. Young preschoolers’ ability to reference story characters: The contribution of gesture and character speech. First Language 22: 73- 103. O’Neill, D.K., Pearce, M.J. & Pick, J.L. 2004. Preschool children’s narratives and performance on the Peabody Individualized Achievement Test-Revised: Evidence of a relation between early narrative and later mathematical ability. First Language 24: 149-183. Pan, B.A. & Snow, C.E. 1999. The development of conversational and discourse skills. In The development of language, M. Barrett (ed), 229-249. Hove, UK: Psychology Press. Park, E.S. 2004. The comparative fallacy in UG studies. In Commentaries on the Comparative Fallacy in Second Language Research, J. Purdy (ed). Teachers College, Columbia University Working Papers in TESOL & Applied Linguistics 4 (1), <http: / / journals.tc-library.org/ index.php/ tesol/ issue/ view/ 7> (5 February 2010). Pearson, P.D. 1982. A primer for schema theory. The Volta Review 84: 25-33. Perfetti, C.A. 1999. Comprehending written language: A blueprint of the reader. In The neurocognition of language, C.M. Brown & P. Hagoort (eds), 167-208. Oxford, UK et al.: Oxford University Press. Petersen, D.B., Gillam, S.L. & Gillam, R.B. 2008. Emerging procedures in narrative assessment: The index of narrative complexity. Topics in Language Disorders 28 (2), 115-130. Peterson, C. & Dodsworth, P. 1991. A longitudinal analysis of young children’s cohesion and noun specification in narratives. Journal of Child Language 18: 397-415. Peterson, C. & McCabe, A. 1983. Developmental psycholinguistics. Three ways of looking at a child’s narrative. New York & London: Plenum. Peterson, C. & McCabe, A. 1991. Linking children’s connective use and narrative macrostructure. In Developing narrative structure, A. McCabe & C. Peterson (eds), 29-53. Hillsdale, NJ: Lawrence Erlbaum Associates. Porte, G. K. 2002. Appraising research in second language learning. A practical approach to critical analysis of quantitative research. Amsterdam: John Benjamins. Pradl, G.M. 1979. Learning how to begin and end a story. Language Arts 56 (1): 21-25. Quasthoff, U.M. 1980. Erzählen in Gesprächen. Linguistische Untersuchungen zu Strukturen und Funktionen am Beispiel einer Kommunikationsform des Alltags. Tübingen: Narr. Quasthoff, U.M. 1997. An interactive approach to narrative development. In Narrative development: Six approaches, M. Bamberg (ed), 51-83. Mahwah, NJ & London: Lawrence Erlbaum Associates. Quasthoff, U.M. & Becker, T. (eds). 2004. Narrative interaction. Amsterdam: John Benjamins. 252 Quirk, R., Greenbaum, S., Leech, G. & Svartvik, J. 1985. A comprehensive grammar of the English language. London et al.: Longman. Reese, E., Suggate, S., Long, J. & Schaughency, E., 2010. Children’s oral narrative and reading skills in the first 3 year of reading instruction. Reading and Writing 23: 627-644. Reid, J. 1992. A computer text analysis of four cohesion devices in English discourse by native and nonnative writers. Journal of Second Language Writing 1 (2): 79-107. Reilly J., Bates, E. & Marchman, V. 1998. Narrative discourse in children with early focal brain injury. Brain and Language 61: 335-375. Reilly, J., Losh, M., Bellugi, U. & Wulfeck, B. 2004. “Frog, where are you? ” Narratives in children with specific language impairment, early focal brain injury, and Williams syndrome. Brain and Language 88 (2): 229-247. Renkema, J. 2004. Introduction to discourse studies. Amsterdam: John Benjamins. Richardson, B. 2000. Recent concepts of narrative and the narratives of narrative theory. Style 34 (2): 168-175. Rickheit, G., Sichelschmidt, L. & Strohner, H. 1995. Economical principles in coherence management: A cognitive systems approach. In Focus and coherence in discourse processing, G. Rickheit & C. Habel (eds), 170-189. Berlin & New York: Walter de Gruyter. Rieger, B. 1987. Wissensrepräsentation und empirische Semantik: Eine computerlinguistische Aufgabe. In Theorie und Empirie, G. Pasternack (ed), 99-149. Bremen: Bremen University Press. Rietveld, T. & van Hout, R. 1993. Statistical techniques for the study of language and language behaviour. Berlin & New York: Mouton de Gruyter. Rohde, A. 2005. Lexikalische Prinzipien im Erst- und Zweitspracherwerb. Trier: Wissenschaftlicher Verlag Trier. Rohde, A. & Tiefenthal, C. 2002. On lexical learning abilities. In An Integrated View of Language Development. Papers in Honor of Henning Wode, P. Burmeister, T. Piske & A. Rohde (eds), 449-471. Trier: Wissenschaftlicher Verlag Trier. Rumelhart, D.E. 1975. Notes on a schema for stories. In Representation and Understanding. Studies in Cognitive Science, D.C. Bobrow & A. Collins (eds), 211-236. New York et al.: Academic Press. Rumelhart, D.E. 1980. Schemata: The building blocks of cognition. In Theoretical issues in reading and comprehension, R.J. Spiro, B. Bruce & W.F. Brewer (eds), 33- 58. Hillsdale, NJ: Lawrence Erlbaum Associates. Sanders, T. & Spooren, W. 2009. The cognition of discourse coherence. In Discourse, of Course: An overview of research in discourse studies, J. Renkema (ed), 197-212. Amsterdam and Philadelphia, PA: John Benjamins. Sanders, T., Spooren, W. & Noordman, L. 1992. Toward a taxonomy of coherence relations. Discourse Processes 15: 1-35 Sanford, A.J. & Moxey, L.M. 1995. Aspects of coherence in written language: A psychological perspective. In Coherence in spontaneous text, M.A. Gernsbacher & T. Givón (eds), 161-187. Amsterdam: John Benjamins. 253 Schank, R.C. 1975. The structure of episodes in memory. In Representation and understanding. Studies in cognitive science, D.G. Bobrow & A. Collins (eds), 237-272. New York: Academic Press. Schank, R.C. & Abelson, R. 1977. Scripts, plans, goals, and understanding. An inquiry into human knowledge structures. Hillsdale, NJ: Lawrence Erlbaum Associates. Schermer, F.J. 2006. Lernen und Gedächtnis. 4 th edition. Stuttgart: Kohlhammer. Schmitt, N. 1996. Uses and abuses of coefficient alpha. Psychological Assessment 8 (4): 350-353. Schneider, P., Dubé, R. V., & Hayward, D. 2005. The Edmonton Narrative Norms Instrument. Retrieved [September 12, 2012] from University of Alberta Faculty of Rehabilitation Medicine website: http: / / www.rehabmed.ualberta.ca/ spa/ enni. Schneider, P., Hayward, D. & Dubé, R.V. 2006. Storytelling from pictures using the Edmonton Narrative Norms Instrument. Journal of Speech Pathology and Audiology 30 (4): 224-238. Schneider, W. & Büttner, G. 2002. Entwicklung des Gedächtnisses bei Kindern und Jugendlichen. In Entwicklungspsychologie, R. Oerter & L. Montada (eds), 495-516. 5 th edition. Weinheim et al.: Beltz. Schriever, K. 1997. Analyse kohäsiver Elemente zum Nachweis kommunikativer Kompetenz von Schülern der 7. Jahrgangsstufe des deutsch-englisch bilingualen Unterrichts in Schleswig-Holstein. Unpublished manuscript, Kiel University. Schubert, C. 2008. Englische Textlinguistik. Eine Einführung. 1 st edition. Berlin: Erich Schmidt. Schubert, C. 2012. Englische Textlinguistik: eine Einführung. 2 nd edition. Berlin: Erich Schmidt. Scollon, R. & Scollon, S.B.K. 1984. Cooking it up and boiling it down: Abstracts in Athabaskan children’s story retellings. In Coherence in spoken and written discourse, D. Tannen (ed), 173-197. Norwood, NJ: Ablex. Segal, E.M. & Duchan, J.F. 1997. Interclausal connectives as indicators of structuring in narratives. In Processing interclausal relationships. Studies in the production and comprehension of text, J. Costermans & M. Fayol (eds), 95-119. Mahwah, NJ: Lawrence Erlbaum Associates. Seidman, S., Nelson, K. & Gruendel, J. 1986. Make-believe scripts: The transformation of event representations in fantasy. In Event knowledge. Structure and function in development, K. Nelson (ed), 161-187. Hillsdale, NJ: Lawrence Erlbaum Associates. Severing, R. & Verhoeven, L. 2001. Bilingual narrative development in Papiamento and Dutch. In Narrative development in a multilingual context, L. Verhoeven & S. Strömqvist (eds), 255-275. Amsterdam: John Benjamins. Shapiro, L.R. & Hudson, J.A. 1991. Tell me a make-believe story: Coherence and cohesion in young children’s picture-elicited narratives. Developmental Psychology 27: 960- 974. Shapiro, L.R. & Hudson, J.A. 1997. Coherence and cohesion in children’s stories. In Processing Interclausal Relationships. Studies in the production and comprehension of text, J. Costermans & M. Fayol (eds), 23-48. Mahwah, NJ: Lawrence Erlbaum Associates. 254 Singer, M. 1990. Psychology of language. An introduction to sentence and discourse processes. Hillsdale NJ: Lawrence Erlbaum Associates. Smit, U. 2008. The AILA Research Network - CLIL and immersion classrooms: Applied linguistics perspectives. Language Teaching 41 (2): 295-298. Smith, E.E. 1991. Concepts and induction. In Foundations of cognitive science, M.I. Posner (ed), 501-526. Cambridge, MA: MIT Press. Snyder, L.S. & Downey, D.M. 1991. The language-reading relationship in normal and reading-disabled children. Journal of Speech and Hearing Research 34: 129-140. Sodian, B. 2002. Entwicklung begrifflichen Wissens. In Entwicklungspsychologie, R. Oerter & L. Montada (eds), 443-468. 5 th edition. Weinheim et al.: Beltz. Solso, R.L. 2005. Kognitive Psychologie. Translated by Matthias Reiss. Berlin et al.: Springer. Sperry, L.L. & Sperry, D.E. 1996. Early development of narrative skills. Cognitive Development 11: 443-465. Spiegel Online. 2008. Merkel bedauert CDU-Beschluss zu deutscher Sprache. Spiegel Online, 2 December 2008, <http: / / www.spiegel.de/ politik/ deutschland/ 0,1518,594056,00.html> (3 December 2008). Spiegel Online. 2008a. Schwan wirft CDU Anti-Einwanderer-Wahlkampf vor. Spiegel Online, 3 December 2008, <http: / / www.spiegel.de/ politik/ deutschland/ 0,1518,594100,00.html> (3 December 2008). Spiegel, D.L. & Fitzgerald, J. 1990. Textual cohesion and coherence in children’s writing revisited. Research in the Teaching of English 24 (1): 48-66. Spooren, W. & Sanders, T. 2008. The acquisition order of coherence relations: On cognitive complexity in discourse. Journal of Pragmatics 40: 2003-2026. Stein, N.L. & Albro, E.R. 1997. Building complexity and coherence: Children’s use of goal-structured knowledge in telling stories. In Narrative development. Six approaches, M. Bamberg (ed), 5-44. Mahwah, NJ: Lawrence Erlbaum Associates. Stein, N.L. & Glenn, C.G. 1979. An analysis of story comprehension in elementary children. In New directions in discourse processing, R.O. Freedle (ed), 53-120. Norwood, NJ: Ablex. Stein, N.L. & Policastro, M. 1984. The concept of a story: A comparison between children’s and teacher’s viewpoints. In Learning and comprehension of text, H. Mandl, N.L. Stein & T. Trabasso (eds), 113-155. Hillsdale, NJ: Lawrence Erlbaum Associates. Stein, N.L. 1982. The definition of a story. Journal of Pragmatics 6: 487-507. Stenning, K. & Mitchell, L. 1985. Learning how to tell a good story: The development of content and language in children’s telling of one tale. Discourse Processes 8: 261- 279. Strömqvist, S. & Day, D. 1993. On the development of narrative structure in child L1 and adult L2 acquisition. Applied Psycholinguistics 14: 135-158. Strömqvist, S. & Verhoeven, L. 2004 Relating events in narrative style. Mahwah, NJ: Erlbaum. Strong, C.J. 1998. The Strong Narrative Assessment Procedure. Eau Claire, WI: Thinking Publications. 255 Swain, M. & Lapkin, S. 2006. Multilingualism through immersion education? In Mehrsprachige Individuen vielsprachige Gesellschaften, D. Wolff (ed.), 31-45. Frankfurt a.M.: Peter Lang. Taboada, M. 2009. Implicit and explicit coherence relations. In Discourse, of Course, J. Renkema (ed.), 127-140. Amsterdam and Philadelphia, PA: John Benjamins. Taboada, M. 2006. Discourse markers as signals (or not) of rhetorical relations. Journal of Pragmatics 38 (4): 567-592. Taboada, M. & Mann, W.C. 2006a. Rhetorical Structure Theory: Looking back and moving ahead. Discourse Studies 8 (3): 423-459. Taboada, M. & Mann, W.C. 2006b. Applications of Rhetorical Structure Theory. Discourse Studies 8 (4): 567-588. Tannen, D. (ed). 1984. Coherence in spoken and written discourse. Norwood NJ: Ablex. Thorndyke, P.W. 1977. Cognitive structures in comprehension and memory of narrative discourse. Cognitive Psychology 9: 77-110. Tiefenthal, C. 2009. Fast mapping im natürlichen L2-Erwerb. Trier: Wissenschaftlicher Verlag Trier. Toolan, M.J. 1988. Narrative. A critical linguistic introduction. London & New York: Routledge. Trabasso, T. & Rodkin, P.C. 1994. Knowledge of goal/ plans: A conceptual basis for narrating ‘Frog, where are you? ’. In Relating events in narrative. A crosslinguistic developmental study, R.A. Berman & D.I. Slobin (eds), 85-106. Hillsdale, NJ: Lawrence Erlbaum Associates. Trabasso, T., Suh, S. & Payton, P. 1995. Explanatory coherence in understanding and talking about events. In Coherence in spontaneous text, M.A. Gernsbacher & T. Givón (eds), 189-214. Amsterdam: John Benjamins. Trappes-Lomax, H. 2006. Discourse analysis. In The handbook of applied linguistics, A. Davies & C. Elder (eds), 133-164. Malden, MA et al.: Blackwell. Tucker, R.G. & Dubiner, D. 2008. Concluding thoughts: Does the immersion pathway lead to multilingualism? In Pathways to multilingualism. Evolving perspectives on immersion education, T. Williams Fortune & D.J. Tedick (eds), 267-277. Clevedon, UK et al.: Multilingual Matters. Ungerer, F. & Schmid, H.-J. 2006. An introduction to cognitive linguistics. 2 nd edition. Harlow, UK et al.: Pearson Education. van de Velde, R. 1989. Man, verbal text, inferencing and coherence. In Connexity and coherence. Analysis of text and discourse, W. Heydrich, F. Neubauer, J.S. Petöfi & E. Sözer (eds), 174-214. Berlin & New York: Walter de Gruyter. van Dijk, T.A. 1980. Macrostructures. An interdisciplinary study of global structures in discourse, interaction, and cognition. Hillsdale, NJ: Lawrence Erlbaum Associates. van Dijk, T.A. & Kintsch, W. 1983. Strategies of discourse comprehension. New York: Academic Press. Verhoeven, L. & Strömqvist, S. (eds). 2001. Narrative development in a multilingual context. Amsterdam: John Benjamins. 256 Verhoeven, L., Aparici, M., Cahana-Amitay, D., van Hell, J., Kriz, S. & Viguié-Simon, A. 2002. Clause packaging in writing and speech: A cross-linguistic developmental analysis. In Cross-linguistic perspectives on the development of text-production abilities in speech and writing, R.A. Berman & L. Verhoeven (eds). Written Language and Literacy (special issue) 5 (2): 135-162. Viberg, A. 2001. Age-related and L2-related features in bilingual narrative development in Sweden. In Narrative development in a multilingual context, L. Verhoeven & S. Strömqvist (eds), 87-128. Amsterdam: John Benjamins. von Berg, B. 2005. Muttersprachliche Lesefähigkeiten bei L2-Immersionsuntericht. Eine Pilotstudie. Unpublished diploma thesis (Staatsexamensarbeit), Kiel University. Warnke, I. 2008. Methoden und Methodologie der Diskurslinguistik - Grundlagen und Verfahren einer Sprachwissenschaft jenseits textueller Grenzen. In Methoden der Diskurslinguistik. Sprachwissenschaftliche Zugänge zur transtextuellen Ebene, I. Warnke (ed), 3-54. Berlin et al.: de Gruyter. Welt Online. 2008. CDU will Deutsch im Grundgesetz verankern. Welt Online, 2 December 2008, <http: / / www.welt.de/ politik/ article2814535/ CDU-will-Deutschim-Grundgesetz-verankern.html> (3 December 2008). Westby, C.E. 1984. Development in narrative language abilities. In Language learning disabilities in school-age children, G.P. Wallach & K.G. Butler (eds), 103-127. Baltimore, MD & London, UK: Williams & Wilkins. Wigglesworth, G. 1990. Children’s narrative acquisition: A study of some aspects of reference and anaphora. First Language 10: 105-125. Wilensky, R. 1982. Story grammars revisited. Journal of Pragmatics 6 (5-6), 423-432. Wilensky, R. 1983. Story grammars versus story points. The Behavioral and Brain Sciences 6: 579-623. Wodak, R. 2005. A new agenda in (critical) discourse analysis. Amsterdam & Philadelphia: John Benjamins. Wodak, R., Nowak, P., Pelikan, J., Gruber, H., de Cillia, R., Mitten, R. 1990. „Wir sind alle unschuldige Täter! “ Diskurshistorische Studien zum Nachkriegsantisemitismus. Frankfurt: Suhrkamp. Wode, H. 1994. Bilinguale Unterrichtserprobung in Schleswig-Holstein. 2 volumes. Kiel: l&f Verlag. Wode, H. 1995. Lernen in der Fremdsprache. Grundzüge von Immersion und bilingualem Unterricht. Ismaning: Hueber. Wode, H. 1998. A European perspective on immersion teaching: The German scenario. In Els programes d’immersió. Una perspectiva europea/ Immersion programmes. A European perspective, J. Arnau & J.M. Artigal (eds), 43-65. Barcelona: Publicacions de la Universitat de Barcelona. Wode, H. 2000. Mehrsprachigkeit durch Kindergarten und Grundschulen: Chance oder Risiko? Nouveaux Cahiers d’Allemand 19: 157-78. Wode, H. 2004. Frühes Fremdsprachenlernen. Englisch ab Kita und Grundschule: Warum? Wie? Was bringt es? Kiel: Verein für frühe Mehrsprachigkeit an Kindertageseinrichtungen und Schulen (FMKS), <http: / / www.fmks-online.de> (5 September 2008). 257 Wode, H. 2009. Frühes Fremdsprachenlernen in bilingualen Kindergärten und Grundschulen. Braunschweig: Westermann. Wode, H., Fischer, U., Pasternak, R. & Franzen, V. 2003. Frühes Englisch lernen im Altenholzer Verbund von Kita und Grundschule: Erfahrungen aus Praxis und Forschung zum Ende der 4. Klasse. In Fremdsprachen auf dem Prüfstand. Innovation - Qualität - Evaluation, B. Voss & E. Stahlheber (eds), 139-149. Berlin: Pädagogischer Zeitschriftenverlag. Woods, A., Fletcher, P. & Hughes, A. 1986. Statistics in language studies. Cambridge, UK et al.: Cambridge University Press. Yde, P. & Spoelders, M. 1985. Text cohesion: An exploratory study with beginning writers. Applied Psycholinguistics 6: 407-416. Yussun, S. & Ozcan, N.M. 1996. The development of knowledge about narratives. Issues in Education 2 (1): 1-68. Zaunbauer, A.C.M. & Möller, J. 2007. Schulleistungen monolingual und immersiv unterrichteter Kinder am Ende des ersten Schuljahres. Zeitschrift für Entwicklungspsychologie und Pädagogische Psychologie 39 (3): 141-153. Zaunbauer, A.C.M., Bonerad, E.-M. & Möller, J. 2005. Muttersprachliches Leseverständnis immersiv unterrichteter Kinder. Zeitschrift für Pädagogische Psychologie 19 (4): 263-265. 259 11 Appendix 11.1 Individual cohesion scores ordered by frequency Tab. 11.1 First graders’ cohesive density scores in ascending order Grade 1: Scores by frequency Overall cohesive density Reference density Connective density Ellipsis density Lexical density 2.17 0.57 0 0 1.22 2.62 0.74 0 0 1.29 2.66 0.76 0.06 0 1.33 2.85 0.85 0.10 0 1.41 2.89 0.95 0.12 0 1.47 2.92 0.96 0.16 0 1.49 2.94 1.00 0.19 0 1.52 3.05 1.02 0.20 0 1.53 3.26 1.06 0.21 0.03 1.55 3.26 1.08 0.22 0.04 1.58 3.27 1.10 0.26 0.05 1.58 3.28 1.10 0.31 0.05 1.64 3.40 1.11 0.35 0.05 1.70 3.45 1.12 0.47 0.06 1.73 3.46 1.15 0.48 0.06 1.76 3.46 1.16 0.48 0.07 1.76 3.47 1.17 0.49 0.08 1.80 3.48 1.17 0.49 0.08 1.85 3.48 1.18 0.50 0.10 1.88 3.67 1.21 0.55 0.11 1.90 3.73 1.23 0.61 0.11 1.92 260 Grade 1: Scores by frequency Overall cohesive density Reference density Connective density Ellipsis density Lexical density 3.74 1.24 0.65 0.12 1.95 3.76 1.24 0.65 0.13 1.96 3.77 1.29 0.68 0.15 1.98 3.87 1.29 0.74 0.15 2.00 3.94 1.30 0.74 0.17 2.03 4.03 1.33 0.75 0.17 2.07 4.15 1.34 0.75 0.17 2.18 4.17 1.48 0.81 0.21 2.20 4.42 1.56 0.87 0.21 2.28 4.44 1.65 0.97 0.38 2.52 Total 31 participants 261 261 262 Tab. 11.2 Fourth graders’ cohesive density scores in ascending order Grade 4: Scores by frequency Total cohesive density Reference density Connective density Ellipsis density Lexical density 3.43 0.76 0.46 0.03 1.58 3.49 0.96 0.47 0.05 1.68 3.58 1.04 0.48 0.08 1.69 3.59 1.07 0.49 0.08 1.69 3.63 1.07 0.49 0.09 1.73 3.64 1.10 0.50 0.10 1.76 3.67 1.10 0.51 0.10 1.80 3.69 1.13 0.53 0.12 1.80 3.72 1.13 0.53 0.13 1.81 3.74 1.15 0.56 0.14 1.83 3.75 1.16 0.57 0.14 1.87 3.76 1.18 0.58 0.14 1.88 3.78 1.20 0.58 0.15 1.88 3.88 1.21 0.59 0.15 1.89 3.88 1.22 0.64 0.16 1.89 3.90 1.23 0.64 0.17 1.93 3.91 1.24 0.65 0.17 1.94 3.95 1.24 0.66 0.17 1.94 3.97 1.26 0.67 0.17 2.00 3.98 1.29 0.70 0.18 2.00 3.98 1.31 0.72 0.18 2.00 4.02 1.32 0.73 0.19 2.02 4.13 1.32 0.76 0.19 2.02 4.21 1.36 0.76 0.19 2.02 4.34 1.36 0.79 0.20 2.21 4.44 1.38 0.79 0.20 2.22 4.64 1.49 0.86 0.21 2.25 4.65 1.62 0.91 0.23 2.31 Total 28 participants 263 11.2 Individual cohesion scores ordered by frequency and sex Tab. 11.3 Male first graders’ cohesive density scores in ascending order Grade 1: Scores by frequency (male participants) Total cohesive density Reference density Connective density Ellipsis density Lexical density 2.85 0.85 0.10 0 1.41 3.26 1.00 0.16 0 1.52 3.46 1.15 0.21 0 1.73 3.46 1.17 0.49 0 1.76 3.48 1.18 0.50 0.06 1.90 3.67 1.21 0.61 0.07 1.95 3.74 1.24 0.68 0.11 1.96 3.94 1.29 0.74 0.15 2.00 4.15 1.48 0.74 0.15 2.03 4.44 1.56 0.87 0.17 2.28 Total 10 participants Tab. 11.4 Female first graders’ cohesive density scores in ascending order Grade 1: Scores by frequency (female participants) Total cohesive density Reference density Connective density Ellipsis density Lexical density 2.17 0.57 0 0 1.22 2.62 0.74 0 0 1.29 2.66 0.76 0.06 0 1.33 2.89 0.95 0.12 0 1.47 2.92 0.96 0.19 0.03 1.49 2.94 1.02 0.20 0.04 1.53 3.05 1.06 0.22 0.05 1.55 3.26 1.08 0.26 0.05 1.58 3.27 1.10 0.31 0.05 1.58 3.28 1.10 0.35 0.06 1.64 3.40 1.11 0.47 0.08 1.70 3.45 1.12 0.48 0.08 1.76 3.47 1.16 0.48 0.10 1.80 3.48 1.17 0.49 0.11 1.85 3.73 1.23 0.55 0.12 1.88 3.76 1.24 0.65 0.13 1.92 3.77 1.29 0.65 0.17 1.98 3.87 1.30 0.75 0.17 2.07 264 Grade 1: Scores by frequency (female participants) Total cohesive density Reference density Connective density Ellipsis density Lexical density 4.03 1.33 0.75 0.21 2.18 4.17 1.34 0.81 0.21 2.20 4.42 1.65 0.97 0.38 2.52 Total 21 participants Tab. 11.5 Male fourth graders’ cohesive density scores in ascending order Grade 4: Scores by frequency (male participants) Total cohesive density Reference density Connective density Ellipsis density Lexical density 3.64 1.04 0.49 0.05 1.76 3.72 1.07 0.51 0.14 1.80 3.76 1.10 0.64 0.17 1.83 3.78 1.13 0.67 0.18 1.87 3.97 1.29 0.70 0.18 1.89 4.21 1.32 0.76 0.19 2.00 4.34 1.36 0.79 0.21 2.25 Total 7 participants Tab. 11.6 Female fourth graders’ cohesive density scores in ascending order Grade 4: Scores by frequency (female participants) Total cohesive density Reference density Connective density Ellipsis density Lexical density 3.43 0.76 0.46 0.03 1.58 3.49 0.96 0.47 0.08 1.68 3.58 1.07 0.48 0.08 1.69 3.59 1.10 0.49 0.09 1.69 3.63 1.13 0.50 0.10 1.73 3.67 1.15 0.53 0.10 1.80 3.69 1.16 0.53 0.12 1.81 3.74 1.18 0.56 0.13 1.88 3.75 1.20 0.57 0.14 1.88 3.88 1.21 0.58 0.14 1.89 3.88 1.22 0.58 0.15 1.93 3.90 1.23 0.59 0.15 1.94 3.91 1.24 0.64 0.16 1.94 3.95 1.24 0.65 0.17 2.00 265 Grade 4: Scores by frequency (female participants) Total cohesive density Reference density Connective density Ellipsis density Lexical density 3.98 1.26 0.66 0.17 2.00 3.98 1.31 0.72 0.17 2.02 4.02 1.32 0.73 0.19 2.02 4.13 1.36 0.76 0.19 2.02 4.44 1.38 0.79 0.20 2.21 4.64 1.49 0.86 0.20 2.22 4.65 1.62 0.91 0.23 2.31 Total 21 participants 11.3 Individual cohesion scores by frequency and experience group Tab. 11.7 Bili first graders’ cohesive density scores in ascending order Grade 1: Scores by frequency (bili group) Overall cohesive density Reference density Connective density Ellipsis density Lexical density 2.66 0.85 0.06 0 1.41 2.85 0.96 0.16 0 1.49 3.26 1.00 0.21 0 1.52 3.27 1.02 0.31 0.04 1.55 3.28 1.06 0.35 0.05 1.58 3.40 1.08 0.47 0.05 1.64 3.45 1.10 0.49 0.06 1.70 3.46 1.10 0.49 0.06 1.73 3.46 1.12 0.50 0.07 1.76 3.67 1.15 0.55 0.08 1.88 3.73 1.17 0.61 0.10 1.92 3.74 1.17 0.65 0.11 1.95 3.76 1.18 0.65 0.12 1.96 3.77 1.21 0.68 0.13 1.98 3.87 1.23 0.74 0.15 2.00 3.94 1.24 0.74 0.15 2.03 4.03 1.29 0.75 0.17 2.07 4.15 1.34 0.75 0.17 2.20 4.44 1.56 0.87 0.21 2.28 Total 19 participants 266 Tab. 11.8 Mono first graders’ cohesive density scores in ascending order Grade 1: Scores by frequency (mono group) Overall cohesive density Reference density Connective density Ellipsis density Lexical density 2.17 0.57 0 0 1.22 2.62 0.74 0 0 1.29 2.89 0.76 0.10 0 1.33 2.92 0.95 0.12 0 1.47 2.94 1.11 0.19 0 1.53 3.05 1.16 0.20 0.03 1.58 3.26 1.24 0.22 0.05 1.76 3.47 1.29 0.26 0.08 1.80 3.48 1.30 0.48 0.11 1.85 3.48 1.33 0.48 0.17 1.90 4.17 1.48 0.81 0.21 2.18 4.42 1.65 0.97 0.38 2.52 Total 12 participants Tab. 11.9 Bili fourth graders’ cohesive density scores in ascending order Grade 4: Scores by frequency (bili group) Overall cohesive density Reference density Connective density Ellipsis density Lexical density 3.43 0.76 0.47 0.03 1.58 3.49 1.04 0.48 0.08 1.69 3.67 1.07 0.49 0.09 1.80 3.69 1.10 0.50 0.10 1.87 3.72 1.13 0.53 0.13 1.88 3.74 1.13 0.56 0.14 1.89 3.75 1.15 0.59 0.14 1.89 3.88 1.18 0.64 0.15 1.93 3.90 1.20 0.64 0.17 1.94 3.97 1.23 0.65 0.17 2.00 3.98 1.31 0.66 0.18 2.00 4.02 1.32 0.72 0.18 2.02 4.21 1.32 0.73 0.19 2.02 4.34 1.36 0.76 0.19 2.21 4.44 1.36 0.79 0.21 2.22 4.64 1.49 0.91 0.23 2.25 Total 16 participants 267 Tab. 11.10 Mono fourth graders’ cohesive density scores in ascending order Grade 4: Scores by frequency (mono group) Overall cohesive density Reference density Connective density Ellipsis density Lexical density 3.58 0.96 0.46 0.05 1.68 3.59 1.07 0.49 0.08 1.69 3.63 1.10 0.51 0.10 1.73 3.64 1.16 0.53 0.12 1.76 3.76 1.21 0.57 0.14 1.80 3.78 1.22 0.58 0.15 1.81 3.88 1.24 0.58 0.16 1.83 3.91 1.24 0.67 0.17 1.88 3.95 1.26 0.70 0.17 1.94 3.98 1.29 0.76 0.19 2.00 4.13 1.38 0.79 0.20 2.02 4.65 1.62 0.86 0.20 2.31 Total 12 participants 268 11.4 Individual coherence and cohesion scores 11.4.1 Coherence scores: Overall measures Tab. 11.11 Overall coherence scores by participant Participant † Total number of narrative components Narrative index score G1-C1-1 6 5 G1-C1-2 6 7 G1-C1-3 1 2 G1-C1-4 5 7 G1-C1-6 8 9 G1-C1-7 2 4 G1-C1-8 3 6 G1-C1-10 6 9 G1-C1-13 2 4 G1-C1-14 7 7 G1-C1-15 2 4 G1-C1-16 10 9 G1-C1-17 6 7 G1-C8-1 6 7 G1-C8-2 5 7 G1-C8-3 10 9 G1-C8-4 3 6 G1-C8-5 7 9 G1-C8-6 2 4 G1-C8-7 1 2 G1-C8-8 1 2 G1-C8-9 1 2 G1-C8-10 1 2 G1-C8-11 3 4 G1-C8-13 9 7 G1-C8-14 7 9 G1-C8-15 3 6 G1-C8-16 1 2 G1-C8-18 2 4 G1-C8-20 8 9 G1-C8-21 10 9 G4-C1-1 9 9 G4-C1-2 8 9 G4-C1-3 7 9 269 Participant † Total number of narrative components Narrative index score G4-C1-4 10 9 G4-C1-6 11 9 G4-C1-7 11 9 G4-C1-8 10 9 G4-C1-10 9 9 G4-C1-13 12 9 G4-C1-14 9 9 G4-C1-15 8 7 G4-C1-16 12 9 G4-C1-17 10 9 G4-C5-3 12 9 G4-C5-6 11 9 G4-C5-7 9 9 G4-C5-8 10 9 G4-C5-9 12 9 G4-C5-10 10 9 G4-C5-11 10 9 G4-C5-12 10 9 G4-C5-14 11 9 G4-C5-15 12 9 G4-C5-17 7 9 G4-C5-18 10 9 G4-C5-19 10 9 G4-C5-20 6 8 G4-C5-21 12 9 † Participants’ individual results are ordered, first and foremost, by grade. References to individual participants are therefore coded as follows: Grade-cohort-participant number. ‘G1-C1- 1’, for example, refers to child 1 from cohort 1 in first grade, ‘G4-C1-1’ to the same child in fourth grade (longitudinal data set). 270 11.4.2 Coherence scores: Individual components Tab. 11.12 Individual narrative components by participant Participant SETTING INITIATING EVENT SIMPPLE REACTION GOAL 1 GOAL 2 ATTEMPT 1 ATTEMPT 2 ATTEMPT 3 ATTEMPT 4 ATTEMPT 5 ATTEMPT 6 ATTEMPT 7 CONSEQUENCE ENDING G1-C1-1 0† 1 1 0 1 1 0 1 1 0 0 0 0 0 G1-C1-2 1 1 0 0 1 1 0 1 1 0 0 0 0 0 G1-C1-3 0 0 0 0 0 0 0 0 0 0 0 0 1 0 G1-C1-4 1 1 0 0 1 0 1 1 0 0 0 0 0 0 G1-C1-6 1 1 1 0 1 1 1 1 0 0 0 0 1 0 G1-C1-7 1 1 0 0 0 0 0 0 0 0 0 0 0 0 G1-C1-8 1 1 0 0 0 1 0 0 0 0 0 0 0 0 G1-C1-10 1 1 0 0 1 0 1 0 0 1 0 0 1 0 G1-C1-13 1 1 0 0 0 0 0 0 0 0 0 0 0 0 G1-C1-14 1 1 0 1 1 0 0 0 1 1 0 0 0 1 G1-C1-15 0 1 0 0 0 1 0 0 0 0 0 0 0 0 G1-C1-16 1 1 0 1 1 0 1 1 1 0 1 0 1 1 G1-C1-17 0 1 0 1 1 0 1 1 0 0 0 0 0 1 G1-C8-1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 G1-C8-2 1 1 0 0 1 1 0 1 0 0 0 0 0 0 G1-C8-3 1 1 0 1 1 1 1 1 0 0 1 0 1 1 G1-C8-4 1 1 0 0 0 0 0 0 0 0 0 0 1 0 G1-C8-5 1 1 0 0 1 0 1 1 0 0 1 0 1 0 G1-C8-6 1 0 0 0 0 0 0 0 0 0 0 0 1 0 G1-C8-7 1 0 0 0 0 0 0 0 0 0 0 0 0 0 G1-C8-8 1 0 0 0 0 0 0 0 0 0 0 0 0 0 G1-C8-9 0 1 0 0 0 0 0 0 0 0 0 0 0 0 G1-C8-10 1 0 0 0 0 0 0 0 0 0 0 0 0 0 G1-C8-11 1 1 1 0 0 0 0 0 0 0 0 0 0 0 G1-C8-13 1 1 0 0 1 1 1 1 1 1 1 0 0 0 G1-C8-14 1 1 0 0 1 1 0 1 0 1 0 0 1 0 G1-C8-15 1 1 0 0 0 0 0 0 0 0 0 0 0 1 G1-C8-16 1 0 0 0 0 0 0 0 0 0 0 0 0 0 G1-C8-18 0 1 0 0 0 0 0 0 0 0 0 0 1 0 G1-C8-20 1 1 1 0 1 1 0 1 0 0 1 0 1 0 271 Participant SETTING INITIATING EVENT SIMPPLE REACTION GOAL 1 GOAL 2 ATTEMPT 1 ATTEMPT 2 ATTEMPT 3 ATTEMPT 4 ATTEMPT 5 ATTEMPT 6 ATTEMPT 7 CONSEQUENCE ENDING G1-C8-21 1 1 0 0 1 1 0 1 1 1 1 1 1 0 G4-C1-1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 G4-C1-2 1 1 0 1 1 1 0 1 0 0 0 0 1 1 G4-C1-3 1 1 0 1 1 1 0 1 0 0 0 0 0 1 G4-C1-4 1 1 0 1 1 1 1 1 0 1 0 0 1 1 G4-C1-6 1 1 1 1 1 1 1 1 0 0 1 0 1 1 G4-C1-7 1 1 0 1 1 1 1 1 0 1 1 0 1 1 G4-C1-8 1 1 1 0 1 1 1 1 1 1 0 0 1 0 G4-C1-10 1 1 0 1 1 1 1 1 0 0 0 0 1 1 G4-C1-13 1 1 1 1 1 1 1 1 1 0 1 0 1 1 G4-C1-14 1 1 0 1 1 1 1 1 1 0 0 0 0 1 G4-C1-15 0 1 0 1 1 1 0 1 0 0 1 0 1 1 G4-C1-16 1 1 0 1 1 1 1 1 1 1 1 0 1 1 G4-C1-17 1 1 0 1 1 1 1 1 1 0 0 0 1 1 G4-C5-3 1 1 1 1 1 1 1 1 1 0 1 0 1 1 G4-C5-6 1 1 1 1 1 1 1 1 0 0 1 0 1 1 G4-C5-7 1 1 0 1 1 1 1 1 0 0 1 0 0 1 G4-C5-8 1 1 1 1 1 1 1 1 0 0 1 0 0 1 G4-C5-9 1 1 0 1 1 1 1 1 1 1 1 0 1 1 G4-C5-10 1 1 0 1 1 1 1 1 1 0 0 1 0 1 G4-C5-11 1 1 0 1 1 1 1 1 1 1 0 0 0 1 G4-C5-12 1 1 1 0 1 1 1 1 1 1 0 0 1 0 G4-C5-14 1 1 1 1 1 1 1 1 0 1 1 0 0 1 G4-C5-15 1 1 1 1 1 1 1 1 1 1 0 0 1 1 G4-C5-17 1 1 0 1 1 1 0 1 0 0 0 0 0 1 G4-C5-18 1 1 0 1 1 1 1 1 0 0 1 0 1 1 G4-C5-19 1 1 1 1 1 1 0 1 0 0 1 0 1 1 G4-C5-20 1 1 1 1 0 1 0 0 0 0 0 0 0 1 G4-C5-21 1 1 1 1 1 1 1 1 1 1 1 0 0 1 † In all coherence analysis tables ‘0’ corresponds to ‘component/ feature absent/ not realized’, while ‘1’ analogously stands for ‘component/ feature present/ realized’. Participants’ individual results are ordered, first and foremost, by grade. References to individual participants are there fore coded as follows: Grade-cohort-participant number. ‘G1-C1-1’, for example, refers to child 1 from cohort 1 in first grade,‘G4-C1-1’ to the same child in fourth grad e (longitudinal data set). 272 11.4.3 Cohesion scores Tab. 11.13 Cohesion scores by participant Participant † Total cohesive density Reference density Connective density Substitution density Ellipsis density Lexical density G1-C1-1 3.74 1.00 0.74 0.00 0.00 2.00 G1-C1-2 4.44 1.56 0.61 0.00 0.00 2.28 G1-C1-3 3.26 1.11 0.19 0.00 0.11 1.85 G1-C1-4 2.17 0.74 0.22 0.00 0.00 1.22 G1-C1-6 3.40 1.12 0.65 0.00 0.05 1.58 G1-C1-7 3.48 1.16 0.48 0.00 0.08 1.76 G1-C1-8 4.42 1.24 0.97 0.00 0.03 2.18 G1-C1-10 2.89 0.95 0.26 0.00 0.21 1.47 G1-C1-13 3.67 1.21 0.50 0.00 0.00 1.96 G1-C1-14 3.87 1.10 0.47 0.00 0.10 2.20 G1-C1-15 3.73 1.08 0.65 0.00 0.12 1.88 G1-C1-16 4.03 1.23 0.75 0.00 0.08 1.98 G1-C1-17 3.27 0.96 0.35 0.00 0.04 1.92 G1-C8-1 3.94 1.15 0.87 0.00 0.17 1.76 G1-C8-2 3.05 0.57 0.81 0.00 0.38 1.29 G1-C8-3 3.46 1.24 0.16 0.00 0.11 1.95 G1-C8-4 3.77 1.34 0.55 0.00 0.17 1.70 G1-C8-5 3.46 1.17 0.49 0.00 0.07 1.73 G1-C8-6 3.47 1.30 0.20 0.00 0.17 1.80 G1-C8-7 3.48 1.48 0.10 0.00 0.00 1.90 G1-C8-8 2.92 1.33 0.00 0.00 0.00 1.58 G1-C8-9 3.45 1.10 0.75 0.00 0.05 1.55 G1-C8-10 2.94 1.29 0.12 0.00 0.00 1.53 G1-C8-11 4.15 1.29 0.68 0.00 0.15 2.03 G1-C8-13 3.28 1.02 0.49 0.00 0.13 1.64 G1-C8-14 3.76 1.17 0.31 0.00 0.21 2.07 G1-C8-15 2.85 1.18 0.21 0.00 0.06 1.41 G1-C8-16 4.17 1.65 0.00 0.00 0.00 2.52 G1-C8-18 2.62 0.76 0.48 0.00 0.05 1.33 G1-C8-20 3.26 0.85 0.74 0.00 0.15 1.52 273 Participant † Total cohesive density Reference density Connective density Substitution density Ellipsis density Lexical density G1-C8-21 2.66 1.06 0.06 0.00 0.06 1.49 G4-C1-1 4.34 1.32 0.64 0.00 0.14 2.25 G4-C1-2 4.21 1.36 0.79 0.00 0.18 1.89 G4-C1-3 3.58 1.16 0.46 0.00 0.08 1.88 G4-C1-4 3.98 1.24 0.76 0.00 0.17 1.81 G4-C1-6 3.67 1.15 0.66 0.00 0.17 1.69 G4-C1-7 3.88 1.21 0.79 0.00 0.20 1.68 G4-C1-8 3.91 1.22 0.86 0.00 0.10 1.73 G4-C1-10 3.63 1.24 0.58 0.00 0.12 1.69 G4-C1-13 3.97 1.13 0.76 0.00 0.21 1.87 G4-C1-14 3.88 1.10 0.65 0.00 0.10 2.02 G4-C1-15 4.02 1.32 0.56 0.00 0.14 2.00 G4-C1-16 3.74 1.07 0.59 0.00 0.19 1.89 G4-C1-17 3.49 0.76 0.72 0.00 0.08 1.93 G4-C5-3 3.72 1.04 0.49 0.01 0.18 2.00 G4-C5-6 3.43 1.18 0.48 0.02 0.17 1.58 G4-C5-7 3.98 1.13 0.64 0.00 0.19 2.02 G4-C5-8 4.44 1.36 0.73 0.00 0.13 2.22 G4-C5-9 4.13 1.38 0.58 0.00 0.14 2.02 G4-C5-10 3.76 1.07 0.67 0.00 0.19 1.83 G4-C5-11 3.59 0.96 0.49 0.00 0.20 1.94 G4-C5-12 4.65 1.62 0.57 0.00 0.15 2.31 G4-C5-14 3.90 1.23 0.50 0.00 0.23 1.94 G4-C5-15 3.75 1.31 0.47 0.00 0.09 1.88 G4-C5-17 3.95 1.26 0.53 0.00 0.16 2.00 G4-C5-18 4.64 1.49 0.91 0.00 0.03 2.21 G4-C5-19 3.64 1.29 0.51 0.02 0.05 1.76 G4-C5-20 3.69 1.20 0.53 0.02 0.15 1.80 G4-C5-21 3.78 1.10 0.70 0.00 0.17 1.80 † Participants’ individual results are ordered, first and foremost, by grade. References to individual participants are therefore coded as follows: Grade-cohort-participant number. ‘G1-C1- 1’, for example, refers to child 1 from cohort 1 in first grade, ‘G4-C1-1’ to the same child in fourth grade (longitudinal data set) 274 12 List of figures Fig. 3.1 A simplified model of discourse production in a picturebook-elicited storytelling task ...........................................................................................16 Fig. 3.2 Narrative components of a single episode structure ............................28 Fig. 5.1 The structure of the frog story in narrative components as identified by story grammars ......................................................................................63 Fig. 6.1 Mean number of narrative components by grade ................................92 Fig. 6.2 Distribution of the total number of component scores by grade .......93 Fig. 6.3 Mean number of narrative components by sex and grade ..................95 Fig. 6.4 Distribution of the total number of components by sex in first grade ...........................................................................................................96 Fig. 6.5 Distribution of total number of narrative component scores by sex in fourth grade ......................................................................................97 Fig. 6.6 Distribution of the total number of components by grade for male participants ..................................................................................................98 Fig. 6.7 Distribution of the total number of components by grade for female participants ..................................................................................................99 Fig. 6.8 Mean number of components by experience group and grade ........100 Fig. 6.9 Distribution of the total number of components by experience group in first grade ...................................................................................101 Fig. 6.10 Distribution of the total number of components by experience group in fourth grade ...............................................................................104 Fig. 6.11 Distribution of the total number of components by grade for participants with exclusively monolingual preschool experience ......105 Fig. 6.12 Distribution of the total number of components by grade for children with bilingual preschool experience .......................................105 Fig. 6.13 Frequencies of the 14 individual narrative components by grade .109 Fig. 6.14 Increase in frequency from first to fourth grade for each narrative component .................................................................................................110 Fig. 6.15 Frequency of each narrative component by sex in first grade ........114 Fig. 6.16 Frequency of each narrative component by sex in fourth grade ....116 Fig. 6.17 Frequency of each narrative component in first grade by experience group.......................................................................................118 Fig. 6.18 Frequency of each narrative component by experience group in fourth grade ...............................................................................................121 Fig. 6.19 Mean narrative index score by grade .................................................126 275 Fig. 6.20 Distribution of narrative index scores by grade ...............................127 Fig. 6.21 Mean narrative index score by sex and grade ...................................129 Fig. 6.22 Distribution of narrative index scores by sex in first grade ............129 Fig. 6.23 Distribution of narrative index scores by sex in fourth grade ........131 Fig. 6.24 Mean narrative index score by experience group and grade ..........132 Fig. 6.25 Distribution of narrative index scores by experience group in first grade ...........................................................................................................133 Fig. 6.26 Distribution of narrative index scores by experience group in fourth grade ...............................................................................................135 Fig. 7.1 Mean cohesive density by grade............................................................139 Fig. 7.2 Distribution of participants’ individual cohesive density scores by grade ...........................................................................................................141 Fig. 7.3 Mean cohesive density by grade and sex .............................................142 Fig. 7.4 Distribution of participants’ individual cohesive density scores by sex in first grade ........................................................................................143 Fig. 7.5 Distribution of participants’ individual cohesive density scores by sex in fourth grade ....................................................................................144 Fig. 7.6 Distribution of participants’ individual cohesive density scores by sex and grade .............................................................................................145 Fig. 7.7 Cohesive density by experience group and grade...............................146 Fig. 7.8 Distribution of participants’ individual cohesive density scores by experience group in first grade ...............................................................147 Fig. 7.9 Distribution of participants’ individual cohesive density scores by experience group in fourth grade ...........................................................149 Fig. 7.10 Distribution of participants’ individual cohesive density scores by experience group and grade ....................................................................150 Fig. 7.11 Distribution of participants’ individual reference density scores by grade ...........................................................................................................154 Fig. 7.12 Distribution of participants’ individual reference density scores by sex in first grade ........................................................................................156 Fig. 7.13 Distribution of participants’ individual reference density scores by sex and grade .............................................................................................157 Fig. 7.14 Distribution of participants’ individual reference density scores by experience group in first grade ...............................................................159 Fig. 7.15 Distribution of participants’ individual reference density scores by experience group and grade ....................................................................160 Fig. 7.16 Distribution of participants’ individual connective density scores by grade ......................................................................................................166 276 Fig. 7.17 Distribution of participants’ individual connective density scores by sex in first grade ...................................................................................167 Fig. 7.18 Distribution of participants’ individual connective density scores by sex and grade ........................................................................................169 Fig. 7.19 Distribution of participants’ individual connective density scores by experience group in first grade ..........................................................171 Fig. 7.20 Distribution of participants’ individual connective density scores by experience group and grade ...............................................................173 Fig. 7.21 Distribution of participants’ individual ellipsis density scores by grade ...........................................................................................................178 Fig. 7.22 Distribution of participants’ individual ellipsis density scores by sex in first grade ........................................................................................180 Fig. 7.23Distribution of participants’ individual ellipsis density scores by sex and grade .............................................................................................181 Fig. 7.24 Distribution of participants’ individual ellipsis density scores by experience group in first grade ...............................................................183 Fig. 7.25 Distribution of participants’ individual ellipsis density scores by experience group and grade ....................................................................185 Fig. 7.26 Distribution of participants’ individual lexical density scores by grade ...........................................................................................................190 Fig. 7.27 Distribution of participants’ individual lexical density scores by sex in first grade ........................................................................................191 Fig. 7.28 Distribution of participants’ individual lexical density scores by sex and grade .............................................................................................192 Fig. 7.29 Distribution of participants’ individual lexical density scores by experience group in first grade ...............................................................194 Fig. 7.30 Distribution of participants’ individual lexical density scores by experience group and grade ....................................................................195 Fig. 7.31 Contribution of the subcategories of cohesion in first grade..........200 Fig. 7.32 Contribution of the subcategories of cohesion in fourth grade .....200 Fig. 7.33 Contribution of the subcategories of cohesion for male first graders ........................................................................................................201 Fig. 7.34 Contribution of the subcategories of cohesion for male fourth graders ........................................................................................................201 Fig. 7.35 Contribution of the subcategories of cohesion for female first graders ........................................................................................................202 Fig. 7.36 Contribution of the subcategories of cohesion for female fourth graders ........................................................................................................202 277 Fig. 7.37 Contribution of the subcategories of cohesion for first graders with monolingual preschool experience (mono group) ......................204 Fig. 7.38 Contribution of the subcategories of cohesion for fourth graders with monolingual preschool experience (mono group) ......................204 Fig. 7.39 Contribution of the subcategories of cohesion for first graders with bilingual preschool experience (bili group)..................................205 Fig. 7.40 Contribution of the subcategories of cohesion for fourth graders with bilingual preschool experience (bili group)..................................205 Fig. 7.41 Contribution of the subcategories of lexical cohesion in first grade ...........................................................................................................207 Fig. 7.42 Contribution of the subcategories of lexical cohesion in fourth grade ........................................................................................................207 Fig. 7.43 Contribution of the subcategories of reference in first grade .........208 Fig. 7.44 Contribution of the subcategories of reference in fourth grade .....208 Fig. 7.45 Contribution of the subcategories of connectives in first grade.....209 Fig. 7.46 Contribution of the subcategories of connectives in fourth grade 209 Fig. 7.47 Cohesive densities by grade ................................................................211 278 13 List of tables Tab. 5.1 Descriptive statistics of the study’s participants ..................................56 Tab. 5.2 Participants of cohort one ......................................................................57 Tab. 5.3 Participants of cohort five (fourth grade) ............................................58 Tab. 5.4 Participants of cohort eight (first grade) ..............................................59 Tab. 5.5 Subtypes of references included in the analysis...................................77 Tab. 5.6 Subtypes of connectives included in the analysis ................................80 Tab. 5.7 Synonymy and near-synonymy included in the analysis ...................81 Tab. 5.8 Hyponymy included in the analysis ......................................................81 Tab. 5.9 Oppositional relations included in the analysis...................................82 Tab. 5.10 Word fields included in the analysis ...................................................83 Tab. 6.1 Descriptive statistics total number of narrative components by grade .............................................................................................................92 Tab. 6.2 Distribution of the total number of component scores by grade......93 Tab. 6.3 Descriptive statistics total number of components by sex and grade .............................................................................................................95 Tab. 6.4 Distribution of the total number of component scores by sex and grade .............................................................................................................96 Tab. 6.5 Descriptive statistics total number of narrative components by experience group and grade ....................................................................101 Tab. 6.6 Distribution of the total number of components by experience group and grade ........................................................................................102 Tab. 6.7 Frequencies of the 14 individual narrative components by grade 109 Tab. 6.8 Order of difficulty for first and fourth graders ..................................112 Tab. 6.9 Frequency of each narrative component by sex and grade ..............114 Tab. 6.10 Order of difficulty for first graders by sex ........................................115 Tab. 6.11 Order of difficulty for fourth graders by sex....................................117 Tab. 6.12 Frequency of each narrative component by experience group and grade ...........................................................................................................119 Tab. 6.13 Order of difficulty for first graders by experience group ...............120 Tab. 6.14 Order of difficulty for fourth graders by experience group ...........122 Tab. 6.15 Index construction: Components and their assigned weight ........125 Tab. 6.16 Descriptive statistics narrative index by grade ................................126 Tab. 6.17 Distribution of narrative index scores by grade ..............................127 Tab. 6.18 Descriptive statistics narrative index by sex and grade ..................128 279 Tab. 6.19 Distribution of narrative index scores by sex and grade ................130 Tab. 6.20 Descriptive statistics narrative index by experience group and grade ...........................................................................................................132 Tab. 6.21 Distribution of narrative index scores by experience group and grade ...........................................................................................................133 Tab. 7.1 Descriptive statistics cohesive density by grade ................................140 Tab. 7.2 Descriptive statistics cohesive density by sex and grade ..................142 Tab. 7.3 Descriptive statistics cohesive density by experience group and grade ...........................................................................................................147 Tab. 7.4 Descriptive statistics reference density ...............................................153 Tab. 7.5 Descriptive statistics reference density by sex and grade .................155 Tab. 7.6 Descriptive statistics reference density by experience group and grade ...........................................................................................................158 Tab. 7.7 Descriptive statistics connective density by grade ............................165 Tab. 7.8 Descriptive statistics connective density by sex and grade ..............167 Tab. 7.9 Descriptive statistics connective density by experience group and grade ...........................................................................................................170 Tab. 7.10 Descriptive statistics ellipsis density by grade .................................177 Tab. 7.11 Descriptive statistics ellipsis density by sex and grade ...................179 Tab. 7.12 Descriptive statistics ellipsis density by experience group and grade ...........................................................................................................182 Tab. 7.13 Descriptive statistics lexical density by grade ..................................189 Tab. 7.14 Descriptive statistics lexical density by sex and grade ....................191 Tab. 7.15 Descriptive statistics lexical density by experience group and grade ...........................................................................................................194 Tab. 7.16 Descriptive statistics cohesive densities by grade............................212 Tab. 7.17 Interindividual variation in cohesive density by grade, sex, and experience group.......................................................................................213 Tab. 8.1 Correlations in first grade.....................................................................218 Tab. 8.2 Correlations in fourth grade ................................................................219 Tab. 8.3 Correlations between increase variables.............................................220 Tab. 11.1 First graders’ cohesive density scores in ascending order..............260 Tab. 11.2 Fourth graders’ cohesive density scores in ascending order..........262 Tab. 11.3 Male first graders’ cohesive density scores in ascending order .....263 Tab. 11.4 Female first graders’ cohesive density scores in ascending order 263 Tab. 11.5 Male fourth graders’ cohesive density scores in ascending order 264 280 Tab. 11.6 Female fourth graders’ cohesive density scores in ascending order ..........................................................................................................264 Tab. 11.7 Bili first graders’ cohesive density scores in ascending order........265 Tab. 11.8 Mono first graders’ cohesive density scores in ascending order ...266 Tab. 11.9 Bili fourth graders’ cohesive density scores in ascending order....266 Tab. 11.10 Mono fourth graders’ cohesive density scores in ascending order ..........................................................................................................267 Tab. 11.11 Overall coherence scores by participant.........................................268 Tab. 11.12 Individual narrative components by participant ..........................270 Tab. 11.13 Cohesion scores by participant........................................................272 Narr Francke Attempto Verlag GmbH+Co. KG • Dischingerweg 5 • D-72070 Tübingen Tel. +49 (07071) 9797-0 • Fax +49 (07071) 97 97-11 • info@narr.de • www.narr.de JETZT BES TELLEN! JETZT BES TELLEN! Stefanie Frisch Lesen im Englischunterricht der Grundschule Eine Vergleichsstudie zur Wirksamkeit zweier Lehrverfahren Giessener Beiträge zur Fremdsprachendidaktik 2013, 296 Seiten €[D] 58,00/ SFr 74,70 ISBN 978-3-8233-6804-5 Die Ergebnisse der Studie stellen eine theoretische und empirische Basis für die fachdidaktische Diskussion um geeignete Leselehrverfahren im Englischunterricht der Grundschule bereit. Zwei Klassen wurden von der gleichen Lehrperson nach zwei verschiedenen Leselehrverfahren unterrichtet. Im Vordergrund stand die Frage nach den Gemeinsamkeiten und Unterschieden in den Lernergebnissen der Schülerinnen und Schüler dieser Klassen. Aus den zentralen Erkenntnissen werden fünf Hypothesen generiert, deren Überprüfung zum Teil quantitative Studien mit größerer Probandenzahl nahe legen und zum Teil eine wertorientierte Diskussion über die anzustrebenden Ziele des Englischunterrichts in der Grundschule herausfordern. Die Erhebungsinstrumente der Untersuchung stehen im Internet zur Ansicht bereit: http: / / www.narr.de/ lesen-im-englischunterricht. How do second language learners’ text/ discourse abilities develop? The present monograph contributes to this largely unanswered question by investigating narrative texts produced by elementary school students in an English immersion program in Germany. On the basis of a psycholinguistic model of discourse production, the texts are analyzed with respect to their coherence and cohesion. Multilingualism and L anguage Tea ching 3