Building Corpus-Informed Word Lists for L2 Vocabulary Learning in Nine Languages

Written by
Language: English

© 2012


Lexical competence constitutes a crucial aspect in L2 learning, since building a rich repository of words is considered indispensable for successful communication. CALL practitioners have experimented with various kinds of computer-mediated glosses to facilitate L2 vocabulary building in the context of incidental vocabulary learning. Intentional learning, on the other hand, is generally underestimated, since it is considered out of fashion and not in line with the communicative L2 learning paradigm. Yet, work is still being done in this area and a substantial body of research indicates that the usefulness of incidental vocabulary learning does not exclude the use of dedicated vocabulary study and that by using aids explicitly geared to building vocabularies (such as word lists and word cards), L2 learners exhibit good retention rates and faster learning gains. Intentional vocabulary study should, therefore, have its place in the instructional and learning context. Regardless of the approach, incidental or intentional, the crucial question with respect to vocabulary teaching/learning remains: which and how many words should we teach/learn at different language levels? An attempt to answer the above question was made within the framework of the EU-funded project titled "KELLY" (Keywords for Language Learning for Young and Adults Alike) presented here. The project aimed at building corpus-informed vocabulary lists for L2 learners ranging from A1 to C2 levels for nine languages: Arabic, Chinese, English, Greek, Italian, Norwegian, Polish, Russian and Swedish.

Keywords: intentional vocabulary learning, corpora, CEFR, corpus-informed word lists, digital cards.


Council of Europe. (2001). Common European Framework of Reference for Languages: Learning, Teaching and Assessment. Cambridge: Cambridge University.

Ferraresi, A., Zanchetta, E., Baroni, M., & Bernardini, S. (2008). Introducing and evaluating ukWaC, a very large web-derived corpus of English. In S. Evert, A. Kilgarriff, & S. Sharoff (Eds.), Proc. 4th Web as Corpus Workshop (WAC-4) – Can we beat Google? (pp. 47-54). Marrakech, Morocco.

Hulstijn, J. (2001). Intentional and incidental second language vocabulary learning: a reappraisal of elaboration, rehearsal, and automaticity. In P. Robinson (Ed.), Cognition and second language instruction (pp. 258–286). Cambridge, UK: Cambridge University Press. doi: 10.1017/CBO9781139524780.011

Laufer, B. (2003). Vocabulary acquisition in a second language: do learners really acquire most vocabulary by reading? Some empirical evidence. Canadian Modern Language Review, 59(4), 567-587. doi: 10.3138/cmlr.59.4.567

Ma, Q., & Kelly, P. (2006). Computer-Assisted Vocabulary Learning: Design and Evaluation. Computer-Assisted Language Learning, 19(1), 15-45. doi: 10.1080/09588220600803998

Nation, P., & Waring, R. (1997). Vocabulary Size, Text Coverage and Word Lists. In N. Schmitt, & M. McCarthy (Eds.), Vocabulary: Description, Acquisition and Pedagogy (pp. 6-19).  Cambridge University Press.

Nation, P. (2001). Learning vocabulary in another language. Cambridge, UK: Cambridge University Press. doi: 10.1017/CBO9781139524759

Read, J. (2000). Assessing Vocabulary. Cambridge, UK: Cambridge University Press. doi: 10.1017/CBO9780511732942

Sharoff, S. (2006). Creating general-purpose corpora using automated search engine queries. In M. Baroni, & S. Bernardini (Eds.), WaCky! Working papers on the Web as Corpus (pp. 63-98). Bologna: Gedit.

How to cite

Citation is provided in standard text format below. For full citation export options, click Export citation.

Charalabopoulou, Frieda; Gavrilidou, Maria; Johansson Kokkinakis, Sofie; Volodina, Elena. (2012). Building Corpus-Informed Word Lists for L2 Vocabulary Learning in Nine Languages. In Linda Bradley, Sylvie Thouësny (Eds), CALL: Using, Learning, Knowing, EUROCALL Conference, Gothenburg, Sweden, 22-25 August 2012, Proceedings (pp. 49-53).

Request permissions

This article is published under the Attribution-NonCommercial-NoDerivatives International 4.0 (CC BY-NC-ND 4.0) licence. Under this licence, the contents are freely available online (as PDF files) for anybody to read, download, copy, and redistribute provided that the AUTHOR(s), EDITORIAL TEAM and PUBLISHER are properly cited. Commercial use and derivative works are, however, not permitted.

Permission is not required for the republication of tables, figures or illustrations, as long as they are reproduced accurately and the source material is fully cited. It may be the case that the licence does not give you all of the permissions necessary for your intended use. If this is your current situation, please do feel free to ask at

From the same authors

Developing an Open-Source Web-Based Exercise Generator for Swedish
Volodina, Elena; Borin, Lars.
Developing Swedish Spelling Exercises on the ICALL Platform Lärka
Pijetlovic, Dijana; Volodina, Elena.
Automatic Selection of Suitable Sentences for Language Learning Exercises
Pilán, Ildikó; Volodina, Elena; Johansson, Richard.
Classification of Swedish learner essays by CEFR levels
Volodina, Elena; Pilán, Ildikó; Alfter, David.