Corpus d'Etude des Langues Vivantes Appliquées à une Spécialité

retour à la liste

Learner language data set for the study of English for Specific Purposes

DOI / Handle

Auteur : Thomas GAILLAT

This data set aims at the study of English as a second language (L2) in learners studying specific acedemic domains. Are included 671 texts written by students of various academic domains in a French university. All learners responded to the same task prompt designed to elicit language related to their specific domain, and had their CEFR level assessed with the DIALANG test. The data set includes structured textual data with rich Universal-Dependency linguistic annotation and metadata. This data set can be used in several types on NLP tasks, to gain insight on the learning of English as L2. This data is collected as part of the Analytics for Language Learning project (A4LL) – ANR-22-CE38-0015-01

Données

data_public_anglais_annotated_CEFR_dialang.csv

data_public_anglais_annotated_CEFR_dialang_with_conllu.csv

Visualisation

Mots-clés

learner corpus
L2
english for specific purposes
corpus linguisitics

Auteur :

Thomas GAILLAT

titre

Learner language data set for the study of English for Specific Purposes

http://nakala.fr/terms#created

2023-03-20

licence

CC-BY-4.0

type

http://purl.org/coar/resource_type/c_ddb1

mots-clés

learner corpus

mots-clés

L2

mots-clés

english for specific purposes

mots-clés

corpus linguisitics

descriptionen

This data set aims at the study of English as a second language (L2) in learners studying specific acedemic domains. Are included 671 texts written by students of various academic domains in a French university. All learners responded to the same task prompt designed to elicit language related to their specific domain, and had their CEFR level assessed with the DIALANG test. The data set includes structured textual data with rich Universal-Dependency linguistic annotation and metadata. This data set can be used in several types on NLP tasks, to gain insight on the learning of English as L2. This data is collected as part of the Analytics for Language Learning project (A4LL) – ANR-22-CE38-0015-01

langues

en

auteur

Thomas GAILLAT