ESL Speaking

Games + Activities to Try Out Today!

in Activities for Adults · Activities for Kids · ESL Speaking Resources

Approaches and Methods in Language Teaching: CLT, TPR

Teaching a foreign language can be a challenging but rewarding job that opens up entirely new paths of communication to students. It’s beneficial for teachers to have knowledge of the many different language learning techniques including ESL teaching methods so they can be flexible in their instruction methods, adapting them when needed.

Keep on reading for all the details you need to know about the most popular foreign language teaching methods. Some of the ESL pedagogy ideas covered are the communicative approach, total physical response, the direct method, task-based language learning, suggestopedia, grammar-translation, the audio-lingual approach and more.

language-learning-methods-approaches

Language teaching methods

Most Popular Approaches and Methods in Language Teaching

Here’s a helpful rundown of the most common language teaching methods and ESL teaching methods. You may also want to take a look at this: Foreign language teaching philosophies .

#1: The Direct Method

In the direct method ESL, all teaching occurs in the target language, encouraging the learner to think in that language. The learner does not practice translation or use their native language in the classroom. Practitioners of this method believe that learners should experience a second language without any interference from their native tongue.

Instructors do not stress rigid grammar rules but teach it indirectly through induction. This means that learners figure out grammar rules on their own by practicing the language. The goal for students is to develop connections between experience and language. They do this by concentrating on good pronunciation and the development of oral skills.

This method improves understanding, fluency , reading, and listening skills in our students. Standard techniques are question and answer, conversation, reading aloud, writing, and student self-correction for this language learning method. Learn more about this method of foreign language teaching in this video: 

#2: Grammar-Translation

With this method, the student learns primarily by translating to and from the target language. Instructors encourage the learner to memorize grammar rules and vocabulary lists. There is little or no focus on speaking and listening. Teachers conduct classes in the student’s native language with this ESL teaching method.

This method’s two primary goals are to progress the learner’s reading ability to understand literature in the second language and promote the learner’s overall intellectual development. Grammar drills are a common approach. Another popular activity is translation exercises that emphasize the form of the writing instead of the content.

Although the grammar-translation approach was one of the most popular language teaching methods in the past, it has significant drawbacks that have caused it to fall out of favour in modern schools . Principally, students often have trouble conversing in the second language because they receive no instruction in oral skills.

#3: Audio-Lingual

The audio-lingual approach encourages students to develop habits that support language learning. Students learn primarily through pattern drills, particularly dialogues, which the teacher uses to help students practice and memorize the language. These dialogues follow standard configurations of communication.

There are four types of dialogues utilized in this method:

  • Repetition, in which the student repeats the teacher’s statement exactly
  • Inflection, where one of the words appears in a different form from the previous sentence (for example, a word may change from the singular to the plural)
  • Replacement, which involves one word being replaced with another while the sentence construction remains the same
  • Restatement, where the learner rephrases the teacher’s statement

This technique’s name comes from the order it uses to teach language skills. It starts with listening and speaking, followed by reading and writing, meaning that it emphasizes hearing and speaking the language before experiencing its written form. Because of this, teachers use only the target language in the classroom with this TESOL method.

Many of the current online language learning apps and programs closely follow the audio-lingual language teaching approach. It is a nice option for language learning remotely and/or alone, even though it’s an older ESL teaching method.

#4: Structural Approach

Proponents of the structural approach understand language as a set of grammatical rules that should be learned one at a time in a specific order. It focuses on mastering these structures, building one skill on top of another, instead of memorizing vocabulary. This is similar to how young children learn a new language naturally.

An example of the structural approach is teaching the present tense of a verb, like “to be,” before progressing to more advanced verb tenses, like the present continuous tense that uses “to be” as an auxiliary.

The structural approach teaches all four central language skills: listening, speaking, reading, and writing. It’s a technique that teachers can implement with many other language teaching methods.

Most ESL textbooks take this approach into account. The easier-to-grasp grammatical concepts are taught before the more difficult ones. This is one of the modern language teaching methods.

approaches-methods-language-teaching-learning

Most popular methods and approaches and language teaching

#5: Total Physical Response (TPR)

The total physical response method highlights aural comprehension by allowing the learner to respond to basic commands, like “open the door” or “sit down.” It combines language and physical movements for a comprehensive learning experience.

In an ordinary TPR class, the teacher would give verbal commands in the target language with a physical movement. The student would respond by following the command with a physical action of their own. It helps students actively connect meaning to the language and passively recognize the language’s structure.

Many instructors use TPR alongside other methods of language learning. While TPR can help learners of all ages, it is used most often with young students and beginners. It’s a nice option for an English teaching method to use alongside some of the other ones on this list. 

An example of a game that could fall under TPR is Simon Says. Or, do the following as a simple review activity. After teaching classroom vocabulary, or prepositions, instruct students to do the following:

  • Pick up your pencil.
  • Stand behind someone.
  • Put your water bottle under your chair.

Are you on your feet all day teaching young learners? Consider picking up some of these teacher shoes .

#6: Communicative Language Teaching (CLT)

These days, CLT is by far one of the most popular approaches and methods in language teaching. Keep reading to find out more about it.

This method stresses interaction and communication to teach a second language effectively. Students participate in everyday situations they are likely to encounter in the target language. For example, learners may practice introductory conversations, offering suggestions, making invitations, complaining, or expressing time or location.

Instructors also incorporate learning topics outside of conventional grammar so that students develop the ability to respond in diverse situations.

ESL/EFL Teaching Practice and Methodology: 20 Years of Experience Teaching English in a Single Book!...

  • Amazon Kindle Edition
  • Bolen, Jackie (Author)
  • English (Publication Language)
  • 301 Pages - 12/21/2022 (Publication Date)

CLT teachers focus on being facilitators rather than straightforward instructors. Doing so helps students achieve CLT’s primary goal, learning to communicate in the target language instead of emphasizing the mastery of grammar.

Role-play , interviews, group work, and opinion sharing are popular activities practiced in communicative language teaching, along with games like scavenger hunts and information gap exercises that promote student interaction.

Most modern-day ESL teaching textbooks like Four Corners, Smart Choice, or Touchstone are heavy on communicative activities.

#7: Natural Approach

This approach aims to mimic natural language learning with a focus on communication and instruction through exposure. It de-emphasizes formal grammar training. Instead, instructors concentrate on creating a stress-free environment and avoiding forced language production from students.

Teachers also do not explicitly correct student mistakes. The goal is to reduce student anxiety and encourage them to engage with the second language spontaneously.

Classroom procedures commonly used in the natural approach are problem-solving activities, learning games , affective-humanistic tasks that involve the students’ own ideas, and content practices that synthesize various subject matter, like culture.

#8: Task-Based Language Teaching (TBL)

With this method, students complete real-world tasks using their target language. This technique encourages fluency by boosting the learner’s confidence with each task accomplished and reducing direct mistake correction.

Tasks fall under three categories:

  • Information gap, or activities that involve the transfer of information from one person, place, or form to another.
  • Reasoning gap tasks that ask a student to discover new knowledge from a given set of information using inference, reasoning, perception, and deduction.
  • Opinion gap activities, in which students react to a particular situation by expressing their feelings or opinions.

Popular classroom tasks practiced in task-based learning include presentations on an assigned topic and conducting interviews with peers or adults in the target language. Or, having students work together to make a poster and then do a short presentation about a current event. These are just a couple of examples and there are literally thousands of things you can do in the classroom. In terms of ESL pedagogy, this is one of the most popular modern language teaching methods. 

It’s considered to be a modern method of teaching English. I personally try to do at least 1-2 task-based projects in all my classes each semester. It’s a nice change of pace from my usually very communicative-focused activities.

One huge advantage of TBL is that students have some degree of freedom to learn the language they want to learn. Also, they can learn some self-reflection and teamwork skills as well. 

#9: Suggestopedia Language Learning Method

This approach and method in language teaching was developed in the 1970s by psychotherapist Georgi Lozanov. It is sometimes also known as the positive suggestion method but it later became sometimes known as desuggestopedia.

Apart from using physical surroundings and a good classroom atmosphere to make students feel comfortable, here are some of the main tenants of this second language teaching method:

  • Deciphering, where the teacher introduces new grammar and vocabulary.
  • Concert sessions, where the teacher reads a text and the students follow along with music in the background. This can be both active and passive.
  • Elaboration where students finish what they’ve learned with dramas, songs, or games.
  • Introduction in which the teacher introduces new things in a playful manner.
  • Production, where students speak and interact without correction or interruption.

english-teaching methods

TESOL methods and approaches

#10: The Silent Way

The silent way is an interesting ESL teaching method that isn’t that common but it does have some solid footing. After all, the goal in most language classes is to make them as student-centred as possible.

In the Silent Way, the teacher talks as little as possible, with the idea that students learn best when discovering things on their own. Learners are encouraged to be independent and to discover and figure out language on their own.

Instead of talking, the teacher uses gestures and facial expressions to communicate, as well as props, including the famous Cuisenaire Rods. These are rods of different colours and lengths.

Although it’s not practical to teach an entire course using the silent way, it does certainly have some value as a language teaching approach to remind teachers to talk less and get students talking more!

#11: Functional-Notional Approach

This English teaching method first of all recognizes that language is purposeful communication. The reason people talk is that they want to communicate something to someone else.

Parts of speech like nouns and verbs exist to express language functions and notions. People speak to inform, agree, question, persuade, evaluate, and perform various other functions. Language is also used to talk about concepts or notions like time, events, places, etc.

The role of the teacher in this second language teaching method is to evaluate how students will use the language. This will serve as a guide for what should be taught in class. Teaching specific grammar patterns or vocabulary sets does play a role but the purpose for which students need to know these things should always be kept in mind with the functional-notional Approach to English teaching.

#12: The Bilingual Method

The bilingual method uses two languages in the classroom, the mother tongue and the target language. The mother tongue is briefly used for grammar and vocabulary explanations. Then, the rest of the class is conducted in English. Check out this video for some of the pros and cons of this method:

#13: The Test Teach Test Approach (TTT)

This style of language teaching is ideal for directly targeting students’ needs. It’s best for intermediate and advanced learners. Definitely don’t use it for total beginners!

There are three stages:

  • A test or task of some kind that requires students to use the target language.
  • Explicit teaching or focus on accuracy with controlled practice exercises.
  • Another test or task is to see if students have improved in their use of the target language.

Want to give it a try? Find out what you need to know here:

Test Teach Test TTT .

#14: Community Language Learning

In Community Language Learning, the class is considered to be one unit. They learn together. In this style of class, the teacher is not a lecturer but is more of a counsellor or guide.

In general, there is no set lesson for the day. Instead, students decide what they want to talk about. They sit in the a circle, and decide on what they want to talk about. They may ask the teacher for a translation or for advice on pronunciation or how to say something.

The conversations are recorded, and then transcribed. Students and teacher can analyze the grammar and vocabulary, as well as subject related content.

While community language learning may not comprehensively cover the English language, students will be learning what they want to learn. It’s also student-centred to the max. It’s perhaps a nice change of pace from the usual teacher-led classes, but it’s not often seen these days as the only method of teaching a class.M

#15: The Situational Approach

This approach loosely falls under the behaviourism view of language as habit formation. The situational approach to teaching English was popular in England, starting in the 1930s. Find out more about it:

Language Teaching Approaches FAQs

There are a number of common questions that people have about second or foreign language teaching and learning. Here are the answers to some of the most popular ones.

What is language teaching approaches?

A language teaching approach is a way of thinking about teaching and learning. An approach produces methods, which is the way of teaching something, in this case, a second or foreign language using techniques or activities.

What are method and approach?

Method and approach are similar but there are some key differences. An approach is the way of dealing with something while a method involves the process or steps taken to handle the issue or task.

What is presentation practice production?

How many approaches are there in language learning?

Throughout history, there have been just over 30 popular approaches to language learning. However, there are around 10 that are most widely known including task-based learning, the communicative approach, grammar-translation and the audio-lingual approach. These days, the communicative approach is all the rage.

What is the best method of English language teaching?

It’s difficult to choose the best single approach or method for English language teaching as the one used depends on the age and level of the students as well as the material being taught. Most teachers find that a mix of the communicative approach, audio-lingual approach and task-based teaching works well in most cases.

What is micro teaching?

What are the most effective methods of learning a language?

The most effective methods for learning a language really depends on the person, but in general, here are some of the best options: total immersion, the communicative approach, extensive reading, extensive listening, and spaced repetition.

The Modern Methods of Teaching English

There are several modern methods of teaching English that focus on engaging students and making learning more interactive and effective. Some of these methods include:

Communicative Language Teaching (CLT)

This approach emphasizes communication and interaction as the main goals of language learning. It focuses on real-life situations and encourages students to use English in meaningful contexts.

Task-Based Learning (TBL)

TBL involves designing activities or tasks that require students to use English to complete a specific goal or objective. This approach helps students develop language skills while focusing on the task at hand.

Technology-Enhanced Learning

Using technology such as computers, tablets, and smartphones can make learning more engaging and interactive. Online resources, apps, and educational games can be used to supplement traditional teaching methods.

Flipped Classroom

In a flipped classroom, students learn new material at home through videos or online resources, and then use class time for activities, discussions, and practice exercises. This approach allows for more individualized learning and interaction in the classroom.

Project-Based Learning (PBL)

PBL involves students working on projects or tasks that require them to use English in a real-world context. This approach helps students develop critical thinking and problem-solving skills while improving their language abilities.

Content and Language Integrated Learning (CLIL)

CLIL involves teaching subjects such as science or history in English, rather than teaching English as a separate subject. This approach helps students learn English while also learning about other subjects.

Gamification

Using game elements such as points, badges, and leaderboards can make learning English more fun and engaging. Educational games can help students practice language skills in a playful and interactive way.

These modern methods of teaching English focus on making learning more student-centered, interactive, and engaging, leading to better outcomes for students.

Have your say about Approaches and Methods in Language Teaching

What’s your top pick for a language teaching method? Is it one of the options from this list or do you have another one that you’d like to mention? Leave a comment below and let us know what you think. We’d love to hear from you. And whatever approach or method you use, you’ll want to check out these top 1o tips for new English teachers .

Also, be sure to give this article a share on Facebook, Pinterest, or Twitter. It’ll help other busy teachers, like yourself, find this useful information about approaches and methods in language teaching and learning.

Last update on 2024-04-25 / Affiliate links / Images from Amazon Product Advertising API

essay about methods of language teaching

About Jackie

Jackie Bolen has been teaching English for more than 15 years to students in South Korea and Canada. She's taught all ages, levels and kinds of TEFL classes. She holds an MA degree, along with the Celta and Delta English teaching certifications.

Jackie is the author of more than 100 books for English teachers and English learners, including 101 ESL Activities for Teenagers and Adults and 1001 English Expressions and Phrases . She loves to share her ESL games, activities, teaching tips, and more with other teachers throughout the world.

You can find her on social media at: YouTube Facebook TikTok Pinterest Instagram

essay about methods of language teaching

This is wonderful, I have learned a lot!

essay about methods of language teaching

You’re welcome!

essay about methods of language teaching

What year did you publish this please?

Recently! Only a few months ago.

essay about methods of language teaching

Wonderful! Thank you for sharing such useful information. I have learned a lot from them. Thank you!

essay about methods of language teaching

I am so grateful. Thanks for sharing your kmowledge.

essay about methods of language teaching

Hi thank you so much for this amazing article. I just wanted to confirm/ask is PPP one of the methods of teaching ESL if so was there a reason it wasn’t included in the article(outdated, not effective etc.?).

PPP is more of a subset of these other ones and not an approach or method in itself.

essay about methods of language teaching

Good explanation, understandable and clear. Congratulations

essay about methods of language teaching

That’s good, very short but clear…👏🏾👏🏾👏🏾👏🏾👏🏾

essay about methods of language teaching

I meant the naturalistic approach

essay about methods of language teaching

This is amazing! Thank you for writing this article, it helped me a lot. I hoped this will reach more people so I will definitely recommend this to others.

essay about methods of language teaching

Thank you, sir! I just used this article in my PPT presentation at my Post Grad School. More articles from you!

I think this useful because it is teaching me a lot about english. Thank you bro! 😀👍

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Our Top-Seller

39 No-Prep/Low-Prep ESL Speaking Activities: For Teenagers and Adults (Teaching ESL Conversation and...

As an Amazon Associate, I earn from qualifying purchases.

More ESL Activities

describing words that start with N

Describing Words that Start with N (“N” Adjectives)

adjectives that start with A

Adjectives with the Letter A | List of A Adjectives

basic english questions with answers

100 Common English Questions and How to Answer Them

words without vowels

Five Letter Words Without Vowels | No Vowel Words

About, contact, privacy policy.

Jackie Bolen has been talking ESL speaking since 2014 and the goal is to bring you the best recommendations for English conversation games, activities, lesson plans and more. It’s your go-to source for everything TEFL!

About and Contact for ESL Speaking .

Privacy Policy and Terms of Use .

Email: [email protected]

Address: 2436 Kelly Ave, Port Coquitlam, Canada

essay about methods of language teaching

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

An Overview of Language Teaching Methods and Approaches

Profile image of melisa Perea

Related Papers

essay about methods of language teaching

Naoki Fujimoto-Adamson

"This study investigates the history of English language education in Japan over the past 150 years. For this purpose, tabulated representations have been devised which illustrate the educational events in each historical era alongside key national and international events and trends. This is a means of illustrating how local education is a microcosm of the society and the world around it, and the manner in which globalization has an impact upon it. In tracing the inter-relatedness between education, society, politics and economics at the local and global levels, various issues are raised which explain why changes have been made in English language education. Among these issues are the periods of immense popularity of English in Japan, seen by some as "linguistic imperialism" (Phillipson, 1992), yet in the early part of Japan's modernization as "a product of the struggle against imperialism" (Brutt-Giffler, 2002, as cited in Park, 2004, p.87). The tables clarify these two polarized stances and give insights into the fluctuating periods of popularity and decline over time in English language education in Japan. Keywords. English language education in Japan, globalization of English language teaching, Japanese 'macro' events and English education"

Neil Heffernan

The Asian EFL Journal Quarterly September 2006 …

Theron Muller

There has been talk in TBL of the dangers of giving students target language before or during the pre-task because students may use the subsequent task to practice target forms and not focus on communication (Ellis, 2003, p. 246). Textbooks are often considered culprits in this predetermination of language forms (Willis, 1990), as model dialogs lock students into particular grammatical forms and restrict student vocabulary, thus reducing the communicative value of a task. This research addresses the concern of supplying learners with target language forms during the pre-task phase. Suggested phrases from the textbook were introduced before the task, but students were encouraged to also use their own ideas in task completion. Whether students deviated from or remained bound by the suggested forms and vocabulary during task completion was analyzed. Thirty-six student performances on a task were analyzed. Preliminary results indicate students use textbook language as a scaffold, employ unique vocabulary not included in the textbook, and do not vary grammatical forms.

Language Learning Journal

Melinda Whong

Despite a range of criticism communicative language teaching (CLT) has been broadly accepted as the appropriate approach to language teaching. This paper argues that large shifts in language pedagogy firstly from ‘structure’ to ‘meaning’ and more recently from progressivism to critical pedagogy need to be tempered bya restatement of the importance of linguistics to language teaching. Ten characteristics of CLT are presented and then explored from a linguistic point of view. Throughout, explicit connections are made between cutting-edge linguistic research and questions of language pedagogy within the CLT paradigm. The conclusion is a call for a renewed focus on the understanding of language for language teaching expertise.

Roger C Nunn

"This paper will outline the rationale behind the design of units of learning ‘activities’ in the form of interlocking sets of interactive holistic ‘tasks’ and supporting ‘exercises’. The illustrations used to support the argumentation are extracts from “task-based units” designed for a general education English foundation course at Kochi University in Japan over a seven-year period, and which are still being used and developed today. The paper will attempt to describe the theoretical underpinning of the units in relation to their practical aim: to encourage students to develop their ability to learn how to use English as a means of international communication."

Ramesh Sharma

princess ann Fernandez

Darling Agapay

RELATED PAPERS

Francis Rew

felipe mansilla

Phương Hà Nguyễn

Dr Shawana Fazal

Erwin Pohan

د. عبد الناصر محمد علي نقيب الرضامي

jacques obensonne

Pamela Bradshaw

Alex Housen

CHARWIN NIEVES

Carlos Alonso

Javad Nabizadeh M.

The Handbook of Language Teaching

Nassima Bendjerid

Tesol Quarterly

Intan Rahmadani

Hamid Noroozi

Xavier Llovet Vilà

Chaima Benkorichi

Ali Shehadeh

Michael Rost

Johnson Cosmo

ali hashemi

Zehra Gabillon

Nguyen Viet Hung

Karunakaran Thirunavukkarasu

Fumi Takegami , Fumi Takegami

Leilani Escalada

Younes Merrah

Alex Gilmore

manel mizab

Deborah Norland

Hamid Marashi , Mehdi Zargari

Zohreh Naeimaee

MIGUEL MURGA

Adel Abdulkhaliq

yasaman salari

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

Your cart is empty

Have an account?

Log in to check out faster.

Estimated total

Enjoy £5.00 off your order! Use Code: WELCOME5

No physical shipments, just instant downloads!

English Language Teaching: Approaches, Methods, and Techniques

Written by: Mike Turner

June 15, 2021

Time to read 5 min

When we are looking at the effectiveness of our teaching, we often get tied up in the minutiae of classroom practice. However, sometimes it’s useful to take a bit of a step back and examine what we are doing more broadly.  

In order to look at our different options as teachers, it is handy to use a consistent framework. I am indebted to several writers on TEFL methodology, but I have chosen specifically to apply the useful distinctions between  approach ,  method , and  technique made by Richards and Rogers in their 1986 work  Approaches and Methods in Language Teaching (London: CUP). Although the book is now 25 years old, it still provides one of the neatest and most accessible descriptions of some of the most influential approaches. The terminological distinctions they draw are particularly useful and are summarised below. I have then applied them, as succinctly as I can, to a variety of current and historical approaches. The list is not intended to be exhaustive, but I hope it will allow teachers to contextualise their own practice.

Approach, Method & Technique

An approach describes the theory or philosophy underlying how a language should be taught; a method or methodology describes, in general terms, a way of implementing the approach (syllabus, progression, kinds of materials); techniques describe specific practical classroom tasks and activities. For example:

Communicative Language Teaching (CLT) is an approach with a theoretical underpinning that a language is for communication.

A CLT methodology may be based on a notional-functional syllabus, or a structural one, but the learner will be placed at the centre, with the main aim being developing their Communicative Competence. Classroom activities will be chosen that will engage learners in communicating with each other.

CLT techniques might include role-plays, discussions, text ordering, speaking games, and problem-solving activities.

Some Different Approaches, Methods & Techniques

The audiolingual approach.

The Audiolingual Approach is based on a structuralist view of language and draws on the psychology of behaviourism as the basis of its learning theory, employing stimulus and response.

Audio-lingual teaching uses a fairly mechanistic method that exposes learners to increasingly complex language grammatical structures by getting them to listen to the language and respond. It often involves memorising dialogues and there is no explicit teaching of grammar.

Techniques include listening and repeating, and oral drilling to achieve a high level of accuracy of language forms and patterns. At a later stage, teachers may use communicative activities.

CLIL - Content and Language Integrated Learning

CLIL is an approach that combines the learning of a specific subject matter with learning the target language. It becomes necessary for learners to engage with the language in order to fulfil the learning objectives. On a philosophical level, its proponents argue that it fosters intercultural understanding, meaningful language use, and the development of transferrable skills for use in the real world.

The method employs immersion in the target language, with the content and activities dictated by the subject being taught. Activities tend to integrate all four skills, with a mixture of task types that appeal to different learning styles.

Techniques involve reading subject-specific texts, listening to subject-based audio or audio-visual resources, discussions, and subject-related tasks.

CLT - Communicative Language Teaching   (The Communicative Approach)

CLT emphasises that the main purpose of language is communication, and that meaning is paramount. The goal of the Communicative Approach is to develop learners’ communicative competence across all four skills. It has been the dominant approach in mainstream language education for many decades.

Most methodologies use an amalgamation of a structural and a functional syllabus, with a relatively common consensus emerging concerning the order in which language elements should be taught. Language is generally contextualised, and communication is encouraged from the start. Native speaker input is seen as highly desirable, though not essential. Much teaching is learner-centred.

Techniques are an eclectic mix - with techniques often borrowed from a range of other approaches. Because of this, it is often criticised for a lack of robust theoretical underpinning. Specific activities and games are chosen for their perceived effectiveness in relation to the knowledge or skills being taught. Typical activities include physical games such as board races and running dictations, information exchange activities, role-plays – and any tasks and games that involve communication between learners.

DOGME is a humanistic communicative approach that focuses on conversational interactions where learners and the teacher work together on the development of knowledge and skills.

In terms of method, it generally eschews the use of textbooks and published materials in favour of real communication and the development of discourse-level skills. Language may be scaffolded by the teacher, with attention paid to emergent forms. Topics are chosen based on their relevance to the learners.

Techniques include conversational activities and exposure to the language through real-life texts, audio, and video materials.

Grammar Translation 

An approach to language study is generally used to prepare students for reading classical texts, notably Latin, in their original. It is thought that students benefit from learning about the ideas of classical thinkers, and from the rigour of rote learning and the application of grammatical rules.

The method commonly involves students learning grammar rules plus vocabulary lists based on the content of chosen texts. These are then applied to the written translation of texts from and into the target language. The teaching is usually done in the student’s native language. There is little emphasis on speaking, other than to recite sections of text.

Techniques include rote learning and drilling, translation activities, and recitation.

This approach is not really used in teaching Modern Foreign Languages but is still sometimes the basis for the teaching of classical languages such as Latin or Greek.

The Lexical Approach

An approach based on the notion that language comprises lexical units (chunks, collocations, and fixed phrases). Grammar is secondary and is acquired through learning these chunks.

The method focuses on learning sets of phrase-level, multi-word vocabulary and linguistic frames that can be manipulated by the learner using substitutions and adaptations. This can be done through adapting many standard EFL activities.

Techniques could include searching texts for lexical units, collocation matching games, lexical drills and chants, story-telling, role plays using fixed and semi-fixed expressions, activities with de-lexical verbs and examining concordances.

The Natural Approach

An approach to language learning that seeks to mirror how we learn our first language.

Methods focus on the possibility of ‘acquiring’ a second language rather than having to learn it artificially. Teaching is by a native-speaker teacher; the syllabus mirrors the order in which we acquire our first language; there is an initial ‘silent phase’ when the learner assimilates aspects of the language, before moving on to producing it. Errors are seen as important attempts to form and use appropriate rules.

Techniques focus on meaningful interactions and may include listening and following instructions; ordering activities; memory games; miming activities; and describing and guessing games.

The Silent Way

The Silent way sees the process of learning a second language as a cognitive task, with learners as intelligent autonomous individuals, who can infer language use from well-structured input.

The methodology employs a graded structural syllabus, with the elements of language presented in a deliberately artificial way, using teaching aids such as charts and Cuisenaire rods.

Techniques involve, for example, mapping individual sounds and sequences onto the colours or physical characteristics of the teaching aids, and then having students infer rules based on recognising the systematic similarities and differences in the input material.

Situational Language Teaching (SLT)

This approach views language as a purposeful means of achieving goals in real-life situations.

The method employs oral practice of sentence patterns and structures related to these specific situations. It often uses props and realia in practice activities.

Techniques include drills, repetition and substitution activities, spoken dialogues, and situational role-plays. Oral practice aims towards accuracy and mastery of the situational language, moving at a later stage to the other three skills.

Related product

Glossary of Abbreviations and English Exams | EFL Resource - TEFL-Toolkit.com

Glossary of Abbreviations and English Exams | EFL Resource

Latest Blogs

The General Election in Britain; A Guide for Teachers Of English

Teaching at Summer Schools: The Good, The Bad and The Ugly

Exploring House and Home | Activities for the EFL Classroom

  • Choosing a selection results in a full page refresh.
  • Opens in a new window.

essay about methods of language teaching

Explore More

Stay in our orbit.

Stay connected with industry news, resources for English teachers and job seekers, ELT events, and more.

essay about methods of language teaching

Explore Topics

  • Global Elt News
  • Job Resources
  • Industry Insights
  • Teaching English Online
  • Classroom Games / Activities
  • Teaching English Abroad
  • Professional Development

essay about methods of language teaching

Popular Articles

  • 5 Popular ESL Teaching Methods Every Teacher Should Know
  • 10 Fun Ways to Use Realia in Your ESL Classroom
  • How to Teach ESL Vocabulary: Top Methods for Introducing New Words
  • Advice From an Expert: TEFL Interview Questions & How to Answer Them
  • What Is TESOL? What Is TEFL? Which Certificate Is Better – TEFL or TESOL?

essay about methods of language teaching

Johanna Kawasaki

  • August 7, 2023

ESL teaching methods

There’s no single way to teach English and, in fact, there have been many popular approaches over the years. These are a few of the top ESL teaching methods, including communicative language teaching (CLT) and total physical response (TPR), used in the classroom today. Learn more about these and other methods and how you can apply them to a real-life classroom in Bridge’s Professional Certificate courses .

Whether you’re new to the different teaching methods or you need a refresher, download this guide to popular ESL methodologies to brush up on the definition and applications of the latest approaches developed by industry experts.

Why learn ESL teaching methods?

There are many reasons why learning a few basic ESL teaching methods is a must for ESL teachers. Here are some ways that learning the most popular methods of teaching ESL can help you as an English teacher:

  • Demonstrating knowledge of these ESL teaching methods and strategies makes you more marketable.
  • Using TEFL/TESOL buzzwords during an interview can improve your chances of getting hired.
  • Using a variety of methods in the ESL classroom makes you a more effective and engaging teacher.
  • Understanding pedagogy helps you design better ESL materials and lessons.
  • Learning methodology can help you strategically use learning objectives that will benefit your students.

If you’re new to teaching, you’ll want to get initial training and qualification with a TEFL certificate . You can explore our online TEFL courses to get started!

ESL teacher in Portugal

What are some popular ESL teaching methods?

Method #1: direct method.

For the direct method, all teaching is done in the target language. Translations are not allowed in class, and the focus lies heavily on speaking instead of grammar. As a result, the direct method is a very student-centered strategy that has gained popularity in recent years.

Students are supposed to learn the target language naturally and instinctively, which is why the direct method is also called the “natural approach.” Mistakes are corrected as they happen in class, and teachers reinforce the correct usage of the language with praise. This method is frequently used when teaching English online . Many virtual ESL companies require teachers to only speak English during class to encourage an immersive experience.

Get more ideas for correcting students’ mistakes by taking Bridge’s 20-hour Micro-credential course: Error Correction in the EFL Classroom .

to teens participating in an ESL conversation activity.

Method #2: Communicative language teaching (CLT)

Communicative language teaching is perhaps the most popular approach among the methods of teaching ESL today. CLT emphasizes the student’s ability to communicate in real-life contexts. As a result, students learn to make requests, accept offers, explain things, and express their feelings and preferences.

Additionally, since CLT focuses on teaching language through real-world assignments and problem-solving, it’s less concerned with grammar accuracy and instead focuses on fluency.

Promote communication and fluency in your classroom with these ESL speaking activities.

Method #3: Task-/project-/inquiry-based learning

This teaching strategy for ESL students can sometimes be considered a part of CLT, but it heavily emphasizes the students’ independence and individuality. Inquiry-based learning is a modern approach that is becoming widely popular in schools all over the world. By asking questions and solving problems, with the teacher as a mere learning facilitator, student motivation and participation in tasks and projects are thought to increase.

Find out more about task-based learning.

ESL teacher Sallie, teaching online using the TPR method.

Method #4: Total physical response (TPR)

Next is the Total Physical Response (TPR) method. You may have heard of this teaching strategy for ESL before, but what exactly is TPR ? Total Physical Response has become a very popular approach in which students react to the teacher with movement. Some examples include miming, gesturing, or acting out the language.

For example, the teacher and students might make an exaggerated frown and pretend to cry when learning the word “sad.” TPR suggests that students learn the target language best through physical response rather than by analysis.

Additionally, TPR is often used when teaching English online and when teaching young learners, as it not only helps students remember vocabulary but also provides an outlet for their energy and helps them stay focused when sitting for long periods.

If you like TPR, you might also like using drama as an ESL teaching method.

Method #5: An eclectic approach

Many teachers choose from the collection of humanistic approaches (TPR, for example) and communicative approaches (the direct method and CLT). Often, they incorporate bits and pieces of many other teaching strategies for ESL learners and use what works best for their individual students. Generally speaking, there is no one-size-fits-all methodology. Each group of learners will have varying learning styles and preferences. For that reason, conducting a needs assessment is a great starting place for teachers who aren’t sure which methodology, or methodologies, to apply.

For example, a teacher who uses mostly the direct method may occasionally do a lot of grammar explanation when preparing students for English proficiency exams , such as teaching Pearson Test of English (PTE) test prep , or a CLT advocate may borrow some aspects of the direct method or use TPR.

  • Pro Tip: Another great way to combine or develop teaching methods is to frequently reflect on your teaching style by using a journal where you write down comments, note adjustments, and brainstorm how you can change certain methods or procedures if necessary.

The list of ESL teaching styles doesn’t have to end here! You can find your own favorite TEFL/TESOL method from among those listed above, combine several strategies for teaching your ESL students, or develop your own ESL teaching methods and techniques. For a full breakdown of the different methodologies and how to evaluate your students’ needs, download Bridge’s ESL Methodologies Guide .

Delve deeper into these and other ESL teaching methods and techniques with Bridge Professional TEFL/TESOL Certificate courses.

essay about methods of language teaching

After backpacking Australia on a Working Holiday visa, Bridge graduate Johanna traveled to Japan for a year to teach English. She then moved to New Zealand for another two years before returning to her chosen home country, Japan, where she currently lives. Now, with more than eight years of professional English teaching experience, Johanna enjoys her expat life in Japan teaching teenagers at a private junior and senior high school, where she recently received tenure after only two years. When she’s not teaching, Johanna continues to travel regionally and explore new places.

Introduction Digests Journals Books Video Web Sites ERIC Documents

Introduction

Language teaching came into its own as a profession in the last century. Central to this process was the emergence of the concept of methods of language teaching. The method concept in language teaching—the notion of a systematic set of teaching practices based on a particular theory of language and language learning—is a powerful one, and the quest for better methods preoccupied teachers and applied linguists throughout the 20th century. Howatt (1984) documents the history of changes in language teaching throughout history, up through the Direct Method in the 20th century. One of the most lasting legacies of the Direct Method has been the notion of method itself.

Methodology in language teaching has been characterized in a variety of ways. A more or less classical formulation suggests that methodology links theory and practice. Within methodology a distinction is often made between methods and approaches, in which methods are held to be fixed teaching systems with prescribed techniques and practices, and approaches are language teaching philosophies that can be interpreted and applied in a variety of different ways in the classroom. This distinction is probably best seen as a continuum ranging from highly prescribed methods to loosely described approaches.

This Resource Guide provides information about and links to digests, journals, books, and Web sites that offer information about second language teaching methods and approaches.

Howatt, A. (1984). A history of English language teaching . Oxford: Oxford University Press.

This introduction is adapted from Rodgers, T. S. (2001). Language Teaching Methodology (ERIC Issue Paper). Washington, DC: ERIC Clearinghouse on Languages and Linguistics.

The following publications, Web sites, and listservs offer additional information about second language teaching methodology. This Resource Guide concludes with an annotated bibliography of ERIC documents on this topic.

Content-Centered Language Learning

Grammar and Its Teaching: Challenging the Myths

Integrated Skills in the ESL/EFL Classroom

Integrating Language and Content: Lessons from Immersion

Language Teaching Methodology

Lexical Approach to Second Language Teaching

Reading with a Purpose: Communicative Reading Tasks for the Foreign Language Classroom

Thematic, Communicative Language Teaching in the K–8 Classroom

The following journals often include articles on language teaching methods and approaches.

The ADFL Bulletin is a refereed journal published by the Association of Departments of Foreign Languages. The ADFL Bulletin prints essays dealing with professional, pedagogical, curricular, and departmental matters.

The Annual Review of Applied Linguistics provides reviews of research in applied linguistics and essays on pedagogy, second language acquisition, and computer assisted instruction.

The CALICO Journal is devoted to the dissemination of information concerning the application of technology to language teaching and language learning.

The Canadian Modern Language Review publishes peer-reviewed articles on all aspects of language learning and teaching. Article topics range from ESL, to French immersion, to international languages, to native languages.

The ELT Journal seeks to bridge the gap between the everyday practical concerns of ELT professionals and related disciplines such as education, linguistics, psychology, and sociology that may offer significant insights.

Foreign Language Annals is the official journal of the American Council on the Teaching of Foreign Languages (ACTFL) and is dedicated to the advancement of foreign language teaching.

The French Review is the official journal of the American Association of Teachers of French. The Review publishes articles and reviews on French and francophone literature, cinema, society and culture, linguistics, technology, and pedagogy six times a year.

Hispania is a journal devoted to the interests of the teaching of Spanish and Portuguese.

The International Review of Applied Linguistics in Language Teaching (IRAL) is devoted to problems of general and applied linguistics in their various forms.

The Journal for Accelerated Learning and Teaching provides information on the research behind accelerated learning theory.

Language Learning - A Journal of Research in Language Studies is a scientific journal dedicated to the understanding of language learning broadly defined.

Language Teaching brings together the latest findings in research worldwide in language teaching and learning. Key international periodicals are abstracted in each volume, and an annual research review identifies significant research trends.

The Modern Language Journal is devoted to research and discussion about the learning and teaching of foreign and second languages.

TESOL Journal publishes articles on ESOL methodology, curriculum materials and design, teacher development, literacy, bilingual education, and classroom inquiry and research.

TESOL Quarterly publishes articles on topics of significance to individuals concerned with the teaching of English as a second or foreign language and of standard English as a second dialect.

Die Unterrichtspraxis: The Teaching of German is the American Association of Teachers of German (AATG) journal for German language pedagogy.

Asher, J. (1982). Learning Another Language Through Actions  (2nd ed.). Los Gatos, CA: Sky Oaks Productions.

Bancroft, W. (1999). Suggestopedia and Language Acquisition: Variations on a Theme . New York: Gordon and Breach.

Bardovi-Harlig, K., & Hartford, B. (Eds.). (1997). Beyond Methods: Components of Second Language Teacher Education . New York: McGraw-Hill.

Brown, H.D. (1980). Principles of Language Learning and Teaching. Englewood Cliffs, NJ: Prentice-Hall.

Brumfit, C.J., & Johnson, K. (Eds.). (1979). The Communicative Approach to Language Teaching. Oxford: Oxford University Press.

Celce-Murcia, M. (Ed.). (1991). Teaching English as a Second or Foreign Language. Boston: Newbury House.

Curran, C.A. (1972). Counseling-Learning: A Whole-Person Model for Education. New York: Grune and Stratton.

Curran, C.A. (1976). Counseling-Learning in Second Languages. Apple River, IL: Apple River Press.

Gattegno, C. (1972). Teaching Foreign Languages in Schools: The Silent Way (2nd ed.). New York: Educational Solutions.

Gattegno, C. (1976). The Common Sense of Teaching Foreign Languages. New York: Educational Solutions.

Holt, D. (1993). Cooperative Learning: A Response to Linguistic and Cultural Diversity . McHenry, IL, and Washington, DC: Delta Systems and Center for Applied Linguistics.

Johnson, K. (1982). Communicative Syllabus Design and Methodology. Oxford: Pergamon.

Krashen, S.D. (1981). Second Language Acquisition and Second Language Learning. Oxford: Pergamon.

Krashen, S.D., & Terrell, T.D. (1983). The Natural Approach: Language Acquisition in the Classroom . Englewood Cliffs, NJ: Prentice Hall.

Larsen-Freeman, D. (2000). Techniques and Principles in Language Teaching  (2nd ed.). Oxford: Oxford University Press.

Littlewood, W. (1982). Communicative Language Teaching: An Introduction . Cambridge: Cambridge University Press.

Littlewood, W. (1992). Teaching Oral Communication: A Methodological Framework . Oxford: Blackwell.

Lozanov, G. (1978). Suggestology and Outlines of Suggestopedy . New York: Gordon and Breach.

Lozanov, G., & Gateva, E. (1988). The Foreign Language Teacher's Suggestopedic Manual . New York: Gordon and Breach.

Musumeci, D. (1997). Breaking Tradition: An Exploration of the Historical Relationship Between Theory and Practice in Second Language Teaching . New York: McGraw-Hill.

Nunan, D. (1999). Second Language Teaching and Learning . Boston: Heinle & Heinle.

Richards, J. C., & Rodgers, T.S. (2001). Approaches and Methods in Language Teaching  (2nd ed.). Cambridge: Cambridge University Press.

Short, D. (1999). New Ways in Teaching English at the Secondary Level . Alexandria, VA: Teachers of English to Speakers of Other Languages.

Stern, H.H. (1983). Fundamental Concepts of Language Teaching. Oxford: Oxford University Press.

Stevick, E. (1998). Working with Teaching Methods: What's at Stake? Boston: Heinle & Heinle.

McGraw-Hill. (1997). The Natural Approach, From Theory to Practice: The 1994 McGraw-Hill Teleconference . New York: Author.

Community Language Learning Theory and Lesson Plan General overview of the techniques and principles of Community Language Learning theory and a sample lesson plan.

Desuggestive Learning The Web site of Dr. Georgi Lozanov, creator of Suggestopedia.

The English Teacher Online magazine for English language teachers that features publications, professional support, lessons, and articles on methodology.

Foreign Language Teaching Forum The topic of FLTEACH is foreign language teaching methods, including school/college articulation, training of student teachers, classroom activities, curriculum, and syllabus design.

Grammar Translation Method Short essay outlining the use of the Grammar Translation Method in teaching German.

McGraw-Hill Teaching Methods Web Resources Large archive of links to teaching method resources.

Overview of Methodologies for Language Teaching A historical survey of the most influential methodologies, some "fringe" methodologies, and new tendencies in language teaching.

The Silent Way Message Board Message board devoted to general discussions regarding the work of Dr. Caleb Gattegno, developer of The Silent Way.

Suggestopedia and Accelerative Language Teaching/Learning A page of links to sites about Suggestopedia, Georgi Lozanov, and Accelerative Language teaching techniques.

Teaching Foreign Languages - The Silent Way History and method of The Silent Way.

Theoretical Basis for the Natural Approach Online PowerPoint presentation about the Natural Approach. Site includes links to information about other teaching methods.

Total Physical Response Articles, workshops, and publications about TPR.

Teaching Methods and Second Language Instruction and Educational Trends or Methodology/ies or Approach/es or Grammar Translation Method or Audiolingual Methods or Silent Way or Suggestopedia or Community Language Learning or Total Physical Response or Natural Approach or Communicative Approach

ED441344 Second Language Teaching & Learning. Nunan, David 1999 ISBN: 0-8384-0838-9 Availability: Heinle & Heinle Publishers, 7625 Empire Dr., Florence, KY 41042-2978 ($24.95). Tel: 800-354-9706 (Toll Free). The purpose of this volume is to provide a contemporary portrait of second language learning and teaching, to identify major trends and issues, to show where these trends and issues have come from, and to illustrate ways teachers can incorporate these ideas in their own teaching practice. The book is a personal account, tracing the author's struggles with theoretical and conceptual issues and illustrating practical solutions. The book is intended for practicing teachers as well as future teachers. It is composed of ten chapters divided into three parts. Part one is a concept map for the rest of the book, covering the conceptual and empirical basis for second language learning and teaching. In part two, language is looked at in context, focusing on those aspects of language that can provide teachers with insights for developing materials and pedagogical procedures. The chapters focus on the learner and learning processes respectively. Part three focuses on thematic issues that arose in parts one and two. Chapters examining the "Cinderella skill," what it is that differentiates spoken from written language, key theoretical and empirical underpinnings of a reading program, and a discussion of a discourse-based approach to writing complete the book. A glossary, index, and extensive references, as well as many charts, tables, and diagrams are included.

EJ499460 Language Awareness as Methodology: Implications for Teachers and Teacher Training. Borg, Simon Language Awareness, v3 n2 p61-71 1994 ISSN:0965-8416 Discusses language awareness (LA) as a methodology in foreign language teaching, demonstrating that LA presumes not only linguistic awareness on the part of teachers but also an understanding of the learning and teaching processes this methodology promotes. Argues that training content needs to be educationally rather than linguistically orientated. (12 references)

EJ435952 Grammar Pedagogy in Second and Foreign Language Teaching. Celce-Murcia, Marianne TESOL Quarterly, v25 n3 p459-512 Fall 1991 ISSN:0039-8322 To provide some perspective on current issues and challenges concerning the role of grammar in language teaching, methodological trends of the past 25 years are reviewed. A proposal for a decision-making strategy is provided for resolving the controversy regarding how much grammar one should teach to language learners.

EJ432944 Methods in Elementary School Foreign Language Teaching. Curtain, Helena Foreign Language Annals, v24 n4 p323-29 Sep 1991 ISSN: 0015-718X A brief overview of the importance of the use of appropriate methodologies for elementary school foreign language instruction precedes a description of several strategies involving total physical response, story telling, games and songs, props, small-group work, role-play, content-based instruction, cultural and global awareness, language experience approach, and dialog journals. (11 references)

ED277280 Eight Approaches to Language Teaching. Doggett, Gina December 1986 Important features of eight second language teaching methods—grammar-translation, direct, audiolingual, the Silent Way, Suggestopedia, community language learning, Total Physical Response, and the communicative approach—are summarized. A chart outlines characteristics of these aspects of the methods: goals, teacher and student roles, the teaching/learning process, student-teacher and student-student interaction, dealing with feelings, view of language and culture, the aspects of language emphasized, the role of the students' native language, means for evaluation, and response to student errors. The report also lists additional information sources.

ED253092 Three Methods for Language Acquisition: Total Physical Response; the Tomatis Program; Suggestopedia. Bancroft, W. Jane November 1984 Total Physical Response is a strategy for learning second languages developed by James J. Asher. The Tomatis program, developed in France by Alfred Tomatis, is a method for treating dyslexia and communication problems and is also used for teaching basic elements of foreign languages. Suggestology is a psychotherapeutic system based on yogic techniques of physical and mental relaxation, created in Bulgaria by Georgi Lozanov. Suggestopedia is the application of suggestology to education, and specifically to foreign language instruction. Although seemingly different, the three methods have important elements in common: (1) they are based on the way children learn their native language, that is, by acquiring listening comprehension before speaking, reading, and writing skills. (2) They share the premise that learning a second language should be a "natural" experience with emphasis on communicative competence and realistic utterances. (3) They perceive language globally, with attention to detail emphasized later in the learning process. (4) They emphasize use of the brain's right hemisphere, for implicit learning.

EJ317053 The Natural Approach to Language Teaching: An Update. Terrell, T. D. Canadian Modern Language Review, v41 n3 p461-79 Jan 1985 It is proposed that language acquisition improves if beginning students are allowed to experience three stages of acquisition: comprehension (preproduction), early speech production, and speech emergence. Each stage requires a different kind of activity building on the previous stage's development.

ED236942 Silent Way in the University Setting. Lantolf, James P. 1983 The use of the Silent Way method of second language instruction in beginning and intermediate Spanish classes at the college level is described. The approach encourages student self-responsibility for learning the target language according to learning strategies selected by the student. Although the method was used during three semesters, the students underwent the greatest metamorphosis in their abilities to independently interact in Spanish during the first semester. Student's initial reactions to the courses, pronunciation, evaluation of student progress, the link between input and acquisition, teacher silence and the cultivation of communicative confidence, the effect of the Silent Way approach on student anxiety levels, and student performance on a cloze test are discussed. Sample student compositions and an editing task are appended.

ED230069 The Natural Approach: Language Acquisition in the Classroom. Krashen, Stephen D.; Terrell, Tracy D. 1983 ISBN: 0-88084-005-7 Availability: The Alemany Press, P.O. Box 5265, San Francisco, CA 94101 ($11.95). The theory and methods of the natural approach to language acquisition in the classroom are described. The natural approach is based on the theory that language acquisition occurs only when students receive comprehensible input. The emphasis is on reading and listening comprehension for beginning students. The seven chapters cover (1) language teaching approaches, (2) second language acquisition theory, (3) classroom implications of the theory, (4) how to begin using the natural approach, (5) oral communication development through acquisition activities, (6) additional sources of input for acquisition, and (7) testing and classroom management. Curriculum organization, classroom activities, management of classroom activities, the role of reading in the natural approach, homework, vocabulary, and error correction are also discussed.

ED221026 The Role of Grammar in a Communicative Approach to Second Language Teaching and Testing. Swain, Merrill; Canale, Michael 1982 A review of literature on communicative competence reveals many meanings of the concept and of the way it should be used in second and foreign language instruction. In the literature there are two views: that communicative competence includes grammatical competence, and that it does not; or at least the ability to communicate one's meaning is secondary to the appropriateness or grammaticalness of the utterance. The latter view has three bases, all related to first language learning. These bases are: (1) children focus more on being understood than on speaking grammatically, (2) full grammatical competence will come at a later stage, and (3) language learning is more effective when it involves real communicative acts. An examination of theoretical arguments and limited empirical evidence regarding transposing these three bases to second language learning suggests the need for a framework specific to second language learning. This framework would include grammatical, sociolinguistic, and strategic competence. The latter includes strategies of two main types, those related primarily to grammatical competence, and those related to sociolinguistic competence. While there is little research to support the view, it is proposed that a functional approach is better than one with a grammatical base.

ED207339 Evaluating Contemporary Language-Teaching Methodologies through Historical Perspective. Madsen, Harold S.; Bowen, J. Donald 1981 The comparative study of foreign language teaching methodologies benefits from an overview of the history of foreign language instruction, which begins with Roman youths learning Greek and, later, the classical form of Latin. In the Middle Ages and Renaissance, notable figures such as Erasmus and Montaigne espoused highly intensive though relatively unsystematic methods, though it was in this period that the love for the discipline of a grammatical system brought the grammar translation method into favor. By the end of the 19th century, the Natural and Phonetic Methods, reactions to grammar translation, had spawned the oral-aural Direct Method. The eclectic and thoroughly worked-out views of a figure such as Harold E. Palmer (1877- 1949) sound remarkably modern. The recurring ideas of contemporary methodologies are also recurring ideas of history (e.g., starting instruction at an early age). Both the success of the audiolingual approach and the views of its contemporary detractors can be understood through historical perspective. The many innovative methods currently in use (including Total Physical Response and the Silent Way) also owe their distinctive appeal to one or more time-honored principles of foreign language instruction.

EJ265801 Method: Approach, Design, and Procedure. Richards, Jack C.; Rodgers, Ted TESOL Quarterly, v16 n2 p153-68 Jun 1982 Offers a model which can be used to describe any given second language teaching method, as well as analyze different methods for their internal adequacy, similarities, and differences.

EJ253831 The Dartmouth-Rassias Model of Teaching Foreign Languages. Stansfield, Charles; Hornor, Jeanne ADFL Bulletin, v12 n4 p23-27 May 1981 Describes the Dartmouth-Rassias language instruction model emphasizing its reliance on audiolingual techniques and on the intensive approach. Discusses classroom techniques, unique teacher selection methods, and importance of teacher attitudes. Reviews the results achieved by this model and expresses the hope that it will receive more attention from the teaching profession.

EJ251125 Directions for Change in an Audio-Lingual Approach. Knop, Constance K. Canadian Modern Language Review, v37 n4 p724-38 May 1981 Examines directions suggested by studies in communicative competence, cognitive mapping of learning styles, and classroom interaction, seeking integration with a basically audiolingual approach to second language instruction. Suggests alternate ways of conducting classes to help teachers meet students' individual learning needs while still using activities set up in audiolingual methodology.

ED205037 Teaching Foreign-Language Skills. Second Edition. Rivers, Wilga M. 1981 ISBN: 0-226-72-0-98-5 (cloth)0-226-72907-7 (paper) Availability: University of Chicago Press, 5801 S. Ellis Ave., Chicago, IL 60637 This second edition is a complete reworking of the 1968 text to include later views of language learning and teaching, and theories of linguistics and psychology. The text is intended particularly for use in methods classes in conjunction with observation of experienced foreign language teachers. The early chapters deal with general principles such as objectives and methods of language teaching, and theories of language and language learning. Subsequent chapters address practical matters related to the language class. These concerns are: (1) structured practice, (2) teaching sounds, (3) listening comprehension, (4) learning the fundamentals of the speaking skill, (5) various approaches to teaching communicative skills, (6) reading skills, (7) writing skills, (8) cultural understanding, (9) principles and techniques of testing, and (10) technology and language learning centers. The final chapter deals with early language learning, elementary school foreign language, languages for special purposes, vocabulary learning, and matters related to lesson planning and classroom management.

ED203667 Approach, Design and Procedure: Their Role in Methodology. Richards, Jack C.; Rodgers, Ted 1980 Three interrelated pedagogical elements—approach, design, and procedure—are basic in a discussion of language teaching. Approach defines those foundational assumptions, beliefs, and theories about the nature of language and language learning. Design specifies the relationships of theories to both the form and use of instructional materials. Procedure comprises classroom techniques and practices consequent upon particular approaches and designs. The discussion is in three parts: (1) some basic questions about structural-behavioral, functional, and interactional theories underlying particular pedagogical philosophies; (2) the relationship between theories in a particular approach and a design for language teaching, which would include a specification of the content, learner and teacher roles, and types and functions of instructional materials; and (3) procedural questions focused on the actual class techniques, practices, and activities operative in teaching and learning a language, and the relation between them and linguistic theory and learning models. It is concluded that an instructional system must be crafted to move from approach to design to procedure. In this way the study of methodology in applied linguistics assumes a significant role.

EJ242479 Some Pre-Methodological Considerations in Foreign-Language Teaching. Higgs, Theodore V. Canadian Modern Language Review, v37 n2 p309-19 Jan 1981 Combines studies in cognitive psychology and language acquisition with observations of pedagogical materials and student performance to analyze foreign-language teaching from the perspective of what students and teachers need to understand about language learning and language before meaningful debate over methodology can be undertaken.

The full text of journal articles may be available from one or more of the following sources:

Back to RGOs    Top of Page

Recommended pages

  • Undergraduate open days
  • Postgraduate open days
  • Accommodation
  • Information for teachers
  • Maps and directions
  • Sport and fitness

Language Teaching Methodology and Classroom Research

Essays marked with a * received a distinction

  • * Analyzing teacher's questioning strategies, feedback and learners' outcomes Mohammad Umar Farooq
  • The Culture of Learning and the Good Teacher in Japan: An Analysis of Student Views Gregory S. Hadley
  • * An Interaction Analysis: A teacher's questions, feedback, and students' production through classroom observation Fumiko Yamazaki
  • Silence in classroom interaction Fuyuko Kato
  • Classroom Interaction in a Korean University English Language Class : Yvette Murdoch
  • * The power distance dimension and methodolog y : Magdalena Polak
  • * In defense of 'PPP' : Marian Dawson
  • Teacher questioning, modification and feedback behaviours and their implications for learner production: an action research case study : Paul Moritoshi
  • Writing Improvement in a 4th year EFL classes: Limits and Possibilities : Wolfgang Petter
  • The Effects of Uncertainty Avoidance on Interaction in the Classroom : Andrew Atkins
  • Consciousness-raising versus deductive approaches to language instruction: a study of learner preferences : James M. Ranalli
  • Problems in teaching English to Japanese students revealed by using a tally sheet and a short ethnographic-style commentary : Fumie Takakubo
  • * Student Difficulties Writing in English: Suggested Strategies to help, and their Potential Beneficial ‘side-effects’: William Penny
  • Uncertainty Avoidance and its Influence within the EFL Classroom Theron Muller
  • Exploring Teachers' Questions and Feedback : Christoph Suter
  • * Questioning and Feedback in the Interactive Classroom: Exploring Strategies Christiane Oberli
  • Process writing David Dawson
  • * Action research investigating the amount of teacher talk in my classroom : Thomas Warren-Price
  • * Turn-taking strategies used by native English and Japanese speakers: a limited, comparative study including tentative pedagogical implications for teaching English to Japanese students : Philip Shigeo Brown
  • Uncertainty Avoidance and Classroom Interaction: Implications for Language Teaching Mary Umemoto
  • Uncertainty Avoidance in Japan : Andrew Rolnick
  • * Type Token Ratios in One Teacher's Classroom Talk: An Investigation of Lexical Complexity Dax Thomas
  • * The Cultural Influence of 'Power Distance' in Language Learning Johnny Mendoza Govea
  • Power Distance in the EFL/ESL classroom   Sharon Ishizaki
  • Classroom Interaction Affected by Power Distance Michiko Kasuya
  • Learner Training in the Context of a Private Conversation School , Erin Peter Kourelis
  • The diary as a window to my classroom , Ryan Moulton
  • * The Effects of Uncertainty Avoidance in EFL Learning Situations , Andrew Lawson
  • * The Process Approach to Student Writing , Anabela Reis Alves
  • * Uncertainty Avoidance in a Japanese High School , Julian Pigott
  • * Teaching from the Orchestra: Cultural Values and Dimensions of Power Within Role Relations of an EFL Classroom , Michael Post
  • * Feedback: A Self-Observation Analysis , D Ashley Stockdale
  • * Observations of Second Language Teaching Strategies and Uncertainty Avoidance in a South Korean ESL University Classroom , Steven James Kurowski
  • Process Approach to Writing , Deborah Grossmann
  • * Implementing task-based language teaching in a Japanese EFL context , Paul Dickinson
  • * Evaluating the appropriateness of adopting a CLT approach in an English conversation classroom in Japan , Paul Raine
  • * Reactive Tokens at Turning Points , Joel Baker
  • Implementing Task-Based Language Teaching in Korean , Aja Dailey
  • * Teaching grammar with authentic material - advantages and disadvantages of a deductive and a consciousness-raising approach , Isabella Seeger
  • * Using a process approach to help student writing based on extracts of their work , Benet Vincent
  • * An Analysis of Questioning and Feedback Strategies Using the IRF Framework , Joshua Durey
  • Examining Teacher Talk in a Japanese Senior High School Oral Communication Class , Alex Small
  • * Increasing students' L2 usage: an analysis of teacher talk time and student talk time Matthew J. Davies
  • The Process Approach Baljinder Gosal
  • Teacher Talking Time: Analyzing my own classroom Paulo Pita
  • * Exploring TBLT in a Japanese EFL/ESP Context Daniel Hougham
  • * Professional Develop ment Through Individual Diary Writing Sarah Jones
  • The Importanc e of Speech and other Techniques that Work Staci-Ann Ali
  • Improving EFL Writin g through the Process Approach Elsa Fenanda Gonzalez
  • The main differences between an inductive and a deductive approach to grammar teaching, and the possibility of a combined approach Pelin Simit
  • * Problems in Adopting CLT in a Rural Korean Primary School Jonas Robertson
  • Adapting Teaching to Improve Listening Instruction for a Business English Class in Japan Leon Townsend-Cartwright
  • * The Process Approach to Writing Remediation Cynthia Ong
  • *  An action research investigation into the effectiveness of a teacher's questioning and feedback strategies during a 40-minute low-level young learner EFL class in South Korea   Chris Brady
  • Strategies Employed for Teaching Language Learners Mehboobkhan Ismail
  • The Advantages and Disadvantages of Using a Task-Based Approach in South Korean Hagwons Michael Alpaugh
  • *  Corpus Approaches to Language Ideology: A Methodology Under the Microscope  Nikolos Peyralans
  • Search Menu
  • Sign in through your institution
  • Advance articles
  • Editor's Choice
  • Key Concepts
  • The View From Here
  • Author Guidelines
  • Submission Site
  • Open Access
  • Why Publish?
  • About ELT Journal
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Self-Archiving Policy
  • Dispatch Dates
  • Terms and Conditions
  • Journals on Oxford Academic
  • Books on Oxford Academic

Article Contents

  • Introduction
  • EFL listening and speaking: a case study
  • Reflections and implications
  • Acknowledgements

Revisiting translation as a method in language teaching and learning

ORCID logo

  • Article contents
  • Figures & tables
  • Supplementary Data

Yuan Ping, Revisiting translation as a method in language teaching and learning, ELT Journal , 2024;, ccae022, https://doi.org/10.1093/elt/ccae022

  • Permissions Icon Permissions

This article begins by defining translation and discussing its use in language education. It then presents a case study of an EFL listening and speaking course taught at a Chinese university that incorporates communicative translation activities, including vocabulary exercises, group presentations, and role-playing. The article concludes by reflecting on the experience of teaching the course and the lessons learnt. It aims to foster a deeper understanding of the relationship between translation and language teaching/learning and to suggest practical applications of this understanding for teachers in diverse educational settings.

Email alerts

Citing articles via.

  • Recommend to Your Library

Affiliations

  • Online ISSN 1477-4526
  • Print ISSN 0951-0893
  • Copyright © 2024 Oxford University Press
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 03 June 2024

Applying large language models for automated essay scoring for non-native Japanese

  • Wenchao Li 1 &
  • Haitao Liu 2  

Humanities and Social Sciences Communications volume  11 , Article number:  723 ( 2024 ) Cite this article

185 Accesses

2 Altmetric

Metrics details

  • Language and linguistics

Recent advancements in artificial intelligence (AI) have led to an increased use of large language models (LLMs) for language assessment tasks such as automated essay scoring (AES), automated listening tests, and automated oral proficiency assessments. The application of LLMs for AES in the context of non-native Japanese, however, remains limited. This study explores the potential of LLM-based AES by comparing the efficiency of different models, i.e. two conventional machine training technology-based methods (Jess and JWriter), two LLMs (GPT and BERT), and one Japanese local LLM (Open-Calm large model). To conduct the evaluation, a dataset consisting of 1400 story-writing scripts authored by learners with 12 different first languages was used. Statistical analysis revealed that GPT-4 outperforms Jess and JWriter, BERT, and the Japanese language-specific trained Open-Calm large model in terms of annotation accuracy and predicting learning levels. Furthermore, by comparing 18 different models that utilize various prompts, the study emphasized the significance of prompts in achieving accurate and reliable evaluations using LLMs.

Similar content being viewed by others

essay about methods of language teaching

Accurate structure prediction of biomolecular interactions with AlphaFold 3

essay about methods of language teaching

Testing theory of mind in large language models and humans

essay about methods of language teaching

Highly accurate protein structure prediction with AlphaFold

Conventional machine learning technology in aes.

AES has experienced significant growth with the advancement of machine learning technologies in recent decades. In the earlier stages of AES development, conventional machine learning-based approaches were commonly used. These approaches involved the following procedures: a) feeding the machine with a dataset. In this step, a dataset of essays is provided to the machine learning system. The dataset serves as the basis for training the model and establishing patterns and correlations between linguistic features and human ratings. b) the machine learning model is trained using linguistic features that best represent human ratings and can effectively discriminate learners’ writing proficiency. These features include lexical richness (Lu, 2012 ; Kyle and Crossley, 2015 ; Kyle et al. 2021 ), syntactic complexity (Lu, 2010 ; Liu, 2008 ), text cohesion (Crossley and McNamara, 2016 ), and among others. Conventional machine learning approaches in AES require human intervention, such as manual correction and annotation of essays. This human involvement was necessary to create a labeled dataset for training the model. Several AES systems have been developed using conventional machine learning technologies. These include the Intelligent Essay Assessor (Landauer et al. 2003 ), the e-rater engine by Educational Testing Service (Attali and Burstein, 2006 ; Burstein, 2003 ), MyAccess with the InterlliMetric scoring engine by Vantage Learning (Elliot, 2003 ), and the Bayesian Essay Test Scoring system (Rudner and Liang, 2002 ). These systems have played a significant role in automating the essay scoring process and providing quick and consistent feedback to learners. However, as touched upon earlier, conventional machine learning approaches rely on predetermined linguistic features and often require manual intervention, making them less flexible and potentially limiting their generalizability to different contexts.

In the context of the Japanese language, conventional machine learning-incorporated AES tools include Jess (Ishioka and Kameda, 2006 ) and JWriter (Lee and Hasebe, 2017 ). Jess assesses essays by deducting points from the perfect score, utilizing the Mainichi Daily News newspaper as a database. The evaluation criteria employed by Jess encompass various aspects, such as rhetorical elements (e.g., reading comprehension, vocabulary diversity, percentage of complex words, and percentage of passive sentences), organizational structures (e.g., forward and reverse connection structures), and content analysis (e.g., latent semantic indexing). JWriter employs linear regression analysis to assign weights to various measurement indices, such as average sentence length and total number of characters. These weights are then combined to derive the overall score. A pilot study involving the Jess model was conducted on 1320 essays at different proficiency levels, including primary, intermediate, and advanced. However, the results indicated that the Jess model failed to significantly distinguish between these essay levels. Out of the 16 measures used, four measures, namely median sentence length, median clause length, median number of phrases, and maximum number of phrases, did not show statistically significant differences between the levels. Additionally, two measures exhibited between-level differences but lacked linear progression: the number of attributives declined words and the Kanji/kana ratio. On the other hand, the remaining measures, including maximum sentence length, maximum clause length, number of attributive conjugated words, maximum number of consecutive infinitive forms, maximum number of conjunctive-particle clauses, k characteristic value, percentage of big words, and percentage of passive sentences, demonstrated statistically significant between-level differences and displayed linear progression.

Both Jess and JWriter exhibit notable limitations, including the manual selection of feature parameters and weights, which can introduce biases into the scoring process. The reliance on human annotators to label non-native language essays also introduces potential noise and variability in the scoring. Furthermore, an important concern is the possibility of system manipulation and cheating by learners who are aware of the regression equation utilized by the models (Hirao et al. 2020 ). These limitations emphasize the need for further advancements in AES systems to address these challenges.

Deep learning technology in AES

Deep learning has emerged as one of the approaches for improving the accuracy and effectiveness of AES. Deep learning-based AES methods utilize artificial neural networks that mimic the human brain’s functioning through layered algorithms and computational units. Unlike conventional machine learning, deep learning autonomously learns from the environment and past errors without human intervention. This enables deep learning models to establish nonlinear correlations, resulting in higher accuracy. Recent advancements in deep learning have led to the development of transformers, which are particularly effective in learning text representations. Noteworthy examples include bidirectional encoder representations from transformers (BERT) (Devlin et al. 2019 ) and the generative pretrained transformer (GPT) (OpenAI).

BERT is a linguistic representation model that utilizes a transformer architecture and is trained on two tasks: masked linguistic modeling and next-sentence prediction (Hirao et al. 2020 ; Vaswani et al. 2017 ). In the context of AES, BERT follows specific procedures, as illustrated in Fig. 1 : (a) the tokenized prompts and essays are taken as input; (b) special tokens, such as [CLS] and [SEP], are added to mark the beginning and separation of prompts and essays; (c) the transformer encoder processes the prompt and essay sequences, resulting in hidden layer sequences; (d) the hidden layers corresponding to the [CLS] tokens (T[CLS]) represent distributed representations of the prompts and essays; and (e) a multilayer perceptron uses these distributed representations as input to obtain the final score (Hirao et al. 2020 ).

figure 1

AES system with BERT (Hirao et al. 2020 ).

The training of BERT using a substantial amount of sentence data through the Masked Language Model (MLM) allows it to capture contextual information within the hidden layers. Consequently, BERT is expected to be capable of identifying artificial essays as invalid and assigning them lower scores (Mizumoto and Eguchi, 2023 ). In the context of AES for nonnative Japanese learners, Hirao et al. ( 2020 ) combined the long short-term memory (LSTM) model proposed by Hochreiter and Schmidhuber ( 1997 ) with BERT to develop a tailored automated Essay Scoring System. The findings of their study revealed that the BERT model outperformed both the conventional machine learning approach utilizing character-type features such as “kanji” and “hiragana”, as well as the standalone LSTM model. Takeuchi et al. ( 2021 ) presented an approach to Japanese AES that eliminates the requirement for pre-scored essays by relying solely on reference texts or a model answer for the essay task. They investigated multiple similarity evaluation methods, including frequency of morphemes, idf values calculated on Wikipedia, LSI, LDA, word-embedding vectors, and document vectors produced by BERT. The experimental findings revealed that the method utilizing the frequency of morphemes with idf values exhibited the strongest correlation with human-annotated scores across different essay tasks. The utilization of BERT in AES encounters several limitations. Firstly, essays often exceed the model’s maximum length limit. Second, only score labels are available for training, which restricts access to additional information.

Mizumoto and Eguchi ( 2023 ) were pioneers in employing the GPT model for AES in non-native English writing. Their study focused on evaluating the accuracy and reliability of AES using the GPT-3 text-davinci-003 model, analyzing a dataset of 12,100 essays from the corpus of nonnative written English (TOEFL11). The findings indicated that AES utilizing the GPT-3 model exhibited a certain degree of accuracy and reliability. They suggest that GPT-3-based AES systems hold the potential to provide support for human ratings. However, applying GPT model to AES presents a unique natural language processing (NLP) task that involves considerations such as nonnative language proficiency, the influence of the learner’s first language on the output in the target language, and identifying linguistic features that best indicate writing quality in a specific language. These linguistic features may differ morphologically or syntactically from those present in the learners’ first language, as observed in (1)–(3).

我-送了-他-一本-书

Wǒ-sòngle-tā-yī běn-shū

1 sg .-give. past- him-one .cl- book

“I gave him a book.”

Agglutinative

彼-に-本-を-あげ-まし-た

Kare-ni-hon-o-age-mashi-ta

3 sg .- dat -hon- acc- give.honorification. past

Inflectional

give, give-s, gave, given, giving

Additionally, the morphological agglutination and subject-object-verb (SOV) order in Japanese, along with its idiomatic expressions, pose additional challenges for applying language models in AES tasks (4).

足-が 棒-に なり-ました

Ashi-ga bo-ni nar-mashita

leg- nom stick- dat become- past

“My leg became like a stick (I am extremely tired).”

The example sentence provided demonstrates the morpho-syntactic structure of Japanese and the presence of an idiomatic expression. In this sentence, the verb “なる” (naru), meaning “to become”, appears at the end of the sentence. The verb stem “なり” (nari) is attached with morphemes indicating honorification (“ます” - mashu) and tense (“た” - ta), showcasing agglutination. While the sentence can be literally translated as “my leg became like a stick”, it carries an idiomatic interpretation that implies “I am extremely tired”.

To overcome this issue, CyberAgent Inc. ( 2023 ) has developed the Open-Calm series of language models specifically designed for Japanese. Open-Calm consists of pre-trained models available in various sizes, such as Small, Medium, Large, and 7b. Figure 2 depicts the fundamental structure of the Open-Calm model. A key feature of this architecture is the incorporation of the Lora Adapter and GPT-NeoX frameworks, which can enhance its language processing capabilities.

figure 2

GPT-NeoX Model Architecture (Okgetheng and Takeuchi 2024 ).

In a recent study conducted by Okgetheng and Takeuchi ( 2024 ), they assessed the efficacy of Open-Calm language models in grading Japanese essays. The research utilized a dataset of approximately 300 essays, which were annotated by native Japanese educators. The findings of the study demonstrate the considerable potential of Open-Calm language models in automated Japanese essay scoring. Specifically, among the Open-Calm family, the Open-Calm Large model (referred to as OCLL) exhibited the highest performance. However, it is important to note that, as of the current date, the Open-Calm Large model does not offer public access to its server. Consequently, users are required to independently deploy and operate the environment for OCLL. In order to utilize OCLL, users must have a PC equipped with an NVIDIA GeForce RTX 3060 (8 or 12 GB VRAM).

In summary, while the potential of LLMs in automated scoring of nonnative Japanese essays has been demonstrated in two studies—BERT-driven AES (Hirao et al. 2020 ) and OCLL-based AES (Okgetheng and Takeuchi, 2024 )—the number of research efforts in this area remains limited.

Another significant challenge in applying LLMs to AES lies in prompt engineering and ensuring its reliability and effectiveness (Brown et al. 2020 ; Rae et al. 2021 ; Zhang et al. 2021 ). Various prompting strategies have been proposed, such as the zero-shot chain of thought (CoT) approach (Kojima et al. 2022 ), which involves manually crafting diverse and effective examples. However, manual efforts can lead to mistakes. To address this, Zhang et al. ( 2021 ) introduced an automatic CoT prompting method called Auto-CoT, which demonstrates matching or superior performance compared to the CoT paradigm. Another prompt framework is trees of thoughts, enabling a model to self-evaluate its progress at intermediate stages of problem-solving through deliberate reasoning (Yao et al. 2023 ).

Beyond linguistic studies, there has been a noticeable increase in the number of foreign workers in Japan and Japanese learners worldwide (Ministry of Health, Labor, and Welfare of Japan, 2022 ; Japan Foundation, 2021 ). However, existing assessment methods, such as the Japanese Language Proficiency Test (JLPT), J-CAT, and TTBJ Footnote 1 , primarily focus on reading, listening, vocabulary, and grammar skills, neglecting the evaluation of writing proficiency. As the number of workers and language learners continues to grow, there is a rising demand for an efficient AES system that can reduce costs and time for raters and be utilized for employment, examinations, and self-study purposes.

This study aims to explore the potential of LLM-based AES by comparing the effectiveness of five models: two LLMs (GPT Footnote 2 and BERT), one Japanese local LLM (OCLL), and two conventional machine learning-based methods (linguistic feature-based scoring tools - Jess and JWriter).

The research questions addressed in this study are as follows:

To what extent do the LLM-driven AES and linguistic feature-based AES, when used as automated tools to support human rating, accurately reflect test takers’ actual performance?

What influence does the prompt have on the accuracy and performance of LLM-based AES methods?

The subsequent sections of the manuscript cover the methodology, including the assessment measures for nonnative Japanese writing proficiency, criteria for prompts, and the dataset. The evaluation section focuses on the analysis of annotations and rating scores generated by LLM-driven and linguistic feature-based AES methods.

Methodology

The dataset utilized in this study was obtained from the International Corpus of Japanese as a Second Language (I-JAS) Footnote 3 . This corpus consisted of 1000 participants who represented 12 different first languages. For the study, the participants were given a story-writing task on a personal computer. They were required to write two stories based on the 4-panel illustrations titled “Picnic” and “The key” (see Appendix A). Background information for the participants was provided by the corpus, including their Japanese language proficiency levels assessed through two online tests: J-CAT and SPOT. These tests evaluated their reading, listening, vocabulary, and grammar abilities. The learners’ proficiency levels were categorized into six levels aligned with the Common European Framework of Reference for Languages (CEFR) and the Reference Framework for Japanese Language Education (RFJLE): A1, A2, B1, B2, C1, and C2. According to Lee et al. ( 2015 ), there is a high level of agreement (r = 0.86) between the J-CAT and SPOT assessments, indicating that the proficiency certifications provided by J-CAT are consistent with those of SPOT. However, it is important to note that the scores of J-CAT and SPOT do not have a one-to-one correspondence. In this study, the J-CAT scores were used as a benchmark to differentiate learners of different proficiency levels. A total of 1400 essays were utilized, representing the beginner (aligned with A1), A2, B1, B2, C1, and C2 levels based on the J-CAT scores. Table 1 provides information about the learners’ proficiency levels and their corresponding J-CAT and SPOT scores.

A dataset comprising a total of 1400 essays from the story writing tasks was collected. Among these, 714 essays were utilized to evaluate the reliability of the LLM-based AES method, while the remaining 686 essays were designated as development data to assess the LLM-based AES’s capability to distinguish participants with varying proficiency levels. The GPT 4 API was used in this study. A detailed explanation of the prompt-assessment criteria is provided in Section Prompt . All essays were sent to the model for measurement and scoring.

Measures of writing proficiency for nonnative Japanese

Japanese exhibits a morphologically agglutinative structure where morphemes are attached to the word stem to convey grammatical functions such as tense, aspect, voice, and honorifics, e.g. (5).

食べ-させ-られ-まし-た-か

tabe-sase-rare-mashi-ta-ka

[eat (stem)-causative-passive voice-honorification-tense. past-question marker]

Japanese employs nine case particles to indicate grammatical functions: the nominative case particle が (ga), the accusative case particle を (o), the genitive case particle の (no), the dative case particle に (ni), the locative/instrumental case particle で (de), the ablative case particle から (kara), the directional case particle へ (e), and the comitative case particle と (to). The agglutinative nature of the language, combined with the case particle system, provides an efficient means of distinguishing between active and passive voice, either through morphemes or case particles, e.g. 食べる taberu “eat concusive . ” (active voice); 食べられる taberareru “eat concusive . ” (passive voice). In the active voice, “パン を 食べる” (pan o taberu) translates to “to eat bread”. On the other hand, in the passive voice, it becomes “パン が 食べられた” (pan ga taberareta), which means “(the) bread was eaten”. Additionally, it is important to note that different conjugations of the same lemma are considered as one type in order to ensure a comprehensive assessment of the language features. For example, e.g., 食べる taberu “eat concusive . ”; 食べている tabeteiru “eat progress .”; 食べた tabeta “eat past . ” as one type.

To incorporate these features, previous research (Suzuki, 1999 ; Watanabe et al. 1988 ; Ishioka, 2001 ; Ishioka and Kameda, 2006 ; Hirao et al. 2020 ) has identified complexity, fluency, and accuracy as crucial factors for evaluating writing quality. These criteria are assessed through various aspects, including lexical richness (lexical density, diversity, and sophistication), syntactic complexity, and cohesion (Kyle et al. 2021 ; Mizumoto and Eguchi, 2023 ; Ure, 1971 ; Halliday, 1985 ; Barkaoui and Hadidi, 2020 ; Zenker and Kyle, 2021 ; Kim et al. 2018 ; Lu, 2017 ; Ortega, 2015 ). Therefore, this study proposes five scoring categories: lexical richness, syntactic complexity, cohesion, content elaboration, and grammatical accuracy. A total of 16 measures were employed to capture these categories. The calculation process and specific details of these measures can be found in Table 2 .

T-unit, first introduced by Hunt ( 1966 ), is a measure used for evaluating speech and composition. It serves as an indicator of syntactic development and represents the shortest units into which a piece of discourse can be divided without leaving any sentence fragments. In the context of Japanese language assessment, Sakoda and Hosoi ( 2020 ) utilized T-unit as the basic unit to assess the accuracy and complexity of Japanese learners’ speaking and storytelling. The calculation of T-units in Japanese follows the following principles:

A single main clause constitutes 1 T-unit, regardless of the presence or absence of dependent clauses, e.g. (6).

ケンとマリはピクニックに行きました (main clause): 1 T-unit.

If a sentence contains a main clause along with subclauses, each subclause is considered part of the same T-unit, e.g. (7).

天気が良かった の で (subclause)、ケンとマリはピクニックに行きました (main clause): 1 T-unit.

In the case of coordinate clauses, where multiple clauses are connected, each coordinated clause is counted separately. Thus, a sentence with coordinate clauses may have 2 T-units or more, e.g. (8).

ケンは地図で場所を探して (coordinate clause)、マリはサンドイッチを作りました (coordinate clause): 2 T-units.

Lexical diversity refers to the range of words used within a text (Engber, 1995 ; Kyle et al. 2021 ) and is considered a useful measure of the breadth of vocabulary in L n production (Jarvis, 2013a , 2013b ).

The type/token ratio (TTR) is widely recognized as a straightforward measure for calculating lexical diversity and has been employed in numerous studies. These studies have demonstrated a strong correlation between TTR and other methods of measuring lexical diversity (e.g., Bentz et al. 2016 ; Čech and Miroslav, 2018 ; Çöltekin and Taraka, 2018 ). TTR is computed by considering both the number of unique words (types) and the total number of words (tokens) in a given text. Given that the length of learners’ writing texts can vary, this study employs the moving average type-token ratio (MATTR) to mitigate the influence of text length. MATTR is calculated using a 50-word moving window. Initially, a TTR is determined for words 1–50 in an essay, followed by words 2–51, 3–52, and so on until the end of the essay is reached (Díez-Ortega and Kyle, 2023 ). The final MATTR scores were obtained by averaging the TTR scores for all 50-word windows. The following formula was employed to derive MATTR:

\({\rm{MATTR}}({\rm{W}})=\frac{{\sum }_{{\rm{i}}=1}^{{\rm{N}}-{\rm{W}}+1}{{\rm{F}}}_{{\rm{i}}}}{{\rm{W}}({\rm{N}}-{\rm{W}}+1)}\)

Here, N refers to the number of tokens in the corpus. W is the randomly selected token size (W < N). \({F}_{i}\) is the number of types in each window. The \({\rm{MATTR}}({\rm{W}})\) is the mean of a series of type-token ratios (TTRs) based on the word form for all windows. It is expected that individuals with higher language proficiency will produce texts with greater lexical diversity, as indicated by higher MATTR scores.

Lexical density was captured by the ratio of the number of lexical words to the total number of words (Lu, 2012 ). Lexical sophistication refers to the utilization of advanced vocabulary, often evaluated through word frequency indices (Crossley et al. 2013 ; Haberman, 2008 ; Kyle and Crossley, 2015 ; Laufer and Nation, 1995 ; Lu, 2012 ; Read, 2000 ). In line of writing, lexical sophistication can be interpreted as vocabulary breadth, which entails the appropriate usage of vocabulary items across various lexicon-grammatical contexts and registers (Garner et al. 2019 ; Kim et al. 2018 ; Kyle et al. 2018 ). In Japanese specifically, words are considered lexically sophisticated if they are not included in the “Japanese Education Vocabulary List Ver 1.0”. Footnote 4 Consequently, lexical sophistication was calculated by determining the number of sophisticated word types relative to the total number of words per essay. Furthermore, it has been suggested that, in Japanese writing, sentences should ideally have a length of no more than 40 to 50 characters, as this promotes readability. Therefore, the median and maximum sentence length can be considered as useful indices for assessment (Ishioka and Kameda, 2006 ).

Syntactic complexity was assessed based on several measures, including the mean length of clauses, verb phrases per T-unit, clauses per T-unit, dependent clauses per T-unit, complex nominals per clause, adverbial clauses per clause, coordinate phrases per clause, and mean dependency distance (MDD). The MDD reflects the distance between the governor and dependent positions in a sentence. A larger dependency distance indicates a higher cognitive load and greater complexity in syntactic processing (Liu, 2008 ; Liu et al. 2017 ). The MDD has been established as an efficient metric for measuring syntactic complexity (Jiang, Quyang, and Liu, 2019 ; Li and Yan, 2021 ). To calculate the MDD, the position numbers of the governor and dependent are subtracted, assuming that words in a sentence are assigned in a linear order, such as W1 … Wi … Wn. In any dependency relationship between words Wa and Wb, Wa is the governor and Wb is the dependent. The MDD of the entire sentence was obtained by taking the absolute value of governor – dependent:

MDD = \(\frac{1}{n}{\sum }_{i=1}^{n}|{\rm{D}}{{\rm{D}}}_{i}|\)

In this formula, \(n\) represents the number of words in the sentence, and \({DD}i\) is the dependency distance of the \({i}^{{th}}\) dependency relationship of a sentence. Building on this, the annotation of sentence ‘Mary-ga-John-ni-keshigomu-o-watashita was [Mary- top -John- dat -eraser- acc -give- past] ’. The sentence’s MDD would be 2. Table 3 provides the CSV file as a prompt for GPT 4.

Cohesion (semantic similarity) and content elaboration aim to capture the ideas presented in test taker’s essays. Cohesion was assessed using three measures: Synonym overlap/paragraph (topic), Synonym overlap/paragraph (keywords), and word2vec cosine similarity. Content elaboration and development were measured as the number of metadiscourse markers (type)/number of words. To capture content closely, this study proposed a novel-distance based representation, by encoding the cosine distance between the essay (by learner) and essay task’s (topic and keyword) i -vectors. The learner’s essay is decoded into a word sequence, and aligned to the essay task’ topic and keyword for log-likelihood measurement. The cosine distance reveals the content elaboration score in the leaners’ essay. The mathematical equation of cosine similarity between target-reference vectors is shown in (11), assuming there are i essays and ( L i , …. L n ) and ( N i , …. N n ) are the vectors representing the learner and task’s topic and keyword respectively. The content elaboration distance between L i and N i was calculated as follows:

\(\cos \left(\theta \right)=\frac{{\rm{L}}\,\cdot\, {\rm{N}}}{\left|{\rm{L}}\right|{\rm{|N|}}}=\frac{\mathop{\sum }\nolimits_{i=1}^{n}{L}_{i}{N}_{i}}{\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{L}_{i}^{2}}\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{N}_{i}^{2}}}\)

A high similarity value indicates a low difference between the two recognition outcomes, which in turn suggests a high level of proficiency in content elaboration.

To evaluate the effectiveness of the proposed measures in distinguishing different proficiency levels among nonnative Japanese speakers’ writing, we conducted a multi-faceted Rasch measurement analysis (Linacre, 1994 ). This approach applies measurement models to thoroughly analyze various factors that can influence test outcomes, including test takers’ proficiency, item difficulty, and rater severity, among others. The underlying principles and functionality of multi-faceted Rasch measurement are illustrated in (12).

\(\log \left(\frac{{P}_{{nijk}}}{{P}_{{nij}(k-1)}}\right)={B}_{n}-{D}_{i}-{C}_{j}-{F}_{k}\)

(12) defines the logarithmic transformation of the probability ratio ( P nijk /P nij(k-1) )) as a function of multiple parameters. Here, n represents the test taker, i denotes a writing proficiency measure, j corresponds to the human rater, and k represents the proficiency score. The parameter B n signifies the proficiency level of test taker n (where n ranges from 1 to N). D j represents the difficulty parameter of test item i (where i ranges from 1 to L), while C j represents the severity of rater j (where j ranges from 1 to J). Additionally, F k represents the step difficulty for a test taker to move from score ‘k-1’ to k . P nijk refers to the probability of rater j assigning score k to test taker n for test item i . P nij(k-1) represents the likelihood of test taker n being assigned score ‘k-1’ by rater j for test item i . Each facet within the test is treated as an independent parameter and estimated within the same reference framework. To evaluate the consistency of scores obtained through both human and computer analysis, we utilized the Infit mean-square statistic. This statistic is a chi-square measure divided by the degrees of freedom and is weighted with information. It demonstrates higher sensitivity to unexpected patterns in responses to items near a person’s proficiency level (Linacre, 2002 ). Fit statistics are assessed based on predefined thresholds for acceptable fit. For the Infit MNSQ, which has a mean of 1.00, different thresholds have been suggested. Some propose stricter thresholds ranging from 0.7 to 1.3 (Bond et al. 2021 ), while others suggest more lenient thresholds ranging from 0.5 to 1.5 (Eckes, 2009 ). In this study, we adopted the criterion of 0.70–1.30 for the Infit MNSQ.

Moving forward, we can now proceed to assess the effectiveness of the 16 proposed measures based on five criteria for accurately distinguishing various levels of writing proficiency among non-native Japanese speakers. To conduct this evaluation, we utilized the development dataset from the I-JAS corpus, as described in Section Dataset . Table 4 provides a measurement report that presents the performance details of the 14 metrics under consideration. The measure separation was found to be 4.02, indicating a clear differentiation among the measures. The reliability index for the measure separation was 0.891, suggesting consistency in the measurement. Similarly, the person separation reliability index was 0.802, indicating the accuracy of the assessment in distinguishing between individuals. All 16 measures demonstrated Infit mean squares within a reasonable range, ranging from 0.76 to 1.28. The Synonym overlap/paragraph (topic) measure exhibited a relatively high outfit mean square of 1.46, although the Infit mean square falls within an acceptable range. The standard error for the measures ranged from 0.13 to 0.28, indicating the precision of the estimates.

Table 5 further illustrated the weights assigned to different linguistic measures for score prediction, with higher weights indicating stronger correlations between those measures and higher scores. Specifically, the following measures exhibited higher weights compared to others: moving average type token ratio per essay has a weight of 0.0391. Mean dependency distance had a weight of 0.0388. Mean length of clause, calculated by dividing the number of words by the number of clauses, had a weight of 0.0374. Complex nominals per T-unit, calculated by dividing the number of complex nominals by the number of T-units, had a weight of 0.0379. Coordinate phrases rate, calculated by dividing the number of coordinate phrases by the number of clauses, had a weight of 0.0325. Grammatical error rate, representing the number of errors per essay, had a weight of 0.0322.

Criteria (output indicator)

The criteria used to evaluate the writing ability in this study were based on CEFR, which follows a six-point scale ranging from A1 to C2. To assess the quality of Japanese writing, the scoring criteria from Table 6 were utilized. These criteria were derived from the IELTS writing standards and served as assessment guidelines and prompts for the written output.

A prompt is a question or detailed instruction that is provided to the model to obtain a proper response. After several pilot experiments, we decided to provide the measures (Section Measures of writing proficiency for nonnative Japanese ) as the input prompt and use the criteria (Section Criteria (output indicator) ) as the output indicator. Regarding the prompt language, considering that the LLM was tasked with rating Japanese essays, would prompt in Japanese works better Footnote 5 ? We conducted experiments comparing the performance of GPT-4 using both English and Japanese prompts. Additionally, we utilized the Japanese local model OCLL with Japanese prompts. Multiple trials were conducted using the same sample. Regardless of the prompt language used, we consistently obtained the same grading results with GPT-4, which assigned a grade of B1 to the writing sample. This suggested that GPT-4 is reliable and capable of producing consistent ratings regardless of the prompt language. On the other hand, when we used Japanese prompts with the Japanese local model “OCLL”, we encountered inconsistent grading results. Out of 10 attempts with OCLL, only 6 yielded consistent grading results (B1), while the remaining 4 showed different outcomes, including A1 and B2 grades. These findings indicated that the language of the prompt was not the determining factor for reliable AES. Instead, the size of the training data and the model parameters played crucial roles in achieving consistent and reliable AES results for the language model.

The following is the utilized prompt, which details all measures and requires the LLM to score the essays using holistic and trait scores.

Please evaluate Japanese essays written by Japanese learners and assign a score to each essay on a six-point scale, ranging from A1, A2, B1, B2, C1 to C2. Additionally, please provide trait scores and display the calculation process for each trait score. The scoring should be based on the following criteria:

Moving average type-token ratio.

Number of lexical words (token) divided by the total number of words per essay.

Number of sophisticated word types divided by the total number of words per essay.

Mean length of clause.

Verb phrases per T-unit.

Clauses per T-unit.

Dependent clauses per T-unit.

Complex nominals per clause.

Adverbial clauses per clause.

Coordinate phrases per clause.

Mean dependency distance.

Synonym overlap paragraph (topic and keywords).

Word2vec cosine similarity.

Connectives per essay.

Conjunctions per essay.

Number of metadiscourse markers (types) divided by the total number of words.

Number of errors per essay.

Japanese essay text

出かける前に二人が地図を見ている間に、サンドイッチを入れたバスケットに犬が入ってしまいました。それに気づかずに二人は楽しそうに出かけて行きました。やがて突然犬がバスケットから飛び出し、二人は驚きました。バスケット の 中を見ると、食べ物はすべて犬に食べられていて、二人は困ってしまいました。(ID_JJJ01_SW1)

The score of the example above was B1. Figure 3 provides an example of holistic and trait scores provided by GPT-4 (with a prompt indicating all measures) via Bing Footnote 6 .

figure 3

Example of GPT-4 AES and feedback (with a prompt indicating all measures).

Statistical analysis

The aim of this study is to investigate the potential use of LLM for nonnative Japanese AES. It seeks to compare the scoring outcomes obtained from feature-based AES tools, which rely on conventional machine learning technology (i.e. Jess, JWriter), with those generated by AI-driven AES tools utilizing deep learning technology (BERT, GPT, OCLL). To assess the reliability of a computer-assisted annotation tool, the study initially established human-human agreement as the benchmark measure. Subsequently, the performance of the LLM-based method was evaluated by comparing it to human-human agreement.

To assess annotation agreement, the study employed standard measures such as precision, recall, and F-score (Brants 2000 ; Lu 2010 ), along with the quadratically weighted kappa (QWK) to evaluate the consistency and agreement in the annotation process. Assume A and B represent human annotators. When comparing the annotations of the two annotators, the following results are obtained. The evaluation of precision, recall, and F-score metrics was illustrated in equations (13) to (15).

\({\rm{Recall}}(A,B)=\frac{{\rm{Number}}\,{\rm{of}}\,{\rm{identical}}\,{\rm{nodes}}\,{\rm{in}}\,A\,{\rm{and}}\,B}{{\rm{Number}}\,{\rm{of}}\,{\rm{nodes}}\,{\rm{in}}\,A}\)

\({\rm{Precision}}(A,\,B)=\frac{{\rm{Number}}\,{\rm{of}}\,{\rm{identical}}\,{\rm{nodes}}\,{\rm{in}}\,A\,{\rm{and}}\,B}{{\rm{Number}}\,{\rm{of}}\,{\rm{nodes}}\,{\rm{in}}\,B}\)

The F-score is the harmonic mean of recall and precision:

\({\rm{F}}-{\rm{score}}=\frac{2* ({\rm{Precision}}* {\rm{Recall}})}{{\rm{Precision}}+{\rm{Recall}}}\)

The highest possible value of an F-score is 1.0, indicating perfect precision and recall, and the lowest possible value is 0, if either precision or recall are zero.

In accordance with Taghipour and Ng ( 2016 ), the calculation of QWK involves two steps:

Step 1: Construct a weight matrix W as follows:

\({W}_{{ij}}=\frac{{(i-j)}^{2}}{{(N-1)}^{2}}\)

i represents the annotation made by the tool, while j represents the annotation made by a human rater. N denotes the total number of possible annotations. Matrix O is subsequently computed, where O_( i, j ) represents the count of data annotated by the tool ( i ) and the human annotator ( j ). On the other hand, E refers to the expected count matrix, which undergoes normalization to ensure that the sum of elements in E matches the sum of elements in O.

Step 2: With matrices O and E, the QWK is obtained as follows:

K = 1- \(\frac{\sum i,j{W}_{i,j}\,{O}_{i,j}}{\sum i,j{W}_{i,j}\,{E}_{i,j}}\)

The value of the quadratic weighted kappa increases as the level of agreement improves. Further, to assess the accuracy of LLM scoring, the proportional reductive mean square error (PRMSE) was employed. The PRMSE approach takes into account the variability observed in human ratings to estimate the rater error, which is then subtracted from the variance of the human labels. This calculation provides an overall measure of agreement between the automated scores and true scores (Haberman et al. 2015 ; Loukina et al. 2020 ; Taghipour and Ng, 2016 ). The computation of PRMSE involves the following steps:

Step 1: Calculate the mean squared errors (MSEs) for the scoring outcomes of the computer-assisted tool (MSE tool) and the human scoring outcomes (MSE human).

Step 2: Determine the PRMSE by comparing the MSE of the computer-assisted tool (MSE tool) with the MSE from human raters (MSE human), using the following formula:

\({\rm{PRMSE}}=1-\frac{({\rm{MSE}}\,{\rm{tool}})\,}{({\rm{MSE}}\,{\rm{human}})\,}=1-\,\frac{{\sum }_{i}^{n}=1{({{\rm{y}}}_{i}-{\hat{{\rm{y}}}}_{{\rm{i}}})}^{2}}{{\sum }_{i}^{n}=1{({{\rm{y}}}_{i}-\hat{{\rm{y}}})}^{2}}\)

In the numerator, ŷi represents the scoring outcome predicted by a specific LLM-driven AES system for a given sample. The term y i − ŷ i represents the difference between this predicted outcome and the mean value of all LLM-driven AES systems’ scoring outcomes. It quantifies the deviation of the specific LLM-driven AES system’s prediction from the average prediction of all LLM-driven AES systems. In the denominator, y i − ŷ represents the difference between the scoring outcome provided by a specific human rater for a given sample and the mean value of all human raters’ scoring outcomes. It measures the discrepancy between the specific human rater’s score and the average score given by all human raters. The PRMSE is then calculated by subtracting the ratio of the MSE tool to the MSE human from 1. PRMSE falls within the range of 0 to 1, with larger values indicating reduced errors in LLM’s scoring compared to those of human raters. In other words, a higher PRMSE implies that LLM’s scoring demonstrates greater accuracy in predicting the true scores (Loukina et al. 2020 ). The interpretation of kappa values, ranging from 0 to 1, is based on the work of Landis and Koch ( 1977 ). Specifically, the following categories are assigned to different ranges of kappa values: −1 indicates complete inconsistency, 0 indicates random agreement, 0.0 ~ 0.20 indicates extremely low level of agreement (slight), 0.21 ~ 0.40 indicates moderate level of agreement (fair), 0.41 ~ 0.60 indicates medium level of agreement (moderate), 0.61 ~ 0.80 indicates high level of agreement (substantial), 0.81 ~ 1 indicates almost perfect level of agreement. All statistical analyses were executed using Python script.

Results and discussion

Annotation reliability of the llm.

This section focuses on assessing the reliability of the LLM’s annotation and scoring capabilities. To evaluate the reliability, several tests were conducted simultaneously, aiming to achieve the following objectives:

Assess the LLM’s ability to differentiate between test takers with varying levels of oral proficiency.

Determine the level of agreement between the annotations and scoring performed by the LLM and those done by human raters.

The evaluation of the results encompassed several metrics, including: precision, recall, F-Score, quadratically-weighted kappa, proportional reduction of mean squared error, Pearson correlation, and multi-faceted Rasch measurement.

Inter-annotator agreement (human–human annotator agreement)

We started with an agreement test of the two human annotators. Two trained annotators were recruited to determine the writing task data measures. A total of 714 scripts, as the test data, was utilized. Each analysis lasted 300–360 min. Inter-annotator agreement was evaluated using the standard measures of precision, recall, and F-score and QWK. Table 7 presents the inter-annotator agreement for the various indicators. As shown, the inter-annotator agreement was fairly high, with F-scores ranging from 1.0 for sentence and word number to 0.666 for grammatical errors.

The findings from the QWK analysis provided further confirmation of the inter-annotator agreement. The QWK values covered a range from 0.950 ( p  = 0.000) for sentence and word number to 0.695 for synonym overlap number (keyword) and grammatical errors ( p  = 0.001).

Agreement of annotation outcomes between human and LLM

To evaluate the consistency between human annotators and LLM annotators (BERT, GPT, OCLL) across the indices, the same test was conducted. The results of the inter-annotator agreement (F-score) between LLM and human annotation are provided in Appendix B-D. The F-scores ranged from 0.706 for Grammatical error # for OCLL-human to a perfect 1.000 for GPT-human, for sentences, clauses, T-units, and words. These findings were further supported by the QWK analysis, which showed agreement levels ranging from 0.807 ( p  = 0.001) for metadiscourse markers for OCLL-human to 0.962 for words ( p  = 0.000) for GPT-human. The findings demonstrated that the LLM annotation achieved a significant level of accuracy in identifying measurement units and counts.

Reliability of LLM-driven AES’s scoring and discriminating proficiency levels

This section examines the reliability of the LLM-driven AES scoring through a comparison of the scoring outcomes produced by human raters and the LLM ( Reliability of LLM-driven AES scoring ). It also assesses the effectiveness of the LLM-based AES system in differentiating participants with varying proficiency levels ( Reliability of LLM-driven AES discriminating proficiency levels ).

Reliability of LLM-driven AES scoring

Table 8 summarizes the QWK coefficient analysis between the scores computed by the human raters and the GPT-4 for the individual essays from I-JAS Footnote 7 . As shown, the QWK of all measures ranged from k  = 0.819 for lexical density (number of lexical words (tokens)/number of words per essay) to k  = 0.644 for word2vec cosine similarity. Table 9 further presents the Pearson correlations between the 16 writing proficiency measures scored by human raters and GPT 4 for the individual essays. The correlations ranged from 0.672 for syntactic complexity to 0.734 for grammatical accuracy. The correlations between the writing proficiency scores assigned by human raters and the BERT-based AES system were found to range from 0.661 for syntactic complexity to 0.713 for grammatical accuracy. The correlations between the writing proficiency scores given by human raters and the OCLL-based AES system ranged from 0.654 for cohesion to 0.721 for grammatical accuracy. These findings indicated an alignment between the assessments made by human raters and both the BERT-based and OCLL-based AES systems in terms of various aspects of writing proficiency.

Reliability of LLM-driven AES discriminating proficiency levels

After validating the reliability of the LLM’s annotation and scoring, the subsequent objective was to evaluate its ability to distinguish between various proficiency levels. For this analysis, a dataset of 686 individual essays was utilized. Table 10 presents a sample of the results, summarizing the means, standard deviations, and the outcomes of the one-way ANOVAs based on the measures assessed by the GPT-4 model. A post hoc multiple comparison test, specifically the Bonferroni test, was conducted to identify any potential differences between pairs of levels.

As the results reveal, seven measures presented linear upward or downward progress across the three proficiency levels. These were marked in bold in Table 10 and comprise one measure of lexical richness, i.e. MATTR (lexical diversity); four measures of syntactic complexity, i.e. MDD (mean dependency distance), MLC (mean length of clause), CNT (complex nominals per T-unit), CPC (coordinate phrases rate); one cohesion measure, i.e. word2vec cosine similarity and GER (grammatical error rate). Regarding the ability of the sixteen measures to distinguish adjacent proficiency levels, the Bonferroni tests indicated that statistically significant differences exist between the primary level and the intermediate level for MLC and GER. One measure of lexical richness, namely LD, along with three measures of syntactic complexity (VPT, CT, DCT, ACC), two measures of cohesion (SOPT, SOPK), and one measure of content elaboration (IMM), exhibited statistically significant differences between proficiency levels. However, these differences did not demonstrate a linear progression between adjacent proficiency levels. No significant difference was observed in lexical sophistication between proficiency levels.

To summarize, our study aimed to evaluate the reliability and differentiation capabilities of the LLM-driven AES method. For the first objective, we assessed the LLM’s ability to differentiate between test takers with varying levels of oral proficiency using precision, recall, F-Score, and quadratically-weighted kappa. Regarding the second objective, we compared the scoring outcomes generated by human raters and the LLM to determine the level of agreement. We employed quadratically-weighted kappa and Pearson correlations to compare the 16 writing proficiency measures for the individual essays. The results confirmed the feasibility of using the LLM for annotation and scoring in AES for nonnative Japanese. As a result, Research Question 1 has been addressed.

Comparison of BERT-, GPT-, OCLL-based AES, and linguistic-feature-based computation methods

This section aims to compare the effectiveness of five AES methods for nonnative Japanese writing, i.e. LLM-driven approaches utilizing BERT, GPT, and OCLL, linguistic feature-based approaches using Jess and JWriter. The comparison was conducted by comparing the ratings obtained from each approach with human ratings. All ratings were derived from the dataset introduced in Dataset . To facilitate the comparison, the agreement between the automated methods and human ratings was assessed using QWK and PRMSE. The performance of each approach was summarized in Table 11 .

The QWK coefficient values indicate that LLMs (GPT, BERT, OCLL) and human rating outcomes demonstrated higher agreement compared to feature-based AES methods (Jess and JWriter) in assessing writing proficiency criteria, including lexical richness, syntactic complexity, content, and grammatical accuracy. Among the LLMs, the GPT-4 driven AES and human rating outcomes showed the highest agreement in all criteria, except for syntactic complexity. The PRMSE values suggest that the GPT-based method outperformed linguistic feature-based methods and other LLM-based approaches. Moreover, an interesting finding emerged during the study: the agreement coefficient between GPT-4 and human scoring was even higher than the agreement between different human raters themselves. This discovery highlights the advantage of GPT-based AES over human rating. Ratings involve a series of processes, including reading the learners’ writing, evaluating the content and language, and assigning scores. Within this chain of processes, various biases can be introduced, stemming from factors such as rater biases, test design, and rating scales. These biases can impact the consistency and objectivity of human ratings. GPT-based AES may benefit from its ability to apply consistent and objective evaluation criteria. By prompting the GPT model with detailed writing scoring rubrics and linguistic features, potential biases in human ratings can be mitigated. The model follows a predefined set of guidelines and does not possess the same subjective biases that human raters may exhibit. This standardization in the evaluation process contributes to the higher agreement observed between GPT-4 and human scoring. Section Prompt strategy of the study delves further into the role of prompts in the application of LLMs to AES. It explores how the choice and implementation of prompts can impact the performance and reliability of LLM-based AES methods. Furthermore, it is important to acknowledge the strengths of the local model, i.e. the Japanese local model OCLL, which excels in processing certain idiomatic expressions. Nevertheless, our analysis indicated that GPT-4 surpasses local models in AES. This superior performance can be attributed to the larger parameter size of GPT-4, estimated to be between 500 billion and 1 trillion, which exceeds the sizes of both BERT and the local model OCLL.

Prompt strategy

In the context of prompt strategy, Mizumoto and Eguchi ( 2023 ) conducted a study where they applied the GPT-3 model to automatically score English essays in the TOEFL test. They found that the accuracy of the GPT model alone was moderate to fair. However, when they incorporated linguistic measures such as cohesion, syntactic complexity, and lexical features alongside the GPT model, the accuracy significantly improved. This highlights the importance of prompt engineering and providing the model with specific instructions to enhance its performance. In this study, a similar approach was taken to optimize the performance of LLMs. GPT-4, which outperformed BERT and OCLL, was selected as the candidate model. Model 1 was used as the baseline, representing GPT-4 without any additional prompting. Model 2, on the other hand, involved GPT-4 prompted with 16 measures that included scoring criteria, efficient linguistic features for writing assessment, and detailed measurement units and calculation formulas. The remaining models (Models 3 to 18) utilized GPT-4 prompted with individual measures. The performance of these 18 different models was assessed using the output indicators described in Section Criteria (output indicator) . By comparing the performances of these models, the study aimed to understand the impact of prompt engineering on the accuracy and effectiveness of GPT-4 in AES tasks.

Based on the PRMSE scores presented in Fig. 4 , it was observed that Model 1, representing GPT-4 without any additional prompting, achieved a fair level of performance. However, Model 2, which utilized GPT-4 prompted with all measures, outperformed all other models in terms of PRMSE score, achieving a score of 0.681. These results indicate that the inclusion of specific measures and prompts significantly enhanced the performance of GPT-4 in AES. Among the measures, syntactic complexity was found to play a particularly significant role in improving the accuracy of GPT-4 in assessing writing quality. Following that, lexical diversity emerged as another important factor contributing to the model’s effectiveness. The study suggests that a well-prompted GPT-4 can serve as a valuable tool to support human assessors in evaluating writing quality. By utilizing GPT-4 as an automated scoring tool, the evaluation biases associated with human raters can be minimized. This has the potential to empower teachers by allowing them to focus on designing writing tasks and guiding writing strategies, while leveraging the capabilities of GPT-4 for efficient and reliable scoring.

figure 4

PRMSE scores of the 18 AES models.

This study aimed to investigate two main research questions: the feasibility of utilizing LLMs for AES and the impact of prompt engineering on the application of LLMs in AES.

To address the first objective, the study compared the effectiveness of five different models: GPT, BERT, the Japanese local LLM (OCLL), and two conventional machine learning-based AES tools (Jess and JWriter). The PRMSE values indicated that the GPT-4-based method outperformed other LLMs (BERT, OCLL) and linguistic feature-based computational methods (Jess and JWriter) across various writing proficiency criteria. Furthermore, the agreement coefficient between GPT-4 and human scoring surpassed the agreement among human raters themselves, highlighting the potential of using the GPT-4 tool to enhance AES by reducing biases and subjectivity, saving time, labor, and cost, and providing valuable feedback for self-study. Regarding the second goal, the role of prompt design was investigated by comparing 18 models, including a baseline model, a model prompted with all measures, and 16 models prompted with one measure at a time. GPT-4, which outperformed BERT and OCLL, was selected as the candidate model. The PRMSE scores of the models showed that GPT-4 prompted with all measures achieved the best performance, surpassing the baseline and other models.

In conclusion, this study has demonstrated the potential of LLMs in supporting human rating in assessments. By incorporating automation, we can save time and resources while reducing biases and subjectivity inherent in human rating processes. Automated language assessments offer the advantage of accessibility, providing equal opportunities and economic feasibility for individuals who lack access to traditional assessment centers or necessary resources. LLM-based language assessments provide valuable feedback and support to learners, aiding in the enhancement of their language proficiency and the achievement of their goals. This personalized feedback can cater to individual learner needs, facilitating a more tailored and effective language-learning experience.

There are three important areas that merit further exploration. First, prompt engineering requires attention to ensure optimal performance of LLM-based AES across different language types. This study revealed that GPT-4, when prompted with all measures, outperformed models prompted with fewer measures. Therefore, investigating and refining prompt strategies can enhance the effectiveness of LLMs in automated language assessments. Second, it is crucial to explore the application of LLMs in second-language assessment and learning for oral proficiency, as well as their potential in under-resourced languages. Recent advancements in self-supervised machine learning techniques have significantly improved automatic speech recognition (ASR) systems, opening up new possibilities for creating reliable ASR systems, particularly for under-resourced languages with limited data. However, challenges persist in the field of ASR. First, ASR assumes correct word pronunciation for automatic pronunciation evaluation, which proves challenging for learners in the early stages of language acquisition due to diverse accents influenced by their native languages. Accurately segmenting short words becomes problematic in such cases. Second, developing precise audio-text transcriptions for languages with non-native accented speech poses a formidable task. Last, assessing oral proficiency levels involves capturing various linguistic features, including fluency, pronunciation, accuracy, and complexity, which are not easily captured by current NLP technology.

Data availability

The dataset utilized was obtained from the International Corpus of Japanese as a Second Language (I-JAS). The data URLs: [ https://www2.ninjal.ac.jp/jll/lsaj/ihome2.html ].

J-CAT and TTBJ are two computerized adaptive tests used to assess Japanese language proficiency.

SPOT is a specific component of the TTBJ test.

J-CAT: https://www.j-cat2.org/html/ja/pages/interpret.html

SPOT: https://ttbj.cegloc.tsukuba.ac.jp/p1.html#SPOT .

The study utilized a prompt-based GPT-4 model, developed by OpenAI, which has an impressive architecture with 1.8 trillion parameters across 120 layers. GPT-4 was trained on a vast dataset of 13 trillion tokens, using two stages: initial training on internet text datasets to predict the next token, and subsequent fine-tuning through reinforcement learning from human feedback.

https://www2.ninjal.ac.jp/jll/lsaj/ihome2-en.html .

http://jhlee.sakura.ne.jp/JEV/ by Japanese Learning Dictionary Support Group 2015.

We express our sincere gratitude to the reviewer for bringing this matter to our attention.

On February 7, 2023, Microsoft began rolling out a major overhaul to Bing that included a new chatbot feature based on OpenAI’s GPT-4 (Bing.com).

Appendix E-F present the analysis results of the QWK coefficient between the scores computed by the human raters and the BERT, OCLL models.

Attali Y, Burstein J (2006) Automated essay scoring with e-rater® V.2. J. Technol., Learn. Assess., 4

Barkaoui K, Hadidi A (2020) Assessing Change in English Second Language Writing Performance (1st ed.). Routledge, New York. https://doi.org/10.4324/9781003092346

Bentz C, Tatyana R, Koplenig A, Tanja S (2016) A comparison between morphological complexity. measures: Typological data vs. language corpora. In Proceedings of the workshop on computational linguistics for linguistic complexity (CL4LC), 142–153. Osaka, Japan: The COLING 2016 Organizing Committee

Bond TG, Yan Z, Heene M (2021) Applying the Rasch model: Fundamental measurement in the human sciences (4th ed). Routledge

Brants T (2000) Inter-annotator agreement for a German newspaper corpus. Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00), Athens, Greece, 31 May-2 June, European Language Resources Association

Brown TB, Mann B, Ryder N, et al. (2020) Language models are few-shot learners. Advances in Neural Information Processing Systems, Online, 6–12 December, Curran Associates, Inc., Red Hook, NY

Burstein J (2003) The E-rater scoring engine: Automated essay scoring with natural language processing. In Shermis MD and Burstein JC (ed) Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Čech R, Miroslav K (2018) Morphological richness of text. In Masako F, Václav C (ed) Taming the corpus: From inflection and lexis to interpretation, 63–77. Cham, Switzerland: Springer Nature

Çöltekin Ç, Taraka, R (2018) Exploiting Universal Dependencies treebanks for measuring morphosyntactic complexity. In Aleksandrs B, Christian B (ed), Proceedings of first workshop on measuring language complexity, 1–7. Torun, Poland

Crossley SA, Cobb T, McNamara DS (2013) Comparing count-based and band-based indices of word frequency: Implications for active vocabulary research and pedagogical applications. System 41:965–981. https://doi.org/10.1016/j.system.2013.08.002

Article   Google Scholar  

Crossley SA, McNamara DS (2016) Say more and be more coherent: How text elaboration and cohesion can increase writing quality. J. Writ. Res. 7:351–370

CyberAgent Inc (2023) Open-Calm series of Japanese language models. Retrieved from: https://www.cyberagent.co.jp/news/detail/id=28817

Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, Minnesota, 2–7 June, pp. 4171–4186. Association for Computational Linguistics

Diez-Ortega M, Kyle K (2023) Measuring the development of lexical richness of L2 Spanish: a longitudinal learner corpus study. Studies in Second Language Acquisition 1-31

Eckes T (2009) On common ground? How raters perceive scoring criteria in oral proficiency testing. In Brown A, Hill K (ed) Language testing and evaluation 13: Tasks and criteria in performance assessment (pp. 43–73). Peter Lang Publishing

Elliot S (2003) IntelliMetric: from here to validity. In: Shermis MD, Burstein JC (ed) Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Google Scholar  

Engber CA (1995) The relationship of lexical proficiency to the quality of ESL compositions. J. Second Lang. Writ. 4:139–155

Garner J, Crossley SA, Kyle K (2019) N-gram measures and L2 writing proficiency. System 80:176–187. https://doi.org/10.1016/j.system.2018.12.001

Haberman SJ (2008) When can subscores have value? J. Educat. Behav. Stat., 33:204–229

Haberman SJ, Yao L, Sinharay S (2015) Prediction of true test scores from observed item scores and ancillary data. Brit. J. Math. Stat. Psychol. 68:363–385

Halliday MAK (1985) Spoken and Written Language. Deakin University Press, Melbourne, Australia

Hirao R, Arai M, Shimanaka H et al. (2020) Automated essay scoring system for nonnative Japanese learners. Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pp. 1250–1257. European Language Resources Association

Hunt KW (1966) Recent Measures in Syntactic Development. Elementary English, 43(7), 732–739. http://www.jstor.org/stable/41386067

Ishioka T (2001) About e-rater, a computer-based automatic scoring system for essays [Konpyūta ni yoru essei no jidō saiten shisutemu e − rater ni tsuite]. University Entrance Examination. Forum [Daigaku nyūshi fōramu] 24:71–76

Hochreiter S, Schmidhuber J (1997) Long short- term memory. Neural Comput. 9(8):1735–1780

Article   CAS   PubMed   Google Scholar  

Ishioka T, Kameda M (2006) Automated Japanese essay scoring system based on articles written by experts. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 17–18 July 2006, pp. 233-240. Association for Computational Linguistics, USA

Japan Foundation (2021) Retrieved from: https://www.jpf.gp.jp/j/project/japanese/survey/result/dl/survey2021/all.pdf

Jarvis S (2013a) Defining and measuring lexical diversity. In Jarvis S, Daller M (ed) Vocabulary knowledge: Human ratings and automated measures (Vol. 47, pp. 13–44). John Benjamins. https://doi.org/10.1075/sibil.47.03ch1

Jarvis S (2013b) Capturing the diversity in lexical diversity. Lang. Learn. 63:87–106. https://doi.org/10.1111/j.1467-9922.2012.00739.x

Jiang J, Quyang J, Liu H (2019) Interlanguage: A perspective of quantitative linguistic typology. Lang. Sci. 74:85–97

Kim M, Crossley SA, Kyle K (2018) Lexical sophistication as a multidimensional phenomenon: Relations to second language lexical proficiency, development, and writing quality. Mod. Lang. J. 102(1):120–141. https://doi.org/10.1111/modl.12447

Kojima T, Gu S, Reid M et al. (2022) Large language models are zero-shot reasoners. Advances in Neural Information Processing Systems, New Orleans, LA, 29 November-1 December, Curran Associates, Inc., Red Hook, NY

Kyle K, Crossley SA (2015) Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Q 49:757–786

Kyle K, Crossley SA, Berger CM (2018) The tool for the automatic analysis of lexical sophistication (TAALES): Version 2.0. Behav. Res. Methods 50:1030–1046. https://doi.org/10.3758/s13428-017-0924-4

Article   PubMed   Google Scholar  

Kyle K, Crossley SA, Jarvis S (2021) Assessing the validity of lexical diversity using direct judgements. Lang. Assess. Q. 18:154–170. https://doi.org/10.1080/15434303.2020.1844205

Landauer TK, Laham D, Foltz PW (2003) Automated essay scoring and annotation of essays with the Intelligent Essay Assessor. In Shermis MD, Burstein JC (ed), Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 159–174

Laufer B, Nation P (1995) Vocabulary size and use: Lexical richness in L2 written production. Appl. Linguist. 16:307–322. https://doi.org/10.1093/applin/16.3.307

Lee J, Hasebe Y (2017) jWriter Learner Text Evaluator, URL: https://jreadability.net/jwriter/

Lee J, Kobayashi N, Sakai T, Sakota K (2015) A Comparison of SPOT and J-CAT Based on Test Analysis [Tesuto bunseki ni motozuku ‘SPOT’ to ‘J-CAT’ no hikaku]. Research on the Acquisition of Second Language Japanese [Dainigengo to shite no nihongo no shūtoku kenkyū] (18) 53–69

Li W, Yan J (2021) Probability distribution of dependency distance based on a Treebank of. Japanese EFL Learners’ Interlanguage. J. Quant. Linguist. 28(2):172–186. https://doi.org/10.1080/09296174.2020.1754611

Article   MathSciNet   Google Scholar  

Linacre JM (2002) Optimizing rating scale category effectiveness. J. Appl. Meas. 3(1):85–106

PubMed   Google Scholar  

Linacre JM (1994) Constructing measurement with a Many-Facet Rasch Model. In Wilson M (ed) Objective measurement: Theory into practice, Volume 2 (pp. 129–144). Norwood, NJ: Ablex

Liu H (2008) Dependency distance as a metric of language comprehension difficulty. J. Cognitive Sci. 9:159–191

Liu H, Xu C, Liang J (2017) Dependency distance: A new perspective on syntactic patterns in natural languages. Phys. Life Rev. 21. https://doi.org/10.1016/j.plrev.2017.03.002

Loukina A, Madnani N, Cahill A, et al. (2020) Using PRMSE to evaluate automated scoring systems in the presence of label noise. Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, Seattle, WA, USA → Online, 10 July, pp. 18–29. Association for Computational Linguistics

Lu X (2010) Automatic analysis of syntactic complexity in second language writing. Int. J. Corpus Linguist. 15:474–496

Lu X (2012) The relationship of lexical richness to the quality of ESL learners’ oral narratives. Mod. Lang. J. 96:190–208

Lu X (2017) Automated measurement of syntactic complexity in corpus-based L2 writing research and implications for writing assessment. Lang. Test. 34:493–511

Lu X, Hu R (2022) Sense-aware lexical sophistication indices and their relationship to second language writing quality. Behav. Res. Method. 54:1444–1460. https://doi.org/10.3758/s13428-021-01675-6

Ministry of Health, Labor, and Welfare of Japan (2022) Retrieved from: https://www.mhlw.go.jp/stf/newpage_30367.html

Mizumoto A, Eguchi M (2023) Exploring the potential of using an AI language model for automated essay scoring. Res. Methods Appl. Linguist. 3:100050

Okgetheng B, Takeuchi K (2024) Estimating Japanese Essay Grading Scores with Large Language Models. Proceedings of 30th Annual Conference of the Language Processing Society in Japan, March 2024

Ortega L (2015) Second language learning explained? SLA across 10 contemporary theories. In VanPatten B, Williams J (ed) Theories in Second Language Acquisition: An Introduction

Rae JW, Borgeaud S, Cai T, et al. (2021) Scaling Language Models: Methods, Analysis & Insights from Training Gopher. ArXiv, abs/2112.11446

Read J (2000) Assessing vocabulary. Cambridge University Press. https://doi.org/10.1017/CBO9780511732942

Rudner LM, Liang T (2002) Automated Essay Scoring Using Bayes’ Theorem. J. Technol., Learning and Assessment, 1 (2)

Sakoda K, Hosoi Y (2020) Accuracy and complexity of Japanese Language usage by SLA learners in different learning environments based on the analysis of I-JAS, a learners’ corpus of Japanese as L2. Math. Linguist. 32(7):403–418. https://doi.org/10.24701/mathling.32.7_403

Suzuki N (1999) Summary of survey results regarding comprehensive essay questions. Final report of “Joint Research on Comprehensive Examinations for the Aim of Evaluating Applicability to Each Specialized Field of Universities” for 1996-2000 [shōronbun sōgō mondai ni kansuru chōsa kekka no gaiyō. Heisei 8 - Heisei 12-nendo daigaku no kaku senmon bun’ya e no tekisei no hyōka o mokuteki to suru sōgō shiken no arikata ni kansuru kyōdō kenkyū’ saishū hōkoku-sho]. University Entrance Examination Section Center Research and Development Department [Daigaku nyūshi sentā kenkyū kaihatsubu], 21–32

Taghipour K, Ng HT (2016) A neural approach to automated essay scoring. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, 1–5 November, pp. 1882–1891. Association for Computational Linguistics

Takeuchi K, Ohno M, Motojin K, Taguchi M, Inada Y, Iizuka M, Abo T, Ueda H (2021) Development of essay scoring methods based on reference texts with construction of research-available Japanese essay data. In IPSJ J 62(9):1586–1604

Ure J (1971) Lexical density: A computational technique and some findings. In Coultard M (ed) Talking about Text. English Language Research, University of Birmingham, Birmingham, England

Vaswani A, Shazeer N, Parmar N, et al. (2017) Attention is all you need. In Advances in Neural Information Processing Systems, Long Beach, CA, 4–7 December, pp. 5998–6008, Curran Associates, Inc., Red Hook, NY

Watanabe H, Taira Y, Inoue Y (1988) Analysis of essay evaluation data [Shōronbun hyōka dēta no kaiseki]. Bulletin of the Faculty of Education, University of Tokyo [Tōkyōdaigaku kyōiku gakubu kiyō], Vol. 28, 143–164

Yao S, Yu D, Zhao J, et al. (2023) Tree of thoughts: Deliberate problem solving with large language models. Advances in Neural Information Processing Systems, 36

Zenker F, Kyle K (2021) Investigating minimum text lengths for lexical diversity indices. Assess. Writ. 47:100505. https://doi.org/10.1016/j.asw.2020.100505

Zhang Y, Warstadt A, Li X, et al. (2021) When do you need billions of words of pretraining data? Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Online, pp. 1112-1125. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.acl-long.90

Download references

This research was funded by National Foundation of Social Sciences (22BYY186) to Wenchao Li.

Author information

Authors and affiliations.

Department of Japanese Studies, Zhejiang University, Hangzhou, China

Department of Linguistics and Applied Linguistics, Zhejiang University, Hangzhou, China

You can also search for this author in PubMed   Google Scholar

Contributions

Wenchao Li is in charge of conceptualization, validation, formal analysis, investigation, data curation, visualization and writing the draft. Haitao Liu is in charge of supervision.

Corresponding author

Correspondence to Wenchao Li .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

Ethical approval was not required as the study did not involve human participants.

Informed consent

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental material file #1, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Li, W., Liu, H. Applying large language models for automated essay scoring for non-native Japanese. Humanit Soc Sci Commun 11 , 723 (2024). https://doi.org/10.1057/s41599-024-03209-9

Download citation

Received : 02 February 2024

Accepted : 16 May 2024

Published : 03 June 2024

DOI : https://doi.org/10.1057/s41599-024-03209-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

essay about methods of language teaching

Help | Advanced Search

Computer Science > Computation and Language

Title: lora: low-rank adaptation of large language models.

Abstract: An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example -- deploying independent instances of fine-tuned models, each with 175B parameters, is prohibitively expensive. We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. Compared to GPT-3 175B fine-tuned with Adam, LoRA can reduce the number of trainable parameters by 10,000 times and the GPU memory requirement by 3 times. LoRA performs on-par or better than fine-tuning in model quality on RoBERTa, DeBERTa, GPT-2, and GPT-3, despite having fewer trainable parameters, a higher training throughput, and, unlike adapters, no additional inference latency. We also provide an empirical investigation into rank-deficiency in language model adaptation, which sheds light on the efficacy of LoRA. We release a package that facilitates the integration of LoRA with PyTorch models and provide our implementations and model checkpoints for RoBERTa, DeBERTa, and GPT-2 at this https URL .

Submission history

Access paper:.

  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

11 blog links

Dblp - cs bibliography, bibtex formatted citation.

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Explore how Microsoft's partnership with Khan Academy is enhancing the future of education with AI innovation and tools for teachers >

  • AI in education
  • Published Jan 23, 2024

Meet your AI assistant for education: Microsoft Copilot

essay about methods of language teaching

  • Content Type
  • Microsoft Copilot

With new advancements in AI happening faster than ever before, you might be wondering how you can use these tools in your classroom to save you time and energy. Educators worldwide are making strides to understand and integrate AI into their work and often find it to be a valuable tool. You can use AI to save time creating rubrics, personalized content for students, and educational materials such as quizzes and lesson plans.   

Generative AI is a newer piece of technology and a unique category of AI that focuses on creating new content. With generative AI you can generate new content like text, images, code, or audio. It achieves this by learning patterns from existing data and understanding the context and intent of language. This provides you with new opportunities for content creation, personalization, and innovation. Because this technology is creating new content, checking for accuracy in generative AI is essential—especially in the field of education.  

Microsoft Copilot is a tool that uses generative AI to serve as a helpful assistant to you in the classroom. Copilot can help you save time, differentiate instruction, and enhance student learning. With Copilot, you can easily create lesson plans, quizzes, rubrics, and other class resources for any level of learner.  

5 ways to use Copilot in education 

Here are just a few examples of the many ways you can use Microsoft Copilot to save time and energy: 

  • Personalized learning: Copilot can support personalized learning by helping you create content, tailored feedback, and guidance for students based on their individual needs and learning styles. 
  • Brainstorming: You can use Copilot to brainstorm new ideas for activities, lesson plans, supporting materials, and assignments.  
  • Lesson planning: Copilot can help you plan lessons by suggesting or drafting activities, resources, and assessments that align with learning objectives. You can also use Copilot to start a rubric for the lessons. 
  • Provide feedback: Copilot can help you draft initial feedback and ideas for students on their work, which you can edit and personalize for your students.  
  • Get quick answers: Copilot can help you get quick answers to your questions without having to read through multiple search results. Also, Copilot provides links to content sources so you can assess the source or dive deeper into the original content. 

Copilot homepage

Microsoft Copilot showing suggested prompts for educators. Copilot uses generative AI to serve as a helpful assistant to you in the classroom. 

Getting started with Microsoft Copilot

To get started with Microsoft Copilot, you can follow these steps:  

  • Open copilot.microsoft.com or select the Copilot icon on the sidebar in your Microsoft Edge browser. 
  • Type your prompt into the chat window. 
  • Review the sources linked at the bottom by “Learn more.” You can fact-check the information provided or dive deeper into a topic by accessing the original articles, studies, or reports. 
  • Review the response to make sure the output is what you want and accurate. You are the expert, and you decide what goes into the classroom. 
  • To get the most out of Copilot, you can keep the conversation going by following up on your prompts. This helps you collaborate with Copilot to gain more useful, tailored responses.   

You can also give feedback to Copilot based on the quality of its responses to help the AI learn and match your preferences.  

How to write a prompt for AI 

To effectively guide generative AI, you want to give it clear and concise instructions, known as prompts. A well-crafted prompt enhances the generative AI’s output in the quality, relevance, and diversity. A good prompt should be clear, specific, and aligned with the goal of the generation task. A bad prompt can lead to ambiguous, irrelevant, or biased output. To get the best response from Copilot, consider the following tips:  

  • Define clear objectives.  Determine the main goal of the prompt and the role AI should take. Whether creating a syllabus, drafting a quiz, or revising lesson content, have a clear vision of the end goal. 
  • Be specific.  Chat experiences operate best when given detailed instructions. Specify grade level, subject, topic, or any other relevant parameters. For instance, “secondary math quiz on algebraic expressions” is clearer than “math quiz.” 
  • Structure the prompt.  Break complex tasks into smaller parts. Instead of asking the AI to draft an entire lesson, request an outline, then delve into specific sections. 
  • Iterate and refine.  The first response from AI might not always align perfectly with expectations. Don’t hesitate to rephrase the prompt, ask follow-up questions, or provide more context based on the initial output. 
  • Combine expertise.  Use AI as a tool to enhance and streamline work but remember to overlay its suggestions with your educational expertise. AI can suggest content, but the educator decides the best way to edit and present it to their audience.   

An infographic that explains how to craft effective prompts for AI tools and provides five key elements: conversation style, specific instructions, tailor for audience, specify length, specify format.

A infographic about how to write AI prompts to get better answers from Copilot. A good prompt should be clear, specific, and aligned with the goal of the task. 

Want a fun way to practice creating effective prompts? Minecraft Education just announced Prompt Lab for Minecraft Educators , a free playbook on how to use Microsoft Copilot to write compelling prompts, develop interactive learning content and assessments, and generate creative ideas for Minecraft lesson plans.   

Create images from text with Copilot 

You can use Image Creator from Designer in Copilot to create personalized, engaging visuals for all sorts of lessons or topics. You can type in a description of an image, provide additional context like location or activity, and choose an art style. Image Creator generates an image straight from your imagination. Prompts can begin with “draw an image” or “create an image.” You can use this tool to create images for a class newsletter, lesson, or Teams post.   

  • Get started in Copilot prompting “create an image…”  
  • Then build out your prompt with adjective + noun + verb + style.  
  • Click on your favorite image to open the result in a new tab and save the image. 

 An example would be “Create an image of an adorable black puppy wearing a hat in photorealistic style.” 

A Microsoft Copilot chat displaying four generated images of a black puppy wearing a hat in photorealistic style, with options to ask anything or continue the conversation.

An example of Copilot creating an image of a black dog wearing a hat in a photorealistic style, based on text descriptions. 

Try creating an image in Copilot for your lesson, or just for fun!   

Protected AI-powered chat

At Microsoft, our efforts are guided by our AI principles and Responsible AI Standard and build on decades of research on grounding and privacy-preserving machine learning. Copilot provides commercial data protection and delivers a secure AI-powered chat service for educational institutions. This means user and organizational data are protected, chat prompts and responses in Copilot are not saved, Microsoft has no eyes-on access to them, and they aren’t used to train the underlying large language models. Additionally, our  Customer Copyright Commitment  means education customers can be confident using our services and the output they generate without worrying about copyright claims.  

Get to know your Copilot 

Dive deeper into the world of generative AI and unlock its full potential for your classroom.  

  • The new  AI for Educators Learning Path  on  Microsoft Learn is made up of three modules to help educators learn about and benefit from AI. 
  • Prompt Lab for Minecraft Educators demonstrates how to use Microsoft Copilot with Minecraft Education to design engaging learning experiences. Level up your Minecraft teaching with this useful new resource! 
  • AI classroom toolkit provides instructional information for educators and students to use generative AI safely and responsibly. 
  • AI for education on  Microsoft Learn is a collection of resources and courses on how to use AI for educational purposes.  

Ready to elevate your teaching with Microsoft Copilot? Start using Copilot today! copilot.microsoft.com  

Related Posts

essay about methods of language teaching

Inspiring students during Women’s History Month 2024  

essay about methods of language teaching

Stay ahead with 8 new updates from Microsoft Education  

essay about methods of language teaching

  • Customer stories

Streamline messaging with Dynamics 365 Customer Insights  

Ai in education brings opportunity to life.

Watch Reimagine Education

Connect with us on social

essay about methods of language teaching

Subscribe to our newsletter

Stay up to date with monthly newsletters from Microsoft Education.  

essay about methods of language teaching

School stories

Get inspired by stories from Microsoft Education customers.

essay about methods of language teaching

Microsoft Learn Educator Center

Expand possibilities with educator training and professional development resources.

essay about methods of language teaching

Contact sales

Connect with a Microsoft Education sales specialist to explore solutions for your school.

essay about methods of language teaching

Discover a collection of resources to support a variety of educational topics.

IMAGES

  1. English Language Teaching Methods Essay Example

    essay about methods of language teaching

  2. (PDF) A Reflective Essay on Various Methods of Language Teaching

    essay about methods of language teaching

  3. ⇉Methods techniques and approaches of language teaching Essay Example

    essay about methods of language teaching

  4. Summary of Methods and Approaches in Language Teaching

    essay about methods of language teaching

  5. 2. Overview of Language Teaching Approaches and Methods

    essay about methods of language teaching

  6. Methods of language teaching

    essay about methods of language teaching

VIDEO

  1. Multilingualism: The future of language learning

  2. Language Teaching Instructional Methods|| Pedagogy MCQs Suggestopedia CLT TPR MCQs AJK PSC NTS 2024

  3. Grammar Translation Method

  4. Language Learning Through the Pushed Output Strategy

  5. English Language Teaching Methods in Hindi Urdu /language Teaching Methods

  6. Xorijiy tillar fakulteti Azizbek Muhammedov. Mavzu: Enhancing critical thinking in language learning

COMMENTS

  1. PDF An Analysis of Language Teaching Approaches and Methods

    This paper tries to analyze their effectiveness and weakness of several most influential teaching approaches and methods: Grammar-translation Method, Direct Method, Audio-lingual Method, Communicative Teaching Method, in order to have a better understanding and application in the future teaching practice. Key words: teaching approaches and ...

  2. Methods and Approaches in Language Teaching: CLT, TPR, TBL

    These days, CLT is by far one of the most popular approaches and methods in language teaching. Keep reading to find out more about it. This method stresses interaction and communication to teach a second language effectively. Students participate in everyday situations they are likely to encounter in the target language.

  3. PDF Approaches and Methods in Language Teaching

    The result is generally referred to as a teaching method or approach, by which we refer to a set of core teaching and learning principles together with a body of classroom practices that are derived from them. The same is true in language teaching, and the field of teaching methods has been a very active one in language teaching since the 1900s.

  4. Approaches and Methods in Language Teaching

    Like the first edition, it surveys the major approaches and methods in language teaching, such as grammar translation, audiolingualism, communicative language teaching, and the natural approach. The text examines each approach and method in terms of its theory of language and language learning, goals, syllabus, teaching activities, teacher and ...

  5. (PDF) LANGUAGE TEACHING APPROACHES AND METHODS

    The history of language teaching approaches is characterized by a variety of methods, from grammar-translation to communicative language teaching, each with its unique strengths and weaknesses ...

  6. (PDF) English Language Teaching Methods: Exploring the Impact of

    This scholarly paper embarks on an in-depth exploration within the domain of English language teaching methodologies, casting a spotlight on three innovative approaches: immersive teaching, task ...

  7. PDF Language Teaching Methods

    This video series featuring live demonstrations of current methods of teaching English as a second language has been produced in the USIA WORLDNET studios in Washington, D,C. The teaching materials which form the basis for these six unrehearsed classroom lessons were created by Prof. Diane Larsen­Freeman of the

  8. Why I Wrote 30 Language Teaching Methods

    Learning a foreign language is easy with the XXX Method. The highly acclaimed YYY Method lets you pick up a new language naturally. Over a period of more than 15 years, ZZZ has developed and perfected a unique method of teaching languages. What's more, training courses regularly include a component on the history of language teaching methods.

  9. PDF A Review of the Traditional and Current Language Teaching Methods

    2.1 Language Teaching Methods 2.1.1. The Direct Method Parallel to the Reform Movement ideas was an interest for developing principles in language teaching as the ones that are seen in first language acquisition. These were called natural methods , and finally during the nineteenth and the twentieth century this new method was called the ...

  10. An Overview of Language Teaching Methods and Approaches

    An Overview of Language Teaching Methods and Approaches "…there is, as Gebhard et al. (1990:16) argue, no convincing evidence from pedagogic research, including research into second language instruction, that there is any universally or 'best' way to teach. Although, clearly, particular approaches are likely to prove more effective in ...

  11. English Language Teaching: Approaches, Methods, and Techniques

    An approach describes the theory or philosophy underlying how a language should be taught; a method or methodology describes, in general terms, a way of implementing the approach (syllabus, progression, kinds of materials); techniques describe specific practical classroom tasks and activities. For example: Communicative Language Teaching (CLT) is an approach with a theoretical underpinning ...

  12. Language Teaching Methods: A Conceptual Approach

    The result is that language teaching and learning is. replete with several methods which include: Grammar-Translation Metho d, Gouin and Berlitz -. The Direct Method, The Audio-lingual Method ...

  13. 5 Popular ESL Teaching Methods Every Teacher Should Know

    Method #2: Communicative language teaching (CLT) Communicative language teaching is perhaps the most popular approach among the methods of teaching ESL today. CLT emphasizes the student's ability to communicate in real-life contexts. As a result, students learn to make requests, accept offers, explain things, and express their feelings and ...

  14. Methods and Approaches to English Language Teaching

    Methods and Approaches of English Language Teaching. According to Asher and James (1982), Methods are the combination of techniques that are used and plasticized by the teachers in the classrooms in order to teach their students and approaches are the philosophies of teachers about language teaching that can be applied in the classrooms by ...

  15. Learning Language, Learning Culture: Teaching Language to the Whole

    Educating the "whole person," when teaching language, requires engaging with the cultural ways of life within which that language lives. People use language to participate in and to create social, emotional, and ethical activities. Ignoring this and treating language as a decontextualized set of facts and techniques misses the opportunity ...

  16. Second Language Teaching Methods

    Introduction. Language teaching came into its own as a profession in the last century. Central to this process was the emergence of the concept of methods of language teaching. The method concept in language teaching—the notion of a systematic set of teaching practices based on a particular theory of language and language learning—is a powerful one, and the quest for better methods ...

  17. Approaches and Methods in Language Teaching

    This third edition of Approaches and Methods in Language Teaching surveys the major approaches and methods in language teaching, such as Grammar Translation, Audiolingualism, Communicative Language Teaching and the Natural Approach. It examines each one in terms of its theory of language and language learning, goals, syllabus, teaching ...

  18. Methods Of Teaching English English Language Essay

    2.3.2 The Direct Method. The term 'direct' refers to the fact that learners are in direct contact with the target language. The aim of this method was to develop in the learners, the ability to think in the language, whether in speaking, reading or writing. The following procedures and principles guide this method:

  19. 1

    Summary. This chapter, in briefly reviewing the history of language teaching methods, provides a background for discussion of contemporary methods and suggests the issues we will refer to in analyzing these methods. From this historical perspective we are also able to see that the concerns that have prompted modern method innovations were ...

  20. Language Teaching Methodology and Classroom Research

    The Effects of Uncertainty Avoidance on Interaction in the Classroom: Andrew Atkins. Consciousness-raising versus deductive approaches to language instruction: a study of learner preferences: James M. Ranalli. Problems in teaching English to Japanese students revealed by using a tally sheet and a short ethnographic-style commentary: Fumie Takakubo.

  21. Revisiting translation as a method in language teaching and learning

    Introduction. Language boundaries in today's increasingly globalized world have become fluid and the belief that languages are best taught monolingually has been challenged (Cenoz and Gorter 2013; García and Li 2014).Criticism of the traditional grammar translation teaching method, and perhaps the fact that some EFL teachers are monolingual, has led over time to a reduction in the teaching ...

  22. (PDF) Language Teaching Methodology: Historical Development and

    Abstract. This paper captures the historical development of language teaching methods over the years. It analyses the different language teaching methods and discuses their strengths and ...

  23. Applying large language models for automated essay scoring for non

    Recent advancements in artificial intelligence (AI) have led to an increased use of large language models (LLMs) for language assessment tasks such as automated essay scoring (AES), automated ...

  24. [2106.09685] LoRA: Low-Rank Adaptation of Large Language Models

    An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example -- deploying independent instances of fine-tuned models, each with 175B parameters, is ...

  25. Kaedah Pengajaran Guru Bahasa Melayu Secara dalam Talian dan Inisiatif

    The variety of online teaching methods can create a more meaningful learning atmosphere and make teaching sessions more interesting compared to conventional methods. Malay language teachers should take the initiative to improve their knowledge and skills in using GC so that it can be implemented in teaching comprehensively and effectively.

  26. A Reflective Essay on Various Methods of Language Teaching

    First of all, full immersion in the target language is the best side of a direct method class. It gives the learner the opportunity to live with the target language and explore it. This is. how a ...

  27. Meet your AI assistant for education: Microsoft Copilot

    Microsoft Copilot is a tool that uses generative AI to serve as a helpful assistant to you in the classroom. Copilot can help you save time, differentiate instruction, and enhance student learning. With Copilot, you can easily create lesson plans, quizzes, rubrics, and other class resources for any level of learner.