• PRO Courses Guides New Tech Help Pro Expert Videos About wikiHow Pro Upgrade Sign In
  • EDIT Edit this Article
  • EXPLORE Tech Help Pro About Us Random Article Quizzes Request a New Article Community Dashboard This Or That Game Popular Categories Arts and Entertainment Artwork Books Movies Computers and Electronics Computers Phone Skills Technology Hacks Health Men's Health Mental Health Women's Health Relationships Dating Love Relationship Issues Hobbies and Crafts Crafts Drawing Games Education & Communication Communication Skills Personal Development Studying Personal Care and Style Fashion Hair Care Personal Hygiene Youth Personal Care School Stuff Dating All Categories Arts and Entertainment Finance and Business Home and Garden Relationship Quizzes Cars & Other Vehicles Food and Entertaining Personal Care and Style Sports and Fitness Computers and Electronics Health Pets and Animals Travel Education & Communication Hobbies and Crafts Philosophy and Religion Work World Family Life Holidays and Traditions Relationships Youth
  • Browse Articles
  • Learn Something New
  • Quizzes Hot
  • This Or That Game
  • Train Your Brain
  • Explore More
  • Support wikiHow
  • About wikiHow
  • Log in / Sign up
  • Education and Communications
  • College University and Postgraduate
  • Academic Writing

How to Cite an Essay

Last Updated: February 4, 2023 Fact Checked

This article was co-authored by Diya Chaudhuri, PhD and by wikiHow staff writer, Jennifer Mueller, JD . Diya Chaudhuri holds a PhD in Creative Writing (specializing in Poetry) from Georgia State University. She has over 5 years of experience as a writing tutor and instructor for both the University of Florida and Georgia State University. There are 10 references cited in this article, which can be found at the bottom of the page. This article has been fact-checked, ensuring the accuracy of any cited facts and confirming the authority of its sources. This article has been viewed 558,547 times.

If you're writing a research paper, whether as a student or a professional researcher, you might want to use an essay as a source. You'll typically find essays published in another source, such as an edited book or collection. When you discuss or quote from the essay in your paper, use an in-text citation to relate back to the full entry listed in your list of references at the end of your paper. While the information in the full reference entry is basically the same, the format differs depending on whether you're using the Modern Language Association (MLA), American Psychological Association (APA), or Chicago citation method.

Template and Examples

how to cite while writing an essay

  • Example: Potter, Harry.

Step 2 List the title of the essay in quotation marks.

  • Example: Potter, Harry. "My Life with Voldemort."

Step 3 Provide the title and authors or editors of the larger work.

  • Example: Potter, Harry. "My Life with Voldemort." Great Thoughts from Hogwarts Alumni , by Bathilda Backshot,

Step 4 Add publication information for the larger work.

  • Example: Potter, Harry. "My Life with Voldemort." Great Thoughts from Hogwarts Alumni , by Bathilda Backshot, Hogwarts Press, 2019,

Step 5 Include the page numbers where the essay is found.

  • Example: Potter, Harry. "My Life with Voldemort." Great Thoughts from Hogwarts Alumni , by Bathilda Backshot, Hogwarts Press, 2019, pp. 22-42.

MLA Works Cited Entry Format:

LastName, FirstName. "Title of Essay." Title of Collection , by FirstName Last Name, Publisher, Year, pp. ##-##.

Step 6 Use the author's last name and the page number for in-text citations.

  • For example, you might write: While the stories may seem like great adventures, the students themselves were terribly frightened to confront Voldemort (Potter 28).
  • If you include the author's name in the text of your paper, you only need the page number where the referenced material can be found in the parenthetical at the end of your sentence.
  • If you have several authors with the same last name, include each author's first initial in your in-text citation to differentiate them.
  • For several titles by the same author, include a shortened version of the title after the author's name (if the title isn't mentioned in your text).

Step 1 Place the author's name first in your Reference List entry.

  • Example: Granger, H.

Step 2 Add the year the larger work was published.

  • Example: Granger, H. (2018).

Step 3 Include the title of the essay.

  • Example: Granger, H. (2018). Adventures in time turning.

Step 4 Provide the author and title of the larger work.

  • Example: Granger, H. (2018). Adventures in time turning. In M. McGonagall (Ed.), Reflections on my time at Hogwarts

Step 5 List the page range for the essay and the publisher of the larger work.

  • Example: Granger, H. (2018). Adventures in time turning. In M. McGonagall (Ed.), Reflections on my time at Hogwarts (pp. 92-130). Hogwarts Press.

APA Reference List Entry Format:

LastName, I. (Year). Title of essay. In I. LastName (Ed.), Title of larger work (pp. ##-##). Publisher.

Step 6 Use the author's last name and year of publication for in-text citations.

  • For example, you might write: By using a time turner, a witch or wizard can appear to others as though they are actually in two places at once (Granger, 2018).
  • If you use the author's name in the text of your paper, include the parenthetical with the year immediately after the author's name. For example, you might write: Although technically against the rules, Granger (2018) maintains that her use of a time turner was sanctioned by the head of her house.
  • Add page numbers if you quote directly from the source. Simply add a comma after the year, then type the page number or page range where the quoted material can be found, using the abbreviation "p." for a single page or "pp." for a range of pages.

Step 1 Start your Bibliography entry with the name of the author of the essay.

  • Example: Weasley, Ron.

Step 2 Include the title of the essay in quotation marks.

  • Example: Weasley, Ron. "Best Friend to a Hero."

Step 3 Add the title and editor of the larger work along with page numbers for the essay.

  • Example: Weasley, Ron. "Best Friend to a Hero." In Harry Potter: Wizard, Myth, Legend , edited by Xenophilius Lovegood, 80-92.

Step 4 Provide publication information for the larger work.

  • Example: Weasley, Ron. "Best Friend to a Hero." In Harry Potter: Wizard, Myth, Legend , edited by Xenophilius Lovegood, 80-92. Ottery St. Catchpole: Quibbler Books, 2018.

' Chicago Bibliography Format:

LastName, FirstName. "Title of Essay." In Title of Book or Essay Collection , edited by FirstName LastName, ##-##. Location: Publisher, Year.

Step 5 Adjust your formatting for footnotes.

  • Example: Ron Weasley, "Best Friend to a Hero," in Harry Potter: Wizard, Myth, Legend , edited by Xenophilius Lovegood, 80-92 (Ottery St. Catchpole: Quibbler Books, 2018).
  • After the first footnote, use a shortened footnote format that includes only the author's last name, the title of the essay, and the page number or page range where the referenced material appears.

Tip: If you use the Chicago author-date system for in-text citation, use the same in-text citation method as APA style.

Community Q&A

wikiHow Staff Editor

You Might Also Like

Cite a Song

  • ↑ https://style.mla.org/essay-in-authored-textbook/
  • ↑ https://owl.purdue.edu/owl/research_and_citation/mla_style/mla_formatting_and_style_guide/mla_works_cited_page_books.html
  • ↑ https://utica.libguides.com/c.php?g=703243&p=4991646
  • ↑ https://owl.purdue.edu/owl/research_and_citation/mla_style/mla_formatting_and_style_guide/mla_in_text_citations_the_basics.html
  • ↑ https://guides.libraries.psu.edu/apaquickguide/intext
  • ↑ https://guides.himmelfarb.gwu.edu/c.php?g=27779&p=170363
  • ↑ https://owl.purdue.edu/owl/research_and_citation/apa_style/apa_formatting_and_style_guide/in_text_citations_the_basics.html
  • ↑ http://libguides.heidelberg.edu/chicago/book/chapter
  • ↑ https://librarybestbets.fairfield.edu/citationguides/chicagonotes-bibliography#CollectionofEssays
  • ↑ https://libguides.heidelberg.edu/chicago/book/chapter

About This Article

Diya Chaudhuri, PhD

To cite an essay using MLA format, include the name of the author and the page number of the source you’re citing in the in-text citation. For example, if you’re referencing page 123 from a book by John Smith, you would include “(Smith 123)” at the end of the sentence. Alternatively, include the information as part of the sentence, such as “Rathore and Chauhan determined that Himalayan brown bears eat both plants and animals (6652).” Then, make sure that all your in-text citations match the sources in your Works Cited list. For more advice from our Creative Writing reviewer, including how to cite an essay in APA or Chicago Style, keep reading. Did this summary help you? Yes No

  • Send fan mail to authors

Reader Success Stories

Mbarek Oukhouya

Mbarek Oukhouya

Mar 7, 2017

Did this article help you?

how to cite while writing an essay

Sarah Sandy

May 25, 2017

Skyy DeRouge

Skyy DeRouge

Nov 14, 2021

Diana Ordaz

Diana Ordaz

Sep 25, 2016

Do I Have a Dirty Mind Quiz

Featured Articles

The Best Ways to Speed Read & Become a Faster Reader

Trending Articles

18 Practical Ways to Celebrate Pride as an Ally

Watch Articles

Clean Silver Jewelry with Vinegar

  • Terms of Use
  • Privacy Policy
  • Do Not Sell or Share My Info
  • Not Selling Info

wikiHow Tech Help Pro:

Level up your tech skills and stay ahead of the curve

Article type icon

How to Write an Academic Essay with References and Citations

#scribendiinc

Written by  Scribendi

If you're wondering how to write an academic essay with references, look no further. In this article, we'll discuss how to use in-text citations and references, including how to cite a website, how to cite a book, and how to cite a Tweet, according to various style guides.

How to Cite a Website

You might need to cite sources when writing a paper that references other sources. For example, when writing an essay, you may use information from other works, such as books, articles, or websites. You must then inform readers where this information came from. Failure to do so, even accidentally, is plagiarism—passing off another person's work as your own.

You can avoid plagiarism and show readers where to find information by using citations and references. 

Citations tell readers where a piece of information came from. They take the form of footnotes, endnotes, or parenthetical elements, depending on your style guide. In-text citations are usually placed at the end of a sentence containing the relevant information. 

A reference list , bibliography, or works cited list at the end of a text provides additional details about these cited sources. This list includes enough publication information allowing readers to look up these sources themselves.

Referencing is important for more than simply avoiding plagiarism. Referring to a trustworthy source shows that the information is reliable. Referring to reliable information can also support your major points and back up your argument. 

Learning how to write an academic essay with references and how to use in-text citations will allow you to cite authors who have made similar arguments. This helps show that your argument is objective and not entirely based on personal biases.

How Do You Determine Which Style Guide to Use?

How to Write an Academic Essay with References

Often, a professor will assign a style guide. The purpose of a style guide is to provide writers with formatting instructions. If your professor has not assigned a style guide, they should still be able to recommend one. 

If you are entirely free to choose, pick one that aligns with your field (for example, APA is frequently used for scientific writing). 

Some of the most common style guides are as follows:

AP style for journalism

Chicago style for publishing

APA style for scholarly writing (commonly used in scientific fields)

MLA style for scholarly citations (commonly used in English literature fields)

Some journals have their own style guides, so if you plan to publish, check which guide your target journal uses. You can do this by locating your target journal's website and searching for author guidelines.

How Do You Pick Your Sources?

When learning how to write an academic essay with references, you must identify reliable sources that support your argument. 

As you read, think critically and evaluate sources for:

Objectivity

Keep detailed notes on the sources so that you can easily find them again, if needed.

Tip: Record these notes in the format of your style guide—your reference list will then be ready to go.

How to Use In-Text Citations in MLA

An in-text citation in MLA includes the author's last name and the relevant page number: 

(Author 123)

How to Cite a Website in MLA

How to Cite a Website in MLA

Here's how to cite a website in MLA:

Author's last name, First name. "Title of page."

Website. Website Publisher, date. Web. Date

retrieved. <URL>

With information from a real website, this looks like:

Morris, Nancy. "How to Cite a Tweet in APA,

Chicago, and MLA." Scribendi. Scribendi

Inc., n.d. Web. 22 Dec. 2021.

<https://www.scribendi.com/academy/articles/how_to_cite_a_website.en.html>

How Do You Cite a Tweet in MLA ?

MLA uses the full text of a short Tweet (under 140 characters) as its title. Longer Tweets can be shortened using ellipses. 

MLA Tweet references should be formatted as follows:

@twitterhandle (Author Name). "Text of Tweet." Twitter, Date Month, Year, time of

publication, URL.

With information from an actual Tweet, this looks like:

@neiltyson (Neil deGrasse Tyson). "You can't use reason to convince anyone out of an

argument that they didn't use reason to get into." Twitter, 29 Sept. 2020, 10:15 p.m.,

https://twitter.com/neiltyson/status/1311127369785192449 .

How to Cite a Book in MLA

Here's how to cite a book in MLA:

Author's last name, First name. Book Title. Publisher, Year.

With publication information from a real book, this looks like:

Montgomery, L.M. Rainbow Valley. Frederick A. Stokes Company, 1919.

How to Cite a Chapter in a Book in MLA

Author's last name, First name. "Title of Chapter." Book Title , edited by Editor Name,

Publisher, Year, pp. page range.

With publication information from an actual book, this looks like:

Ezell, Margaret J.M. "The Social Author: Manuscript Culture, Writers, and Readers." The

Broadview Reader in Book History , edited by Michelle Levy and Tom Mole, Broadview

Press, 2015,pp. 375–394.

How to  Cite a Paraphrase in MLA

You can cite a paraphrase in MLA exactly the same way as you would cite a direct quotation. 

Make sure to include the author's name (either in the text or in the parenthetical citation) and the relevant page number.

How to Use In-Text Citations in APA

In APA, in-text citations include the author's last name and the year of publication; a page number is included only if a direct quotation is used: 

(Author, 2021, p. 123)

How to Cite a Website in APA

Here's how to cite a website in APA:

Author, A. A., & Author, B. B. (Year, Month. date of publication). Title of page. https://URL

Morris, N. (n.d.). How to cite a Tweet in APA, Chicago, and MLA. 

https://www.scribendi.com/academy/articles/how_to_cite_a_website.en.html       

Tip: Learn more about how to write an academic essay with  references to websites .

How Do You  Cite a Tweet in APA ?

APA refers to Tweets using their first 20 words. 

Tweet references should be formatted as follows:

Author, A. A. [@twitterhandle). (Year, Month. date of publication). First 20 words of the

Tweet. [Tweet] Twitter. URL

When we input information from a real Tweet, this looks like:

deGrasse Tyson, N. [@neiltyson]. (2020, Sept. 29). You can't use reason to convince anyone

out of an argument that they didn't use reason to get into. [Tweet] Twitter.

https://twitter.com/neiltyson/status/1311127369785192449

How to Cite a Book in APA

How to Cite a Book in APA

Here's how to cite a book in APA:   

Author, A. A. (Year). Book title. Publisher.

For a real book, this looks like:

Montgomery, L. M. (1919). Rainbow valley.

Frederick A. Stokes Company.

How to Cite a Chapter in a Book in APA

Author, A. A. (Year). Chapter title. In Editor Name (Ed.), Book Title (pp. page range).

With information from a real book, this looks like:

Ezell, M. J. M. (2014). The social author: Manuscript culture, writers, and readers. In

Michelle Levy and Tom Mole (Eds.), The Broadview Reader in Book History (pp. 375–

394). Broadview Press.

Knowing how to cite a book and how to cite a chapter in a book correctly will take you a long way in creating an effective reference list.

How to Cite a Paraphrase

How to Cite a Paraphrase in APA

You can cite a paraphrase in APA the same way as you would cite a direct quotation, including the author's name and year of publication. 

In APA, you may also choose to pinpoint the page from which the information is taken.

Referencing is an essential part of academic integrity. Learning how to write an academic essay with references and how to use in-text citations shows readers that you did your research and helps them locate your sources.

Learning how to cite a website, how to cite a book, and how to cite a paraphrase can also help you avoid plagiarism —an academic offense with serious consequences for your education or professional reputation.

Scribendi can help format your citations or review your whole paper with our Academic Editing services .

Take Your Essay from Good to Great

Hire an expert academic editor , or get a free sample, about the author.

Scribendi Editing and Proofreading

Scribendi's in-house editors work with writers from all over the globe to perfect their writing. They know that no piece of writing is complete without a professional edit, and they love to see a good piece of writing transformed into a great one. Scribendi's in-house editors are unrivaled in both experience and education, having collectively edited millions of words and obtained numerous degrees. They love consuming caffeinated beverages, reading books of various genres, and relaxing in quiet, dimly lit spaces.

Have You Read?

"The Complete Beginner's Guide to Academic Writing"

Related Posts

APA Style and APA Formatting

APA Style and APA Formatting

How to Research a Term Paper

How to Research a Term Paper

MLA Formatting and MLA Style: An Introduction

MLA Formatting and MLA Style: An Introduction

Upload your file(s) so we can calculate your word count, or enter your word count manually.

We will also recommend a service based on the file(s) you upload.

English is not my first language. I need English editing and proofreading so that I sound like a native speaker.

I need to have my journal article, dissertation, or term paper edited and proofread, or I need help with an admissions essay or proposal.

I have a novel, manuscript, play, or ebook. I need editing, copy editing, proofreading, a critique of my work, or a query package.

I need editing and proofreading for my white papers, reports, manuals, press releases, marketing materials, and other business documents.

I need to have my essay, project, assignment, or term paper edited and proofread.

I want to sound professional and to get hired. I have a resume, letter, email, or personal document that I need to have edited and proofread.

 Prices include your personal % discount.

 Prices include % sales tax ( ).

how to cite while writing an essay

Home / Guides / Citation Guides / How to Cite Sources

How to Cite Sources

Here is a complete list for how to cite sources. Most of these guides present citation guidance and examples in MLA, APA, and Chicago.

If you’re looking for general information on MLA or APA citations , the EasyBib Writing Center was designed for you! It has articles on what’s needed in an MLA in-text citation , how to format an APA paper, what an MLA annotated bibliography is, making an MLA works cited page, and much more!

MLA Format Citation Examples

The Modern Language Association created the MLA Style, currently in its 9th edition, to provide researchers with guidelines for writing and documenting scholarly borrowings.  Most often used in the humanities, MLA style (or MLA format ) has been adopted and used by numerous other disciplines, in multiple parts of the world.

MLA provides standard rules to follow so that most research papers are formatted in a similar manner. This makes it easier for readers to comprehend the information. The MLA in-text citation guidelines, MLA works cited standards, and MLA annotated bibliography instructions provide scholars with the information they need to properly cite sources in their research papers, articles, and assignments.

  • Book Chapter
  • Conference Paper
  • Documentary
  • Encyclopedia
  • Google Images
  • Kindle Book
  • Memorial Inscription
  • Museum Exhibit
  • Painting or Artwork
  • PowerPoint Presentation
  • Sheet Music
  • Thesis or Dissertation
  • YouTube Video

APA Format Citation Examples

The American Psychological Association created the APA citation style in 1929 as a way to help psychologists, anthropologists, and even business managers establish one common way to cite sources and present content.

APA is used when citing sources for academic articles such as journals, and is intended to help readers better comprehend content, and to avoid language bias wherever possible. The APA style (or APA format ) is now in its 7th edition, and provides citation style guides for virtually any type of resource.

Chicago Style Citation Examples

The Chicago/Turabian style of citing sources is generally used when citing sources for humanities papers, and is best known for its requirement that writers place bibliographic citations at the bottom of a page (in Chicago-format footnotes ) or at the end of a paper (endnotes).

The Turabian and Chicago citation styles are almost identical, but the Turabian style is geared towards student published papers such as theses and dissertations, while the Chicago style provides guidelines for all types of publications. This is why you’ll commonly see Chicago style and Turabian style presented together. The Chicago Manual of Style is currently in its 17th edition, and Turabian’s A Manual for Writers of Research Papers, Theses, and Dissertations is in its 8th edition.

Citing Specific Sources or Events

  • Declaration of Independence
  • Gettysburg Address
  • Martin Luther King Jr. Speech
  • President Obama’s Farewell Address
  • President Trump’s Inauguration Speech
  • White House Press Briefing

Additional FAQs

  • Citing Archived Contributors
  • Citing a Blog
  • Citing a Book Chapter
  • Citing a Source in a Foreign Language
  • Citing an Image
  • Citing a Song
  • Citing Special Contributors
  • Citing a Translated Article
  • Citing a Tweet

6 Interesting Citation Facts

The world of citations may seem cut and dry, but there’s more to them than just specific capitalization rules, MLA in-text citations , and other formatting specifications. Citations have been helping researches document their sources for hundreds of years, and are a great way to learn more about a particular subject area.

Ever wonder what sets all the different styles apart, or how they came to be in the first place? Read on for some interesting facts about citations!

1. There are Over 7,000 Different Citation Styles

You may be familiar with MLA and APA citation styles, but there are actually thousands of citation styles used for all different academic disciplines all across the world. Deciding which one to use can be difficult, so be sure to ask you instructor which one you should be using for your next paper.

2. Some Citation Styles are Named After People

While a majority of citation styles are named for the specific organizations that publish them (i.e. APA is published by the American Psychological Association, and MLA format is named for the Modern Language Association), some are actually named after individuals. The most well-known example of this is perhaps Turabian style, named for Kate L. Turabian, an American educator and writer. She developed this style as a condensed version of the Chicago Manual of Style in order to present a more concise set of rules to students.

3. There are Some Really Specific and Uniquely Named Citation Styles

How specific can citation styles get? The answer is very. For example, the “Flavour and Fragrance Journal” style is based on a bimonthly, peer-reviewed scientific journal published since 1985 by John Wiley & Sons. It publishes original research articles, reviews and special reports on all aspects of flavor and fragrance. Another example is “Nordic Pulp and Paper Research,” a style used by an international scientific magazine covering science and technology for the areas of wood or bio-mass constituents.

4. More citations were created on  EasyBib.com  in the first quarter of 2018 than there are people in California.

The US Census Bureau estimates that approximately 39.5 million people live in the state of California. Meanwhile, about 43 million citations were made on EasyBib from January to March of 2018. That’s a lot of citations.

5. “Citations” is a Word With a Long History

The word “citations” can be traced back literally thousands of years to the Latin word “citare” meaning “to summon, urge, call; put in sudden motion, call forward; rouse, excite.” The word then took on its more modern meaning and relevance to writing papers in the 1600s, where it became known as the “act of citing or quoting a passage from a book, etc.”

6. Citation Styles are Always Changing

The concept of citations always stays the same. It is a means of preventing plagiarism and demonstrating where you relied on outside sources. The specific style rules, however, can and do change regularly. For example, in 2018 alone, 46 new citation styles were introduced , and 106 updates were made to exiting styles. At EasyBib, we are always on the lookout for ways to improve our styles and opportunities to add new ones to our list.

Why Citations Matter

Here are the ways accurate citations can help your students achieve academic success, and how you can answer the dreaded question, “why should I cite my sources?”

They Give Credit to the Right People

Citing their sources makes sure that the reader can differentiate the student’s original thoughts from those of other researchers. Not only does this make sure that the sources they use receive proper credit for their work, it ensures that the student receives deserved recognition for their unique contributions to the topic. Whether the student is citing in MLA format , APA format , or any other style, citations serve as a natural way to place a student’s work in the broader context of the subject area, and serve as an easy way to gauge their commitment to the project.

They Provide Hard Evidence of Ideas

Having many citations from a wide variety of sources related to their idea means that the student is working on a well-researched and respected subject. Citing sources that back up their claim creates room for fact-checking and further research . And, if they can cite a few sources that have the converse opinion or idea, and then demonstrate to the reader why they believe that that viewpoint is wrong by again citing credible sources, the student is well on their way to winning over the reader and cementing their point of view.

They Promote Originality and Prevent Plagiarism

The point of research projects is not to regurgitate information that can already be found elsewhere. We have Google for that! What the student’s project should aim to do is promote an original idea or a spin on an existing idea, and use reliable sources to promote that idea. Copying or directly referencing a source without proper citation can lead to not only a poor grade, but accusations of academic dishonesty. By citing their sources regularly and accurately, students can easily avoid the trap of plagiarism , and promote further research on their topic.

They Create Better Researchers

By researching sources to back up and promote their ideas, students are becoming better researchers without even knowing it! Each time a new source is read or researched, the student is becoming more engaged with the project and is developing a deeper understanding of the subject area. Proper citations demonstrate a breadth of the student’s reading and dedication to the project itself. By creating citations, students are compelled to make connections between their sources and discern research patterns. Each time they complete this process, they are helping themselves become better researchers and writers overall.

When is the Right Time to Start Making Citations?

Make in-text/parenthetical citations as you need them.

As you are writing your paper, be sure to include references within the text that correspond with references in a works cited or bibliography. These are usually called in-text citations or parenthetical citations in MLA and APA formats. The most effective time to complete these is directly after you have made your reference to another source. For instance, after writing the line from Charles Dickens’ A Tale of Two Cities : “It was the best of times, it was the worst of times…,” you would include a citation like this (depending on your chosen citation style):

(Dickens 11).

This signals to the reader that you have referenced an outside source. What’s great about this system is that the in-text citations serve as a natural list for all of the citations you have made in your paper, which will make completing the works cited page a whole lot easier. After you are done writing, all that will be left for you to do is scan your paper for these references, and then build a works cited page that includes a citation for each one.

Need help creating an MLA works cited page ? Try the MLA format generator on EasyBib.com! We also have a guide on how to format an APA reference page .

2. Understand the General Formatting Rules of Your Citation Style Before You Start Writing

While reading up on paper formatting may not sound exciting, being aware of how your paper should look early on in the paper writing process is super important. Citation styles can dictate more than just the appearance of the citations themselves, but rather can impact the layout of your paper as a whole, with specific guidelines concerning margin width, title treatment, and even font size and spacing. Knowing how to organize your paper before you start writing will ensure that you do not receive a low grade for something as trivial as forgetting a hanging indent.

Don’t know where to start? Here’s a formatting guide on APA format .

3. Double-check All of Your Outside Sources for Relevance and Trustworthiness First

Collecting outside sources that support your research and specific topic is a critical step in writing an effective paper. But before you run to the library and grab the first 20 books you can lay your hands on, keep in mind that selecting a source to include in your paper should not be taken lightly. Before you proceed with using it to backup your ideas, run a quick Internet search for it and see if other scholars in your field have written about it as well. Check to see if there are book reviews about it or peer accolades. If you spot something that seems off to you, you may want to consider leaving it out of your work. Doing this before your start making citations can save you a ton of time in the long run.

Finished with your paper? It may be time to run it through a grammar and plagiarism checker , like the one offered by EasyBib Plus. If you’re just looking to brush up on the basics, our grammar guides  are ready anytime you are.

How useful was this post?

Click on a star to rate it!

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?

Citation Basics

Harvard Referencing

Plagiarism Basics

Plagiarism Checker

Upload a paper to check for plagiarism against billions of sources and get advanced writing suggestions for clarity and style.

Get Started

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Referencing

A Quick Guide to Referencing | Cite Your Sources Correctly

Referencing means acknowledging the sources you have used in your writing. Including references helps you support your claims and ensures that you avoid plagiarism .

There are many referencing styles, but they usually consist of two things:

  • A citation wherever you refer to a source in your text.
  • A reference list or bibliography at the end listing full details of all your sources.

The most common method of referencing in UK universities is Harvard style , which uses author-date citations in the text. Our free Harvard Reference Generator automatically creates accurate references in this style.

Instantly correct all language mistakes in your text

Be assured that you'll submit flawless writing. Upload your document to correct all your mistakes.

upload-your-document-ai-proofreader

Table of contents

Referencing styles, citing your sources with in-text citations, creating your reference list or bibliography, harvard referencing examples, frequently asked questions about referencing.

Each referencing style has different rules for presenting source information. For in-text citations, some use footnotes or endnotes , while others include the author’s surname and date of publication in brackets in the text.

The reference list or bibliography is presented differently in each style, with different rules for things like capitalisation, italics, and quotation marks in references.

Your university will usually tell you which referencing style to use; they may even have their own unique style. Always follow your university’s guidelines, and ask your tutor if you are unsure. The most common styles are summarised below.

Harvard referencing, the most commonly used style at UK universities, uses author–date in-text citations corresponding to an alphabetical bibliography or reference list at the end.

Harvard Referencing Guide

Vancouver referencing, used in biomedicine and other sciences, uses reference numbers in the text corresponding to a numbered reference list at the end.

Vancouver Referencing Guide

APA referencing, used in the social and behavioural sciences, uses author–date in-text citations corresponding to an alphabetical reference list at the end.

APA Referencing Guide APA Reference Generator

MHRA referencing, used in the humanities, uses footnotes in the text with source information, in addition to an alphabetised bibliography at the end.

MHRA Referencing Guide

OSCOLA referencing, used in law, uses footnotes in the text with source information, and an alphabetical bibliography at the end in longer texts.

OSCOLA Referencing Guide

The only proofreading tool specialized in correcting academic writing

The academic proofreading tool has been trained on 1000s of academic texts and by native English editors. Making it the most accurate and reliable proofreading tool for students.

how to cite while writing an essay

Correct my document today

In-text citations should be used whenever you quote, paraphrase, or refer to information from a source (e.g. a book, article, image, website, or video).

Quoting and paraphrasing

Quoting is when you directly copy some text from a source and enclose it in quotation marks to indicate that it is not your own writing.

Paraphrasing is when you rephrase the original source into your own words. In this case, you don’t use quotation marks, but you still need to include a citation.

In most referencing styles, page numbers are included when you’re quoting or paraphrasing a particular passage. If you are referring to the text as a whole, no page number is needed.

In-text citations

In-text citations are quick references to your sources. In Harvard referencing, you use the author’s surname and the date of publication in brackets.

Up to three authors are included in a Harvard in-text citation. If the source has more than three authors, include the first author followed by ‘ et al. ‘

The point of these citations is to direct your reader to the alphabetised reference list, where you give full information about each source. For example, to find the source cited above, the reader would look under ‘J’ in your reference list to find the title and publication details of the source.

Placement of in-text citations

In-text citations should be placed directly after the quotation or information they refer to, usually before a comma or full stop. If a sentence is supported by multiple sources, you can combine them in one set of brackets, separated by a semicolon.

If you mention the author’s name in the text already, you don’t include it in the citation, and you can place the citation immediately after the name.

  • Another researcher warns that the results of this method are ‘inconsistent’ (Singh, 2018, p. 13) .
  • Previous research has frequently illustrated the pitfalls of this method (Singh, 2018; Jones, 2016) .
  • Singh (2018, p. 13) warns that the results of this method are ‘inconsistent’.

The terms ‘bibliography’ and ‘reference list’ are sometimes used interchangeably. Both refer to a list that contains full information on all the sources cited in your text. Sometimes ‘bibliography’ is used to mean a more extensive list, also containing sources that you consulted but did not cite in the text.

A reference list or bibliography is usually mandatory, since in-text citations typically don’t provide full source information. For styles that already include full source information in footnotes (e.g. OSCOLA and Chicago Style ), the bibliography is optional, although your university may still require you to include one.

Format of the reference list

Reference lists are usually alphabetised by authors’ last names. Each entry in the list appears on a new line, and a hanging indent is applied if an entry extends onto multiple lines.

Harvard reference list example

Different source information is included for different source types. Each style provides detailed guidelines for exactly what information should be included and how it should be presented.

Below are some examples of reference list entries for common source types in Harvard style.

  • Chapter of a book
  • Journal article

Your university should tell you which referencing style to follow. If you’re unsure, check with a supervisor. Commonly used styles include:

  • Harvard referencing , the most commonly used style in UK universities.
  • MHRA , used in humanities subjects.
  • APA , used in the social sciences.
  • Vancouver , used in biomedicine.
  • OSCOLA , used in law.

Your university may have its own referencing style guide.

If you are allowed to choose which style to follow, we recommend Harvard referencing, as it is a straightforward and widely used style.

References should be included in your text whenever you use words, ideas, or information from a source. A source can be anything from a book or journal article to a website or YouTube video.

If you don’t acknowledge your sources, you can get in trouble for plagiarism .

To avoid plagiarism , always include a reference when you use words, ideas or information from a source. This shows that you are not trying to pass the work of others off as your own.

You must also properly quote or paraphrase the source. If you’re not sure whether you’ve done this correctly, you can use the Scribbr Plagiarism Checker to find and correct any mistakes.

Harvard referencing uses an author–date system. Sources are cited by the author’s last name and the publication year in brackets. Each Harvard in-text citation corresponds to an entry in the alphabetised reference list at the end of the paper.

Vancouver referencing uses a numerical system. Sources are cited by a number in parentheses or superscript. Each number corresponds to a full reference at the end of the paper.

Is this article helpful?

Other students also liked.

  • A Quick Guide to Harvard Referencing | Citation Examples
  • APA Referencing (7th Ed.) Quick Guide | In-text Citations & References

How to Avoid Plagiarism | Tips on Citing Sources

More interesting articles.

  • A Quick Guide to OSCOLA Referencing | Rules & Examples
  • Harvard In-Text Citation | A Complete Guide & Examples
  • Harvard Referencing for Journal Articles | Templates & Examples
  • Harvard Style Bibliography | Format & Examples
  • MHRA Referencing | A Quick Guide & Citation Examples
  • Reference a Website in Harvard Style | Templates & Examples
  • Referencing Books in Harvard Style | Templates & Examples
  • Vancouver Referencing | A Quick Guide & Reference Examples

Scribbr APA Citation Checker

An innovative new tool that checks your APA citations with AI software. Say goodbye to inaccurate citations!

how to cite while writing an essay

helpful professor logo

How to Reference in an Essay (9 Strategies of Top Students)

Are you feeling overwhelmed by referencing?

When you’re first asked to do referencing in an essay it can be hard to get your head around it. If it’s been a while since you were first taught how to reference, it can be intimidating to ask again how to do it!

I have so many students who consistently lose marks just because they didn’t get referencing right! They’re either embarrassed to ask for extra help or too lazy to learn how to solve the issues.

So, here’s a post that will help you solve the issues on your own.

Already think you’re good at referencing? No worries. This post goes through some surprising and advanced strategies for anyone to improve no matter what level you are at!

In this post I’m going to show you exactly how to reference in an essay. I’ll explain why we do it and I’ll show you 9 actionable tips on getting referencing right that I’m sure you will not have heard anywhere else!

The post is split into three parts:

  • What is a Reference and What is a Citation?
  • Why Reference? (4 Things you Should Know)
  • How to Reference (9 Strategies of Top Students)

If you think you’ve already got a good understanding of the basics, you can jump to our 9 Advanced Strategies section.

Part 1: What is a Reference and What is a Citation?

What is a citation.

An in-text mention of your source. A citation is a short mention of the source you got the information from, usually in the middle or end of a sentence in the body of your paragraph. It is usually abbreviated so as not to distract the reader too much from your own writing. Here’s two examples of citations. The first is in APA format. The second is in MLA format:

  • APA: Archaeological records trace the original human being to equatorial Africa about 250,000–350,000 years ago (Schlebusch & Jakobsson, 2018) .
  • MLA: Archaeological records trace the original human being to equatorial Africa about 250,000–350,000 years ago (Schlebusch and Jakobsson 1) .

In APA format, you’ve got the authors and year of publication listed. In MLA format, you’ve got the authors and page number listed. If you keep reading, I’ll give some more tips on formatting further down in this article.

And a Reference is:

What is a Reference?

A reference is the full details of a source that you list at the end of the article. For every citation (see above) there needs to be a corresponding reference at the end of the essay showing more details about that source. The idea is that the reader can see the source in-text (i.e. they can look at the citation) and if they want more information they can jump to the end of the page and find out exactly how to go about finding the source.

Here’s how you would go about referencing the Schlebusch and Jakobsson source in a list at the end of the essay. Again, I will show you how to do it in APA and MLA formats:

  • APA: Schlebusch, C. & Jakobsson, M. (2018). Tales of Human Migration, Admixture, and Selection in Africa. Annual Review of Genomics and Human Genetics , 11 (33), 1–24.
  • MLA: Schlebusch, Carina and Mattias Jakobsson. “Tales of Human Migration, Admixture, and Selection in Africa.” Annual Review of Genomics and Human Genetics , vol. 11, no. 33, 2018, pp. 1–24.

In strategy 1 below I’ll show you the easiest and fool proof way to write these references perfectly every time.

One last quick note: sometimes we say ‘reference’ when we mean ‘citation’. That’s pretty normal. Just roll with the punches. It’s usually pretty easy to pick up on what our teacher means regardless of whether they use the word ‘reference’ or ‘citation’.

Part 2: Why Reference in an Essay? (4 Things you Should Know)

Referencing in an essay is important. By the time you start doing 200-level courses, you probably won’t pass the course unless you reference appropriately. So, the biggest answer to ‘why reference?’ is simple: Because you Have To!

Okay let’s be serious though … here’s the four top ‘real’ reasons to reference:

1. Referencing shows you Got an Expert’s Opinion

You can’t just write an essay on what you think you know. This is a huge mistake of beginning students. Instead this is what you need to do:

Top Tip: Essays at university are supposed to show off that you’ve learned new information by reading the opinions of experts.

Every time you place a citation in your paragraph, you’re showing that the information you’re presenting in that paragraph was provided to you by an expert. In other words, it means you consulted an expert’s opinion to build your knowledge.

If you have citations throughout the essay with links to a variety of different expert opinions, you’ll show your marker that you did actually genuinely look at what the experts said with an open mind and considered their ideas.

This will help you to grow your grades.

2. Referencing shows you read your Assigned Readings

Your teacher will most likely give you scholarly journal articles or book chapters to read for homework between classes. You might have even talked about those assigned readings in your seminars and tutorials.

Great! The assigned readings are very important to you.

You should definitely cite the assigned readings relevant to your essay topic in your evaluative essay (unless your teacher tells you not to). Why? I’ll explain below.

  • Firstly, the assigned readings were selected by your teacher because your teacher (you know, the person who’s going to mark your essay) believes they’re the best quality articles on the topic. Translation: your teacher gave you the best source you’re going to find. Make sure you use it!
  • Secondly, by citing the assigned readings you are showing your teacher that you have been paying attention throughout the course. You are showing your teacher that you have done your homework, read those assigned readings and paid attention to them. When my students submit an essay that has references to websites, blogs, wikis and magazines I get very frustrated. Why would you cite low quality non-expert sources like websites when I gave you the expert’s article!? Really, it frustrates me so, so much.

So, cite the assigned readings to show your teacher you read the scholarly articles your teacher gave to you. It’ll help you grow your marks.

3. Referencing deepens your Knowledge

Okay, so you understand that you need to use referencing to show you got experts’ opinions on the topic.

But there’s more to it than that. There’s actually a real benefit for your learning.

If you force yourself to cite two expert sources per paragraph, you’re actually forcing yourself to get two separate pieces of expert knowledge. This will deepen your knowledge!

So, don’t treat referencing like a vanity exercise to help you gain more marks. Actually view it as an opportunity to develop deeper understandings of the topic!

When you read expert sources, aim to pick up on some new gems of knowledge that you can discuss in your essays. Some things you should look out for when finding sources to reference:

  • Examples that link ideas to real life. Do the experts provide real-life examples that you can mention in your essay?
  • Facts and figures. Usually experts have conducted research on a topic and provide you with facts and figures from their research. Use those facts and figures to deepen your essay!
  • Short Quotes. Did your source say something in a really interesting, concise or surprising way? Great! You can quote that source in your essay .
  • New Perspectives. Your source might give you another perspective, angle or piece of information that you can add to your paragraph so that it’s a deep, detailed and interesting paragraph.

So, the reason we ask you to reference is at the end of the day because it’s good for you: it helps you learn!

4. Referencing backs up your Claims

You might think you already know a ton of information about the topic and be ready to share your mountains of knowledge with your teacher. Great!

So, should you still reference?

Yes. Definitely.

You need to show that you’re not the only person with your opinion. You need to ‘stand on the shoulders of giants.’ Show what other sources have said about your points to prove that experts agree with you.

You should be saying: this is my opinion and it’s based on facts, expert opinions and deep, close scrutiny of all the arguments that exist out there .

If you make a claim that no one else has made, your teacher is going to be like “Have you even been reading the evidence on this topic?” The answer, if there are no citations is likely: No. You haven’t.

Even if you totally disagree with the experts, you still need to say what their opinions are! You’ll need to say: “This is the experts’ opinions. And this is why I disagree.”

So, yes, you need to reference to back up every claim. Try to reference twice in every paragraph to achieve this.

Part 3: Strategies for How to Reference in an Essay (9 Strategies of Top Students)

Let’s get going with our top strategies for how to reference in an essay! These are strategies that you probably haven’t heard elsewhere. They work for everyone – from beginner to advanced! Let’s get started:

1. Print out your Reference Style Cheat Sheet

Referencing is hard and very specific. You need to know where to place your italics, where the commas go and whether to use an initial for full name for an author.

There are so many details to get right.

And here’s the bad news: The automated referencing apps and websites nearly always get it wrong! They tell you they can generate the citation for you. The fact of the matter is: they can’t!

Here’s the best way to get referencing right: Download a referencing cheat sheet and have it by your side while writing your essay.

Your assignment outline should tell you what type of referencing you should use. Different styles include: APA Style, MLA Style, Chicago Style, Harvard Style, Vancouver Style … and many more!

You need to find out which style you need to use and download your cheat sheet. You can jump onto google to find a cheat sheet by typing in the google bar:

how to reference in an essay

Download a pdf version of the referencing style cheat sheet, print it out, and place it on your pinboard or by your side when writing your essay.

2. Only cite Experts

There are good and bad sources to cite in an essay.

You should only cite sources written, critiqued and edited by experts. This shows that you have got the skill of finding information that is authoritative. You haven’t just used information that any old person popped up on their blog. You haven’t just gotten information from your local newspaper. Instead, you got information from the person who is an absolute expert on the topic.

Here’s an infographic listing sources that you should and shouldn’t cite. Feel free to share this infographic on social media, with your teachers and your friends:

good and bad sources infographic

3. Always use Google Scholar

Always. Use. Google. Scholar.

Ten years ago students only had their online university search database to find articles. Those university databases suck. They rarely find the best quality sources and there’s always a big mix of completely irrelevant sources mixed in there.

Google Scholar is better at finding the sources you want. That’s because it looks through the whole article abstract and analyses it to see if it’s relevant to your search keywords. By contrast, most university search databases rely only on the titles of articles.

Use the power of the best quality search engine in the world to find scholarly sources .

Note: Google and Google Scholar are different search engines.

To use Google Scholar, go to: https://scholar.google.com

Then, search on google scholar using keywords. I’m going to search keywords for an essay on the topic: “What are the traits of a good nurse?”

how to reference in an essay

If you really like the idea of that first source, I recommend copying the title and trying your University online search database. Your university may give you free access.

4. Cite at least 50% sources you found on your Own Research

Okay, so I’ve told you that you should cite both assigned readings and readings you find from Google Scholar.

Here’s the ideal mix of assigned sources and sources that you found yourself: 50/50.

Your teacher will want to see that you can use both assigned readings and do your own additional research to write a top essay . This shows you’ve got great research skills but also pay attention to what is provided in class.

I recommend that you start with the assigned readings and try to get as much information out of them, then find your own additional sources beyond that using Google Scholar.

So, if your essay has 10 citations, a good mix is 5 assigned readings and 5 readings you found by yourself.

5. Cite Newer Sources

As a general rule, the newer the source the better .

The best rule of thumb that most teachers follow is that you should aim to mostly cite sources from the past 10 years . I usually accept sources from the past 15 years when marking essays.

However, sometimes you have a really great source that’s 20, 30 or 40 years old. You should only cite these sources if they’re what we call ‘seminal texts’. A seminal text is one that was written by an absolute giant in your field and revolutionized the subject.

Here’s some examples of seminal authors whose old articles you would be able to cite despite the fact that they’re old:

  • Education: Vygotsky, Friere, Piaget
  • Sociology: Weber, Marx, C. Wright Mills
  • Psychology: Freud, Rogers, Jung

Even if I cite seminal authors, I always aim for at least 80% of my sources to have been written in the past 10 years.

6. Reference twice per Paragraph

How much should you reference?

Here’s a good strategy: Provide two citations in every paragraph in the body of the essay.

It’s not compulsory to reference in the introduction and conclusion . However, in all the other paragraphs, aim for two citations.

Let’s go over the key strategies for achieving this:

  • These two citations should be to different sources, not the same sources twice;
  • Two citations per paragraph shows your points are backed up by not one, but two expert sources;
  • Place one citation in the first half of the paragraph and one in the second half. This will indicate to your marker that all the points in the whole paragraph are backed up by your citations.

This is a good rule of thumb for you when you’re not sure when and how often to reference. When you get more confident with your referencing, you can mix this up a little.

7. The sum total of your sources should be minimum 1 per 150 words

You can, of course, cite one source more than once throughout the essay. You might cite the same source in the second, fourth and fifth paragraphs. That’s okay.

Essay Writing Tip: Provide one unique citation in the reference list for every 150 words in the essay.

But, you don’t want your whole essay to be based on a narrow range of sources. You want your marker to see that you have consulted multiple sources to get a wide range of information on the topic. Your marker wants to know that you’ve seen a range of different opinions when coming to your conclusions.

When you get to the end of your essay, check to see how many sources are listed in the end-text reference list. A good rule of thumb is 1 source listed in the reference list per 150 words. Here’s how that breaks down by essay size:

  • 1500 word essay: 10 sources (or more) listed in the reference list
  • 2000 word essay: 13 sources (or more) listed in the reference list
  • 3000 word essay: 20 sources (or more) listed in the reference list
  • 5000 word essay: 33 sources (or more) listed in the reference list

8. Instantly improve your Reference List with these Three Tips

Here’s two things you can do to instantly improve your reference list. It takes less than 20 seconds and gives your reference list a strong professional finish:

a) Ensure the font size and style are the same

You will usually find that your whole reference list ends up being in different font sizes and styles. This is because you tend to copy and paste the titles and names in the citations from other sources. If you submit the reference list with font sizes and styles that are not the same as the rest of the essay, the piece looks really unprofessional.

So, quickly highlight the whole reference list and change its font to the same font size and style as the rest of your essay. The screencast at the end of Step 8 walks you through this if you need a hand!

b) List your sources in alphabetical order.

Nearly every referencing style insists that references be listed in alphabetical order. It’s a simple thing to do before submitting and makes the piece look far more professional.

If you’re using Microsoft Word, simply highlight your whole reference list and click the A>Z button in the toolbar. If you can’t see it, you need to be under the ‘home’ tab (circled below):

how to reference in an essay

You’ve probably never heard of a hanging indent. It’s a style where the second line of the reference list is indented further from the left-hand side of the page than the first line. It’s a strategy that’s usually used in reference lists provided in professional publications.

If you use the hanging indent, your reference list will look far more professional.

Here’s a quick video of me doing it for you:

9. Do one special edit especially for Referencing Style

The top students edit their essays three to five times spaced out over a week or more before submitting. One of those edits should be specifically for ensuring your reference list adheres to the referencing style that your teacher requires.

To do this, I recommend you get that cheat sheet printout that I mentioned in Step 1 and have it by your side while you read through the piece. Pay special attention to the use of commas, capital letters, brackets and page numbers for all citations. Also pay attention to the reference list: correct formatting of the reference list can be the difference between getting the top mark in the class and the fifth mark in the class. At the higher end of the marking range, things get competitive and formatting of the reference list counts.

A Quick Summary of the 9 Top Strategies…

How to reference in an essay

Follow the rules of your referencing style guide (and that cheat sheet I recommended!) and use the top 9 tips above to improve your referencing and get top marks. Not only will your referencing look more professional, you’ll probably increase the quality of the content of your piece as well when you follow these tips!

Here’s a final summary of the 9 top tips:

Strategies for How to Reference in an Essay (9 Strategies of Top Students)

  • Print out your Reference Style Cheat Sheet
  • Only cite Experts
  • Always use Google Scholar
  • Cite at least 50% sources you found on your Own Research
  • Cite Newer Sources
  • Reference twice per Paragraph
  • The sum total of your sources should be minimum 1 per 150 words
  • Instantly improve your Reference List with these Three Tips
  • Do one special edit especially for Referencing Style

Chris

Chris Drew (PhD)

Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]

  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 15 Self-Actualization Examples (Maslow's Hierarchy)
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ Forest Schools Philosophy & Curriculum, Explained!
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ Montessori's 4 Planes of Development, Explained!
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ Montessori vs Reggio Emilia vs Steiner-Waldorf vs Froebel

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *

how to cite while writing an essay

Cite While You Write: MLA

how to cite while writing an essay

DeCandido, Graceanne A. "Bibliographic Good vs. Evil in Buffy the Vampire Slayer ." American Libraries Sept. 1999: 44-47. Print.

"Buffy Slays Academics." BBC News Education . 7 Nov. 2001. BBC. Web. 8 July 2008 <http://news.bbc.co.uk/1/hi/education/1642829.stm>.

Masquerade. "Moral Ambiguities." All Things Philosophical on Buffy the Vampire Slayer and Angel: The Series . 7 March 2005. Web. 9 July 2008 < http://www.atpobtvs.com/moram.html>.

Back

Quoting and integrating sources into your paper

In any study of a subject, people engage in a “conversation” of sorts, where they read or listen to others’ ideas, consider them with their own viewpoints, and then develop their own stance. It is important in this “conversation” to acknowledge when we use someone else’s words or ideas. If we didn’t come up with it ourselves, we need to tell our readers who did come up with it.

It is important to draw on the work of experts to formulate your own ideas. Quoting and paraphrasing the work of authors engaged in writing about your topic adds expert support to your argument and thesis statement. You are contributing to a scholarly conversation with scholars who are experts on your topic with your writing. This is the difference between a scholarly research paper and any other paper: you must include your own voice in your analysis and ideas alongside scholars or experts.

All your sources must relate to your thesis, or central argument, whether they are in agreement or not. It is a good idea to address all sides of the argument or thesis to make your stance stronger. There are two main ways to incorporate sources into your research paper.

Quoting is when you use the exact words from a source. You will need to put quotation marks around the words that are not your own and cite where they came from. For example:

“It wasn’t really a tune, but from the first note the beast’s eyes began to droop . . . Slowly the dog’s growls ceased – it tottered on its paws and fell to its knees, then it slumped to the ground, fast asleep” (Rowling 275).

Follow these guidelines when opting to cite a passage:

  • Choose to quote passages that seem especially well phrased or are unique to the author or subject matter.
  • Be selective in your quotations. Avoid over-quoting. You also don’t have to quote an entire passage. Use ellipses (. . .) to indicate omitted words. Check with your professor for their ideal length of quotations – some professors place word limits on how much of a sentence or paragraph you should quote.
  • Before or after quoting a passage, include an explanation in which you interpret the significance of the quote for the reader. Avoid “hanging quotes” that have no context or introduction. It is better to err on the side of your reader not understanding your point until you spell it out for them, rather than assume readers will follow your thought process exactly.
  • If you are having trouble paraphrasing (putting something into your own words), that may be a sign that you should quote it.
  • Shorter quotes are generally incorporated into the flow of a sentence while longer quotes may be set off in “blocks.” Check your citation handbook for quoting guidelines.

Paraphrasing is when you state the ideas from another source in your own words . Even when you use your own words, if the ideas or facts came from another source, you need to cite where they came from. Quotation marks are not used. For example:

With the simple music of the flute, Harry lulled the dog to sleep (Rowling 275).

Follow these guidelines when opting to paraphrase a passage:

  • Don’t take a passage and change a word here or there. You must write out the idea in your own words. Simply changing a few words from the original source or restating the information exactly using different words is considered plagiarism .
  • Read the passage, reflect upon it, and restate it in a way that is meaningful to you within the context of your paper . You are using this to back up a point you are making, so your paraphrased content should be tailored to that point specifically.
  • After reading the passage that you want to paraphrase, look away from it, and imagine explaining the main point to another person.
  • After paraphrasing the passage, go back and compare it to the original. Are there any phrases that have come directly from the original source? If so, you should rephrase it or put the original in quotation marks. If you cannot state an idea in your own words, you should use the direct quotation.

A summary is similar to paraphrasing, but used in cases where you are trying to give an overview of many ideas. As in paraphrasing, quotation marks are not used, but a citation is still necessary. For example:

Through a combination of skill and their invisibility cloak, Harry, Ron, and Hermione slipped through Hogwarts to the dog’s room and down through the trapdoor within (Rowling 271-77).

Important guidelines

When integrating a source into your paper, remember to use these three important components:

  • Introductory phrase to the source material : mention the author, date, or any other relevant information when introducing a quote or paraphrase.
  • Source material : a direct quote, paraphrase, or summary with proper citation.
  • Analysis of source material : your response, interpretations, or arguments regarding the source material should introduce or follow it. When incorporating source material into your paper, relate your source and analysis back to your original thesis.

Ideally, papers will contain a good balance of direct quotations, paraphrasing and your own thoughts. Too much reliance on quotations and paraphrasing can make it seem like you are only using the work of others and have no original thoughts on the topic.

Always properly cite an author’s original idea, whether you have directly quoted or paraphrased it. If you have questions about how to cite properly in your chosen citation style, browse these citation guides . You can also review our guide to understanding plagiarism .

University Writing Center

The University of Nevada, Reno Writing Center provides helpful guidance on quoting and paraphrasing and explains how to make sure your paraphrasing does not veer into plagiarism. If you have any questions about quoting or paraphrasing, or need help at any point in the writing process, schedule an appointment with the Writing Center.

Works Cited

Rowling, J.K. Harry Potter and the Sorcerer's Stone.  A.A. Levine Books, 1998.

The Silent Symphony: Decoding the Implications of Hazelwood V Kuhlmeier

This essay about the landmark case Hazelwood v Kuhlmeier explores the intricate balance between student press rights and school authority. It highlights how the Supreme Court’s ruling shifted the landscape of free speech in educational settings, allowing educators to censor school-sponsored publications under certain circumstances. The summary underscores the ongoing debate over the implications of the decision for student autonomy and democratic discourse within schools, emphasizing the need for robust safeguards to protect dissenting voices while upholding the educational mission of institutions.

How it works

Hazelwood v Kuhlmeier, an emblematic legal saga, has sculpted the terrain of student press rights, etching a profound narrative within the annals of First Amendment jurisprudence in the United States. Stemming from a contentious clash between student journalists and school administrators over the censorship of articles in a school-endorsed newspaper, this landmark case has catalyzed a profound introspection into the delicate balance between the liberties of students and the prerogatives of educational institutions. Unraveling the intricacies of this legal odyssey unveils a tapestry of competing interests and ethical quandaries, underscoring the nuanced interplay between freedom of expression and institutional authority.

At its crux, Hazelwood v Kuhlmeier embodies the tension between the constitutional rights of students and the imperatives of educational governance. Central to the dispute was the pivotal question of whether school officials wielded the prerogative to exercise prior restraint over student publications integrated into the curriculum. In a decisive 5-3 verdict, the Supreme Court tilted the scales in favor of the school district, affirming the discretion of educators to censor school-sponsored expressive endeavors provided they could substantiate a legitimate educational rationale. This precedent diverged sharply from the benchmark set by the Tinker standard in 1969, which accorded greater latitude to student speech rights.

The reverberations of the Hazelwood ruling reverberate throughout the corridors of academia, heralding a paradigm shift in the terrain of student press rights and administrative autonomy. While detractors lament the erosion of students’ First Amendment protections, advocates contend that the verdict strikes an equitable balance between unfettered expression and the pedagogical mission of schools. Nonetheless, the case serves as a poignant reminder of the imperative to erect robust safeguards against the arbitrary stifling of dissenting voices, safeguarding the vibrancy of democratic discourse within educational enclaves.

In the wake of Hazelwood, the legal discourse surrounding student press rights continues to evolve, with subsequent jurisprudence offering fertile ground for refinement and elucidation of the parameters governing school censorship. However, the legacy of Hazelwood endures as a poignant testament to the complexities inherent in reconciling the competing imperatives of student autonomy, institutional prerogatives, and societal mores. As the contours of free speech within educational precincts are meticulously delineated, the insights gleaned from Hazelwood serve as a beacon, illuminating the trajectory towards a more equitable and enlightened discourse within our scholastic institutions.

owl

Cite this page

The Silent Symphony: Decoding the Implications of Hazelwood v Kuhlmeier. (2024, Jun 01). Retrieved from https://papersowl.com/examples/the-silent-symphony-decoding-the-implications-of-hazelwood-v-kuhlmeier/

"The Silent Symphony: Decoding the Implications of Hazelwood v Kuhlmeier." PapersOwl.com , 1 Jun 2024, https://papersowl.com/examples/the-silent-symphony-decoding-the-implications-of-hazelwood-v-kuhlmeier/

PapersOwl.com. (2024). The Silent Symphony: Decoding the Implications of Hazelwood v Kuhlmeier . [Online]. Available at: https://papersowl.com/examples/the-silent-symphony-decoding-the-implications-of-hazelwood-v-kuhlmeier/ [Accessed: 8 Jun. 2024]

"The Silent Symphony: Decoding the Implications of Hazelwood v Kuhlmeier." PapersOwl.com, Jun 01, 2024. Accessed June 8, 2024. https://papersowl.com/examples/the-silent-symphony-decoding-the-implications-of-hazelwood-v-kuhlmeier/

"The Silent Symphony: Decoding the Implications of Hazelwood v Kuhlmeier," PapersOwl.com , 01-Jun-2024. [Online]. Available: https://papersowl.com/examples/the-silent-symphony-decoding-the-implications-of-hazelwood-v-kuhlmeier/. [Accessed: 8-Jun-2024]

PapersOwl.com. (2024). The Silent Symphony: Decoding the Implications of Hazelwood v Kuhlmeier . [Online]. Available at: https://papersowl.com/examples/the-silent-symphony-decoding-the-implications-of-hazelwood-v-kuhlmeier/ [Accessed: 8-Jun-2024]

Don't let plagiarism ruin your grade

Hire a writer to get a unique paper crafted to your needs.

owl

Our writers will help you fix any mistakes and get an A+!

Please check your inbox.

You can order an original essay written according to your instructions.

Trusted by over 1 million students worldwide

1. Tell Us Your Requirements

2. Pick your perfect writer

3. Get Your Paper and Pay

Hi! I'm Amy, your personal assistant!

Don't know where to start? Give me your paper requirements and I connect you to an academic expert.

short deadlines

100% Plagiarism-Free

Certified writers

Home — Essay Samples — Life — Parenting — The Pros and Cons of Spanking: A Comprehensive Analysis

test_template

The Pros and Cons of Spanking: a Comprehensive Analysis

  • Categories: Parenting Spanking

About this sample

close

Words: 802 |

Published: Jun 6, 2024

Words: 802 | Pages: 2 | 5 min read

Table of contents

Introduction, arguments in favor of spanking, arguments against spanking, alternative approaches to discipline.

Image of Dr. Oliver Johnson

Cite this Essay

Let us write you an essay from scratch

  • 450+ experts on 30 subjects ready to help
  • Custom essay delivered in as few as 3 hours

Get high-quality help

author

Verified writer

  • Expert in: Life

writer

+ 120 experts online

By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy . We’ll occasionally send you promo and account related email

No need to pay just yet!

Related Essays

2 pages / 854 words

6 pages / 2890 words

3 pages / 1353 words

2 pages / 994 words

Remember! This is just a sample.

You can get your custom paper by one of our expert writers.

121 writers online

Still can’t find what you need?

Browse our vast selection of original essay samples, each expertly formatted and styled

Related Essays on Parenting

Music, G. (2019). Nurturing Children: From Trauma to Growth Using Attachment Theory, Psychoanalysis, and Neurobiology. Routledge.Howe, D. (2005). Attachment Across the Lifecourse: A Brief Introduction. Palgrave Macmillan.Bowlby, [...]

In today's world, diversity is a fact of life. We live among people of different races, ethnicities, religions, cultures, and beliefs. While this diversity can enrich our lives and broaden our perspectives, it can also create [...]

In conclusion, the pros and cons of spanking present a complex and multifaceted issue. While proponents argue that spanking can be an effective disciplinary tool, critics highlight the potential for harm and negative [...]

Parenting plays a pivotal role in shaping the character, behavior, and values of children. The relationship between parenting styles and juvenile delinquency is a topic of significant concern and study. The impact of bad [...]

All people who have children do not have the exact same parenting styles as each other. Every household raises their child differently from other households and has different expectations for them. In Amy Chua's novel “Battle [...]

Is aggressive parenting a necessary form of discipline or a harmful practice that can have long-term negative effects on children? This argumentative essay will explore the controversial topic of aggressive parenting, examining [...]

Related Topics

By clicking “Send”, you agree to our Terms of service and Privacy statement . We will occasionally send you account related emails.

Where do you want us to send this sample?

By clicking “Continue”, you agree to our terms of service and privacy policy.

Be careful. This essay is not unique

This essay was donated by a student and is likely to have been used and submitted before

Download this Sample

Free samples may contain mistakes and not unique parts

Sorry, we could not paraphrase this essay. Our professional writers can rewrite it and get you a unique paper.

Please check your inbox.

We can write you a custom essay that will follow your exact instructions and meet the deadlines. Let's fix your grades together!

Get Your Personalized Essay in 3 Hours or Less!

We use cookies to personalyze your web-site experience. By continuing we’ll assume you board with our cookie policy .

  • Instructions Followed To The Letter
  • Deadlines Met At Every Stage
  • Unique And Plagiarism Free

how to cite while writing an essay

Purdue Online Writing Lab Purdue OWL® College of Liberal Arts

Welcome to the Purdue Online Writing Lab

OWL logo

Welcome to the Purdue OWL

This page is brought to you by the OWL at Purdue University. When printing this page, you must include the entire legal notice.

Copyright ©1995-2018 by The Writing Lab & The OWL at Purdue and Purdue University. All rights reserved. This material may not be published, reproduced, broadcast, rewritten, or redistributed without permission. Use of this site constitutes acceptance of our terms and conditions of fair use.

The Online Writing Lab at Purdue University houses writing resources and instructional material, and we provide these as a free service of the Writing Lab at Purdue. Students, members of the community, and users worldwide will find information to assist with many writing projects. Teachers and trainers may use this material for in-class and out-of-class instruction.

The Purdue On-Campus Writing Lab and Purdue Online Writing Lab assist clients in their development as writers—no matter what their skill level—with on-campus consultations, online participation, and community engagement. The Purdue Writing Lab serves the Purdue, West Lafayette, campus and coordinates with local literacy initiatives. The Purdue OWL offers global support through online reference materials and services.

A Message From the Assistant Director of Content Development 

The Purdue OWL® is committed to supporting  students, instructors, and writers by offering a wide range of resources that are developed and revised with them in mind. To do this, the OWL team is always exploring possibilties for a better design, allowing accessibility and user experience to guide our process. As the OWL undergoes some changes, we welcome your feedback and suggestions by email at any time.

Please don't hesitate to contact us via our contact page  if you have any questions or comments.

All the best,

Social Media

Facebook twitter.

What Europe Fears

American allies see a second Trump term as all but inevitable. “The anxiety is massive.”

A map of Europe with the shadow of Donald Trump looming over it

Listen to this article

Produced by ElevenLabs and News Over Audio (NOA) using AI narration.

I n early April , a crowd of diplomats and dignitaries gathered in the Flemish countryside to toast the most powerful military alliance in the history of the world, and convince themselves it wasn’t about to collapse.

They arrived in a convoy of town cars that snaked down a private driveway and deposited them outside Truman Hall, a white-brick house set on 27 acres of gardens and hazelnut groves. Originally built by a Belgian chocolatier, the estate was sold to the American government at a discount—a thank-you gift for liberating Europe—and became the residence of the U.S. ambassador to NATO. Tonight, Julianne Smith, the inexhaustibly cheerful diplomat who currently holds the job, was stationed at the front door, greeting each guest.

The reception was part of a two-day onslaught of ceremonial activity ostensibly organized to celebrate the 75th anniversary of NATO. There were photo ops and triumphant speeches. The original copy of NATO’s founding charter was brought from Washington, D.C., for display, left open to the most important lines in the treaty, Article 5: “The Parties agree that an armed attack against one or more of them in Europe or North America shall be considered an attack against them all …” Officials ate cake, and declared the alliance stronger than ever.

At Truman Hall, every effort was made to keep the mood festive despite a storm looming outside. Beneath a backyard tent, Secretary of State Antony Blinken spoke, followed by NATO Secretary-General Jens Stoltenberg.

A photo-illustration of the secretary general of NATO Jens Stoltenberg.

Stoltenberg, lean and unrumpled, decided to do something diplomatically unorthodox: acknowledge reality. Anxiety about America’s commitment to the alliance had been omnipresent and unspoken; now Stoltenberg was directly addressing the dangers of a potential U.S. withdrawal from the world.

“The United States left Europe after the First World War,” he said, adding, with a measure of Scandinavian understatement, “That was not a big success.”

The wind was picking up outside, pounding the flaps of the tent and making it difficult to hear. Stoltenberg raised his voice. “Ever since the alliance was established,” he said, “it has been a great success, preserving peace, preventing war, and enabling economic prosperity—”

A strong gust hit the tent, rattling the light trusses above. Guests glanced around nervously.

Stoltenberg stumbled. “The great success has been, uh, enabled or has happened not least because of U.S. leadership—”

Another gust, and the large chandelier hanging over the crowd began to swing. Murmurs rippled through the audience. Stoltenberg, perhaps aware of the unfortunate symbolism that would result from a NATO tent collapse, got quickly to the point.

“I cannot tell you exactly what the next crisis or the next conflict or the next war will be,” he said, but “as long as we stand together, no one can threaten us. We are safe.”

Stoltenberg would tell me weeks later that the speech was intended as a rallying cry. That night, it sounded more like a plea.

T he undercurrent of dread at Truman Hall was not unique. I encountered it in nearly every conversation I had while traveling through Europe this spring. In capitals across the continent—from Brussels to Berlin, Warsaw to Tallinn—leaders and diplomats expressed a sense of alarm bordering on panic at the prospect of Donald Trump’s reelection.

“We’re in a very precarious place,” one senior NATO official told me. He wasn’t supposed to talk about such things on the record, but it was hardly a secret. The largest armed conflict in Europe since World War II was grinding into its third year. The Ukrainian counteroffensive had failed, and Russia was gaining momentum. Sixty billion dollars in desperately needed military aid for Ukraine had been stalled for months in the dysfunctional U.S. Congress. And, perhaps most ominous, America—the country with by far the biggest military in NATO—appeared on the verge of reelecting a president who has repeatedly threatened to withdraw the U.S. from the alliance.

Fear of losing Europe’s most powerful ally has translated into a pathologically intense fixation on the U.S. presidential race. European officials can explain the Electoral College in granular detail and cite polling data from battleground states. Thomas Bagger, the state secretary in the German foreign ministry, told me that in a year when billions of people in dozens of countries around the world will get the chance to vote, “the only election all Europeans are interested in is the American election.” Almost every official I spoke with believed that Trump is going to win.

A photo-illustration of the NATO Headquarters with a fist tearing the photo apart.

The irony of Europe’s obsession with the upcoming election is that the people who will decide its outcome aren’t thinking about Europe much at all. In part, that’s because many Americans haven’t seen the need for NATO in their lifetime (despite the fact that the September 11 terrorist attacks were the only time Article 5 has been invoked). As one journalist in Brussels put it to me, the alliance has for decades been a “solution in search of a problem.” Now, with Russia waging war dangerously close to NATO territory, there’s a large problem. Throughout my conversations, one word came up again and again when I asked European officials about the stakes of the American election: existential .

“The anxiety is massive,” Victoria Nuland, who served until recently as undersecretary for political affairs at the State Department, told me. Like other diplomats in the Biden administration, she has spent the three-plus years since Trump unwillingly left office working to restabilize America’s relationship with its allies.

“Foreign counterparts would say it to me straight up,” Nuland recalled. “‘The first Trump election—maybe people didn’t understand who he was, or it was an accident. A second election of Trump? We’ll never trust you again.’”

BERLIN, GERMANY

T o understand why European governments are so worried about Trump’s return, you could study his erratic behavior at international summits, his fraught relationship with Ukraine’s president and open admiration for Russia’s, his general aversion to the liberal international order. Or you could look at the exceedingly irregular tenure of Trump’s ambassador to Germany, Richard Grenell.

Four years after he left Berlin, people in the city’s political class still speak of Grenell as if they’re processing some unresolved trauma. The mere mention of his name elicits heavy sighs and mirthless chuckles and brief, frozen stares into the middle distance. For them, Grenell’s ambassadorship remains a bitter reminder of what working with the Trump administration was like—and what Trump’s return would mean.

Often, people will tell you about the parties.

Hosting social functions is part of an ambassador’s job. But the parties Grenell threw were more eclectic than a typical embassy reception. The guest lists were light on German political elites—many of whom Grenell made a sport of publicly tormenting—and featured instead a mix of far-right politicians, semi-canceled intellectuals, devout Christians, gay Trump fans, and sundry other friends and hangers-on. Standard social etiquette was at times disregarded; so was good taste. When Grenell hosted a superhero-themed Halloween party at the ambassador’s residence in 2019, one male guest came dressed in a burka, while another wore a “suicide bomber” costume. Photos from the party circulated privately among mystified German journalists. “It was a freak show,” recalled one Berlin-based reporter who saw the pictures and who, like others I spoke with, requested anonymity to speak candidly about the former ambassador. (Grenell declined my request for an interview.)

The scandalized reaction to Grenell’s parties was emblematic of his broader reception in Berlin. A right-wing foreign-policy pundit and Twitter troll—he once posted that Rachel Maddow should “take a breath and put on a necklace” and talked about Michelle Obama “sweating on the East Room’s carpet”—he arrived in Germany in May 2018 at a moment of growing geopolitical anxiety. Despite efforts by German Chancellor Angela Merkel to develop a normal working relationship with Trump, the new president seemed intent on antagonizing Europe—hitting allies with tariffs, abruptly withdrawing from the Iran nuclear deal, and constantly questioning the need for NATO. Another ambassador might have seen it as his job to ease tensions. But Grenell was not just any ambassador.

He was belligerent and uncouth, less a diplomat than a partisan operative. He was “a special animal,” Wolfgang Ischinger, a former German ambassador to the U.S., told me. “He did not play by the rules.”

Hours after starting the job, Grenell tweeted a terse warning that “German companies doing business in Iran should wind down operations immediately.” A few weeks later, he invited a Breitbart News reporter to his residence and said he planned to use his position to “empower other conservatives throughout Europe”—a comment widely interpreted as a political endorsement of European far-right parties, and one he later had to walk back.

A photo illustration of former Ambassador of the United States of America in Germany Richard Grenell

Grenell wasn’t any more tactful in private. In his first meeting with the German foreign ministry, according to a former diplomatic official in Berlin who was briefed on the encounter, Grenell announced, “I’m here to implement the American president’s interests.” The officials, taken aback by his audacity, tried politely to correct him: No, he was there to lobby for America’s interests. But Grenell didn’t seem to see the difference.

He hung a giant oil painting of Trump in the entryway of the ambassador’s residence, and made a party trick out of flaunting his access to the White House. He would call the Oval Office “for fun” just to show that “he had a direct line to the U.S. president,” recalled Julian Reichelt, a friend of Grenell’s who was then the editor of the right-leaning German tabloid Bild .

As Trump escalated his crusade against the European political establishment—publicly rooting for Merkel’s right-wing opponents and identifying the European Union as a “foe” —Grenell seemed eager to join in. After the president hijacked a NATO summit in July 2018 to deliver a tirade against countries that weren’t spending enough on defense, Grenell did his best to replicate the performance in Berlin.

The ambassador quickly became a villain in the German press. The magazine Der Spiegel nicknamed him “Little Trump.” German politicians publicly called on the U.S. to recall Grenell. One member of the Bundestag compared him to a “far-right colonial officer”; another was quoted as saying that he acted like “the representative of a hostile power.”

Some observers would later speculate that the bad press was the product of a leak campaign by Merkel’s government to isolate Grenell. Others believed that he deliberately courted outrage. “He didn’t care a bit about his reputation here,” Christoph Heusgen, the chair of the Munich Security Conference, told me. “He cared about offending the Germans and making headlines because he knew his boss would love that.” Soon enough, the president was referring to Grenell as “my beautiful Ric” and reportedly telling advisers that his man in Berlin “gets it.”

Grenell’s defenders would later argue that his hardball tactics got results. Take, for example, his vociferous opposition to the Nord Stream 2 pipeline. The U.S. had long objected to its construction, which would dramatically increase Germany’s reliance on Russian energy. But Grenell pressed the issue much harder than his predecessors had—sending letters threatening sanctions against companies that worked on the project, and successfully lobbying Berlin to import American liquefied natural gas. After Russia invaded Ukraine, German President Frank-Walter Steinmeier admitted that clinging to Nord Stream 2 had been a “mistake.”

To Grenell’s admirers, it was his effectiveness that made him unpopular in Berlin. “The ideal U.S. ambassador for your average German government,” Reichelt told me, “just talks nicely about, like, the American dream and transatlantic relations and blah blah and freedom blah blah and what we can learn from each other.” Grenell refused to be a mascot. “He was doing politics—he was actually driving policies,” Reichelt said. (Reichelt was fired from Bild in 2021 after The New York Times reported on a sexual relationship he’d had with a subordinate; Reichelt denied abusing his authority.)

But by the time Grenell left Berlin, the mutual disdain between the ambassador and the political class was so thick that some wondered if he’d kept an enemies list. Grenell, who briefly served as Trump’s acting director of national intelligence, is reportedly on the shortlist for secretary of state or national security adviser in a second Trump administration, which means he’d be in a position to make life difficult for political leaders he disfavors. “I know many of these ministers, and they would be afraid,” one prominent German journalist told me. “I think he’s a guy who doesn’t forget.”

The Germans are bracing for Trump’s return in other ways. Inside the foreign ministry, officials have mapped out a range of policy areas likely to be destabilized by his reelection—NATO, Ukraine, tariffs, climate change—and are writing detailed proposals for how to deal with the fallout, multiple people told me. Can Trump’s moods be predicted? Who are his confidants, and how can the government get close to them?

The Germans have a contingency plan for President Joe Biden’s reelection too, but few seem to think they’ll need it. They’re preparing for a third scenario as well: a period of sustained uncertainty about the election’s outcome, accompanied by widespread political violence in the U.S. Nuland, the recently departed State Department official, told me that, based on her conversations with foreign counterparts, Germany isn’t alone in planning for this possibility. “If you are an adversary of the United States, whether you’re talking about Putin, Iran, or others, it would be a perfect opportunity to exploit the fact that we’re distracted,” she said.

René Pfister, Der Spiegel ’s Washington bureau chief, told me that the first Trump administration left Germany struggling with difficult questions about its relationship with the U.S. Was America still interested in being the leader of the free world, or would it be governed by ruthless self-interest like China and Russia? Could it be counted on to defend its allies if Trump were reelected? “The Germans always had the impression that, regardless of the political affiliation of the president, you can rely, on the big questions, on the United States,” Pfister told me. “I think this confidence is totally shaken.”

BRUSSELS, BELGIUM

O ne afternoon in early April, I listened in as Julianne Smith, the U.S. ambassador who’d hosted the event at Truman Hall, conducted a virtual press briefing from NATO headquarters. Journalists had called in from across Europe, and their questions reflected the unease on the continent. A reporter from Portugal asked about the prospect of NATO countries reinstating military conscription in light of the Russian threat. Another, from Bulgaria, asked Smith to respond to politicians there pushing to withdraw from the alliance. A TV-news correspondent from North Macedonia asked whether Smith thought Russia would take the Balkans next if Ukraine fell.

When President Biden set about filling diplomatic posts after his election, he made reassuring rattled allies a top priority. Smith fit the mold of a model ambassador—a career foreign-policy wonk with deep government experience and comfortingly conventional views on America’s role in the world. She also brings a boundless Leslie Knopeian energy to the job, and has been well schooled in the finer points of diplomat-speak: She scarcely mentions a country or region without first establishing friendship—“our friends in the Middle East,” “our friends in Portugal”—and she does not talk to these friends; she only “engages” them (as in “I went to the Vatican quite a while ago to engage them on the war.”).

A photo illustration of the United States Permanent Representative to NATO Julianne Smith.

Listening to the press briefing, I thought Smith did well—she sounded calm and confident and relentlessly optimistic. But when the briefing ended, I was ushered into a hallway to await my scheduled interview with the ambassador, and I overheard her fretting to an aide about how she’d handled a question about recent Ukrainian strikes on energy infrastructure inside Russia. American officials, worried about escalation, were reportedly urging Ukraine to stop the attacks, and Smith had responded that the U.S. was “not particularly supportive of” Ukraine going after targets on Russian soil. Now she was second-guessing herself. Maybe she should have said that the U.S. doesn’t “encourage” the attacks, or that the attacks don’t have America’s “blessing.” (Last week, the Biden administration gave Ukraine permission to use American weapons to attack Russian targets in limited circumstances.)

“Maybe I’m splitting hairs,” I heard Smith say. “Just with my lack of sleep, I didn’t have my game face on. I didn’t nail it.” She sounded exhausted.

During our interview, I asked Smith if the job was what she’d expected. She laughed: “No, no, no.” Part of what had appealed to her about the NATO post was the potential for a 9-to-5 lifestyle. Her kids were still young, and she’d been looking forward to some work-life balance. Then, six weeks after she moved to Brussels, Russia invaded Ukraine, and all of a sudden she was at the center of a geopolitical crisis.

Smith told me her ambassadorial role is unique in that she doesn’t have just one host country to worry about when she makes public statements. She’s speaking to audiences in dozens of countries, and each one needs to hear something different from her. “You have to sit down and understand: ‘What is it that’s keeping you awake at night?’” she said. Maybe it’s an errant Russian missile entering their airspace. Or a destabilizing wave of refugees. Or a cyberattack. Or tanks crossing their borders. “They’re obviously looking to hear time and time again that the U.S. commitment to the alliance, and particularly Article 5, is ironclad and unwavering.”

Smith has developed an arsenal of sanguine talking points to convey this message. She cites U.S. opinion polls showing strong support for NATO. She rehearses America’s long, bipartisan history of standing by its European allies. “For over seven decades,” she told me, “American presidents of all political stripes have supported this alliance.”

I encountered the same performative positivity in meetings with American diplomats throughout Europe. In Warsaw, Ambassador Mark Brzezinski sat in the airy living room of his residence and talked about the “economic efficiencies” America has enjoyed as a result of its alliance with Poland. “The Poles are spending billions of dollars to protect themselves, mostly buying from U.S. defense contractors,” he said. In Berlin, Ambassador Amy Gutmann met me in an embassy room overlooking the Brandenburg Gate and recounted the heroic role America had played in the massive airlift that broke the 1949 Soviet blockade of West Berlin. “Before I came here,” Gutmann told me, “President Biden said, ‘Make sure you tell every person you meet in Germany how important the U.S.-German relationship is.’ And I’ve done that.”

But sentimental rhetoric and gestures of goodwill only go so far. George Kent, the U.S. ambassador to Estonia, told me about an Earth Day photo op he’d taken part in earlier this year. The plan was to plant a tree at the Park of Friendship in central Estonia. Upon arrival, he was greeted by a kindly septuagenarian gardener who’d been participating in the tradition for decades. Kent tried to make small talk about horticulture, but the gardener had other things on his mind: “Can we talk about the vote in Congress?” He wanted the latest news on the Ukraine aid package.

In interviews, State Department officials in Washington, who requested anonymity so they could speak candidly, acknowledged that efforts to “reassure” European allies are largely futile now. What exactly can a U.S. diplomat say, after all, about the fact that the Republican presidential nominee has said he would encourage Russia to “do whatever the hell they want” to NATO countries that he considers freeloaders?

“There’s not really anything we can do,” one U.S. official told me. European leaders “are smart, thoughtful people. The secretary isn’t going to get them in a room and say, ‘Hey, guys, it’s going to be okay, the election is a lock.’ That’s not something he can promise.”

WARSAW, POLAND

“W hat the fuck is happening in the United States?”

Agnieszka Homańska, seemingly startled by her own outburst, slowly placed her hands on the table as if to calm herself. “Sorry for being so frank.” We were sitting in a crowded bistro in downtown Warsaw with retro pop art on the walls and American Top 40 playing from the speakers. Homańska, a 25-year-old grad student and government worker who wore sneakers and a T-shirt that said BE BRAVE , was trying to explain how Poles her age felt about this year’s U.S. election.

Homańska exhibited none of the casual contempt for America often associated with young people in other European capitals. In the history she grew up learning, Americans were the good guys—defeating the Nazi occupiers, tearing down the Iron Curtain. Surveys consistently find that Poland is the most pro-America country in Europe, and one of the few where public opinion doesn’t change based on which party controls the White House. Ronald Reagan is a hero to many here; so is George H. W. Bush. In Poland, the mythology of America—vanquisher of tyrants, keeper of the democratic flame—persists. The U.S. is still a city on a hill.

But the Trump era punctured Homańska’s image of America, as it did for many younger Poles. Trump’s refusal to concede the 2020 election was jarring to those who saw the U.S. as an aspirational democracy. The storming of the Capitol on January 6 “was broadcast on every television,” she told me. Trump’s criminal charges—and his recent conviction on 34 felony counts in a Manhattan court—have made the news here too. “People don’t understand why Trump can still run for president.” (Like others I spoke with, Homańska was also confused by the fact that Joe Biden, who struck her as feeble and out of touch, is running again—were these really the best options America could muster? I told her she wasn’t alone in wondering about this.)

Many Poles see Trump through the prism of their own country’s recent politics. The right-wing nationalist Law and Justice party came to power in Poland a year before Trump’s election, and spent the next eight years co-opting democratic institutions, from the courts to the civil service to the public media. The government maintained a cozy relationship with Trump—President Andrzej Duda famously proposed naming an American military base in Poland after him—and he is still popular among conservative Poles. But last year, an intense electoral backlash to Law and Justice produced the largest voter turnout in Poland’s post-Soviet history, driven by young people. The new government, a coalition spanning from the center-left to the center-right, is focused on repairing Poland’s democracy.

After the election, Homańska decided to postpone her planned studies in Canada so she could help rebuild her country. When I asked her which countries she looked to as democratic role models, she mentioned Finland and Estonia, another former Soviet country that has successfully modernized. “Maybe there is something about the maturity of French democracy,” she added.

And America? I asked.

Homańska hesitated. “I don’t think that people my age would perceive America as an ideal way to create a democratic society,” she replied. She seemed almost apologetic.

An illustration of NATO nation flags with the USA flag scribbled out.

Many of the Poles I met were especially perplexed by one recent display of U.S. political dysfunction: the struggle to pass a military-aid package for Ukraine earlier this year. Polls showed that a majority of Americans supported the funding. Reporting suggested that most members of Congress favored it too. But somehow, because Trump opposed it, a minority of Republicans in the House had succeeded in holding up the bill for months while Ukraine was forced to ration bullets and let Russian missiles level buildings. Although the aid package finally passed in late April, some Western officials worry that the battlefield advances Russia made during the delay will be difficult to reverse.

The Russian threat is no abstract matter in Poland, where Prime Minister Donald Tusk has talked about living in a “prewar era” and regularly urges citizens to prepare for a conflict. I heard stories about people stocking up on gold and looking for apartments with basements that could double as bomb shelters. Schools are running duck-and-cover drills, and shooting ranges have become more popular as people realize they might soon need to know how to handle a gun. One Polish woman told me about a phone call she’d received from her aunt, who was wondering if she should restain her wood floors or save her money because her house might be destroyed soon anyway.

In Warsaw, Polish Minister of Foreign Affairs Radek Sikorski (who is married to the Atlantic writer Anne Applebaum) told me, “you will feel the physical vulnerability.” Travel 200 miles north and you reach Kaliningrad, where Russia is said to house nuclear weapons; go 200 miles east, and you hit the Ukrainian border. “It concentrates the mind.”

Poland has recently increased defense spending to 4 percent of its GDP—well beyond the standard of 2 percent set by NATO, and higher even than in the U.S. But officials know they’ll never be able to fend off a hostile Russia alone.

“It’s an existential threat,” Aleksandra Wiśniewska, who was elected to Poland’s Parliament last year, told me. Like other Polish politicians I spoke with, Wiśniewska—a 30-year-old former humanitarian aid worker who now sits on the foreign-affairs committee—was reluctant to say anything that might alienate the former, and perhaps future, American president. But she wanted me to understand that the choice American voters make this fall will reverberate beyond U.S. borders.

“I fear that the old United States that we all almost revere,” Wiśniewska told me, is “now sort of self-sabotaging. And by consequence, it will jeopardize the safety and security of the entire global order.”

FRANKENBERG, GERMANY

T he U.S. Army’s 2nd Cavalry Regiment left Vilseck, Germany, before dawn on April 9 in a convoy of camouflaged jeeps, fuel tankers, armored vehicles, and trucks packed with soldiers and ammunition. They rumbled past windmills and pastoral villages, stopping only for fuel. Speed was essential: The road march to Bemowo Piskie, Poland, was more than 800 miles, and the fate of the Western world was—at least hypothetically—at stake.

The regiment was training for a long-dreaded crisis scenario: a Russian invasion of the Suwałki Gap. The 60-mile stretch of Polish farmland is sparsely populated but strategically important. If Russian forces annexed the territory, they could effectively seal off Lithuania, Latvia, and Estonia from the rest of NATO. To save the Baltic states, allies in Northern Europe would have to mobilize quickly.

During a refueling stop at a German barracks in Frankenberg, U.S. Army officers rattled off facts to me about the Stryker, a lightweight armored vehicle that looks like a tank but can drive up to 60 miles an hour, and demonstrated a language-translation app they’d developed to facilitate communication among allied troops. The drill they were conducting that day was part of a monthslong NATO military exercise—the largest since the end of the Cold War—involving all 32 allied countries; more than 1,000 combat vehicles; dozens of aircraft carriers, frigates, and battleships; and 90,000 troops. Although NATO officials have been careful not to single out Russia by name, the intended audience for the war games was clear. “Are exercises like this designed to send a message? They are, absolutely,” Colonel Martin O’Donnell told me as soldiers in fatigues milled around nearby. “The message is that we’re here. We’re ready. We have the capability to work with our allies and partners and meet you, potential adversary, wherever you may be.”

But the demonstration in Frankenberg sent another, perhaps less convenient, message as well. The convoy rushing to confront a theoretical Russian invasion was composed almost entirely of U.S. soldiers driving U.S. vehicles filled with U.S.-made guns and bullets and missiles. They’d link up with military units from other NATO countries eventually. But if America were removed from the equation, would the battle group in Bemowo Piskie stand a chance?

Whether Trump wins or not, there’s a growing consensus in Europe that the strain of American politics he represents—a throwback to the hard-edged isolationism of the 1920s and ’30s—isn’t going away. It’s become common in the past year for politicians to talk about the need for European “defense autonomy.”

“We can’t just flip a coin every four years and hope that Michigan voters will vote in the right direction,” Benjamin Haddad, a member of France’s National Assembly, said at an event earlier this year. “We have to take matters in our own hand.”

What exactly that would look like is a subject of intense debate. Italy’s foreign minister recently proposed forming a European Union army (an idea that’s been raised and rejected many times in the past). Others have suggested diverting resources from NATO to a separate European defense alliance (though European countries are not immune to the kind of populist nationalism that could make such alliances dysfunctional). Replacing the so-called nuclear umbrella provided by the U.S. arsenal would require countries such as Germany and Poland to develop their own nuclear stockpiles, to supplement the small ones France and the United Kingdom already have.

Within NATO, the immediate priority is “Trump-proofing” the alliance. In the past 18 months, Finland and Sweden have joined, each bringing relatively capable and high-tech militaries. Secretary-General Stoltenberg has also proposed shifting responsibility for Ukrainian arms deliveries from the U.S. to NATO in case the next administration decides to abandon the war.

Most notably, allied countries have dramatically increased their own military spending. I spoke with several officials who grudgingly credited Trump for this development—something NATO officials and U.S. presidents had spent decades advocating for unsuccessfully. In 2017, when Trump took office, only three allies, plus the U.S., were spending at least 2 percent of their GDP on defense. This year, that number is expected to rise to at least 18. Trump’s criticism of paltry defense budgets was not only effective, Stoltenberg told me, but fair. “European allies have not spent enough for many years,” he said. (No doubt Russia’s invasion of Ukraine also factored into the increased spending.)

Even with the funding influx, many officials believe Europe still has a long way to go before it could defend itself alone. The U.S. has some 85,000 troops currently stationed in Europe—more than the entire militaries of Belgium, Sweden, and Portugal combined—and provides essential intelligence gathering, ballistic-missile defense, and air-force capabilities. “Dreaming about strategic autonomy for Europe is a wonderful vision for maybe the next 50 years,” Ischinger, the former German ambassador, told me. “But right now, we need America more than ever.”

That reality has left politicians and diplomats across Europe honing their theories of Trump-ego management ahead of the U.S. election. To some, the former president’s emotional volatility represents a grave threat. The former diplomatic official in Berlin told me that in May 2020, Merkel called Trump to inform him that she wouldn’t be traveling to Washington for the G7 summit out of concern for COVID. Trump was enraged, according to the diplomat, who requested anonymity to describe a private conversation, and the call grew heated. A week later, Trump announced plans to permanently withdraw nearly 10,000 U.S. troops from Germany—a move seen within Merkel’s government as a petty act of revenge. (Biden later reversed the order; a spokesperson for the Trump campaign did not respond to a request for comment.)

Others think Trump’s ego could make him easier to manipulate. “He’s very transactional, and he’s very narcissistic,” the senior NATO official, who’s met Trump multiple times, told me. “And if you combine the two, then you can sell him—” the official paused. He recited an expression in his native language. Roughly translated, it meant “You can sell him turnips as if they’re lemons.”

What’s striking about these calculations is how thoroughly allies have already adjusted their perception of the U.S. relationship. I noticed a certain pattern in my conversations with European political leaders and diplomats: At some point in almost every interview, the European would begin pitching me on how much the U.S. benefits economically from the alliance. Preserving peace in Europe has sustained decades of lucrative trade for U.S. companies. A broader Russian war on the continent would be felt in the average American’s pocketbook. I later learned that these talking points were being encouraged by NATO officials as well as the U.S. State Department. The thinking behind the strategy is that Americans need to hear why supporting European allies is in their self-interest.

“They keep telling us how important it is to go and convince the housewives in Wisconsin and the farmers in Iowa,” a senior official from an allied country grumbled to me. “How many Americans are going to the housewives of southern Estonia or … the countryside in France to tell why Europe should stand by the United States?” He noted that the alliance protects the U.S. as well.

The more I listened to prime ministers and parliamentarians deliver the same earnest spiel, the more dispiriting I found it. At its most idealistic, the transatlantic alliance has always been about a shared commitment to democratic values. Now Europeans are bracing for an America that behaves like any other transactional superpower. Several officials expressed fears that Trump would turn America’s NATO membership into a kind of protection racket, threatening to abandon Europe unless this ally offers better trade terms, or that ally helps investigate a political enemy.

“We are exposed,” Bagger, the German state secretary, told me. Europe’s alliance with America, he said, “has served as our life insurance for the last 70 years.”

And with Vladimir Putin seizing territory in Europe and trying to unravel NATO, what choice would these countries have but to accept Trump’s terms?

NARVA, ESTONIA

T he city of Narva sits on Estonia’s eastern border, separated from Russia by a river and a heavily guarded bridge. Some experts believe that if World War III breaks out in the coming years, this is where it will begin. The city is overwhelmingly populated by ethnic Russians, many of whom don’t speak Estonian and are therefore ineligible for citizenship. Western officials fear Putin might try to use the same playbook he developed in Crimea—enlisting Russian separatists to stoke unrest and create a pretense for annexing the city. Such a move would effectively dare the West to go to war with a nuclear power over a small Estonian city, or else watch the credibility of their vaunted alliance collapse. NATO calls this “the Narva scenario.”

On a cold spring morning, I drove two hours from the Estonian capital of Tallinn and arrived at the border-crossing station, a red-brick box of a building on the edge of the Narva River. There I met Aleksandr Kazmin, a border guard with close-cropped hair and a friendly face who spoke broken English with a thick Russian accent. He wore a patch on his coat that said Politsei and a gun on his hip.

The border checkpoint once saw a steady stream of commuters and tourists traveling back and forth between Russia and Estonia—at its peak, Kazmin told me, the station processed 27,000 people in a single day. But travel dropped dramatically once the war in Ukraine started. In the months following the invasion, many of the people coming across the Narva border were refugees. Then, earlier this year, Russia closed its side of the road for “renovations,” meaning that the only way to cross the bridge now is by foot. On the morning I visited, I saw a thin trickle of travelers—moms pushing strollers, young people with backpacks—shuffle in and out of the station.

Kazmin told me that the war had divided Narva, as it had the wider Russian diaspora. Those who are “already integrated in Estonian society” generally oppose Putin’s aggression, he said, but some “don’t want to integrate—they are living in Russian-media world.” He rolled his eyes before muttering in resignation, “Nothing to do. It’s their choice.”

I asked Kazmin if I could walk to the actual border, and he obliged. As we made our way across the bridge, passing a tangle of barbed wire that had been pushed to the side, he warned me that we might see a Russian border guard filming us from the checkpoint on the other side. Kazmin didn’t know exactly why the Russians did this—he guessed it was some kind of intelligence-gathering tactic—but it often happened when he brought a visitor to the bridge.

Sure enough, as we got closer, a guard appeared in the distance. He didn’t seem to have a camera, so I asked Kazmin if I could wave at him. Kazmin cautioned against it. Communication between the two sides, even for benign logistical coordination, is strictly regulated: Only specially trained officials at the station are allowed to talk to the Russians, and they do so using a Cold War–era crank phone.

We stopped when we reached the middle of the bridge. Kazmin told me this was the closest we would get to Russia, explaining that there was no permanent, official border; it was understood that the deepest point of the river was what technically separated the two countries, and that shifts over time. The spot was strangely beautiful. Below us, a current of water rushed toward the Baltic Sea; above us, a flurry of snow fell from the gray sky. Two imposing medieval fortresses faced each other from either side of the river, one built by the occupying Danes in the 13th century, the other by a Muscovite prince two centuries later—twin relics of conquests past. As I took in the view, Kazmin bounced up and down to keep warm, stealing glances at his Russian counterpart.

I thought about how much more precarious the world must feel to those living in a place like this, doing a job like his. The day before my visit to Narva, I had interviewed Estonian Prime Minister Kaja Kallas, who talked about the stakes of preserving the transatlantic alliance. Her country has a population of 1.3 million and is roughly the size of Vermont. She recalled sitting in a meeting with other world leaders shortly after her election where they discussed the Russian threat. “I made a note in my notebook: ‘For some countries here, talking about security and defense is a nice intellectual discussion,’” Kallas told me. “‘For us, it’s existential.’”

After dozens of interviews, I’d become desensitized to politicians using this word. But walking back across the bridge, I thought I understood what she meant.

Kazmin pointed to a tall flagpole perched beside the Narva station. At the top, the Estonian flag waved in the wind; beneath it, a navy-blue flag with the NATO seal. He said that flag had been installed only a few months earlier. I asked him if he thought the day would ever come when he saw Russian tanks rolling across the bridge. Kazmin got quiet for a moment. He said Russia’s government has long promised that it would not attack the Baltics—but that Putin had said the same thing about Ukraine.

“When they tell us they will not do something,” he said, “it means for us that they can do it—or will do it.”

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 03 June 2024

Applying large language models for automated essay scoring for non-native Japanese

  • Wenchao Li 1 &
  • Haitao Liu 2  

Humanities and Social Sciences Communications volume  11 , Article number:  723 ( 2024 ) Cite this article

185 Accesses

2 Altmetric

Metrics details

  • Language and linguistics

Recent advancements in artificial intelligence (AI) have led to an increased use of large language models (LLMs) for language assessment tasks such as automated essay scoring (AES), automated listening tests, and automated oral proficiency assessments. The application of LLMs for AES in the context of non-native Japanese, however, remains limited. This study explores the potential of LLM-based AES by comparing the efficiency of different models, i.e. two conventional machine training technology-based methods (Jess and JWriter), two LLMs (GPT and BERT), and one Japanese local LLM (Open-Calm large model). To conduct the evaluation, a dataset consisting of 1400 story-writing scripts authored by learners with 12 different first languages was used. Statistical analysis revealed that GPT-4 outperforms Jess and JWriter, BERT, and the Japanese language-specific trained Open-Calm large model in terms of annotation accuracy and predicting learning levels. Furthermore, by comparing 18 different models that utilize various prompts, the study emphasized the significance of prompts in achieving accurate and reliable evaluations using LLMs.

Similar content being viewed by others

how to cite while writing an essay

Accurate structure prediction of biomolecular interactions with AlphaFold 3

how to cite while writing an essay

Testing theory of mind in large language models and humans

how to cite while writing an essay

Highly accurate protein structure prediction with AlphaFold

Conventional machine learning technology in aes.

AES has experienced significant growth with the advancement of machine learning technologies in recent decades. In the earlier stages of AES development, conventional machine learning-based approaches were commonly used. These approaches involved the following procedures: a) feeding the machine with a dataset. In this step, a dataset of essays is provided to the machine learning system. The dataset serves as the basis for training the model and establishing patterns and correlations between linguistic features and human ratings. b) the machine learning model is trained using linguistic features that best represent human ratings and can effectively discriminate learners’ writing proficiency. These features include lexical richness (Lu, 2012 ; Kyle and Crossley, 2015 ; Kyle et al. 2021 ), syntactic complexity (Lu, 2010 ; Liu, 2008 ), text cohesion (Crossley and McNamara, 2016 ), and among others. Conventional machine learning approaches in AES require human intervention, such as manual correction and annotation of essays. This human involvement was necessary to create a labeled dataset for training the model. Several AES systems have been developed using conventional machine learning technologies. These include the Intelligent Essay Assessor (Landauer et al. 2003 ), the e-rater engine by Educational Testing Service (Attali and Burstein, 2006 ; Burstein, 2003 ), MyAccess with the InterlliMetric scoring engine by Vantage Learning (Elliot, 2003 ), and the Bayesian Essay Test Scoring system (Rudner and Liang, 2002 ). These systems have played a significant role in automating the essay scoring process and providing quick and consistent feedback to learners. However, as touched upon earlier, conventional machine learning approaches rely on predetermined linguistic features and often require manual intervention, making them less flexible and potentially limiting their generalizability to different contexts.

In the context of the Japanese language, conventional machine learning-incorporated AES tools include Jess (Ishioka and Kameda, 2006 ) and JWriter (Lee and Hasebe, 2017 ). Jess assesses essays by deducting points from the perfect score, utilizing the Mainichi Daily News newspaper as a database. The evaluation criteria employed by Jess encompass various aspects, such as rhetorical elements (e.g., reading comprehension, vocabulary diversity, percentage of complex words, and percentage of passive sentences), organizational structures (e.g., forward and reverse connection structures), and content analysis (e.g., latent semantic indexing). JWriter employs linear regression analysis to assign weights to various measurement indices, such as average sentence length and total number of characters. These weights are then combined to derive the overall score. A pilot study involving the Jess model was conducted on 1320 essays at different proficiency levels, including primary, intermediate, and advanced. However, the results indicated that the Jess model failed to significantly distinguish between these essay levels. Out of the 16 measures used, four measures, namely median sentence length, median clause length, median number of phrases, and maximum number of phrases, did not show statistically significant differences between the levels. Additionally, two measures exhibited between-level differences but lacked linear progression: the number of attributives declined words and the Kanji/kana ratio. On the other hand, the remaining measures, including maximum sentence length, maximum clause length, number of attributive conjugated words, maximum number of consecutive infinitive forms, maximum number of conjunctive-particle clauses, k characteristic value, percentage of big words, and percentage of passive sentences, demonstrated statistically significant between-level differences and displayed linear progression.

Both Jess and JWriter exhibit notable limitations, including the manual selection of feature parameters and weights, which can introduce biases into the scoring process. The reliance on human annotators to label non-native language essays also introduces potential noise and variability in the scoring. Furthermore, an important concern is the possibility of system manipulation and cheating by learners who are aware of the regression equation utilized by the models (Hirao et al. 2020 ). These limitations emphasize the need for further advancements in AES systems to address these challenges.

Deep learning technology in AES

Deep learning has emerged as one of the approaches for improving the accuracy and effectiveness of AES. Deep learning-based AES methods utilize artificial neural networks that mimic the human brain’s functioning through layered algorithms and computational units. Unlike conventional machine learning, deep learning autonomously learns from the environment and past errors without human intervention. This enables deep learning models to establish nonlinear correlations, resulting in higher accuracy. Recent advancements in deep learning have led to the development of transformers, which are particularly effective in learning text representations. Noteworthy examples include bidirectional encoder representations from transformers (BERT) (Devlin et al. 2019 ) and the generative pretrained transformer (GPT) (OpenAI).

BERT is a linguistic representation model that utilizes a transformer architecture and is trained on two tasks: masked linguistic modeling and next-sentence prediction (Hirao et al. 2020 ; Vaswani et al. 2017 ). In the context of AES, BERT follows specific procedures, as illustrated in Fig. 1 : (a) the tokenized prompts and essays are taken as input; (b) special tokens, such as [CLS] and [SEP], are added to mark the beginning and separation of prompts and essays; (c) the transformer encoder processes the prompt and essay sequences, resulting in hidden layer sequences; (d) the hidden layers corresponding to the [CLS] tokens (T[CLS]) represent distributed representations of the prompts and essays; and (e) a multilayer perceptron uses these distributed representations as input to obtain the final score (Hirao et al. 2020 ).

figure 1

AES system with BERT (Hirao et al. 2020 ).

The training of BERT using a substantial amount of sentence data through the Masked Language Model (MLM) allows it to capture contextual information within the hidden layers. Consequently, BERT is expected to be capable of identifying artificial essays as invalid and assigning them lower scores (Mizumoto and Eguchi, 2023 ). In the context of AES for nonnative Japanese learners, Hirao et al. ( 2020 ) combined the long short-term memory (LSTM) model proposed by Hochreiter and Schmidhuber ( 1997 ) with BERT to develop a tailored automated Essay Scoring System. The findings of their study revealed that the BERT model outperformed both the conventional machine learning approach utilizing character-type features such as “kanji” and “hiragana”, as well as the standalone LSTM model. Takeuchi et al. ( 2021 ) presented an approach to Japanese AES that eliminates the requirement for pre-scored essays by relying solely on reference texts or a model answer for the essay task. They investigated multiple similarity evaluation methods, including frequency of morphemes, idf values calculated on Wikipedia, LSI, LDA, word-embedding vectors, and document vectors produced by BERT. The experimental findings revealed that the method utilizing the frequency of morphemes with idf values exhibited the strongest correlation with human-annotated scores across different essay tasks. The utilization of BERT in AES encounters several limitations. Firstly, essays often exceed the model’s maximum length limit. Second, only score labels are available for training, which restricts access to additional information.

Mizumoto and Eguchi ( 2023 ) were pioneers in employing the GPT model for AES in non-native English writing. Their study focused on evaluating the accuracy and reliability of AES using the GPT-3 text-davinci-003 model, analyzing a dataset of 12,100 essays from the corpus of nonnative written English (TOEFL11). The findings indicated that AES utilizing the GPT-3 model exhibited a certain degree of accuracy and reliability. They suggest that GPT-3-based AES systems hold the potential to provide support for human ratings. However, applying GPT model to AES presents a unique natural language processing (NLP) task that involves considerations such as nonnative language proficiency, the influence of the learner’s first language on the output in the target language, and identifying linguistic features that best indicate writing quality in a specific language. These linguistic features may differ morphologically or syntactically from those present in the learners’ first language, as observed in (1)–(3).

我-送了-他-一本-书

Wǒ-sòngle-tā-yī běn-shū

1 sg .-give. past- him-one .cl- book

“I gave him a book.”

Agglutinative

彼-に-本-を-あげ-まし-た

Kare-ni-hon-o-age-mashi-ta

3 sg .- dat -hon- acc- give.honorification. past

Inflectional

give, give-s, gave, given, giving

Additionally, the morphological agglutination and subject-object-verb (SOV) order in Japanese, along with its idiomatic expressions, pose additional challenges for applying language models in AES tasks (4).

足-が 棒-に なり-ました

Ashi-ga bo-ni nar-mashita

leg- nom stick- dat become- past

“My leg became like a stick (I am extremely tired).”

The example sentence provided demonstrates the morpho-syntactic structure of Japanese and the presence of an idiomatic expression. In this sentence, the verb “なる” (naru), meaning “to become”, appears at the end of the sentence. The verb stem “なり” (nari) is attached with morphemes indicating honorification (“ます” - mashu) and tense (“た” - ta), showcasing agglutination. While the sentence can be literally translated as “my leg became like a stick”, it carries an idiomatic interpretation that implies “I am extremely tired”.

To overcome this issue, CyberAgent Inc. ( 2023 ) has developed the Open-Calm series of language models specifically designed for Japanese. Open-Calm consists of pre-trained models available in various sizes, such as Small, Medium, Large, and 7b. Figure 2 depicts the fundamental structure of the Open-Calm model. A key feature of this architecture is the incorporation of the Lora Adapter and GPT-NeoX frameworks, which can enhance its language processing capabilities.

figure 2

GPT-NeoX Model Architecture (Okgetheng and Takeuchi 2024 ).

In a recent study conducted by Okgetheng and Takeuchi ( 2024 ), they assessed the efficacy of Open-Calm language models in grading Japanese essays. The research utilized a dataset of approximately 300 essays, which were annotated by native Japanese educators. The findings of the study demonstrate the considerable potential of Open-Calm language models in automated Japanese essay scoring. Specifically, among the Open-Calm family, the Open-Calm Large model (referred to as OCLL) exhibited the highest performance. However, it is important to note that, as of the current date, the Open-Calm Large model does not offer public access to its server. Consequently, users are required to independently deploy and operate the environment for OCLL. In order to utilize OCLL, users must have a PC equipped with an NVIDIA GeForce RTX 3060 (8 or 12 GB VRAM).

In summary, while the potential of LLMs in automated scoring of nonnative Japanese essays has been demonstrated in two studies—BERT-driven AES (Hirao et al. 2020 ) and OCLL-based AES (Okgetheng and Takeuchi, 2024 )—the number of research efforts in this area remains limited.

Another significant challenge in applying LLMs to AES lies in prompt engineering and ensuring its reliability and effectiveness (Brown et al. 2020 ; Rae et al. 2021 ; Zhang et al. 2021 ). Various prompting strategies have been proposed, such as the zero-shot chain of thought (CoT) approach (Kojima et al. 2022 ), which involves manually crafting diverse and effective examples. However, manual efforts can lead to mistakes. To address this, Zhang et al. ( 2021 ) introduced an automatic CoT prompting method called Auto-CoT, which demonstrates matching or superior performance compared to the CoT paradigm. Another prompt framework is trees of thoughts, enabling a model to self-evaluate its progress at intermediate stages of problem-solving through deliberate reasoning (Yao et al. 2023 ).

Beyond linguistic studies, there has been a noticeable increase in the number of foreign workers in Japan and Japanese learners worldwide (Ministry of Health, Labor, and Welfare of Japan, 2022 ; Japan Foundation, 2021 ). However, existing assessment methods, such as the Japanese Language Proficiency Test (JLPT), J-CAT, and TTBJ Footnote 1 , primarily focus on reading, listening, vocabulary, and grammar skills, neglecting the evaluation of writing proficiency. As the number of workers and language learners continues to grow, there is a rising demand for an efficient AES system that can reduce costs and time for raters and be utilized for employment, examinations, and self-study purposes.

This study aims to explore the potential of LLM-based AES by comparing the effectiveness of five models: two LLMs (GPT Footnote 2 and BERT), one Japanese local LLM (OCLL), and two conventional machine learning-based methods (linguistic feature-based scoring tools - Jess and JWriter).

The research questions addressed in this study are as follows:

To what extent do the LLM-driven AES and linguistic feature-based AES, when used as automated tools to support human rating, accurately reflect test takers’ actual performance?

What influence does the prompt have on the accuracy and performance of LLM-based AES methods?

The subsequent sections of the manuscript cover the methodology, including the assessment measures for nonnative Japanese writing proficiency, criteria for prompts, and the dataset. The evaluation section focuses on the analysis of annotations and rating scores generated by LLM-driven and linguistic feature-based AES methods.

Methodology

The dataset utilized in this study was obtained from the International Corpus of Japanese as a Second Language (I-JAS) Footnote 3 . This corpus consisted of 1000 participants who represented 12 different first languages. For the study, the participants were given a story-writing task on a personal computer. They were required to write two stories based on the 4-panel illustrations titled “Picnic” and “The key” (see Appendix A). Background information for the participants was provided by the corpus, including their Japanese language proficiency levels assessed through two online tests: J-CAT and SPOT. These tests evaluated their reading, listening, vocabulary, and grammar abilities. The learners’ proficiency levels were categorized into six levels aligned with the Common European Framework of Reference for Languages (CEFR) and the Reference Framework for Japanese Language Education (RFJLE): A1, A2, B1, B2, C1, and C2. According to Lee et al. ( 2015 ), there is a high level of agreement (r = 0.86) between the J-CAT and SPOT assessments, indicating that the proficiency certifications provided by J-CAT are consistent with those of SPOT. However, it is important to note that the scores of J-CAT and SPOT do not have a one-to-one correspondence. In this study, the J-CAT scores were used as a benchmark to differentiate learners of different proficiency levels. A total of 1400 essays were utilized, representing the beginner (aligned with A1), A2, B1, B2, C1, and C2 levels based on the J-CAT scores. Table 1 provides information about the learners’ proficiency levels and their corresponding J-CAT and SPOT scores.

A dataset comprising a total of 1400 essays from the story writing tasks was collected. Among these, 714 essays were utilized to evaluate the reliability of the LLM-based AES method, while the remaining 686 essays were designated as development data to assess the LLM-based AES’s capability to distinguish participants with varying proficiency levels. The GPT 4 API was used in this study. A detailed explanation of the prompt-assessment criteria is provided in Section Prompt . All essays were sent to the model for measurement and scoring.

Measures of writing proficiency for nonnative Japanese

Japanese exhibits a morphologically agglutinative structure where morphemes are attached to the word stem to convey grammatical functions such as tense, aspect, voice, and honorifics, e.g. (5).

食べ-させ-られ-まし-た-か

tabe-sase-rare-mashi-ta-ka

[eat (stem)-causative-passive voice-honorification-tense. past-question marker]

Japanese employs nine case particles to indicate grammatical functions: the nominative case particle が (ga), the accusative case particle を (o), the genitive case particle の (no), the dative case particle に (ni), the locative/instrumental case particle で (de), the ablative case particle から (kara), the directional case particle へ (e), and the comitative case particle と (to). The agglutinative nature of the language, combined with the case particle system, provides an efficient means of distinguishing between active and passive voice, either through morphemes or case particles, e.g. 食べる taberu “eat concusive . ” (active voice); 食べられる taberareru “eat concusive . ” (passive voice). In the active voice, “パン を 食べる” (pan o taberu) translates to “to eat bread”. On the other hand, in the passive voice, it becomes “パン が 食べられた” (pan ga taberareta), which means “(the) bread was eaten”. Additionally, it is important to note that different conjugations of the same lemma are considered as one type in order to ensure a comprehensive assessment of the language features. For example, e.g., 食べる taberu “eat concusive . ”; 食べている tabeteiru “eat progress .”; 食べた tabeta “eat past . ” as one type.

To incorporate these features, previous research (Suzuki, 1999 ; Watanabe et al. 1988 ; Ishioka, 2001 ; Ishioka and Kameda, 2006 ; Hirao et al. 2020 ) has identified complexity, fluency, and accuracy as crucial factors for evaluating writing quality. These criteria are assessed through various aspects, including lexical richness (lexical density, diversity, and sophistication), syntactic complexity, and cohesion (Kyle et al. 2021 ; Mizumoto and Eguchi, 2023 ; Ure, 1971 ; Halliday, 1985 ; Barkaoui and Hadidi, 2020 ; Zenker and Kyle, 2021 ; Kim et al. 2018 ; Lu, 2017 ; Ortega, 2015 ). Therefore, this study proposes five scoring categories: lexical richness, syntactic complexity, cohesion, content elaboration, and grammatical accuracy. A total of 16 measures were employed to capture these categories. The calculation process and specific details of these measures can be found in Table 2 .

T-unit, first introduced by Hunt ( 1966 ), is a measure used for evaluating speech and composition. It serves as an indicator of syntactic development and represents the shortest units into which a piece of discourse can be divided without leaving any sentence fragments. In the context of Japanese language assessment, Sakoda and Hosoi ( 2020 ) utilized T-unit as the basic unit to assess the accuracy and complexity of Japanese learners’ speaking and storytelling. The calculation of T-units in Japanese follows the following principles:

A single main clause constitutes 1 T-unit, regardless of the presence or absence of dependent clauses, e.g. (6).

ケンとマリはピクニックに行きました (main clause): 1 T-unit.

If a sentence contains a main clause along with subclauses, each subclause is considered part of the same T-unit, e.g. (7).

天気が良かった の で (subclause)、ケンとマリはピクニックに行きました (main clause): 1 T-unit.

In the case of coordinate clauses, where multiple clauses are connected, each coordinated clause is counted separately. Thus, a sentence with coordinate clauses may have 2 T-units or more, e.g. (8).

ケンは地図で場所を探して (coordinate clause)、マリはサンドイッチを作りました (coordinate clause): 2 T-units.

Lexical diversity refers to the range of words used within a text (Engber, 1995 ; Kyle et al. 2021 ) and is considered a useful measure of the breadth of vocabulary in L n production (Jarvis, 2013a , 2013b ).

The type/token ratio (TTR) is widely recognized as a straightforward measure for calculating lexical diversity and has been employed in numerous studies. These studies have demonstrated a strong correlation between TTR and other methods of measuring lexical diversity (e.g., Bentz et al. 2016 ; Čech and Miroslav, 2018 ; Çöltekin and Taraka, 2018 ). TTR is computed by considering both the number of unique words (types) and the total number of words (tokens) in a given text. Given that the length of learners’ writing texts can vary, this study employs the moving average type-token ratio (MATTR) to mitigate the influence of text length. MATTR is calculated using a 50-word moving window. Initially, a TTR is determined for words 1–50 in an essay, followed by words 2–51, 3–52, and so on until the end of the essay is reached (Díez-Ortega and Kyle, 2023 ). The final MATTR scores were obtained by averaging the TTR scores for all 50-word windows. The following formula was employed to derive MATTR:

\({\rm{MATTR}}({\rm{W}})=\frac{{\sum }_{{\rm{i}}=1}^{{\rm{N}}-{\rm{W}}+1}{{\rm{F}}}_{{\rm{i}}}}{{\rm{W}}({\rm{N}}-{\rm{W}}+1)}\)

Here, N refers to the number of tokens in the corpus. W is the randomly selected token size (W < N). \({F}_{i}\) is the number of types in each window. The \({\rm{MATTR}}({\rm{W}})\) is the mean of a series of type-token ratios (TTRs) based on the word form for all windows. It is expected that individuals with higher language proficiency will produce texts with greater lexical diversity, as indicated by higher MATTR scores.

Lexical density was captured by the ratio of the number of lexical words to the total number of words (Lu, 2012 ). Lexical sophistication refers to the utilization of advanced vocabulary, often evaluated through word frequency indices (Crossley et al. 2013 ; Haberman, 2008 ; Kyle and Crossley, 2015 ; Laufer and Nation, 1995 ; Lu, 2012 ; Read, 2000 ). In line of writing, lexical sophistication can be interpreted as vocabulary breadth, which entails the appropriate usage of vocabulary items across various lexicon-grammatical contexts and registers (Garner et al. 2019 ; Kim et al. 2018 ; Kyle et al. 2018 ). In Japanese specifically, words are considered lexically sophisticated if they are not included in the “Japanese Education Vocabulary List Ver 1.0”. Footnote 4 Consequently, lexical sophistication was calculated by determining the number of sophisticated word types relative to the total number of words per essay. Furthermore, it has been suggested that, in Japanese writing, sentences should ideally have a length of no more than 40 to 50 characters, as this promotes readability. Therefore, the median and maximum sentence length can be considered as useful indices for assessment (Ishioka and Kameda, 2006 ).

Syntactic complexity was assessed based on several measures, including the mean length of clauses, verb phrases per T-unit, clauses per T-unit, dependent clauses per T-unit, complex nominals per clause, adverbial clauses per clause, coordinate phrases per clause, and mean dependency distance (MDD). The MDD reflects the distance between the governor and dependent positions in a sentence. A larger dependency distance indicates a higher cognitive load and greater complexity in syntactic processing (Liu, 2008 ; Liu et al. 2017 ). The MDD has been established as an efficient metric for measuring syntactic complexity (Jiang, Quyang, and Liu, 2019 ; Li and Yan, 2021 ). To calculate the MDD, the position numbers of the governor and dependent are subtracted, assuming that words in a sentence are assigned in a linear order, such as W1 … Wi … Wn. In any dependency relationship between words Wa and Wb, Wa is the governor and Wb is the dependent. The MDD of the entire sentence was obtained by taking the absolute value of governor – dependent:

MDD = \(\frac{1}{n}{\sum }_{i=1}^{n}|{\rm{D}}{{\rm{D}}}_{i}|\)

In this formula, \(n\) represents the number of words in the sentence, and \({DD}i\) is the dependency distance of the \({i}^{{th}}\) dependency relationship of a sentence. Building on this, the annotation of sentence ‘Mary-ga-John-ni-keshigomu-o-watashita was [Mary- top -John- dat -eraser- acc -give- past] ’. The sentence’s MDD would be 2. Table 3 provides the CSV file as a prompt for GPT 4.

Cohesion (semantic similarity) and content elaboration aim to capture the ideas presented in test taker’s essays. Cohesion was assessed using three measures: Synonym overlap/paragraph (topic), Synonym overlap/paragraph (keywords), and word2vec cosine similarity. Content elaboration and development were measured as the number of metadiscourse markers (type)/number of words. To capture content closely, this study proposed a novel-distance based representation, by encoding the cosine distance between the essay (by learner) and essay task’s (topic and keyword) i -vectors. The learner’s essay is decoded into a word sequence, and aligned to the essay task’ topic and keyword for log-likelihood measurement. The cosine distance reveals the content elaboration score in the leaners’ essay. The mathematical equation of cosine similarity between target-reference vectors is shown in (11), assuming there are i essays and ( L i , …. L n ) and ( N i , …. N n ) are the vectors representing the learner and task’s topic and keyword respectively. The content elaboration distance between L i and N i was calculated as follows:

\(\cos \left(\theta \right)=\frac{{\rm{L}}\,\cdot\, {\rm{N}}}{\left|{\rm{L}}\right|{\rm{|N|}}}=\frac{\mathop{\sum }\nolimits_{i=1}^{n}{L}_{i}{N}_{i}}{\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{L}_{i}^{2}}\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{N}_{i}^{2}}}\)

A high similarity value indicates a low difference between the two recognition outcomes, which in turn suggests a high level of proficiency in content elaboration.

To evaluate the effectiveness of the proposed measures in distinguishing different proficiency levels among nonnative Japanese speakers’ writing, we conducted a multi-faceted Rasch measurement analysis (Linacre, 1994 ). This approach applies measurement models to thoroughly analyze various factors that can influence test outcomes, including test takers’ proficiency, item difficulty, and rater severity, among others. The underlying principles and functionality of multi-faceted Rasch measurement are illustrated in (12).

\(\log \left(\frac{{P}_{{nijk}}}{{P}_{{nij}(k-1)}}\right)={B}_{n}-{D}_{i}-{C}_{j}-{F}_{k}\)

(12) defines the logarithmic transformation of the probability ratio ( P nijk /P nij(k-1) )) as a function of multiple parameters. Here, n represents the test taker, i denotes a writing proficiency measure, j corresponds to the human rater, and k represents the proficiency score. The parameter B n signifies the proficiency level of test taker n (where n ranges from 1 to N). D j represents the difficulty parameter of test item i (where i ranges from 1 to L), while C j represents the severity of rater j (where j ranges from 1 to J). Additionally, F k represents the step difficulty for a test taker to move from score ‘k-1’ to k . P nijk refers to the probability of rater j assigning score k to test taker n for test item i . P nij(k-1) represents the likelihood of test taker n being assigned score ‘k-1’ by rater j for test item i . Each facet within the test is treated as an independent parameter and estimated within the same reference framework. To evaluate the consistency of scores obtained through both human and computer analysis, we utilized the Infit mean-square statistic. This statistic is a chi-square measure divided by the degrees of freedom and is weighted with information. It demonstrates higher sensitivity to unexpected patterns in responses to items near a person’s proficiency level (Linacre, 2002 ). Fit statistics are assessed based on predefined thresholds for acceptable fit. For the Infit MNSQ, which has a mean of 1.00, different thresholds have been suggested. Some propose stricter thresholds ranging from 0.7 to 1.3 (Bond et al. 2021 ), while others suggest more lenient thresholds ranging from 0.5 to 1.5 (Eckes, 2009 ). In this study, we adopted the criterion of 0.70–1.30 for the Infit MNSQ.

Moving forward, we can now proceed to assess the effectiveness of the 16 proposed measures based on five criteria for accurately distinguishing various levels of writing proficiency among non-native Japanese speakers. To conduct this evaluation, we utilized the development dataset from the I-JAS corpus, as described in Section Dataset . Table 4 provides a measurement report that presents the performance details of the 14 metrics under consideration. The measure separation was found to be 4.02, indicating a clear differentiation among the measures. The reliability index for the measure separation was 0.891, suggesting consistency in the measurement. Similarly, the person separation reliability index was 0.802, indicating the accuracy of the assessment in distinguishing between individuals. All 16 measures demonstrated Infit mean squares within a reasonable range, ranging from 0.76 to 1.28. The Synonym overlap/paragraph (topic) measure exhibited a relatively high outfit mean square of 1.46, although the Infit mean square falls within an acceptable range. The standard error for the measures ranged from 0.13 to 0.28, indicating the precision of the estimates.

Table 5 further illustrated the weights assigned to different linguistic measures for score prediction, with higher weights indicating stronger correlations between those measures and higher scores. Specifically, the following measures exhibited higher weights compared to others: moving average type token ratio per essay has a weight of 0.0391. Mean dependency distance had a weight of 0.0388. Mean length of clause, calculated by dividing the number of words by the number of clauses, had a weight of 0.0374. Complex nominals per T-unit, calculated by dividing the number of complex nominals by the number of T-units, had a weight of 0.0379. Coordinate phrases rate, calculated by dividing the number of coordinate phrases by the number of clauses, had a weight of 0.0325. Grammatical error rate, representing the number of errors per essay, had a weight of 0.0322.

Criteria (output indicator)

The criteria used to evaluate the writing ability in this study were based on CEFR, which follows a six-point scale ranging from A1 to C2. To assess the quality of Japanese writing, the scoring criteria from Table 6 were utilized. These criteria were derived from the IELTS writing standards and served as assessment guidelines and prompts for the written output.

A prompt is a question or detailed instruction that is provided to the model to obtain a proper response. After several pilot experiments, we decided to provide the measures (Section Measures of writing proficiency for nonnative Japanese ) as the input prompt and use the criteria (Section Criteria (output indicator) ) as the output indicator. Regarding the prompt language, considering that the LLM was tasked with rating Japanese essays, would prompt in Japanese works better Footnote 5 ? We conducted experiments comparing the performance of GPT-4 using both English and Japanese prompts. Additionally, we utilized the Japanese local model OCLL with Japanese prompts. Multiple trials were conducted using the same sample. Regardless of the prompt language used, we consistently obtained the same grading results with GPT-4, which assigned a grade of B1 to the writing sample. This suggested that GPT-4 is reliable and capable of producing consistent ratings regardless of the prompt language. On the other hand, when we used Japanese prompts with the Japanese local model “OCLL”, we encountered inconsistent grading results. Out of 10 attempts with OCLL, only 6 yielded consistent grading results (B1), while the remaining 4 showed different outcomes, including A1 and B2 grades. These findings indicated that the language of the prompt was not the determining factor for reliable AES. Instead, the size of the training data and the model parameters played crucial roles in achieving consistent and reliable AES results for the language model.

The following is the utilized prompt, which details all measures and requires the LLM to score the essays using holistic and trait scores.

Please evaluate Japanese essays written by Japanese learners and assign a score to each essay on a six-point scale, ranging from A1, A2, B1, B2, C1 to C2. Additionally, please provide trait scores and display the calculation process for each trait score. The scoring should be based on the following criteria:

Moving average type-token ratio.

Number of lexical words (token) divided by the total number of words per essay.

Number of sophisticated word types divided by the total number of words per essay.

Mean length of clause.

Verb phrases per T-unit.

Clauses per T-unit.

Dependent clauses per T-unit.

Complex nominals per clause.

Adverbial clauses per clause.

Coordinate phrases per clause.

Mean dependency distance.

Synonym overlap paragraph (topic and keywords).

Word2vec cosine similarity.

Connectives per essay.

Conjunctions per essay.

Number of metadiscourse markers (types) divided by the total number of words.

Number of errors per essay.

Japanese essay text

出かける前に二人が地図を見ている間に、サンドイッチを入れたバスケットに犬が入ってしまいました。それに気づかずに二人は楽しそうに出かけて行きました。やがて突然犬がバスケットから飛び出し、二人は驚きました。バスケット の 中を見ると、食べ物はすべて犬に食べられていて、二人は困ってしまいました。(ID_JJJ01_SW1)

The score of the example above was B1. Figure 3 provides an example of holistic and trait scores provided by GPT-4 (with a prompt indicating all measures) via Bing Footnote 6 .

figure 3

Example of GPT-4 AES and feedback (with a prompt indicating all measures).

Statistical analysis

The aim of this study is to investigate the potential use of LLM for nonnative Japanese AES. It seeks to compare the scoring outcomes obtained from feature-based AES tools, which rely on conventional machine learning technology (i.e. Jess, JWriter), with those generated by AI-driven AES tools utilizing deep learning technology (BERT, GPT, OCLL). To assess the reliability of a computer-assisted annotation tool, the study initially established human-human agreement as the benchmark measure. Subsequently, the performance of the LLM-based method was evaluated by comparing it to human-human agreement.

To assess annotation agreement, the study employed standard measures such as precision, recall, and F-score (Brants 2000 ; Lu 2010 ), along with the quadratically weighted kappa (QWK) to evaluate the consistency and agreement in the annotation process. Assume A and B represent human annotators. When comparing the annotations of the two annotators, the following results are obtained. The evaluation of precision, recall, and F-score metrics was illustrated in equations (13) to (15).

\({\rm{Recall}}(A,B)=\frac{{\rm{Number}}\,{\rm{of}}\,{\rm{identical}}\,{\rm{nodes}}\,{\rm{in}}\,A\,{\rm{and}}\,B}{{\rm{Number}}\,{\rm{of}}\,{\rm{nodes}}\,{\rm{in}}\,A}\)

\({\rm{Precision}}(A,\,B)=\frac{{\rm{Number}}\,{\rm{of}}\,{\rm{identical}}\,{\rm{nodes}}\,{\rm{in}}\,A\,{\rm{and}}\,B}{{\rm{Number}}\,{\rm{of}}\,{\rm{nodes}}\,{\rm{in}}\,B}\)

The F-score is the harmonic mean of recall and precision:

\({\rm{F}}-{\rm{score}}=\frac{2* ({\rm{Precision}}* {\rm{Recall}})}{{\rm{Precision}}+{\rm{Recall}}}\)

The highest possible value of an F-score is 1.0, indicating perfect precision and recall, and the lowest possible value is 0, if either precision or recall are zero.

In accordance with Taghipour and Ng ( 2016 ), the calculation of QWK involves two steps:

Step 1: Construct a weight matrix W as follows:

\({W}_{{ij}}=\frac{{(i-j)}^{2}}{{(N-1)}^{2}}\)

i represents the annotation made by the tool, while j represents the annotation made by a human rater. N denotes the total number of possible annotations. Matrix O is subsequently computed, where O_( i, j ) represents the count of data annotated by the tool ( i ) and the human annotator ( j ). On the other hand, E refers to the expected count matrix, which undergoes normalization to ensure that the sum of elements in E matches the sum of elements in O.

Step 2: With matrices O and E, the QWK is obtained as follows:

K = 1- \(\frac{\sum i,j{W}_{i,j}\,{O}_{i,j}}{\sum i,j{W}_{i,j}\,{E}_{i,j}}\)

The value of the quadratic weighted kappa increases as the level of agreement improves. Further, to assess the accuracy of LLM scoring, the proportional reductive mean square error (PRMSE) was employed. The PRMSE approach takes into account the variability observed in human ratings to estimate the rater error, which is then subtracted from the variance of the human labels. This calculation provides an overall measure of agreement between the automated scores and true scores (Haberman et al. 2015 ; Loukina et al. 2020 ; Taghipour and Ng, 2016 ). The computation of PRMSE involves the following steps:

Step 1: Calculate the mean squared errors (MSEs) for the scoring outcomes of the computer-assisted tool (MSE tool) and the human scoring outcomes (MSE human).

Step 2: Determine the PRMSE by comparing the MSE of the computer-assisted tool (MSE tool) with the MSE from human raters (MSE human), using the following formula:

\({\rm{PRMSE}}=1-\frac{({\rm{MSE}}\,{\rm{tool}})\,}{({\rm{MSE}}\,{\rm{human}})\,}=1-\,\frac{{\sum }_{i}^{n}=1{({{\rm{y}}}_{i}-{\hat{{\rm{y}}}}_{{\rm{i}}})}^{2}}{{\sum }_{i}^{n}=1{({{\rm{y}}}_{i}-\hat{{\rm{y}}})}^{2}}\)

In the numerator, ŷi represents the scoring outcome predicted by a specific LLM-driven AES system for a given sample. The term y i − ŷ i represents the difference between this predicted outcome and the mean value of all LLM-driven AES systems’ scoring outcomes. It quantifies the deviation of the specific LLM-driven AES system’s prediction from the average prediction of all LLM-driven AES systems. In the denominator, y i − ŷ represents the difference between the scoring outcome provided by a specific human rater for a given sample and the mean value of all human raters’ scoring outcomes. It measures the discrepancy between the specific human rater’s score and the average score given by all human raters. The PRMSE is then calculated by subtracting the ratio of the MSE tool to the MSE human from 1. PRMSE falls within the range of 0 to 1, with larger values indicating reduced errors in LLM’s scoring compared to those of human raters. In other words, a higher PRMSE implies that LLM’s scoring demonstrates greater accuracy in predicting the true scores (Loukina et al. 2020 ). The interpretation of kappa values, ranging from 0 to 1, is based on the work of Landis and Koch ( 1977 ). Specifically, the following categories are assigned to different ranges of kappa values: −1 indicates complete inconsistency, 0 indicates random agreement, 0.0 ~ 0.20 indicates extremely low level of agreement (slight), 0.21 ~ 0.40 indicates moderate level of agreement (fair), 0.41 ~ 0.60 indicates medium level of agreement (moderate), 0.61 ~ 0.80 indicates high level of agreement (substantial), 0.81 ~ 1 indicates almost perfect level of agreement. All statistical analyses were executed using Python script.

Results and discussion

Annotation reliability of the llm.

This section focuses on assessing the reliability of the LLM’s annotation and scoring capabilities. To evaluate the reliability, several tests were conducted simultaneously, aiming to achieve the following objectives:

Assess the LLM’s ability to differentiate between test takers with varying levels of oral proficiency.

Determine the level of agreement between the annotations and scoring performed by the LLM and those done by human raters.

The evaluation of the results encompassed several metrics, including: precision, recall, F-Score, quadratically-weighted kappa, proportional reduction of mean squared error, Pearson correlation, and multi-faceted Rasch measurement.

Inter-annotator agreement (human–human annotator agreement)

We started with an agreement test of the two human annotators. Two trained annotators were recruited to determine the writing task data measures. A total of 714 scripts, as the test data, was utilized. Each analysis lasted 300–360 min. Inter-annotator agreement was evaluated using the standard measures of precision, recall, and F-score and QWK. Table 7 presents the inter-annotator agreement for the various indicators. As shown, the inter-annotator agreement was fairly high, with F-scores ranging from 1.0 for sentence and word number to 0.666 for grammatical errors.

The findings from the QWK analysis provided further confirmation of the inter-annotator agreement. The QWK values covered a range from 0.950 ( p  = 0.000) for sentence and word number to 0.695 for synonym overlap number (keyword) and grammatical errors ( p  = 0.001).

Agreement of annotation outcomes between human and LLM

To evaluate the consistency between human annotators and LLM annotators (BERT, GPT, OCLL) across the indices, the same test was conducted. The results of the inter-annotator agreement (F-score) between LLM and human annotation are provided in Appendix B-D. The F-scores ranged from 0.706 for Grammatical error # for OCLL-human to a perfect 1.000 for GPT-human, for sentences, clauses, T-units, and words. These findings were further supported by the QWK analysis, which showed agreement levels ranging from 0.807 ( p  = 0.001) for metadiscourse markers for OCLL-human to 0.962 for words ( p  = 0.000) for GPT-human. The findings demonstrated that the LLM annotation achieved a significant level of accuracy in identifying measurement units and counts.

Reliability of LLM-driven AES’s scoring and discriminating proficiency levels

This section examines the reliability of the LLM-driven AES scoring through a comparison of the scoring outcomes produced by human raters and the LLM ( Reliability of LLM-driven AES scoring ). It also assesses the effectiveness of the LLM-based AES system in differentiating participants with varying proficiency levels ( Reliability of LLM-driven AES discriminating proficiency levels ).

Reliability of LLM-driven AES scoring

Table 8 summarizes the QWK coefficient analysis between the scores computed by the human raters and the GPT-4 for the individual essays from I-JAS Footnote 7 . As shown, the QWK of all measures ranged from k  = 0.819 for lexical density (number of lexical words (tokens)/number of words per essay) to k  = 0.644 for word2vec cosine similarity. Table 9 further presents the Pearson correlations between the 16 writing proficiency measures scored by human raters and GPT 4 for the individual essays. The correlations ranged from 0.672 for syntactic complexity to 0.734 for grammatical accuracy. The correlations between the writing proficiency scores assigned by human raters and the BERT-based AES system were found to range from 0.661 for syntactic complexity to 0.713 for grammatical accuracy. The correlations between the writing proficiency scores given by human raters and the OCLL-based AES system ranged from 0.654 for cohesion to 0.721 for grammatical accuracy. These findings indicated an alignment between the assessments made by human raters and both the BERT-based and OCLL-based AES systems in terms of various aspects of writing proficiency.

Reliability of LLM-driven AES discriminating proficiency levels

After validating the reliability of the LLM’s annotation and scoring, the subsequent objective was to evaluate its ability to distinguish between various proficiency levels. For this analysis, a dataset of 686 individual essays was utilized. Table 10 presents a sample of the results, summarizing the means, standard deviations, and the outcomes of the one-way ANOVAs based on the measures assessed by the GPT-4 model. A post hoc multiple comparison test, specifically the Bonferroni test, was conducted to identify any potential differences between pairs of levels.

As the results reveal, seven measures presented linear upward or downward progress across the three proficiency levels. These were marked in bold in Table 10 and comprise one measure of lexical richness, i.e. MATTR (lexical diversity); four measures of syntactic complexity, i.e. MDD (mean dependency distance), MLC (mean length of clause), CNT (complex nominals per T-unit), CPC (coordinate phrases rate); one cohesion measure, i.e. word2vec cosine similarity and GER (grammatical error rate). Regarding the ability of the sixteen measures to distinguish adjacent proficiency levels, the Bonferroni tests indicated that statistically significant differences exist between the primary level and the intermediate level for MLC and GER. One measure of lexical richness, namely LD, along with three measures of syntactic complexity (VPT, CT, DCT, ACC), two measures of cohesion (SOPT, SOPK), and one measure of content elaboration (IMM), exhibited statistically significant differences between proficiency levels. However, these differences did not demonstrate a linear progression between adjacent proficiency levels. No significant difference was observed in lexical sophistication between proficiency levels.

To summarize, our study aimed to evaluate the reliability and differentiation capabilities of the LLM-driven AES method. For the first objective, we assessed the LLM’s ability to differentiate between test takers with varying levels of oral proficiency using precision, recall, F-Score, and quadratically-weighted kappa. Regarding the second objective, we compared the scoring outcomes generated by human raters and the LLM to determine the level of agreement. We employed quadratically-weighted kappa and Pearson correlations to compare the 16 writing proficiency measures for the individual essays. The results confirmed the feasibility of using the LLM for annotation and scoring in AES for nonnative Japanese. As a result, Research Question 1 has been addressed.

Comparison of BERT-, GPT-, OCLL-based AES, and linguistic-feature-based computation methods

This section aims to compare the effectiveness of five AES methods for nonnative Japanese writing, i.e. LLM-driven approaches utilizing BERT, GPT, and OCLL, linguistic feature-based approaches using Jess and JWriter. The comparison was conducted by comparing the ratings obtained from each approach with human ratings. All ratings were derived from the dataset introduced in Dataset . To facilitate the comparison, the agreement between the automated methods and human ratings was assessed using QWK and PRMSE. The performance of each approach was summarized in Table 11 .

The QWK coefficient values indicate that LLMs (GPT, BERT, OCLL) and human rating outcomes demonstrated higher agreement compared to feature-based AES methods (Jess and JWriter) in assessing writing proficiency criteria, including lexical richness, syntactic complexity, content, and grammatical accuracy. Among the LLMs, the GPT-4 driven AES and human rating outcomes showed the highest agreement in all criteria, except for syntactic complexity. The PRMSE values suggest that the GPT-based method outperformed linguistic feature-based methods and other LLM-based approaches. Moreover, an interesting finding emerged during the study: the agreement coefficient between GPT-4 and human scoring was even higher than the agreement between different human raters themselves. This discovery highlights the advantage of GPT-based AES over human rating. Ratings involve a series of processes, including reading the learners’ writing, evaluating the content and language, and assigning scores. Within this chain of processes, various biases can be introduced, stemming from factors such as rater biases, test design, and rating scales. These biases can impact the consistency and objectivity of human ratings. GPT-based AES may benefit from its ability to apply consistent and objective evaluation criteria. By prompting the GPT model with detailed writing scoring rubrics and linguistic features, potential biases in human ratings can be mitigated. The model follows a predefined set of guidelines and does not possess the same subjective biases that human raters may exhibit. This standardization in the evaluation process contributes to the higher agreement observed between GPT-4 and human scoring. Section Prompt strategy of the study delves further into the role of prompts in the application of LLMs to AES. It explores how the choice and implementation of prompts can impact the performance and reliability of LLM-based AES methods. Furthermore, it is important to acknowledge the strengths of the local model, i.e. the Japanese local model OCLL, which excels in processing certain idiomatic expressions. Nevertheless, our analysis indicated that GPT-4 surpasses local models in AES. This superior performance can be attributed to the larger parameter size of GPT-4, estimated to be between 500 billion and 1 trillion, which exceeds the sizes of both BERT and the local model OCLL.

Prompt strategy

In the context of prompt strategy, Mizumoto and Eguchi ( 2023 ) conducted a study where they applied the GPT-3 model to automatically score English essays in the TOEFL test. They found that the accuracy of the GPT model alone was moderate to fair. However, when they incorporated linguistic measures such as cohesion, syntactic complexity, and lexical features alongside the GPT model, the accuracy significantly improved. This highlights the importance of prompt engineering and providing the model with specific instructions to enhance its performance. In this study, a similar approach was taken to optimize the performance of LLMs. GPT-4, which outperformed BERT and OCLL, was selected as the candidate model. Model 1 was used as the baseline, representing GPT-4 without any additional prompting. Model 2, on the other hand, involved GPT-4 prompted with 16 measures that included scoring criteria, efficient linguistic features for writing assessment, and detailed measurement units and calculation formulas. The remaining models (Models 3 to 18) utilized GPT-4 prompted with individual measures. The performance of these 18 different models was assessed using the output indicators described in Section Criteria (output indicator) . By comparing the performances of these models, the study aimed to understand the impact of prompt engineering on the accuracy and effectiveness of GPT-4 in AES tasks.

Based on the PRMSE scores presented in Fig. 4 , it was observed that Model 1, representing GPT-4 without any additional prompting, achieved a fair level of performance. However, Model 2, which utilized GPT-4 prompted with all measures, outperformed all other models in terms of PRMSE score, achieving a score of 0.681. These results indicate that the inclusion of specific measures and prompts significantly enhanced the performance of GPT-4 in AES. Among the measures, syntactic complexity was found to play a particularly significant role in improving the accuracy of GPT-4 in assessing writing quality. Following that, lexical diversity emerged as another important factor contributing to the model’s effectiveness. The study suggests that a well-prompted GPT-4 can serve as a valuable tool to support human assessors in evaluating writing quality. By utilizing GPT-4 as an automated scoring tool, the evaluation biases associated with human raters can be minimized. This has the potential to empower teachers by allowing them to focus on designing writing tasks and guiding writing strategies, while leveraging the capabilities of GPT-4 for efficient and reliable scoring.

figure 4

PRMSE scores of the 18 AES models.

This study aimed to investigate two main research questions: the feasibility of utilizing LLMs for AES and the impact of prompt engineering on the application of LLMs in AES.

To address the first objective, the study compared the effectiveness of five different models: GPT, BERT, the Japanese local LLM (OCLL), and two conventional machine learning-based AES tools (Jess and JWriter). The PRMSE values indicated that the GPT-4-based method outperformed other LLMs (BERT, OCLL) and linguistic feature-based computational methods (Jess and JWriter) across various writing proficiency criteria. Furthermore, the agreement coefficient between GPT-4 and human scoring surpassed the agreement among human raters themselves, highlighting the potential of using the GPT-4 tool to enhance AES by reducing biases and subjectivity, saving time, labor, and cost, and providing valuable feedback for self-study. Regarding the second goal, the role of prompt design was investigated by comparing 18 models, including a baseline model, a model prompted with all measures, and 16 models prompted with one measure at a time. GPT-4, which outperformed BERT and OCLL, was selected as the candidate model. The PRMSE scores of the models showed that GPT-4 prompted with all measures achieved the best performance, surpassing the baseline and other models.

In conclusion, this study has demonstrated the potential of LLMs in supporting human rating in assessments. By incorporating automation, we can save time and resources while reducing biases and subjectivity inherent in human rating processes. Automated language assessments offer the advantage of accessibility, providing equal opportunities and economic feasibility for individuals who lack access to traditional assessment centers or necessary resources. LLM-based language assessments provide valuable feedback and support to learners, aiding in the enhancement of their language proficiency and the achievement of their goals. This personalized feedback can cater to individual learner needs, facilitating a more tailored and effective language-learning experience.

There are three important areas that merit further exploration. First, prompt engineering requires attention to ensure optimal performance of LLM-based AES across different language types. This study revealed that GPT-4, when prompted with all measures, outperformed models prompted with fewer measures. Therefore, investigating and refining prompt strategies can enhance the effectiveness of LLMs in automated language assessments. Second, it is crucial to explore the application of LLMs in second-language assessment and learning for oral proficiency, as well as their potential in under-resourced languages. Recent advancements in self-supervised machine learning techniques have significantly improved automatic speech recognition (ASR) systems, opening up new possibilities for creating reliable ASR systems, particularly for under-resourced languages with limited data. However, challenges persist in the field of ASR. First, ASR assumes correct word pronunciation for automatic pronunciation evaluation, which proves challenging for learners in the early stages of language acquisition due to diverse accents influenced by their native languages. Accurately segmenting short words becomes problematic in such cases. Second, developing precise audio-text transcriptions for languages with non-native accented speech poses a formidable task. Last, assessing oral proficiency levels involves capturing various linguistic features, including fluency, pronunciation, accuracy, and complexity, which are not easily captured by current NLP technology.

Data availability

The dataset utilized was obtained from the International Corpus of Japanese as a Second Language (I-JAS). The data URLs: [ https://www2.ninjal.ac.jp/jll/lsaj/ihome2.html ].

J-CAT and TTBJ are two computerized adaptive tests used to assess Japanese language proficiency.

SPOT is a specific component of the TTBJ test.

J-CAT: https://www.j-cat2.org/html/ja/pages/interpret.html

SPOT: https://ttbj.cegloc.tsukuba.ac.jp/p1.html#SPOT .

The study utilized a prompt-based GPT-4 model, developed by OpenAI, which has an impressive architecture with 1.8 trillion parameters across 120 layers. GPT-4 was trained on a vast dataset of 13 trillion tokens, using two stages: initial training on internet text datasets to predict the next token, and subsequent fine-tuning through reinforcement learning from human feedback.

https://www2.ninjal.ac.jp/jll/lsaj/ihome2-en.html .

http://jhlee.sakura.ne.jp/JEV/ by Japanese Learning Dictionary Support Group 2015.

We express our sincere gratitude to the reviewer for bringing this matter to our attention.

On February 7, 2023, Microsoft began rolling out a major overhaul to Bing that included a new chatbot feature based on OpenAI’s GPT-4 (Bing.com).

Appendix E-F present the analysis results of the QWK coefficient between the scores computed by the human raters and the BERT, OCLL models.

Attali Y, Burstein J (2006) Automated essay scoring with e-rater® V.2. J. Technol., Learn. Assess., 4

Barkaoui K, Hadidi A (2020) Assessing Change in English Second Language Writing Performance (1st ed.). Routledge, New York. https://doi.org/10.4324/9781003092346

Bentz C, Tatyana R, Koplenig A, Tanja S (2016) A comparison between morphological complexity. measures: Typological data vs. language corpora. In Proceedings of the workshop on computational linguistics for linguistic complexity (CL4LC), 142–153. Osaka, Japan: The COLING 2016 Organizing Committee

Bond TG, Yan Z, Heene M (2021) Applying the Rasch model: Fundamental measurement in the human sciences (4th ed). Routledge

Brants T (2000) Inter-annotator agreement for a German newspaper corpus. Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00), Athens, Greece, 31 May-2 June, European Language Resources Association

Brown TB, Mann B, Ryder N, et al. (2020) Language models are few-shot learners. Advances in Neural Information Processing Systems, Online, 6–12 December, Curran Associates, Inc., Red Hook, NY

Burstein J (2003) The E-rater scoring engine: Automated essay scoring with natural language processing. In Shermis MD and Burstein JC (ed) Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Čech R, Miroslav K (2018) Morphological richness of text. In Masako F, Václav C (ed) Taming the corpus: From inflection and lexis to interpretation, 63–77. Cham, Switzerland: Springer Nature

Çöltekin Ç, Taraka, R (2018) Exploiting Universal Dependencies treebanks for measuring morphosyntactic complexity. In Aleksandrs B, Christian B (ed), Proceedings of first workshop on measuring language complexity, 1–7. Torun, Poland

Crossley SA, Cobb T, McNamara DS (2013) Comparing count-based and band-based indices of word frequency: Implications for active vocabulary research and pedagogical applications. System 41:965–981. https://doi.org/10.1016/j.system.2013.08.002

Article   Google Scholar  

Crossley SA, McNamara DS (2016) Say more and be more coherent: How text elaboration and cohesion can increase writing quality. J. Writ. Res. 7:351–370

CyberAgent Inc (2023) Open-Calm series of Japanese language models. Retrieved from: https://www.cyberagent.co.jp/news/detail/id=28817

Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, Minnesota, 2–7 June, pp. 4171–4186. Association for Computational Linguistics

Diez-Ortega M, Kyle K (2023) Measuring the development of lexical richness of L2 Spanish: a longitudinal learner corpus study. Studies in Second Language Acquisition 1-31

Eckes T (2009) On common ground? How raters perceive scoring criteria in oral proficiency testing. In Brown A, Hill K (ed) Language testing and evaluation 13: Tasks and criteria in performance assessment (pp. 43–73). Peter Lang Publishing

Elliot S (2003) IntelliMetric: from here to validity. In: Shermis MD, Burstein JC (ed) Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Google Scholar  

Engber CA (1995) The relationship of lexical proficiency to the quality of ESL compositions. J. Second Lang. Writ. 4:139–155

Garner J, Crossley SA, Kyle K (2019) N-gram measures and L2 writing proficiency. System 80:176–187. https://doi.org/10.1016/j.system.2018.12.001

Haberman SJ (2008) When can subscores have value? J. Educat. Behav. Stat., 33:204–229

Haberman SJ, Yao L, Sinharay S (2015) Prediction of true test scores from observed item scores and ancillary data. Brit. J. Math. Stat. Psychol. 68:363–385

Halliday MAK (1985) Spoken and Written Language. Deakin University Press, Melbourne, Australia

Hirao R, Arai M, Shimanaka H et al. (2020) Automated essay scoring system for nonnative Japanese learners. Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pp. 1250–1257. European Language Resources Association

Hunt KW (1966) Recent Measures in Syntactic Development. Elementary English, 43(7), 732–739. http://www.jstor.org/stable/41386067

Ishioka T (2001) About e-rater, a computer-based automatic scoring system for essays [Konpyūta ni yoru essei no jidō saiten shisutemu e − rater ni tsuite]. University Entrance Examination. Forum [Daigaku nyūshi fōramu] 24:71–76

Hochreiter S, Schmidhuber J (1997) Long short- term memory. Neural Comput. 9(8):1735–1780

Article   CAS   PubMed   Google Scholar  

Ishioka T, Kameda M (2006) Automated Japanese essay scoring system based on articles written by experts. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 17–18 July 2006, pp. 233-240. Association for Computational Linguistics, USA

Japan Foundation (2021) Retrieved from: https://www.jpf.gp.jp/j/project/japanese/survey/result/dl/survey2021/all.pdf

Jarvis S (2013a) Defining and measuring lexical diversity. In Jarvis S, Daller M (ed) Vocabulary knowledge: Human ratings and automated measures (Vol. 47, pp. 13–44). John Benjamins. https://doi.org/10.1075/sibil.47.03ch1

Jarvis S (2013b) Capturing the diversity in lexical diversity. Lang. Learn. 63:87–106. https://doi.org/10.1111/j.1467-9922.2012.00739.x

Jiang J, Quyang J, Liu H (2019) Interlanguage: A perspective of quantitative linguistic typology. Lang. Sci. 74:85–97

Kim M, Crossley SA, Kyle K (2018) Lexical sophistication as a multidimensional phenomenon: Relations to second language lexical proficiency, development, and writing quality. Mod. Lang. J. 102(1):120–141. https://doi.org/10.1111/modl.12447

Kojima T, Gu S, Reid M et al. (2022) Large language models are zero-shot reasoners. Advances in Neural Information Processing Systems, New Orleans, LA, 29 November-1 December, Curran Associates, Inc., Red Hook, NY

Kyle K, Crossley SA (2015) Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Q 49:757–786

Kyle K, Crossley SA, Berger CM (2018) The tool for the automatic analysis of lexical sophistication (TAALES): Version 2.0. Behav. Res. Methods 50:1030–1046. https://doi.org/10.3758/s13428-017-0924-4

Article   PubMed   Google Scholar  

Kyle K, Crossley SA, Jarvis S (2021) Assessing the validity of lexical diversity using direct judgements. Lang. Assess. Q. 18:154–170. https://doi.org/10.1080/15434303.2020.1844205

Landauer TK, Laham D, Foltz PW (2003) Automated essay scoring and annotation of essays with the Intelligent Essay Assessor. In Shermis MD, Burstein JC (ed), Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 159–174

Laufer B, Nation P (1995) Vocabulary size and use: Lexical richness in L2 written production. Appl. Linguist. 16:307–322. https://doi.org/10.1093/applin/16.3.307

Lee J, Hasebe Y (2017) jWriter Learner Text Evaluator, URL: https://jreadability.net/jwriter/

Lee J, Kobayashi N, Sakai T, Sakota K (2015) A Comparison of SPOT and J-CAT Based on Test Analysis [Tesuto bunseki ni motozuku ‘SPOT’ to ‘J-CAT’ no hikaku]. Research on the Acquisition of Second Language Japanese [Dainigengo to shite no nihongo no shūtoku kenkyū] (18) 53–69

Li W, Yan J (2021) Probability distribution of dependency distance based on a Treebank of. Japanese EFL Learners’ Interlanguage. J. Quant. Linguist. 28(2):172–186. https://doi.org/10.1080/09296174.2020.1754611

Article   MathSciNet   Google Scholar  

Linacre JM (2002) Optimizing rating scale category effectiveness. J. Appl. Meas. 3(1):85–106

PubMed   Google Scholar  

Linacre JM (1994) Constructing measurement with a Many-Facet Rasch Model. In Wilson M (ed) Objective measurement: Theory into practice, Volume 2 (pp. 129–144). Norwood, NJ: Ablex

Liu H (2008) Dependency distance as a metric of language comprehension difficulty. J. Cognitive Sci. 9:159–191

Liu H, Xu C, Liang J (2017) Dependency distance: A new perspective on syntactic patterns in natural languages. Phys. Life Rev. 21. https://doi.org/10.1016/j.plrev.2017.03.002

Loukina A, Madnani N, Cahill A, et al. (2020) Using PRMSE to evaluate automated scoring systems in the presence of label noise. Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, Seattle, WA, USA → Online, 10 July, pp. 18–29. Association for Computational Linguistics

Lu X (2010) Automatic analysis of syntactic complexity in second language writing. Int. J. Corpus Linguist. 15:474–496

Lu X (2012) The relationship of lexical richness to the quality of ESL learners’ oral narratives. Mod. Lang. J. 96:190–208

Lu X (2017) Automated measurement of syntactic complexity in corpus-based L2 writing research and implications for writing assessment. Lang. Test. 34:493–511

Lu X, Hu R (2022) Sense-aware lexical sophistication indices and their relationship to second language writing quality. Behav. Res. Method. 54:1444–1460. https://doi.org/10.3758/s13428-021-01675-6

Ministry of Health, Labor, and Welfare of Japan (2022) Retrieved from: https://www.mhlw.go.jp/stf/newpage_30367.html

Mizumoto A, Eguchi M (2023) Exploring the potential of using an AI language model for automated essay scoring. Res. Methods Appl. Linguist. 3:100050

Okgetheng B, Takeuchi K (2024) Estimating Japanese Essay Grading Scores with Large Language Models. Proceedings of 30th Annual Conference of the Language Processing Society in Japan, March 2024

Ortega L (2015) Second language learning explained? SLA across 10 contemporary theories. In VanPatten B, Williams J (ed) Theories in Second Language Acquisition: An Introduction

Rae JW, Borgeaud S, Cai T, et al. (2021) Scaling Language Models: Methods, Analysis & Insights from Training Gopher. ArXiv, abs/2112.11446

Read J (2000) Assessing vocabulary. Cambridge University Press. https://doi.org/10.1017/CBO9780511732942

Rudner LM, Liang T (2002) Automated Essay Scoring Using Bayes’ Theorem. J. Technol., Learning and Assessment, 1 (2)

Sakoda K, Hosoi Y (2020) Accuracy and complexity of Japanese Language usage by SLA learners in different learning environments based on the analysis of I-JAS, a learners’ corpus of Japanese as L2. Math. Linguist. 32(7):403–418. https://doi.org/10.24701/mathling.32.7_403

Suzuki N (1999) Summary of survey results regarding comprehensive essay questions. Final report of “Joint Research on Comprehensive Examinations for the Aim of Evaluating Applicability to Each Specialized Field of Universities” for 1996-2000 [shōronbun sōgō mondai ni kansuru chōsa kekka no gaiyō. Heisei 8 - Heisei 12-nendo daigaku no kaku senmon bun’ya e no tekisei no hyōka o mokuteki to suru sōgō shiken no arikata ni kansuru kyōdō kenkyū’ saishū hōkoku-sho]. University Entrance Examination Section Center Research and Development Department [Daigaku nyūshi sentā kenkyū kaihatsubu], 21–32

Taghipour K, Ng HT (2016) A neural approach to automated essay scoring. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, 1–5 November, pp. 1882–1891. Association for Computational Linguistics

Takeuchi K, Ohno M, Motojin K, Taguchi M, Inada Y, Iizuka M, Abo T, Ueda H (2021) Development of essay scoring methods based on reference texts with construction of research-available Japanese essay data. In IPSJ J 62(9):1586–1604

Ure J (1971) Lexical density: A computational technique and some findings. In Coultard M (ed) Talking about Text. English Language Research, University of Birmingham, Birmingham, England

Vaswani A, Shazeer N, Parmar N, et al. (2017) Attention is all you need. In Advances in Neural Information Processing Systems, Long Beach, CA, 4–7 December, pp. 5998–6008, Curran Associates, Inc., Red Hook, NY

Watanabe H, Taira Y, Inoue Y (1988) Analysis of essay evaluation data [Shōronbun hyōka dēta no kaiseki]. Bulletin of the Faculty of Education, University of Tokyo [Tōkyōdaigaku kyōiku gakubu kiyō], Vol. 28, 143–164

Yao S, Yu D, Zhao J, et al. (2023) Tree of thoughts: Deliberate problem solving with large language models. Advances in Neural Information Processing Systems, 36

Zenker F, Kyle K (2021) Investigating minimum text lengths for lexical diversity indices. Assess. Writ. 47:100505. https://doi.org/10.1016/j.asw.2020.100505

Zhang Y, Warstadt A, Li X, et al. (2021) When do you need billions of words of pretraining data? Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Online, pp. 1112-1125. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.acl-long.90

Download references

This research was funded by National Foundation of Social Sciences (22BYY186) to Wenchao Li.

Author information

Authors and affiliations.

Department of Japanese Studies, Zhejiang University, Hangzhou, China

Department of Linguistics and Applied Linguistics, Zhejiang University, Hangzhou, China

You can also search for this author in PubMed   Google Scholar

Contributions

Wenchao Li is in charge of conceptualization, validation, formal analysis, investigation, data curation, visualization and writing the draft. Haitao Liu is in charge of supervision.

Corresponding author

Correspondence to Wenchao Li .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

Ethical approval was not required as the study did not involve human participants.

Informed consent

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental material file #1, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Li, W., Liu, H. Applying large language models for automated essay scoring for non-native Japanese. Humanit Soc Sci Commun 11 , 723 (2024). https://doi.org/10.1057/s41599-024-03209-9

Download citation

Received : 02 February 2024

Accepted : 16 May 2024

Published : 03 June 2024

DOI : https://doi.org/10.1057/s41599-024-03209-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

how to cite while writing an essay

IMAGES

  1. How to cite sources in an essay examples

    how to cite while writing an essay

  2. 4 Ways to Cite an Essay

    how to cite while writing an essay

  3. How to cite in apa format example

    how to cite while writing an essay

  4. Properly cite a source in an essay

    how to cite while writing an essay

  5. How to Cite Sources

    how to cite while writing an essay

  6. 4 Ways to Cite Sources

    how to cite while writing an essay

VIDEO

  1. How to Cite an Essay in an Edited Volume

  2. Cite While You Write 2: Setting your referencing style

  3. [SOLVED] HOW TO CITE AN ESSAY?

  4. How do I use quotes in my essay?

  5. Can you use cite while you write on iPad?

  6. Cite While You Write for Google Docs

COMMENTS

  1. In-Text Citations: The Basics

    APA Citation Basics. When using APA format, follow the author-date method of in-text citation. This means that the author's last name and the year of publication for the source should appear in the text, like, for example, (Jones, 1998). One complete reference for each source should appear in the reference list at the end of the paper.

  2. 4 Ways to Cite an Essay

    3. Include the title of the essay. Type the title of the essay in sentence case, capitalizing only the first word and any proper nouns in the title. If the essay has a subtitle, type a colon at the end of the title and then type the subtitle, also in sentence case. Place a period at the end.

  3. The Basics of In-Text Citation

    The point of an in-text citation is to show your reader where your information comes from. Including citations: Avoids plagiarism by acknowledging the original author's contribution. Allows readers to verify your claims and do follow-up research. Shows you are engaging with the literature of your field.

  4. How to Cite an Essay in MLA

    Create manual citation. The guidelines for citing an essay in MLA format are similar to those for citing a chapter in a book. Include the author of the essay, the title of the essay, the name of the collection if the essay belongs to one, the editor of the collection or other contributors, the publication information, and the page number (s).

  5. How to Write an Academic Essay with References and Citations

    When learning how to write an academic essay with references, you must identify reliable sources that support your argument. As you read, think critically and evaluate sources for: Accuracy. Objectivity. Currency. Authority. Keep detailed notes on the sources so that you can easily find them again, if needed.

  6. How to Cite Sources

    To quote a source, copy a short piece of text word for word and put it inside quotation marks. To paraphrase a source, put the text into your own words. It's important that the paraphrase is not too close to the original wording. You can use the paraphrasing tool if you don't want to do this manually.

  7. Basic principles of citation

    The following are guidelines to follow when writing in-text citations: Ensure that the spelling of author names and the publication dates in reference list entries match those in the corresponding in-text citations. Cite only works that you have read and ideas that you have incorporated into your writing. The works you cite may provide key ...

  8. MLA: Citing Within Your Paper

    An in-text citation can be included in one of two ways as shown below: 1. Put all the citation information at the end of the sentence: 2. Include author name as part of the sentence (if author name unavailable, include title of work): Each source cited in-text must also be listed on your Works Cited page. RefWorks includes a citation builder ...

  9. A Quick Guide to Harvard Referencing

    When you cite a source with up to three authors, cite all authors' names. For four or more authors, list only the first name, followed by ' et al. ': Number of authors. In-text citation example. 1 author. (Davis, 2019) 2 authors. (Davis and Barrett, 2019) 3 authors.

  10. In-Text Citation and Notes

    APA: Parenthetical In-Text Citations. To cite a source in the text of an essay, APA advocates two methods: in-text citations and attribution within the essay's content. in-text citations should be included immediately after the quotation marks used in direct quotations or immediately after the use of the source, even if this means including the parenthetical reference in the middle of the ...

  11. How to Cite Sources

    The Chicago/Turabian style of citing sources is generally used when citing sources for humanities papers, and is best known for its requirement that writers place bibliographic citations at the bottom of a page (in Chicago-format footnotes) or at the end of a paper (endnotes). The Turabian and Chicago citation styles are almost identical, but ...

  12. A Quick Guide to Referencing

    In-text citations are quick references to your sources. In Harvard referencing, you use the author's surname and the date of publication in brackets. Up to three authors are included in a Harvard in-text citation. If the source has more than three authors, include the first author followed by ' et al. '.

  13. How to Reference in an Essay (9 Strategies of Top Students)

    Download a pdf version of the referencing style cheat sheet, print it out, and place it on your pinboard or by your side when writing your essay. 2. Only cite Experts. There are good and bad sources to cite in an essay. You should only cite sources written, critiqued and edited by experts.

  14. How to Quote

    Citing a quote in APA Style. To cite a direct quote in APA, you must include the author's last name, the year, and a page number, all separated by commas. If the quote appears on a single page, use "p."; if it spans a page range, use "pp.". An APA in-text citation can be parenthetical or narrative.

  15. Cite While You Write (MLA)

    Format: Use the complete title of the work in the sentence or a shortened version of the title in parentheses, as well as the page number if available. If shortening the title, make sure to use the first word of the corresponding works cited entry so that your reader can find the full citation. In-Text Example: Jac Bayles wrote her MA ...

  16. PDF Strategies for Essay Writing

    Harvard College Writing Center 5 Asking Analytical Questions When you write an essay for a course you are taking, you are being asked not only to create a product (the essay) but, more importantly, to go through a process of thinking more deeply about a question or problem related to the course. By writing about a

  17. Quoting and integrating sources into your paper

    Important guidelines. When integrating a source into your paper, remember to use these three important components: Introductory phrase to the source material: mention the author, date, or any other relevant information when introducing a quote or paraphrase. Source material: a direct quote, paraphrase, or summary with proper citation.

  18. Paraphrasing

    6 Steps to Effective Paraphrasing. Reread the original passage until you understand its full meaning. Set the original aside, and write your paraphrase on a note card. Jot down a few words below your paraphrase to remind you later how you envision using this material. At the top of the note card, write a key word or phrase to indicate the ...

  19. How to Cite a Website

    Citing a website in MLA Style. An MLA Works Cited entry for a webpage lists the author's name, the title of the page (in quotation marks), the name of the site (in italics), the date of publication, and the URL. The in-text citation usually just lists the author's name. For a long page, you may specify a (shortened) section heading to ...

  20. The Silent Symphony: Decoding the Implications of Hazelwood v Kuhlmeier

    This essay about the landmark case Hazelwood v Kuhlmeier explores the intricate balance between student press rights and school authority. It highlights how the Supreme Court's ruling shifted the landscape of free speech in educational settings, allowing educators to censor school-sponsored publications under certain circumstances.

  21. The Pros and Cons of Spanking: a Comprehensive Analysis

    The fear and pain associated with spanking can undermine the child's sense of security and trust in their caregivers, potentially leading to long-term emotional harm. Additionally, spanking can model aggressive behavior, teaching children that physical force is an acceptable way to resolve conflicts. Another significant concern is the ...

  22. Welcome to the Purdue Online Writing Lab

    Mission. The Purdue On-Campus Writing Lab and Purdue Online Writing Lab assist clients in their development as writers—no matter what their skill level—with on-campus consultations, online participation, and community engagement. The Purdue Writing Lab serves the Purdue, West Lafayette, campus and coordinates with local literacy initiatives.

  23. How to Cite a Book

    To cite a book chapter, first give the author and title (in quotation marks) of the chapter cited, then information about the book as a whole and the page range of the specific chapter. The in-text citation lists the author of the chapter and the page number of the relevant passage. MLA format. Author last name, First name.

  24. How the U.S.'s European Allies Are Preparing for a Second Trump Term

    In 2017, when Trump took office, only three allies, plus the U.S., were spending at least 2 percent of their GDP on defense. This year, that number is expected to rise to at least 18. Trump's ...

  25. Applying large language models for automated essay scoring for non

    Recent advancements in artificial intelligence (AI) have led to an increased use of large language models (LLMs) for language assessment tasks such as automated essay scoring (AES), automated ...