Source inclusion in synthesis writing: an NLP approach to understanding argumentation, sourcing, and essay quality

Scott Crossley, Qian Wan, Laura Allen, Danielle McNamara

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Synthesis writing is widely taught across domains and serves as an important means of assessing writing ability, text comprehension, and content learning. Synthesis writing differs from other types of writing in terms of both cognitive and task demands because it requires writers to integrate information across source materials. However, little is known about how integration of source material may influence overall writing quality for synthesis tasks. This study examined approximately 900 source-based essays written in response to four different synthesis prompts which instructed writers to use information from the sources to illustrate and support their arguments and clearly indicate from which sources they were drawing (i.e., citation use). The essays were then scored by expert raters for holistic quality, argumentation, and source use/inferencing. Hand-crafted natural language processing (NLP) features and pre-existing NLP tools were used to examine semantic and keyword overlap between the essays and the source texts, plagiarism from the source texts, and instances of source citation and quoting. These variables along with text length and prompt were then used to predict essays scores. Results reported strong models for predicting human ratings that explained between 47 and 52% of the variance in scores. The results indicate that text length was the strongest predictor of score but also that more successful writers include stronger, semantically-related information from the source, provide more citations and do so later in the text, and copy less from the text. This work introduces the use of NLP techniques to assess source integration, provides details on the types of source integration used by writers, and highlights the effects of source integration on writing quality.

Original languageEnglish (US)
Pages (from-to)1053-1083
Number of pages31
JournalReading and Writing
Issue number4
StatePublished - Apr 2023
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer Nature B.V.


  • Corpus linguistics
  • Natural language processing
  • Synthesis writing


Dive into the research topics of 'Source inclusion in synthesis writing: an NLP approach to understanding argumentation, sourcing, and essay quality'. Together they form a unique fingerprint.

Cite this