Skip to main content

Open Access 16.04.2024

Examining pre-service teachers’ feedback on low- and high-quality written assignments

verfasst von: Ignacio Máñez, Anastasiya A. Lipnevich, Carolina Lopera-Oquendo, Raquel Cerdán

Erschienen in: Educational Assessment, Evaluation and Accountability

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Assessing student writing assignments and providing effective feedback are a complex pedagogical skill teacher candidates need to master. Scarce literature has closely examined the type of feedback that pre-service high-school teachers spontaneously deliver when assessing student writings, which is the main goal of our study. In a sample of 255 high school teacher candidates, we examined the type of feedback that they provided when assessing two writing assignments that differed in quality. One thousand eight hundred thirty-five comments were analyzed and coded into 11 sub-categories. Results showed that candidates’ feedback not only focused on task performance but also on the writing process. Although candidates provided critical and past-oriented evaluations frequently, they also crafted feedback in a neutral tone and included future-oriented suggestions. Further, feedback varied as a function of candidates’ gender, academic discipline, and students’ quality of writing. Teacher training programs may use this information to design resources to address nuances of feedback provision.
Hinweise

Supplementary Information

The online version contains supplementary material available at https://​doi.​org/​10.​1007/​s11092-024-09432-x.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Among the many skills expected of pre-service teachers (or teacher candidates), one crucial aspect is their capacity to effectively evaluate and provide high-quality written feedback on students’ written assignments. This ability plays a vital role in enhancing writing outcomes and fostering the development of students’ writing skills across academic subjects (e.g., Dempsey et al., 2009; Duijnhouwer et al., 2010, 2012; Graham, 2018; Graham et al., 2020; Parr & Timperley, 2010). Proficiency in writing is an indispensable skill that holds significance across a multitude of domains. Its importance resonates throughout various academic, professional, and creative pursuits and transcends academic subjects. In their recent meta-analysis, Graham et al. (2020) synthesized effect size from 56 studies that examined the impact of content-related writing on student learning in science, social studies, and mathematics across Grades 1 to 12. The findings indicated that writing about content significantly enhanced learning (effect size = 0.30) in a consistent manner, irrespective of subject, grade level, or the specific features of writing activities, instruction, or assessment. One of the key predictors of student writing development is the feedback provided by teachers, and previous research has shown that both experienced in-service teachers and novice teachers tended to struggle when generating effective feedback (Evans, 2013; Junqueira & Payant, 2015; Orrell, 2006; Ropohl & Rönnebeck, 2019; Ryan et al., 2021; Underwood & Tregidgo, 2006). Furthermore, the bulk of the literature has focused on examining how in-service teachers provide written feedback on students’ writing (e.g., Otnes & Solheim, 2019; Parr & Timperley, 2010), generally neglecting the pre-service teachers’ sub-population (e.g., Dempsey et al., 2009; Fong et al., 2013; Lee, 2014). Understanding the feedback pre-service teachers naturally provide is crucial for designing effective courses and resources to improve this critical skill in teacher preparation programs. This study is one of the first to provide a fine-grained picture of the type of written feedback that high school teacher candidates deliver when assessing written assignments. Hence, we had the following two goals: (1) we investigated what type of feedback teacher candidates, who are training to be high school teachers, provide, and (2) we examined the extent to which the quality of the student’s essay, teachers’ gender, and academic discipline predicted the type of comments they generate.

1 Characteristics of instructional feedback

Feedback is one of the most powerful instructional interventions teachers can use to help students to increase their knowledge and skills (Hattie, 2009). Countless reviews and meta-analyses have suggested that feedback has a potential of enhancing students’ performance and learning (e.g., Black & Wiliam, 1998; Evans, 2013; Hattie & Timperley, 2007; Kluger & DeNisi, 1996; Li & DeLuca, 2014; Shute, 2008; Winstone et al., 2017; Wisniewski et al., 2020). However, it is not uncommon for studies to show non-significant, or even negative, effects on students’ outcomes (e.g., van der Kleij et al., 2015). Over the last decades, scholars have proposed a number of theoretical frameworks to explain how feedback works and what factors contribute to its effectiveness (e.g., Lipnevich & Panadero, 2021). After conducting a review of the most prominent feedback models, Panadero and Lipnevich (2022) proposed an integrative framework that covers five components that may explain the large variability of feedback effects on learning. The components are message, implementation, student, context, and agent (MISCA). Of these five groups of factors, in the current study, we focus on message (characteristics of written feedback comments), student (quality of writing), and agent (teacher candidates’ gender and academic discipline), while considering the context (writing). Next, we delve into some of the key parameters related to the content of the feedback messages.
To date, numerous researchers have proposed feedback typologies and taxonomies that range from simple dichotomies (e.g., verification vs. elaborated feedback, Kulhavy & Stock, 1989) to more fine-grained categories that cover a range of specific types of feedback (Hattie & Timperley, 2007; Shute, 2008). Describing all typologies and taxonomies is out of scope of the current paper, so we would like to direct the reader to Panadero and Lipnevich (2022) and Lipnevich and Panadero (2021) for their review. Instead, we will focus on categories that we selected for our study. These categories have been systematically acknowledged to be crucial in the feedback literature, and include focus, content, emotional valence, and orientation of feedback.
Firstly, research has often explored the focus of teachers’ feedback. In their seminal work, Hattie and Timperley (2007) distinguished four main foci, i.e., task, process, self-regulation, and self. Task-focused feedback is intended to bring students’ attention to the specifics of the task at hand and allows them to know how successful they were in meeting teachers’ criteria. Process-focused feedback, on the other hand, allows students to comprehend what actions they have or should have deployed to solve the task. The third focus has to do with students’ self-regulation and refers to how students can monitor their own learning. The final focus of feedback is on the self and refers to personal characteristics of a student. It is important to mention that feedback that focuses on the process usually incorporates self-regulatory information, pointing out actions or strategies to self-evaluate the task performed. Hence, disentangling these two types may be problematic. The extant literature suggests that teachers’ feedback should focus on the strategies and processes of writing (i.e., process-focused feedback) because such comments provide opportunities for transfer and have a higher chance to improve the writing process. Focusing exclusively on students’ task performance may improve the current task, but higher order writing skills will be unlikely to develop (Hawe et al., 2008; Parr & Timperley, 2010). Despite this claim, recent findings show that both in-service and pre-service teachers provide predominantly task-focused feedback when assessing student writings (Arts et al., 2016; Dysthe, 2011; Fong et al., 2013).
Closely related to the previous category, the content specificity of the feedback is arguably one of the most relevant variables to explore. When it comes to written assignments, it is important to be aware of what features of student writing teachers refer to when delivering feedback (Dempsey et al., 2009). In such context, feedback may refer to surface features such as grammar, spelling or punctuation, or deep features such as students’ writing style or the quality of the ideas they express. As a general recommendation, teachers are expected to provide content-level and higher-level stylistic comments (Graham, 2018), but findings suggest that they predominantly focus on surface errors (see, for example, Arts et al., 2016; Hawe et al., 2008).
Thirdly, emotional valence is inherent in any communicative act and hence represents an important characteristic of teacher feedback (Lipnevich & Smith, 2009; Lipnevich et al. 2021b; Dawson et al., 2021; Pitt & Norton, 2017; Ryan & Henderson, 2018). Research has shown that students’ writing is influenced by their emotions (Graham, 2018; Lipnevich et al., 2021a), and feedback is one of the key antecedents of achievement emotions (Pekrun, 2006, 2022). To better understand instructional practices and their impact on learning outcomes, it is crucial to consider the emotional valence of teacher comments (Graham & Harris, 2018; Yu, Geng, et al., 2021; Yu, Zheng, et al., 2021). Recent studies have categorized teacher feedback on student writing as constructive (neutral), positive, or negative (Fong et al., 2018), depending on the affective information transmitted and emotions it elicits (Goetz et al., 2018; Pekrun, 2006). Positive messages usually include praise comments about a product (Congratulations! I see how well you’ve framed your writing!), an action (You’ve done great job selecting information from those websites!), or personal characteristics that have played a favorable role in good performance (e.g., You’re very organized!). On the other hand, negative messages usually comprise critical remarks about failures and shortcomings, including evaluative judgments about other’s performance or personal characteristics (I expected more from you!). Besides this positive-negative dichotomy, feedback may also be neutral in tone when it conveys information on how students may improve without including an evaluative or normative component. Such informative feedback comments that mainly state or explain what actions students should undertake to improve their work have a potential of improving student writing from a constructive perspective (Fong et al., 2018; Graham, 2018; Hattie et al., 2021).
The final category in our classification includes feedback orientation. Broadly speaking, teachers may provide comments aimed at evaluating past actions/outcomes or suggesting ways to enhance students’ performance, and both types, in combination, may help students to learn how to monitor their writing skills (e.g., Duijnhouwer et al., 2012; Pitt & Norton, 2017; Price et al., 2011). While the former let students know where they are, the latter let them know where they need to go and offers strategies on how to get there (Parr & Timperley, 2010). Past-oriented evaluative comments, especially those focused on negative aspects of the students’ performance, may inhibit student’s engagement with feedback or, in the best scenario, encourage students with higher achievement levels (Pitt & Norton, 2017). Beyond evaluative feedback, literature suggests that effective learner-centered feedback should include information about future-oriented processes and actions that students may undertake so as to improve their learning (Ryan et al., 2021). However, teachers appear to rarely deliver future-oriented comments when assessing written assignments (Arts et al., 2016).
Examining the type of feedback messages that pre-service teachers provide may shed some light on their strengths and weaknesses when delivering feedback on writing and help teacher education programs to better design their curricular activities. In our study, we did exactly that: we investigated the types of feedback that pre-service teachers provide on students’ written assignments.

2 Feedback on writing

Unarguably, writing is critical to success in today’s societies (Freedman et al., 2016) and teachers from different disciplines rely on assigning specific types of writing to support learning (Bazerman, 2009; Smagorinsky & Mayer, 2014). Although writing has been traditionally taught in language courses (Klein & Boscolo, 2016), efforts have been made in recent years to promote writing as a critical learning activity in other disciplines (e.g., Kiuhara et al., 2020; MacArthur, 2014; Wallace et al., 2004). Although social studies teachers are more likely to use writing to promote learning, followed by science and then mathematics teachers (Gillespie et al., 2014; Ray et al., 2016), meta-analytic work by Graham et al. (2020) shows that writing promotes school-aged students’ learning independently of the academic discipline (science, social studies, or mathematics) and the academic level (elementary, middle, or high school).
Teachers frequently assign a variety of writing activities, such as composing reports, summarizing information, building stories or narratives, or defending arguments. These assignments require students to engage in a broad spectrum of cognitive and metacognitive processes that teachers must consider when assessing their written products. For instance, students may set goals, strategically plan their compositions, activate prior knowledge, elaborate on new ideas or arguments, integrate information, monitor their comprehension and the quality of their written product, assess text structure, or search for spelling errors. These operations require considerable effort and are not always executed with consistent success by students (Graham & Harris, 2000).
Guiding students towards improved writing proficiency and enhanced learning outcomes through constructive feedback is of paramount importance. However, assessing and fostering writing while ensuring the effectiveness of teachers’ feedback is a complex endeavor. Teachers’ ability to design and deliver effective feedback is among the most relevant skills to acquire (Boud & Dawson, 2021; Graham, 2018). Nevertheless, providing feedback on writing assignments can be time-consuming and challenging, and it does not always guarantee significant learning gains (Duijnhouwer et al., 2012; Kellogg & Whiteford, 2009). The differences in the impact of feedback on writing may be attributed to the inherent complexity of this skill. Students vary in their ability to process feedback cognitively, affectively, and behaviorally, while teachers provide messages that are highly variable and specific to individual work (Parr & Timperley, 2010; Ryan et al., 2021). In addition to attributes of students and messages, teacher characteristics also may influence the type of feedback that is provided (see Panadero & Lipnevich, 2022). For example, research exploring potential gender differences among teachers in assessing and grading students’ performance has yielded varied results (Read et al., 2005). Some studies have indicated that male teachers tend to exhibit a bias favoring male students in their grading (Lavy, 2008), while others have not found consistent evidence to support this assertion (Doornkamp et al., 2022; Lindahl, 2016). In general, the body of research suggests that male teachers may tend to be more stringent in their assessment of students’ performance (Cheng & Kong, 2023; Lopera-Oquendo et al., 2023). However, it is worth noting that Read et al. (2005) discovered no statistically significant gender differences in how history higher education tutors presented feedback comments on students’ written work. These variations and disparities in assessment and grading practices based on gender could potentially result in distinct interactions through feedback comments when evaluating high school students’ writing, an area that has not been extensively explored in the literature.
Considering the plethora of feedback-related consideration, the extant literature on written feedback suggests that there is not a single correct way to deliver effective feedback on writing. We do know that, ideally, comments should include enough information to let students know where they are (point a), where they are going (point b), and what actions they should undertake to get from point a to point b (Parr & Timperley, 2010). On the whole, findings support the claim that feedback on writing assignments should be specific and tailored to the student’s writing proficiency, include explanations, strategies, and future-oriented suggestions, provide clear actionable information, use a dialogic tone and, at the same time, avoid simple praise comments and grades because they lack relevant information to facilitate learning (Lipnevich et al., 2021b).
Recent studies have delved into how teachers design and deliver written feedback on writing for both traditional (e.g., Arts et al., 2016; Hattie et al., 2021; Otnes & Solheim, 2019; Parr & Timperley, 2010; Peterson & McClay, 2010) and second-language language learners (e.g., Bitchener & Ferris, 2012; McMartin-Miller, 2014; Montgomery & Baker, 2007; Wigglesworth & Storch, 2012). These studies showed that teachers provided feedback that comprised surface comments, focusing on students’ writing mechanics rather than on content and higher-order writing skills (e.g., Arts et al., 2016; Hawe et al., 2008). Otnes and Solheim (2019), for example, found that school teachers’ written feedback on students’ writing was usually composed of directive messages in a non-dialogic tone and that it often included clear praise/criticism comments without further explanations.
Researchers also highlighted the importance of considering the students’ current writing level to craft accurate feedback (Ropohl & Rönnebeck, 2019). For instance, Duijnhouwer et al. (2012) found that teachers provided more strategies when assessing weaker writing compared to higher quality work. Furthermore, the higher number of strategies teachers provided within their feedback comments was associated with lower levels of students’ self-efficacy. In general, previous findings revealed that the quality of teachers’ written feedback when assessing a piece of writing can predict students’ progress (Hattie et al., 2021; Parr & Timperley, 2010). Unfortunately, studies also demonstrated that high school teachers were less likely to deliver high-quality actionable feedback compared to university professors (see, for example, Hattie et al., 2021). One of the possible explanations for less than optimal feedback that students receive on their writing is due to the fact that teachers often learn to assess their students’ knowledge and skills on the job (Orrell, 2006). In other words, teacher education programs may be lacking adequate training to enhance pre-service teachers’ assessment skills that comprise basic rules and approaches to feedback provision (DeLuca & Bellara, 2013). Thus, controlling for the quality of student essays, we examined feedback on writing that teachers in training provided.

3 Pre-service teachers’ feedback on student writing

Compared to compendiums of studies that have investigated feedback provision skills of in-service teachers, pre-service teachers’ approaches to written feedback delivery have received significantly less attention (e.g., Dempsey et al., 2009; Fong et al., 2013; Guénette & Lyster, 2013; Junqueira & Payant, 2015; Ropohl & Rönnebeck, 2019). The studies that do exist suggest that beginning teachers struggle to devise effective feedback to support students’ writing skills and task performance. For instance, Guénette and Lyster (2013) explored the type of written feedback pre-service high school ESL teachers gave to their students on their written assignments and identified that the bulk of comments comprised specific surface-level corrections and task-focused features, similar to what their in-service counterparts usually do (Furneaux et al., 2007). Conversely, in their analysis of pre-service teachers’ written feedback on a hypothetical essay, Fong et al. (2013) found that teacher candidates focused on both the task and the process outcomes, and that the process-oriented comments were seen to be positively related to writing improvements. Further, a recent study conducted by Ropohl and Rönnebeck (2019) examined pre-service high school chemistry teachers’ written feedback given to 8th graders’ written plans to conduct an experimental study. These authors observed that pre-service teachers tended to provide descriptive and evaluative comments with little future-oriented information, and that students’ writing accuracy influenced the number of comments delivered, with low-performing students receiving more feedback.
Although these studies contribute to our understanding of pre-service teachers’ feedback practices, several critical limitations limit generalization of the findings: (a) a single-case study or a small sample of pre-service teachers has been employed (e.g., Fong et al., 2013; Guénette & Lyster, 2013; Junqueira & Payant, 2015; Ropohl & Rönnebeck, 2019), (b) a small number of feedback comments were analyzed (e.g., Fong et al., 2013; Ropohl & Rönnebeck, 2019), (c) feedback comments were classified as a function of a single dimension (Fong et al., 2013), (d) no differences in feedback provision as a function of teacher characteristics (e.g., gender or discipline) were explored, and (e) little is known about the teachers’ attempts to adapt the feedback comments to the students’ writing proficiency (Ropohl & Rönnebeck, 2019). Our study aims to overcome these limitations and fill the gap in the current literature.

4 The current study

In the current study, we examined teacher candidates’ feedback practices and investigated whether the type of comments pre-service teachers delivered depended on their individual characteristics (i.e., gender, academic discipline) as well as the quality of students’ writing. To better understand pre-service teachers’ feedback practices, we held the quality of student writing constant for all participants, used two standardized essays of low and high quality, and did not provide information about the gender of the student to avoid possible gender bias. That is, all teacher candidates were asked to provide feedback on the same two essays. Pre-service teachers’ comments were then coded into four categories, i.e., focus, content, emotional valence, and past and future orientation. The following research questions guided this study:
RQ1: What kind of feedback messages do pre-service high school teachers provide when assessing an academic writing assignment?
RQ2: To what extent do pre-service teachers’ gender and academic discipline, as well as students’ quality of writing, predict the type of feedback that pre-service teachers provide?

5 Method

5.1 Participants and procedure

A total of 282 master’s degree students in the high school education teacher training program from a public Spanish university participated in this study. An online instrument was administered in two different course cohorts at the end of 2020 and 2021 academic years. Participants who omitted more than 80% of questionnaire items were deleted, so 255 observations were retained in the final dataset (90.4%, N = 171 in 2020, N = 84 in 2021). Table 1 summarizes the participants’ demographic characteristics. 64.7% of the participants (N = 165) were women; 71.4% were between 21 and 25 years of age, and about two-thirds of the sample (62.4%, N = 159) was enrolled in the language teaching specialization program.
Table 1
Demographic characteristics of the participants
Variable
N
%
Gender
  
 Female candidate
165
64.7
 Male candidate
88
34.5
 Missing
2
0.8
Age
  
 21–25
182
71.4
 26–30
39
15.3
 More than 30
34
13.3
Discipline (master’s specialization program)
  
 Language studies and Classical Cultures (English, Spanish, French, German, Greek, and Latin)
159
62.4
 Social Sciences (Geography and History)
57
22.4
 Sciences (Mathematics, Physics, and Chemistry)
39
15.3
Cohort (administration year)
  
 2020
171
67.1
 2021
84
32.9
Total
255
100
This study adheres to institutional and governmental regulations governing research involving human participants. Participation was approved by the ethical review committee 2022-0460-QC and followed the guidelines of the Declaration of Helsinki. All participants were informed about the nature of the study, i.e., general research goal, procedures, and the absence of potential risks associated with participation. They were also informed about the confidentiality and anonymity of the data collected, as well as their right to withdraw at any time. Before engaging in the study, participants were asked to provide informed consent to participate.
This study was conducted online, with the questionnaire presented in Qualtrics. The task consisted of providing feedback on two anonymous essays with different levels of performance (one strong and one weak exemplar) selected from the Colombian standardized test in writing communication SABER T&T, then providing grades using analytic and holistic scoring approaches. Second, participants were asked to complete the Receptivity to Instructional Feedback scale (Lipnevich et al., 2021a; Lipnevich & Lopera-Oquendo, 2024) and the Big Five Personality Inventory (BFI; Goldberg, 1993), and to report their demographic and academic background information. Finally, participants were asked to define the term “feedback” so as to gather information about their feedback conceptions. For the purposes of the current paper, we will not consider participants’ grading, personality variables, and feedback receptivity and will focus exclusively on teacher candidates’ feedback comments.
Before commencing the task, the participants received the essay prompt along with the instructions that the authors of the essay had received:
Next you will find a controversial topic. Please write an argumentative text justifying your position: ‘The parliament is currently discussing a bill to allow adolescents under the age of 14 to undergo cosmetic procedures. This bill has generated a lot of controversy, with some people in favor and some against this proposal. Do you agree that Spanish adolescents under the age of 14 should be allowed to undergo cosmetic procedures?’.
Next, the participants were shown two essays written by high school students on the above prompt and were invited to provide comments the same way they would in their own classes. Figure S1a, Supplementary Materials depicts the comments interface. To provide a comment, teacher candidates could highlight words or sentences they wanted to comment upon. Next, a pop-up box was displayed. Students could provide up to eight comments of unlimited length (Figure S1b, Supplementary Materials). Each participant performed this task twice, once for the strong essay and once for the weak essay. Despite the fact that participants were enrolled in the teacher training program, they did not receive specific training on how to provide feedback when assessing written assignments prior to participating in this study.

5.2 Data coding

Pre-service high school teachers’ feedback comments were coded according to the four dimensions of feedback: focus of feedback, content specificity, emotional valence, and past and future orientation. In order to perform the analysis, each category had two or three subcategories to classify the feedback comments. The definition of each subcategory along with some representative examples can be found in Table 2.
Table 2
Coding scheme employed to classify feedback
Category
Subcategory
Definition and examples
Focus of feedback
Task
Feedback comments focus on how well the task was understood or performed (i.e., feedback focuses on participant’s outcome or task performance). These comments may address specific instances that could rarely be generalized to other writing assignments.
Examples: “I really like this argument, well done.”, “The conclusion is confusing, it is not clear what your opinion is.”, “The text structure is not well defined, and some sentences make no sense.”
Process/self-reg.
Feedback comments focus on strategies followed or to be followed when writing (i.e., processes needed to perform the task), as well as self-monitoring or self-regulation. These comments may provide general guidelines that could be applied to other writing assignments.
Examples: “Once you consider your writing done, you should check whether there are mistakes and whether it responds to the topic in an appropriate manner.”, “Why did you not include more arguments in your writing? Please, check it.”
Content specificity
Grammar/spelling/punctuation
Feedback comments include information about the grammar, spelling, or punctuation such as incorrect word order and grammatical errors.
Examples: “You have missed some word or connector here.”, “I think you meant ‘error’, not ‘menor’”, “More commas need to be used in a proper way”
Style
Feedback comments refer to the student’s writing style, i.e., it refers to the way the student writes or expresses his/her own ideas. It may refer to syntactic or word choices, tone, and essay structure.
Examples: “To increase text cohesion, more connectors should be used.”, “Using less taxing sentences would be a good idea to persuade the reader.”
Ideas expressed
Feedback comments refer to the quality of the ideas/arguments expressed by pointing out whether they are appropriate and/or coherent to the topic of the essay.
Examples: “Consider that adolescents are not fully developed at the psychological and emotional level.”, “The arguments are clear and serve to defend the thesis.”
Emotional valence
Praise
Feedback comments congratulate on, compliment on or commend the students for his/her work. Comments express admiration of his/her piece of writing or a personal characteristic.
Examples: “Very nice writing”, “Keep up the good work!”, “These are good arguments; you’ve worked hard”
Criticism
Feedback comments judge the student’s piece of writing or a personal characteristic in an authoritarian or condescending manner.
Examples: “This is a mistake, it makes no sense to me.”, “The text structure is very deficient.”, “It is poorly written, there is no clarity of ideas.”
Neutral
Feedback comments presented in an informative manner, lacking emotional connotations.
Examples: “One of the aspects to improve would be going a little deeper into the topic and explain more arguments that are not included in your writing. Also, to add ideas in favor to then position yourself in the conclusion.”, “I understand the idea, but I think a verb is missing here.”
Orientation
Past-oriented
Comments assessing the student’s work in a past-oriented manner. These messages acknowledge something the student did well or point out negative aspects of the student’s writing, but do not encourage them to engage in further improvements.
Examples: “There is no coherence in this writing.”, “The statement indicates aesthetic procedures, without limiting it to plastic surgeries. It’s a mistake to reduce the argument to this idea since the statement does not refer to that.”
Future-oriented directive
Explicit prescriptive comments indicating what changes, modifications or adjustments the student must make in order to fix errors or improve his/her writing performance.
Examples: “Rephrase this sentence, it makes no sense.”, “Add commas where appropriate.”, “Do not use the first person in the introduction.”
Future-oriented suggestive
Prescriptive comments that provide student with information about how to improve his/her current performance, including suggestions, hints or clues that help the student move the writing forward. These are suggestive in nature and provide alternatives the student may undertake.
Examples: “If you change this sentence, it may be complemented by the ideas discussed below.”, “What do humans benefit from? You need to think deeper about this idea to improve your writing.”, “It would be advisable to start an introduction in a more depersonalized way.”
Previous research suggests that both process and self-regulation categories are hard to distinguish because the boundaries are somewhat artificial and, in both cases, comments refer to the students’ strategy (Arts et al., 2016). In the present study, comments focused on self-regulation are coded as process-focused feedback. Each feedback comment can be considered multi-dimensional (Parr & Timperley, 2010). For example, a comment may refer to the students’ task performance, writing style, and provide future-oriented suggestions on how to improve the text structure
A total of 1835 feedback comments were coded (N = 1104 in 2020, N = 731 in 2021) by two researchers for each data collection cycle separately. For the 2020 sample, two researchers jointly coded 50 comments and then independently coded 100 comments (14% of the total number of comments) to get calibrated. On average, the level of agreement was above 85% and the inter-rater reliability indexed by Cohen’s κ was .66, ranging from .33 (p < .001) to .94 (p < .001) for task and criticism subcategories, respectively, showing acceptable agreement (Table S2, Supplementary Material) (McHugh, 2012). Disagreements were solved through discussion. The remaining feedback comments were randomly divided in two sets of 550 comments that shared a 25% of comments to make sure that the calibration remained consistent through the coding process. This follow-up coding calibration also showed acceptable agreement (95.5%), while the average of Cohen’s kappa was κ = .89 for each set, ranging from .64 (p < .001) to 1.0 (p < .001) for future-oriented directive and process subcategories, respectively. For the 2021 sample, the process was almost identical. Two researchers, one of whom was a new coder who did not participate in the previous coding procedure, jointly coded 50 comments. The remaining comments were randomly divided into two sets of 400 comments that shared 40% of the sample for measuring consistence between the two coders. The level of agreement was above 89% and the inter-rater reliability measured with Cohen’s κ was .75, ranging from .58 (p < .001) to .88 (p < .001) for future-oriented directive and task subcategories, respectively, showing acceptable agreement. Discrepancies were solved through discussions with the third researcher, who participated in the coding procedure for the 2020 sample.

5.3 Data analysis

Descriptive information about characteristics of the feedback comments was examined for both the strong and the weak essay. For the first research question, frequencies for each subcategory for the total sample and by candidates’ gender and academic discipline were calculated. Z-tests to compare observed proportions were then estimated.
For the second research question, a set of multiple logistic regression models using a binomial distribution were fitted to estimate the main effect of pre-service candidates’ gender and academic discipline and the quality of the essay (strong vs. weak performance) on the probability to provide comments in each subcategory. The proportion of comments in each subcategory was used as a dependent variable (which is an event/trial form variable rather than a binary observation), whereas candidates’ gender, academic discipline, and cohort were used as predictors. Further, Zero-Inflated Poisson (ZIP) and Zero-Inflated Negative Binomial (ZINB) distributions were estimated to deal with the overabundance of zero counts in some subcategories. Model goodness-of-fit statistics were calculated for selecting the best model fit by subcategory. Better models correspond to smaller Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) (Friendly & Meyer, 2016). Plot analyses were also generated for checking model assumptions. All analysis were conducted using R software version 4.1.2 (R Core Team, 2021). In order to control the inflation of Type I error resulting from multiple comparison within the same sample, p-values were adjusted using Benjamini-Hochberg method.

6 Results

6.1 Descriptive statistics

One thousand eight hundred thirty-five feedback comments were coded into the aforementioned subcategories (852 and 983 for the strong and the weak essays, respectively). Table 3 presents descriptive characteristics of feedback messages for the total sample and the two levels of writing quality (strong and weak). Each participant provided an average of five feedback comments (M = 5.10, SD = 1.97, range = 1–8). About 95% of the comments comprised 1–3 sentences (M = 1.52, SD = 1.02, range = 1–11). On average, comments were 19 words long (M = 19.14, SD = 20.74, range = 1–247). Therefore, sentences were, on average, 13 words long (M = 12.59, SD = 8.64, range = 1–80). Participants somewhat tailored their comments to the students’ writing performance. Small differences were found for the total number of comments (d = .47), with the weak essay receiving, on average, one extra comment compared to the strong essay. Differences in the number of sentences per comment and their length (i.e., word count) were negligible, which means that participants did not adapt the length of the comments to the quality of the essay. Further, differences in feedback characteristics by cohort were even smaller, with effect sizes ranging from .31 (comments by participants) to .07 (sentences by comment) (Table S1, Supplementary Materials). Next, we explored whether the content of the feedback comments varied as a function of the quality of the writing (i.e., weak versus strong essay).
Table 3
Descriptive characteristics of the feedback comments as a whole and as a function of the quality of student writing
Variable
 
Writing quality
Cohen’s d
Total sample
Strong essay
Weak essay
Mean
SD
Mean
SD
Mean
SD
Comments by participant
5.10
1.97
4.62
1.87
5.51
1.97
.47
Sentences by comment
1.52
1.02
1.57
1.07
1.47
0.98
.10
Words by comments
19.51
20.74
21.53
22.77
17.75
18.64
.18
Words by sentences
12.59
8.64
13.51
9.20
11.80
8.05
.20
Feedback messages for strong essay N = 851, weak essay N = 983

6.2 Characteristics of the type of feedback pre-service teachers provided

We first computed the overall descriptive statistics for each feedback subcategory to draw a general picture of the comments provided (Table 4). Regarding the focus of the feedback, 1423 (77.5%) included information about students’ task performance and 789 (43%) referred to the writing process (i.e., strategies followed or to be followed when writing). When it comes to the specific content of the essay, 379 (20.7%) feedback comments referred to grammar/spelling/punctuation aspects, 1020 (55.6%) pointed out aspects of the students’ writing style, and 822 (44.8%) included information about the ideas and arguments expressed within the essay. For the emotional valence of the feedback, 284 (15.5%) included praise, 827 (45.1%) included criticisms, and 757 (41.3%) were delivered in a neutral tone. Regarding the feedback orientation, 1317 (71.8%) comments were past oriented, 343 (18.5%) included future-oriented directive actions to be performed, and 817 (44.5%) included future-oriented suggestive actions.
Table 4
Distribution of feedback comments for each subcategory as a function of the quality of student writing
Category
Subcategory
Total Sample
Writing quality
Difference of proportion
Strong essay
Weak essay
n
%
n
%
n
%
χ2
p
Focus of feedback
Task
1423
77.5
649
76.2
774
78.7
1.58
.244
 
Process
789
43.0
392
46.0
397
40.4
5.66
.022
*
Content specificity
Grammar
379
20.7
146
17.1
233
23.7
11.61
.001
**
Style
1020
55.6
474
55.6
546
55.5
0.00
1.000
 
Ideas expressed
822
44.8
408
47.9
414
42.1
5.92
.021
*
Emotional valence
Praise
284
15.5
173
20.3
111
11.3
27.66
< .001
***
Criticism
827
45.1
290
34.0
537
54.6
77.34
< .001
***
Neutral
757
41.3
401
47.1
356
36.2
21.73
< .001
***
Orientation
Past-oriented
1317
71.8
585
68.7
732
74.5
7.30
.011
*
Future-oriented directive
343
18.5
215
25.5
128
13.0
44.00
< .001
***
Future-oriented suggestive
817
44.5
307
36.0
510
51.9
45.78
< .001
***
*p < .05; **p < .01, ***p < .001
There were differences in the type of comments provided on the strong and weak essays. When assessing the weak essay, teacher candidates provided a higher number of feedback comments about grammar (p = .001), critics (p < .001), past-oriented evaluations (p = .007), and futured-oriented suggestions (p <.001), compared to the strong essay. Conversely, the strong essay received a larger number of feedback comments about the writing process (p = .011), the quality of the ideas expressed (p = .021), praise (p < .001), neutral comments (p < .001), and future-oriented directive comments (p < .001), compared to the weak essay. No differences were found for the number of feedback comments referring to the task performance and the writing style.
Next, we computed descriptive statistics for each subcategory as a function of the quality of the writing and teacher candidates’ gender (Table 5) (see Table S3 for the distribution of feedback comments as a function of writing quality and cohort). Findings suggested that female candidates were more focused on grammar aspects (p = .006 and p = .006 for strong and weak essays, respectively) than male candidates. In contrast, male candidates commented more on the quality of the ideas than female candidates (p = .006 and p = .004 for the strong and weak essays, respectively). Regarding the emotional valence and orientation components, there were no significant differences in the proportion of comments by gender.
Table 5
Distribution of feedback comments for each subcategory as a function of the quality of student writing and teacher candidates’ gender
Category
Subcategory
Strong essay
Weak essay
Women
Men
Test difference of proportion
Women
Men
Test difference of proportion
n
%
n
%
χ2
p
n
%
n
%
χ2
p
Focus of feedback
Task
406
75.3
239
77.6
0.44
.592
 
493
76.8
274
82.0
3.29
.140
 
Process
263
48.8
126
40.9
4.59
.112
 
276
43.0
118
35.3
5.04
.108
 
Content specificity
Grammar
111
20.6
35
11.4
11.07
.006
**
173
26.9
57
17.1
11.37
.005
**
Style
307
57.0
165
53.6
0.78
.529
 
361
56.2
182
54.5
0.20
.652
 
Ideas expressed
233
43.2
171
55.5
11.38
.006
**
242
37.7
167
50.0
13.16
.004
**
Emotional valence
Praise
104
19.3
69
22.4
0.98
.501
 
67
10.4
42
12.6
0.81
.516
 
Criticism
192
35.6
96
31.2
1.54
.376
 
341
53.1
191
57.2
1.31
.393
 
Neutral
253
46.9
145
47.1
0.00
1.000
 
250
38.9
106
31.7
4.62
.108
 
Orientation
Past-oriented
375
69.6
206
66.9
0.54
.589
 
463
72.1
262
78.4
4.28
.108
 
Future-oriented directive
128
23.7
86
27.9
1.59
.376
 
76
11.8
50
15.0
1.65
.349
 
Future-oriented suggestive
210
39.0
94
30.5
5.71
.079
 
326
50.8
179
53.6
0.59
.517
 
One thousand eight hundred thirty-five comments were analyzed (852 comments for the strong essay and 983 for the weak essay). Note that the subcategories are not mutually exclusive, so that the aggregate values do not necessarily correspond to the total number of comments
*p < .05; **p < .01, ***p < .001
We also computed descriptive statistics for each subcategory as a function of the quality of the essay and candidates’ academic discipline (Table 6). Regarding the focus of feedback, a statistically significant difference was found in the proportion of participants who commented on the task by the area of specialization in the weak essay. Results showed that social science and science teacher candidates referred to task performance in almost every comment assessing the weak essay (90.2 and 84.4, respectively). Interestingly, across disciplines, teacher candidates provided a similar proportion of comments about the writing process independently of quality of essay. When it comes to content specificity, results showed that language teacher candidates delivered a higher proportion of comments related to grammar aspects in the weak essay compared to candidates in other disciplines (p < .001) but delivered a lower number of comments related to the quality of ideas (p = .04 and p < .001 for the strong and weak essays, respectively). There were no significant differences in the proportion of comments related to the writing style among disciplines.
Table 6
Distribution of feedback comments for each subcategory as a function of the quality of student writing and teacher candidates’ discipline
Category
Subcategory
Strong essay
Weak essay
Language
Social Science
Science
Test difference of proportion
Language
Social Science
Science
Test difference of proportion
n
%
n
%
n
%
χ2
p
n
%
n
%
n
%
χ2
p
Focus of feedback
Task
417
75.1
98
83.1
134
74.9
0.79
.518
 
506
75.1
111
90.2
157
84.4
16.51
< .001
***
Process
263
47.4
52
44.1
77
43.0
1.06
.518
 
287
42.6
41
33.3
69
37.1
4.01
.091
 
Content specificity
Grammar
108
19.5
18
15.3
20
11.2
5.59
.063
 
187
27.7
25
20.3
21
11.3
18.66
< .001
***
Style
315
56.8
60
50.8
99
55.3
0.69
.518
 
364
54.0
68
55.3
114
61.3
1.86
.268
 
Ideas expressed
247
44.5
72
61.0
89
49.7
6.92
.040
*
252
37.4
59
48.0
103
55.4
19.04
< .001
***
Emotional valence
Praise
116
20.9
35
29.7
22
12.3
0.25
.719
 
64
9.5
23
18.7
24
12.9
6.35
.033
*
Criticism
197
35.5
32
27.1
61
34.1
1.33
.499
 
359
53.3
66
53.7
112
60.2
1.44
.322
 
Neutral
251
45.2
51
43.2
99
55.3
1.96
.453
 
265
39.3
37
30.1
54
29.0
8.51
.012
*
Orientation
Past-oriented
389
70.1
81
68.6
115
64.2
1.32
.499
 
495
73.4
99
80.5
138
74.2
1.02
.399
 
Future-oriented directive
141
25.4
40
33.9
34
19.0
0.01
1.000
 
79
11.7
26
21.1
23
12.4
2.85
.160
 
Future-oriented suggestive
222
40.0
32
27.1
53
29.6
10.38
.014
*
355
52.7
60
48.8
95
51.1
0.44
.547
 
One thousand eight hundred thirty-five comments were analyzed (852 comments for the strong essay and 983 for the weak essay). Note that the subcategories are not mutually exclusive, so that the aggregate values do not necessarily correspond to the total number of comments. Hypothesis test compared the proportion of comments provided by candidates in the language area against other disciplines
*p < .05; **p < .01, ***p < .001
Regarding the emotional valence of the feedback messages, results showed that candidates from the three disciplines delivered neutral or praise comments more frequently when assessing the strong essay but criticized more often when assessing the weak essay. Furthermore, there were no significant differences in the proportion of comments related to subcategories of emotional valence among disciplines for the strong essay. In contrast, in the weak essay, there were significant differences in the proportion of feedback that included praise comments or neutral information. Results indicated that language teacher candidates delivered a lower proportion of praise comments and a higher proportion of neutral comments compared to candidates from other disciplines (p = .033 for praise and p = .012 for neutral subcategories). For the orientation category, past-oriented comments were the most prevalent messages among candidates from the three disciplines. However, there were no statistically significant differences between proportion of comments on orientation subcategories. Only the future-oriented suggestions subcategory yielded significant results, with candidates from the language discipline delivering more suggestions when assessing the strong essay compared to participants in other disciplines (p = .014).

6.3 Predicting feedback comments: the role of the quality of student writing and teacher candidates’ gender and discipline

Two multiple logistic regression models, fixed and interaction effects, were conducted to examine the extent to which the cohort, gender, and academic discipline predicted the type of feedback comments provided by teacher candidates. The dependent variable was the proportion of comments in each subcategory. Models were estimated for both the total sample of comments and for each essay depending on the quality of the writing, independently. Table 7 presents descriptive information for each subcategory as a function of writing quality, and Table S4 (Supplementary Material) shows descriptive information for the total sample of comments. On average, 16.5% of the comments focused on task performance, while 14.7% included past-oriented messages independently of the students’ writing quality. Differences in proportions by writing quality were mainly observed for emotional valence, especially in criticism (M = .07, SD = .06 for strong essay, M = .11, SD =.07 for weak essay) and neutral comments (M = .11, SD = .09 for strong essay, M = .07, SD = .82 for weak essay).
Table 7
Descriptive information for the proportion of comments in each subcategory as a function of the quality of the writing
Category
Subcategory
Strong essay
Weak essay
Mean
SD
Range
[min–max]
Skew
Kurtosis
Mean
SD
Range
[min–max]
Skew
Kurtosis
Focus of feedback
Task
0.168
0.057
[0.00–0.33]
0.065
0.652
0.163
0.055
[0.00–0.33]
−0.406
1.120
Process
0.094
0.075
[0.00–0.38]
0.683
0.739
0.089
0.076
[0.00–0.33]
0.800
0.596
Content specificity
Grammar
0.040
0.059
[0.00–0.33]
1.724
3.308
0.045
0.060
[0.00–0.33]
1.489
2.288
Style
0.114
0.068
[0.00–0.29]
−0.073
−0.517
0.115
0.066
[0.00–0.33]
0.031
−0.098
Ideas expressed
0.106
0.081
[0.00–0.38]
0.710
0.626
0.096
0.072
[0.00–0.33]
0.776
0.904
Emotional valence
Praise
0.041
0.056
[0.00–0.33]
1.589
3.278
0.030
0.053
[0.00–0.25]
1.905
2.966
Criticism
0.070
0.062
[0.00–0.25]
0.458
−0.714
0.110
0.070
[0.00–0.33]
0.024
−0.209
Neutral
0.109
0.089
[0.00–0.38]
0.765
0.075
0.073
0.082
[0.00–0.33]
1.246
1.103
Orientation
Past-oriented
0.138
0.058
[0.00–0.25]
−0.826
0.412
0.148
0.052
[0.00–0.25]
−1.140
1.528
Future-oriented directive
0.050
0.056
[0.00–0.20]
0.779
−0.418
0.033
0.052
[0.00–0.20]
1.598
1.619
Future-oriented suggestive
0.072
0.061
[0.00–0.20]
0.378
−0.870
0.099
0.062
[0.00–0.20]
−0.345
−1.046
Furthermore, some distributions in some subcategories were highly negatively skewed. The proportion of participants who did not provide any message in the subcategory varied between .01 (task) and .57 (grammar) for the strong essay and between .03 (task) and .66 (praise) for the weak essay. Moreover, more than 50% of participants did not provide comments pertaining to grammar, praise, and future-oriented directive subcategories. For these categories, additional Zero-Inflated Poisson (ZIP) and Zero-Inflated Negative Binomial (ZINB) were estimated to control for the effect of an overabundance of zero counts in modeling.
Results of multiple logistic regression models with significant effects for each subcategory as a function of the quality of the writing are displayed in Tables 8 and 9, while model estimation for the total sample of feedback comments is presented in Table S5 (Supplementary Materials).
Table 8
Logistic regression. Cohort, gender, and discipline as predictors of the proportion of feedback comments for strong essay
Subcategory
Predictor
β
OR
OR 95% CI [LL, UL]
p
Task
Intercept
−1.5 (0.09)
0.22
[0.19–0.26]
< .001
***
Cohort [2021]
−0.32 (0.11)
0.73
[0.59–0.90]
.008
**
Gender [Men]
0.01 (0.1)
1.01
[0.84–1.23]
.891
 
Discipline [Science]
−0.11 (0.13)
0.90
[0.70–1.15]
.671
 
Discipline [Social Science]
−0.07 (0.14)
0.93
[0.70–1.23]
.783
 
Process
Intercept
−2.45 (0.12)
0.09
[0.07–0.11]
< .001
***
Cohort [2021]
0.4 (0.14)
1.49
[1.14–1.98]
.012
*
Gender [Men]
−0.12 (0.13)
0.88
[0.69–1.13]
.323
 
Discipline [Science]
0.29 (0.17)
1.34
[0.95–1.87]
.155
 
Discipline [Social Science]
0.24 (0.19)
1.27
[0.87–1.84]
.269
 
Grammar
Intercept
−3.51 (0.20)
0.03
[0.02–0.04]
< .001
***
Cohort [2021]
0.64 (0.23)
1.89
[1.23–3.00]
.013
*
Gender [Men]
−0.44 (0.22)
0.64
[0.42–0.97]
.068
 
Discipline [Science]
0.11 (0.31)
1.12
[0.60–2.03]
.720
 
Discipline [Social Science]
0.38 (0.32)
1.46
[0.77–2.69]
.293
 
Praise
Intercept
−2.97 (0.15)
0.05
[0.04–0.07]
< .001
***
Cohort [2021]
−0.29 (0.19)
0.75
[0.52–1.10]
.228
 
Gender [Men]
0.22 (0.18)
1.25
[0.88–1.76]
.255
 
Discipline [Science]
−0.74 (0.26)
0.48
[0.28–0.79]
.014
*
Discipline [Social Science]
0.13 (0.23)
1.14
[0.72–1.79]
.568
 
Future-oriented suggestive
Intercept
−2.61 (0.14)
0.07
[0.06–0.09]
< .001
***
Cohort [2021]
0.43 (0.15)
1.54
[1.14–2.10]
.018
*
Gender [Men]
−0.36 (0.19)
0.70
[0.47–1.01]
.110
 
Discipline [Science]
−0.38 (0.31)
0.69
[0.36–1.21]
.218
 
Discipline [Social Science]
−0.47 (0.37)
0.63
[0.29–1.22]
.218
 
Men: Science
0.870 (0.38)
2.39
[1.17–5.11]
.047
*
Men: Social Science
0.698 (0.45)
2.01
[0.85–5.09]
.172
 
Values in parentheses indicate the standard error. Values in square brackets indicate the 95% confidence interval of odd ratio estimation. LL and UL indicate the lower and upper limits of odd ratio. N = 235
*p < .05; **p < .01, ***p < .001
Table 9
Logistic regression. Cohort, gender, and discipline as predictors of the proportion of feedback comments for weak essay
Subcategory
Predictor
β
OR
OR 95% CI [LL, UL]
p
Process
Intercept
−2.62 (0.12)
0.07
[0.06–0.09]
< .001
***
Cohort [2021]
0.50 (0.14)
1.65
[1.27–2.16]
.001
**
Gender [Men]
−0.11 (0.13)
0.89
[0.69–1.15]
.477
 
Discipline [Science]
0.20 (0.18)
1.22
[0.86–1.72]
.428
 
Discipline [Social Science]
0.07 (0.20)
1.07
[0.72–1.58]
.723
 
Grammar
Intercept
−2.70 (0.13)
0.07
[0.05–0.09]
< .001
***
Cohort [2021]
−0.07 (0.16)
0.93
[0.69–1.27]
.662
 
Gender [Men]
−0.28 (0.17)
0.76
[0.54–1.05]
.166
 
Discipline [Science]
−0.87 (0.26)
0.42
[0.25–0.69]
.002
**
Discipline [Social Science]
−0.28 (0.24)
0.76
[0.46–1.20]
.308
 
Praise
Intercept
−4.06 (0.22)
0.02
[0.01–0.03]
< .001
***
Cohort [2021]
0.25 (0.27)
1.28
[0.77–2.21]
.444
 
Gender [Men]
0.01 (0.23)
1.01
[0.65–1.56]
.971
 
Discipline [Science]
0.45 (0.31)
1.56
[0.85–2.89]
.250
 
Discipline [Social Science]
0.78 (0.31)
2.18
[1.17–4.05]
.033
*
Neutral
Intercept
−2.40 (0.11)
0.09
[0.07–0.11]
< .001
***
Cohort [2021]
−0.04 (0.13)
0.96
[0.74–1.25]
.756
 
Gender [Men]
−0.11 (0.16)
1.11
[0.82–1.51]
.719
 
Discipline [Science]
−0.14 (0.25)
0.87
[0.53–1.39]
.719
 
Discipline [Social Science]
0.13 (0.25)
1.13
[0.68–1.82]
.719
 
Men: Science
−0.46 (0.33)
0.63
[0.34–1.21]
.374
 
Men: Social Science
−0.99 (0.38)
0.37
[0.17–0.78]
.031
*
Past-oriented
Intercept
−1.57 (0.08)
0.21
[0.18–0.24]
< .001
***
Cohort [2021]
−0.25 (0.10)
0.78
[0.64–0.94]
.028
*
Gender [Men]
0.05 (0.09)
1.06
[0.88–1.27]
.568
 
Discipline [Science]
−0.18 (0.12)
0.83
[0.65–1.06]
.232
 
Discipline [Social Science]
−0.10 (0.14)
0.90
[0.69–1.18]
.568
 
Future-oriented suggestive
Intercept
−2.32 (0.10)
0.10
[0.08–0.12]
< .001
***
Cohort [2021]
0.34 (0.12)
1.40
[1.11–1.78]
.013
*
Gender [Men]
0.13 (0.11)
1.14
[0.92–1.42]
.389
 
Discipline [Science]
0.09 (0.15)
1.10
[0.81–1.47]
.678
 
Discipline [Social Science]
0.043 (0.17)
1.04
[0.74–1.45]
.800
 
Values in parentheses indicate the standard error. Values in square brackets indicate the 95% confidence interval of odd ratio estimation. LL and UL indicate the lower and upper limits of odd ratio. N = 235
*p < .05; **p < .01, ***p < .001
Regarding the focus of feedback, participants in 2021 were 0.73 (β = −0.32, p < .001) times as likely to provide comments about the task relative to 2020 cohort in the strong essay. Moreover, candidates from the 2021 cohort were 1.49 (β = 0.4, p < .012) and 1.65 (β = 0.50, p = .001) times more likely to provide comments that referred to process in the strong and weak essays, respectively, compared to candidates from cohort 2020. For the content specificity, candidates from 2021 were 1.89 (β = 0.64, p = .013) times as likely to write feedback comments about grammar relative to the 2020 cohort. Results also showed a significant effect of discipline on the feedback comments about grammar, suggesting that science teacher candidates were 0.42 (β = −0.87, p < .002) times as likely to provide messages about grammar aspects compared to their language specialization counterparts.
The participants’ discipline was a statistically significant predictor of the emotional valence category. Specifically, science teacher candidates were 0.48 (β = −0.74, p = .014) times as likely to provide praise messages when assessing the strong essay compared to language teacher candidates, whereas social science teacher candidates provided 2.18 (β = 0.78, p =.033) times more praise when assessing the weak essay than language teacher candidates. Additionally, there was a statistically significant interaction effect between gender and discipline for neutral feedback comments when assessing the weak essay, suggesting that male science teachers provided a lower proportion of neutral comments in this essay compared to male language teachers (β = −2.97, < .001).
Finally, participants in 2021 were 0.78 (β = −0.25, p = .028) times as likely to provide past-oriented evaluative comments when assessing the weak essay compared to the 2020 cohort. Moreover, candidates from 2021 cohort were 1.54 (β = 0.43, p = .018) and 1.40 (β = 0.34, p = .013) times more likely to provide future-oriented suggestive comments in both the strong and weak essays, respectively, compared to those from the 2020 cohort. Additionally, male science teachers provided a higher proportion of future-oriented suggestive comments in the strong essay compared to male language teachers (β = 0.870, p = .047). Tables S6 to S8 (Supplementary materials) display the goodness-of-fit test statistics for all models conducted. Additionally, comparison of goodness-of-fit statistics for models conducted with categories with zero-inflated indicates that the best model for adjusted data were logistic regression models with a binomial link function (Table S9, Supplementary Materials).

7 Discussion

Teachers often assess and deliver written feedback comments on their student writing assignments, which is critical for improving writing outcomes and developing students’ writing skills (e.g., Dempsey et al., 2009; Duijnhouwer et al., 2010, 2012; Graham, 2018; Parr & Timperley, 2010). The extant literature suggests that providing effective feedback is a complex pedagogical skill that is difficult to master for both in-service and pre-service teachers (Ropohl & Rönnebeck, 2019; Ryan et al., 2021; Underwood & Tregidgo, 2006). So far, very few studies have closely examined the type of feedback that pre-service teachers deliver when assessing written assignments, which was the main goal of our study. We attempted to present a comprehensive overview of the feedback comments that pre-service high school teachers provide on two standardized essays of different quality. We also examined differences in feedback they provided depending on gender, academic discipline, and students’ quality of writing (i.e., weak vs. strong essay).

7.1 What kind of feedback messages did pre-service high school teachers provide when assessing an academic writing assignment?

Guided by existing taxonomies (e.g., Otnes & Solheim, 2019; Wingard & Geosits, 2014; Zellermayer, 1989), we employed four categories to classify feedback comments, i.e., focus, content specificity, emotional valence, and past and future orientation. In general, the literature indicates that teachers’ feedback should be predominantly process-focused, as this type of feedback would have a higher chance of being transferred to a new task (Hawe et al., 2008; Parr & Timperley, 2010). However, recent findings show that both in-service and pre-service teachers usually provide task-focused feedback when assessing writing assignments (Arts et al., 2016; Dysthe, 2011; Fong et al., 2013). Despite the fact that over three quarters of the comments analyzed in the current study focused on the students’ task performance, more than 40% of feedback messages referred to the students’ writing process (e.g., Once you consider your writing done, you should check whether there are mistakes and whether it responds to the topic in an appropriate manner). This generally positive pattern of our results suggests that pre-service high school teachers may not only spontaneously assess the assignment outcome but also the writing process that leads students to that outcome (Fong et al., 2013).
When it comes to the content of comments, teacher candidates in our study mostly commented on students’ writing style and the arguments employed to defend their contentions, with relatively few comments focusing on grammar and punctuation. This finding contradicts previous findings suggesting that both in-service and pre-service teachers provided feedback on surface features (grammar, spelling or punctuation) rather than on higher order writing skills (e.g., Arts et al., 2016; Furneaux et al., 2007; Hawe et al., 2008). It is possible that this overall optimistic finding in our study has to do with our systematic and comprehensive design. That is, we examined 1835 comments delivered on two standardized essays, so we were able to capture a more complete pattern of pre-service teachers’ written feedback compared to previous studies.
When it comes to the emotional valence of feedback, pre-service teachers mostly delivered criticisms or provided comments in a neutral or informative tone. Such informative feedback comments that mainly state or explain what actions students should undertake are effective for enhancement of student writing (Hattie et al., 2021). Interestingly, very few comments included praise, which aligns with the evidence-based suggestion that praise does not enhance learning (Lipnevich & Smith, 2022). Teachers’ feedback that includes praise/criticism comments without further explanations may limit feedback effectiveness (Nicol & Macfarlane-Dick, 2006). In our study, teacher candidates provided praise and criticism along with additional information to help move students’ writing forward. This finding should be interpreted with caution. In our study, teacher candidates did not know students who had written the two essays. Hence, they may not have been emotionally invested to provide positive comments that they otherwise would have, had they had a personal relationship with the student. Furthermore, the number of critical comments could have varied, had the teachers known the authors of the two essays. These potential confounds aside, the bulk of teacher candidates’ comments in our study was neutral in tone which suggests that they are well prepared to communicate feedback in a non-emotionally charged manner.
For the past and future orientation of feedback, we found that over 70% of the comments were past-oriented, about 45% included future-oriented suggestive actions, and few comments included future-oriented directive comments. Overall, these findings indicate that pre-service teachers’ feedback included information about where students are, where they needed to go and offered suggestions on how to get there (Parr & Timperley, 2010). Although previous research has shown that teachers rarely delivered future-oriented comments when assessing a writing assignment (Arts et al., 2016), our findings revealed that about 50% of the comments offered students the tools to improve their writing, which is a key aspect of feedback design (Duijnhouwer et al., 2012; Pitt & Norton, 2017; Price et al., 2011). We also found that teacher candidates provided a substantial number of past-oriented evaluative comments, which may inhibit student’s engagement with feedback processing if they focus on negative aspects of the students’ performance (Pitt & Norton, 2017).

7.2 To what extent did pre-service teachers’ gender and academic discipline, as well as students’ quality of writing, predict the type of feedback provided?

To answer our second research question, we considered the impact of critical variables on teacher candidates’ feedback provision (i.e., strong vs. weak essay, teacher candidates’ gender, and the academic discipline). As literature suggests, considering students’ writing level is crucial to craft effective feedback (Graham, 2018; Ropohl & Rönnebeck, 2019). In our study, an interesting trend was observed for the differences in comments between the weak and the strong essay. For the weak essay, candidates mostly focused on grammar, provided critical comments, past-oriented evaluations, and future-oriented suggestions. On the other hand, the strong essay received more comments on the writing process, quality of ideas expressed, praise, neutral messages, and future-oriented directive comments. These findings indicate that pre-service high school teacher candidates can create feedback that includes actions to improve the writing and that they adapt feedback to the level of writing performance (Duijnhouwer et al., 2012; Hattie et al., 2021), with the strong essay receiving the type of feedback that facilitates future performance in a deeper way.
When considering teacher candidates’ gender, we observed interesting differences in feedback types. In general, women commented more on grammar aspects, while men commented more on the quality of the ideas and arguments employed. However, regarding the focus of feedback, its emotional valence, and its past-future orientation, no significant differences in the proportion of comments by gender were apparent. A significant interaction between gender and discipline was also found for future-oriented suggestive feedback comments on the strong essay, with male science teachers providing a higher proportion of suggestions in this essay compared to male language teachers. Another significant interaction between gender and discipline was found for neutral feedback comments on the weak essay, with male social science teachers providing a lower proportion of neutral comments in this essay compared to male language teachers.
When it comes to the teacher candidates’ academic discipline, social science and science teacher candidates referred to task performance in almost every comment when assessing the weak essay. Interestingly, teacher candidates from the three disciplines provided a similar proportion of comments about the writing process when assessing both the strong and the weak essays. Apparently, teacher candidates seem to be able to foster the composition skills of young writers, by focusing on the writing process when creating feedback. For the content specificity of the comments, we found that language teacher candidates delivered a higher proportion of comments related to grammar aspects in the weak essay compared to their counterparts from science, which means that they pay more attention to the mechanics of writing. However, science teacher candidates, in turn, were more likely to provide feedback on the quality of ideas expressed within the weak essay compared to their language teacher counterparts. Further, social science candidates provided more comments on the ideas when assessing the strong essay. As for the emotional valence of feedback, candidates from the three disciplines delivered neutral or praise comments more often when assessing the strong essay, but criticized the weak essay with higher probability. Science teacher candidates, however, delivered a lower proportion of praise comments when assessing the strong essay compared to their language counterparts. On the other hand, language teacher candidates delivered a higher proportion of neutral comments when assessing the weak essay compared to the candidates from other disciplines. Regarding the orientation of feedback, the results showed that language teacher candidates delivered more future-oriented suggestions when assessing the strong essay compared to participants from other disciplines. Again, these findings are quite encouraging in that teachers seem to tailor their comments to low- and high-quality essays, and language teacher trainees are quite prepared in helping students to improve their writing.
Since this study was conducted in two academic years, cohort was considered when predicting the type of feedback comments created. Results suggest that teacher candidates from different cohorts may deliver different feedback. For instance, candidates from 2021 cohort focused more on the writing process and offered more future-oriented suggestive comments when assessing the essays (both weak and strong) than candidates from the 2020 cohort. However, candidates from the 2021 cohort were less likely to provide feedback about students’ task performance and commented more on grammar aspects when assessing a strong essay than candidates from the 2020 cohort. When assessing the weak essay, candidates from 2021 delivered fewer past-oriented messages than candidates from the 2020 cohort. These findings indicate that specific training on building effective feedback that respond to evidence-based instructional principles need to be consistently integrated into teacher training programs so as to ensure that teachers from different cohorts generate similar (and equally effective!) feedback routines.

8 Limitations and future directions

Although in this study we presented a comprehensive picture on the type of feedback pre-service high school teachers may deliver when assessing written essays, our research reveals some areas for improvement that warrant further studies. First, teacher candidates assessed two essays from fictitious students, which may have influenced the interactive nature of feedback. Teachers often hold beliefs of their students’ competence and motivation, leading to different patterns of communication. Exploring how pre-service teachers provide feedback when assessing a writing assignment during their internship training would be of special interest to draw a more ecologically valid picture of feedback practices. At the same time, by using standardized essays, we were able to draw conclusions about differences in teacher feedback while holding the quality of essays constant.
Second, providing feedback requires teachers to activate cognitive and metacognitive processes when assessing an assignment. To the best of our knowledge, no study has examined teachers’ cognitive processing when crafting feedback. Future studies may explore these mechanisms and processes. Furthermore, in our study, teacher candidates did not receive any specific training on evidence-based instructional principles that may support feedback construction. Future research may examine the extent to which training candidates to deliver effective feedback on specific assignments (e.g., writing tasks) may improve their feedback practices (Dempsey et al., 2009). It may also be relevant to account for the teacher candidates’ background to assess writing, the role of their own experiences receiving feedback, and the extent to which general or discipline-based training experiences make a difference in delivering feedback.
To be effective, feedback comments have to be received, processed, and used by students. Previous studies suggest that teachers sometimes overestimate the quality of their feedback comments compared to their students’ perceptions (Orrell, 2006). Research also indicates that students use feedback comments to make adjustments to their assignments even when no future-oriented comments are provided (Arts et al., 2016). Given that our study demonstrates that teacher candidates are able to deliver future-oriented comments, we call for research that carefully examines how students take up this particular type of feedback, and whether these comments translate into enhanced engagement and task improvements. Finally, this study was conducted in a sample of teacher trainees in a single university in Spain. Replicating this study in other universities and countries could be a fruitful avenue for research.

9 Implications and conclusions

In general, our findings offer several insights regarding instructional implications for teacher training programs and for student learning. Literature suggests that feedback effectiveness depends on teachers’ abilities to deliver information in a manner that can trigger actions that facilitate learning improvements (e.g., Carless & Boud, 2018; Winstone et al., 2017). The coding scheme that we employed to classify feedback permitted us to dig deeper into the quality of the feedback comments pre-service high school teachers can provide to their students when assessing writing assignments. Contrary to previous findings (Hawe et al., 2008), we observed that teacher candidates treated the written products as work-in-progress rather than finished work, thus including a good number of comments on the writing process along with future-oriented messages. However, some of the candidates systematically focused on students’ task outcomes, which may mean that they paid more attention to the product and the specifics of the task, thus reducing the odds of promoting learning improvements. We also found that overall, teacher candidates were capable of delivering neutral informative comments—something that research shows could be most conducive to improvement (e.g., Lipnevich & Smith, 20092022).
Taken together, our results suggest that there is a pressing need to provide structured training for teacher candidates, facilitating their ability to deliver spontaneous feedback across a spectrum of assignments, while considering its characteristics, such as focus, content, emotional tone, and orientation. Initial efforts should prioritize the identification of candidates who naturally gravitate towards assessing student products and task-specific elements, laying the groundwork for instruction on evidence-based feedback practices. Subsequently, the implementation of authentic, hands-on experiences becomes paramount, aimed at cultivating feedback strategies that prioritize the writing process, adopt a forward-looking perspective, and are conveyed in a neutral tone. Such an approach holds a potential in harnessing the power of feedback to enhance writing proficiency and foster academic improvement (Parr & Timperley, 2010; Pitt & Norton, 2017). Notably, it is important to caution teacher trainers regarding various considerations during training sessions. Factors such as candidates’ gender may influence the characteristics of feedback, with men providing more comments on the arguments and women offering more comments on the grammar aspects of student work. Similarly, disciplinary differences must be acknowledged, wherein male social science educators may provide fewer neutral comments than language candidates, whereas male science instructors may provide more future-oriented suggestions compared to language educators.
This study holds significant implications for student learning within the classroom setting. Broadly speaking, research findings suggest that the provision of high-quality and effective feedback can greatly contribute to the acquisition and enhancement of higher-order writing skills, such as managing text structure and constructing compelling arguments. However, it goes without saying that not all students receive identical types of feedback. Our research revealed that lower-performing students may receive fewer opportunities to cultivate critical writing skills compared to their higher-performing peers. Moreover, they may encounter a higher frequency of critical feedback, potentially dampening their motivation. Additionally, it is important to recognize that students may derive varying benefits from different types of feedback, influenced by the subject matter at hand. For instance, language teachers often prioritize mechanics and offer suggestions to refine the written product, whereas science instructors may focus more on evaluating the strength of arguments presented. Despite the significance of these implications, it is essential to approach our results with caution, as participants in our study did not know the students who produced the essays.
In this study, we attempted to provide a fine-grained picture of the type of written feedback that high school teacher candidates deliver when assessing written assignments, as well as to examine the extent to which the quality of the student’s essay, candidate’s gender, and academic discipline predicted the type of comments they generate. Despite the fact that some feedback comments did not align with evidence-based instructional principles, the bulk of the comments included effective and valuable information that has a great potential of enhancing students’ writing. All in all, the findings of this study are quite optimistic. Teacher candidates in our sample delivered comments of higher quality than what was predicted by the literature. There is always room for improvement, but we shall choose to remain pleased with the participating cohort of teacher educators.

Declarations

Competing interests

The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Anhänge

Supplementary information

Literatur
Zurück zum Zitat Bitchener, J., & Ferris, D. R. (2012). Written corrective feedback in second language acquisition and writing. Routledge.CrossRef Bitchener, J., & Ferris, D. R. (2012). Written corrective feedback in second language acquisition and writing. Routledge.CrossRef
Zurück zum Zitat Fong, C. J., Williams, K. M., Schallert, D. L., & Warner, J. R. (2013). Without adding these details, your writing is meaningless: Evaluating preservice teachers’ constructive feedback on a writing assignment. Yearbook of the Literacy Research Association, 62, 344–358. Fong, C. J., Williams, K. M., Schallert, D. L., & Warner, J. R. (2013). Without adding these details, your writing is meaningless: Evaluating preservice teachers’ constructive feedback on a writing assignment. Yearbook of the Literacy Research Association, 62, 344–358.
Zurück zum Zitat Freedman, S. W., Hull, G. A., Higgs, J. M., & Booten, K. P. (2016). Teaching writing in a digital and global age: Toward access, learning, and development for all. In D. H. Gitomer & C. A. Bell (Eds.), Handbook of research on teaching (5th ed., pp. 1389–1450). American Educational Research Association. https://doi.org/10.3102/978-0-935302-48-6_23CrossRef Freedman, S. W., Hull, G. A., Higgs, J. M., & Booten, K. P. (2016). Teaching writing in a digital and global age: Toward access, learning, and development for all. In D. H. Gitomer & C. A. Bell (Eds.), Handbook of research on teaching (5th ed., pp. 1389–1450). American Educational Research Association. https://​doi.​org/​10.​3102/​978-0-935302-48-6_​23CrossRef
Zurück zum Zitat Friendly, M., & Meyer, D. (2016). Discrete data analysis with R: Visualization and modeling techniques for categorical and count data (1st ed.). Chapman & Hall. Friendly, M., & Meyer, D. (2016). Discrete data analysis with R: Visualization and modeling techniques for categorical and count data (1st ed.). Chapman & Hall.
Zurück zum Zitat Hawe, E., Dixon, H., & Watson, E. (2008). Oral feedback in the context of written language. The Australian Journal of Language and Literacy, 31(1), 43–58. Hawe, E., Dixon, H., & Watson, E. (2008). Oral feedback in the context of written language. The Australian Journal of Language and Literacy, 31(1), 43–58.
Zurück zum Zitat Kluger, A. N., & DeNisi, A. (1996). The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological Bulletin, 119(2), 254–284.CrossRef Kluger, A. N., & DeNisi, A. (1996). The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological Bulletin, 119(2), 254–284.CrossRef
Zurück zum Zitat Lopera-Oquendo, C., Lipnevich, A., & Máñez, I. (2023, June 28). Comparison of holistic and analytic grading n high-school pre-service teachers. [Oral presentation]. EARLI SIG 1&4 Conference, Cádiz, Spain. Lopera-Oquendo, C., Lipnevich, A., & Máñez, I. (2023, June 28). Comparison of holistic and analytic grading n high-school pre-service teachers. [Oral presentation]. EARLI SIG 1&4 Conference, Cádiz, Spain.
Zurück zum Zitat McHugh, M. L. (2012). Interrater reliability: The kappa statistic. Biochemia Medica, 22(3), 276–282.CrossRef McHugh, M. L. (2012). Interrater reliability: The kappa statistic. Biochemia Medica, 22(3), 276–282.CrossRef
Zurück zum Zitat Pekrun, R. (2022). Development of achievement emotions. In D. Dukes, A. Samson, & E. Walle (Eds.), The Oxford handbook of emotional development (pp. 446–462). Oxford University Press.CrossRef Pekrun, R. (2022). Development of achievement emotions. In D. Dukes, A. Samson, & E. Walle (Eds.), The Oxford handbook of emotional development (pp. 446–462). Oxford University Press.CrossRef
Zurück zum Zitat Underwood, J. S., & Tregidgo, A. P. (2006). Improving student writing through effective feedback: Best practices and recommendations. Journal of Teaching Writing, 22(2), 73–97. Underwood, J. S., & Tregidgo, A. P. (2006). Improving student writing through effective feedback: Best practices and recommendations. Journal of Teaching Writing, 22(2), 73–97.
Metadaten
Titel
Examining pre-service teachers’ feedback on low- and high-quality written assignments
verfasst von
Ignacio Máñez
Anastasiya A. Lipnevich
Carolina Lopera-Oquendo
Raquel Cerdán
Publikationsdatum
16.04.2024
Verlag
Springer Netherlands
Erschienen in
Educational Assessment, Evaluation and Accountability
Print ISSN: 1874-8597
Elektronische ISSN: 1874-8600
DOI
https://doi.org/10.1007/s11092-024-09432-x

Premium Partner