Developing Semi-Automated Evaluation of Analysis in Secondary Student Writing

Introduction

Fully functional and reliable automated “AI” grading of essays is a long way off yet and well beyond the computing capability available in typical secondary school classrooms. However, useful steps in that direction are well within reach, particularly for working within the domain of limited vocabulary and composition skills that constitute the typical proficiency level of students in grades six through twelve. High school social studies teachers in New York State assess student essays using a grading rubric provided by the State Education Department. One dimension of this rubric is to assess the relative degree of “analytic writing” versus descriptive. Students whose essays are more analytical than descriptive have a work of greater value. The artificially intelligent grading program at InnovationAssessments.com estimates the grade of a student writing sample by comparing it to a number of models in a corpus of full credit samples. With a view to developing an algorithm that better imitates human raters, this paper outlines the data and methods underlying an algorithm that yields an assessment of the “richness of analysis” of a student writing sample.

Measuring “Richness” of Analysis in Secondary Student Writing Samples

The New York State generic scoring rubrics for high school social studies Regents exams, both for thematic and document-based essay1, value student expository work where the piece “[i]s more analytical than descriptive (analyzes, evaluates, and/or creates* information)” (Abrams, 2004). A footnote in the Generic Grading Rubric states: “The term ​create​ as used by Anderson/Krathwohl, et al. in their 2001 revision of Bloom’s Taxonomy of Educational Objectives​ refers to the highest level of the cognitive domain. This usage of create is similar to Bloom’s use of the term synthesis. Creating implies an insightful reorganization of information into a new pattern or whole. While a level 5 paper will contain analysis and/or evaluation of information, a very strong paper may also include examples of creating information as defined by Anderson and Krathwohl.”

Anderson and Krathwohl (2002), in their revision of Bloom’s Taxonomy, define analysis thus:

4.0 Analyze – Breaking material into its constituent parts and detecting how the parts
relate to one another and to an overall structure or purpose.
4.1 Differentiating
4.2 Organizing
4.3 Attributing

One of the ways that students analyze is to express cause and effect relationships (Anderson and Krathwohl’s “4.3 Attributing”). It is possible using natural language processing techniques to identify and examine cause and effect relationships in writing samples using lexical and syntactic indicators. Taking a cue from the New York State rubric, one could judge that a student writing sample is more “richly analytical” if it “spends” more words on cause and effect proportionate to the entire body of words written.

Identifying and Extracting Cause and Effect Relationships using Natural Language Processing

With regard to identifying cause-effect relationships in natural language, Asghar (2016, p. 2) notes that “[t]he existing literature on causal relation extraction falls into two broad categories: 1) approaches that employ linguistic, syntactic and semantic pattern matching only, and 2) techniques based on statistical methods and machine learning.” The former method was selected for this task because the domain is limited to secondary level student writing samples and they use a limited variety of writing structures. Previous work studying this issue yielded better results in domain-specific contexts (Asghar, 2016) and tagging sentences containing cause-effect relationships in this context should be within reach to a high degree of accuracy.

The software is written in Perl. The following process is applied to the student writing sample for
analysis:

  1. The text is “scrubbed” of extra consecutive spaces, HTML tags, and characters outside the normal alphanumeric ASCII range.
  2. The Flesch-Kincaid text complexity measure is calculated.
  3. The text is “lemmatized”, meaning words that have many variations are reduced to a root form (i.e., “is, am, are, were, was” etc. are all turned to “be”; “cause, caused, causing” etc. are all turned to “cause.”)
  4. The text is “synonymized”, meaning words are changed to a single common synonym. The text is separated into an array of sentences and all words are tagged by their part of speech.
  5. A variety of lexical and syntactic indicators of cause-effect are used in pattern matching to identify and extract sentences which include a cause-effect relationship into an array.
  6. The resulting array of cause-effect relationship sentences are converted into a “bag of words2” without punctuation. Stop words are removed. All words are “stemmed”, meaning variations on spelling are removed.
  7. Finally, both the original text and the array of cause-effect relationships are reduced further to a bag of unique words.

At this point, the computer program compares the bags of words. The resulting percentage is the proportion of unique words spent on cause-effect out of the total number of unique words. Recall that these are “bags of words” which have been lemmatized, synonymized, stemmed, and from which stop words have been removed.

Limitations of this Method

There are ways to express cause-effect relationships in English without using lexical indicators such as “because”, “thus”, “as”, etc. For example, one could express cause and effect this way: It was raining very heavily. We put on the windshield wipers and we drove slowly.

“Putting on the wipers” and “driving slowly” are caused by the heavy rain. There are no semantic or lexical indicators that signal this. There are many challenges dealing with “explicit and implicit causal relations based on syntactic-structure-based causal patterns” (Paramita, 2016). This algorithm does not attempt to identify this kind of expression of cause-effect. Prior research in this area has shown limited promise to date (Mirza, 2016, p. 70).

Cause-effect is only one way to analyze. Differentiating (categorizing) and organizing (prioritizing, setting up a hierarchy) should also be addressed in future versions of this software. A student could compose a “richly” analytical piece without using cause-effect, although in this writer’s experience cause-effect is the most common expression in writing of people in this age group.

Analyzing a Corpus of Student Work

The New York State Education Department provides anchor papers for the Regents exams so that raters can have models of each possible essay score on a scale of one to five. Anchor papers are written by actual students during the field testing phase of the examination creation process. Sixty such anchor papers were selected for use in this study from collections of thematic and document-based essays available online at the New York State Education Department website archive (​https://www.nysedregents.org/GlobalHistoryGeography/​). Thirty came from papers identified as scoring level five and thirty scoring level two. Essays scoring five are exemplary and rare. Papers scoring two are “passing” and represent the most common score. Essays are provided online in PDF format. Each one was transformed to plain text using GoogleDrive’s OCR feature. Newline characters were removed as was any text not composed by a student (such as header information). This constitutes the corpus.

The computer program analyzed each sample and returned the following statistics: number of cause-effect sentences found in the sample, the count of unique words “spent” on cause-effect relationships in the whole text, the count of unique words in the entire text, the percentage of unique words spent on cause-effect, the seconds it took to process, text complexity as measured by the Flesch-Kincaid readability formula, and finally a figure that is termed the “analysis score” and is intended to be a measure of “richness” in analysis in the writing sample.

An interesting and somewhat surprising finding came in comparing the corpus of level two essays to those scoring a level five. There was no real difference in the percentage of unique words students writing at these levels spent “doing” analysis of cause-effect. The mean percent of words spent on cause-effect relative to the unique words in the entire text was 46% in level five essays and 45% in level twos. There were no outliers and the standard deviation for the level fives was 0.9; for the level twos it was 0.13. Initially, it seemed that essays of poor quality would have a much different figure, but this turned out not to be the case. What made these level two papers just passing was their length and limited factual content (recall that analysis is only one dimension on this rubric).

Text complexity is an important factor in evaluating student writing. The Flesch-Kincaid readability formula is one well-known method for calculating the grade level readability of a text. In an evaluation of the “richness” of a student’s use of analysis, text complexity is a significant and distinguishing feature. The “analysis score” is a figure intended to convey that combination of text complexity and words spent on cause-effect type analysis. This figure is calculated by multiplying the percentage of unique words spent on cause-effect by 100, and then multiplying by the grade level result of the Flesch-Kincaid formula. This measure yielded more differentiating results. In order to discover ranges of normal performance based on these models, the following statistics were calculated for each data set: lowest score (MIN), first quartile(Q1), median(MED), third quartile(Q3), and highest score(MAX).

If this corpus of sixty essays can be considered representative, then the ranges can be considered standards in assessing the richness of secondary level student analysis in a writing sample. These figures can be used to devise a rubric. On a scale of one to four where four is the highest valued sample, the following ranges are derived from the combined statistics of all sixty essays:

Incorporation of Cause-Effect Assessment into AI-Assisted Grading

The artificially-intelligent grading assistance provided subscribers at InnovationAssessments.com, to date, estimates grades for student composition work based on a comparison of eleven text features of the student sample from a comparison with the most similar model answer in a corpus of one or more model texts. In cases where expository compositions are valued higher for being “analytically rich”, incorporating this cause-effect function could refine and enhance AI-assisted scoring.

Firstly, the algorithm will examine the most similar model in the corpus to the student sample. If the analysis score of the model text is greater than or equal to 419, then it is assumed analysis is a feature of the response’s value. In this case, an evaluation of the “analytical richness” of the student’s work will be incorporated into the scoring estimate. Samples that are more analytical will have greater chances of scoring well.

Conclusion

An artificially intelligent grading program for secondary student expository writing that includes an evaluation of the richness of analysis in that text would be very valuable. Cause-effect statements are indicators of analysis. The algorithm described here identifies and extracts these sentences, processes them for meaningful analysis, and judges the quality of the student’s analysis with a number which incorporates a measure of the proportion of words spent on analysis and text complexity. An analysis of sixty samples of student writing yielded a range of scores at four levels of quality for use in artificial grading schemes. While this algorithm does not detect all varieties of cause-effect relationships nor even all types of analysis, its incorporation in already established artificial scoring programs may well enhance the accuracy and reliability of the program.

Sources

Abrams, D. (2004). Revised Generic Scoring Rubrics for the Regents Examinations in Global History and Geography and United States History and Government (field memo). Retrieved from http://www.p12.nysed.gov/assessment/ss/hs/rubrics/revisedrubrichssocst.pdf​ .

Asghar, N. (May 2016). Automatic Extraction of Causal Relations fromNatural Language Texts: A Comprehensive Survey. Retrieved from ​https://arxiv.org/pdf/1605.07895.pdf​.

Krathwohl, D. (2002). ​A Revision of Bloom’s Taxonomy: An Overview.​ Retrieved from https://www.depauw.edu/files/resources/krathwohl.pdf ​.

Mirza, Paramita. (2016). Extracting Temporal and Causal Relations between Events. 10.13140/RG.2.1.3713.5765 .

Sorgente, A., Vettigli G., & Mele F. (January 2013) ​Automatic extraction of cause-effect relations inNatural Language Text.​ Retrieved from ​http://ceur-ws.org/Vol-1109/paper4.pdf​ .

21st Century Learning Spaces: Asynchronous Discussion Forum

My first experience with asynchronous discussion forums came in courses I was taking myself online through Empire State College a number of years ago. Many readers will recognize the assignment: given a prompt, students are to post their response and then reply to the responses of a number of other students in the class. Typically, there was a deadline by which these discussions had to take place. I liked the exercise and I found it useful to address the course material.

I would invite the reader to read my earlier post on synchronous chat, which presents some of the research on online discussion and chat.

Promoters of asynchronous discussion forums point out rightly that this task brings greater participation than face-to-face class discussions do. Whereas in the latter situation, participation may be dominated by an extroverted few or limited in other ways, the online forum brings everybody in. Asynchronous discussion leave time for research and reflection that is not practical in the face-to-face class. There are some practical considerations for students at the middle and high school level that are not usually issues at the college level.

My Experience

I used asynchronous form discussions in my middle and high school social studies classes for a decade. This occurred in each unit of student. In my context, students were assigned a persuasive prompt to which they were expected to take a position and post two supporting reasons. Next, they were assigned to present the opposing view to another student (even if it did not match their actual personal views), and finally they were to defend their original position in reply to the student who was assigned to present the opposing view to themselves.

Sample 7th Grader Exchange

Seventh and eight graders needed training right off the bat, naturally. Accustomed to social media, their early contributions were vapid and full of emojis and “txt” language. It was important to remind them that this was a formal enterprise and that standard English conventions held. It was often difficult to get them to elaborate their ideas toward the 200-word goal set for their opening post.

Not the kind of thing I as looking for!

I was working in a small, rural school where I would have the students from grades seven through ten, so I could see their work develop over the years.

By end of 9th grade, posts became more sophisticated

I found it to be a good practice to offer the highest marks to those who provided evidence and cited a source. I coded a citation generator right in the forum app to encourage this.

Grading the Posts

Scoring these can be labor intensive for no other reason than the layout of the forum itself. The page is designed for reading and responding, but this does not work well for scoring because there is too much scrolling and searching necessary to view posts and replies.

The scoring app makes it easy for the teacher to view the rubric, the student’s posts, and their replies to others in one place. Analysis tools lets the teacher see how many posts, when they were made, and even the readability level of the contributions.
My early discussion grading rubric.
The grading rubric I adopted later on.

Practical Issues

The main problem I encountered in this assignment was that students would forget to complete it at first. I resolved this by assigning it in class and giving time. For example, on the first day I would present the prompt and instruct students to post their positions that class period before continuing with the day’s other work. The following day, students would have time to post their replies and finally a third day they would post their defense.

Another issue that came up was getting everyone the needed number of replies. Some posts would attract more replies than others. Some students needed a reply so they could offer defense. The solution was to modify the assignment and declare that, once one has posted, one is obliged to offer the opposing view to the person above in the forum feed.

Interestingly, these assignments also led to face-to-face spontaneous class discussions, sometimes with me and sometimes with a group. Although this may have been somewhat distracting for students in the class working on other things, we found some compromise time to allow these spontaneous interactions to proceed without disrupting the other work much. These were golden opportunities, conversations of enormous educational benefit that are so hard to artificially initiate and encourage.

I came to regard the discussion each unit as a sort of group persuasive writing effort. I included training in grade eight in persuasive writing and logical fallacies. The discussion app here at Innovation has a feature which allows readers to flag posts as committing a logical fallacy.

The Innovation Discussion Forum App is a 21st Century Learning Space

  • Guardrails: The app lets the teacher monitor all conversations and to delete problematic ones.
  • Training Wheels: The teacher can attach a grading rubric and sample posts. I used to post first under a pseudonym to whom the first student could reply. Additionally, weaker students can peruse the posts of stronger students in an effort to get a clear picture of the kinds of opinions that can be had on the issue.
  • Debriefing: Debriefing is easily achieved by projecting the discussion screen on the from board. Students posts in this task are not anonymous.
  • Assessment and Feedback: The scoring app is very efficient and highly developed from years of use. The teacher can view all pf the student’s posts and replies without having to scroll across the entire platform. Analysis tools reveal the readability of the text, how much they wrote, how analytical it is.
  • Swiss Army Knife: The discussion app lends itself well to more in-depth persuasive writing assignments such as an essay.
  • Locus of Data Control: The student chat submissions are stored on a server licensed to the teacher’s control. Commercial apps such as FaceBook and Twitter may be less dedicated to the kinds of privacy and control exigencies of education.

Ideas in Closing

Asynchronous discussions are great – my students and I enjoyed these tasks. It is my view that higher level thinking demanded by persuasion and debate (Bloom’s evaluation level of the cognitive domain) really enhance long-term memory of the content. I cannot emphasize enough the value of these kinds of higher-order task. Working in a 21st century learning space promotes the participation of everybody.

Automated Scoring of Secondary Student Summaries and Short Answer Tests

Introduction

Research and development of software to harness artificial intelligence for scoring student essays has many significant obstacles. Using machine learning techniques requires massive amounts of data and computing power far beyond what is available to the typical secondary public school. The cost and effort to devise such technology does not seem to be juice worth the squeeze, since it is still more time efficient and cost effective to just have a human do the job. However, the potential exists to devise AI-assisted grading software whose purpose is to increase the speed and accuracy of human raters. AI grading that is “assisted” applies natural language processing strategies to student writing samples in a narrowly defined context and operates in a mostly “supervised” fashion. That is, a human rater activates the software and may make scoring judgments with the advice provided by the AI. A promising area for this, more narrowly contextualized application of artificially intelligent natural language processing, is in scoring summaries and short answer tests. This also poses interesting possibilities for automated coaching for students while they write. This study examines a set of algorithms that derives a suggested score for a secondary level student summary and short answer test response by comparing a corpus of model answers selected by a human rater with the student work. The human rater stays on duty for the scoring process, adding full credit student work to the corpus such that the AIs is trained and selecting student scores.

Features of Text for Comparison

The AI examines the following text characteristics to evaluate a student work by comparison to one or more models:

  • “readability” as determined by the Flesch-Kincaid readability formula
  • the percent difference in number of unique words after pre-processing. “Preprocessing” refers to text that has been scrubbed of irrelevant characters like HTML tags and extra spaces, has been lemmatized, synonymized, and finally stemmed.
  • intersecting noun phrases
  • Jaccard similarity
  • cosine similarity of unigrams
  • cosine similarity of bigrams
  • cosine similarity of trigrams
  • intersecting proper nouns
  • cosine similarity of T-score
  • intersecting bigrams as percent of corpus size
  • intersecting trigrams as percent of corpus size
  • analysis score2

The program first compares the student text to each model using cosine similarity of n-grams. The most similar model in the corpus is then compared to the student work. Four hundred and twenty-six short answer questions that had been scored by a human rater were compared using the algorithm. From these results was developed scoring ranges within each text feature typifying scores of 100, 85, 65, 55, and 0. Outliers were removed from the dataset. Next, sets of student summaries were scored using the ranges for each text feature and the program’s scoring accuracy was monitored. With each successive scoring trial, the profiles were adjusted, sometimes more intuitively that methodically, until over the course of months the accuracy rate was satisfactory.

When analyzing a student writing sample for scoring, the score on each text feature is compared to the profiles and the program keeps a tally of matches for each scoring category (100, 85, 65, 55, and 0). The “best fit” is the first stage of suggested score. Noun phrases, intersecting proper nouns, and bigram cosine were found to correlate most highly with score matching the human rater, so an additional calculation is applied to the profile scores to weight these factors. Next, a set of functions calculates partial credit possibilities for scores in the category of 94, 76 and 44 using statistics from the data analysis of the original dataset of 426 samples. Finally, samples where analysis are important in the response have their score adjusted one final time.

Analysis score” is a metric devised to evaluate the “analytical richness” of a student writing sample​.

The development of the scoring ranges for text features proceeded somewhat methodically and at times more intuitively or organically. Over the course of months, when error patterns in AI scoring became apparent, adjustments were made to improve performance. Natural language processing, even at this basic level, is very demanding on computer memory and processing resources. At this writing, the server running this software has 6GB of RAM and work is often being done on the code to reduce processing time. One strategy is to store both “raw” and processed versions of the student work products as they are written so that processing time can be shortened at the end. the corpus of model responses is also saved in this way.

Training the AI

Upon creation of an assignment, the teacher can save model responses to the corpus. Once students have completed the assignment, the teacher can begin by reviewing and scoring the work product of students who usually score full credit. Upon confirming that these are indeed full credit models, the teacher can click a button to add the student sample to the corpus of model answers. The software limits the teacher to five models in short answer testing and seven models in composition assessment.

Once trained, the teacher can run the scoring algorithm on each student submission. At this writing, processing takes about nine seconds on average per sample, depending on the text size. This program works best for assignments where there is a narrow range of full credit responses. Its primary purpose is to score writing samples by comparing to a limited number of full credit responses. Its strength is in recognizing similar meaning across texts in varying ways to say the same thing. This program does not assess spelling or technical / mechanical writing conventions, although it does rely on student accuracy for scoring to the extent that adherence to certain conventions are necessary for the program to operate. Examples: proper noun count requires that students capitalize them; sentence count requires that students apply standard rules of punctuation.

21st Century Learning Spaces: Debriefing Kit

The debriefing is a powerful tool for teaching to which students readily respond. I have had students tell me they really felt they benefited from these activities.

In general, the debriefing is a lesson that consists of analyzing student errors and offering corrections. Naturally, this is done anonymously so as to avoid embarrassment. It is particularly useful in teaching writing, computer programming, and similar complex tasks that can be broken down into smaller skill sets for training.

For example, when I teach French composition, I select errors from student compositions and present them anonymously to the class. I explain the error, I correct the error, and students then proceed to practice recognizing and correcting the error themselves. Innovation has a number of these lessons for sale at one of our online stores.

By way of another example, when teaching social studies, I help students develop skills for analyzing historic documents using constructed response tasks. This assignment calls upon students to provide historical or geographical context for a document and then to analyze its reliability and relationships with other documents such as cause-effect, turning point, or to compare and contrast. Especially for the reliability element, it is useful to display student work, both strong and weak, for commentary and analysis.

If you’ll indulge a final example, when I teach persuasive writing I like to display student samples in class and we can practice together identifying claims, warrants, rebuttals, and so forth. We can weigh the strength of arguments and of writing style.

21st century learning spaces are designed to facilitate debriefing for all sorts of tasks. Since this is a key feature of my own teaching practice, it is really baked in to the Innovation platform:

  • Multiple-choice: Teachers can start up a “live session” after a test to review. In the live session, the host displays the question and students join the session from their own devices and interact. (Kahoot! is a well-known example).
  • Short Answer: Teachers can initiate a “live session” for short answer that works the same way.
  • Jeopardy-Style Review: It is easy to select questions from a set of recent tasks such as quizzes or short answer prompts and then generate a Ventura game.
  • Analytics: Innovation has a complete set of analytics tools for all online tasks. This includes multiple-choice and short answer item analysis, standardized (“curved”) grading functions, and statistical analysis tools to evaluate and compare assessments. Analytics tells teachers what to debrief; what has priority for review and remediation.
  • Item Analysis: The test “master” for each assessment presents an item analysis of student work and a ready-to-display version of the test.

Debriefing lesson planning can be very arduous. It can be time-consuming to create a slide show or document with copy-pasted elements of student work submissions for analysis. The Innovation platform facilitates this in multiple ways with a few clicks, in true form to a strong 21st century learning space.

21st Century Learning Spaces: Guardrails and Training Wheels

When digital natives, native to the world of online commerce, gaming, and entertainment (digital commercial spaces), come to the 21st century learning space, they bring with them customs from their native shores that are maladaptive. Guardrails are features of software applications that prevent students from engaging in counterproductive activity. Training wheels are app functions that assist students to meet their objectives by coaching, scaffolding, and offering interim assessment of progress.

Focusing Attention

For one thing, the native of yonder shore is accustomed to dividing their attention continuously from one phenomenon to the next. They call it “multi-tasking”, but we know in our land that this is a myth. On social media, advertisers call out to them like hawkers in a busy marketplace. In video games, the constant drive toward increased and sustained stimulation calls their attention elsewhere each moment. Even passive entertainment programs (what we used to call “TV shows”) change scene every few bewildering seconds. Notifications and popups clamor for attention at frequent intervals. Often in place with multiple devices (phone and laptop), the native of digital commercial space is drawn from one virtual event to the next … text from a friend … notification of en email message … ads offering discounts on the item recently searched …

Distractability is the principle maladaptive trait for the 21st century learning space. An unwillingness to ignore and delay some stimuli in favor of sustained attention to one task is the first transformation the native of digital commercial space needs to make. The mechanism of learning, of activity in the working memory that leads to encoding into long-term memory, is not well served by constant interruptions. Studies in cognitive load reinforce the idea that, while varying from individual to individual, there are limits to what can be held in working memory and that overload means information loss.

21st century learning spaces include guardrails to help focus attention and train executive functioning. There are a variety of ways to do this. Third party apps that force students to share their screens with the teacher and which limit the number of browser or window tabs that can be opened are key. Apps should react to loss of focus, such as a multiple-choice test that locks up if a student opens another browser window or one which reports this activity to the teacher. Video monitoring software can track when a student starts, pauses, and stops an embedded video for study.

Academic Honesty

Academic honesty is a new dialect that natives of the commercial world need to learn to speak. In that environment, liberal copy-paste and derivative creation is almost de rigueur. The 21st century learning space provides some guardrails and training wheels. For guardrails, there are apps that check for plagiarism and app features such as recording and reporting on student paste and right clicks in working space.

Evaluating source material is more important in the 21st century learning space than in commercial country. For training wheels, there are apps that guide students in the customary features of a reliable source and that automatically check for errors. Citation generators teach the standard format of source citation in various disciplines.

Coaching and Tutoring by an Algorithmic AI

Studying sometimes means learning information, studying facts, old-fashioned memorization. In 21st century learning spaces, apps for this purpose have features that allow the student to limit the number of items to learn at a time. They also manage the items being studied such that things the student has already learned are hidden away from view so that energy is focused on what has not yet been learned.

The algorithmic AI in the tutor app at Innovation trains students in keywords to remember. The app discards questions students get right so they only work on those they do not yet know.

Composing longer text responses can benefit from coaching. For example, the algorithmic AI at Innovation can be easily trained to provide students immediate feedback on the composition of a summary or an outline. This is an important example of training wheels that supports skill development. The AI can detect copy-paste from an article as well, so as to provide a guardrail against plagiarism.

The algorithmic AI can coach students on composing a summary and estimate the grade they would earn on the work as it progresses.

Accountability: Tracking Activity

In the traditional physical classroom, we can track students’ activities and refocus when students are misdirected. Once we enter the digital world, it is important that 21st century learning spaces permit teachers to maintain the same level supervision. Such spaces need to include extensive auditing capabilities to see when students log in, start a task, finish a task, score on an assessment, how long they spent on the task, and so forth.

Sample fragment of an audit at Innovation showing student activity.
Screenshot of an audit showing student activity at Innovation. These reports can be shared directly with parents.

Support Staff

21st century learning spaces facilitate support staff participation. Software features should easily allow teaching assistants and parents to access selected student’s on-task audits, assignments, scores, and so forth. Proctors for tests in separate location would benefit from access codes allowing them to easily support student learning and testing security.

At Innovation, teachers can let teaching assistants and parents access coursework and student information.

Special Education

Individual education plans (IEPs) offer students the equal opportunity afforded by testing modifications. A 21st century learning space will have these modification options built right in.

Innovation has a number of features to support program and testing modifications:

  • Feature that allows a proctor to unlock and monitor tests
  • Automated extended time on tests
  • Option to attach an alternative, lower-level reading assignment to standard tasks
At Innovation, teachers can set testing accommodations like extended time for individual students in compliance with IEPs.

Guardrails and Training Wheels

21st century learning spaces stand in contrast to commercial digital spaces in providing the support systems that middle and high school students need developmentally. If your experience is like mine, you will find the classroom learning environment much tamed and more effective with these elements in place. Trying to apply apps designed for a commercial environment (sales, games, social media) leads to a wild west effect in classrooms where learning opportunities are lost to distractions.

21st Century Learning Spaces: The Paradigm

The premise of the 21st century learning space concept is that co-opting software applications and devices that were designed for entertainment, socializing, or commerce is a less-than-perfect model for education. The promise of technology for education is realized when the app design meets the needs of an educational community. Five interrelated characteristics of the 21st century learning space that I propose are:

  • Training Wheels
  • Guardrails
  • Debriefing Kit
  • Swiss Army Knife
  • Locus of Data Control

Training Wheels

The development of generative AI and lesser algorithmic AI both offer opportunities to aid the instructor in one of the core strategies of teaching: break it down into manageable pieces to master the goal. 21st century learning spaces could include coaching on spelling, grammar, and even content.

Computer software opens the door to more efficient content management. Teachers curating their classroom resources online have organizational tools that exceed old fashioned binders and notebooks. Addressing the needs of students with disabilities is a key efficiency of 21st century learning spaces: presenting modified texts and assignments becomes more manageable.

Training wheels are temporary assistive devices for young people learning new things. They are a modification to the program that is usually temporary; a scaffolding that brings students upward in the zone of proximal development.

Students have the tools they need to manage their own learning experiences.

21st century learning spaces incorporate a system of badges and rewards as well as provide visualization of students’ progress and achievements.

Guardrails

Young people are easily distracted, especially since their main use of digital devices as been entertainment. 21st century learning spaces have guardrails to limit distractions and develop executive functioning. Examples of such features include extensive logging of online activity in the learning space, a system of scoring and accountability, a “proctor” feature that tracks student interaction with the content.

Plagiarism has never been easier than in the digital realm. Guardrail features of educational apps help prevent academic dishonesty by making it harder to go undetected.

Moderated social engagement apps reinforce learning through shared experiences, discussions, and study groups with confidence that inappropriate content is avoided.

Guardrails are there to protect us from error, safety features along the road at dangerous points to avoid a pitfall.

Debriefing Kit

In a learning community, it is helpful to study our errors to learn from them. This is especially useful in teaching writing, but it has applications to all subjects. Anonymity is very important: if we’re going to display student errors for analysis, everyone must be confident and assured that no one will be humiliated.

Learning analytics available to teacher in the 21st century learning spaces provide detailed information about student progress to inform lesson plans and follow up.

Creating debriefing lessons is time consuming. For example, when I taught social studies I would display anonymous passages from student essays to work on form or content in a whole class activity. When I taught French, I found it very useful to display selected sentences from compositions for correction or improvement.

21st century learning spaces lend themselves to debriefing: they are designed such that the anonymous presentation of teacher-selected student work is easily generated for debriefing.

Swiss Army Knife

Saved data exists in database tables in the digital world. 21st century learning spaces should leverage this flexibility to facilitate lesson planning in multiple modes. Multiple-choice questions can be short answer questions, test questions can be Jeopardy review games, notes taken on lecture can inspire questions for discussion, and so forth. All this should be easy and fast.

21st century learning spaces are a Swiss army knife. Such collections of applications serve many functions from the same core.

Locus of Data Control

When you post to FaceBook, Twitter, or any other public commercial platform, where is your data? If you use FaceBook to moderate a class discussion, what control do you, the teacher, have over your students’ contributions?

21st century learning spaces are those where the teacher rules the roost and student privacy protection is a high priority. In this paradigm, student work is licensed to the teacher’s control for a specified period, after which it is auto-deleted. Inappropriate content posted by students can be edited, hidden, retained for investigation by authorities, or deleted per the instructor’s decisions.

In the commercial domain, data is the valuable commodity used by tech companies. Our data. It is important that student work and teacher’s intellectual property are in safe digital locations and under the teacher’s control.

Reinventing the Term “Digital Native”

Marc Prensky’s 2001 article “Digital Natives, Digital Immigrants” sparked a lot of conversation, even debate, about the use of computers in education. Mr. Prensky proposed that students who grew up with digital devices integrated into their lives “think and process information fundamentally differently from their predecessors” and he posited that it’s “very likely that our students brains have physically changed” as a result of how they grew up (Prensky, 2001). Prensky characterized the older generation as populated by digital immigrants, whose more limited command of computer use was an obstacle to teaching these digital natives. His recommendations focused on what we would now call “gamification” of learning; “edutainment”.

In the two decades since the coining of the term, information and communication (ICT) technology has changed and many challenges legitimately arose to Prensky’s depiction.

I need a term for young people with extensive experience and skill with the digital world of commerce, entertainment, and social media. I would like to borrow Marc Prensky’s term “digital native”.

ICT Skills

The digital native is said to “possess sophisticated knowledge of and skills with information technologies” and to “have particular learning preferences or styles that differ from earlier generations of students”. (Bennett, et al. 2008).

Prensky, in my view, was asking teachers to bend their lessons away from sound educational practice to match the entertainment that students were used to experiencing when using computer technology. I believe this was his mistake.

The ensuing decades saw some challenge to these notions. Scholars posed legitimate challenges for the basis on limited and anecdotal evidence (Bennett, et al. 2008). Further research in the first decade after Prensky’s papers, while confirming the near ubiquitous use of digital devices by adolescents, found that “a significant proportion of students had lower level skills than might be expected of digital natives” and that “only a minority of the students (around 21%) were engaged in creating their own content and multimedia for the Web” (Kvavik, Caruso & Morgan, 2004, as cited in Bennett, et al. 2008).

I would propose that computer software, especially at this early period in the 2000’s but even still today, has been designed primarily for commerce, entertainment, and socializing. This is what Prensky and his supporters were seeing students use; skills in using this software was what students were developing. I submit that software designed to sell, entertain, and socialize has some features that are not supportive of an effective educational tool. Education needs platforms that engage students in what we know to be good learning practices. Prensky, in my view, was asking teachers to bend their lessons away from sound educational practice to match the entertainment that students were used to experiencing when using computer technology. I believe this was his mistake.

But we know that something is different. Those of us who were born before 1980 can attest to changes in adolescents that is associated with the digital age. In my view, this shift is better understood in less dramatic terms: as a cultural change of the sort that has been going on in civilization for millennia. The point is well taken that educators should adapt to this kind of cultural shift.

Thinking and Information Processing

Prensky posits that the extensive use of digital devices has changed how students think and process information. Digital natives are characterized as “accustomed to learning at high speed, making random connections, processing visual and dynamic information and learning through game-
based activities” as well as multi-tasking (Bennett, et al. 2008). Not only do these assertions lack evidence, but I would argue that none of them lend themselves to right learning practice. Multi-tasking does not work for human beings (Napier, 2014), since it interferes with encoding in long-term memory and increases cognitive load.

In addition to the lack of evidence to support the assertion that digital natives thinking is significantly different from previous generations, Bennett et al note that “the claim that there might be a particular learning style or set of learning preferences characteristic of a generation of young people is highly problematic.”

While there is some overlap between engaging in digital experiences for commerce, entertainment, and social media and for education, experience has taught me that there are some fundamental differences. Digital teaching platforms should reflect sound educational practice and practical application to classrooms.

But we know that something is different. Those of us who were born before 1980 can attest to changes in adolescents that is associated with the digital age. In my view, this shift is better understood in less dramatic terms: as a cultural change of the sort that has been going on in civilization for millennia. The point is well taken that educators should adapt to this kind of cultural shift. However, where I diverge from Prensky is here: effective teaching practice does not mean that we adopt the ways of commerce, entertainment, ans social media in wholesale fashion.

Beyond the myth of the “digital native” (2019) by Carlos A. Scolari

What are young people doing with media? An alternative framework for understanding how adolescents use technology is one which maps out the skills adolescents possess across different digital media. “[T]ransmedia skills are understood as a series of skills related to the production, management and consumption of digital interactive media” (Scolari, 2019).

“[T]ransmedia literacy turns [the] question around and asks what young people are doing with the media. Instead of considering young people as consumers taken over by the screens (television or interactive screens, large or small), they are considered ‘prosumers’ able to generate and share media content of different types and levels of complexity” (Scolari, 2019).

Let us redefine the “digital native” as the typical adolescent who has an uneven skill set for media, having come from immersive digital experiences in games, commerce, and social media. Let us recognize that, while there is some overlap, the approach students need to be taught to cultivate toward digital devices as learning tools has some important and very fundamental differences from what they have done before.

“[T]he concept of “digital native”, understood as a young person who “comes with a built-in chip” and who moves skillfully within digital networked environments, shows more problems than advantages” (Scolari, 2019).

In terms of transmedia skills, we still have some things to teach students. Strong skill sets are not evenly distributed (Scolari, 2019). Students come to us skilled in the areas they use most, engaging in entertainment, commerce, and informal social interaction. Skills linked to production are usually strong in adolescents, but those associated with related to ideology and values are more limited.

Scolari writes: “at an individual level, a young person who demonstrates that they have advanced photographic production skills (creation of memes) or audiovisual management skills (a YouTube channel) can, at the same time, have less developed abilities in, for example, detecting stereotypes or managing privacy.”

A Revised Definition

We who have been teaching for decades know that there is a cultural shift going on that is related to information and communication technology. The term “digital native” coined by Prensky lacks a firm foundation and may well be more a reflection of the time it was conceived than anything else. In addition, it seems to me that the current generation of beginning teachers are in no way digital immigrants, having themselves grown up with extensive digital experience.

Adolescents use technology extensively and this does affect their starting point for education. We teachers can capitalize on the skills they possess already in the classroom while refining those they may lack and discouraging adolescent practices that are detrimental (like attempting to multi-task).

Let us redefine the “digital native” as the typical adolescent who has an uneven skill set for media, having come from immersive digital experiences in games, commerce, and social media. Let us recognize that, while there is some overlap, the approach students need to be taught to cultivate toward digital devices as learning tools has some important and very fundamental differences from what they have done before. The platforms we use to teach them in the digital world should reflect this. I would term these apps “21st century learning spaces“.

Sources

Teaching to the Test versus to the Standard

Striking a Balance: Leveraging Innovation Teaching Platform for Effective Pedagogy

In the ever-evolving landscape of education, the debate between teaching to the test and teaching to the standard persists, presenting educators with a perpetual challenge: how to strike a balance between preparing students for standardized assessments while also ensuring they acquire the essential knowledge and skills outlined in educational standards. Fortunately, innovative teaching platforms like Innovation Assessments offer a multifaceted approach to pedagogy that can help educators navigate this delicate balance and promote best practices in teaching and learning.

Utilizing Comprehensive Learning Management:

At the heart of the Innovation teaching platform lies a comprehensive learning management system designed to streamline content creation, assessment, and student engagement. By leveraging this robust platform, educators can seamlessly align their instruction with established educational standards while also incorporating targeted test preparation strategies.

Balancing Test Preparation and Conceptual Understanding:

Innovation Assessments provides educators with the tools to integrate test preparation within a broader framework of standards-aligned instruction. Through features such as customizable assessments, practice drills, and automated grading, educators can ensure that students receive targeted support in mastering both the content knowledge and test-taking skills necessary for success on standardized assessments.

Fostering Inquiry-Based Learning:

One hallmark of effective teaching practice is the promotion of inquiry-based learning, which encourages students to explore, question, and construct their understanding of the world around them. With its diverse array of multimedia resources, interactive activities, and collaborative tools, the Innovation platform empowers educators to create engaging learning experiences that foster curiosity, critical thinking, and problem-solving skills in students.

Personalizing Instruction to Meet Diverse Needs:

Recognizing that every student is unique, the Innovation platform offers personalized learning features that allow educators to tailor instruction to meet the individual needs, interests, and learning styles of their students. From adaptive assessments that adjust to students’ proficiency levels to customizable learning pathways that provide targeted remediation and enrichment, educators can ensure that each student receives the support and challenge they need to succeed.

Empowering Educators with Data-Driven Insights:

In addition to facilitating student learning, the Innovation platform equips educators with valuable data-driven insights into student performance, engagement, and growth over time. By analyzing student data and trends, educators can identify areas of strength and weakness, inform instructional decision-making, and implement targeted interventions to support student success.

Promoting Continuous Professional Growth:

Lastly, the Innovation platform serves as a catalyst for educators’ continuous professional growth by providing access to a wealth of resources, professional development opportunities, and collaborative communities. Through ongoing training, peer collaboration, and reflective practice, educators can refine their pedagogical approaches, stay abreast of emerging best practices, and ultimately elevate the quality of instruction in their classrooms.

In conclusion, while the debate between teaching to the test and teaching to the standard may persist, innovative teaching platforms like Innovation Assessments offer a holistic solution that transcends this dichotomy. By harnessing the power of technology to integrate test preparation within a broader framework of standards-aligned instruction, foster inquiry-based learning, personalize instruction, empower educators with data-driven insights, and promote continuous professional growth, the Innovation platform emerges as a potent tool for promoting best practices in teaching and learning. As educators embrace this multifaceted approach, they can navigate the delicate balance between preparing students for standardized assessments and equipping them with the knowledge, skills, and dispositions needed for success in the 21st century and beyond.

21st Century Learning Spaces: Synchronous Chat

When I was developing an app for synchronous chat, my eighth, ninth, and tenth graders were only too happy to be my beta-testers. It was in the last month before I was to retire and so I wanted to make good use of my time remaining, especially preparing students for the conversation part of the regional world language examination in French. The chat app arose out of the desire for an effective method for students to communicate in the lesson in a paired situation, in a 21st century learning space.

Synchronous Online Discussion in a Co-located Classroom Setting

A number of advantages to blending online discussion tools in the classroom present themselves. In peer face-to-face interactions, “student differences in social status, verbal abilities and personality traits cannot guarantee equal participation rates (Chinn, Anderson, & Waggoner, 2001, as cited in Asterhan & Eisenmann). High-status, high-ability and extrovert peers may often dominate the discussion and group decision making” (Barron, 2003, Caspi et al., 2006, as cited in Asterhan & Eisenmann). Online discussion tools can reduce these factors and present a more egalitarian framework for participation.

Having students in the same room communicating with each other on a chat system may seem odd at first glance, but in addition to the benefits noted above, there are some practical benefits especially for the secondary level. The presence of an adult will ensure more on-task behavior and more appropriate behavior (no “flaming”, for example). Students may not all have equal access to home internet services such an an asynchronous model would demand. Furthermore, the synchronous model greatly ensures that the task will get done. Asynchronous assignments often fall down to procrastination, a typical foible of the adolescent. A literature review by Asterhan and Eisenmann reveal that “[c]ommunication in synchronous discussion environment is closer to spoken conversation and therefore likely to be more engaging and animating than asynchronous conferencing (McAlister, Ravenscroft, & Scanlon, 2004, as cited in Asterhan & Eisenmann). Students have also been found to be more active and produce more contributions in synchronous, than in asynchronous environments (Cress, Kimmerle, & Hesse, 2009, as cited in Asterhan & Eisenmann).”

When used during the class period, synchronous chat is a small part of a larger lesson which includes scaffolding, participation, and debriefing.

Early synchronous chat software such as reviewed in the study by Asterhan and Eisenmann had some practical limitations for class discussion. Instant messaging or threaded discussion boards both work on precedence by chronology, which makes conversations difficult to follow and so may actually defeat the purpose of the exercise. Some teachers have attempted to use FaceBook or Twitter to facilitate class discussions. These platforms were designed to satisfy a commercial interest.

A 21st century learning space paradigm provides the necessary structure (guardrails and training wheels) to maximize quality participation frequency while eliminating concerns about privacy and advertising.

How it Works

The chat app works like this: the teacher opens a chat session and displays the host control dashboard on the large screen. Next, students join the session from their devices and once everyone is onboard, the teacher explains the assignment. The teacher then clicks the control to generate random partners and then to enable the chat session. A timer can optionally be set. Students engage in a real time discussion to carry out the task for the allotted time. During this session, the teacher can display the current chats going on (anonymously, of course) and offer any coaching that would be useful. At the conclusion of the time, the host closes the chat session and can debrief by displaying the chats and offering comment. The chats are anonymous: unless students introduce themselves in live session, they do not know necessarily who their partner is. The pairs are organized by “city”, a nickname generated by the app to identify them from a list of world capitals.

Host Screen Displayed at Front

The first issue that developed was that they enjoyed it (not necessarily a problem but…). It caused a lot of “real” chatter in class as students chuckled about funny things others had said or trying to find out who their partner was. Older students who were more serious about their studies also were motivated to communicate outside the chat session to strategize in real time addressing their assignment. My tenth graders were assigned to use the chat as a writing exercise, such that they answered the prompt by collaboratively composing a paragraph. When a class is engaged in this activity, they need to be trained to maintain a mostly silent room, focused on the task and not the distractions.

A second issue that arose in the early version of the app was that students would forget the prompt or instructions. It was easy to modify the app to allow the teacher to attach “accessories”: text, video embed, and/or a PDF document with the assignment and rubric displayed. Now students could refresh their understanding of the assignment by clicking a button.

Sometimes a student would leave the chat window to another browser tab to look something up. For situations where is is not allowed, I modified to app to include a “proctor” that records right in the app when a student leaves the window and when they paste in text.

Research on this sort of application support the practice of including assessment in the activity (Gilbert and Dabbagh, 2005, as cited in Balaji & Chakrabati, 2010). Students are aware of the rubric and are graded, which has an enhancing effect on their performance as they are often more mindful of their progress. Using the timer, which displays in the front of the room from the teacher’s host screen is also helpful. If one is pressed for time, one is less likely to be off-task without knowing it.

In keeping with the paradigm of the 21st century learning space, the app lends itself well to assessment and debriefing. The assessment screen makes it easy to assess student work on a built-in rubric.

Scoring Controls

Students can see their scores and comments.

I developed this in the context of teaching French, but its application to other subjects is clear. For example, a social studies lesson could include a document or video segment for students to analyze or a short discussion on a topic from lecture.

The chat application is designed as a 21st century learning space .

  • Guardrails: The proctor for the chat app reports on text paste-ins and leaving the browser tab.
  • Training Wheels: The optional accessories can provide the scaffold support for the discussion. The optional timer supports on-task behavior.
  • Debriefing: In debriefing mode, anonymized student contributions to chat can be displayed for analysis and discussion.
  • Assessment and Feedback: In scoring mode, an efficient system of evaluation saves time and offers students significant feedback.
  • Swiss Army Knife: The chat can be viewed in discussion mode, where other features can be applied such as identifying logical fallacies and replying to the posts of students other than one’s assigned partner. In forum mode, the teacher can participate.
  • Locus of Data Control: The student chat submissions are stored on a server licensed to the teacher’s control. Commercial apps such as FaceBook and Twitter may be less dedicated to the kinds of privacy and control exigencies of education.

Synchronous chat turned out to be a hit in my French class. It provided a solid and effective tool for engaging everyone in the lesson and made me feel like my time was well spent. In the next academic year (2023-24), I will be teaching an online synchronous college level French course. Look for posts next fall where I share how the new app went over in that class.

References

Aderibigbe, Semiyu Adejare, Can online discussions facilitate deep learning for students inGeneral Education?

C.S.C. Asterhan and T. Eisenmann, Introducing synchronous e-discussion tools in co-located classrooms: A study on the experiences of ‘active’ and ‘silent’ secondary school students, Computers in Human Behavior (2011).

21st Century Learning Spaces: Accountability and Executive Functioning

During the pandemic, many office workers moved to remote work from home. This precipitated a rise in monitoring software that companies could use to ensure that, being at home, workers were productive. An article in Forbes Magazine from 2021 reports that “[d]emand for worker surveillance tools increased by 74% compared to March 2019.” This rush to monitor and micromanage turned out to be unnecessary, as fears of a loss or productivity proved unfounded and “94% [of companies] reported that worker productivity either stayed at the same levels or improved.”

But this is not the case with adolescents.

The traditional classroom had to be a “very supervised” place because, by virtue of the fact that they are immature, most of our charges need guidance to get back on track. It is one reason why remote learning went so badly for many youngsters: it is not in the nature of most to be focused. The executive functioning needed to ignore distraction, set goals and reasonable timelines for work, even to break a longer task up into smaller, achievable segments is rarely present in adolescence. Until this develops, the role of the instructors includes teaching this skill and guiding students to follow the right course. Teaching with digital devices at present has reduced much of this supervisory ability. 21st century learning spaces would come with an array of monitoring and accountability features.

Data […] promotes accountability, but it also puts the student potentially in the driver’s seat and that is what developing executive functioning is all about.

I recall an instance where a student of mine was completing the essay portion of an examination remotely. I was able to monitor his examination in real time using software that shared his screen with me. When I noticed that he was typing sentences that appeared beyond his ability, I was able to google those phrases and find the source he was plagiarizing from online (he had his phone with him to cheat). This monitoring software allowed me a virtual way to simulate normal classroom supervision and to take the natural step of concluding the examination and award no credit.

After the pandemic, I continued using digital tools for student work. My students all had ChromeBooks. I had a student who was clever in taking advantage of a certain doubtfulness about technology by some adults around him. Faced with an incomplete assignment, he would claim he did it and that the app must have “lost his work”. He would claim that it “did not save”. In a traditional classroom, I would have seen his paper and whether it was written on, but the digital work did not include this monitor yet. I adjusted the software for his writing assignments to report when a response was deleted, when a student left the browser page for another, when students pasted text in, and even double-check the server to ensure an answer was saved. These application features returned important accountability assurances that were initially lost when moving to digital devices.

As time went on, my colleagues and I devised further modifications to the software at Innovation. I developed the “proctor” on important apps for testing and writing.

The proctor records data about the page and the students’ interactions with the assignment. Depending on the particular assignment, it records when work has begun, when an ancillary resource like a video has successfully loaded, when a student leaves the page and for how long, when text is pasted in, and when answers are saved. The proctor is visible to students (see illustration above) so they know their work is being monitored.

My colleague in the science department uses a flipped classroom technique. He made a great suggestion for the development of an app to monitor student interaction with a video assignment. As a student watches a video assignment, proctor records events like start video, stop video, how long between pauses, when the video ended, and how long the student was there.

The tracking monitor helped maintain a system of accountability for students.

Besides the proctor, Innovation tracks student activity around the site. The auditor maintains a record of logging in, accessing a course, starting a task, saving work, getting a score, etc.

The critical work of developing executive functioning in adolescents can be enhanced by providing youngsters the kind of data that, if they attend to it, can inform their decisions about what they should do. The proctor and other reporting tools are available to all students. Although consequences for missing the mark on attention to task can and should be part of the program, it is not great practice to be all sticks and no carrots. Objective data on what a teenager is actually doing (rather than what they remember they did or want you to think they did) can be the focus of discussions about on-task behavior and how the individual can take responsibility for it. We can take a look at performance on as assignment and examine on-task behavior related to its production. Could on-task behavior have improved the final product?

Data like this promotes accountability, but it also puts the student potentially in the driver’s seat and that is what developing executive functioning is all about.