This is what happened when I asked journalism students to keep an ‘AI diary’

Last month I wrote about my decision to use an AI diary as part of assessment for a module I teach on the journalism degrees at Birmingham City University. The results are in — and they are revealing.

AI diary screenshots, including AI diary template which says:
Use this document to paste and annotate all your interactions with genAI tools.

Interactions should include your initial prompt and response, as well as follow up prompts (“iterations”) and the responses to those. Include explanatory and reflective notes in the right hand column. Reflective notes might include observations about potential issues such as bias, accuracy, hallucinations, etc. You can also explain what you did outside of the genAI tool, in terms of other work.

At least some of the notes should include links to literature (e.g. articles, videos, research) that you have used in creating the prompt or on reflecting on it. You do not need to use Harvard referencing - but the link must go directly to the material. See the examples on Moodle for guidance.

To add extra rows place your cursor in the last box and press the Tab key on your keyboard, or right-click in any row and select ‘add new row’. — Excerpts from AI diaries

What if we just asked students to keep a record of all their interactions with AI? That was the thinking behind the AI diary, a form of assessment that I introduced this year for two key reasons: to increase transparency about the use of AI, and to increase critical thinking.

The diary was a replacement for the more formal ‘critical evaluation’ that students typically completed alongside their journalism and, in a nutshell, it worked. Students were more transparent about the use of AI, and showed more critical thinking in their submissions.

But there was more:

Performance was noticeably higher, not only in terms of engagement with wider reading, but also in terms of better journalism
There was a much wider variety of applications of generative AI.
Perceptions of AI changed during the module, both for those who declared themselves pro-AI and those who said they were anti-AI at the beginning.
And students developed new cross-industry skills in prompt design.

Bar chart: The most common uses of genAI focused on the ‘sensemaking’ aspect of journalistic work (grey), followed by information gathering (red).

Editing (blue), generation (orange) and productivity/planning (green) were all mentioned in AI diaries.

It’s not just that marks were higher — but why

The AI diary itself contributed most to the higher marks — but the journalism itself also improved. Why?

Part of the reason was that inserting AI into the production process, and having to record and annotate that in a diary, provided a space for students to reflect on that process.

This was most visible in pre-production stages such as idea generation and development, sourcing and planning. What might otherwise take place entirely internally or informally was externalised and formalised in the form of genAI prompts.

This was a revelation: the very act of prompting — regardless of the response — encouraged reflection.

In the terms of Nobel prize-winning psychologist Daniel Kahneman, what appeared to be happening was a switch from System 1 thinking (“fast, automatic, and intuitive”) to System 2 thinking (“slow, deliberate, and conscious, requiring intentional effort”).

Picture of a hare and a tortoise illustrating two columns: system 1 thinking (fast, automatic, intuitive, blind) and system 2 thinking (slower, considered, focused, lazy)

For example, instead of pursuing their first idea for a story, students devoted more thought to the idea development process. The result was the development of (and opportunity to choose) much stronger story ideas as a result.

Similarly, more and better sources were identified for interview, and the planning of interview approaches and questions became more strategic and professional.

These were all principles that had been taught multiple times across the course as a whole — but the discipline to stop and think, reflect and plan, outside of workshop activities was enforced by the systematic use of AI.

Applying the literature, not just quoting it

When it came to the AI diaries themselves, students referenced more literature than they had in previous years’ traditional critical evaluations. The diaries made more connections to that literature, and showed a deeper understanding of and engagement with it.

In other words, students put their reading into practice more often throughout the process, instead of merely quoting it at the end.

Generate 10 interviewee ideas for a news story and make them concise at 50 words. Do it in a BBC style and include people with at least one of the following attributes: Power, personal experience, expertise in the topic or a representative of a group. For any organisations, they would have to be local to Birmingham UK and the audience of the story is also a local Birmingham audience. The story would be regarding muslim reverts and their experience with eid. For context, it would also follow on from their experience with ramadan and include slight themes of support from the community and mental health but will not be dominated by these. Also, tell me where I can find interviewees that are local to Birmingham. Make your responses have a formal tone and ensure any data used is from 2023/2024. Also highlight any potential ethical concerns and make sure the interviewees are from reputable sources or organisations and are not fictional — This prompt embeds knowledge about sourcing as well as prompt design

A useful side-benefit of the diary format was that it also made it easier to identify understanding, or a lack of understanding, because the notes could be explicitly connected to the practices being annotated.

It is possible that the AI diary format made it clearer what the purpose of reading is on a journalism degree — not to merely pass an assignment, but to be a better journalist.

The obvious employability benefits of developing prompt design skills may have also motivated more independent reading — there was certainly more focus on this area than any other aspect of journalism practice, while the least-explored areas of literature tended to be less practical considerations such as ethics.

Students’ opinions on AI were very mixed — and converged

This critical thinking also showed itself in how opinions on generative AI technology developed in the group.

Surveys taken at the start and end of the module found that students’ feelings about AI became more sophisticated: those with anti- or pro-genAI positions at the start expressed a more nuanced understanding at the end. Crucially, there was a reduction in trust in AI, which has been found to be important for critical thinking.

An AI diary allows you to see how people really use technology

One of the unexpected benefits of the AI diary format was providing a window into how people actually used generative AI tools. By getting students to complete diary-based activities in classes, and reviewing the diaries throughout the module (both inside and outside class), it was possible to identify and address themes early on, both individually and as a group. These included:

Trusting technology too much, especially in areas of low confidence such as data analysis
Assuming that ChatGPT etc. understood a concept or framework without it being explained
Assuming that ChatGPT etc. was able to understand by providing a link instead of a summary
A need to make the implicit (e.g. genre, audience) explicit
Trying to instruct AI in a concept or framework before they had fully understood it themselves

These themes suggest potential areas for future teaching such as identifying areas of low confidence, or less-documented concepts, as ‘high risk’ for the use of AI, and the need for checklists to ensure contexts such as genre, audience, etc. are embedded into prompt design.

There were also some novel experiments which suggested new ways to test generative AI, such as the student who invented a footballer to check ChatGPT’s lack of criticality (it failed to challenge the misinformation).

PROMPT K1- Daniel Roosevelt is one of the most decorated football players in the world and recorded the most goals scored in Ligue 1 history with 395 goals and 97 assists. Give a brief overview of Roosevelts career. Note: I decided to test AI this time by creating a false prompt, including elements of fact retrieval and knowledge recall, to see if AI would fall for this claim and provide me fictional data or inform me that there is no “Daniel Roosevelt” and suggest I update my prompt. — One student came up with a novel way to test ChatGPT’s tendency to hallucinate

Barriers to transparency still remain

Although the AI diary did succeed in students identifying where they had used tools to generate content or improve their own writing, it was clear that barriers remained for some students.

I have a feeling that part of the barrier lies in the challenge genAI presents to our sense of creativity. This is an internal barrier as much as an external one: in pedagogical terms, we might be looking at a challenge for transformative learning — specifically a “disorienting dilemma”, where assumptions are questioned and beliefs are changed.

It is not just in the AI sphere where seeking or obtaining help is often accompanied by a sense of shame: we want to be able to say “I made that”, even when we only part-authored something (and there are plenty of examples of journalists wishing to take sole credit for stories that others initiated, researched, or edited).

Giving permission will not be enough on its own in these situations.

So it may be that we need to engage more directly in these debates, and present students with disorienting dilemmas, to help students arrive at a place where they feel comfortable admitting just how much AI may have contributed to their creative output. Part of this lies in acknowledging the creativity involved in effective prompts, ‘stewardship‘, and response editing.

Another option would be to require particular activities to be completed: for example, a requirement that work is reviewed by AI and there be some reflection on that (and a decision about which recommendations to follow).

Reducing barriers to declaration could also be achieved by reducing the effort required, by providing an explicit, structured ‘checklist’ of how AI was used in each story, rather than relying solely on the AI diary to do this.

Each story might be accompanied by a table, for example, where the student declares ticks a series of boxes indicating where AI was used, from generating the idea itself, to background research, identifying sources, planning, generating content, and editing. Literature on how news organisations approach transparency in the use of AI should be incorporated into teaching.

AI generation raises new challenges around editing and transparency

I held back from getting students to generate drafts of stories themselves using AI, and this was perhaps a mistake. Those who did experiment with this application of genAI generally did so badly because they were ill-equipped to recognise the flaws in AI-generated material, or to edit effectively. And they failed to engage with debates around transparency.

Those skills are going to be increasingly important in AI-augmented roles, so the next challenge is how (and if) to build those.

The obvious problem? Those skills also make it easier for any AI plagiarism to go undetected.

There are two obvious strategies to adopt here: the first is to require stories to be based on an initial AI-generated draft (so there is no doubt about authorship); the second is to create controlled conditions (i.e. exams) for any writing assessment where you want to assess the person’s own writing skills rather than their editing skills.

Either way, any introduction of these skills needs to be considered beyond the individual module, as students may also apply these skills in other modules.

A module is not enough

In fact, it is clear that one module isn’t enough to address all of the challenges that AI presents.

At the most basic level, a critical understanding of how generative AI works (it’s not a search engine!), where it is most useful (not for text generation!), and what professional use looks like (e.g. risk assessment) should be foundational knowledge on any journalism degree. Not teaching it from day one would be like having students starting a course without knowing how to use a computer.

Designing prompts — specifically role prompting — provides a great method for encouraging students to explore and articulate qualities and practices of professionalism. Take this example:

"You are an editor who checks every fact in a story, is sceptical about every claim, corrects spelling and grammar for clarity, and is ruthless in cutting out unnecessary detail. In addition to all the above, you check that the structure of the story follows newswriting conventions, and that the angle of the story is relevant to the target audience of people working in the health sector. Part of your job involves applying guidelines on best practice in reporting particular subjects (such as disability, mental health, ethnicity, etc). Provide feedback on this story draft..."

Here the process of prompt design doubles as a research task, with a practical application, and results that the student can compare and review.

Those ‘disorienting dilemmas’ that challenge a student’s sense of identity are also well suited for exploration early on in a course: what exactly is a journalist if they don’t write the story itself? Where do we contribute value? What is creativity? How do we know what to believe? These are fundamental questions that AI forces us to confront.

And the answers can be liberating: we can shift the focus from quantity to quality; from content to original newsgathering; from authority to trust.

Now I’ve just got to decide which bits I can fit into the module next year.