Graded assessment is an important part of any course. AI tools can impact on student graded assessment in both appropriate and inappropriate ways:
- They can make it easier for students to produce work that is clearly written and well formatted and thus communicates their ideas more clearly
- They can enable students to produce higher-quality work by acting as a tool for brainstorming and by providing feedback on form and content.
- They can enable students to submit assessment work (for example project and lab reports, essays, etc.) which is not their own, but where this is hard to detect
The purpose of this section is to help teachers enhance the possibility for students to use AI tools in constructive ways while reducing the risk of problematic graded assessment practices.
Topics on this page
- Appropriate and inappropriate use of AI tools in student assessment
- Detection of inappropriate use of AI in student assessment
- Reducing the risk of inappropriate use of AI in student assessment
- Use of AI tools by teachers in graded assessments
Appropriate AI tool use depends on learning goals. The purpose of graded student assessments is to validate whether the student is able to demonstrate the learning goals of the course. Whether using an AI tool to perform a task is legitimate in a course depends on whether that task is being assessed as a learning goal. For example, the use of an AI tool to translate a text may not be legitimate in a language course (where the skill of translation is being assessed) but may well be legitimate if done as part of an engineering project report.
Learning goals are influenced by AI tools. It is also important to note that the existence of AI tools may change the learning goals for a course: For example, when writing a program to analyze data or produce graphs, modern editors will often suggest the next line in your code. Analyzing data de facto becomes a collaborative activity between the human and the machine. This change in the behavior of tools means that it may be inappropriate to have as a learning goal “”to learn how to produce graphs with a specific library”; instead, a broader goal such as “analyze the data from the experiment” may be more appropriate.
Therefore, whether a particular use of AI is legitimate depends on the skills that you intend to teach in your course, which differs from course to course and can even change over time.
Examples of legitimate use. Examples of legitimate use may include (depending on your course goals – this list is not exhaustive):
- Searching and summarizing literature (using tools such as elicit.org, or perplexity.ai)
- Brainstorming about a topic to help students define their ideas coherently
- Assisting students in defining a structure for a report or essay
- Coding with suggestions provided by AI
- Giving feedback on form and content, including readability improvements to aid in expressing ideas clearly and grammatically correctly in different languages
Communicating legitimate use rules to students. Since there is no ‘one-size-fits-all’ rule regarding the use of AI in assessment, it is recommended that teachers make explicit to students what kind of use is not legitimate in their course, and what rules accompany the use of AI tools. This could take the form of a statement on the course moodle page. This should specify:
- If the use of AI tools is allowed in the course (e.g., “Students are allowed to use AI tools without restriction in this course” or “Students can only use specified AI tools and in circumstances outlined below”)
- Which AI tools they can use, if any (e.g., Chatbots, Image generators, Code generators, Literature search or summarizers etc.)
- Under what conditions (e.g., “as a study aid, but not in material submitted for assessment”, or “In material for assessment provided that the use is appropriately documented”)
- The rationale for this decision (e.g., “Some tasks are intended to help you learn and it would be detrimental to your learning if you used an AI tool to complete them”, or “We need to be able to assess if you can complete specific tasks unaided by an AI tool”)
- Consequences for non-compliance (i.e. “in EPFL all assessment material that is not the student’s personal and original contribution must be recognizable as such [Lex 1.3.3, Article 4]). Use of AI tools in ways that are not authorized or failure to attribute their use will be treated as a potential case of cheating and will be forwarded to the EPFL Legal Affairs team”).
If you would like to produce something more comprehensive covering a wider range of circumstances, Stanford has developed a comprehensive guideline for developing a policy tailored to your course: link to Stanford Teaching Commons.
AI disclosure statement. EPFL rules (Lex 1.3.3, Article 4) require that all assessment material that is not the student’s personal and original contribution must be recognizable as such. It is therefore recommended that, where AI use is allowed by teachers, they require students to make explicit their use of AI in preparing material for assessment. The University of Sydney (link to ‘Acknowledging and Referencing the use of AI’) provides the following example of text that a teacher can insert into their assessment description:
“Use of generative artificial intelligence must be appropriately acknowledged. You can do this by <inserting a note at the end of your submission> where you need to <describe the AI tool(s) that you used, what you used it to do, what prompt(s) you provided, and how the output of the artificial intelligence was used or adapted by you>. This additional description does not add to your word count.”
Something similar can be provided as an exemple for students (again, this is based on the University of Sydney example):
“I acknowledge the use of ChatGPT (https://chat.openai.com/) to refine the academic language of my own work. On <date> I submitted my entire essay (<link to original document here>) with the prompt to <“Give feedback on the academic tone and accuracy of language, including grammatical structures, punctuation and vocabulary”>. The output (<link here>) was then used to improve my work.”
Issues with detection of AI-generated content. Detection of AI-generated content is a controversial topic, and conflicting research results further complicate matters. It is extremely difficult to assess whether an assessment or part of an assessment has been generated with the use of AI. Software which claims to be able to do so does not, at the time of writing (November 2024), appear to be effective and may be biased against those learning through a second language. Indeed, there have been reports of false accusations against students (see Farrelly and Baker, 2023). Furthermore, other than the similarity checking tools made available at EPFL (iThenticate, turnitin), there are currently no AI detection tools validated for use with sensitive data such as student assessment submissions. For all these reasons, it is recommended that EPFL teachers do not use such tools other than those provided by EPFL, and that while using EPFL-provided tools, they do not rely on the results of AI-detection functionality to assess AI plagiarism.
Traditional similarity checking tools. Existing ‘similarity checking’ tools (e.g. iThenticate, turnitin) that can detect similarities with existing source material and therefore can be used in detecting suspected cases of plagiarism. These remain generally ineffective in cases where generative AI is used to produce new text. They may, however, detect cases where the AI tool reproduces an existing text, and it is included in a student’s work without citation.
Another use of plagiarism detection tools is to spot bibliographic references which do not exist. In such a case, plagiarism detection tools identify similarities for most true citations but do not spot any for invented citations. It can happen that some true citations (e.g., Warnock, 1977 in the example below) are not spotted by this method. However, if many references in a student submission do not get flagged by a similarity checking tool, then you may need to take the next step in determining whether you have a case of AI plagiarism.
Figure: A bibliography with similarities detected for some references (i.e. they exist) and some without similarities (they do not exist elsewhere).
Steps to detect problematic submissions. In the absence of effective digital solutions, the onus falls to humans to use practices that can detect problematic submissions. These practices are very much the same regardless of the suspected issue, whether a student used an AI tool to generate their submission, plagiarized someone else’s work, or paid someone to complete it for them.
- Combining multiple assessment methods (e.g., oral presentation with project report) may be a useful strategy to detect whether students themselves produced the submitted work.
- In addition, where a teacher has concerns about the origin of some material submitted for assessment, the teacher may interview the student and require them to discuss and explain specific parts of their submission (e.g., code, figures, ideas, etc.).
- In the case of any suspected fraud and/or cheating, the teacher writes a report which they then send to the section director and Legal Service ([email protected]) together with evidence of the compromising or suspected item or behavior.
- Suspected cases should not be dealt with by the teacher alone, as this prevents the school from having an overview of students who may repeatedly act in problematic ways across multiple courses. Moreover, the teacher should be supported in dealing with the issue while being clear on the legal context.
Clarifying expectations. One common reason for students to engage in problematic behaviors is that they may be unclear as to what is required of them, especially when different courses have different requirements. Making requirements explicit for students (as described above) reduces this risk (see ‘using grading grids’ in the teaching guide)
However, even where students are told what is required, they may still struggle to see what this means in practice. This can be addressed through two strategies:
- Examples of good practice (such as the example of how to cite the use of generative AI provided above) can help to make requirements more implementable by students.
- Providing low stakes (ungraded) opportunities for students to get feedback (e.g. peer feedback) on their citation of AI tools can help them improve these practices before a final submission.
Valuing the learning process. More generally, students may be less likely to consciously engage in cheating if they recognize that the skills/ knowledge they are learning through completing the assessment are meaningful to them. It may be helpful to clarify for students the utility or value of the skills they are developing, practicing or demonstrating through the course assessment. Going through an effortful learning process is essential for acquiring skill and knowledge.
Designing Assessments. When designing student assessments, it may be a good idea to ask a generative AI tool (Microsoft 365 Copilot, ChatGPT) to answer the assessment question. This will allow you to have some idea as to the kinds of outputs that one can expect from an AI tool. In doing so, it may be useful to provide multiple prompts to get a better idea as to the kinds of outputs that would be possible, and to understand how a student using the tool may re-focus to include concepts most relevant to the course.
- For example, a first prompt may say: “Can you suggest the structure of a short report (about 800 words) on a chemistry lab which involves extracting the dyes in M&M candy coatings using wool and then separate the dyes using the technique of paper chromatography?”
- This may be updated with a second prompt to see how the tool can integrate concepts explicitly addressed in the course content: “Can you adapt the report to make specific reference to (i) a Vee diagram, (ii) the mobile phase and (iii) the stationary phase of a paper chromatography investigation, and (iv) the concept of retention time?”
- A further update might be: “Can you explain how the experiment would have been different if (i) we used vinegar instead of ethanol, (ii) we used acrylic instead of wool?”
As highlighted by a recent study, a substantial portion of assessment questions at EPFL can actually be correctly answered by AI tools: link. In some cases, you may want to redesign your assessment methods. You can select strategies to mitigate the risk of students using AI tools to complete assessments, including:
- using different question types
- conducting oral interviews
- proctored exams and
- video diaries documenting project progress.
A study conducted by Australian colleagues provides a comprehensive overview of vulnerability and mitigation strategies: link. The technical university of Munich published a guide to rethinking assessment in response to ChatGPT: link. Changing the context of assessments could also be helpful. A flipped classroom model would enable students to work on assignments in class, while benefiting from guidance on the appropriate use of AI tools.
Even though it is potentially interesting to use AI tools to assist in grading assessments, you need to take into account that student submissions may be regarded as sensitive data. Even though tools like Microsoft 365 Copilot comply in some respects with data protection regulations, there is currently no official recommendation from the Data Protection Office for using these tools to process student submissions. Running an LLM locally on EPFL infrastructure is for now the safest, but technically the most challenging approach.
It should also be noted that there is a risk of bias in the use of AI tools. There is also evidence that users tend to over-trust the output of AI tools (link). These are important considerations in high-stakes assessment of students, and research on addressing them remains at an early stage.
As this situation is rapidly evolving (in terms of the legal framework, the research on the impact of such tools, and in the nature of the tools themselves), this recommendation will be regularly updated.
Message from AVP Education to students 9/10/2023
Today, there are a large number of generative Artificial Intelligence tools capable of generating synthetic media such as text or images [1][3]. Like any tool, they have their advantages, but also limitations and major risks that you need to be aware of. Always remember to stay critical.
When should you not consider using generative AI tools ?
- Do not use such tools to learn new things or to search for information: They often generate plausible nonsense and can lead you to believe that what they generate is true or real when it isn’t.
- Do not use it to generate content that you are unable to check for veracity or for form: for example a foreign language.
When can you consider using generative AI tools?
- When you want to be surprised: for example, to generate ideas.
- When you have the possibility to check the accuracy of the result that the AI tool generates: for example, only generate code that you can run and check yourself.
- When you want help with the form of your production, rather than with its contents: for example, to improve the wording of your text, to summarise a passage that is too long or to overcome writer’s block.
What are the risks?
- Plausible nonsense [2]: we generally tend to trust machines more than ourselves (automation bias [4]), which makes us all the more vulnerable to the apparent plausibility of the content generated by this software, even when it is completely false or incorrect.
- Environmental impact: this software is among the least energy- and water-efficient, so avoid using it when you have tools that will perform the same task with less impact (for example, searching the web, or even watching videos).
- Privacy: by using generative AI tools, you are sharing your data with private companies, so don’t enter any personal or sensitive data about yourself or others.
- Bias: this software suffers from different types of biases, whether gender bias (e.g. machine translation [5], image generation [6]) or bias based on ethnic origin or religious orientation (e.g. text generation [7]). Evaluate the results carefully and think critically.
What are the rules for using generative AI in your studies?
- Follow your teachers’ instructions. Teachers can design learning activities that may or may not include certain tools. A common practice is to mention your use of generative AI tools in your academic work. For example: AI tool X was used to improve the grammar of the text and make it more understandable or tool Y was used to generate an illustration.
- You are responsible for the work you submit on your behalf and for acting in accordance with society’s expectations of future scientists. As part of project evaluations at EPFL, teachers may ask you to explain a paragraph or a fragment of computer code to verify that you are the author of the work you have submitted, without the help of generative AI tools.
We thank you for taking note of the above and best wishes for the rest of the semester.
Pierre Dillenbourg, Associate Vice-President for Education
Patrick Jermann, Head of CEDE – Center for Digital Education
To know more:
- [1] Barraud, E, Petersen, T, Overney, J., Aubort S. & Brouet A.-M. (2023). Intelligence artificielle. Amie ou concurrente. Dimensions, 8. EPFL. https://longread.epfl.ch/dossier/intelligence-artificielle-amie-ou-concurrente/
- [2] Hardebolle, C. Ramachandran, V. (to appear). SEFI Editorial for the Special Interest Group on ethics: https://go.epfl.ch/plausiblenonsense
- [3] Rochel, J. (2023) ChatGPT. 6 questions fondamentales. https://ethix.ch/sites/default/files/inline-files/Ethix_ChatGPT_April2023.pdf
- [4] Suresh, H., Lao, N., & Liccardi, I. (2020, July). Misplaced trust: Measuring the interference of machine learning in human decision-making. In Proceedings of the 12th ACM Conference on Web Science (pp. 315-324). https://dl.acm.org/doi/10.1145/3394231.3397922
- [5] Schiebinger, L., Klinge, I., Sánchez de Madariaga, I., Paik, H. Y., Schraudner, M., and Stefanick, M. (Eds.) (2011-2021). Gendered Innovations in Science, Health & Medicine, Engineering and Environment. https://genderedinnovations.stanford.edu/case-studies/nlp.html#tabs-2
- [6] Leonardo Nicoletti and Dina Bass. Humans Are Biased. Generative AI Is Even Worse, Text-to-image models amplify stereotypes about race and gender — here’s why that matters https://www.bloomberg.com/graphics/2023-generative-ai-bias/ .
- [7] Abid, A., Farooqi, M., & Zou, J. (2021). Large language models associate Muslims with violence. Nature Machine Intelligence, 3(6), 461-463. https://www.nature.com/articles/s42256-021-00359-2