LibGuides: Generative AI in Research: 2. Considerations When Using Generative AI

Areas to Consider when using Generative AI in Research

Generative AI models offer limited transparency into the data that is used to train them and the date ranges of the data used. When using any generative AI tool or interface, always make efforts to carefully assess the accuracy, relevance, and veracity of the generative AI system’s outputs.

Generative models can also perpetuate biases present in their design, build and training data which can be amplified by the prompts input by the user. This has the potential to manifest in harmful ways including but not limited to, perpetuating stereotypes, reinforcing misinformation or creating unequal representation of different groups. When using generative AI, it is essential to be aware of this potential for bias, thoughtfully develop prompts, and critically review and assess the system’s output.

Generative AI does sometimes provide citations/attributions to works it has used in providing its answers but these should be verified using external sources as there are examples of AI models creating “fake” references. Below is an example of a generated citation to an article that does not exist.

Kallajoki M, et al. Homocysteine and bone metabolism. Osteoporos Int. 2002 Oct;13(10):822-7. PMID: 12352394

Types of Bias

Below are some examples of different types of bias that might exist in generative AI, and should be taken into consideration when using various tools, or resources that utilize generative AI.

Machine bias refers to the biases that are present in the training data used to build the tools. Generative AI models learn from large human-generated datasets and will ingest the biases present in the text.
Confirmation bias when individuals seek information that confirms their existing beliefs and disregard alternative information. This can be demonstrated either in the training data or in the way that the prompt is written. When users seek information on a particular subject, the AI might selectively generate content that reinforces their viewpoints.
Selection bias when certain groups or perspectives are underrepresented or not present in the training data, the model will not have the information to generate comprehensive answers.
Contextual bias can happen when the model is unable to understand or interpret the context of a conversation or prompt accurately.
Linguistic bias Language models may exhibit preferences for certain dialects or languages, making it challenging for individuals who speak other dialects to access information or engage with AI interfaces.
Automation bias is the propensity for humans to favor suggestions from automated systems such as generative AI and to ignore contradictory information made without automation, even if it is correct.

Review data use policies of the generative AI tool you plan to use. Only input data that are appropriate to share publicly and externally to UNLV. Exercise caution when working with private, sensitive, or identifiable information, and avoid sharing any student information (which could be a FERPA violation), proprietary data, human subject data, controlled/regulated information, third party copyrighted materials or any materials that you do not own or manage the rights to.

Some generative AI tools have data use policies (For example, ChatGPT Enterprise) that may make them HIPAA compliant. Before inputting data into those tools it is recommended that you seek guidance from the Office of Research Integrity.

Peer Review

Uploading a manuscript in whole or in part to a Generative AI tool as part of your review process may breach the requirement to maintain the confidentiality of the content. As an example, NIH recently issued clear guidance prohibiting the use of generative AI in the NIH peer review process.

If you suspect that text or an image has been created using Generative AI, and the author has not disclosed this, you may want to alert the editor of this concern. While there are tools available to detect generated images (see the page of this guide on Detection), note that uploading those images/text during the peer review process may breach the confidentiality of the content.

Rights and Attributions

With lack of transparency on data used in any particular AI model it is unclear if/how authors/creators works are credited/attributed in the model. Additionally there are growing concerns about the unauthorized inclusion of content. When using AI generated content in your research you may be inadvertently including another person's work with incorrect or no citation and attribution, raising issues regarding plagiarism and intellectual property rights.

Copyright

The challenges brought into the copyright system by generative AI are still being discussed and understood.

The United States Copyright Office has launched an initiative to examine the copyright law and policy issues raised by artificial intelligence (AI) technology, including the scope of copyright in works generated using AI tools and the use of copyrighted materials in AI training. The law is unsettled regarding copyright and generative AI but it is important to be aware that established concepts such as fair use, authorship and derivative works all apply to the creation and use of generative AI content.

For more information on Copyright and Author's Rights, please visit the Copyright and Author Rights guide.

Tools for Detection

There are tools that attempt to detect the likelihood that an image or text has been generated by an AI tool. Plagiarism checking tools have incorporated detection features to varying degrees of success. Your existing knowledge and skills provide an advantage when reviewing materials within your field of expertise. For materials outside of your field or your comfort zone, fact- and citation-checking skills may be utilized to aid in evaluating these materials, or assessing materials from authors with a wide range of writing styles.

There are also tools for images. For instance, this AI Image Detector allows you to drag and drop an image file. This tool is a proof-of-concept. There may be more robust tools of this type available to you to evaluate published works, or with permission from the creators, unpublished works and student assignments.

Detection Challenges

The graphic below (Gu et al., 2022) shows AI generated images to highlight how challenging it is to detect them without tools.

The first image on each row is the original image. For the first two sections, the last four images are regenerated from the first real image. Section A in the graphic below are all fake images generated from a well trained generative AI model. Section B is the result of regenerating images using a generative model trained on a single image. Section C shows the results of using generative AI to manipulate images. The generative models manipulate images by directly generating images that are similar in features but with modified content. For each group, the images in the middle are the original images, and the images on both sides are deliberately manipulated fake images.

While generative AI offers promising capabilities for researchers across disciplines, understanding its nuances like perplexity and burstiness is crucial. These concepts act as tools to evaluate the AI's outputs, ensuring they are insightful, relevant, and unbiased. By being aware of these metrics, scholars can better navigate and leverage the vast potentials of generative AI in their research endeavors.

In addition, understanding these concepts could help scholars identify whether a text is AI-generated or not. While neither perplexity nor burstiness is a foolproof method to identify AI-generated content on its own, they provide valuable tools for discerning readers.

Observing for unexpected combinations of information or repetitive emphasis can offer hints toward the origin of the text. In an era of sophisticated AI, critical reading combined with an awareness of these concepts becomes more important than ever.

Perplexity

What is it? Perplexity is a measure used to evaluate how well a probability distribution predicts a sample. In the context of generative AI, it quantifies how "surprised" the model is by a given input, based on the data it has been trained on. A lower perplexity indicates that the model is less surprised and thus better at predicting the input.

How does it relate to AI-generated content? If an AI language model produces a piece of text that seems improbable or unexpected based on its training, the perplexity would be high. For instance, a coherent and grammatically correct text would typically have lower perplexity than a jumbled, nonsensical one.

Why is it important? Imagine you're reading a book and trying to guess the next word in a sentence. If the language and context are familiar, you can often make accurate predictions. Similarly, a language model trained on vast amounts of data uses perplexity to assess how accurately it can predict or understand the next word or piece of data

For researchers, understanding perplexity helps in:

Evaluating the quality of AI-generated outputs.
Comparing the performance of different models.
Assessing how well the model understands a given dataset or subject matter.
Deciphering AI-generated content by identifying contextual oddities

Considerations for Researchers:

Training Data: If a model is trained on specific genres or disciplines, it might show low perplexity for similar content but high perplexity for unfamiliar subjects.
Overfitting: A model with too low perplexity might be overfitted to its training data, meaning it might not generalize well to new, unseen data.

Perplexity Illustrative Examples

High Perplexity (Unexpected and Hard to Predict)

Prompt: "The capital of France is..."
Output: "a rare fruit named blue apple."
This output is unexpected. A model trained on general knowledge would expect the answer to be "Paris". The "blueapple" answer would greatly surprise (or have a high perplexity for) a well-trained model.

Low Perplexity (Expected and Easy to Predict)

Prompt: "The capital of France is..."
Output: "Paris."
This is the expected answer, so a well-trained model would predict this with low perplexity.

Identifying AI-generated Content:

Perplexity Clues: Generative AI models aim for coherency and alignment with the patterns they've learned from their training data. When confronted with an output that seems unexpected or nonsensical based on typical human language patterns, it might be indicative of AI generation.

Example of High Perplexity (Possible AI Error)

Text: "The Eiffel Tower, known for its role in the American Revolution, stands tall in Berlin."
The above sentence mashes together factually incorrect and geographically disparate elements. An AI, mistakenly joining different data points, might generate such an output, leading to high perplexity for any reader familiar with world history and geography.

Example of Low Perplexity (AI Imitating Human-Like Output)

Text: "The Eiffel Tower, an iconic symbol of Paris, is one of the most recognized structures in the world."
This text aligns well with what we expect based on common knowledge. While it shows an AI model is accurately replicating human-like knowledge, it doesn't necessarily indicate the content isn't AI-generated.

Burstiness

What is it? Burstiness refers to the tendency of certain events or terms to appear in clusters rather than uniformly or randomly distributed. In the context of AI-generated content, it can manifest as repetitive or clustered outputs when you might expect more diverse responses.

Why is it important? Understanding burstiness is essential because it provides insight into:

Quality of AI-generated content: Repetitive or too similar content might indicate the model's limited understanding or inherent biases.
Data patterns: Recognizing burstiness helps in identifying the patterns or biases present in the training data.

For researchers, grasping the concept of burstiness can aid in:

Detecting anomalies or repetitive patterns in AI-generated outputs.

Understanding potential biases in the training data.
Ensuring the diversity and quality of results for research applications.
Deciphering AI-generated content by identifying:
- Repetition & Overemphasis: If a text frequently repeats themes, words, or ideas in a way that seems unnatural or overly emphasized, it might be a sign of AI generation.
- Clustered Information: AI-generated content can sometimes present clustered information, meaning it might delve too deeply into a specific topic or idea, neglecting a more balanced approach.
- Consistent Themes in Varied Inputs: If, upon giving varied prompts or inputs, the AI tends to veer towards certain themes or terms consistently, it can be an indication of its bursty nature and a hint that the content is AI-generated.

Considerations for Researchers:

Training Data: A model trained on data with inherent burstiness or repetition is more likely to produce clustered outputs.
Adjusting Parameters: Sometimes, adjusting model parameters or input prompts can help in mitigating burstiness in outputs.
Interdisciplinary Awareness: Understanding the norms of term usage in various disciplines can help in distinguishing genuine burstiness from discipline-specific patterns.

Burstiness Illustrative Examples

High Burstiness (Repetitive and Clustered)

Prompt: "Describe the ecosystem of a rainforest."
Output: "Rainforests are dense, moist, and full of life. Rainforests have dense vegetation. Dense trees are a common sight in rainforests. The density of the rainforests provides shelter to many animals. Because of this density, rainforests are unique."
Here, the output excessively focuses on the "density" aspect of rainforests, repetitively using variants of the term "dense", showcasing high burstiness.

Low Burstiness (Diverse and Spread Out)

Prompt: "Describe the ecosystem of a rainforest."
Output: "Rainforests are vibrant ecosystems teeming with diverse life. From the tall canopy trees to the undergrowth, they are home to countless species of plants, animals, and insects. The humid climate, regular rainfall, and rich soil foster rapid plant growth. This dense vegetation, in turn, provides shelter and sustenance for a plethora of animal species."
This output offers a varied and comprehensive view of rainforests without unnecessary repetition or clustering around a single theme.

Identifying AI-generated content:

Burstiness Clues:
- Repetitiveness and overemphasis on certain words or themes can be a sign of AI generation. This might occur due to biases in the training data or the AI's propensity to get "stuck" on certain topics.

Example of High Burstiness (Possible AI Overemphasis)

Text: "Cats are popular pets. Cats are often kept in homes. Many people love cats because cats are affectionate. Cats, with their playful nature, make homes lively. It's no wonder cats are loved."
The repeated emphasis on "cats" in various contexts within a short span hints at AI generation. A human writer would likely introduce more diversity in phrasing and content.

Example of Low Burstiness (AI Imitating Diverse Human-Like Output)

Text: "Cats are popular pets known for their playful nature. They are often loved for being affectionate, and their distinct personalities make each one unique."
This content flows more naturally, offering diverse insights about cats without overly repetitive phrasing. However, smooth and diverse content doesn't mean it's not AI-generated; it just means the AI is doing a good job imitating human-like writing.

When using generative AI tools in writing, acknowledge and cite the output of those tools in your work. Norms and conventions for citing AI-generated content are likely to evolve over the next few years but the below guides provide information on current guidelines from the major style guides.

In the past few years there have been increasing instances of the systematic manipulation of the publishing process. Fraudulent manuscripts that resemble legitimate research articles have made their way through the peer review process and have been published in reputable journals. “Paper Mills”, organizations that produce and sell fraudulent manuscripts are at the center of this problem. This Nature news article describes how more than 10,000 research articles have been retracted in 2023 due to integrity issues and the systematic manipulation of the publishing process.

The impact of fraudulent research being published with the stamp of authority of a peer-reviewed journal is far-reaching. Not only is it damaging to the trust researchers place in the publication system, but the fraudulent research may be used to build more research, wasting money and time for a researcher. Scholarly journals have become increasingly aware of paper mill articles and are working to develop methods to screen for them. For researchers, it is extremely difficult to detect these published articles. While this issue is not directly the result of AI. AI is exacerbating the problem making it easier to fabricate research, and the methods described in this guide for detecting AI may be useful in detecting fabricated research. Additionally researchers should become familiar with how retractions are communicated in their discipline and journals. This can help to avoid citing fraudulent research, but of course this only applies to research that has been identified as fraudulent. The Retraction Watch database is a tool that can be used to identify retracted journal articles.

Generative AI in Research

Contact Us

AI in the News

Contributions to this Guide