Bradley Naylon
bsnaylon@towson.edu
(410) 746-8266
The purpose of this study is to gauge the efficacy of generative AI in online data collection
environments. The results of this study will aid in our understanding of the difference in quality of
data collected from individuals and data collected from individuals assisted by a generative AI.
Stanford report published an article that found a third of individuals surveyed have used generative
AI to aid in answering online surveys. The report speaks on the issues involved in training AI on
generated content, and how models can lose meaning when trained on this data. This concept is
explored in depth in the article "AI models collapse when trained on recursively generated data",
which details that models can collapse entirely if they are trained on content produced recursively.
Though there are proven issues with training AI on recursively generated data, the use of AI in
surveys also has proven benefits as an accessibility and probing tool. NORC at the University of
Chicago details some of these benefits. "These chatbots can clarify questions, prompt for more
detailed answers, and create a more interactive survey experience. This could help reduce
respondent fatigue and improve the depth and quality of the data we collect." Additionally, NORC
found that specificity and explanatory quality grew by 73% and 65% respectively. However, this
study used AI purely as a probing tool, not one that generates text for the user.
This study hopes to fill the gap in knowledge about data generated by AI chatbots in a survey
setting, as an assistive and probing tool. Though the NORC studies reviewed AI chatbots in survey
settings, they did not analyze text generated by the chatbot. Though there are proven negative
effects of using generated text in survey responses, these responses were generated by models
with general instruction sets, not specified to aid a user in survey responses.
By creating a chatbot whose purpose is to expand on and increase specificity in user responses, we
may be able to maintain both the benefits of increased specificity, and the complexity and nuance
of human generated text. With data generated by humans, assisted by ai, we may be able to create
large amounts of generated training text that do not lead to model collapse over time.
Citations:
Generative AI can enhance survey interviews: NORC at the University of Chicago. Generative AI Can
Enhance Survey Interviews | NORC at the University of Chicago. (n.d.).
https://www.norc.org/research/library/generative-ai-can-enhance-survey-interviews.html
Expert view: The Promise & Pitfalls of ai-augmented survey research. The Promise & Pitfalls of AIAugmented Survey Research | NORC at the University of Chicago. (n.d.).
https://www.norc.org/research/library/promise-pitfalls-ai-augmented-surveyresearch.html#:~:text=AI%2DAssisted%20Chatbots%20to%20Improve%20Survey%20Delivery&tex
t=These%20chatbots%20can%20clarify%20questions,of%20the%20data%20we%20collect.
Shumailov, I., Shumaylov, Z., Zhao, Y., Papernot, N., Anderson, R., & Gal, Y. (2024, July 24). AI
models collapse when trained on recursively generated data. Nature News.
https://www.nature.com/articles/s41586-024-07566-y
Survey participants are turning to ai, putting results into question. Stanford Report. (n.d.).
https://news.stanford.edu/stories/2024/11/ai-generated-survey-responses-could-make-researchless-accurate-lot-less-interesting
Professors from Towson University whose classes are related to the topics of human computer
interaction, interface design, and machine learning will be sent recruitment information for this
study via TU email.
Participants are required to register a codename and confirm the have read this consent form, and
agree to its terms. Once participants have agreed to the terms of the consent form, they will be
redirected to the study’s data collection form. This codename should not contain any personally
identifiable information. If a codename is registered with personally identifiable information
included, the codename and its related data will be destroyed.
If you choose to participate in this study, you will be asked to answer a series of questions twice.
Once independently and once with the assistance of an AI chatbot “Kerry”. Participation in this
study should take approximately 15 minutes.
Questions are designed to be open ended. The topics of the provided questions vary greatly and are
akin to in-depth “ice-breaker” or “small-talk” questions. No question will ask for personally
identifiable information. Please do not include any information in your responses that could be
used to identify you.
Students who wish to participate for extra credit should remember the codename used in the study,
and provide it to their instructor as described later in this document.
In order to participate in this study, you must be 18 years or older.
This study requires that you interact with a generative AI. This AI is designed to provide a user-friendly and supportive experience, ensuring that interactions are respectful and non-intrusive. If you feel the AI is generating responses that do not align with the intended experience, you may close the chatbot and exit the study at any time.
Participants who are enrolled as students at Towson University may directly benefit from the study as outlined in the Compensation section below. Participants who are not enrolled as students at Towson University are not expected to benefit directly from this study.
If proven effective, AI assistive chatbots may be used in data collection environments to:
Participation in this study is voluntary. You are free to withdraw or discontinue participation in this study at any time without penalty.
You may also choose not to answer any questions that you do not want to answer.
“Select courses” are defined as courses in which the professor has communicated with the PI that
they would like to enroll their class section in the extra credit program.
Participants who are not enrolled in select courses at Towson University will not receive
compensation.
Students from select courses at Towson University may receive extra credit for participating in this
study.
Extra credit will be granted to students from select courses who submit two sets of answers (one
set independently, one set using Kerry). Extra credit will not be granted to students who submit only
one set of answers, or do not complete all questions in the survey in both sets of answers.
Students must note the codename used to submit their answers. Students will provide said
codename to their professor participating in extra credit for the study
To maintain anonymity of student participants, Investigators will not have access to the list
associating codenames with student names, and survey answers will not be published or shared
with associated codenames.
To ensure your anonymity, we are not collecting any identifying information that could be used to identify you. The program used to house the questionnaire will be set to remove all identifying information from the dataset including IP addresses. We ask that you NOT place your name or other information that could identify you on your questionnaires. Any publications or reports that result from this research will not include identifying information on any participant.
If any identifiable or confidential information is submitted, the data will be destroyed immediately upon discovery, any information pertaining to the submission or codename of that individual will be destroyed, and any text stored by the chatbot vendor (OpenAI) will be overwritten and destroyed.
If you have any questions regarding your rights as a research participant please contact the Institutional Review Board Chairperson, Dr. Elizabeth Katz, Office of University Research Services, 8000 York Road, Towson University, Towson, Maryland 21252; phone (410) 704-2236. If you have questions about the study or if you wish to withdraw your consent, please contact the Investigators, Bradley Naylon; phone (410) 746-8266/email bsnaylon@towson.edu, or Jinjuan Feng; phone (410) 704-3463/email jfeng@towson.edu.
By clicking "Register Codename and Continue" below, I am indicating my understanding that (a) I am participating in a research study; (b) my participation is completely voluntary and that I can withdraw my consent at any time without penalty; and (c) I do not have to answer any questions I do not want to answer.
*Please do not use any information that could be used to identify you in your codename!