Correct Yourself, Keep My Trust: How Self-Correction and Social Connection Shape Credibility in Social Chatbots
This study compares webpage retraction, self-correction, and expert correction strategies, revealing that self-correction maintains credibility and enhances trust, especially when social connection is strong.
Key Findings
Methodology
This research employs a randomized controlled experiment with three groups (N=120), comparing webpage retraction, self-correction, and expert correction strategies in a social chatbot context. The chatbot, Drew, powered by GPT-4, was designed to simulate health misinformation errors. Participants interacted with Drew through a four-phase process: rapport building via small talk, exposure to misinformation, a second interaction, and correction based on assigned condition. The study measured trustworthiness, perceived expertise, social connection (via social attraction and self-disclosure), and belief change through Likert-scale questionnaires and qualitative interviews. Statistical analyses included ANOVA, bootstrap confidence intervals, and permutation tests to ensure robustness. The study also examined how social connection moderates correction effectiveness, using correlation analyses.
Key Results
- All three correction strategies effectively reduced belief in false information with no significant difference (p>0.05). However, self-correction uniquely preserved or enhanced perceived trustworthiness and expertise, with mean trust scores of 5.59 (SD=1.14) compared to 4.79 (SD=1.47) for expert correction and 4.76 (SD=1.45) for webpage correction (p<0.05).
- Social connection measures, including social attraction and self-disclosure, significantly predicted belief change only in the self-correction condition (r>0.4, p<0.01), indicating that social bonds amplify correction effectiveness.
- Participants viewed self-correction as a sign of honesty and competence, which humanized the chatbot and increased trust. Conversely, external corrections were perceived as less sincere, leading to trust erosion.
Significance
This work advances understanding of trust repair mechanisms in AI-human interactions by demonstrating that self-correction, when combined with strong social bonds, can repair and even boost credibility. It challenges the conventional focus solely on correction accuracy, highlighting the importance of social connection as a functional component that enhances the effectiveness of trust repair strategies. The findings have practical implications for designing AI systems that maintain long-term user trust, especially in domains prone to misinformation such as health and finance. Theoretically, it integrates social psychology principles into AI trust models, providing a nuanced perspective on how social bonds influence belief updating and credibility perception.
Technical Contribution
The paper introduces a novel experimental framework combining GPT-4-based social chatbots with social connection metrics to evaluate trust repair strategies. It systematically compares three correction methods, employing rigorous statistical validation. The study integrates social attraction and self-disclosure as moderating variables, providing empirical evidence for their role in amplifying correction effects. It also extends CASA (Computers Are Social Actors) theory into the domain of AI trust repair, offering a new model for understanding social influence in automated systems. The methodology and findings contribute to the development of more human-like, trustworthy AI agents capable of self-correction without compromising credibility.
Novelty
This research is the first to empirically compare webpage retraction, self-correction, and expert correction within a social chatbot context, explicitly examining the moderating role of social connection. It uniquely demonstrates that self-correction enhances perceived credibility and trustworthiness, contrary to prior assumptions that external correction is superior. The integration of social connection as a functional mechanism, rather than a mere design feature, marks a significant innovation. Additionally, leveraging GPT-4's advanced language capabilities to simulate realistic interactions adds a new dimension to trust repair studies.
Limitations
- The experimental setting focused on health misinformation, which may limit generalizability to other domains such as finance or politics. Future work should validate across diverse content areas.
- Sample size (N=120) and online recruitment may introduce biases; larger, more diverse samples are needed for broader applicability.
- The study primarily assesses short-term trust repair; long-term effects and behavioral outcomes require longitudinal investigation to understand sustained trust dynamics.
Future Work
Future research should explore cross-cultural differences in trust repair, extend to multi-modal interactions including voice and video, and develop adaptive self-correction algorithms that learn from user feedback. Investigating long-term trust maintenance mechanisms, integrating emotional intelligence, and deploying in real-world applications such as healthcare assistants or customer service bots will be crucial. Additionally, exploring how social connection can be systematically enhanced through design features to maximize correction efficacy remains an open avenue.
AI Executive Summary
In an era overwhelmed by misinformation, maintaining user trust in AI-powered social chatbots is a critical challenge. Errors made by these agents, especially in sensitive domains like health, can severely damage credibility and user engagement. Traditional correction methods, such as webpage retractions or external expert interventions, often fall short because they either fail to restore trust or inadvertently erode it. This creates a pressing need for more effective, trust-preserving correction strategies.
This study introduces a novel approach centered on self-correction, grounded in social psychology theories like CASA (Computers Are Social Actors). By comparing three correction strategies—webpage retraction, self-correction by the chatbot, and correction by an external expert—the researchers sought to understand their impact on credibility and belief change. The experiment involved a GPT-4-based social chatbot named Drew, designed to simulate health misinformation errors. Participants interacted with Drew through a structured four-phase process, with the key phase being the correction, which varied according to the assigned condition.
The results were illuminating. All three strategies effectively reduced belief in the false information, but only self-correction preserved or enhanced the chatbot’s perceived trustworthiness and expertise. Quantitative data showed that the self-correcting chatbot received higher trust scores (average 5.59) than the external correction groups. Moreover, the study found that the strength of social connection—measured via social attraction and self-disclosure—significantly predicted belief change, but only when the same chatbot delivered the correction. This indicates that social bonds amplify the effectiveness of self-correction, making it a powerful mechanism for trust repair.
Qualitative feedback revealed that users interpret self-correction as honesty and responsibility, which humanizes the chatbot and fosters trust. Conversely, external corrections are perceived as less sincere, often leading to diminished trust. These insights suggest that AI developers should prioritize enabling chatbots to self-correct, especially in contexts where social bonds are established. Incorporating social connection-building features before correction enhances the overall trustworthiness and user satisfaction.
The implications extend beyond health chatbots. Designing AI systems that can autonomously recognize and fix their mistakes, while maintaining a strong social rapport, could revolutionize user engagement across industries. However, limitations such as domain specificity, sample size, and short-term focus highlight the need for further research. Future work should explore long-term trust dynamics, multi-modal interactions, and adaptive learning mechanisms.
In conclusion, this research underscores the importance of social connection in trust repair. Self-correction, when combined with strong social bonds, not only repairs but can also strengthen user trust in AI agents. This paradigm shift offers a promising pathway toward more trustworthy, human-like AI systems capable of sustaining long-term relationships with users.
Deep Dive
Abstract
When social chatbots make mistakes, and they do, how they recover determines whether users trust them again. Social chatbots are increasingly integrated into everyday life, yet they remain prone to generating convincing but inaccurate information. The social connection they build with users makes such errors particularly consequential. We conducted a between-subjects experiment (N=120) comparing three error correction strategies: a webpage retraction, self-correction by the same social chatbot, and correction by an expert chatbot. Our results reveal two key findings. First, all three strategies corrected the error equally well, but only self-correction did so without damaging the chatbot's credibility: participants rated self-correcting chatbots significantly higher in both trustworthiness and perceived expertise than chatbots whose errors were corrected by external sources. Second, the strength of the user's social connection with the chatbot, measured through social attraction and self-disclosure, significantly predicted the magnitude of belief change, but only when the chatbot corrected itself. Outsourcing corrections to an external source severed this link entirely. These findings suggest that social chatbots should correct their own mistakes rather than outsource corrections, and that investing in social connection is a functional mechanism that amplifies correction effectiveness, not merely a design feature. We discuss implications for designing chatbots that maintain long-term credibility while effectively addressing their own errors.