Insight Summarizer

Introduction

We’re thrilled to announce the launch of a powerful new feature on the Rival Platform: "AI Insights." This tool elevates the way users can analyze and interpret responses to both open-ended text and video questions within their chats. Below is a straightforward summary designed to help users fully leverage this innovative feature.

Accessing AI Insight Summarizer

You can access the AI Insights feature from the "Chat Data" (#1) section within any chat on the Rival Platform. The chat data section offers an index of all research questions associated with any given chat. Researchers can review and interact with responses by selecting a question from the “chat data list” and examining the responses in the answer table displayed to the right of their selection. The new “AI Insights” feature is conveniently located in a tab next to the "All Responses" tab (#2) for any selected open-ended text or video question. NOTE: this feature is only available for open ended text and video questions.

Open AI insights empty state.png

Generating and Viewing Insights

When you first open the AI Insights tab, it will be empty. To generate insights, simply click the prominent "GENERATE INSIGHTS" (#3) button. The AI will then analyze the open-ended responses to identify key themes based on the available data. For example, if the question is about what makes a soda refreshing, the AI might highlight themes such as carbonation, temperature, flavor, and preference for natural ingredients. Please note that at least ten responses are required for this feature to function correctly. If fewer than ten responses are available, a warning message will notify you that there is insufficient data for thematic analysis.

Insight Details and Confidence Scores

Each insight is accompanied by verbatim responses that are relevant to the theme, sorted by a relevance score (e.g., 96 out of 100). These scores help users evaluate the validity and quality of the insights. Additionally, each insight is given a confidence score, indicating the reliability of the theme, with scores like 93 out of 100 or 92.5 out of 100. These confidence scores allow users to prioritize the most trustworthy insights. Each time the user select “generate insights” the system will provide a list of the top 5 themes sorted from high confidence to low.

Working with Video Responses

The AI Insights feature also extends to video responses. When viewing insights for video questions, users can see both the verbatim text and the associated video clip, providing a richer context for the insights.

Actions and Customization

Users can take several actions on individual insights, such as editing the title or description, deleting them, or exporting them to Excel. These actions help users manage and customize the insights according to their needs. There are also options to perform actions at the insight set level, such as selecting and exporting multiple insights at once.

Conclusion

The AI Insights feature on the Rival Platform offers a streamlined way to analyze and interpret open-ended responses, whether text or video. With features like relevance and confidence scores, and the ability to take actionable steps, users can efficiently manage their data insights. This marks the first version of the AI Insights feature, and the team looks forward to receiving user feedback to continue improving this tool.

Demo video

Here

How AI Insights are generated

READER NOTE: This section is a technical explanation of how the AI is working to generate the insights.

Step 1: Generate Sentence Embeddings

Our insight generation process begins by generating sentence embeddings for all the survey responses (AKA "verbatims"). A sentence embedding is like a special code that turns a sentence into numbers that a computer can understand. This code captures the meaning of the sentence, so the computer can compare it with other sentences. In a vector database, these codes help the computer store and organize sentences based on their meanings, making it easier to find and group similar ideas. These embeddings represent the semantic meaning of each response in a numerical format that the AI can understand and process.

Step 2: Store Embeddings in a Vector Database

Once the sentence embeddings are generated, they are stored in a vector database. This database acts as the AI's "brain," where each verbatim is organized based on its semantic properties, allowing the AI to establish relationships between responses.

We establish a semantic footprint for each verbatim by embedding them in a vector database

Step 3: Create a Centroid Embedding Representing Insightfulness

In this step, we create a “Centroid” embedding that helps represent and define the concept of "insightfulness" within the context of the survey. We also provide the message text from the initial question to provide the AI with more context for the task of evaluating and generating themed insights that are relevant to the question that was asked. This “centroid” serves as a reference point, capturing the essence of what makes a response “insightful” according to the AI's understanding.

We establish a semantic centre in our vector database to help the AI understand the context for the thematic analysis task.

Step 4: Sort Embeddings Based on Proximity to the Centroid Embedding

With the insightfulness centroid established, the AI sorts all the stored embeddings based on their proximity to this centroid. This sorting process helps prioritize the responses that are most closely aligned with the goal of generating meaningful insights.

Step 5: Fill the Context Window with the Top 12K Tokens of Sorted Embeddings

To generate insights, the AI fills its “context window” with up to 12,000 tokens worth of the closest sorted embeddings (remember an embedding at this stage is an instance of a question response). A “context window” in AI applications refers to the amount of information the AI can "see" or "remember" at one time when processing text. A “token” in AI applications is like a small piece of a word or a word itself. When AI processes text, it breaks down sentences into these smaller parts, called tokens, to understand and analyze the content. For example, in the sentence "I love cats," the words "I," "love," and "cats" might each be treated as individual tokens. Some words, especially longer ones, can be split into multiple tokens. Tokens help the AI understand and work with the text more effectively. This step ensures that the AI is working with the most relevant and insightful responses when analyzing the data.

Step 6: Generate the Top 5 Insights Capturing Consensus Amongst Responses

Using the information within the context window, the AI generates the top five insights from the survey responses. These insights are designed to capture the consensus among the responses, meaning they reflect common themes or agreements across the data rather than focusing on outlier opinions or individual relevancy. The AI analyzes the semantic similarity of the responses within the context window, identifying patterns that reveal what the majority of respondents are expressing. By focusing on these areas of consensus, the AI ensures that the generated insights accurately represent the collective voice of the survey participants. This step is crucial for drawing out the most widely shared perspectives, providing a clear and representative summary of the survey's overall findings.

The AI runs thematic analysis on the response set and then generates the top 5 themes based on that analysis

Step 7: Create Centroid Embeddings for Each Insight

For each of the top five insights, the AI creates a corresponding centroid embedding. These centroids represent the semantic center of each insight, allowing the AI to further analyze and categorize the responses.

Step 8: Sort Verbatims Relative to the Insight Centroids

Finally, the AI sorts all the verbatims relative to their proximity to the insight centroids. Each response is assigned a "relevance score" based on how closely it aligns with the corresponding insight. Verbatims with high relevance scores are grouped together, ensuring that each insight is supported by the most pertinent responses.

Source verbatims are clustered based on their proximity to each insight themes center

AI Insights Summarizer Version

AI Model Attribution and Prompt Version - We are currently using the GPT 4.o Mini LLM to power our AI Insights summarizer. You can find the AI model and prompt version used for generating insights under the View AI Attribution option in the Actions menu for each insight. This ensures transparency and allows tracking of improvements over time.

GPT Model	Prompt Version	Release Date	Key Benefits
GPT 3.5 Turbo	1.0	7 March, 2024	Introduced AI-powered thematic analysis using GPT-3.5 Turbo, enabling researchers to quickly identify key themes from open-ended text and video responses with confidence and relevance scoring.
GPT 4.o Mini	1.0	31 March, 2025	Upgraded to GPT-4o Mini, delivering faster processing, improved contextual accuracy, and better handling of complex responses for more precise and reliable insights.

AI technology notes

APIs - We are currently using OpenAI’s “Chat completion” API to power our AI summarizer, AI Tone Refinement and AI probing features.

We have also used the OpenAI assistants API to power some experiments in the lab. Currently we have no features productized that use the OpenAI assistants API.

Comparison between the two APIs

Chat Completion API - OpenAI Platform

**General Conversation**:

Designed to handle general conversational tasks with less customization needed.
Suitable for creating simple chatbots or adding conversational capabilities to applications quickly.

**Prompt and Response**:

Focuses on generating text based on a given prompt, ideal for straightforward Q&A and chat interactions.
Provides high-quality responses to user inputs but may require additional layers to handle complex interactions or specific tasks.

**Ease of Use**:

Easier to set up for basic chat functionalities, requiring less initial configuration.
Ideal for scenarios where quick deployment of a conversational AI is needed without extensive customization.

Assistants API - OpenAI Platform.

**Customization and Configuration**:

Designed for building highly customized and specific AI assistants.
Provides tools to set up and manage the assistant’s behavior, knowledge base, and responses.
Allows for integrating specific workflows and tasks tailored to the user's needs.

**Purpose-Built for Assistants**:

Specifically aimed at creating robust virtual assistants that can perform a wide range of tasks.
More suitable for complex, multi-step interactions and handling various user queries.

**Management Features**:

Often includes features for managing the assistant’s performance, such as analytics, user feedback, and continuous learning.

In Summary

Assistants API: Best for creating tailored, multi-functional virtual assistants with advanced customization and management features.
Chat Completion API: Ideal for straightforward chat interactions and quick deployment, focusing on generating text responses based on user inputs.

More details regarding the OpenAI API’s can be found at the following link: API platform

Privacy and Security of AI Features

The Rival Platform adheres to strict privacy and security requirements, including ISO27001:2022, HIPAA, GDPR and SOC 2 Type II. Our partnership with OpenAI falls under the same scrutiny, and we are constantly monitoring for updates to the scope of their services. When we use the OpenAI APIs and LLMs, several data privacy and security features are provided to protect user data and ensure secure interactions. Here are the key features:

Data Encryption:
- In Transit: Data transmitted between your systems and OpenAI servers is encrypted using Transport Layer Security (TLS) to prevent interception and unauthorized access.
- At Rest: Data stored on OpenAI's servers is encrypted to safeguard it from unauthorized access and breaches.
Access Controls:
- API Keys: Secure API key management allows only authorized applications and users to access the API. You can regenerate keys if needed to maintain security.
- Role-Based Access: Granular permissions to control who can access and manage your API keys and data.
Data Minimization:
- Limited Data Retention: OpenAI retains data only for as long as necessary to provide the service, ensuring that data is not kept longer than required.
- Anonymization: Personal data is anonymized to protect user identities.
Compliance with Regulations:
- GDPR and CCPA Compliance: Adherence to data protection regulations such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), ensuring data subjects' rights are respected and protected.
Logging and Monitoring:
- Audit Logs: Detailed logging of API usage to track and monitor access and activities. This helps in detecting and responding to any unauthorized or suspicious activities.
- Monitoring: Continuous monitoring of API endpoints for unusual patterns or potential security threats.
Rate Limiting:
- Throttling: Rate limiting to prevent abuse and ensure fair usage. This also helps in mitigating denial-of-service (DoS) attacks.
Incident Response:
- Security Team: A dedicated security team is in place to respond to and mitigate any security incidents promptly.
- Incident Management: Established procedures for detecting, reporting, and addressing security incidents and breaches.
Regular Security Audits:
- Third-Party Audits: Regular security assessments and audits by third-party experts to identify and fix vulnerabilities.
- Internal Reviews: Ongoing internal security reviews and improvements to maintain high security standards.
Customer Control:
- Data Management: Customers have control over their data, including options for data deletion and export as needed.
- API Customization: Flexibility to customize API usage according to specific security and privacy needs.
Transparency and Documentation:
- Clear Policies: Transparent data privacy and security policies are provided, outlining how data is handled, processed, and protected.
- Comprehensive Documentation: Detailed documentation on best practices for securing API keys and using the API securely.
Employee Training:
- Security Training: Regular training for employees on data privacy, security best practices, and emerging threats to ensure they are equipped to handle sensitive data securely.