Batch Testing

Batch Testing is the RoseRx Platform’s quality assurance system for validating agent responses. It lets you define sets of test questions, run them against your agent, and review the results — ensuring your agent provides accurate, compliant responses before (and after) going live.

Regular testing is one of the most effective ways to maintain response quality. Run batch tests after every significant change to your agent’s configuration or knowledge base.

Accessing batch tests

Navigate to Testing in the left sidebar. You’ll see a list of all existing tests and their current status.

Creating a test

Navigate to Testing

Click Testing in the left sidebar to open the batch testing page.

Click Create Test

Click the Create Test button in the top-right corner.

Enter test details

Give your test a descriptive name and select the agent you want to test. The name should reflect the purpose of the test — for example, “Post-KB Update — [Medicine Name]” or “Guardrail Compliance Check”.

Add questions

Add the questions you want your agent to answer. You can use any of the three methods below — or combine them.

Run the test

Click Run Test to begin. The RoseRx Platform will send each question to your agent and capture the response.

Adding questions

The RoseRx Platform provides three ways to add questions to a batch test:

Generate
Upload CSV
Manual

Let RoseRx’s AI generate questions automatically based on your agent’s configuration, knowledge base, and guardrails. This is the fastest way to create comprehensive test coverage.Simply click Generate and RoseRx will create a set of relevant questions tailored to your agent. You can review, edit, or remove any generated question before running the test.

AI-generated questions are a great starting point, but always review them to ensure they cover the specific scenarios you care about.

You can add more questions to an existing test at any time — even after the test has been run. This makes it easy to expand your test coverage iteratively.

Test statuses

Each batch test moves through the following statuses:

Status	Description
Draft	The test has been created but not yet run. You can still add or edit questions.
Scheduled	The test is queued and will begin shortly.
Running	The test is actively in progress. A progress bar shows completion status.
Completed	All questions have been processed and results are available for review.
Failed	The test encountered an error and could not complete.

While a test is running, the progress bar updates every few seconds. You can navigate away and return — the test will continue running in the background.

Reviewing results

Once a test completes, click into it to review the results. You can filter the results by:

Answer status — Success, Failed, Timeout, or Pending.
Rating — Good, Acceptable, Poor, or Unrated.

Rating system

Each answer can be rated to track quality over time:

Rating	Meaning
Good	The response is accurate, compliant, and well-written.
Acceptable	The response is adequate but could be improved.
Poor	The response is inaccurate, non-compliant, or unhelpful.
Unrated	The response has not yet been reviewed.

Question detail

Click any question in the results to see the full detail view:

Question — The full text of the test question.
Answer — The complete agent response.
Rating — Set or change the rating.
Notes — Add internal notes about the response quality or any issues identified.

Test actions

From the test list or within a test, you can perform the following actions:

Rename — Change the test name.
Duplicate — Create a copy of the test with all its questions (useful for re-running after changes).
Export CSV — Download the test results as a CSV file for offline review or reporting.
Delete — Permanently remove the test and its results.
Add questions — Add additional questions to an existing test at any time.

Best practices

Test after KB changes

Run batch tests after updating your Knowledge Base to ensure the agent still provides accurate responses with the new content.

Test guardrail changes

After modifying guardrails or compliance settings, run tests that specifically probe the boundaries you’ve configured.

Build a regression suite

Create a standard set of test questions that you run regularly. Duplicate tests to quickly re-run them after making changes.

Combine methods

Use AI-generated questions for broad coverage and manual questions for targeted edge cases. This gives you the best of both worlds.

Getting Started

Agents

Widgets & Deployment

Testing

Monitoring & Analytics

Organization

Accessing batch tests

Creating a test

Adding questions

Test statuses

Reviewing results

Rating system

Question detail

Test actions

Best practices

Test after KB changes

Test guardrail changes

Build a regression suite

Combine methods

Getting Started

Agents

Widgets & Deployment

Testing

Monitoring & Analytics

Organization

Documentation Index

​Accessing batch tests

​Creating a test

​Adding questions

​Test statuses

​Reviewing results

​Rating system

​Question detail

​Test actions

​Best practices

Test after KB changes

Test guardrail changes

Build a regression suite

Combine methods

Accessing batch tests

Creating a test

Adding questions

Test statuses

Reviewing results

Rating system

Question detail

Test actions

Best practices