Index
← The Science

The Procedures

What is done to beagles in laboratories — every major procedure type

Science > Procedures

Anatomy of a Toxicology Study

Regulatory toxicology follows a rigid architecture. Every element — group sizes, dose levels, observation schedules, terminal necropsy — is dictated by OECD Test Guidelines and ICH harmonized standards. Understanding the design explains why so many dogs are required and why nearly all of them die.

~95%
Dogs killed at study end
Histopathology requires tissue collection
Source: OECD TG 409
4
Dose groups per study
Control + low + mid + high
Source: ICH M3(R2)
3-5
Dogs per sex per group
Typical non-rodent group size
Source: OECD TG 409
32-80
Dogs per study
Main + recovery + TK satellites
Source: Industry range

Dose Group Structure

Every guideline repeat-dose study assigns dogs to at least four groups. The control group receives the vehicle (solvent or capsule shell) without the test substance. The three treatment groups receive escalating doses designed to bracket the expected human therapeutic exposure. OECD TG 409 and ICH S4 establish this as the minimum architecture.

GroupPurposeTypical size (per sex)What happens to them
ControlVehicle only — baseline for all comparisons3–5Killed and necropsied at study end
Low doseExpected to show no adverse effects (target NOAEL)3–5Killed and necropsied
Mid doseIntermediate — defines dose-response curve3–5Killed and necropsied
High doseDesigned to produce observable toxicity3–5Killed and necropsied; highest suffering
RecoveryAssess reversibility of toxic effects after dosing stops2–3 (control + high dose only)Kept alive weeks longer — then killed and necropsied
TK satelliteDense blood sampling for toxicokinetics2–3 per key dose levelKilled; prevents blood-draw artifacts in main groups
Why This Matters
The high-dose group is intentionally set at a level expected to cause toxic effects — organ damage, weight loss, behavioral changes. These dogs experience the worst outcomes by design. The mid dose bridges the gap; the low dose is supposed to approximate a “safe” level. But all groups, including controls, end in death.

Randomization & Assignment

Dogs arrive from purpose-bred suppliers (Marshall BioResources, Envigo/Inotiv, or historically Ridglan Farms) after a quarantine and acclimation period of 2–4+ weeks. Before assignment, baseline data is collected: body weight, clinical observations, sometimes baseline ECG or bloodwork.

Stratified randomization
Dogs are assigned to groups using body-weight-stratified randomization so that group mean weights are comparable at study start. This is a GLP requirement — non-random assignment would invalidate the study.
Acclimation is a burden event
Research shows at least 4 weeks of acclimation is needed to eliminate stress-related physiological artifacts from transport and facility changes. Many protocols allow only 7–14 days, meaning dogs may enter dosing while still stressed.

Endpoints: What Is Measured and When

Toxicology studies collect data at three levels: in-life observations, clinical pathology (blood and urine from living animals), and terminal procedures (necropsy and histopathology). Each level adds handling, restraint, and cumulative burden.

In-Life Observations

Twice daily: Morbidity/mortality checks — cage-side assessment of whether the dog is alive, alert, and not in acute distress.
Daily: Clinical observations during and after dosing — food consumption, vomiting, salivation, activity level, stool consistency.
Weekly: Detailed physical exam outside the cage — body weight, palpation, ophthalmology (pre-study and terminal), ECG for some compounds.

OECD TG 409 specifies this cadence: daily clinical observation, twice-daily mortality checks, weekly detailed exams outside the cage when practical.

Clinical Pathology

Blood samples for hematology, clinical chemistry, coagulation, and urinalysis. Typically collected pre-study (baseline), at interim timepoints, and at terminal sacrifice. Toxicokinetic sampling is denser: up to 8–11 timepoints per day on designated days (e.g., Day 1 and last dosing day), with ~0.5 mL per draw.

For a 10-kg beagle with ~850 mL circulating blood, the 10% per 14-day institutional limit is ~85 mL. A TK day with 11 draws at 0.5 mL totals only ~5.5 mL — well within limits — but the burden is in the handling and restraint, not the volume.

Terminal Procedures

At the scheduled endpoint, dogs are euthanized (typically IV pentobarbital overdose) and subjected to complete necropsy: gross examination of all body cavities, organ weight measurement (heart, liver, kidneys, brain, adrenals, thyroid, etc.), and fixation of 40+ tissue sites in formalin for microscopic examination.

A board-certified veterinary pathologist examines H&E-stained slides from every tissue. This histopathological examination is the core regulatory deliverable — it identifies target organs of toxicity and establishes the no-observed-adverse-effect level (NOAEL) that determines safe human starting doses.

Why ~95% Are Killed

Histopathology requires tissue. You cannot biopsy 40+ organ sites from a living dog. OECD test guidelines explicitly require terminal necropsy of all surviving animals with complete gross and microscopic pathology. This data is not supplementary — it is the primary deliverable that regulators use to assess compound safety.

Even recovery groups end in euthanasia and necropsy. The question those groups answer is whether pathology findings reversed after dosing stopped, not whether the dog survives. EU data shows a ~39% reuse rate for dogs (5,659 reuses in 2022), primarily in telemetry and crossover designs. But reuse delays death; it does not prevent it. UK rehoming data: 0.4% of dogs at 41 facilities were rehomed — 44 out of 10,456.

Key Finding
The lethal endpoint is not incidental. It is the purpose. The entire study is designed to produce tissue slides. Every other measurement — clinical observations, body weights, blood chemistry — provides supporting context for the pathology findings. The dogs are the consumable medium.

Recovery Groups: The ~5% That Survive Longer

A subset of dogs — usually from the control and high-dose groups only — continues to be housed and monitored for 2–4 weeks after dosing ends. The purpose is to determine whether treatment-related pathology resolves, stabilizes, or progresses after exposure stops. This data directly influences regulatory risk assessment: reversible findings are considered less severe.

“Recovery” is a regulatory term, not a welfare outcome
Recovery dogs undergo the same terminal necropsy and full histopathology panel as main-study dogs. They live longer only to provide a delayed tissue snapshot. If organ damage has repaired by the time the dog is killed, the finding is classified as “reversible.” If it persists, it is “irreversible” — a more serious regulatory flag. Either way, the dog dies.

Statistical Power & Group Sizes

Non-rodent toxicology uses small group sizes (3–5 dogs per sex per group) compared to rodent studies (10–20 per sex per group). This creates an inherent tension: small n means low statistical power to detect anything but large treatment effects.

Why so few dogs?
Regulatory guidelines set the minimum at 4 dogs/sex/group (OECD TG 409). The small n reflects ethical pressure to minimize animal use, the high per-animal cost of dogs (~$5,000–$8,000 each from breeders), and the long per-animal processing time for necropsy and histopathology. In practice, many studies use 3/sex/group.
What this means for detection
With n=4 per group, a study can only reliably detect differences of ~2 standard deviations between groups (effect size d > 2.0). Subtle toxicity is routinely missed. Regulators compensate by relying heavily on histopathology — where a single finding in one dog can be biologically significant even without statistical significance.
Methodology Caveat
This is why individual-animal pathology data matters more than group statistics in dog studies. A liver lesion in 1 of 4 high-dose dogs is 25% incidence — not statistically significant, but potentially treatment-related if the finding is unusual. The pathologist’s judgment, not a p-value, drives the interpretation.

GLP Documentation Requirements

Studies intended to support regulatory submissions (IND, NDA, MAA) must comply with Good Laboratory Practice regulations — 21 CFR Part 58 (FDA) or OECD GLP Principles. GLP is not about the science; it is about the paper trail.

Study protocol
Every procedure must be pre-specified: dose levels, routes, group assignments, observation schedules, sacrifice timing, tissue collection list. Deviations require documented justification.
Raw data integrity
All observations, measurements, and deviations recorded in real time. No backdating, no erasures without explanation. Electronic records require audit trails (21 CFR Part 11). Data is archived for the life of the drug program.
QA unit & audit
An independent Quality Assurance unit inspects critical study phases and audits the final report. The Study Director is personally responsible for data integrity. QA verifies the report accurately reflects the raw data.
SOPs govern every step
CROs operating under GLP maintain Standard Operating Procedures for dosing, blood collection, restraint, clinical observations, necropsy, tissue processing, histopathology, and data management. These SOPs create highly standardized “procedure modules” — the same steps, the same sequence, applied to every dog. The documentation is meticulous. The experience for the dog is identical whether or not it is documented.

What Drives Cumulative Burden

Individual procedures can look “routine.” What makes them harmful is repetition, constraint, and the absence of recovery time. UK severity guidance stresses that overall classification should increase when mild harms are cumulative or prolonged. Three patterns dominate:

Clustered repetition (hours)
PK/TK days: up to 11 blood draws in 24 hours. The burden is handling + repeated restraint + needle access, not necessarily blood volume.
Chronic repetition (weeks–months)
General tox: once-daily dosing for 28–90+ days with daily handling, clinical observations, and housing constraints. The burden is daily interference + cumulative dosing effects + confinement.
Hybrid repetition
Telemetry safety pharmacology: surgery + recovery, then repeated crossover dosing sessions with extended monitoring. Adding automated blood sampling inserts clustered PK sampling into the same protocol.

Sources

OECD Test Guideline 409: Repeated Dose 90-Day Oral Toxicity Study in Non-Rodents (1998)

OECD Principles of Good Laboratory Practice, ENV/MC/CHEM(98)17 (1998)

21 CFR Part 58: Good Laboratory Practice for Nonclinical Laboratory Studies (FDA)

ICH M3(R2): Nonclinical Safety Studies for the Conduct of Human Clinical Trials (2009)

ICH S4: Duration of Chronic Toxicity Testing in Animals (1998)

ICH S7A: Safety Pharmacology Studies for Human Pharmaceuticals

UK Home Office: Notes on Actual Severity Reporting

EU+Norway 2022 Statistical Report under Directive 2010/63/EU

USDA APHIS FY2024 Research Facility Annual Report Summary

NIH Guide for the Care and Use of Laboratory Animals (8th Edition)

NC3Rs: Refinement resources for blood sampling and substance administration

Norecopa Severity Classification Compendium (European cross-reference)