peptide garden
Guide

How We Grade Peptide Evidence

Our process for evaluating research, scoring evidence, and assigning grades — explained in plain language.

8 min read·MethodologyTransparency

Why this page exists

Peptide Garden exists for one reason: to give you honest, evidence-based information about peptides in language you can actually understand.

We don't sell peptides. We don't promote clinics. We don't have sponsors. That independence is the foundation of everything we publish. When you see an evidence grade on one of our profiles, it reflects the published scientific literature — not marketing claims, not community anecdotes, not what someone on Reddit said worked for them.

We built this page because you deserve to know exactly how we arrive at those grades. If you're going to trust our assessments — and we hope you will — you should be able to evaluate our process and hold us accountable to it.


The three evidence dimensions

Most rating systems reduce complex evidence to a single grade. We think that's misleading. A peptide can have strong animal research but almost no human data (that's most of them, actually). Collapsing those into one score obscures a distinction that matters for your decision-making.

That's why we score three dimensions separately.

Animal studies

Preclinical research conducted in animal models — typically rats, mice, and sometimes larger animals like rabbits or primates. Animal studies tell us about biological plausibility: does this compound do something measurable in a living system? They reveal mechanisms of action, dose-response relationships, and potential toxicity signals.

What animal studies don't tell us: whether a peptide works in humans, what the right human dose is, or what side effects to expect. The translation gap between animal models and human biology is significant and well-documented. Many compounds that work brilliantly in mice fail completely in human trials.

We score animal evidence based on study count, model diversity (results across multiple species are more convincing), independent replication (did different labs get similar results?), and recency (newer studies using modern methods carry more weight).

Here's what a strong animal evidence score looks like compared to the human evidence for that same peptide:

Animal studies — BPC-157Strong100+ studies

Consistent results across multiple animal models including rats, mice, rabbits, and horses. Strongest evidence for tendon and GI repair.

Human evidence — BPC-157Minimal3 studies

Only 3 small human studies with approximately 30 total participants. No completed Phase II or III trial. Zero randomized controlled trials.

BPC-157 is a textbook example of the gap animal studies can't bridge. More than a hundred animal studies — but almost nothing in humans. That distinction is exactly why we score these dimensions separately.

Human evidence

Published clinical research conducted in human participants. This is the gold standard. We weight studies by design quality:

  • Randomized controlled trials (RCTs) carry the most weight — participants are randomly assigned to treatment or placebo, reducing bias
  • Pilot studies and open-label trials provide useful signals but lack the rigor of RCTs
  • Case reports and case series are the weakest form of clinical evidence — they describe outcomes but can't establish causation

We also consider sample size (a trial with 5,000 participants tells us more than one with 12), independent replication (have multiple research groups confirmed the findings?), and whether results have been published in peer-reviewed journals.

Compare two peptides at opposite ends of the spectrum:

SemaglutideStrong100+ trials

One of the most extensively studied drugs in history. Multiple large Phase III RCT programs with 25,000+ total participants. FDA-approved for multiple indications.

BPC-157Minimal3 studies

Only 3 small human studies. No randomized controlled trials. Total participant count approximately 30.

Semaglutide's 95% reflects the kind of evidence base most compounds never achieve. BPC-157's 12% doesn't mean the peptide doesn't work — it means we genuinely don't know yet, because the human research barely exists.

Safety data

How much controlled safety information exists for a peptide in humans. This dimension is often misunderstood, so we want to be very clear:

A high safety score does not mean "safe." It means "well-studied." We know a lot about the safety profile because thousands of people have been monitored in clinical settings.

A low safety score does not mean "dangerous." It means "we don't know yet." The absence of documented harm is not the same as evidence of safety — it might just mean nobody has looked carefully enough.

Safety — SemaglutideStrong25,000+ subjects

Extensive post-marketing surveillance from millions of prescriptions. Well-characterized side effect profile with known serious risks documented and monitored.

Safety — BPC-157Limited2 subjects

Total controlled human safety data from 2 subjects in a dose-escalation study. No long-term safety data in humans.

We score safety data based on the number of human subjects studied under controlled conditions, the duration of follow-up, and whether post-marketing surveillance data exists.


The scoring scale

Every evidence dimension receives a percentage score (0–100) and a qualitative level. Here's what each level means:

Strong 75 – 100%

Robust body of evidence. Multiple high-quality studies, often including large RCTs and independent replication. For safety, typically means thousands of monitored subjects and post-marketing data.

Moderate 50 – 74%

Meaningful evidence with gaps. Several studies exist, possibly including some RCTs, but the body of evidence may be limited in scope, sample size, or replication. Enough to draw cautious conclusions.

Limited 25 – 49%

Some evidence exists but with significant gaps. Studies may be small, uncontrolled, or unreplicated. Results are suggestive but far from conclusive.

Minimal 10 – 24%

Very little evidence. Typically one or two small studies, often case reports or pilot data. Not enough to draw any reliable conclusions.

None 0 – 9%

No meaningful evidence in this dimension. This doesn't necessarily mean the peptide is ineffective or unsafe — it means the research simply hasn't been done.

A note on honesty: These composite scores involve editorial judgment. There is no purely mechanical formula that converts study counts into percentages — if there were, it would create a false sense of precision. We weigh study quality, design, sample size, replication, and recency, then assign a score that reflects our best assessment of the evidence landscape. We think transparency about that subjectivity is more honest than pretending the math does all the work.


What we don't do

Being clear about our limitations matters as much as explaining our process:

  • We don't evaluate product quality from specific vendors. Our profiles assess the evidence for a peptide compound, not any particular supplier's product. Purity, contamination, and manufacturing quality are real concerns — but they require laboratory testing, not literature review.
  • We don't provide medical advice or dosing recommendations. Nothing on this site is a substitute for consultation with a qualified healthcare provider. We report what the research shows. Your doctor helps you decide what that means for you.
  • We don't score based on anecdotal reports or community sentiment. User testimonials, forum posts, and social media reports are not included in our evidence assessments. They can be valuable for hypothesis generation, but they don't meet the bar for evidence scoring.
  • We don't have access to unpublished data or internal trial results. Pharmaceutical companies often hold data from ongoing or failed trials. Our scores reflect only what has been published in the peer-reviewed literature.
  • Our scores reflect the published, peer-reviewed literature as of the last review date. Evidence evolves. A score assigned today might change as new research is published. Every profile shows a Last reviewed date so you know how current our assessment is.

Claim assessments

Beyond the three evidence dimensions, each profile evaluates specific health claims commonly associated with the peptide. For every claim, we assign one of four verdicts:

Well-supported — Multiple quality studies confirm the claim. The evidence is consistent across different research groups and study designs. You can have reasonable confidence this effect is real.

Some supporting evidence — There is evidence in favor of the claim, but with meaningful caveats. Common reasons: the evidence is primarily from animal models, human studies are small or uncontrolled, or results haven't been independently replicated. The signal is there, but it's not definitive.

Not yet demonstrated — The available evidence contradicts the claim, or the evidence is too weak to support it. This doesn't always mean the claim is false — it means the research to date doesn't back it up.

More research needed — There isn't enough published research to assess the claim in either direction. This is the honest answer for many peptide claims, and we'd rather say "we don't know" than speculate.


How we stay current

Peptide research is a moving target. New studies are published regularly, and regulatory landscapes shift. Here's how we keep our profiles current:

  • Quarterly review cycle. We review every published profile at least once per quarter, checking for new publications, regulatory changes, and significant community developments.
  • Event-triggered updates. Major events — a new RCT publication, an FDA decision, a safety signal — trigger immediate review regardless of the regular cycle.
  • Last reviewed dates. Every profile displays a Last reviewed date in the header. This tells you exactly when we last evaluated the evidence. If a profile hasn't been reviewed in more than six months, we flag it.
  • Version transparency. When we change a score, the profile reflects the updated assessment. We don't maintain a public changelog for every score adjustment, but significant changes are noted in our update summaries.

Disagreements and corrections

We get things wrong sometimes. If you believe we've made an error — a miscited study, a mischaracterized finding, a score that doesn't reflect the current literature — we want to hear about it.

The most helpful corrections include a specific citation (DOI or PMID) and a brief explanation of what you think we got wrong. We review every correction and update our profiles when warranted.

You can reach us through our contact page. We read everything.