The Big Supp Study
A Pilot of Structured Self-Research on Supplements
The global dietary supplement market exceeds $200 billion annually. To put that in perspective, that’s more than four times the annual NIH budget (the largest public funder of biomedical research globally).
75% of American adults use supplements regularly, spending a median of $50 per month.
And yet, most of us are deciding what to take without knowing what really works. The majority of supplements do not have substantial research behind them.
We had an AI agent go through each supplement information page on the NIH’s supplement website and summarize the quality of evidence for 170 common supplements. It rated the evidence as strong for only 28 (16%) of them.
The details and results of this task are shared publicly here.
Why is it that so little evidence exists for such popular products? First, the incentives aren’t there. Supplement ingredients are generic and unpatentable, so no single company has reason to fund a definitive trial. If one company spends $100,000 proving its product supports cognition, every other brand using the same ingredient benefits for free. Second, the FDA does not require it. Some libertarians argue that we should abolish the FDA and that private certification bodies could take its place, but the supplements market shows us that this is not what happens in practice. Since DSHEA passed in 1994, the FDA has explicitly exempted dietary supplements from efficacy review, and in the three decades since, no major private certifier has emerged to perform that role. The certifications that do exist (NSF, USP, ConsumerLab) verify purity and accurate labeling, not whether the product works.
Even if many randomized controlled trials were being run on supplements, they don’t necessarily cut it for our individual needs. RCTs can tell us what an intervention does on average, but often different people will have different responses. This is called heterogeneity of treatment effects, and it applies with particular force to nutritional interventions. A supplement that produces a modest average effect across a trial population can be highly effective for some and useless for others.
Many studies don’t account for this, and when they do, they often find some subgroup for whom the population average is very misleading. In our own mouth taping study results, we found that people who scored high on our Mouth Breather Index had more significant benefits from mouth taping compared to those who scored low. Such heterogenous effects are likely present in many supplements as well.
Structured Self-Research
This study follows a structured self-research design. Structured self-research aims to improve upon existing evidence by formalizing the self-experimentation that people already do. At the same time, it is essential to make the participant experience as easy and valuable as possible in order to maximize participation. The key features are:
A protocol that mirrors natural self-experimentation (baseline, then add the intervention, then observe how you feel with it versus without it)
Low-effort data collection (15 second daily check-in with simple reminders)
Measurement that aligns with participant’s goals (perceived improvements)
You can read more details and justifications in our introduction to structured self-research.
Big Supp Study Design
Big Supp is a pilot study using structured self-research to gather evidence on supplements. The goal is to start with a set of popular yet understudied supplements that have a common outcome.
Outcome: Mood, anxiety, and stress.
We choose to start here because, unlike outcomes where a blood test or wearable metric can do the job, subjective self-report is the best way to measure how someone feels. The gold-standard clinical instruments for these outcomes are also self-report. Brief daily check-ins can produce similar relevant data.
The outcome is also aligned with the participants’ goal – when you start taking a supplement for mental well-being, you will continue if you subjectively feel that it is helping.
The outcome is measured by a daily check-in that takes 15 seconds or less, delivered via a push notification at the same time each evening. Three 0-10 ratings represent a usable daily snapshot of how someone feels while being easy enough that consistency is realistic.
The reminder system is designed to be as convenient as possible for the participant, with features like adaptive scheduling, customizable medium (optional SMS, email, push notification), and grace windows.
Initial Supplement Set:
L-theanine
Ashwaganda
Silexan (oral lavender oil capsules)
Saffron
Magnesium glycinate
Ashwagandha and L-theanine are the most widely used and evidenced of the set, which makes recruitment easier and serves as a validation check. Ashwagandha has the strongest evidence for stress reduction, with a recent meta-analysis covering 12 RCTs (n=1002), but also considerable heterogeneity (I² of 83-93%). L-theanine also has reasonable evidence for stress reduction, with a systematic review of 9 RCTs finding that 200-400mg/day reduced short-term stress and anxiety, though longer-term and larger cohort studies are needed.
Silexan and saffron both have promising evidence bases that haven’t gotten wide attention: Silexan has five RCTs showing efficacy comparable to prescription anti-anxiety drugs, though all were conducted in Germany and funded by the manufacturer, while saffron has meta-analyses showing effects for depression and anxiety comparable to SSRIs, though the literature is concentrated in Iran where saffron is produced.
We also included magnesium glycinate because it is quite popular but barely studied in its specific form. Most magnesium research lumps formulations together, and focuses on clinical deficiencies rather than everyday use.
Some people will want to experiment with supplements outside of the initial set. This is permitted within this framework. An open registry lets participants join with any supplement targeting the same outcome space, using the standardized daily measurement. A pathway to propose a protocol lets participants submit a new supplement with a suggested dose, duration, and timing for lightweight safety review before being made available to others.
Protocols:
Each supplement is paired with a protocol that reflects its optimal dose, duration, and timing. Doses and durations vary by supplement, but every protocol follows the same three-phase structure of baseline, intervention, and post. For our example initial supplement set, it may look like:
Analysis:
On the individual level, each participant sees their own results comparing their ratings during intervention to the baseline and post phases. Since supplements often have heterogeneity in treatment effects, your own data is the most relevant evidence you can have. Structured self-research results are a level above the typical route of just starting a supplement and trying to remember how you felt before it.
On the cohort level, once a supplement hits 20+ completed participants, we report the full distribution of individual effects (rather than a single average). We also examine which baseline characteristics predict response, generating exploratory “who is likely to benefit” information that traditional trials often don’t.
Expected Outcomes
We have previously addressed some of the limitations of structured self-research. The main argument for it comes down to the fact that our goal is to improve upon the existing evidence in fields for which research is sparse and no incentives exist to generate more. The right comparison is not “Big Supp vs. a hypothetical perfect RCT” but “Big Supp vs. the actual evidence environment people make supplement decisions in today”.
Big Supp can generate evidence for understudied supplements, help individuals approach their own decisions scientifically, and establish a new model for self-driven experimentation.
If this resonates and you’d like to contribute to developing Big Supp, please get in touch.
Email: info@cosimoresearch.com







