Even though I'm a scientist and thoroughly trained in statistics, the idea that we can to sequester 36 people and monitor diet 24/7 and made general conclusions doesn't sound completely right to me. Partly in the technical sense and partly in the "why do the folks working on human health get away with sample sizes that would be laughed about in any other field?"
I don't know enough about medical statistics to say, but I often see small sample sizes in studies where the effect size is expected to be high. That may be the case here.
There are too many variables in diet. If they study steak every meal vs rice and beans every meal they can come up with one. However most people are not that one-tracked either way. Sometimes the rich eat rice and beans, sometimes even poor manage to afford steak. For steak, did I mean beef, lamb, goat, pork... - this might or might not matter. There is also chicken, turkey, snake, deer, elk - and dozens more animals people eat which might or might not be healthy. OF each of the above there are different cuts (does it matter?), different fat levels (does it matter?). And that is just meat, how many varieties of beans are there, what about rice? What about all the other things people eat?
Do these other fields also study humans in controlled experiments?
I think it has to do with the sample to staff ratio. It's not enough to observe human subjects. You have to actively prevent them from going off the rails. It doesn't scale well when you increase the sample size. I guess we could replicate a similar experiment n-times and then do a meta study, but it's not ideal either.
How would you tackle the logistics of scaling up the above experiment?
Yes, the most common example would be clinical trials for drugs and other medical treatments- often have thousands of patients (with recruitment being the limiting factor). There are tons of ways that studies can go wrong, for example when patients don't take the treatment and lie (this is common) or have other lifestyle factors that influence the results, which can't be easily smoothed out with slightly larger N.
I don't know how to fix the nutritionist studies- I'm still pretty skeptical that you could ever control enough variables to make any sort of conclusion around things with tiny effect sizes. This isn't like nutritional diseases we've seen in the past, for example if you look at a disease like pellagra (not getting enough niacin), literally tens of thousands of people died over a few years (beri beri, rickets, scurvy are three other examples; these discoveries were tightly coupled to the discovery of essential nutrients, now called vitamins).
From my reading, that's not generally true. It all depends on the methodology. Safety or feasibility studies can use very small sample sizes. I've been reading safety studies on monoclonal antibodies like Cimzia, for example:
https://pubmed.ncbi.nlm.nih.gov/29030361/ (N=16)
https://pubmed.ncbi.nlm.nih.gov/28814432/ (N=17)
https://ard.bmj.com/content/83/Suppl_1/1145.2 (N=21)
Of course, these are not nutritional studies.
You should try reading the FDA approval for the drug; it was already approved before these publications (which aren't so much clinical trials as just medical research). The FDA approval has a whole paragraph about how the effect size was too small to demonstrate statistical significance, and the trials had n=300.
It's also indicated for use in a disease we don't understand, for people who didn't respond to all the previously approved drugs. Not a good example at all.
I'm not comparing these to the FDA approvals process, but to your claim that trials use thousands of patients. These three studies are ascertaining the pregnancy safety of a drug, irrespective of whether we understand the disease or what the response rate is.
Cimzia has been well-studied, and we understand why it works on autoimmune diseases like inflammatory arthritis. It has 6 FDA approvals for different indications, so your description of the drug itself is incorrect.
The sample size doesn't concern me as much as what does that force on their lifestyle and in turn do they apply. They probably are not getting the same exercise as a normal person (which runs the range from "gym rat" who gets too much to "couch potato" who barely walks).
> why do the folks working on human health get away with sample sizes that would be laughed about in any other field?
Because it's really hard and expensive to do such studies with more participants