Preregistration — Glossary Aria Research

Extended definition

Preregistration is the formal deposit — in a repository with verifiable timestamp — of hypotheses, methods, statistical analysis plan, and inclusion/exclusion criteria before data collection or (in some designs) before analysis. The central purpose is to distinguish confirmatory tests (specific hypothesis registered before data) from exploratory ones (post hoc analyses), reducing room for p-hacking, harking (hypothesizing after results known), and the “garden of forking paths” — multiple analytic paths that inflate the false-positive rate. Main platforms: Open Science Framework (OSF), AsPredicted, ClinicalTrials.gov (clinical trials, regulatory requirement), PROSPERO (systematic reviews). Nosek, Ebersole, DeHaven, and Mellor (2018, PNAS) synthesized the “preregistration revolution” and described exponential adoption growth since 2012. Munafò et al. (2017, Nature Human Behaviour) articulated preregistration as a central element in the manifesto for reproducible science. Registered reports — the more rigorous modality in which the journal accepts the study based on protocol, before knowing results — are the contemporary gold standard, offered by more than 300 journals.

When it applies

Preregistration applies to any confirmatory study of a specific hypothesis: clinical trials (regulatorily required), psychological experiments, observational studies with a defined hypothesis, secondary analyses of public datasets with a specific hypothesis. It is strongly recommended in funding proposals (national agencies increasingly incentivize). It is a growing requirement in systematic reviews (PROSPERO or equivalent registry) — preregistering the review protocol before starting the search. Registered reports apply especially in studies with large sample size, high cost, or where null results would be discarded by traditional publication bias.

When it does not apply

Strict preregistration does not apply to purely exploratory research — discovering patterns in data without prior hypothesis is legitimate practice and should not be disguised as confirmatory. It does not apply to secondary analyses when data have been analyzed before — preregistration requires the researcher not to have seen the data yet; the alternative for existing data is “blinded analysis” or conditional preregistration declaring what has already been seen. It does not apply to pure qualitative research in the sense of pre-specifying analysis — but preregistration of research question, collection method, and analytic strategy is growing practice. It does not replace statistical power, methodological transparency, or reproducibility — it is a complementary instrument, not a substitute.

Applications by field

— Health and biomedical sciences: ClinicalTrials.gov has been regulatory since 2007 (FDA Amendments Act); ICMJE requires registration as precondition for submission. — Psychology: OSF grew rapidly since 2012; many top-tier journals offer registered reports. — Experimental economics and political science: AsPredicted and EGAP (Evidence in Governance and Politics) consolidated the practice. — Education: Society for Research on Educational Effectiveness recommends preregistration in intervention studies.

Common pitfalls

The first pitfall is treating preregistration as a straitjacket — deviations from the original plan are legitimate when justified and transparent; the final report should document changes and reasons. The second is vague preregistration: the protocol should specify the primary analysis with enough detail that a different analysis is clearly post hoc. A generic plan (“we’ll run regression”) leaves room for flexibility that hollows out the purpose. The third is confusing preregistration with prepublication — preregistration is protocol deposit, not manuscript deposit; preprint covers the latter. The fourth is registering and never publishing null results, neutralizing the anti-publication-bias effect — good practice is to deposit a final report even when the hypothesis is not confirmed. The fifth is assuming preregistration guarantees validity — a poorly designed protocol, with low power or problematic measures, does not become a good study just by being preregistered.