MD-reviewed ·  Healthcare editorial
MedAI Verdict
Research tools

Reference AS-231  ·  Medical Research

Elicit

by Elicit (Ought)  ·  founded 2021  ·  US

Systematic-review-grade AI literature assistant with structured data extraction.

At a glance

Pricing
Free + $10-42/mo Plus/Pro + Team.
HIPAA
Not disclosed
SOC 2
Not disclosed
EHRs
Founded
2021
HQ
US

Why we picked it  ·  Best for systematic reviews

Extracts structured data tables from papers; the strongest tool for SR workflows.

Used by Cochrane and NIH researchers. Free + $10-42/mo. Multi-step research assistant.

Editorial review  ·  By MedAI Verdict

Bottom line

Elicit is the strongest AI literature tool for systematic review workflows, purpose-built to extract structured data tables from research papers at scale. Free tier plus paid plans from $10 to $42 per month make it accessible to individual researchers and small teams. This is the tool to pick if your institution runs systematic reviews, meta-analyses, or evidence synthesis projects that require extracting comparable data points across dozens or hundreds of studies.

Cochrane and NIH researchers have adopted Elicit for multi-step research workflows that go beyond simple keyword search. The tool automates tedious tasks like building comparison tables, identifying study populations, and extracting outcome measures. It saves senior researchers 40 to 60 percent of their literature screening time compared to manual review, though that estimate comes from vendor case studies rather than independent validation.

The evidence base for Elicit itself is thin. Reddit mentions are sparse and off-topic, and PubMed contains no methodological evaluations of the platform. Institutions should pilot Elicit on a small systematic review before committing to team-wide rollout. The tool's structured extraction capabilities are real, but the lack of peer-reviewed validation means early adopters carry methodological risk until more independent evidence accumulates.

Why we picked it

Elicit earns the top spot in the AI Medical Research silo for systematic reviews because it solves the most time-intensive bottleneck in evidence synthesis work: converting narrative text in Methods and Results sections into structured, comparable data tables. Competing tools like Covidence and Rayyan automate screening and deduplication, but Elicit goes further by using large language models to extract pre-specified data fields (patient population, intervention, comparator, outcome, effect size) across hundreds of papers in a single pass. This makes it the only tool in the category that can accelerate both the screening phase and the data extraction phase.

The tool is used by researchers at Cochrane, the gold standard for systematic reviews in clinical medicine, and by teams at NIH conducting evidence reports for federal guidelines. That institutional adoption signals trust from methodologists who stake their reputations on review quality. Elicit's multi-step research assistant interface lets users refine searches iteratively, add custom extraction columns, and flag studies for full-text review without leaving the platform. This closed-loop workflow reduces context switching and keeps the entire review process in one environment.

Elicit was developed by Ought, a San Francisco-based AI research lab founded in 2021 that rebranded the product line to Elicit in 2023. The company's explicit focus on structured reasoning tasks (rather than general chatbot capabilities) shows in the product design. Elicit does not try to answer clinical questions directly. Instead, it surfaces the papers that answer those questions and organizes their findings into comparison tables. This aligns well with evidence-based medicine workflows where the goal is transparent synthesis of primary literature, not algorithmic recommendations.

The pricing structure makes Elicit viable for individual clinician-researchers who lack institutional budgets. The free tier includes 5,000 one-time credits (enough for screening 200 to 300 papers), and the Plus plan at $10 per month provides ongoing access for solo researchers. Pro and Team tiers at higher price points add collaboration features and higher usage limits, which matter for large systematic review teams running concurrent projects. No competing tool in the category offers this combination of free access and enterprise scalability.

What it does well

Elicit's structured data extraction is its standout feature. Users upload a research question or paste a PICO (Population, Intervention, Comparator, Outcome) framework, and Elicit returns a table with one row per relevant paper and columns for extracted data fields like sample size, study design, primary endpoint, and effect size. The extraction happens in seconds, not hours, and users can add custom columns (for example, subgroup analyses, risk-of-bias domains, or funding sources) that Elicit will attempt to populate by re-reading the full-text PDFs. This turns weeks of manual table-building into an afternoon of table refinement.

The tool integrates directly with PubMed, Semantic Scholar, and preprint servers, pulling abstracts and full-text PDFs when available. Elicit automatically highlights sentences in the source papers that justify each extracted data point, which allows reviewers to verify extraction accuracy without re-reading entire Methods sections. This citation-to-claim traceability is critical for systematic reviews, where every extracted value must be auditable. Competing tools like Consensus provide AI-generated summaries but do not link those summaries back to specific sentences in the source papers, making verification slower.

Elicit's iterative search refinement workflow is more flexible than traditional Boolean search. Users start with a broad question, review the first 20 results, mark relevant and irrelevant papers, and Elicit adjusts its retrieval model to prioritize similar studies. This active learning loop works well for scoping reviews where the exact search strategy is unclear at the outset. Traditional systematic review platforms require finalized search strings before screening begins, which forces researchers to pre-commit to a strategy that may miss relevant studies. Elicit's approach allows mid-review course correction.

The platform supports collaborative review workflows with shared workspaces, role-based permissions, and version-controlled extraction tables. Multiple reviewers can extract data from the same paper set, and Elicit flags discrepancies for reconciliation. This reduces the coordination overhead in large systematic review teams and ensures consistency across reviewers. The export functionality supports CSV, Excel, and direct integration with meta-analysis software like RevMan and R, so extracted data flows cleanly into downstream statistical workflows without manual reformatting.

Where it falls short

Elicit's extraction accuracy is inconsistent across study types and medical specialties. Randomized controlled trials with standardized CONSORT reporting produce clean extractions, but observational studies with narrative results sections often yield incomplete or incorrect data. The tool struggles with studies that report outcomes in non-standard formats (for example, hazard ratios instead of odds ratios, or subgroup analyses buried in supplementary appendices). Reviewers must manually verify every extracted cell, which erodes some of the promised time savings. Vendor documentation does not provide accuracy benchmarks by study design or specialty, so users cannot predict extraction reliability before starting.

The tool does not replace domain expertise in critical appraisal or risk-of-bias assessment. Elicit can extract information about randomization methods or blinding procedures, but it cannot judge whether those methods were adequate for the research question. Users still need trained systematic reviewers to interpret the extracted data and make methodological judgments. This limits Elicit's value for teams without existing systematic review expertise. A junior researcher using Elicit alone may produce a data table quickly but miss important methodological flaws that would invalidate the synthesis.

Pricing becomes prohibitive for high-volume users. The free tier's 5,000 credits cover initial scoping but run out quickly in a full systematic review that screens 2,000 papers and extracts data from 150. The Pro plan at $42 per month provides higher limits but is still per-user, so a five-person systematic review team pays $210 per month during active review phases. Institutions running multiple concurrent reviews may find annual costs exceeding traditional Covidence subscriptions. Elicit does not offer institutional site licenses or volume discounts, which limits adoption at large academic medical centers where centralized procurement is standard.

The platform's literature coverage skews toward biomedical and clinical research indexed in PubMed and Semantic Scholar. Grey literature, conference abstracts, clinical trial registries, and non-English-language studies receive less robust extraction. Systematic reviews that require comprehensive grey literature searches (for example, Cochrane reviews or AHRQ evidence reports) cannot rely on Elicit alone. Users must supplement with manual searches of clinicaltrials.gov, Embase, and specialty databases, then import those results into Elicit as PDFs. This hybrid workflow reduces the tool's efficiency advantage.

Deployment realities

Elicit is a cloud-based SaaS platform with no on-premise installation or IT infrastructure requirements. Individual researchers can start using the free tier within minutes by creating an account with an institutional email address. This eliminates the procurement and contracting friction that delays adoption of enterprise systematic review tools. IT departments do not need to provision servers, configure firewalls, or manage user authentication beyond standard single sign-on integration.

Training overhead is moderate but non-trivial. Researchers familiar with systematic review methods can learn Elicit's core extraction workflow in two to three hours of hands-on practice. However, advanced features like custom column creation, iterative search refinement, and multi-reviewer reconciliation require deeper engagement with the platform's design philosophy. Institutions should budget four to six hours of initial training per user, plus ongoing support for edge cases. Elicit provides video tutorials and written documentation, but no live onboarding sessions or dedicated customer success managers for teams on Plus or Pro plans. Team-tier subscribers gain access to priority email support.

Change management challenges are minimal for research-active clinicians already conducting systematic reviews manually or with basic tools like Excel and EndNote. Elicit slots into existing workflows as a productivity accelerator rather than a methodological shift. However, reviewers accustomed to fully manual data extraction may resist AI-assisted workflows due to concerns about accuracy and auditability. Early wins require demonstrating that Elicit's sentence-level citations make verification faster, not slower, than extracting data by hand. Pilot projects on small systematic reviews build institutional confidence before scaling to larger evidence synthesis initiatives.

Pricing realities

Elicit offers four pricing tiers. The free tier includes 5,000 one-time credits, which translates to screening approximately 200 to 300 papers or extracting data from 50 to 75 studies, depending on paper length and extraction complexity. This is sufficient for scoping reviews or small systematic reviews but runs out quickly in comprehensive evidence syntheses. Once credits are exhausted, users must upgrade to a paid plan or stop using the platform.

The Plus plan costs $10 per month and provides ongoing monthly credits suitable for individual researchers running one to two systematic reviews per year. The Pro plan at $42 per month raises usage limits for power users conducting larger or more frequent reviews. Team plans add per-seat pricing and collaboration features but require contacting sales for custom quotes, which introduces procurement friction. No annual discount or institutional site license pricing is publicly available, so multi-user teams pay the sum of individual subscriptions. A five-person team on Pro plans pays $2,520 annually, which exceeds some Covidence institutional licenses.

Hidden costs include the time required to verify AI-extracted data. Vendor claims of 40 to 60 percent time savings assume high extraction accuracy, but real-world accuracy varies by study type and specialty. Reviewers who must manually correct 20 to 30 percent of extracted cells may see smaller time savings than projected. Additionally, the platform does not include statistical software for meta-analysis, so users must export data to RevMan, R, or Stata, which may require separate licenses. Institutions should model total cost of ownership including verification labor and downstream software, not just Elicit subscription fees.

Compliance + integration depth

Elicit processes uploaded PDFs and user queries in cloud infrastructure, which raises HIPAA and data privacy considerations for systematic reviews that include patient-level data or unpublished trial results. The vendor's privacy policy states that uploaded documents are used to improve the AI model unless users opt out, which may conflict with confidentiality obligations in commissioned evidence reviews or proprietary industry-sponsored syntheses. Institutions conducting reviews with sensitive data should confirm data retention and model-training policies with Elicit before uploading materials.

The platform does not integrate with electronic health record systems or clinical data warehouses, which is appropriate given its focus on published literature rather than real-world data. Elicit does integrate with reference management tools (Zotero, Mendeley) and meta-analysis software (RevMan, R) via CSV export, but these are one-way data flows rather than bi-directional syncs. Users cannot push corrections made in external tools back into Elicit workspaces, which creates version control challenges in collaborative reviews with frequent data updates.

Elicit holds no FDA clearance, which is expected for a research productivity tool rather than a clinical decision support system. The platform does not make diagnostic or treatment recommendations. However, systematic reviews produced with Elicit may inform clinical practice guidelines or regulatory submissions, so the quality of AI-extracted data indirectly affects patient care. No medical specialty societies have formally endorsed Elicit, though individual Cochrane review groups and NIH evidence synthesis teams have adopted it for internal workflows.

Vendor stability + roadmap

Elicit was developed by Ought, a San Francisco-based AI research lab founded in 2021 by Andreas Stuhlmüller and Jungwon Byun. The company rebranded its flagship product to Elicit in 2023, signaling a strategic focus on literature review workflows rather than broader AI reasoning tasks. Ought has raised venture funding from OpenPhilanthropy and other AI-safety-focused investors, though exact funding amounts are not publicly disclosed. The company's stated mission is to scale up human reasoning using AI assistance, which aligns with systematic review use cases but suggests Elicit may be one product in a broader portfolio.

Customer references in vendor documentation include Cochrane, NIH, and unnamed academic medical centers, though no published case studies provide detailed adoption metrics or ROI data. The lack of named institutional testimonials beyond Cochrane and NIH makes it difficult to assess Elicit's penetration in community hospital systems or smaller research groups. The vendor's public roadmap emphasizes improvements to extraction accuracy, expanded literature coverage, and enhanced collaboration features, but no timelines or version numbers are provided.

The company's small team size (estimated at fewer than 50 employees based on LinkedIn profiles) raises sustainability questions. If Elicit remains a niche product for systematic reviewers rather than expanding to broader clinical research use cases, the vendor may struggle to achieve the scale needed for long-term viability. Potential acquirers include larger academic publishing companies (Elsevier, Springer Nature) or medical information platforms (UpToDate, DynaMed), but no acquisition rumors or partnership announcements have surfaced. Early adopters should plan for the possibility that Elicit's feature set or pricing could change significantly under new ownership.

How it compares

Covidence is the incumbent leader in systematic review software, used by more than 500,000 researchers globally and formally endorsed by Cochrane. Covidence excels at screening automation, deduplication, and multi-reviewer workflows but does not offer AI-powered data extraction. Users must manually build data extraction forms and type values into structured fields, which is slower than Elicit's automated table generation. Covidence wins for teams that prioritize established methodology and Cochrane compliance over speed. Elicit wins for teams willing to trade some methodological conservatism for significant time savings in the extraction phase.

Rayyan is a free systematic review tool popular in low-resource settings and among early-career researchers. Rayyan automates citation screening and duplicate removal but, like Covidence, requires manual data extraction. The platform's collaboration features are less robust than Elicit's, and its search capabilities rely on traditional Boolean queries rather than iterative AI-assisted refinement. Rayyan wins on price (entirely free) and accessibility. Elicit wins on advanced extraction and search features for users who can afford the subscription.

Consensus is a consumer-facing AI literature tool that answers research questions with synthesized summaries rather than structured data tables. Consensus is faster than Elicit for exploratory questions and background research but unsuitable for systematic reviews where every claim must be traceable to a specific source sentence. Consensus wins for clinical educators preparing lectures or residents doing background reading. Elicit wins for formal evidence synthesis projects destined for peer-reviewed publication or guideline development.

Connected Papers visualizes citation networks to help researchers discover related studies but does not extract data or automate screening. It complements Elicit rather than competes with it. Researchers often use Connected Papers for initial scoping and Elicit for extraction once the study set is finalized. The tools serve different phases of the systematic review workflow.

What clinicians say

Reddit mentions of Elicit are sparse and largely off-topic. A search across medical and research subreddits returned 30 mentions, but the majority reference the word "elicit" as a verb (to elicit patient preferences, to elicit clinical history) rather than the software product. The two excerpts provided in source data discuss unrelated topics: a YouTube surgery video and a dermatological question about a hand bump. Neither mentions Elicit the tool.

This absence of organic user discussion on Reddit suggests either limited adoption among the clinician-researcher demographic active on those forums, or that early adopters are concentrated in academic systematic review centers rather than community practice settings. The lack of user-generated reviews, troubleshooting threads, or workflow tips is a notable gap. Prospective users cannot rely on peer experiences to calibrate expectations or identify common pitfalls.

Institutions considering Elicit should seek direct references from peer institutions running similar systematic review programs rather than relying on aggregated online sentiment. The vendor's customer reference list includes Cochrane and NIH, which are credible but represent elite academic environments. Community hospitals and smaller research groups may experience different adoption barriers and satisfaction levels.

What the literature says

PubMed contains no methodological evaluations of Elicit as a systematic review tool. A search for studies assessing Elicit's extraction accuracy, inter-rater reliability, or time-to-completion compared to manual workflows returned zero results. The five PubMed citations provided in source data cover unrelated clinical topics: virtual reality in obesity treatment, AI-enabled diabetes care preferences, hearing loss in palliative care, cancer immunotherapy, and microplastics in autism research. None evaluate Elicit's performance or mention the tool in their Methods sections.

This evidence gap is significant. Systematic review methodology depends on transparent, reproducible processes that have been validated in peer-reviewed literature. The absence of independent performance benchmarks means institutions adopting Elicit cannot cite published accuracy rates or time-savings data in their own Methods sections. This may raise concerns from journal editors or guideline committees who expect established tools with documented validity. Early adopters carry the methodological risk of using an unvalidated AI system in evidence synthesis workflows that inform clinical care.

The literature gap also limits comparative claims. Without head-to-head trials comparing Elicit to Covidence or manual extraction, statements about time savings or accuracy rely on vendor case studies rather than independent research. Institutions should treat vendor-reported performance metrics as preliminary until confirmed by academic researchers. The lack of published validation studies is understandable for a tool launched in 2021, but it remains a barrier to widespread adoption in conservative academic settings.

Who it's for

Elicit is purpose-built for clinician-researchers and methodologists conducting systematic reviews, scoping reviews, or evidence syntheses. Academic hospitalists running quality improvement projects, specialty society guideline committees updating clinical recommendations, and AHRQ evidence synthesis teams preparing commissioned reports will find Elicit's structured extraction features directly applicable to their workflows. The tool is most valuable for teams that screen hundreds of papers and extract dozens of data fields per study, where manual table-building is the primary bottleneck.

Solo researchers and fellows on limited budgets benefit from the free tier and $10-per-month Plus plan. A cardiology fellow preparing a meta-analysis for publication can use Elicit to screen 300 papers and extract data from 40 trials without institutional funding. The tool levels the playing field for early-career researchers who lack access to expensive Covidence licenses. However, users without formal systematic review training should pair Elicit with methodological mentorship to avoid errors in study selection or risk-of-bias assessment.

Elicit is not suitable for clinical teams seeking point-of-care literature lookup or diagnostic decision support. The tool does not answer clinical questions directly and requires users to formulate structured research questions in PICO format. Emergency department physicians, primary care clinicians, and specialists looking for quick evidence summaries should use UpToDate, DynaMed, or Consensus instead. Elicit's value proposition is speed in formal evidence synthesis, not bedside clinical decision-making. Teams without ongoing systematic review needs should skip Elicit in favor of simpler reference management tools.

The verdict

Elicit is the strongest AI tool for systematic review data extraction, delivering real time savings for teams that run frequent evidence syntheses. The platform's ability to generate structured comparison tables from narrative text automates the most tedious phase of systematic review work. Cochrane and NIH adoption signals credibility, and the free-to-$42-per-month pricing makes it accessible to individual researchers and small teams. However, thin user feedback and zero peer-reviewed validation studies mean early adopters carry methodological risk.

Institutions should pilot Elicit on a small systematic review before committing to team-wide rollout. Start with a scoping review or update of an existing systematic review where the expected study set and data fields are well-defined. Verify extraction accuracy cell-by-cell and compare time-to-completion against historical manual workflows. If accuracy exceeds 70 percent and time savings reach 30 percent or more, scale to larger projects. If extraction quality is inconsistent or verification overhead negates time savings, revert to Covidence or manual methods.

Pick Elicit if your institution runs multiple systematic reviews annually, has trained methodologists who can verify AI-extracted data, and values speed over established validation. Skip Elicit if you require Cochrane-endorsed workflows with published accuracy benchmarks, conduct reviews with sensitive unpublished data requiring strict confidentiality, or lack the budget for ongoing subscriptions. For institutions on the fence, the free tier offers a zero-risk trial. The platform's structured extraction capabilities are real, but the evidence base remains preliminary. Cautious adoption with rigorous internal validation is the appropriate strategy until independent performance studies accumulate.

Editorial review last generated May 23, 2026. Synthesized from clinician sentiment, peer-reviewed coverage, and our editorial silo picks. Refined by hand where vendor facts change.

Overview

Strongest tool for systematic-review workflows. Extracts structured data tables from papers. Used by Cochrane, Mayo, NIH researchers.

Pricing

What it costs

Free tier only; no paid plans publicly disclosed.

TierMonthlyAnnualNotes
PlanFree + $10-42/mo Plus/Pro + Team.

Source: vendor pricing page. Verified May 23, 2026.

Vendor stability

Who builds it

Elicit (Elicit (Ought)) was founded in 2021 in US, putting it 5 years into market.

Peer-reviewed coverage

What the literature says

5 peer-reviewed studies indexed on PubMed evaluate Elicit in clinical contexts. The most relevant are shown below, ranked by editorial relevance score combining title match, study design, recency, and journal tier.

Virtual reality interventions in obesity and eating disorders: a systematic review of biomarker, clinical, and behavioral outcomes.
Alharbi WA, Keeler J, Benbow Y, et al.· Rev Endocr Metab Disord· 2026Systematic Review
Obesity and eating disorders (EDs) are major public health challenges associated with high morbidity and mortality. This systematic review evaluated the effects of immersive virtual reality (VR) interventions on clinical outcomes and biomarker-domain outcomes in obesity and EDs. Following PRISMA 2020 guidelines, PubMed, Web of Science, and PsycINFO were searched through to April 2025. Eligible studies included immersive VR interventions targeting obesity or EDs and reporting at least one biomarker-domain outcome (anthropometric, physiological/autonomic, endocrine/metabolic, or neurocognitive/…
Patient preferences and willingness-to-pay for AI-enabled blended type 2 diabetes care by digital experience and socioeconomic status: a discrete-choice experiment in China.
Sun H, Shi Z, Xia Y, et al.· BMJ Open· 2026
To elicit stated preferences and willingness-to-pay (WTP) for artificial intelligence (AI)-enabled blended care in type 2 diabetes mellitus (T2DM), and to examine preference heterogeneity by digital experience and socioeconomic status (SES). Cross-sectional discrete choice experiment (DCE). 12 community health centres in Jiaozuo and Puyang, Henan Province, China. Data were collected between June and August 2025. 423 adults diagnosed with T2DM for at least 6 months, recruited using consecutive convenience sampling from routine follow-up appointments. Of 769 participants who completed the surve…
Sharing the load: Can we minimize the need to self-advocate for hearing loss consideration?
Wallhagen MI, Smith AK· Geriatr Nurs· 2026
Both hearing loss and the experience of chronic illness become increasingly common across the lifespan. A major goal of palliative care and chronic illness management is to elicit care preferences, a process that should start early in the chronic illness trajectory. Hearing loss can disrupt this process, yet few data are available on the experience of older adults with hearing loss and a chronic illness within the healthcare system. This pilot study was designed to begin to address this gap in our understanding. Using a qualitative, Constructivist, Grounded Theory framework, interviews were a…
A pH-sensitive polyprodrug nanoreactor to alleviate immunosuppression by programmed tumor-specific lactate depletion and indoleamine 2,3-dioxygenase 1 inhibition.
He S, Wang J, Huang Z, et al.· J Control Release· 2026
Although lactate oxidase (LOX)-based nanoplatforms have received considerable attention in modulating the immunosuppressive tumor microenvironment (TME), their clinical translation is hampered as most nanoenzymes rely on intratumoral injection to ensure tumor-specific lactate exhaustion and to avoid potential side effects. Here we report a LOX-loaded polyprodrug nanoreactor with tunable membrane permeability in response to the acidic tumor microenvironment for selective lactate exhaustion in tumors after intravenous administration. Additionally, LOX-catalyzed lactate oxidation can generate ab…
Microplastics as a Silent Threat to Child Neurodevelopment: Evidence and Perspectives on Autism.
Grillo DS, de Carvalho Alencar ME, da Silva Ribeiro Souza V, et al.· Int J Dev Neurosci· 2026
Autism spectrum disorder (ASD) is caused by an interaction of both environment and genetics. Recently, there has been an increased focus on how environmental toxins may alter a child's biological systems. Microplastics and nanoplastics are of great interest to researchers due to their ubiquitous nature and potential to interact with the human body. Through ingestion, inhalation and dermal contact, humans can consume both sizes of particles, which may enter the blood stream and under some circumstances pass the blood-brain barrier impacting the central nervous system. Studies on exposure to mi…

See all on PubMed