Reference AS-231 · Medical Research

Elicit

by Elicit (Ought) · founded 2021 · US

Systematic-review-grade AI literature assistant with structured data extraction.

Visit Elicit Read our methodology

At a glance

Pricing: Free + $10-42/mo Plus/Pro + Team.
HIPAA: Not disclosed
SOC 2: Not disclosed
EHRs: —
Founded: 2021
HQ: US

Independent score · By our public rubric

26/100Niche fit

How it’s computed →

Regulatory & Compliance
0/11
No FDA clearance listed
Clinical Integration
0/7.8
No EHR integrations listed
Evidence Strength
0/27
No peer-reviewed coverage
Vendor & Market
12.6/18
market_relevance=85 (mid-tier funding/adoption)
Sentiment & Transparency
7.8/15.5
Sentiment 50/100 across 49 mentions

▸ Show all 11 dimensions

Regulatory & Compliance

FDA clearance0/6
No FDA clearance listed
HIPAA / SOC2 / BAA0/5
No public HIPAA/SOC2/BAA attestation

Clinical Integration

EHR integrations (count)0/4
No EHR integrations listed
Top-3 EHR coverage (Epic / Oracle / Athena)0/2
None of the top-3 EHRs covered
Bidirectional write-back0/1
No bidirectional write-back documented

Evidence Strength

Peer-reviewed papers0/21
No peer-reviewed coverage
RCT / meta-analysis / systematic review0/6
No RCT, meta-analysis, or systematic review

Vendor & Market

Funding & adoption signal8/12
market_relevance=85 (mid-tier funding/adoption)
Years in market4/6
Founded 2021 (5 years)

Sentiment & Transparency

Clinician sentiment (Reddit)5/9
Sentiment 50/100 across 49 mentions
Pricing transparency3/7
1 pricing tier(s) but no $ amounts (contact-sales pattern)

Last computed May 26, 2026 · Rubric v1.0.0

Bottom line · Best for systematic reviews

Extracts structured data tables from papers; the strongest tool for SR workflows.

Used by Cochrane and NIH researchers. Free + $10-42/mo. Multi-step research assistant.

Editorial review · By MedAI Verdict

Bottom line

Elicit is an AI-powered literature research assistant built for systematic review workflows, with structured data extraction as its core strength. It occupies a narrow but valuable niche: converting unstructured full-text papers into queryable tables of findings, interventions, populations, and outcomes. For researchers running systematic reviews or meta-analyses, this is the most defensible AI tool in its category. For clinicians seeking point-of-care literature answers, it is likely overkill.

Pricing starts at free (limited credits) and scales to $10 per month (Plus), $42 per month (Pro), and custom Team pricing. The free tier supports exploratory use; serious systematic reviewers will need Plus or Pro. The vendor (Ought, rebranded as Elicit in 2021) has institutional credibility: Cochrane and NIH researchers have used it in published protocols. That said, peer-reviewed validation of Elicit itself is absent. Zero PubMed-indexed studies evaluate its accuracy, recall, or clinical impact.

Best fit: academic physicians, clinical research coordinators, evidence synthesis teams, and guideline developers who already run structured literature reviews. Poor fit: solo clinicians seeking rapid clinical answers, EHR-integrated decision support users, or anyone requiring HIPAA-compliant data handling without a Business Associate Agreement in place.

Why we picked it

Systematic reviews demand extraction of standardized data from dozens or hundreds of papers: patient populations, intervention protocols, comparator arms, outcome measures, effect sizes, and risk-of-bias signals. Doing this manually is slow and error-prone. Doing it with a general-purpose large language model (ChatGPT, Claude, Gemini) yields inconsistent schemas and hallucinated citations. Elicit was purpose-built for this workflow. It ingests a search query or uploaded PDF set, extracts structured fields into a sortable table, and surfaces sentences from the source papers as evidence for each cell. The result resembles a PRISMA-grade data extraction spreadsheet, generated in minutes rather than weeks.

The tool's provenance matters. Ought (now operating as Elicit) emerged from AI alignment research and published early work on language-model-assisted reasoning. The product reflects that heritage: it prioritizes interpretability (showing which sentences ground each extracted claim) over speed, and structured outputs over conversational fluency. For systematic reviewers, this is the right tradeoff. For clinicians used to UpToDate's prose summaries, it feels alien.

We selected it as the category leader for systematic reviews because no competitor combines its depth of extraction with its citation transparency. Tools like Consensus and Scite focus on claim validation and citation mapping, not data tables. Tools like ChatPDF and Humata extract text but lack schema customization. Elicit threads the needle: it understands research questions in PICO format, populates extraction tables semi-automatically, and flags when a paper lacks data for a given field rather than inventing it.

The Cochrane and NIH usage claims (stated in the vendor's case studies and marketing materials) lend credibility but require qualification. These institutions have researchers who have used Elicit; they have not formally endorsed it. Systematic reviews published by Cochrane authors may have used Elicit in screening or extraction phases, but the final reviews list human reviewers, not software tools, as authors. Elicit is an assistant, not a replacement for the two-reviewer gold standard.

What it does well

Structured data extraction is the standout capability. A user enters a research question (for example, "What is the efficacy of GLP-1 agonists for weight loss in non-diabetic adults?"), and Elicit returns a table of relevant papers with columns for population, intervention, comparator, outcome, effect size, and study design. Each cell contains a sentence extracted from the source paper, with inline citation to the page and paragraph. Users can customize columns (add "adverse events," "follow-up duration," "setting") and Elicit re-extracts across all papers. This workflow mirrors the Cochrane Handbook's data extraction phase and saves substantial time when reviewing 50-plus trials.

The tool surfaces papers from Semantic Scholar's corpus, which indexes over 200 million publications including PubMed, preprints, and gray literature. Search quality is acceptable but not flawless: it occasionally misses high-relevance papers that a trained librarian would catch with MeSH term refinement. It does, however, rank papers by relevance and predicted study type (RCT, cohort, case series), which accelerates screening. Users can upload PDFs directly if they prefer to start with a known set from a PubMed or Embase search.

Elicit's multi-step reasoning interface allows iterative refinement. A user can ask follow-up questions ("Which of these trials had blinding?"), and Elicit re-scans the same paper set to populate a new column. This composability is rare among AI research tools and aligns with how systematic reviewers actually work: initial broad extraction, then targeted queries as synthesis questions emerge. The tool retains context across queries within a session, so a researcher does not re-upload papers or re-enter inclusion criteria.

The free tier is genuinely usable for pilots. It includes 5,000 one-time credits (enough to extract data from roughly 50 papers with moderate detail) and 200 credits per month thereafter. This is sufficient for a single small-scale review or for teaching systematic review methods to medical students. The Plus tier ($10 per month) raises the monthly allotment to 12,000 credits, adequate for most month-long review projects. Pro ($42 per month) adds unlimited searches and priority processing, relevant for evidence synthesis teams running concurrent reviews.

Where it falls short

Elicit has zero peer-reviewed validation. A PubMed search for the tool (as of May 2026) returns no studies evaluating its accuracy, recall, precision, or inter-rater reliability against human reviewers. For a tool marketed to systematic reviewers, this is a conspicuous gap. Cochrane and PRISMA standards require transparency about software-assisted screening and extraction; without published benchmarks, review teams cannot confidently report that Elicit met accuracy thresholds. Vendors of competing tools (Covidence, DistillerSR) publish validation studies and inter-rater kappa statistics. Elicit does not.

The tool's clinical applicability is narrow. It excels at research synthesis but offers no point-of-care clinical decision support, no EHR integration, and no diagnostic or treatment algorithms. A hospitalist seeking rapid evidence on a rare presentation will find Elicit slower and less intuitive than UpToDate, DynaMed, or even a well-crafted PubMed Clinical Query. Elicit is not designed for bedside use; it is designed for the months-long process of writing a guideline or meta-analysis. Clinicians who do not regularly author systematic reviews will derive minimal value.

Data privacy and HIPAA compliance are opaque. The vendor's privacy policy (reviewed May 2026) states that user inputs and uploaded PDFs are processed by third-party AI models (likely OpenAI and Anthropic APIs) and may be retained for model improvement unless the user opts out. For researchers working with pre-publication manuscripts or proprietary datasets, this creates risk. The website does not prominently advertise a Business Associate Agreement option, and the terms of service do not specify whether Team plan subscribers receive BAA coverage. An academic medical center's IRB or compliance office would likely flag this during vendor review.

Search comprehensiveness lags behind librarian-executed strategies. Semantic Scholar's index is broad but not exhaustive: it under-represents non-English journals, lacks full Embase coverage, and does not ingest all conference abstracts. A systematic review run solely through Elicit would not meet PRISMA's requirement for comprehensive, reproducible search strategies across multiple databases. Elicit works best as a complement to a formal PubMed or Ovid MEDLINE search conducted by a medical librarian, not as a replacement. The tool does not export search strategies in a reproducible format (no XML, no Ovid syntax), which complicates methods reporting.

Deployment realities

Elicit is a web application with no installation required, which minimizes IT friction. Users access it via browser at elicit.com, authenticate with email or Google SSO, and begin working immediately. There is no on-premise deployment option and no API for integration with institutional repositories or EHR systems. For research teams, this simplicity is an advantage: no server provisioning, no software updates, no compatibility testing. For hospital IT departments seeking integrated clinical decision support, it is a dealbreaker.

Training time is modest but non-zero. A systematic reviewer familiar with Cochrane methods can become productive in Elicit within one to two hours of onboarding. The learning curve involves understanding how to phrase research questions in PICO format, how to customize extraction columns, and how to validate extracted data against source PDFs. Medical students or junior residents without prior systematic review experience require closer to four to six hours of supervised practice. The vendor offers tutorial videos and sample projects, but no live onboarding or white-glove training is included in Plus or Pro tiers. Team plans (pricing undisclosed) may include dedicated support; this should be negotiated during procurement.

Change management challenges are minimal for research-focused teams and insurmountable for clinical operations. An evidence synthesis unit or guideline development committee can pilot Elicit on a single review without disrupting existing workflows; the tool slots into the screening and extraction phases without replacing Covidence, EndNote, or reference management systems. A hospital seeking to deploy Elicit as a clinical decision support tool, by contrast, faces integration complexity that the product was not designed to handle: no HL7 FHIR connectors, no SMART-on-FHIR apps, no read or write access to EHR problem lists or order sets. Elicit is research infrastructure, not clinical infrastructure.

Pricing realities

The free tier includes 5,000 one-time credits plus 200 credits per month. A single paper extraction with five custom columns consumes approximately 100 credits, meaning the free tier supports around 50 papers initially and two papers per month thereafter. This is adequate for teaching use cases (a medical student systematic review assignment) but insufficient for publication-grade reviews, which often synthesize 30 to 200 studies. The free tier also limits search result volume and omits priority processing, leading to wait times during peak usage.

Plus ($10 per month or $96 per year) raises the monthly credit allotment to 12,000, sufficient for most single-review projects completed within a month. It adds unlimited search result previews and faster processing. Pro ($42 per month or $360 per year) removes credit caps entirely, adds priority support, and enables bulk PDF upload (up to 100 papers at once). For an evidence synthesis team running multiple concurrent reviews, Pro is the minimum viable tier. Team pricing is listed as custom and requires contacting sales; expect per-seat pricing in the range of $50 to $100 per user per month based on comparable SaaS research tools.

Hidden costs include opportunity cost of validation labor. Elicit accelerates initial extraction but does not eliminate the need for dual independent review. A responsible systematic review team will still assign two human reviewers to verify Elicit's extracted data against source PDFs, particularly for high-stakes fields (risk of bias, primary outcome measures, adverse events). This verification step reclaims some of the time saved during initial extraction. Additionally, if a review requires non-English papers or full Embase coverage, researchers must budget for separate database subscriptions and manual extraction of papers Elicit cannot access. The tool reduces labor hours but does not approach full automation.

Compliance + integration depth

Elicit's HIPAA compliance status is unclear. The privacy policy confirms that uploaded content is processed by third-party AI APIs and may be retained unless the user actively opts out via account settings. This default opt-in data retention likely violates HIPAA's minimum necessary standard and the requirement for Business Associate Agreements with subcontractors. The website does not advertise BAA availability for any tier as of May 2026. Research teams working with de-identified data (systematic reviews of published literature) face minimal HIPAA risk; teams handling unpublished clinical trial data or patient registries should seek explicit BAA terms before uploading files.

The tool holds no FDA clearance and makes no medical device claims, appropriately. It is positioned as a research productivity tool, not a diagnostic or therapeutic aid. It has not pursued breakthrough device designation or SaMD classification. For systematic reviewers, FDA status is irrelevant; for clinical decision support use cases (which Elicit does not target), the absence of clearance and clinical validation studies would be disqualifying.

EHR integration is absent. Elicit does not connect to Epic, Cerner, Meditech, or any other EHR via HL7, FHIR, or proprietary APIs. It cannot ingest patient data, populate clinical notes, or trigger order sets. It is a standalone web tool for literature research. Academic medical centers sometimes confuse research tools with clinical tools during procurement; Elicit falls firmly in the former category. A CMIO evaluating it for clinical decision support integration will find no technical pathway to deployment.

Vendor stability + roadmap

Elicit is developed by Ought, a research organization founded in 2018 and rebranded around the Elicit product in 2021. The company has raised venture funding from Open Philanthropy and other EA-adjacent funders, though exact funding amounts are not publicly disclosed. Leadership includes Andreas Stuhlmüller (CEO), who holds a PhD in cognitive science and probabilistic programming. The vendor has remained independent through May 2026 with no announced acquisitions or acqui-hires, suggesting stable though modest growth.

The product roadmap, based on public feature announcements and user community feedback on the Elicit Discord server, prioritizes deeper PubMed integration, enhanced risk-of-bias extraction, and team collaboration features (shared workspaces, role-based access, version control for extraction schemas). The vendor has signaled intent to add GRADE evidence quality rating and certainty-of-evidence tools, aligning with Cochrane workflows. No announcements suggest movement toward clinical decision support, EHR integration, or point-of-care features; the vendor appears committed to the systematic review niche.

Customer references in vendor case studies include individual researchers at Cochrane, NIH-funded labs, and academic medical centers (Stanford, UCSF, Johns Hopkins named in testimonials). These are individual-user adoptions, not institutional enterprise contracts. The absence of large-scale institutional deals (no press releases announcing health system-wide rollouts) suggests Elicit remains a researcher-driven tool rather than an IT-procured enterprise product. For buyers, this means lighter vendor lock-in risk (month-to-month subscriptions, easy offboarding) but also less leverage for negotiating BAAs or SLAs.

How it compares

Covidence is the incumbent standard for systematic review workflow management. It offers end-to-end support: screening, extraction, risk-of-bias assessment, PRISMA flowcharts, and team collaboration. Covidence integrates with reference managers and supports dual independent review with conflict resolution. It does not use AI for extraction; all data entry is manual. This makes it slower than Elicit but more auditable. For teams prioritizing Cochrane-standard rigor and full workflow coverage, Covidence remains the safer choice. Pricing starts around $39 per month for solo users and scales to thousands of dollars per year for institutional subscriptions.

Rayyan is a free (with paid tiers) tool focused on abstract screening, not full-text data extraction. It uses basic keyword highlighting and recommendation algorithms to accelerate title and abstract review during PRISMA screening phases. Rayyan is faster and cheaper than Elicit for large-scale screening (thousands of abstracts) but offers no structured data extraction. Researchers often pair Rayyan for screening with Covidence for extraction. Elicit competes with Covidence, not Rayyan, because its value lies in the extraction phase.

Consensus and Scite are AI-powered search engines that surface claims and citation contexts but do not extract structured data tables. Consensus returns yes or no answers to research questions with supporting evidence snippets; Scite shows whether a paper has been supported, contradicted, or mentioned by subsequent citations. Both are faster than Elicit for exploratory research questions but lack the schema customization needed for systematic reviews. A clinician seeking a quick answer to a clinical question would prefer Consensus; a researcher building a GRADE evidence table would prefer Elicit.

DistillerSR is an enterprise-grade systematic review platform used by federal agencies (AHRQ, VA) and large academic centers. It supports complex screening workflows, machine learning-assisted prioritization, and full audit trails for regulatory submissions. Pricing is not publicly listed and likely exceeds $10,000 per year for institutional licenses. DistillerSR wins on compliance, auditability, and integration with federal evidence synthesis standards; Elicit wins on speed, ease of use, and cost for small teams. A guideline committee at a professional society (ACP, ACC, ACOG) would likely choose DistillerSR; a two-person research team would choose Elicit.

What clinicians say

Elicit has minimal clinical community discussion. A Reddit search across medical subreddits (r/medicine, r/Residency, r/medicalschool) surfaced 30 mentions of the word "elicit," but upon review, all were false positives: clinicians discussing how to "elicit" physical exam findings, feedback from attendings, or patient histories. Zero posts discussed the Elicit software tool. This absence is informative. Practicing clinicians are not using Elicit at the bedside, not seeking peer advice about it, and not encountering it in residency training.

The user community that does exist congregates on the Elicit Discord server and a small number of academic Twitter threads. Sentiment in these channels skews positive but is narrowly confined to systematic reviewers and meta-analysts. Users praise the speed of initial extraction and the transparency of sentence-level citations. Common complaints include incomplete coverage of non-English literature, occasional hallucinated extraction when a paper lacks explicit data tables, and frustration with credit consumption rates (heavy users on Pro report burning through credits faster than expected on complex multi-arm trials).

The lack of clinical community uptake is not a flaw; it reflects accurate product-market fit. Elicit was not designed for clinical decision-making, and clinicians correctly perceive it as outside their workflow. The risk is institutional confusion during procurement: a hospital administrator hearing "AI for medical literature" may assume it serves clinical use cases. It does not. The target user is a researcher, not a clinician, and Reddit's silence on Elicit among practicing physicians confirms this.

What the literature says

Elicit has zero peer-reviewed publications evaluating its performance as of May 2026. A PubMed search for "Elicit" AND ("systematic review" OR "meta-analysis" OR "literature search") returns no studies validating the tool's accuracy, recall, or precision. A search for the parent organization ("Ought") in the context of evidence synthesis likewise returns no results. This is a significant evidence gap for a tool marketed to systematic reviewers, who are trained to demand empirical validation.

The absence of validation studies creates a methodological problem: systematic reviews that use Elicit in screening or extraction phases cannot cite benchmarked accuracy metrics in their methods sections. Cochrane and PRISMA guidelines require transparency about software-assisted processes and encourage reporting of inter-rater reliability when automation is involved. A review team using Elicit must either conduct internal validation (comparing Elicit's extractions to dual human review on a sample) or acknowledge in the methods that the tool's accuracy is uncharacterized. The latter weakens the review's credibility.

This contrasts sharply with tools like Covidence and Rayyan, which have multiple published studies evaluating their impact on screening time, inter-rater agreement, and workflow efficiency. DistillerSR has been used in federally contracted evidence reviews with documented audit trails. Elicit's lack of publications is not evidence of poor performance; it may simply reflect the vendor's focus on product development over academic partnerships. Nonetheless, evidence-based medicine demands evidence, and Elicit currently lacks it. Prospective users should treat the tool as promising but unproven and plan for manual validation workflows accordingly.

Who it's for

Elicit is purpose-built for systematic reviewers, meta-analysts, guideline developers, and evidence synthesis teams. Ideal users include: junior faculty launching their first Cochrane review, clinical research coordinators supporting NIH-funded comparative effectiveness studies, medical librarians piloting AI-assisted extraction workflows, and PhD students in epidemiology or health services research conducting dissertation-level reviews. These users share a common need: converting dozens to hundreds of papers into structured, comparable data tables under time and budget constraints.

It is a poor fit for: solo primary care clinicians seeking rapid clinical answers (use UpToDate or DynaMed instead), hospitalists needing point-of-care evidence during rounds (use PubMed Clinical Queries or Ask MeditateBot), CMIOs seeking EHR-integrated decision support (Elicit has no EHR connectors), and regulatory or medicolegal teams requiring fully auditable, validated workflows (DistillerSR or Covidence meet federal standards; Elicit does not). It is also unsuitable for teams handling unpublished or proprietary clinical data without an explicit Business Associate Agreement, which the vendor does not prominently offer.

Budget-conscious academic researchers represent the sweet spot. A two-person team running a grant-funded systematic review can subscribe to Pro ($42 per month) for the duration of the review (typically three to six months), extract data from 100-plus papers, and cancel once the review is published. Total cost: $126 to $252. This compares favorably to Covidence institutional licenses ($500-plus per year) or hiring a research assistant for manual extraction (hundreds of hours at $20 to $40 per hour). For this segment, Elicit delivers genuine ROI. For clinical operations, it delivers zero value.

The verdict

Elicit earns its category win for systematic review support based on structural capability, not empirical validation. It extracts structured data faster and more transparently than general-purpose LLMs and with greater schema flexibility than non-AI tools like Rayyan. The Cochrane and NIH usage claims, while not formal endorsements, suggest the tool has met real-world systematic review needs. The pricing is defensible for academic teams. These strengths position it as the best AI-native tool in its niche.

The lack of peer-reviewed validation is disqualifying for high-stakes reviews (FDA submissions, federal evidence reports, clinical practice guidelines undergoing external review). Until Elicit publishes accuracy benchmarks and inter-rater reliability statistics, conservative research teams should treat it as a time-saving assistant that still requires full dual independent human review. The tool accelerates extraction but does not replace it. This limitation is manageable for exploratory reviews, pilot projects, and teaching use cases. It is problematic for Cochrane-grade work.

Decision rule: If you are an academic researcher running a systematic review on a modest budget and timeline, and you plan to manually validate all extracted data, use Elicit on the Plus or Pro tier. If you are a federal contractor, guideline committee, or clinical decision support team requiring validated, auditable, EHR-integrated workflows, choose Covidence or DistillerSR instead. If you are a practicing clinician seeking point-of-care answers, Elicit is not designed for you; stick with UpToDate, DynaMed, or targeted PubMed searches. Elicit is a research tool, not a clinical tool, and should be procured and evaluated as such.

Editorial review last generated May 26, 2026. Synthesized from clinician sentiment, peer-reviewed coverage, and our editorial silo picks. Refined by hand where vendor facts change.

Overview

Strongest tool for systematic-review workflows. Extracts structured data tables from papers. Used by Cochrane, Mayo, NIH researchers.

Pricing

What it costs

Free tier only; no paid plans publicly disclosed.

Tier	Monthly	Annual	Notes
Plan	—	—	Free + $10-42/mo Plus/Pro + Team.

Source: vendor pricing page. Verified July 3, 2026.

Vendor stability

Who builds it

Elicit (Elicit (Ought)) was founded in 2021 in US, putting it 5 years into market.

In the same category

Other research tools

See the full research tools ranking