AprScope

Methodology

Last updated 2026-05-13

This page documents how the data on AprScope is sourced, processed, and presented. We update it whenever the pipeline changes substantively.

Data sources

All on-chain data on this site comes from DefiLlama (api.llama.fi and yields.llama.fi), the primary public aggregator for DeFi metrics. We use the following endpoints:

EndpointWhat we getRefresh cadence
/protocolsProtocol catalog, current TVL, audits, chains supportedHourly
/chainsChain catalog, current TVL per chain, gas token, EVM compatibilityEvery 6 hours
/pools (yields.llama.fi)All active yield pools, APY breakdown, IL flags, predictionsHourly
/chart/{pool_id}Historical daily APY and TVL per poolDaily, top 200 pools by TVL

When DefiLlama updates a value, our hourly ingest reflects that update on the next cycle. There is no manual delay or curation between DefiLlama’s data and what you see on the site.

We do not currently use:

  • Direct on-chain queries (Etherscan, Arbiscan, etc.) for primary data
  • Centralized exchange APIs (CoinGecko, CoinMarketCap) for current pricing
  • Subgraph queries (The Graph) for historical data

These may be added in future phases for redundancy and to fill gaps in DefiLlama’s coverage.

Pool slug stability

Every pool gets a URL-safe slug at first ingest, in the format {protocol-slug}-{chain-slug}-{symbol}[-{pool-meta}]. For example, the Aave V3 USDC pool on Ethereum is at /yields/pool/aave-v3-ethereum-usdc.

Once assigned, a slug never changes, even if the underlying protocol rebrands or the symbol displayed by DefiLlama is updated. This ensures bookmarks and external links remain valid indefinitely. If two pools would generate the same slug (rare but possible for similar pool variants), the second receives a stable hash suffix derived from its DefiLlama pool ID.

What “soft delete” means

Pools that disappear from DefiLlama’s /pools response (because the underlying contract was paused, deprecated, or migrated) are marked is_active = false in our database rather than deleted. This preserves their history for users following links from search engines or external sites. They are excluded from the homepage, top-N lists, and protocol/chain detail pages, but their dedicated URL continues to load with the most recent data we have.

Risk profile generation

The “Risk profile” block on every pool page is rule-based, not AI-generated. The bullets are derived deterministically from structured data:

  • Smart-contract risk is shown on every pool, with the same generic warning text. We do not attempt to rank specific contracts as more or less safe; this would require an audit-level review per pool that we cannot provide at scale.
  • Impermanent loss is shown if and only if DefiLlama flags the pool with il_risk = "yes" (i.e., the pool holds multiple assets that can diverge in price). The 7-day IL figure is taken directly from DefiLlama if available.
  • Depeg risk is shown for pools flagged as stablecoin pools by DefiLlama.
  • Yield variability is shown on every pool. The current vs. 30-day-mean APY comparison is computed from our snapshot data.
  • DefiLlama yield prediction is included only when DefiLlama provides a predicted_class and predicted_probability for the pool.

There is no editorial freedom in what appears here. The same input data always produces the same output text.

AI-generated content

Where you see an “Overview” block on a pool, protocol, chain, or asset page, the text was generated by Anthropic’s Claude Haiku 4.5 model from structured pool data. The pipeline is:

  1. The relevant database row is serialized to JSON (e.g., for a pool: protocol, chain, symbol, current APY breakdown, TVL, IL risk, exposure type, predicted class).
  2. A constrained Jinja2 prompt template is rendered with that JSON. The system prompt instructs the model to use only the data provided, never invent percentages or names, and write in a factual neutral tone.
  3. The output is validated: it must be at least 80 words, no more than 220 words, and must not contain forbidden phrases such as “guaranteed return”, “risk-free”, “best investment”, or “you should invest”.
  4. Validated text is stored in our database alongside the model identifier and generation cost. Failed responses are logged and discarded; the page falls back to showing only the rule-based risk profile.
  5. Generation runs on a weekly cadence for new entities and quarterly with full regeneration for freshness.

The AI’s role is to convert numbers into prose. It does not have access to any data outside the input JSON. It cannot search the web, read protocol documentation, or reference other pools. If a value is missing from the input, the prompt instructs the model to write “data unavailable” rather than guess.

We disclose the AI generation date and the source data lineage at the bottom of every overview block.

Editorial content

Hand-written content (blog posts, the About page, this Methodology page, optional protocol overrides) lives in markdown files in our git repository. Authors are real, named individuals; the byline and bio on each post link to a profile page. Editorial content overrides AI-generated content on the rare protocol pages where we’ve published a hand-written description.

What we don’t claim

The following is not something we attempt:

  1. Independent audit verification. When we display “audit count” for a protocol, that’s the number reported by DefiLlama or the protocol itself. We do not read audit reports or assess findings; that’s beyond the scope of an aggregator.
  2. APY prediction. The “DefiLlama yield prediction” we surface is DefiLlama’s model, not ours. We show it because users find it useful; we do not stand behind it.
  3. Best-pool recommendations. Sorting pools by current APY is mechanical. Choosing the right pool depends on your risk tolerance, position size, time horizon, and tax situation. None of that is something we can or should do for you.
  4. Real-time data. Our refresh cadence is hourly for current snapshots, daily for history. Pools whose APY changes minute-to-minute are not represented at sub-hour resolution. For real-time, go to the protocol’s own dashboard.

Limitations we know about

  • Pool history coverage. We have rich daily history for the top 200 pools by TVL. The long tail of pools shows current snapshots but no historical chart, because the upstream /chart/{pool_id} endpoint at DefiLlama is rate-limited in a way we don’t yet have a workaround for. We’re considering The Graph subgraphs and direct on-chain indexing as future fallbacks.
  • Asset modeling. Currently asset pages are derived from pool.symbol, which works for single-asset pools (USDC, ETH, stETH) but treats LP pair symbols (USDC-ETH) as separate “assets”. A proper asset model with contract addresses and CoinGecko-linked metadata is planned.
  • Chain coverage gaps. A handful of DefiLlama chain identifiers (BSC, Gnosis, OP Mainnet) use different names in /pools than in /chains, causing approximately 19 pools to be skipped during ingestion. We have a planned alias mapping to fix this.

If you find a data error or a pool that’s incorrectly tagged, the contact page is open.

Source code and reproducibility

The ETL pipeline (Python), the database schema (PostgreSQL), and the frontend (Astro) are all version-controlled. The data lineage from DefiLlama API → our database → page render is straightforward and inspectable.

Last updated

This methodology page was last reviewed on 2026-05-13. Substantial changes will be reflected in the lastUpdated field at the top of the page.