# AgentLaw — Full Platform Description

> Legal research infrastructure for AI agents. Structured legal propositions with authority hierarchy, confidence scoring, and graph relationships.
> Last updated: 2026-04-03

This file contains the complete description of the AgentLaw platform — architecture, data model, API specification, business model, and competitive positioning. For the summary version, see llms.txt.

---

## 1. Core Insight

When agents are the consumer, the abstraction layer shifts up. Humans need documents because humans reason over text. Agents need structured propositions because agents reason over data.

Existing legal research platforms (Westlaw, Lexis) give document retrieval with an API. That forces every agent to:
1. Retrieve full case documents (5,000-20,000 tokens each)
2. Read and parse the text to extract relevant holdings
3. Assess authority weight, jurisdiction scope, and temporal validity
4. Repeat for every query

This is expensive (hundreds of thousands of tokens per research question), slow, and error-prone. AgentLaw pre-structures the knowledge. A proposition lookup returns the relevant law — with citations, authority ranking, and graph context — for a fraction of the cost.

### Token Economics

| Approach | Tokens per research query | What you get |
|----------|--------------------------|--------------|
| Document retrieval (Westlaw/Lexis) | 50,000-300,000 | Raw case/statute text requiring agent parsing |
| Proposition lookup (AgentLaw) | < 500 | Structured assertion with confidence, hierarchy, citations, graph |

### Comparison: Document-Oriented vs. Proposition-Oriented

| Dimension | Document-Oriented (Westlaw/Lexis) | Proposition-Oriented (AgentLaw) |
|-----------|-----------------------------------|--------------------------------|
| Unit of information | A document (case, statute) | A legal proposition (structured assertion with provenance) |
| Cost per research query | 50K-300K tokens | Hundreds of tokens |
| Search model | Keywords -> ranked document list | Structured query -> knowledge graph traversal |
| Citation checking | Visual signals (Shepard's / KeyCite) | Citation network as a queryable graph API |
| Authority ranking | Human judgment | Machine-readable hierarchy baked into every response |
| Jurisdiction | Filters applied manually | First-class jurisdiction hierarchy in every proposition |
| Currency | "As of" dates on documents | Temporal validity on every node; conflict detection |

---

## 2. Data Model

### 2.1 Proposition Schema

Each proposition node carries:

- **id** — Unique identifier (e.g., `cdp-abuse-of-discretion-001`)
- **proposition_text** — The structured legal assertion
- **authority_type** — One of: `holding`, `dicta`, `statutory_text`, `reg_interpretation`, `procedural_rule`, `agency_guidance`
- **confidence_tier** — Static legal-weight label: `controlling`, `persuasive`, `unsettled`, `dicta_only`
- **confidence_score** — Dynamic 0.0-1.0 score computed from 4 component signals
- **Component signals:**
  - `authority_strength` (0.0-1.0) — weight of the issuing authority
  - `recency` (0.0-1.0) — how recent the supporting authority is
  - `consistency` (0.0-1.0) — agreement across authorities
  - `novelty` (0.0-1.0) — how novel or untested the proposition is
- **uncertainty_type** — `settled`, `unsettled` (circuit split), `undeveloped` (no authority)
- **jurisdiction_scope** — e.g., `nationwide`, `tax_court`, `circuit_specific`
- **valid_from** — Date proposition became effective
- **valid_to** — Date proposition was superseded (null if current)

### 2.2 Citations

Each proposition has one or more supporting citations:
- **citation_text** — Full citation (e.g., "Murphy v. Commissioner, 125 T.C. 301 (2005)")
- **pinpoint** — Specific page/paragraph reference
- **is_primary** — Whether this is the primary supporting citation
- **court** — Issuing court
- **year** — Year of decision

### 2.3 Statutory Anchors

Propositions are anchored to specific statutes/regulations:
- **anchor_type** — `irc` (Internal Revenue Code), `reg` (Treasury Regulation), `rule` (Tax Court Rule)
- **section** — Section number (e.g., `6330`, `301.6330-1`)
- **description** — Human-readable description

### 2.4 Graph Edges

Typed relationships between propositions:
- **edge_type** — One of: `supports`, `contradicts`, `narrows`, `broadens`, `codifies`, `interprets`, `applies`, `distinguishes`, `overrules`, `modifies`, `creates_exception`
- **source_id** — Source proposition ID
- **target_id** — Target proposition ID
- **description** — Explanation of the relationship

### 2.5 Knowledge Graph Tables (SQLite)

Four core tables:
- `propositions` — All proposition nodes with metadata
- `citations` — Supporting citations linked to propositions
- `statutory_anchors` — IRC/Reg/Rule links per proposition
- `edges` — Typed relationships between propositions

Three convenience views:
- `v_propositions_with_primary_cite` — Propositions joined with their primary citation
- `v_propositions_by_irc` — Propositions grouped by IRC section
- `v_graph` — Full graph view with edge details

---

## 3. Authority Hierarchy

When authorities conflict, the higher source controls — always:

1. **Internal Revenue Code (IRC)** — supreme
2. **Treasury Regulations** — controlled by Code
3. **Tax Court and Federal Case Law** — controlled by Code and Regs
4. **IRS Internal Revenue Manual (IRM)** — NOT law. Useful for procedure, but zero legal force.

### Enforcement

This hierarchy is enforced programmatically:
- `validate_data_integrity()` catches structural violations (e.g., IRM marked as controlling)
- `detect_conflicts()` finds propositions on shared anchors at different authority levels
- `resolve_hierarchy()` returns pre-resolved results so consuming agents don't need hierarchy logic

IRM propositions must be `authority_type = 'agency_guidance'` and can never be `confidence_tier = 'controlling'` on their own authority. If IRM conflicts with Code, Regs, or case law, the IRM loses.

---

## 4. Confidence Scoring

### 4.1 Formula

The confidence score (0.0-1.0) is computed deterministically from four component signals:
- `authority_strength` — Weight of the issuing court/body
- `recency` — How recently the proposition was affirmed
- `consistency` — Agreement across multiple authorities
- `novelty` — Inverse of how established the proposition is

### 4.2 Re-scoring Pipeline

When new cases arrive:
1. New case enters candidate queue
2. Effects are extracted (which propositions does this case affect?)
3. Affected propositions are identified
4. Component signals are recalculated
5. Confidence scores are updated

### 4.3 Staleness Detection

Propositions with unprocessed candidates in the queue are flagged as potentially stale. The API exposes this via `GET /v1/propositions/stale`.

---

## 5. Architecture Layers

### Layer 1: Legal Knowledge Graph
The foundation. Propositions as nodes with typed edges (supports, contradicts, narrows, broadens, codifies, interprets, applies, distinguishes, overrules, modifies, creates_exception). Authority hierarchy enforced at the data layer. Confidence scoring from four component signals.

### Layer 2: Legal Reasoning Primitives
Instead of a search endpoint, legal reasoning operations as API calls:
- **Authority lookup**: Controlling authority for a proposition in a jurisdiction
- **Statute search**: All propositions anchored to a given IRC section or regulation
- **Graph traversal**: What narrows, supports, or contradicts a proposition
- **Conflict detection**: Where authorities at different levels disagree
- **Temporal queries**: What was the law on X as of date Y

### Layer 3: Jurisdiction Intelligence
- **Hierarchy resolution**: Given a court and an issue, rank binding vs. persuasive authority
- **Golsen rule**: Tax Court follows the law of the circuit to which a case is appealable
- **Preemption mapping**: Federal/state preemption relationships
- **Regulatory body jurisdiction**: Which agency has authority over a given activity

### Layer 4: Temporal Legal State
- **Current-law endpoint**: Resolves current state of a statute, accounting for amendments and sunsets
- **Point-in-time queries**: What was the law on X as of date Y
- **Pending change feeds**: Proposed rules, bills, cert petitions
- **Change webhooks**: Subscribe to topics, get structured notifications when law changes

### Layer 5: Agent Workflow Primitives
- **Research contexts**: Matter-specific workspaces with cross-session continuity
- **Research chains**: Iterative refinement tracking (initial query -> narrowing -> conclusion)
- **Collaborative research**: Multiple agents share a context, avoid duplicating work
- **Confidence scoring**: Machine-readable confidence levels on every proposition

### Layer 6: Compliance & Regulatory Graph
- **Regulatory obligation mapping**: Given an entity type and activities, return applicable obligations
- **Compliance posture scoring**: Assess exposure across regulatory domains
- **Enforcement pattern analysis**: Historical enforcement data as queryable trends

---

## 6. API Specification

### 6.1 Authentication
- API key via `X-API-Key` header
- Key format: `al_free_...` (free tier) / `al_live_...` (paid tier)
- Keys stored as SHA-256 hashes
- Daily rate limits per key

### 6.2 Free Tier Endpoints

Text previews only, no confidence scores:

- `GET /v1/propositions` — List all propositions (previews)
- `GET /v1/propositions/search?q={query}` — Keyword search
- `GET /v1/topics/{topic}` — Browse by topic
- `GET /v1/statutes/{anchor_type}/{section}` — Statute lookup (previews)

### 6.3 Paid Tier Endpoints

Full details, scores, graph, hierarchy, ingestion:

- `GET /v1/propositions/{id}` — Full proposition with citations, anchors, edges
- `GET /v1/propositions/{id}/related` — Graph traversal from a proposition
- `GET /v1/propositions/stale` — Propositions with unprocessed candidates
- `GET /v1/statutes/{anchor_type}/{section}/resolve` — Hierarchy-resolved lookup
- `GET /v1/review-queue` — Low-confidence propositions needing review
- `POST /v1/candidates` — Submit new case/revision to candidate queue
- `POST /v1/candidates/{id}/effects` — Add extracted effects
- `POST /v1/candidates/{id}/process` — Process through re-scoring pipeline
- `GET /v1/candidates/queue` — Queue overview

### 6.4 Response Format

All responses are JSON. Proposition responses include:
```json
{
  "id": "cdp-abuse-of-discretion-001",
  "proposition": "Abuse of discretion exists when...",
  "authority_type": "holding",
  "confidence_tier": "controlling",
  "confidence_score": 0.915,
  "uncertainty_type": "settled",
  "jurisdiction_scope": "tax_court",
  "valid_from": "2005-12-05",
  "valid_to": null,
  "citations": [...],
  "statutory_anchors": [...],
  "edges": [...]
}
```

---

## 7. Current Coverage

### 7.1 Collection Due Process (CDP) — Live

248 verified legal propositions covering:
- Hearing request deadlines and timely mailing rules
- Standard of review (abuse of discretion vs. de novo)
- Underlying liability challenges and prior opportunity preclusion
- Collection alternatives (installment agreements, offers in compromise, CNC status)
- Equivalent hearings (after missed 30-day deadline)
- Balancing test and verification requirements
- Tax Court petition rights and judicial review
- Frivolous position penalties
- IRM procedural requirements for Appeals officers

Live demo: https://cdprights.com

### 7.2 Data Sources
- **DAWSON** — US Tax Court case management system (daily automated fetch)
- **IRM** — XML source files from irs.gov (IRM 8.22.1-8.22.9, IRM 5.19.8)
- **Tax Knowledge Base** — Structured knowledge (foundation, doctrine, practice patterns)

### 7.3 Expansion Path
1. CDP -> full tax controversy (deficiency, innocent spouse, passport, penalties)
2. Tax controversy -> tax compliance (regulatory obligations, filing requirements)
3. Tax -> financial regulation (adjacent domain, shared regulatory structure)
4. Single-domain -> multi-domain (general federal, then state law)

---

## 8. Competitive Positioning

AgentLaw is not a legal AI tool. It's the legal data infrastructure that legal AI tools query.

| Company | What they do | How AgentLaw differs |
|---------|-------------|---------------------|
| Harvey | AI assistant that uses legal databases | Harvey is the agent; AgentLaw is the database the agent queries |
| Casetext / CoCounsel | AI on top of Westlaw's document store | Still document-oriented; AgentLaw is proposition-oriented |
| vLex / Fastcase | Legal databases with AI features | AI is a feature; AgentLaw makes agents the primary consumer |
| CourtListener / RECAP | Open legal data | Raw documents; AgentLaw adds the structured knowledge layer |
| Westlaw API | Document retrieval via API | Same document paradigm; AgentLaw restructures the data model |

The moat: Raw legal text is increasingly commoditized. Structured legal knowledge — with authority ranking, temporal validity, and graph relationships — is not.

---

## 9. Business Model

- **Per-query pricing** (not per-seat) — agents make thousands of queries; per-query aligns incentives
- **Tiered access**: Free (raw text + search) -> Pro (proposition graph + reasoning primitives) -> Enterprise (custom configs + research contexts)
- **Free tier for open data**: Public court opinions/statutes at document level; charge for the structured proposition layer
- **Direct product revenue**: CDP self-help tool as first revenue stream while building the platform

---

## 10. Technical Stack

- **Database**: SQLite knowledge graph
- **Scoring**: Deterministic confidence scoring (4 component signals)
- **API**: FastAPI (Python) with OpenAPI/Swagger docs
- **Ingestion**: Candidate queue with staleness detection, extraction schema, DAWSON bridge
- **Agent**: Claude API with tool use (8 tools wrapping SQL queries)
- **Deployment**: Hetzner VPS, Cloudflare DNS

---

## 11. Design Principles

1. **Proposition-oriented, not document-oriented** — The unit of information is a structured legal assertion with provenance, not a case or statute document.
2. **Authority hierarchy is enforced, not suggested** — Programmatic enforcement ensures agents get correct hierarchy resolution without implementing it themselves.
3. **Confidence is computed, not guessed** — Four-component scoring formula is deterministic and auditable.
4. **Currency is structural** — Temporal validity and staleness detection are built into the data model, not bolted on.
5. **Agent-native from day one** — Every design decision optimizes for machine consumption: structured responses, graph traversal, minimal tokens.

---

## Contact

- Website: https://agent-law.net
- CDP Demo: https://cdprights.com