Tool: Entity Schema Generator

Extracts named entities from your page, links them to Wikidata, and generates ready-to-use JSON-LD structured data.

What it does

Analyzes your page content to find named entities (people, companies, products, places, technologies) and generates JSON-LD schema markup that makes these entities machine-readable.

How it works

Step 1: Entity extraction (NER)

Uses regex-based heuristic NER (no LLM calls required) to identify capitalized multi-word phrases, known patterns for organizations, product names, place names, and technical terms. This approach is fast, deterministic, and works offline.

Step 2: Wikidata linking

Each extracted entity is searched against the Wikidata API to find a matching knowledge graph entry. If found, the entity is enriched with its Wikidata ID, description, and type classification (Person, Organization, Place, etc.).

Step 3: JSON-LD generation

Generates <script type="application/ld+json"> blocks you can paste directly into your page. The output includes:

Organization schema — if company entities are detected
Person schema — for people mentioned on the page
Product/Service schema — for commercial entities
sameAs — links to Wikidata entries for entity disambiguation

Example output

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Anthropic",
  "sameAs": "https://www.wikidata.org/wiki/Q107432851",
  "description": "American AI safety company"
}

API endpoint

GET /api/entity-schema?url=https://example.com/page

Limitations

Regex NER may miss entities without standard capitalization patterns.
Wikidata linking depends on the entity having a Wikidata entry.
Generated schema should be reviewed before adding to production pages.