{"id":247,"date":"2026-03-01T06:48:31","date_gmt":"2026-03-01T06:48:31","guid":{"rendered":"https:\/\/globalsolidarity.live\/aiearth\/?p=247"},"modified":"2026-03-01T06:48:33","modified_gmt":"2026-03-01T06:48:33","slug":"spacearch-aiearth-markets","status":"publish","type":"post","link":"https:\/\/globalsolidarity.live\/aiearth\/ai-native-media-stack\/spacearch-aiearth-markets\/","title":{"rendered":"SpaceArch AIEARTH Markets"},"content":{"rendered":"\n<h1 class=\"wp-block-heading\">AI Extraction\u2013Validation\u2013Publication Pipeline v1<\/h1>\n\n\n\n<h3 class=\"wp-block-heading\">(Entity-first \u2022 Claim-based \u2022 Provenance-driven \u2022 Human-in-the-loop)<\/h3>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">1) Objectives<\/h2>\n\n\n\n<p>The pipeline must reliably transform unstructured inputs into:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Canonical Entities<\/strong> (Company \/ Person \/ Corridor)<\/li>\n\n\n\n<li><strong>Claims<\/strong> (field-level truth units with evidence + confidence)<\/li>\n\n\n\n<li><strong>Relationships<\/strong> (graph edges with provenance)<\/li>\n\n\n\n<li><strong>Publishable Notes<\/strong> (human narrative rendering + embedded JSON-LD)<\/li>\n\n\n\n<li><strong>Machine Feeds\/APIs<\/strong> (entity updates, corridor briefs, NDJSON streams)<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">2) Inputs (Ingestion Channels)<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">A. Structured<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Company submission form (recommended primary)<\/li>\n\n\n\n<li>Founder\/Person submission form<\/li>\n\n\n\n<li>Corridor opportunity intake form (trade\/investment\/RE\/legal)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">B. Semi-structured<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Interview transcript (Q&amp;A)<\/li>\n\n\n\n<li>Email threads<\/li>\n\n\n\n<li>\u201cPitch deck + short answers\u201d templates<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">C. Unstructured<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>PDFs (deck, registry docs)<\/li>\n\n\n\n<li>Website content snapshots<\/li>\n\n\n\n<li>Public registry extracts (optional)<\/li>\n\n\n\n<li>Partner feeds (institutions)<\/li>\n<\/ul>\n\n\n\n<p>All inputs are stored as immutable artifacts with IDs:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>document_id<\/code>, <code>transcript_id<\/code>, <code>submission_id<\/code><\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">3) Pipeline Stages (End-to-End)<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 0 \u2014 Intake &amp; Preprocessing<\/h3>\n\n\n\n<p><strong>Goal:<\/strong> Normalize inputs and create a traceable job.<\/p>\n\n\n\n<p><strong>Actions<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Assign <code>job_id<\/code><\/li>\n\n\n\n<li>Store raw artifact in object storage<\/li>\n\n\n\n<li>Extract text (PDF \u2192 text)<\/li>\n\n\n\n<li>Chunk into logical sections<\/li>\n\n\n\n<li>Detect language + translate (optional)<\/li>\n\n\n\n<li>Create <code>ingestion_manifest<\/code>:\n<ul class=\"wp-block-list\">\n<li>source type<\/li>\n\n\n\n<li>timestamps<\/li>\n\n\n\n<li>submitter identity (if any)<\/li>\n\n\n\n<li>consent flags (interview \/ PII)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Outputs<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>raw_text<\/code><\/li>\n\n\n\n<li><code>chunks[]<\/code><\/li>\n\n\n\n<li><code>manifest.json<\/code><\/li>\n<\/ul>\n\n\n\n<p><strong>Gates<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>If no consent and contains PII \u2192 route to compliance review<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 1 \u2014 Entity Detection &amp; Candidate Generation<\/h3>\n\n\n\n<p><strong>Goal:<\/strong> Identify which entities exist and whether they already exist.<\/p>\n\n\n\n<p><strong>AI Tasks<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Named entity recognition (Company, Person, City, Institution, Product)<\/li>\n\n\n\n<li>Normalize names, domains, locations<\/li>\n\n\n\n<li>Propose entity candidates:\n<ul class=\"wp-block-list\">\n<li>existing match probability<\/li>\n\n\n\n<li>new entity proposal<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Deterministic Tasks<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Domain normalization (strip tracking)<\/li>\n\n\n\n<li>Location normalization (country code ISO2)<\/li>\n\n\n\n<li>Slug suggestion<\/li>\n<\/ul>\n\n\n\n<p><strong>Outputs<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>entity_candidates[]<\/code> with:\n<ul class=\"wp-block-list\">\n<li><code>entity_type<\/code><\/li>\n\n\n\n<li><code>canonical_name<\/code><\/li>\n\n\n\n<li><code>match_candidates[]<\/code> (existing entity IDs + match score)<\/li>\n\n\n\n<li><code>proposed_new_entity<\/code> (if no match)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Gates<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>If match score \u2265 threshold (e.g., 0.92) \u2192 auto-link to existing entity<\/li>\n\n\n\n<li>Else \u2192 human chooses (or AI asks minimal disambiguation)<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 2 \u2014 Claim Extraction (Field-Level Truth Units)<\/h3>\n\n\n\n<p><strong>Goal:<\/strong> Extract structured fields as claims with evidence.<\/p>\n\n\n\n<p><strong>AI Tasks<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Extract schema-aligned fields per entity type:\n<ul class=\"wp-block-list\">\n<li>Company: sector, stage, business model, products, markets, etc.<\/li>\n\n\n\n<li>Person: roles, affiliations, expertise<\/li>\n\n\n\n<li>Corridor: nodes, domains, operating model<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Hard Rules<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>If a field is not explicitly supported by the input evidence:\n<ul class=\"wp-block-list\">\n<li>mark as <code>unknown<\/code> OR<\/li>\n\n\n\n<li>create claim with low confidence + \u201cself_reported\u201d level<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Outputs (Claim objects)<\/strong><br>Each claim:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>claim_id<\/code><\/li>\n\n\n\n<li><code>entity_id<\/code> (or temporary entity key)<\/li>\n\n\n\n<li><code>field_path<\/code> (e.g., <code>funding.total_raised<\/code>)<\/li>\n\n\n\n<li><code>value<\/code><\/li>\n\n\n\n<li><code>evidence_span<\/code> (chunk_id + start\/end offsets)<\/li>\n\n\n\n<li><code>source_ref<\/code> (document_id \/ interview \/ registry)<\/li>\n\n\n\n<li><code>confidence<\/code> (0\u20131 or 0\u2013100)<\/li>\n\n\n\n<li><code>verification_level<\/code> default = <code>self_reported<\/code><\/li>\n\n\n\n<li><code>created_at<\/code><\/li>\n<\/ul>\n\n\n\n<p><strong>Gates<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Critical fields require evidence:\n<ul class=\"wp-block-list\">\n<li>legal_name, location, founded_date, funding, certifications, export markets<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>If missing evidence \u2192 cannot move to \u201cverified\u201d<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 3 \u2014 Relationship Extraction (Graph Edges)<\/h3>\n\n\n\n<p><strong>Goal:<\/strong> Turn implicit relationships into explicit edges.<\/p>\n\n\n\n<p><strong>AI Tasks<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Identify relationships:\n<ul class=\"wp-block-list\">\n<li>Company founded_by Person<\/li>\n\n\n\n<li>Company partners_with Institution<\/li>\n\n\n\n<li>Company exports_to Country<\/li>\n\n\n\n<li>Corridor includes Node City<\/li>\n\n\n\n<li>Company corridor_fit domains<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Outputs<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>edges[]<\/code>:\n<ul class=\"wp-block-list\">\n<li><code>from_entity_id<\/code><\/li>\n\n\n\n<li><code>to_entity_id<\/code><\/li>\n\n\n\n<li><code>relation_type<\/code><\/li>\n\n\n\n<li><code>evidence<\/code><\/li>\n\n\n\n<li><code>confidence<\/code><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Gates<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>No edge without evidence pointer<\/li>\n\n\n\n<li>If evidence is weak \u2192 \u201cproposed edge\u201d status<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 4 \u2014 Normalization &amp; Taxonomy Assignment<\/h3>\n\n\n\n<p><strong>Goal:<\/strong> Convert free-text into controlled vocab references.<\/p>\n\n\n\n<p><strong>AI Tasks<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Map sector strings \u2192 <code>sector taxonomy codes<\/code><\/li>\n\n\n\n<li>Map tech mentions \u2192 <code>technology taxonomy codes<\/code><\/li>\n\n\n\n<li>Map topics \u2192 <code>topic taxonomy codes<\/code><\/li>\n\n\n\n<li>Map corridor domains \u2192 <code>corridor_domain codes<\/code><\/li>\n<\/ul>\n\n\n\n<p><strong>Deterministic Checks<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Only allow tags existing in the taxonomy registry<\/li>\n\n\n\n<li>Reject unknown codes or route to \u201ctaxonomy steward\u201d queue<\/li>\n<\/ul>\n\n\n\n<p><strong>Outputs<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Normalized entity draft:\n<ul class=\"wp-block-list\">\n<li><code>primary_sector<\/code> tagRef<\/li>\n\n\n\n<li><code>secondary_sectors[]<\/code><\/li>\n\n\n\n<li><code>technologies[]<\/code><\/li>\n\n\n\n<li><code>topics[]<\/code><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Gates<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>If AI suggests a non-existent tag \u2192 requires steward approval<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 5 \u2014 Consistency &amp; Quality Validation (Automated)<\/h3>\n\n\n\n<p><strong>Goal:<\/strong> Detect contradictions and enforce completeness.<\/p>\n\n\n\n<p><strong>Checks<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Dates consistency (founded_date not in future)<\/li>\n\n\n\n<li>Country\/city coherence (valid ISO2)<\/li>\n\n\n\n<li>Funding coherence (currency format, non-negative)<\/li>\n\n\n\n<li>Duplicates (domain collision, name collision)<\/li>\n\n\n\n<li>Required fields present for publishable draft<\/li>\n\n\n\n<li>Risk checks:\n<ul class=\"wp-block-list\">\n<li>prohibited content<\/li>\n\n\n\n<li>PII leaks<\/li>\n\n\n\n<li>defamation risk (accusations)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Outputs<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>validation_report.json<\/code>:\n<ul class=\"wp-block-list\">\n<li>errors (blockers)<\/li>\n\n\n\n<li>warnings (non-blocking)<\/li>\n\n\n\n<li>completeness score<\/li>\n\n\n\n<li>confidence summary<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Gates<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Errors block progression<\/li>\n\n\n\n<li>Warnings can pass but are logged<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 6 \u2014 Human Review (Editorial + Verification)<\/h3>\n\n\n\n<p><strong>Goal:<\/strong> Convert \u201cAI draft\u201d into \u201cpublishable truth\u201d.<\/p>\n\n\n\n<p><strong>Two distinct roles<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Editor Review<\/strong><\/li>\n<\/ol>\n\n\n\n<ul class=\"wp-block-list\">\n<li>readability, clarity, tone<\/li>\n\n\n\n<li>remove promotional language<\/li>\n\n\n\n<li>ensure template compliance<\/li>\n<\/ul>\n\n\n\n<ol start=\"2\" class=\"wp-block-list\">\n<li><strong>Verifier Review<\/strong><\/li>\n<\/ol>\n\n\n\n<ul class=\"wp-block-list\">\n<li>confirm evidence for critical claims<\/li>\n\n\n\n<li>assign verification level:\n<ul class=\"wp-block-list\">\n<li>self_reported \u2192 partially_verified \u2192 verified \u2192 externally_verified<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>approve or reject claims<\/li>\n<\/ul>\n\n\n\n<p><strong>UI Requirements<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Side-by-side view:\n<ul class=\"wp-block-list\">\n<li>extracted field \u2192 evidence highlight \u2192 approve\/edit<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>One-click demote \u201cunsupported claims\u201d<\/li>\n\n\n\n<li>Mark \u201cneeds more evidence\u201d with checklist<\/li>\n<\/ul>\n\n\n\n<p><strong>Outputs<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>approved_claims[]<\/code><\/li>\n\n\n\n<li><code>rejected_claims[]<\/code><\/li>\n\n\n\n<li><code>edited_entity_draft<\/code><\/li>\n<\/ul>\n\n\n\n<p><strong>Gates<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Publishing requires:\n<ul class=\"wp-block-list\">\n<li>editorial_state \u2265 <code>verified<\/code> OR at least \u201cpublished (self_reported)\u201d with explicit label<\/li>\n\n\n\n<li>last_verified_at set<\/li>\n\n\n\n<li>sources listed<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 7 \u2014 Canonical Entity Build (Source of Truth)<\/h3>\n\n\n\n<p><strong>Goal:<\/strong> Build the final entity record from approved claims.<\/p>\n\n\n\n<p><strong>Process<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Merge approved claims into entity<\/li>\n\n\n\n<li>Maintain:\n<ul class=\"wp-block-list\">\n<li>per-field provenance pointers<\/li>\n\n\n\n<li>entity-level provenance summary<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Write to:\n<ul class=\"wp-block-list\">\n<li>Postgres entity store<\/li>\n\n\n\n<li>Graph store (edges)<\/li>\n\n\n\n<li>Search index (keyword)<\/li>\n\n\n\n<li>Vector DB (embeddings)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Outputs<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>entity.json<\/code> (canonical)<\/li>\n\n\n\n<li><code>graph_updates<\/code><\/li>\n\n\n\n<li><code>search_doc<\/code><\/li>\n\n\n\n<li><code>embedding_artifacts<\/code><\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 8 \u2014 Article Rendering (Narrative View)<\/h3>\n\n\n\n<p><strong>Goal:<\/strong> Generate the public note from the canonical entity.<\/p>\n\n\n\n<p><strong>AI Tasks<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Produce:\n<ul class=\"wp-block-list\">\n<li>Executive Summary (150\u2013250 words)<\/li>\n\n\n\n<li>Core Activity &amp; Tech Structure<\/li>\n\n\n\n<li>Market Positioning<\/li>\n\n\n\n<li>Ecosystem Context<\/li>\n\n\n\n<li>Corridor Analysis (if applicable)<\/li>\n\n\n\n<li>Verification Note<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Hard Constraints<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>No new facts may be introduced beyond approved claims<\/li>\n\n\n\n<li>All quantitative statements must reference approved fields<\/li>\n\n\n\n<li>Style: impersonal, technical, neutral<\/li>\n<\/ul>\n\n\n\n<p><strong>Outputs<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>article.md<\/code> (or structured blocks)<\/li>\n\n\n\n<li><code>render_blocks.json<\/code> (for CMS layout)<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 9 \u2014 Machine Layer Generation (JSON-LD + Feeds)<\/h3>\n\n\n\n<p><strong>Goal:<\/strong> Publish machine-readable signals.<\/p>\n\n\n\n<p><strong>Generation<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>JSON-LD (schema.org Organization\/Person + additionalProperty)<\/li>\n\n\n\n<li>OpenGraph metadata<\/li>\n\n\n\n<li>RSS entry (human)<\/li>\n\n\n\n<li>AI feed entry:\n<ul class=\"wp-block-list\">\n<li><code>entity_updates.ndjson<\/code> (diff-based)<\/li>\n\n\n\n<li><code>corridor_briefs.ndjson<\/code><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Outputs<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>page.html<\/code> (or CMS page)<\/li>\n\n\n\n<li><code>embedded_jsonld<\/code><\/li>\n\n\n\n<li><code>feeds<\/code><\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Stage 10 \u2014 Publication &amp; Indexing<\/h3>\n\n\n\n<p><strong>Goal:<\/strong> Release to public + trigger indexing.<\/p>\n\n\n\n<p><strong>Actions<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Publish page (CMS or static deploy)<\/li>\n\n\n\n<li>Update sitemaps + lastmod<\/li>\n\n\n\n<li>Ping search engines (optional)<\/li>\n\n\n\n<li>Notify subscribers \/ partners (webhooks)<\/li>\n\n\n\n<li>Log publishing event<\/li>\n<\/ul>\n\n\n\n<p><strong>Outputs<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Public URL<\/li>\n\n\n\n<li>API cache refresh<\/li>\n\n\n\n<li>Webhook events<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">4) Status Model (State Machine)<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Entity state<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>draft<\/li>\n\n\n\n<li>in_review<\/li>\n\n\n\n<li>verified<\/li>\n\n\n\n<li>published<\/li>\n\n\n\n<li>archived<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Claim state<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>proposed<\/li>\n\n\n\n<li>approved<\/li>\n\n\n\n<li>rejected<\/li>\n\n\n\n<li>superseded (replaced by newer claim)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Edge state<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>proposed<\/li>\n\n\n\n<li>approved<\/li>\n\n\n\n<li>rejected<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">5) Confidence &amp; Verification Policy<\/h2>\n\n\n\n<p><strong>Confidence (0\u2013100)<\/strong> = AI confidence in extraction correctness.<br><strong>Verification level<\/strong> = editorial evidence quality.<\/p>\n\n\n\n<p>Rules:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>High confidence does NOT equal verified.<\/li>\n\n\n\n<li>Verified requires evidence review.<\/li>\n<\/ul>\n\n\n\n<p>Default levels:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Interview\/submission \u2192 self_reported<\/li>\n\n\n\n<li>Public registry doc \u2192 verified<\/li>\n\n\n\n<li>Partner feed + doc \u2192 externally_verified<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">6) Anti-Hallucination Guards (Critical)<\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>No-free-text generation without entity constraints<\/strong><br>Generation must be constrained to approved fields only.<\/li>\n\n\n\n<li><strong>Evidence pointer required for critical fields<\/strong><br>No evidence \u2192 cannot be \u201cverified\u201d.<\/li>\n\n\n\n<li><strong>Diff-based updates<\/strong><br>Only changed fields are published as updates.<\/li>\n\n\n\n<li><strong>Changelog required<\/strong><br>Every published entity page includes update log.<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">7) Outputs (What You Sell)<\/h2>\n\n\n\n<p>This pipeline creates multiple products:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Public media<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Human-readable notes<\/li>\n\n\n\n<li>Sector dossiers<\/li>\n\n\n\n<li>Corridor briefs<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">AI\/Institutional products<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Verified entity feed<\/li>\n\n\n\n<li>Corridor pipeline feed<\/li>\n\n\n\n<li>Graph export<\/li>\n\n\n\n<li>Premium API endpoints<\/li>\n\n\n\n<li>Webhooks for updates<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">8) Minimal MVP Implementation (Practical)<\/h2>\n\n\n\n<p>If you want the leanest working version:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Entity store (Postgres)<\/li>\n\n\n\n<li>Claims table (per-field)<\/li>\n\n\n\n<li>Taxonomy registry (simple table)<\/li>\n\n\n\n<li>Editor\/Verifier UI (approve claims)<\/li>\n\n\n\n<li>Article renderer (template)<\/li>\n\n\n\n<li>JSON-LD embed + RSS + NDJSON feed<\/li>\n<\/ol>\n\n\n\n<p>That\u2019s enough to be \u201cAI-native\u201d for real.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">9) Recommended \u201cJobs\u201d &amp; Queues<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>intake_queue<\/code><\/li>\n\n\n\n<li><code>entity_resolution_queue<\/code><\/li>\n\n\n\n<li><code>claim_extraction_queue<\/code><\/li>\n\n\n\n<li><code>taxonomy_mapping_queue<\/code><\/li>\n\n\n\n<li><code>validation_queue<\/code><\/li>\n\n\n\n<li><code>editor_review_queue<\/code><\/li>\n\n\n\n<li><code>verifier_queue<\/code><\/li>\n\n\n\n<li><code>publish_queue<\/code><\/li>\n\n\n\n<li><code>monitor_update_queue<\/code><\/li>\n<\/ul>\n\n\n\n<p>This makes it scalable to multiple cities and corridors.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>AI Extraction\u2013Validation\u2013Publication Pipeline v1 (Entity-first \u2022 Claim-based \u2022 Provenance-driven \u2022 Human-in-the-loop) 1) Objectives The pipeline must reliably transform<\/p>\n","protected":false},"author":1,"featured_media":239,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21],"tags":[],"class_list":["post-247","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-native-media-stack"],"_links":{"self":[{"href":"https:\/\/globalsolidarity.live\/aiearth\/wp-json\/wp\/v2\/posts\/247","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/globalsolidarity.live\/aiearth\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/globalsolidarity.live\/aiearth\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/globalsolidarity.live\/aiearth\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/globalsolidarity.live\/aiearth\/wp-json\/wp\/v2\/comments?post=247"}],"version-history":[{"count":1,"href":"https:\/\/globalsolidarity.live\/aiearth\/wp-json\/wp\/v2\/posts\/247\/revisions"}],"predecessor-version":[{"id":248,"href":"https:\/\/globalsolidarity.live\/aiearth\/wp-json\/wp\/v2\/posts\/247\/revisions\/248"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/globalsolidarity.live\/aiearth\/wp-json\/wp\/v2\/media\/239"}],"wp:attachment":[{"href":"https:\/\/globalsolidarity.live\/aiearth\/wp-json\/wp\/v2\/media?parent=247"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/globalsolidarity.live\/aiearth\/wp-json\/wp\/v2\/categories?post=247"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/globalsolidarity.live\/aiearth\/wp-json\/wp\/v2\/tags?post=247"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}