# Day Zero Semantic Layer

### How to setup

{% embed url="<https://www.youtube.com/watch?v=lDHZwFF42d4>" %}

#### Steps Summary

1. **Connect data source** (Snowflake, BigQuery, Databricks, PostgreSQL, Athena etc.).
2. **Schema sync** → captures tables, joins, and data types.
3. **Query history sync** → learns past business usage.
4. **Day Zero initialization** → builds autonomous semantic relationships.
5. **Data stats collection** → captures data completeness and quality indicators.
6. **Ready state** → Connecty is now fully semantic and query-ready.
7. **Goals and KPI** recommendation

### Overview

The **DayZero Semantic Layer** (DayZero SL) feature automatically builds a semantic model the moment data connection is setup.\
It discovers datasets, relationships, metrics, and joins and initializes Connecty’s **Autonomous Semantic Graph (ASG)** without any manual configuration.

DayZero SL uses existing metadata, query history, and data samples to form a holistic understanding of your data’s logical and business context, and allows admins to validate (human-in-the-loop).&#x20;

**What this means for your business:** Teams can start extracting insights based on your custom definitions from day one, instead of waiting for manual upload of business definitions.

***

### Key Capabilities

#### 1.  Auto **Semantic Layer Generation**

During DayZero sync, Connecty runs a **Query-to-Semantic Layer** and **Semantic Layer Reconciliation** process:

* Converts discovered relationships into an internal grammar representation
* Merges inferred logic with existing verified grammar (if any)
* Ensures consistency and removes ambiguous mappings

**Business impact:** Analysts get an immediately usable data reasoning layer. The AI understands metric relationships and naming conventions without needing model scripts or dbt packages.

***

#### 2. **Query History Analysis**

Connecty analyzes past query history (from tools like Athena or Snowflake) to identify:

* Frequently used joins and filters
* Metrics and business expressions used in previous SQL
* Naming patterns that indicate business intent (e.g., *“total\_sales”*, *“monthly\_active\_users”*)

**Business impact:** The system learns the organization's natural data language automatically, so the first AI answers already align with how your team talks about KPIs.

***

#### 3. **Automatic Schema Discovery**

When a new data connection is added (e.g., Snowflake, Databricks, BigQuery, PostgreSQL, Athena), Connecty automatically scans:

* Tables, columns, and joins
* Data types and primary keys
* Relationships between entities

This information is used to populate the **ASG (Autonomous Semantic Graph)**.

**Business impact:** Reduces dependency on engineers to define structures manually. The semantic model starts forming immediately after connection, accelerating the time to first insight.

***

#### 4. **Column-Level Data Statistics**

Connecty automatically computes data quality metrics during sync, including:

* Null counts and percentages
* Unique value counts
* Minimum, maximum, and average (for numeric fields)

**Business impact:** Analysts and AI agents can evaluate data completeness and reliability instantly, improving the accuracy of generated queries and recommendations.

***

#### 5. **Day-Zero Query History Sync Workflow**

Connecty executes a chained sync process:

1. **DE Query History Sync** – collects past query usage.
2. **DW Query History Sync** – builds semantic clusters from those queries.
3. **Grammar Reconciliation** – updates the ASG with query-derived logic.
4. **Completion Event** – marks the environment as ready for semantic exploration.

**Business impact:** Connecty’s AI doesn’t start from zero — it starts with a pre-trained understanding of your organization’s analytical behavior.

***

#### 6. **Parallel Semantic Reasoning**

The DayZero process runs multiple semantic inference steps in parallel:

* Entity extraction
* Join and key relationship detection
* Metric and aggregation classification

**Business impact:** The semantic graph is ready within minutes, even for large warehouses with hundreds of tables.

***

#### 7. **Context Graph**

Every generated DWQuery now includes a **chart specification**, describing how results should be visualized (e.g., line chart, bar chart, trend).

**Business impact:** The semantic layer doesn’t just interpret queries — it also encodes visualization intent, allowing Connecty to produce instant charts and summaries directly in chat.

***

#### 8. **Verified-Entity Enforcement**

When “verified-only” mode is enabled, Connecty’s DayZero-generated semantic layer:

* Flags unverified entities
* Restricts reasoning and query generation to approved definitions

**Business impact:** Ensures business consistency from day zero — only trusted metrics and dimensions are used in AI responses.

***

### What Happens Next

Once the DayZero Semantic Layer is initialized:

* Business users can start asking natural language questions immediately.
* Data stewards can begin verifying and refining entities.
* Connecty continues to learn and reconcile new relationships automatically.

### Custom Semantic Inputs (Optional add-on)

Some customers want Connecty to reflect their **existing conceptual models and dictionaries**, not just what we infer from the warehouse. That’s available as a **custom add-on**.

**Industry / Conceptual Models (Custom)**

Yes - Connecty can ingest and map **industry models** (e.g. OSHA, banking/credit, financial, loyalty, insurance) into the semantic layer so the AI reasons in terms like *Account, Policy, Claim, Incident, Credit Line, Loyalty Member* instead of just table names.

You provide your existing models (diagrams, dictionaries, dbt / YAML / JSON, catalog exports); we map those concepts onto your actual schemas and plug them into the Day Zero Semantic Graph.

**Corporate Dictionaries & Taxonomies (Custom)**

Yes - Connecty can also ingest **corporate business glossaries and taxonomies**:

* Business glossaries (names, descriptions, owners)
* Domain taxonomies (e.g. Product → Category → Subcategory, Region → Market → Territory)
* Synonyms/abbreviations (e.g. GMV, LTV, LOB)
* Exports from tools like **Collibra, Alation**, etc.

These inputs are used to normalize your language, so “churn”, “attrition” and “cancellations” all resolve to the same governed KPI, and “verified-only” mode keeps AI answers aligned to approved definitions.
