What is Identity Resolution in Data Cloud and what are the matching methods?

Identity Resolution is the process of unifying records from multiple data sources into a single Unified Individual profile. It uses Rulesets with two matching methods: Deterministic matching uses exact-match criteria (same email, same phone number); Probabilistic matching uses statistical likelihood scores across multiple fuzzy signals. Reconciliation rules determine which source value wins for each attribute: Most Frequent (most common value), Last Updated (most recent value), or Source Priority (defined source wins). The match rate dashboard shows unification quality.

Data Cloud Consultant Interview Questions

What is the Salesforce Data Cloud architecture and how does it relate to a production org?

Salesforce Data Cloud is a multi-tenant, cloud-native Customer Data Platform (CDP) built on Hyperforce infrastructure. It runs as a separate Salesforce org but shares the same identity (Salesforce login) as the connected production org. Key architectural concepts:

Data Cloud Credits — the consumption-based pricing unit governing ingestion volume, identity resolution runs, and activation usage.
Data Cloud Org relationship — a production Salesforce CRM org is connected to the Data Cloud org via the Salesforce CRM connector, enabling bidirectional data flow.
Data is stored in Data Cloud's lake-house architecture (not in standard Salesforce storage).
All Data Cloud objects and features are accessible via the Data Cloud app within the connected org's UI.

What are Data Streams and what connector types does Data Cloud support?

Data Streams are the ingestion pipelines that bring data into Data Cloud from external sources. Supported connector types:

Salesforce CRM Connector — syncs standard and custom objects from a connected Salesforce org (near real-time delta sync).
Marketing Cloud Connector — ingests subscriber, engagement, and journey data from Marketing Cloud.
Ingestion API — REST API for streaming or bulk ingestion from custom applications, mobile apps, or external systems.
Cloud Storage Connectors — batch file ingestion from Amazon S3, Azure Blob Storage, and Google Cloud Storage using CSV or Parquet files. Each Data Stream maps source fields to Data Model Objects (DMOs). Formula fields can be created within a Data Stream to transform or derive values during ingestion before data is mapped to DMOs.

What are Data Model Objects (DMOs) and the Unified Data Model in Data Cloud?

The Unified Data Model (UDM) is Data Cloud's standardized data schema. Data Model Objects (DMOs) are the individual tables within the UDM:

Standard DMOs — pre-built objects aligned to common data concepts: Individual (person), Contact Point (email, phone, address), Sales Order, Engagement (web/email events), and more.
Custom DMOs — user-defined objects for data that doesn't fit standard schemas.
Key distinction: Person Account vs Individual — in B2B orgs with Person Accounts, individuals are modeled using both Account and Contact fields; the Individual DMO is the canonical person record in Data Cloud.
Engagement Concept Objects — capture behavioral events (email opens, web clicks, purchases). DMO relationships (lookups) allow joining data across objects for segmentation and insights.

What is Identity Resolution in Data Cloud and how do deterministic vs probabilistic matching differ?

Identity Resolution (IR) unifies records from multiple data sources into a single Unified Individual profile by linking related Contact Point records. Configuration uses IR Rulesets:

Deterministic Matching — exact-match rules on high-confidence identifiers (exact email address, exact phone number, exact loyalty ID). High precision, lower recall.
Probabilistic Matching — statistical scoring across multiple partial signals (similar name + same city); matches records that are likely the same person without a single exact identifier. Higher recall, lower precision. Reconciliation Rules determine which attribute value "wins" when multiple sources provide different values for the same field: Most Frequent (majority value), Last Updated (most recent), or Source Priority (configured source order). The Match Rate Dashboard in Data Cloud shows the percentage of records resolved into unified profiles.

What is a Unified Profile and how can it be accessed in real time?

The Unified Profile is the result of Identity Resolution — a consolidated view of an individual assembled from all matched source records across Data Streams. It includes all contact points (emails, phones, addresses), attributes from source DMOs, and computed insights. Access methods:

Profile Explorer — a visual UI in Data Cloud that allows admins and analysts to look up any Unified Individual and browse their complete profile, segment membership, and engagement timeline.
Profile API — a REST API endpoint that returns the Unified Profile for a given identifier (email, phone, loyalty ID) in real time; used to power personalized experiences in Experience Cloud, mobile apps, LWC components, and Einstein Personalization.
Computed Insights on the Unified Individual enrich the profile with aggregated metrics (e.g., total purchase value, days since last engagement).

How does Segmentation work in Data Cloud and what is waterfall segmentation?

Segmentation in Data Cloud uses a drag-and-drop Segment Criteria Builder to define audience rules against Unified Individuals and related DMOs. Key concepts:

Segment-on-Segment — build a segment using another segment as an inclusion or exclusion criteria, enabling hierarchical audience logic.
Waterfall Segmentation — a controlled segment strategy where individuals flow through priority-ordered buckets and are assigned to the first matching segment, preventing overlap across buckets (e.g., "Platinum → Gold → Silver → Standard").
Refresh Schedule — segments can refresh every 12 hours (Rapid Publish) or every 24 hours (standard). Rapid Publish requires an Activation to also support near-real-time processing.
Segment Membership Table — a DMO that records which Unified Individuals belong to each segment at each refresh, providing historical membership data.

What are Calculated Insights in Data Cloud and how are they built?

Calculated Insights (CIs) are pre-aggregated metrics computed using ANSI SQL and stored as enrichment attributes on DMOs. They allow complex calculations to be computed once and reused across many segments and activations. Key points:

ANSI SQL based — CIs are written as SQL SELECT statements with GROUP BY clauses; they run on Data Cloud's lake-house engine.
groupBy segments — CIs can group by Unified Individual ID to produce per-person metrics (e.g., total_spend, avg_order_value, days_since_last_purchase).
CI on Unified Individual vs DMO — CIs can target the Unified Individual DMO for person-level scores or other DMOs for object-level aggregates.
Refresh — CIs can be refreshed on a schedule or on-demand; output is stored back in Data Cloud as a new attribute available for segmentation and activation.

What are Activation Targets in Data Cloud and what are the key activation types?

Activation Targets define where and how segment members are pushed to downstream systems. Key types:

Salesforce CRM Activation — updates Contact or Lead records in the connected Salesforce org; can populate a Campaign Member or update custom fields. Enables CRM-based workflows to act on Data Cloud segments.
Marketing Cloud Activation — sends segment members into a Marketing Cloud Journey (as a Journey Entry Source) or updates a Subscription Center.
Advertising Activations — Facebook Ads: hashes PII (email/phone) and pushes to Custom Audiences. Google Ads: uploads to Customer Match lists. LinkedIn: populates Matched Audiences. Hashing is applied automatically to comply with platform requirements.
Activations are associated with a segment; when the segment refreshes, the activation automatically syncs updated membership to the target.

What are Data Spaces in Data Cloud and when do you use them?

Data Spaces are logical partitions within a single Data Cloud org that isolate data for different business units, brands, or regions. Key use cases:

Business Unit Separation — a global company with multiple brands can keep each brand's customer data in a separate Data Space, preventing cross-contamination of segments and profiles.
Regional Data Residency — partition data by geography to support data sovereignty requirements (e.g., EU data stays in the EU Data Space while US data is in a separate partition).
Access Control — users can be granted access to specific Data Spaces only, so regional teams see only their data. Each Data Space has its own Data Streams, DMOs, Segments, and Activations. Data Spaces reduce the need to maintain multiple separate Data Cloud orgs for different business divisions.

How does Consent Management work in Data Cloud?

Data Cloud includes a consent data model to respect customer privacy preferences during activation. Key objects:

Party Consent — a record per individual per processing purpose (e.g., "Marketing Emails: Opted In").
Contact Point Consent — consent tied to a specific contact point (e.g., a particular email address opted out of promotional communications). Consent data is ingested via Data Streams from CRM, Marketing Cloud, or preference centers. During Activation, Data Cloud evaluates consent records and automatically suppresses individuals who have opted out of the activation's communication purpose — no manual suppression segment is required. This ensures GDPR and CCPA compliance is enforced at the platform level before data is pushed to downstream channels.

What are Data Actions in Data Cloud and how do they work?

Data Actions allow Data Cloud to trigger a Salesforce Flow in near real time when a segment membership changes or a Calculated Insight crosses a defined threshold. Use cases:

A customer enters a "High Churn Risk" segment — a Data Action triggers a Flow that creates a follow-up task for the account manager in CRM.
A customer's total spend Calculated Insight exceeds $10,000 — a Data Action triggers a Flow to upgrade the customer to a VIP tier. Data Action configuration:
Create a Data Action linked to a segment or CI.
Define the trigger condition (enter segment, exit segment, threshold breach).
Map Data Cloud attributes to Flow input variables.
The target Flow runs in the connected Salesforce org. This enables near-real-time alerting and automated response to data changes without waiting for batch processing.

How does the Ingestion API work in Data Cloud?

The Ingestion API is a REST API for sending data directly into Data Cloud from external systems. Two modes:

Streaming Ingestion — sends individual events or records in real time (e.g., web clickstream events, mobile app events). Ideal for behavioral engagement data requiring low latency.
Bulk Ingestion — uploads batches of records via multipart file upload; suited for large historical data loads or nightly batch syncs. Key technical details:
Schema Validation — the payload must conform to the schema defined on the connected Data Stream; non-conforming records are rejected.
Idempotency Key — include a unique key per request to prevent duplicate processing if a request is retried.
Error Handling — the API returns HTTP status codes and error payloads for validation failures; failed records appear in the Data Stream's error log in the Data Cloud UI.

What is the Profile API and how is it used in real-time personalization?

The Profile API allows external systems to query the Unified Profile of an individual in real time using a known identifier (email, phone, loyalty ID, or Salesforce Contact ID). Use cases:

Experience Cloud — an LWC component on a community page calls the Profile API to retrieve the visitor's Calculated Insights (e.g., lifetime value, product preferences) and render personalized content.
Einstein Personalization — the real-time decision engine calls the Profile API to evaluate segment membership and recommend the next best action or offer.
Custom Applications — any web app or mobile app can call the Profile API via OAuth 2.0 to build personalized in-session experiences. The Profile API response includes profile attributes, contact points, segment memberships, and computed insights, making it the single source of truth for real-time personalization decisions.

What is Einstein Studio in Data Cloud and what ML capabilities does it provide?

Einstein Studio is Data Cloud's machine learning hub for bringing AI models into the Data Cloud ecosystem. Capabilities:

Einstein Discovery Integration — connect Einstein Discovery predictions to Data Cloud DMOs; prediction scores become attributes on Unified Individuals for use in segmentation.
Bring Your Own Model (BYO Model) — import externally trained models from Amazon SageMaker or Google Cloud Vertex AI into Einstein Studio. The model is connected via its API endpoint.
Model Scoring on DMO — configure the imported model to score records in a Data Cloud DMO (e.g., score all Unified Individuals for churn probability). Scores are written back as DMO attributes.
Model outputs can then be used in Segment criteria (e.g., "Churn Score > 0.8"), Calculated Insights, and Activations, enabling AI-driven audience selection without leaving Salesforce.

How does vector search and embeddings work in Data Cloud for Agentforce?

Data Cloud supports storing and querying vector embeddings to power semantic search and AI grounding. Key concepts:

Embeddings — numerical vector representations of text (articles, emails, product descriptions) generated by an embedding model (e.g., OpenAI embeddings or Salesforce's own model).
Vector Search — instead of exact keyword matching, vector search finds semantically similar content based on cosine similarity between query and stored vectors.
Chunking — long documents are split into smaller text chunks before embedding to fit model token limits; each chunk is stored as a separate vector.
Grounding for Agentforce — when an Einstein/Agentforce agent needs to answer a question, Data Cloud's vector store is queried to retrieve the most relevant document chunks (RAG — Retrieval-Augmented Generation), which are included in the LLM prompt context for accurate, grounded responses.

What are Data Transforms in Data Cloud and when do you use them?

Data Transforms are SQL-based transformation jobs that read from existing DMOs, apply logic, and write the results to new derived DMOs. They are analogous to SQL views or materialized views within Data Cloud. Use cases:

Cleansing and standardizing raw ingested data (e.g., normalizing phone number formats).
Joining data from multiple DMOs into a single denormalized table for easier segmentation.
Creating derived metrics or categorizations that don't fit the standard Calculated Insights framework.
Preparing data for activation targets that require a specific schema. Scheduled Refresh — Data Transforms can be set to run on a schedule (hourly, daily); the output DMO is updated on each run. This is distinct from Calculated Insights (metric aggregations) — Transforms produce new record-level DMOs rather than aggregated scores.

How does Data Cloud integrate with Marketing Cloud Connect?

Data Cloud and Marketing Cloud integration works bidirectionally via the Marketing Cloud Connector:

Segment Activation to Marketing Cloud Journey Builder (JB) — a Data Cloud Activation Target pushes segment members into Marketing Cloud as a contact list or Journey Entry Event Source. When a new member enters the segment, they are automatically enrolled in the linked Journey. Attributes mapped during activation are available as Journey data extensions for personalization.
Engagement Data Back to Data Cloud — Marketing Cloud sends email engagement events (sends, opens, clicks, bounces, unsubscribes) back to Data Cloud via the Marketing Cloud Connector as Engagement DMO records. This closes the loop, enriching the Unified Profile with channel engagement history and enabling re-segmentation based on email behavior.

How does Data Cloud integrate with Tableau and CRM Analytics?

Data Cloud provides two primary analytics integration paths:

Tableau Integration — Tableau can connect directly to Data Cloud using the Salesforce Data Cloud connector in Tableau Desktop or Tableau Cloud. Analysts can query DMOs, Calculated Insights, and Unified Profiles as data sources and build any visualization. Tableau uses Data Cloud's query engine, supporting live queries or extracts.
CRM Analytics (Tableau CRM) — pre-built Data Cloud recipe connectors sync DMO data into CRM Analytics datasets for dashboard and lens creation within Salesforce. Pre-built analytics templates for Data Cloud provide out-of-the-box dashboards for identity resolution quality, segment trends, and activation performance. Both tools can combine Data Cloud data with Salesforce CRM data for comprehensive cross-system analytics.

What are data governance and stewardship capabilities in Data Cloud?

Data Cloud includes governance capabilities to support enterprise data management:

Data Ownership — each Data Stream and DMO has an assigned owner responsible for data quality and access.
Retention Policies — define how long ingested data is retained in Data Cloud before automatic deletion (e.g., retain engagement events for 12 months). Retention policies are set at the Data Stream level.
GDPR Right to Erasure — Data Cloud supports individual erasure requests: the Individual Deletion Job deletes all records associated with a Unified Individual (across all DMOs and source data) and removes the individual from all segments and activations. Deletion propagates to connected activations.
Data Lineage — Data Cloud provides lineage views showing how data flows from Data Streams to DMOs to Segments to Activations, supporting auditability.

What is the difference between a Unified Individual and a Unified Contact Point?

Unified Individual is the resolved, merged person-level record created by Identity Resolution. It represents a single real-world person assembled from all matched source records (Contacts, Leads, Loyalty profiles, etc.). It holds person-level attributes (name, demographics) and links to all associated data. Unified Contact Point represents a specific communication channel identifier — such as an email address, phone number, or postal address — associated with a Unified Individual. One Unified Individual can have multiple Unified Contact Points (e.g., a person with two email addresses). Contact Points carry consent data and are the basis for deterministic matching rules (matching individuals by shared email or phone). Activations use Contact Point data to determine how to reach an individual on a given channel.

How do you use Data Cloud segmentation for B2B vs B2C use cases?

Data Cloud segmentation supports both B2C (individual-focused) and B2B (account-focused) scenarios:

B2C Segmentation — segments are built on the Unified Individual DMO, targeting people based on demographics, behavior, engagement, and purchase history. Typical use cases: high-value customer segments, re-engagement audiences, churn prediction audiences.
B2B Segmentation — while Data Cloud's core entity is the Individual, B2B use cases extend to Account-based segmentation by joining the Unified Individual to an Account DMO (synced from Salesforce CRM). Segments can include "All contacts at accounts in the Technology industry with ARR > $500K." Account-based segments are then activated to CRM for ABM campaigns or to LinkedIn Ads for account-targeted advertising. Calculated Insights aggregate metrics at both the individual and account level for rich B2B audience intelligence.

What is the Salesforce CRM Connector in Data Cloud and what data does it sync?

The Salesforce CRM Connector is a native, no-code connector that establishes a near-real-time sync between a connected Salesforce org and Data Cloud. Key characteristics:

Objects Synced — any standard or custom object can be synced; commonly synced objects include Contact, Lead, Account, Opportunity, Case, Campaign, and CampaignMember.
Delta Sync — only changed records are synced after the initial full load, keeping Data Cloud current without full table refreshes.
Field Selection — admins select which fields to include for each object; formula fields and encrypted fields have limitations.
Bidirectional — in addition to pulling CRM data into Data Cloud, CRM Activation pushes segment results back to CRM Contact/Lead records or Campaign Members. The connector uses Salesforce's Bulk API 2.0 and Streaming API infrastructure.

What are Data Cloud Credits and how is consumption calculated?

Data Cloud Credits are the consumption-based currency used to price Data Cloud usage. Credit consumption is driven by several activities:

Data Ingestion — credits consumed per record ingested via Data Streams (different rates for real-time vs. batch).
Identity Resolution — credits consumed per IR ruleset run, based on the number of records processed.
Segmentation Refresh — each segment refresh consumes credits based on DMO size and complexity.
Activation — credits consumed per record activated to an external target.
Profile API Calls — each real-time profile lookup consumes a small credit amount. Organizations purchase Credit packs and monitor consumption via the Data Cloud Usage dashboard. Optimizing segment refresh frequency, minimizing unnecessary activations, and managing data retention policies are key strategies to control credit consumption.

How does the Data Cloud + Agentforce integration work?

Data Cloud serves as the grounding data layer for Agentforce AI agents. Integration points:

Unified Profile as Context — when an Agentforce agent handles a customer interaction, it queries the Data Cloud Unified Profile via the Profile API to retrieve the customer's attributes, segment memberships, and Calculated Insights. This context is injected into the LLM prompt to generate personalized, informed responses.
Vector Search for Grounding — knowledge base content (articles, product documentation) stored as vector embeddings in Data Cloud is retrieved using semantic vector search and included in the agent's prompt for RAG (Retrieval-Augmented Generation), preventing hallucinations.
Data Actions as Agent Triggers — Agentforce agents can be triggered by Data Actions when segment changes indicate a customer need.
Segment-Driven Agent Assignment — route customers to specialized agents based on Data Cloud segment membership (e.g., VIP customers route to premium support agents).

How do you troubleshoot Identity Resolution quality issues in Data Cloud?

Identity Resolution quality issues typically manifest as over-merging (too many individuals merged) or under-merging (the same person appears as multiple profiles). Troubleshooting steps:

Match Rate Dashboard — review the overall match rate, number of unified individuals, and average records per unified individual. An unusually high average (e.g., 50+ records per person) suggests over-merging.
Profile Explorer — look up specific individuals to inspect which source records were merged and via which match rule.
Review Match Rules — overly broad probabilistic rules (e.g., matching on first name + city alone) cause over-merging; tighten criteria or add required fields.
Data Quality — poor input data quality (missing emails, generic phone numbers like "555-0000") causes under-merging; cleanse source data before ingestion.
Reconciliation Rules — if the wrong attribute values are winning, adjust source priority or switch to Last Updated reconciliation.

Data Cloud Consultant Interview Questions

Ready to Practice with Mock Tests?

Continue Your Preparation

Data Cloud Consultant Interview Questions

Data Cloud Consultant Interview Questions Content

Ready to Practice with Mock Tests?

Continue Your Preparation