Product tour¶

This product tour provides a high-level overview of the Cognite Data Fusion (CDF) architecture and the main steps to fast-track your implementation.

Tip

Refer to the Tutorials section for hands-on experience setting up a Cognite Data Fusion (CDF) project.

Architecture¶

Cognite Data Fusion (CDF) is a platform for contextualization and DataOps:

Contextualization combines machine learning, artificial intelligence, and domain knowledge to map resources from different source systems to each other in your industrial knowledge graph.
DataOps is a set of tools and practices to manage your data lifecycle through collaboration and automation.

CDF runs in the cloud and has a modular design:

The sections below introduce the main steps to implementing CDF and how they relate to the different CDF modules.

Step 1: Set up data governance¶

It's important to know when data is reliable when making decisions. End-users should also know when they can trust the data.

Before integrating and contextualizing data in ~~CDF~~, you must define and implement your data governance policies. We recommend appointing a ~~CDF~~ admin to work with the IT department to ensure that ~~CDF~~ follows your organization's security practices. Connect ~~CDF~~ to your identity provider (~~IdP~~), and use the existing user identities to manage access to the ~~CDF~~ tools and data.

To build applications on top of the data in ~~CDF~~, you depend on a well-defined data model to make assumptions about the data structure. ~~CDF~~ has out-of-the-box data models to build a structured, flexible, contextualized knowledge graph.

Step 2: Integrate data¶

Established data governance policies allow you to add data from your add data from your IT, OT, and ET sources into CDF. These data sources include industrial control systems supplying sensor data, ERP systems, and massive 3D CAD models in engineering systems.

Extract data¶

With read access to the data sources, you can set up the system integration to stream data into the ~~CDF~~ staging area, where it can be normalized and enriched. We support standard protocols and interfaces like ~~PostgreSQL~~ and ~~OPC-UA~~ to facilitate data integration with your existing ETL tools and data warehouse solutions.

We have extractors made specific systems and standard ETL tools that work with most databases. This approach lets us minimize logic in the extractors and run and re-run transformations on data in the cloud.

Transform data¶

The data is stored in its original format in the ~~CDF~~ staging area. You can run and re-run transformations on your data in the cloud and reshape it to fit the ~~CDF~~ data model.

Decoupling the extraction and transformation steps makes it easier to maintain the data pipelines and reduces the load on the source systems. We recommend transforming the data using your existing ETL tools, and also offer the ~~CDF~~ Transformation tool as an alternative for lightweight transformation jobs.

Enhance data¶

The automatic and interactive contextualization tools in ~~CDF~~ let you combine artificial intelligence, machine learning, a powerful rules engine, and domain expertise to map resources from different source systems to each other in the ~~CDF~~ data model. Start by contextualizing your data with artificial intelligence, machine learning,W and rules engines, then let domain experts validate and fine-tune the results.

Step 3: Build solutions¶

With complete and contextualized data in your industrial knowledge graph, you can build powerful apps and AI agents to meet your business needs.

All the information stored in ~~CDF~~ is available through our REST-based API. ~~Cognite~~ also provides connectors and SDKs for common programming languages and analytics tools, like ~~Python~~, ~~JavaScript~~, ~~Spark~~, ~~OData~~ (~~Excel~~ ~~Power BI~~), and ~~Grafana~~. We also offer community SDKs for ~~Java~~, ~~Scala~~, ~~Rust~~, and ~~.Net~~.

The ~~Functions~~ service provides a scalable, secure, and automated way to host and run ~~Python~~ code.