Gå til innhald

Product tour

This product tour provides a high-level overview of the Cognite Data Fusion (CDF) architecture and the main steps to fast-track your implementation.

Tip

Refer to the Tutorials section for hands-on experience setting up a Cognite Data Fusion (CDF) project.

Architecture

Cognite Data Fusion (CDF) is a platform for contextualization and DataOps:

  • Contextualization combines machine learning, artificial intelligence, and domain knowledge to map resources from different source systems to each other in your industrial knowledge graph.

  • DataOps is a set of tools and practices to manage your data lifecycle through collaboration and automation.

CDF runs in the cloud and has a modular design:

CDF modular design

The sections below introduce the main steps to implementing CDF and how they relate to the different CDF modules.

Step 1: Set up data governance

It's important to know when data is reliable when making decisions. End-users should also know when they can trust the data.

Before integrating and contextualizing data in CDF, you must define and implement your data governance policies. We recommend appointing a CDF admin to work with the IT department to ensure that CDF follows your organization's security practices. Connect CDF to your identity provider (IdP), and use the existing user identities to manage access to the CDF tools and data.

To build applications on top of the data in CDF, you depend on a well-defined data model to make assumptions about the data structure. CDF has out-of-the-box data models to build a structured, flexible, contextualized knowledge graph.

Step 2: Integrate data

Established data governance policies allow you to add data from your add data from your IT, OT, and ET sources into CDF. These data sources include industrial control systems supplying sensor data, ERP systems, and massive 3D CAD models in engineering systems.

Extract data

With read access to the data sources, you can set up the system integration to stream data into the CDF staging area, where it can be normalized and enriched. We support standard protocols and interfaces like PostgreSQL and OPC-UA to facilitate data integration with your existing ETL tools and data warehouse solutions.

We have extractors made specific systems and standard ETL tools that work with most databases. This approach lets us minimize logic in the extractors and run and re-run transformations on data in the cloud.

Transform data

The data is stored in its original format in the CDF staging area. You can run and re-run transformations on your data in the cloud and reshape it to fit the CDF data model.

Decoupling the extraction and transformation steps makes it easier to maintain the data pipelines and reduces the load on the source systems. We recommend transforming the data using your existing ETL tools, and also offer the CDF Transformation tool as an alternative for lightweight transformation jobs.

Enhance data

The automatic and interactive contextualization tools in CDF let you combine artificial intelligence, machine learning, a powerful rules engine, and domain expertise to map resources from different source systems to each other in the CDF data model. Start by contextualizing your data with artificial intelligence, machine learning,W and rules engines, then let domain experts validate and fine-tune the results.

Step 3: Build solutions

With complete and contextualized data in your industrial knowledge graph, you can build powerful apps and AI agents to meet your business needs.

All the information stored in CDF is available through our REST-based API. Cognite also provides connectors and SDKs for common programming languages and analytics tools, like Python, JavaScript, Spark, OData (Excel Power BI), and Grafana. We also offer community SDKs for Java, Scala, Rust, and .Net.

The Functions service provides a scalable, secure, and automated way to host and run Python code.