Skip to content

About Cognite Data Fusion (CDF)

Cognite is a SaaS provider, and Cognite Data Fusion (CDF) is our industrial DataOps platform product.

Cognite Data Fusion (CDF) streams your industrial data into a data model. You can add connections between different types of data automatically and manually and store the information in a knowledge graph in the cloud. With your data in the cloud, you can use the CDF services and tools to build solutions and applications to meet your business needs.

With Cognite, you own your data. We use it only to provide agreed-upon services, handle it securely, and comply with privacy and legal regulations. If you leave our services, we ensure you continue to own your data.

Architecture

Cognite Data Fusion (CDF) runs in the cloud and has a modular design.

The CDF architecture

You can interact with your data through dedicated workspaces in the CDF web application, or with our APIs and SDKs.

The following sections introduce the main steps to implementing CDF and how they relate to the different CDF modules.

Step 1: Set up data management

When making decisions, it's important to know that data is reliable and that you can trust the data.

Before integrating and contextualizing data in CDF, you must define and implement your data governance policies. We recommend appointing a CDF admin to work with the IT department to ensure that CDF follows your organization's security practices. Connect CDF to your identity provider (IdP), and use the existing user identities to manage access to the CDF tools and data.

To build applications on top of the data in CDF, you depend on a well-defined data model to make assumptions about the data structure. CDF has out-of-the-box data models to build a structured, flexible, contextualized knowledge graph.

Step 2: Integrate data

Established data governance policies allow you to add data from your IT, OT, and ET sources into CDF. These data sources include industrial control systems supplying sensor data, ERP systems, and massive 3D CAD models in engineering systems.

Extract data

With read access to the data sources, you can set up the system integration to stream data into the CDF staging area, where it can be normalized and enriched. We support standard protocols and interfaces like PostgreSQL and OPC-UA to facilitate data integration with your existing ETL tools and data warehouse solutions.

We have extractors made for specific systems and standard ETL tools that work with most databases. This approach lets us minimize logic in the extractors and run and re-run transformations on data in the cloud.

Transform data

The data is stored in its original format in the CDF staging area. You can run and re-run transformations on your data in the cloud and reshape it to fit the CDF data model.

Decoupling the extraction and transformation steps makes it easier to maintain the data pipelines and reduces the load on the source systems. We recommend transforming the data using your existing ETL tools. We also offer the CDF Transformation tool as an alternative for lightweight transformation jobs.

Enhance data

The automatic and interactive contextualization tools in CDF let you combine artificial intelligence, machine learning, a powerful rules engine, and domain expertise to map resources from different source systems to each other in the CDF data model. Start by contextualizing your data with artificial intelligence, machine learning, and rules engines, then let domain experts validate and fine-tune the results.

Step 3: Consume data and build solutions

With complete and contextualized data in your industrial knowledge graph, you can use the built-in industrial tools, and build powerful apps and AI agents to meet your business needs.

All the information stored in CDF is available through our REST-based API. Cognite also provides connectors and SDKs for common programming languages and analytics tools, like Python, JavaScript, Spark, OData (Excel, Power BI), and Grafana. We also offer community SDKs for Java, Scala, Rust, and .Net.

The Functions service provides a scalable, secure, and automated way to host and run Python code.