An overview of Fusion's components and system architecture.
Physical Architecture
Fusion consists of the following components:
-
Fusion Server: A Java EE 6 web application certified with Apache Tomcat.
-
Metadata repository: A database that contains metadata about all applications.
Unified Data Model (UDM)
Fusion maps all data sources to the Unified Data Model (UDM) to provide applications a uniform method of accessing data, no matter which source is being accessed. All sources are represented as a set of named spaces consisting of tables with an associated format specification.
The Fusion UDM consists of the following components:
-
Table: Objects stored in data sources are represented as tables. Examples of objects that can be represented as tables include CSV files on Amazon S3, Salesforce objects (such as Account and User), and tables in a relational database. A table is a named set of records.
-
Table schema: An ordered list of columns that defines the structure of a table and all its records.
-
Record: A table's row -- an ordered list of the values of the columns in that row.
-
Space: A named collection of tables. For example, a set of schemas in a PostgreSQL database is represented as a space in the Fusion UDM.
-
Source: A named collection of spaces. See "Source" in Fusion's Basic Components.
-
Format specification: The format of a table or file, such as ZIP, CSV, or XML. The Load operation takes a format specification as a parameter that describes the format of a file in a file-based source.
Fusion's Basic Components
Fusion's basic components are:
-
Sources: Sources (also referred to as data sources) are data storage systems that are external to Fusion that are accessed and managed by Fusion. Examples of data sources include PostgreSQL, MS SQL, or Azure Blob databases. All sources are mapped to the UDM which provides Fusion applications a uniform interface no matter which source they are accessing.
Sources are defined according to their type, (such as Oracle, PostgreSQL, or Hadoop), not by what you are using them for, (such as Staging or Production). Each source type has a list of type-specific parameters.
-
Specifications: The rules for various data integration tasks. For example, the
Matching
specification type contains rules to find and consolidate duplicate records in a table. Each specification has a name, a type, and a list of type-specific parameters. -
Scenarios: A scenario is a procedure that consists of a sequence of operation calls. Scenarios are the basic mechanisms used to perform operations on data sources. Scenarios can be used to implement many different types of applications, such as ETL and Data Quality processes.
-
Operations: Operations implement basic data integration tasks. For example, the
LoadTableFromSource
operation loads any table from any data source into another data source.Transform
is a universal operation used to implement various types of data transformation.Operations are called from scenarios. Typically their parameters include a source name, space name, table name, target source name, target space name, target table name, plus other operation-specific parameters.
Each specification type usually has a corresponding operation that runs the specification. Such operations take the name of a specification as a parameter. If the specification includes the source and target information, the operation will not require source and target parameters. The operation's other parameters may allow you to run part of the specification or to run it in a specific mode. For example,
MatchOneSource
is an operation that runs aMatching
specification. It has parameters that allow you to direct it to apply only matching rules and no consolidation rules, which can be useful during development. -
Scripts: Scripts contain queries or scripts written in a source-specific language, for example, a SQL query to be used on a PosgreSQL database. Each script definition consists of a name, a data source name, and the script content. Some operations have scripts as parameters. For example, the
RunSourceScript
operation runs the script specified in its parameters. -
Applications: An application is a set of functional capabilities based on a set of sources, specifications, scenarios and scripts. These functional capabilities can be triggered from:
-
An EnterWorks widget, such as the Match and Merge widget.
-
An EnterWorks scheduled job. For example, Scheduled Imports can be configured to invoke the pre-processing class EnterWorksDiff.
An application's components are not available in other applications, however, they may be cloned to another application.
-
-
Objects: Objects are any of Fusion's basic components: sources, specifications, scenarios, operations, scripts, and applications. An object is identified by its name and its type. Objects can contain sub-objects. For example, an application's sub-objects include sources and scenarios.