Uses third-party JDBC drivers to connect to and query the metadata of the database.
JDBC_RESULT_QUERY_FORMAT
option in the DbUrl or DbOptions property. Note that this is a breaking change if you have configured this node in Data360 Analyze prior to the 3.16.0 release. If you have existing nodes configured to connect to Snowflake using our shipped driver, where the DbUrl property has been set explicitly or if you have not set the
DbType property, you must add the JDBC_RESULT_QUERY_FORMAT
option to prevent the node from failing. You can use this node to extract SQL code from a database which you can then feed into the DeepSQL node for lineage analysis. To configure the Database Metadata node:
- Configure the required properties to connect to your database. If you are connecting to a database that uses a driver which is pre-installed with Data360 Analyze, follow the Simple database connection steps. Or, if you are connecting to a database that uses a driver which is not shipped with Data360 Analyze, see Advanced database connection.Tip: For a list of the drivers which are included with your installation, see Acquiring data from a database.
- If you are working with a large database, it is recommended that you filter your queries, see Filtering your queries.
The Database Metadata node has the following outputs:
- Metadata - Provides metadata about the data source that you are connecting to.
- Catalogs - Provides a list of catalogs in the database.
- Schemas - Provides a list of schemas in the database.
- Tables - Provides a list of tables and views in the database.
- Table Columns - Provides a list of tables, views and columns in the database.
- Procedures - Provides a list of procedures in the database.
- Procedure Parameters - Provides a list of procedure parameters in the database.
- Functions - Provides a list of functions in the database.
- Function Parameters - Provides a list of function parameters in the database.
- Triggers - Provides a list of triggers in the database.
-
Object Definitions - Provides a list of SQL statements which you can use as input to the DeepSQL node. These SQL statements define the objects of the database that you are connected to. The Notes field on this output pin provides the following information:
- 0 - Body is valid
- 1 - Body is invalid (more than 4000 bytes)
- 2 - Body is unavailable (for example, it is not supported)
- 3 - Body is encrypted
Example
You are using the Database Metadata node to connect to and query a SQL Server database.
Click the Object Definitions output pin to expand the data viewer:
Each record shows a database object and the corresponding SQL statements that are needed to create the object. In the following example, the ObjectCode field contains the required SQL statements to create a view, a procedure, a trigger and a table from the SQL Server database:
The Object Definitions output pin is connected to a DeepSQL node, which analyzes the SQL statements to produce a set of nodes and edges to enable data lineage.
See DeepSQL for an example of working with the DeepSQL node.
Filtering your queries
- When working with a large database, initially, it is recommended that you set the ShowDBInfoOnly property to True. This will allow you to obtain database metadata and schema information.
- Click the Schemas output pin to view the schema information in the data viewer. Identify the schema that you want to analyze.
- Specify the schema in the SchemaFilter property to filter the information that is retrieved when you next run the node.
%
character, and in this case a stem match will be performed. For the CatalogFilter property, the only option is to type the name of a catalog and an exact match is performed. To filter by more than one item, specify the filters in a comma separated list. For example, to retrieve only the 'INVOICES' and 'CUSTOMERS' tables, set the TableViewFilter property to INVOICES,CUSTOMERS
.Database-specific information
The information that is output by the Database Metadata node varies depending on the database that you are connecting to. For example, Oracle does not support catalogs, so the Tables output pin will display NULL
for each record in the Catalog column.
The following table outlines which database components are supported by some of the most common database connections:
Y = Yes and N = No
Database-supported components | |||||||
---|---|---|---|---|---|---|---|
Database | Schemas |
Catalogs |
Tables | Views | Functions | Procedures |
Triggers |
DB2 | Y | N | Y | Y | Y | Y | Y |
Hana | Y | N | Y | Y | Y | Y | Y |
Hive | Y | N | Y | Y | N | N | N |
MS Access | N | N | Y | N | Y | Y | N |
MySQL / MariaDB | N | Y | Y | Y | Y | Y | Y |
Oracle | Y | N | Y | Y | Y | Y | Y |
PostgreSQL | Y |
Y |
Y | Y | Y | Y | Y |
RedShift | Y |
Y |
Y | Y | Y | Y | N |
Snowflake | Y |
Y |
Y | Y | Y | Y | N |
SQL Server | Y |
Y |
Y | Y | Y | Y | Y |
Sybase | Y |
Y |
Y | Y | Y | Y | Y |
Teradata | Y | N | Y | Y | Y | Y | Y |
DB2
Notes:
- Supports z/OS. If you are using z/OS, you will require an additional license file to connect which is typically named:
db2jcc_license_cisuz.jar
Place the license file in the following directory:
<site directory>/lib/java/db/db2
For example:
C:/users/<username>/AppData/Local/Data360Analyze/site-7731/lib/java/db/db2
Then, configure the DbDriverClasspath property to reference this, as follows:
{{%ls.brain.javaLibDir%}}/db/db2
{{%ls.appDataDir%}}/lib/java/db/db2/db2jcc_license_cisuz.jar
- Modules are not supported in Data360 Analyze or Data360 Govern. This means that, for example, if there are two functions with the same name in two different modules it would not be possible to differentiate the two unless the module was part of the path.
Permissions:
- No additional permissions are required.
System schemas for non-z/OS systems:
- NULLID, SQLJ, SYSCAT, SYSFUN, SYSIBM, SYSIBMADM, SYSIBMINTERNAL, SYSIBMTS, SYSPROC, SYSPUBLIC, SYSSTAT, SYSTOOLS
System schemas for z/OS systems:
- ADB, CLARKG, DSN8110, DSN81110, DSN811SA, DSN8BQRY, DSNRGCOL, IBMUSER, Q, RUNSTATS, SYSIBM, SYSIBMTS, SYSTOOL
Hana
Notes:
- Does not support macros or parameterized views.
Permissions, method one:
- Create a role by running the following command, where
<rolename>
is replaced with the name of the role:create role <rolename>
- Grant the role to the required user by running the following command, where
<rolename>
is replaced with the name of the role and<user>
is replaced with the name of the user:grant <rolename> to <user>
- Grant permission to the role by running the following command, where
<rolename>
is replaced with the name of the role:grant catalog read to <rolename>
Permissions, method two:
- Grant a special internal role to the user by running the following command where
<user>
is replaced with the name of the user:grant SAP_INTERNAL_HANA_SUPPORT to <user>
System schemas:
- HANA_XS_BASE, SAPDBCTRL, SAP_REST_API, SAP_XS_LM, SAP_XS_LM_PE, SAP_XS_LM_PE_TMP, SAP_XS_USAGE, SYS, _SYS_AFL, _SYS_AUDIT, _SYS_BI, _SYS_BIC, _SYS_DATA_ANONYMIZATION, _SYS_EPM, _SYS_EPM_DATA, _SYS_LDB, _SYS_PLAN_STABILITY, _SYS_REPO, _SYS_RT, _SYS_SECURITY, _SYS_SQL_ANALYZER, _SYS_STATISTICS, _SYS_TASK, _SYS_TELEMETRY, _SYS_WORKLOAD_REPLAY, _SYS_XS
Hive
Notes:
- Catologs display
" "
instead ofNULL
.
Permissions:
- No additional permissions are required.
System schemas:
- information_schema, sys
MS Access
Notes:
- The Object Definitions pin only includes tables.
MySQL / MariaDB
Notes:
- MariaDB is an open source version of MySQL.
Permissions:
- Run the following two commands for each database where metadata is needed, replacing
<database>
and<user>
with the name of the database and the name of the user:grant select, execute, trigger, show view on <database>.* to '<user>'@'%'
grant select on mysql.proc to '<user>'@'%'
MariaDBsystem catalogs:
- information_schema, mysql, performance_schema
MySQL system catalogs:
- information_schema, mysql, performance_schema, sys
Oracle
Notes:
- Packages are not supported, that is, the node will not output any object code defined in a package. Materialized / parameterized views are not supported.
- The Procedures output pin shows both functions and procedures. The Functions output pin also shows functions.
- The "select any dictionary" system privilege may be required when connecting to Oracle.
- Oracle has the ability to define objects that are invalid. These objects are not included in the Object Definitions pin.
- Oracle has the ability to specify whether triggers are enabled or disabled. Disabled triggers are not included on the Object Definitions pin.
- Triggers that are greater than 4000 characters will have a null value for the ObjectCode field and a value of 1 in the Notes field.
Permissions:
- Use the following commands where metadata is needed, replacing
<user>
with the name of the user:grant select any table to <user>
grant execute any procedure to <user>
grant select_catalog_role to <user>
System schemas:
- SYSTEM, XDB, CTXSYS, ANONYMOUS, DIP, SYS, DBSNMP, RDSADMIN, OUTLN
PostgreSQL
Notes:
- Catalogs associated with schemas are commonly
NULL
, implying the default catalog.
Permissions:
- Use the following command where metadata is needed, replacing
<user>
with the name of the user:grant trigger on all tables in schema public to <user>
System schemas:
- information_schema, pg_catalog
RedShift
Notes:
-
Functions and procedures are output to the procedures pin.
Permissions:
Run the following commands, where <user>
is replaced with the name of the user:
-
grant select on all tables in schema public to <user>
-
grant execute on all procedures in schema public to <user>
-
grant execute on all functions in schema public to <user>
System schemas:
- information_schema, pg_catalog
Snowflake
Option 1: Elevated permissions to obtain everything:
Permissions:
- Create a role.
- Assign the role to the user by running the following command, where
<role>
is replaced with the name of the role and<user>
is replaced with the name of the user:grant role <role> to user <user>
- Grant permissions to the role by running the following commands, replacing
<role>
,<user>
,<database>
and<schema>
with the relevant values:grant ownership on all procedures in database <database> to role <role>
grant ownership on all views in database <database> to role <role>
grant ownership on all functions in database <database> to role <role>
grant select on all tables in database <database> to role <role>
grant usage on warehouse COMPUTE_WH to role <role>
Note:COMPUTE_WH
is the default warehouse of the database. Change this value if necessary.grant usage on schema <schema> to role <role>
- Grant permission to all objects that will be created in the future by running the following commands, replacing
<role>
,<user>
,<database>
and<schema>
with the relevant values:grant ownership on future tables in database <database> to role <role>
grant ownership on future views in database <database> to role <role>
grant ownership on future procedures in database <database> to role <role>
grant ownership on future functions in database <database> to role <role>
grant usage on future schemas in database <database> to role <role>
Option 2: Restricted permissions to obtain everything except Object Definitions:
- Create a role.
- Assign the role to the user by running the following command, where
<role>
is replaced with the name of the role and<user>
is replaced with the name of the user:grant role <role> to user <user>
- Grant permissions to the role by running the following commands, replacing
<role>
,<user>, <warehouse>
and<catalog>
with the relevant values:grant role <role> to user <user>
use <catalog>
grant usage on warehouse <warehouse> to role <role>
GRANT USAGE ON DATABASE <catalog> TO ROLE <role>
GRANT USAGE ON Schema <catalog>.<schema>TO ROLE <role>
GRANT SELECT ON ALL tables IN DATABASE <catalog>TO ROLE <role>
GRANT SELECT ON ALL VIEWS IN DATABASE <catalog> TO ROLE <role>
GRANT USAGE ON ALL procedures IN DATABASE <catalog>TO ROLE <role>
GRANT USAGE ON ALL FUNCTIONS IN DATABASE <catalog>TO ROLE <role>
System catalogs:
- SNOWFLAKE_SAMPLE_DATA, SNOWFLAKE, UTIL_DB
System schemas:
- INFORMATION_SCHEMA
SQL Server
Notes:
- SQL Server has the ability to have encrypted functions, procedures and views. The table
sys.syscomments
contains a column calledencrypted
which has a value of 0 or 1, where 1 means that the object is encrypted. The ObjectCode field on the Object Definitions pin will populate the notes column with a value of 3 if the object is encrypted.
Permissions:
- Run one of the following two commands for each database where metadata is needed, replacing
<database>
with the name of the database and<user>
with the name of the user:grant view definition on database::[<database>] to [<user>]
Or
grant select, execute, alter on database::[<database>] to [<user>]
Note: The first option (grant view definition on database::[<database>] to [<user>]
) is more restrictive and therefore should be used in most circumstances. - To grant metadata access to all databases on the host, run both of these commands, replacing
<user>
with the name of the user:grant connect any database to [<user>]
grant view any definition to [<user>]
System catalogs:
- master, tempdb, model, msdb, rdsadmin
System schemas:
- db_accessadmin, db_backupoperator, db_datareader, db_datawriter, db_ddladmin, db_denydatareader, db_denydatawriter, db_owner, db_securityadmin, INFORMATION_SCHEMA, sys, guest
Sybase
Permissions:
- Run the following commands for each database where metadata is needed, replacing
<database>
with the name of the database and<user>
with the name of the user:use <database>
sp_adduser <user>
select 'grant select, execute on ' + name + ' to <user>' from sysobjects where type in ('U','P')
System catalogs:
- model, sybsystemprocs, sybsystemdb, tempdb, sybsecurity, sybmgmtdb
Teradata
Notes:
- Teradata does not support the retrieval of procedure bodies, so the Data360 Govern Database Connector will not show relationships based on procedures.
- Teradata reports users as schemas.
- Does not support macros.
-
It is recommended to use Teradata JDBC driver version 17.10.00.26 or higher.
Permissions:
- No additional permissions are required.
System schemas:
- DBC, dbcmngr, LockLogShredder, SQLJ, SYSJDBC, SYSBAR, SYSLIB, SYSSPATIAL, SYSUDTLIB, SYSUIF, Sys_Calendar, SysAdmin, SystemFe, TDMaps, TDQCD, TDStats, TD_SERVER_DB, TD_SYSFNLIB, TD_SYSXML, tdwm
Hints and tips
- The user that is connecting to the database must have the required access permissions.
- If you are having trouble connecting to a database, see Troubleshooting database connection errors.
- For additional information, see JDBC Query and Acquiring data from a database.
Properties
DbUser
Optionally specify the database username to connect to the database.
DbPassword
Optionally specify the password used to connect to the database.
DbType
Select a database type to connect to.
If you are connecting to a database that uses a driver which is not pre-installed with Data360 Analyze, then this property is not required. In this case, specify the connection details in the Advanced property group.
DbName
Specify the name of the database to connect to.
If you are connecting to a database that uses a driver which is not pre-installed with Data360 Analyze, then this property is not required. In this case, specify the connection details in the Advanced property group.
DbHost
Specify the database host to connect to e.g. localhost
. Leave this property blank if connecting to MS Access.
If you are connecting to a database that uses a driver which is not pre-installed with Data360 Analyze, then this property is not required. In this case, specify the connection details in the Advanced property group.
DbUrl
Optionally specify the driver's database URL.
If a database is selected from the DbType property and a DbName and DbHost are specified, then this property is not required.
Otherwise, specify the URL in the following format: jdbc:subprotocol:subname
. For example, jdbc:sqlserver://localhost:1433;databaseName=CustomerData
. Provided by the driver's vendor.
See the help topic Acquiring data from a database for more information on configuring this property.
DbDriver
Optionally specify the class name of your JDBC driver.
If a database is selected from the DbType property and a DbName and DbHost are specified, then this property is not required.
Otherwise, specify the class name of your JDBC driver, as provided by the driver's vendor.
See the help topic Acquiring data from a database for more information on configuring this property.
DbOptions
Optionally specify database connection properties e.g. key=val&key=val2
or key=value
entry per line.
DbDriverClasspath
Optionally specify the database driver classpath.
If a database is selected from the DbType property and a DbName and DbHost are specified, then this property is not required.
Otherwise, specify the classpath with each classpath entry on a new line.
Each classpath entry must be the location on the file system of either a single jar file, or a directory containing jars or class files.
ShowDBInfoOnly
Optionally specify whether to extract database metadata and Schema information only.
The default value is True.
CatalogFilter
Optionally specify a list (comma separated) of names to constrain the Catalogs retrieved.
%
character.SchemaFilter
Optionally specify a list (comma separated) of filters to constrain the Schemas retrieved. When using a filter, "%" means match any substring of 0 or more characters, and "_" means match any one character.
ExcludedCatalogs
Optionally specify a list (comma separated) of catalogs that will be ignored.
ExcludedSchemas
Optionally specify a list (comma separated) of schemas that will be ignored.
TableViewFilter
Optionally specify a list (comma separated) of filters to constrain the Tables/Views retrieved. When using a filter, "%" means match any substring of 0 or more characters, and "_" means match any one character.
ProcedureFilter
Optionally specify a list (comma separated) of filters to constrain the Procedures retrieved. When using a filter, "%" means match any substring of 0 or more characters, and "_" means match any one character.
FunctionFilter
Optionally specify a list (comma separated) of filters to constrain the Functions retrieved. When using a filter, "%" means match any substring of 0 or more characters, and "_" means match any one character.
TriggerFilter
Optionally specify a list (comma separated) of filters to constrain the Triggers retrieved. When using a filter, "%" means match any substring of 0 or more characters, and "_" means match any one character.
ExcludeSystemCatalogs
Optionally specify whether to exclude the system predefined Catalogs or not.
The default value is True.
ExcludeSystemSchemas
Optionally specify whether to exclude the system predefined Schemas or not.
The default value is True.
ShowObjectDefinition
Optionally specify whether the code associated with Views, Procedures, Functions, and Triggers should be output.
The default value is False.
DefaultToUnicode
Optionally specify whether string columns should default to Unicode type.
If set to False, string columns default to string type. The default value is True.
UnicodeBinding
Optionally specify how Unicode strings should be passed to the JDBC driver. Choose from:
- CHARACTER - Bind the string using the mechanism for CHAR and VARCHAR types.
- NATIONAL CHARACTER - Bind the string using the mechanism for NCHAR and NVARCHAR types.
SqlDateType
Optionally select a datetype in which to map a SQL DATE. Choose from:
- date
- datetime
Inputs and outputs
Inputs: 1 optional.
Outputs: Metadata, Catalogs, Schemas, Tables, Table Columns, Procedures, Procedure Parameters, Functions, Function Parameters, Triggers, Object Definitions.