Azure Datalake Storage List - Data360_Analyze - 3.12

Data360 Analyze Server Help

Product
Data360 Analyze
Version
3.12
Language
English
Portfolio
Verify
Product family
Data360
Product name
Data360 Analyze
Title
Data360 Analyze Server Help
Copyright
2023
First publish date
2016

Lists files on an Azure Datalake Storage server.

Azure Datalake Storage nodes enable you to access data lakes on Azure storage, so that you can integrate your data flows accordingly. See:

To create a list of files in an Azure Datalake Storage location:

  1. Enter the RemotePath to the location you want to interrogate.

    If you want to list files from multiple Azure Datalake Storage locations, use the (from Field) variant of the RemotePath property, to point to an input field that references the Azure locations.

  2. Provide your Azure AccountName, together with the AccountKey, or the following properties combined, in the relevant field/s:
    • ClientID, together with the ClientSecret and the TenantID.
Tip: For additional information on Azure Datalake Storage, see the Microsoft Azure online documentation.

As well as being able to provide a simple list of files within your Azure Data Lake, the Azure DataLake Storage List node allows you to inspect the file contents using the MetadataMode property.

Properties

FileSystem

Specify the file system of the Azure Datalake Storage.

A value is required for this property.

RemotePath

Specify the path to the Azure Datalake Storage objects.

A value is required for this property.

Recurse

Optionally specify whether to recursively enumerate the files under RemotePath.

The default value is False.

AccountName

Specify the Azure Account Name.

A value is required for this property.

One of the following should be entered:

  • AccountKey

    The Azure Secret Key.

Or the combination of:

  • ClientID

    The Client ID for the registered app.

  • ClientSecret

    The Client Secret for the registered app.

  • TenantID

    The Tenant ID (directory) for the registered app.

FailureBehavior

Optionally specify what to do when the request fails. Choose from:

  • Error - Report error and stop further processing.
  • Log - Log a warning message and skip the file.
  • Ignore - Skip the file.

The default value is Log.

Enabled

Optionally specify whether the node is enabled or disabled.

You can either choose True or False, or reference another property (see the Using derived property values topic) which will be evaluated to a true or false value.

Disabled nodes are not executed, even if they are selected to run.

The default value is True.

Note: The Enabled property cannot reference (either directly or indirectly) any of the Run Properties of a data flow.

LogLevel

Optionally specify the level at which non-fatal messages are logged.

The lower the level, the more information will be recorded in the log file. Choose from:

  • 0 - Information
  • 1 - Low
  • 2 - Medium
  • 3 - High
  • 4 - Fatal

The default value is 2 (Medium), which can be changed in the ls_brain_node.prop configuration file, by modifying the property ls.brain.node.logLevel.

MetadataMode

Optionally specify to what extent additional file information will be retrieved.

Choose from:

  • Basic - Standard file metadata
  • Identification - Additional media type outputted describing each file, based on media types.
  • Structure - With this option selected, the node will inspect the structure of any .csv files and output the field metadata to the "file metadata" output pin.
Note: Only .csv and .parquet files are currently supported for metadata inspection.

The default value is Basic.

Inputs and outputs

Inputs: 1 optional.

Outputs: listed files, errors.