Azure Datalake Storage Get - Data360_Analyze - Latest

Data360 Analyze Server Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 Analyze
Version
Latest
Language
English
Product name
Data360 Analyze
Title
Data360 Analyze Server Help
Copyright
2024
First publish date
2016
Last updated
2024-11-28
Published on
2024-11-28T15:26:57.181000

Downloads files from an Azure storage server.

Azure Datalake Storage nodes enable you to access data lakes on Azure storage, so that you can integrate your data flows accordingly. See:

Note: The ADLS nodes have been updated to support Azure Data Lake Storage Gen 2. As part of this change, these nodes will no longer support Gen 1 storage accounts. The following announcement by Microsoft instructs users of Gen 1 to migrate to Gen 2: https://azure.microsoft.com/en-us/updates/action-required-switch-to-azure-data-lake-storage-gen2-by-29-february-2024/

Downloading objects from Azure Datalake Storage

  1. Drag an Azure Datalake Storage Get node onto the canvas.

  2. Enter the FileSystem and RemotePath where the objects that you want to download are located.

  3. Provide your Azure AccountName, together with the AccountKey, or the following properties combined, in the relevant field/s:

    • ClientID, together with the ClientSecret and the TenantID.
  4. In the Directory property, specify where to use as the root storage location for the retrieved objects.
  5. Complete other properties, as required.

Downloading selected objects from Azure Datalake Storage

  1. Drag an Azure Datalake Storage List node onto the canvas and connect the listed files output to an Azure Datalake Storage Get node.

  2. Enter the FileSystem and RemotePath to where the objects that you want to download are located.

  3. Provide your Azure AccountName, together with the AccountKey, or the following properties combined, in the relevant field/s:

    • ClientID, together with the ClientSecret and the TenantID.
  4. Run the Azure Datalake Storage List node, to generate a list of files in the specified path.
  5. On the Azure Datalake Storage Get node, select the (from Field) variant of the RemotePath property, and specify the name of the input field that contains the listed files.
  6. Repeat Step 3 above, in the Azure Datalake Storage Get node.
  7. In the Directory property, specify where to store the downloaded files.
  8. Run the Azure Datalake Storage Get node, to download the files from the specified Azure storage.
Tip: For additional information on Azure Datalake Storage, see the Microsoft Azure online documentation.

Properties

FileSystem

Specify the file system of the Azure Datalake Storage. A value is required for this property.

RemotePath

Specify the path to the Azure Datalake Storage objects. A value is required for this property.

AccountName

Specify the Azure Account Name. A value is required for this property.

One of the following should be entered:

  • AccountKey: The Azure Secret Key.

Or the combination of:

  • ClientID: The Client ID for the registered app.

  • ClientSecret: The Client Secret for the registered app.

  • TenantID: The Tenant ID (directory) for the registered app.

Directory

Specify the location of a directory as the root to store all retrieved objects.

A value is required for this property.

LocalPath

Optionally specify the location of the file to store the retrieved object.

The default value is the name of the object.

Choose the (from Field) variant of this property, to look up the value from an input field with the name specified.

Overwrite

Set to False to prevent overwriting an existing file that has the same name.

The default value is True.

FailureBehavior

Optionally specify what to do when a file fails to download. Choose from:

  • Error - Report error and stop further processing.
  • Log - Log a warning message and skip the file.
  • Ignore - Skip the file.

The default value is Log.

Enabled

Optionally specify whether the node is enabled or disabled.

You can either choose True or False, or reference another property (see the Using derived property values topic) which will be evaluated to a true or false value.

Disabled nodes are not executed, even if they are selected to run.

The default value is True.

Note: The Enabled property cannot reference (either directly or indirectly) any of the Run Properties of a data flow.

LogLevel

Optionally specify the level at which non-fatal messages are logged.

The lower the level, the more information will be recorded in the log file. Choose from:

  • 0 - Information
  • 1 - Low
  • 2 - Medium
  • 3 - High
  • 4 - Fatal

The default value is 2 (Medium), which can be changed in the ls_brain_node.prop configuration file, by modifying the property ls.brain.node.logLevel.

Inputs and outputs

Inputs: 1.

Outputs: downloaded files, errors.