Adding Data from Flat File - discovery - 23.1

Spectrum Discovery Guide

Product type
Software
Portfolio
Verify
Product family
Spectrum
Product
Spectrum > Discovery
Version
23.1
Language
English
Product name
Spectrum Discovery
Title
Spectrum Discovery Guide
Topic type
How Do I
Overview
Reference
First publish date
2007
ft:lastEdition
2024-02-07
ft:lastPublication
2024-02-07T17:21:58.768552
These steps describe creating a match rule from records in flat files on your workstation or on the server.
  1. On the Discovery application page, click Prepare.
  2. Click the Create Rule button .
    This displays the Select Source page.
  3. Next to Select data source, click the Flat File option.
  4. Click the Select File button.
    This opens the Choose File dialog box.
  5. Click on Server and click to navigate to locate the data file on the Spectrum server.
    A list of files at the selected location is displayed below the file path selection box.
  6. Select the file you need, and click OK.
    Note: You can enter the file name or part of the name in the Filter box to locate a specific file in the list.
  7. To modify settings for the file, select it, and make changes as described in this table.
    Settings Description

    Character encoding

    The text file's encoding. Select one of these:
    UTF-8
    Supports all Unicode characters and is backwards-compatible with ASCII. For more information about UTF, see unicode.org/faq/utf_bom.html.
    UTF-16
    Supports all Unicode characters but is not backwards-compatible with ASCII. For more information about UTF, see unicode.org/faq/utf_bom.html.
    US-ASCII
    A character encoding based on the order of the English alphabet.
    UTF-16BE
    UTF-16 encoding with big endian byte serialization (most significant byte first).
    UTF-16LE
    UTF-16 encoding with little endian byte serialization (least significant byte first).
    ISO-8859-1
    An ASCII character encoding typically used for Western European languages. Also known as Latin-1.
    ISO-8859-2
    An ASCII character encoding typically used for Eastern European languages. Also known as Latin-2.
    ISO-8859-3
    An ASCII character encoding typically used for Southern European languages. Also known as Latin-3.
    ISO-8859-9
    An ASCII character encoding typically used for Turkish language. Also known as Latin-5.
    CP850
    An ASCII code page used to write Western European languages.
    CP500
    An EBCDIC code page used to write Western European languages.
    Shift_JIS
    A character encoding for the Japanese language.
    MS932
    A Microsoft's extension of Shift_JIS to include NEC special characters, NEC selection of IBM extensions, and IBM extensions.
    Field Delimiter

    Specifies the character used to separate fields in a delimited file.

    For example, this record uses a pipe (|) as a field delimiter:

    7200 13TH ST|MIAMI|FL|33144

    The characters available as field delimiter are:

    • Comma
    • Semicolon
    • Pipe
    • Tab
    • Space
    • Period
    You can also add custom field delimiters. To add a custom field delimiter, follow these steps:
    1. Click the Add button placed adjacent to Field delimiter. Add Separator pop-up window is displayed.
    2. Enter your desired field delimiter in the character field, the corresponding Unicode is displayed automatically.
    3. Enter a suitable name for your delimiter in the Description field.
    4. Click Save, your delimiter appears in the Field delimiter drop down.
    Text qualifier

    The character used to surround text values in a delimited file.

    For example, this record uses double quotes (") as a text qualifier.

    "7200 13TH ST"|"MIAMI"|"FL"|"33144"

    The characters available to define as text qualifiers are:

    • Single Quotes (')
    • Double Quotes (")
    Line separator Specifies the character used to separate records in line in a sequential or delimited file.

    The record separator settings available are:

    Unix (U+000A)
    A line feed character separates the records. This is the standard record separator for Unix systems.
    Macintosh (U+000D)
    A carriage return character separates the records. This is the standard record separator for Macintosh systems.
    Windows (U+000D U+000A)
    A carriage return followed by a line feed separates the records. This is the standard record separator for Windows systems.
    First row is header row

    Specifies if the first record in a delimited file contains header information. A Yes indicates it has header information.

    For example, this file snippet shows a header row in the first record.

    "AddressLine1"|"City"|"StateProvince"|"PostalCode"
    "7200 13TH ST"|"MIAMI"|"FL"|"33144"
    "One Global View"|"Troy"|"NY"|12180
  8. Click Save and Continue to save your changes.
    You are now ready to create your match rule.