Loading Property Attributes Assessment Data - property_attributes_assessment - Latest

Property Attributes Assessment Getting Started Guide

Product type
Data
Portfolio
Enrich
Product family
Enrich Addresses > Property Features
Product
Property Attributes Assessment
Version
Latest
Language
English
Product name
Property Attributes Assessment
Title
Property Attributes Assessment Getting Started Guide
Copyright
2024
First publish date
2020
Last updated
2024-04-29
Published on
2024-04-29T04:26:07.269000
  1. Create an EMR cluster having following configuration:
    Table 1. Software Details
    Component Version
    EMR emr-6.0.0
    Hadoop Amazon 3.2.1
    Hive Hive 3.1.2
    Tez Tez 0.9.2
    Table 2. Hardware Details
    Nodes Machine Type Machine Details
    Master m4.xlarge 4 vCore, 16 GiB memory, EBS only storage, EBS Storage:400 GiB
    Slaves*2 m4.xlarge 4 vCore, 16 GiB memory, EBS only storage, EBS Storage:400 GiB
  2. Download the Property_Attributes_Assessment_yyyymm.zip file to location /mnt/data/.
  3. Uncompress the file using following command:
    unzip Property_Attributes_Assessment_yyyymm.zip
  4. Data will be uncompressed to /mnt/data/Property_Attributes_Assessment_yyyymm/Property_Attributes_Assessment_Data/ property_attributes_assessment_usa.txt

  5. Initiate command prompt through Hive.
  6. Run the create table script in Hive Shell.
  7. Load data in Hive using following command:
    hive:>LOAD DATA LOCAL INPATH '/mnt/data/Property_Attributes_Assessment_yyyymm/
    Property_Attributes_Assessment_Data/property_attributes_assessment_usa.txt' 
    OVERWRITE INTO TABLE asmt_d;


  8. Once data is loaded, execute
    hive:>select count(*) from asmt_d;
    Note: Counts can be verified by comparing the file against the Hive table. Both counts should match.