When you add data sources to load and analyze your data in a Hadoop cluster, the task named Distributed Profiling displays in the Task column. The distributed profiling task indicates that distributed profiling activities such as data load analysis, data profiling analysis aggregation, and key & dependency analysis in your distributed environment are in progress or have completed.
If a distributed profiling task status shows that it failed to complete, the Reason column indicates that a problem was encountered at a certain point during profiling activities.
If a Kerberos ticket in your distributed environment has expired, this can be one of the causes for the task to stop before completing. For more information about troubleshooting Kerberos tickets in your Hadoop environment, see your system administrator.
To see information about why the failure occurred, open the
mtb_server.log
on the
repository server system. Entries in the log record the date, time, job ID value, and
affected data for each activity.
For more information about troubleshooting distributed profiling errors, see the Troubleshooting chapter in the for Big Data Installation Guide.
To view information about distributed profiling errors
-
On the server system where the
repository server is installed, navigate to and open the
mtb_server.log
file. By default, it is located in the following location:- Windows:
C:/ProgramData/Trillium Software/MBSW/software_version/Data/logs
- Linux:
server_path/metabase/logs
- Windows:
- Search for the log entry for the job ID associated with the distributed profiling task.