This tool can be used to provide insights on what features from the selected layer are the most similar or dissimilar to the attributes belonging to another layer.
The Identify Similar Features tool can be selected by navigating the following path:
Tools > Vector Processing > Statistical Analysis > Identify Similar Features
Once this tool is selected, a window appears in which you can select the data layers and features required to run the tool.
You must select the layer to be used for the training data. The values in this layer are used as the training data for the machine learning model that determines the similarity between unique characteristics.
Once the first layer is chosen, you are redirected to the second pop-up box where the target layer must be selected. This is the layer from which matches are obtained.
For an input layer, you have three options to select the area that should be considered for the training data.
- Select By Query - This option redirects to Query Builder, where this tool is executed based on the output produced by the query created.
- Select By Graphics - This option gives you the choice to manually select the Features or area of Interest around which the operation should be applied.
- All Data - Uses the entire layer as input.
For more information about these options and how to implement them, see Data Selection Methods.
After the selection of the layers, you are directed to the window where the parameters required for the operation must be selected.
Match Method - The algorithm is used to measure similarity.
Attributes Values (by default option) - Based on sum of squared differences of standardized attribute values for each target feature.
Ranked Attribute Values - Based on sum of squared differences of attribute ranks for each target feature.
Attribute Profiles - Based on cosine similarity and finds similar relationships between standardized attribute values rather than their magnitudes.
Ranks of Target Features - Specify the total number of ranks based on which the output is sorted.
Select Attribute - Determines the attributes in which the features are similar. In this example, the output aims to produce all the areas in the training layer that are like the target layer based on the values of population, area, and number of households.
Best Match/Dissimilar - Determines if the output produced displays the most similar or dissimilar attributes.
The output produced displays the value of the attributes selected with their ranks in a table.
You can view the output after storing it. To save the output, you must enter the Output Layer Name and click on Save Data button. The saved output will be added as a new layer in the current map session, where you can apply styling as required. This output layer will also be available in 'My Data' section and can be used in any map session of interest.