Which issue arises when outliers significantly distort the position of centroids or create isolated clusters?

Prepare for the GARP Risk and AI (RAI) Exam with targeted quizzes. Utilize flashcards, multiple-choice questions, and detailed explanations to enhance learning. Ace your exam with our comprehensive quiz!

Multiple Choice

Which issue arises when outliers significantly distort the position of centroids or create isolated clusters?

Explanation:
When you’re using centroid-based clustering, like K-means, the center of each cluster is the mean of its points. The mean is not robust to unusual observations, so a single outlier can tug the centroid away from the main group. That pull can distort where the centroid sits, which in turn reshapes which points are assigned to which cluster. In some cases, an extreme outlier ends up forming its own tiny cluster or causes a cluster to become elongated toward that outlier. This entire effect—outliers shifting centroids and potentially creating isolated clusters—demonstrates why the method is sensitive to outliers. Other issues listed refer to different problems: the impact of high dimensionality on distance calculations, how the starting positions of centroids affect results, or biases from choosing a particular distance metric. But the scenario described specifically highlights the lack of robustness to anomalous points, i.e., sensitivity to outliers.

When you’re using centroid-based clustering, like K-means, the center of each cluster is the mean of its points. The mean is not robust to unusual observations, so a single outlier can tug the centroid away from the main group. That pull can distort where the centroid sits, which in turn reshapes which points are assigned to which cluster. In some cases, an extreme outlier ends up forming its own tiny cluster or causes a cluster to become elongated toward that outlier. This entire effect—outliers shifting centroids and potentially creating isolated clusters—demonstrates why the method is sensitive to outliers.

Other issues listed refer to different problems: the impact of high dimensionality on distance calculations, how the starting positions of centroids affect results, or biases from choosing a particular distance metric. But the scenario described specifically highlights the lack of robustness to anomalous points, i.e., sensitivity to outliers.

Subscribe

Get the latest from Examzify

You can unsubscribe at any time. Read our privacy policy