Relative Content

Tag Archive for scikit-learndata-sciencecluster-analysishierarchical-clusteringhdbscan

HDBSCAN getting minimal hierarchy in clustering

I’m clustering some items and HDBSCAN has given good results. However, when clustering items, I frequently see clusters such as dog, cat, guppy, tuna at different Levels of granularity. In order to fix this, I’d like to use the hierarchical clustering that HDBSCAN gives. This question ask the similar question, but when I follow that answer, I get a very large number of groups (ex. 50 groups in a 50 item list). Many of these groups seem like they can be eliminated in the hierarchy (ie. guppy -> fish rather than guppy -> … -> tropical fish -> … -> fish). Is this doable? If so, how would I approach this?