Clustering
±¸ºÐÇÏ·Á°í ÇÏ´Â °¢ class¿¡ ´ëÇÑ ¾Æ¹«·± Áö½ÄÀÌ ¾ø´Â »óÅ¿¡¼ ºÐ·ù (classify) ÇÏ´Â °ÍÀ̹ǷΠÀÚÀ²ÇнÀ (Unsupervised Learning) ¿¡ ÇØ´çÇÑ´Ù. Áï sample µé¿¡ ´ëÇÑ Áö½Ä¾øÀÌ similarity (À¯»çµµ) ¿¡ ±Ù°ÅÇÏ¿© cluster µéÀ» ±¸ºÐÇÑ´Ù. ÆÐÅÏ °ø°£¿¡ ÁÖ¾îÁø À¯ÇÑ °³ÀÇ ÆÐÅϵéÀÌ ¼·Î °¡±õ°Ô ¸ð¿©¼ ¹«¸®¸¦ ÀÌ·ç°í ÀÖ´Â ÆÐÅÏ ÁýÇÕÀ» cluster (±ºÁý) À̶óÇÏ°í ¹«¸®Áö¿ö ³ª°¡´Â ó¸® °úÁ¤À» clustering À̶ó ÇÑ´Ù. cluster °£ÀÇ À¯»çµµ¸¦ Æò°¡Çϱâ À§ÇØ ¿©·¯ °¡ÁöÀÇ °Å¸® ÃøÁ¤ ÇÔ¼ö¸¦ »ç¿ëÇϴµ¥ ¿¹¸¦µé¸é Euclidean distance, Mahalanobis distance, Lance-Williams distance, Hamming distance µîÀÌ »ç¿ëµÈ´Ù.
µ¥ÀÌÅÍ Å¬·¯½ºÅ͸µÀº ±â°èÇнÀ (Machine Learning), µ¥ÀÌŸ¸¶ÀÌ´× (Data Mining), ÆÐÅÏÀÎ½Ä (Pattern Recognition), ¿µ»óºÐ¼® (image analysis) and »ý¹°Á¤º¸ÇÐ (Bioinformatics) ¸¦ ºñ·ÔÇÑ ¸¹Àº ºÐ¾ß¿¡¼ »ç¿ëµÇ´Â µ¥ÀÌÅÍ ºÐ¼®À» À§ÇØ ÈçÈ÷ »ç¿ëÇÏ´Â ±â¼úÀÌ´Ù. Ŭ·¯½ºÅ͸µÀº ÇϳªÀÇ µ¥ÀÌÅ͸¦ ¿©·¯°³ÀÇ ºÎºÐÁýÇÕ (clusters) À¸·Î ºÐÇÒÇÏ´Â °ÍÀ» ÀǹÌÇϸç, ±×¶§ °¢ ºÎºÐÁýÇÕ¿¡ ÀÖ´Â µ¥ÀÌÅÍ´Â ¸î°¡ÁöÀÇ °øÅëµÈ Ư¡ (trait)À» °øÀ¯Çϴµ¥, ±×°ÍÀº ¸î°¡Áö °Å¸® ÃøÁ¤¹ýÀ» »ç¿ëÇÏ¿© À¯»çµµ (similarity or proximity)¸¦ °è»êÇÔÀ¸·Î½á ÀÌ·ç¾îÁø´Ù. µ¥ÀÌÅÍ Å¬·¯½ºÅ͸µÀº Å©°Ô µÎ°¡Áö, Áï hierarchical clustering °ú partitional clustering À¸·Î ³ª´ ¼ö ÀÖ´Ù.
hierarchical clustering Àº agglomerative (bottom-up) ¶Ç´Â divisive (top-down) ÀÏ ¼ö ÀÖ´Ù. °¢ ¿ä¼Òµé·ÎºÎÅÍ ½ÃÀÛÇÑ Å¬·¯½ºÅ͵éÀÌ °èÃþ±¸Á¶¸¦ ÀÌ·ç´Â °ÍÀ̸ç, tree ±¸Á¶¸¦ ÀÌ·ç¸ç ÇÑÂÊ ³¡¿¡´Â °¢°¢ÀÇ ¿ä¼Ò°¡ ÀÖ°í ´Ù¸¥ÂÊ ³¡¿¡´Â ¸ðµç ¿ä¼Ò¸¦ °¡Áö¸¦ ´Ü ÇϳªÀÇ Å¬·¯½ºÅͰ¡ ÀÖ´Ù.
partitional clustering Àº cluster ÀÇ °èÃþÀ» °í·ÁÇÏÁö ¾Ê°í Æò¸éÀûÀ¸·Î clustering ÇÏ´Â ¹æ¹ýÀ¸·Î ÀϹÝÀûÀ¸·Î ¹Ì¸® ¸î °³ÀÇ cluster ·Î ³ª´©¾î Áú °ÍÀ̶ó°í ¿¹»óÇϰí cluster ÀÇ °³¼ö¸¦ Á¤ÇÏ´Â °ÍÀÌ´Ù ............ (Wikipedia : Data Clustering)
term :
ÀÚÀ²ÇнÀ (Unsupervised Learning) K-Æò±Õ ¾Ë°í¸®Áò (K-means Algorithm) ½Å°æ¸Á (Neural Network) ÀÚ±âÁ¶Á÷È Áöµµ (Self-Organizing Map) ±â°èÇнÀ (Machine Learning) µ¥ÀÌŸ¸¶ÀÌ´× (Data Mining) ÆÐÅÏÀÎ½Ä (Pattern Recognition) »ý¹°Á¤º¸ÇÐ (Bioinformatics)
site :
A Tutorial on Clustering Algorithms : K-means | Fuzzy C-means | Hierarchical | Mixture of Gaussians | Links (¡Ú¡Ú¡Ú)