Social media is an emerging area of interest and attracts many researchers as the latter provide tremendous amounts of data readily available that could be exploited for various reasons such as social networking, decision making and marketing. An emerging field in the area of social media data mining is known as topics detection from social data. A topic is harvested from social data by using an unsupervised machine learning task also known as clustering to group similar social data and recognize the importance of the grouped social data to provide a general distinction which will be known as topic. In this work, two clustering techniques known as the hierarchical clustering technique and density based clustering technique which shares stochastic capability is compared using dataset retrieved from Twitter which corresponds to real world events. The classes present in the dataset is restricted to four main topics and the dataset is then used to test the performance of the clustering algorithms. The performance evaluation being used to evaluate the clustering performance of the clustering algorithms are the V-measure which is the harmonic mean of homogeneity and completeness score of the clustering performance of a clustering algorithm. The results shows that the hierarchical clustering technique outperforms the density based clustering technique in determining the correct number of clusters and assigning the data to their respective clusters reliably. Apart from the comparative studies discussed in this project, an analysis tool based on social data is developed to address the problems related to the social data analysis.
Phenix is an analytical dashboard that uses social data and provide social analytical information. The dashboard responsive interface provide a functional yet reliable administration experience especially in mobile administration interface. The social data streaming service provided is also real-time which provides up-to-date analytics on current social trends. The application can be used in various fields such as crisis management, social media marketing, SEO and decision making.
Social data is fetched from popular social network such as Twitter via real-time streaming service and analysis is performed directly during streaming providing real-time social analysis usable in various fields.
Social data is retrieved based on certain region that are considered crucial in order to retrieve corresponding trends. This is a crucial tool in order to track top trends corresponding to certain regions.
Analytics tools provided includes social trends tracking based on region and sentiment analytics tool providing critical insight in current social trends. Information gained from provide analytics tool can be used in marketing strategy, SEO utility and brand awareness.
Research team members:
|Leader||Professor Madya Dr Azah Kamilah binti Draman@ Muda|
|Professor Dr Goh Ong Sing|
|Professor Madya Dr Choo Yun Huoy|
|Mohammad Safar A/L Shariff|