Finding the true influencer
Problem being addressed
The rise of social networks has created an influx of data that describes how individuals are connected, and as more and more data has become available, it has become possible to tailor advertisements to individuals’ interests. Identifying who to sponsor is an interesting problem that companies face, as sponsors with little reach may end up costing the company more money than they bring in through new customers.
The work contributes a new and novel approach to utilize machine learning algorithms in graph analysis, specifically focusing on clustering and key figure analysis. The work proposes that embedded clusters can be determined for the purpose of advertising. Key figure analysis utilizing multiple centrality measures is used to identify leaders of a community, and a proposal of using these measures within embedded communities is made. Particularly, clustering is used to take the place of keywords in identifying advertising demographics, and key figure analysis as a method to determine sponsorship figures.
Advantages of this solution
The research provides a basis for analyzing social media datasets using machine learning with graph theoretic techniques. Means for analyzing data clusters and key figures within them have been found, and suitable metrics are also provided. This work could easily be applied to locating influencers and tight clusters within any given network. The clique analysis put forth gives a basis for finding leaders among smaller groups within a community, specifically leaders that partake in multiple groups.
Solution originally applied in these industries
Possible New Application of the Work
Given a data set of people with terroristic tendencies, similar techniques could identify leaders in the population, or provide a good starting point to look into certain individuals not previously known as a potential threat.
While the context of the data set restricts the results to advertising suggestions, or influencer identification, these techniques could easily be applied to other datasets. By applying similar techniques to medical data, it would be possible to identify disease clusters. This would potentially allow easier identification of strains of the disease, and it would be possible to perform key - figure analysis on the clusters to find likely patient zeros. In this case, identifying the first patient can allow for research into cures to progress quicker as the disease progression can be monitored.
Source URL: #############