← Back to Search
The Contents-Based Website Classification For The Internet Advertising Planning: An Empirical Application Of The Natural Language Analysis
Published 2017 · Computer Science
This study proposes a model for website classification using website content, and discusses applications for the Internet advertising (ad) strategies. Internet ad agencies have a vast amount of ad-spaces embedded in websites and have to choose which advertisements are feasible for place. Therefore, ad agencies have to know the properties and topics of each website to optimize advertising submission strategy. However, since website content is in natural languages, they have to convert these qualitative sentences into quantitative data if they want to classify websites using statistical models. To address this issue, this study applies statistical analysis to website information written in natural languages. We apply a dictionary of neologisms to decompose website sentences into words and create a data set of indicator matrices to classify the websites. From the data set, we estimate the topics of each website using latent Dirichlet allocation, which is fast and robust method for sparse matrices. Finally, we discuss how to apply the results obtained to optimize ad strategies.