Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/70238
Title: Keyword extraction on online advertisement using clustering and classification methodology
Authors: Liu, Peng
Keywords: DRNTU::Library and information science
Issue Date: 2017
Abstract: Keyword advertising is a form of online advertising that an advertiser pays to have an advertisement appear in the results listing when a person uses a phrase to search the web. Selection of keywords is particularly important as they summarize the key characteristics of the advertised products and services, and serve as the important factor for advertiser to increase the reach of the advertisement (Ad) and potentially the conversion rate. In my company, Optimate, we provided the services to help clients optimize their online marketing campaign, advertisement placements and customer reach via multiple channels such as Google Adwords and Facebook. Keyword selection remains a crucial component to increase the overall effectiveness and efficiency of the services. In the report, I aim to propose a new keyword extraction approach from the advertisement text, while considering the grammar pattern of the text, historical ads and the other attributes such as industry and objective. The whole approach can be broadly divided into three phases, keyword candidate generation, Clustering using K-Means and K-nearest-neighbour classification. Selection rules on keyword candidates are based on linguistic feature and Part-of-Speech (POS) pattern of the ad content. The aim of keyword candidates is to generate a comprehensive list of possible keywords for subsequent classification. Kmeans clustering divides ads into different groups, and the subsequent classification is performed only on the group which the ad is in. Such way helps reduce the computing complexity and choose the best group which can yield better keywords. Then the TD-IDF feature of the keyword candidates is analysed. Cosine Distance is also computed and inputted into K-nearest-neighbour classification. Based on the majority vote of 20 neighbour keywords, the candidate keyword is classified into either a true keyword or a false keyword. This approach achieves good results in extracting keywords, but there are still issues limiting its effectiveness. Nevertheless, this approach offers a quick, highly flexible, and easily implementable solution to keyword extraction.
URI: http://hdl.handle.net/10356/70238
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
Final Report LIU PENG.pdf
  Restricted Access
Project Report2.02 MBAdobe PDFView/Open
keyword annotation.xlsx
  Restricted Access
Dataset by Human annotator117.16 kBMicrosoft ExcelView/Open
data_industry_obective.xlsm
  Restricted Access
223.25 kBUnknownView/Open
facebook_ad_data.json
  Restricted Access
1.03 MBUnknownView/Open
codes.7z
  Restricted Access
15.41 kBUnknownView/Open

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.