4 February 2019
Internet, feature extraction, social networking (online), slang feature extraction, analysing topic change, social media, youth slang, neologism, Internet slang, social networking sites, fresh information, automatic information collection, document groups, target slang, general words, slang classification method, SNS
Recently, the authors often see words such as youth slang, neologism and Internet slang on social networking sites (SNSs) that are not registered on dictionaries. Since the documents posted to SNSs include a lot of fresh information, they are thought to be useful for collecting information. It is important to analyse these words (hereinafter referred to as ‘slang’) and capture their features for the improvement of the accuracy of automatic information collection. This study aims to analyse what features can be observed in slang by focusing on the topic. They construct topic models from document groups including target slang on Twitter by latent Dirichlet allocation. With the models, they chronologically the analyse change of topics during a certain period of time to find out the difference in the features between slang and general words. Then, they propose a slang classification method based on the change of features.