Fingerprint-Based Near-Duplicate Document Detection with Applications to SNS Spam Detection

Phuc-Tran Ho; Sung-Ryul Kim
January 2014
International Journal of Distributed Sensor Networks;2014, p1
Academic Journal
Social networking has been used widely by millions of people over the world. It has become the most popular way for people who want to connect and interact online with their friends. Currently, there are many social networking sites, for instance, Facebook, My Space, and Twitter, with a huge number of active users. Therefore, they are also good places for spammers or cheaters who want to steal the personal information of users or advertise their products. Recently, many proposed methods are applied to detect spam comments on social networks with different techniques. In this paper, we propose a similarity-based method that combines fingerprinting technique with trie-tree data structure and meet-in-the-middle approach in order to achieve a higher accuracy in spam comments detection. Using our proposed approach, we are able to detect around 98% spam comments in our dataset.


Related Articles

  • Segmenting Student Profiles on the Usage of Social Networking Media: A Case Study on Facebook. CAVUS, Nadire; ERTAC VAROGLU, Dizem; SERDAROGLU, Rana // Proceedings of the International Future-Learning Conference on I;2012, p648 

    Social networking (SN) media such as Facebook, MySpace, LinkedIn and Twitter are communications technologies which are used by students today. The SN media has the potential of being used for various reasons such as communicating with friends and colleagues, providing and receiving education,...

  • Security pros get caught out by QR codes. Winder, Davey // PC Pro;Jan2013, Issue 219, p091 

    The author discusses issues concerning computer security and social networking in Great Britain. He discusses the hidden security threats posed by QR codes, which are giant barcodes used for marketing. He also explains how even information-technology (IT) professionals get scammed on the social...

  • Google+ vs. Facebook. MONROE, DANIELLE // EContent;Dec2011, Vol. 34 Issue 10, p8 

    The article discusses online social networks, looking at the social networking websites Facebook and Google+. The article examines changes to Facebook including the introduction of a timeline which records user activity and verbs which allow users to interact with brands beyond liking them....

  • Google+ vs Facebook: The Comparison. Curran, Kevin; Morrison, Scott; Mc Cauley, Stephen // Telkomnika;2012, Vol. 10 Issue 2, p379 

    Social networks are a varyingly popular tool used to connect with friends, colleagues and/or family. Recently, Facebook has been at the top of the social network food chain, with Bebo and MySpace decreasing in users and the huge increase of users joining Facebook in such a short time. Google...

  • "FRIENDING," "FOLLOWING," AND "DIGGING" UP EVIDENTIARY DIRT: The Ethical Implications of Investigating Information on Social Media Websites. Clemency, Allison // Arizona State Law Journal;Spring2011, Vol. 43 Issue 1, p1021 

    The article focuses on the ethical use of social media sites including Face book, MySpace and Twitter by lawyers to conduct fact investigation. It mentions that by taking conservative approach, lawyers can ethically investigate social media sites with formal and informal approaches. It concludes...

  • Beware of bullies -- cyber bullies. WHITE, TRICIA // Grand Rapids Family Magazine;Oct2012, Vol. 24 Issue 10, p13 

    The article offers tips for parents to prevent their children from harmful effects of cyberbullies. According to Cyberbully Alert, a product cyberbullies, the six most common technologies used include: Facebook, MySpace and other social networking sites. Tips for monitoring cyberbullies include...

  • Survey on social networking services. Irfan, Rizwana; Bickler, Gage; Khan, Samee U.; Kolodziej, Joanna; Hongxiang Li; Dan Chen; Lizhe Wang; Hayat, Khizar; Madani, Sajjad Ahmad; Nazir, Babar; Khan, Imran A.; Ranjan, Rajiv // IET Networks;2013, Vol. 2 Issue 4, p224 

    The social computing, such as social networking services (SNSs) and social Networking Platforms (SNPs) provide a coherent medium through which people can be interactive and socialize. The SNP is a Web-based social space, specifically designed for end user-driven applications that facilitate...

  • Landlords, Tenants & Facebook. Bitler, Teresa // Personal Real Estate Investor;Nov2011, p76 

    The article discusses the advantages and disadvantages of using social media such as Facebook, MySpace and blogs by landlords to screen applicants and to gain insights on their tenants. Landlord Tara Kennedy-Kline asserts that Facebook helps her in screening applicants and allows her to know...

  • Social Networking Site Continuance: The Paradox of Negative Consequences and Positive Growth. Harden, Gina; Ryan, Sherry D.; Prybutok, Victor R. // Informing Science;2012, Vol. 15, p207 

    The growth of social networking sites (SNSs) introduces a variety of interesting behaviors by users of these online informing environments. SNSs have become important informing channels for both personal and commercial interests, but paradoxically some experience enormous growth even when...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics