Ernestas Poškus

Technical blog

"We must view with profound respect the infinite capacity of the human mind to resist the introduction of useful knowledge." - Thomas R. Lounsbury

ansible 2 / elasticsearch 2 / kernel 2 / leadership 1 / linux 2 / mnemonics 1 / nginx 1 / paper 40 / personal 5 / rust 1 / tools 2 /

Spam Taxonomy

Author: Zoltán Gyöngyi, Hector Garcia-Molina
Name: Web Spam Taxonomy
Link: http://ilpubs.stanford.edu:8090/771/1/2005-9.pdf

Fri, Nov 12, 2021

WC 140 / RT 1min

Web spamming refers to actions intended to mislead search engines into ranking some pages higher than they deserve. The primary consequence of web spamming is that the quality of search results decreases. The secondary consequence of spamming is that search engine indexes are inflated with useless pages, increasing the cost of each processed query.

We use the term spamming (also, spamdexing) to refer to any deliberate human action that is meant to trigger an unjustifiably favorable relevance or importance for some web page, considering the page’s true value.

All types of actions intended to boost ranking (either relevance, or importance, or both), without improving the true value of a page, are considered spamming.

Spamming techniques:

Body spam
Title spam
Meta tag spam
Anchor text spam
Repetition of terms
Dumping large number of unrelated terms
Weaving of spam terms into contents