A problem in the world is unsolicited mail, phone calls, and even visits from people selling unwanted things. The power of computers have, unfortunately, added electronic media, e-mails and text messages to the annoyance. Unsolicited advertising e-mails and text messages are very cheap to mass-distribute and therefore need only a very small return to be profitable.
These unsolicited electronic massages have been called SPAM. For an explanation on this, refer to this amusing link. The overall problem of spam is described in detail in this Wikipedia article.
With spam becoming such a vexing problem, many statisticians and machine learning people have started working on ways to automatically decide whether a message is spam or HAM. (The common label for messages that are non-spam—that is, messages you actually want—is HAM.)
In these activities, you will step into the shoes of a machine learning expert and think about ways to automatically classify real-world text messages as either ham or spam.