Many tasks involve SMS text messages in modern interaction scenarios. For example, many popular music TV shows offer an interactivity feature, whereby viewers are given the ability and opportunity to interact with each other by sending electronic text messages to the TV show, which are displayed at the bottom of the TV screen. Another example would be a lounge, which allows users of SMS devices to view and communicate their comments on a wide-angle, high definition display during a broadcast sporting event.

However, interaction has some risks. If someone wants to use an interactive device in a malicious way, it is quite possible to do so, especially if no one has secured the device against this behaviour. There are always people who want to disturb or intercept a communication by texting, for example, random or rude topics. This work deals with the problem of content based filtering of SMS text messages. One example of content based filtering is a relevance filter for a sporting event. In this case, the inter-actors would be able to discuss related topics, but anything else would be filtered out and deleted.

In this work, an automated computer software classifier will be developed, and the support vector machine and naive Bayes classifier algorithms will be used to implement the classification of text messages. While there are other alternative classifiers, such as artificial neural networks, nearest neighbour approaches, and Markov chains, the support vector machine and naive Bayes classifier algorithms will be used because they are simple, accurate, and widely used in the field. However, there are disadvantages: for example, the naive Bayes classifier incorrectly assumes that every word in a phrase is position independent and also independent from every other word in the phrase.

One of the challenging aspects of this work is to convert SMS text messages to English phrases. The difficulty is that most SMS text messages are a compressed form of English, and some words may be missing and implied. In any case, the conversion must be done to reduce the complexity in the classification process. Work in this area has already been done. One approach was to expand the feature set. Another approach was a phrase based statistical model based on Shannon's channel model. Both approaches showed a significant improvement in classification of SMS messages. Therefore, this work will use these approaches as a starting point, and, by studying and evaluating their performance and characteristics, some improvements may be discovered.