CS 101 Laboratory #10

Spam, Spam, Spam


Objective: To gain more experience with Strings.
As in previous weeks, you should bring a completed design document to lab.

Due date: Friday, December 2, at the beginning of your lecture section.


Background

What is spam? Before the mid-90s, Spam was a canned "meat" product, which we believe actually still exists under that name. Others are fondly reminded of a Monty Python skit involving a restaurant that served spam in all of its dishes. Of course, these days, the term "spam" means just one thing - unwanted email! There is lots more information available online about the history of the term "spam". Hormel (the makers of Spam) have an amusing site to give spam back its "good" reputation, including the slogan (Spam. So good...it's gone.)

The Problem

This week we will build a program that should give you insights into how you might add a spam filter capability to a mail program. Our spam filter is rather simple. The user provides a list of words. The program then searches the "from" and "subject" headers of all your mail messages and divides your mail into two lists. One list contains words in the filter, the other does not.

The SpamFilter program is shown below:

In this program, the user selects the machine to read mail from, then enters his/her username and password. The program connects to the mail server and downloads the headers of all the messages. Initially, all the mail is considered good mail (not spam).

The user can then add filter words. The filter words appear in the text area in the top right after they are entered. When the user adds a filter word, it appears in the list, but also the mail is scanned for occurrences of the filter word. Messages with that word in their headers are removed from the good mail list and moved to the spam list.

The JComboBox at the bottom of the screen allows the user to switch between viewing the good mail or the spam.

You can use this program to connect to your actual mailbox and filter your own mail by entering your mail server name, username and password into the user interface. We provide the code to actually communicate with a mail server. You do not need to worry about errors in your program causing problems with your real mailbox, though, as there is nothing in your program that modifies your mailbox in any way.

If you don't want to use your actual mailbox, we have set up temporary mailboxes for you to use on the mail server basin.cs.middlebury.edu. You would connect with the same username and password that you use to connect to benjerry. Of course, these mailboxes are currently empty. So, you will need to send yourself a few short pieces of mail to test out your program, using the address username@cs.middlebury.edu for example (using your login id in place of "username", of course).

To start click on this link to download the SpamFilter starter.

Design

This week we will again require that you prepare a written "design" for your program before lab. At the beginning of the lab, the instructor will briefly examine each of your designs to make sure you are on the right track. Submit your final design with your code submission.

The only class you are responsible for designing is the SpamFilter class. For this class, you should provide:

We provide a MailConnection class that manages the communication with a mail server. We also provide a simple Mail class that the SpamFilter class uses to store and retrieve a "from" header and a "subject" header. It provides the following:

public Mail (String from, String subject)
Constructs a mail object with just a "from" header and a "subject" header.

public String getFrom()
Returns the from header that was passed in on construction.

public String getSubject()
Returns the subject header that was passed in on construction.

public String toString()
Returns the "from" and "subject" headers as a single string with a newline between them and a newline at the end.

The MailConnection class is a bit more interesting. Here are descriptions of the constructor and methods the SpamFilter class needs:

public MailConnection (String host, String userName, char[] password)
To talk to a mail server, we first construct a MailConnection object. We pass in the name of our mail server, something like "mail.middlebury.edu", a login id, and a password. Notice that the password parameter is a character array rather than a String. This is not really a problem, because the GUI component that the user types their password into (a JPasswordField) returns a character array, rather than a String, so the program just passes the contents of the password field on to the MailConnection constructor.

When you construct a MailConnection it attempts to connect to the mail server. There are a number of reasons why it might fail. For example, the user might mistype his/her login id or password or the mail server might not be running for some reason. If any of these failures occur, a dialog box will pop up to inform the user. The program must also be aware that the connection failed because it will not be possible to look at the mail if there is no connection. For that reason, we provide the following method.

public boolean isConnected()
Returns true if the program currently has a connection to the mail server.

public void disconnect()
Closes the connection to the mail server. This does nothing if there is no active connection.

public int getNumMsgs()
Returns the number of messages in the mailbox you are connected to. This returns 0 if there is no active connection.

public String header (int msg)
Returns the headers of a mail message identified by the number passed in. Unlike Java, mailboxes number messages beginning with 1 and going up to the number of messages contained in the mailbox.

The mail headers are returned in one long string, such as:

Received: from arcticcat.middlebury.edu ([140.233.2.8]) by bearcat.middlebury.edu with Microsoft SMTPSVC(5.0.2195.6713);
 Fri, 18 Nov 2005 15:36:19 -0500
Received: from anthrax.middlebury.edu ([140.233.2.72]) by arcticcat.middlebury.edu with Microsoft SMTPSVC(5.0.2195.6713);
 Fri, 18 Nov 2005 15:36:19 -0500
Received: from 140.233.20.15 [140.233.20.15]
by anthrax.middlebury.edu
with XWall v3.35 ;
Fri, 18 Nov 2005 15:36:18 -0500
Message-ID: <437E3B2F.1080301@middlebury.edu>
Date: Fri, 18 Nov 2005 15:35:59 -0500
From: David Guertin 
MIME-Version: 1.0
To: "Briggs, Amy" 
Subject: Re: two needs for CS 101
In-Reply-To: <437DE9F4.2000107@middlebury.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Return-Path: guertin@middlebury.edu

Your spam filter will look at only the "From" and "Subject" headers, which the SpamFilter class puts into an array for you.

Your job is to complete the definition of the SpamFilter class to search the "From" and "Subject" headers for keywords indicating the message is spam. The SpamFilter class extends FrameController rather than FrameWindowController. We do this because this program makes no use of the canvas. The begin method constructs the GUI. If you need to initialize some instance variables, you may want to add code to the begin method to do that. The GUI components that we construct and that your program will need to interact with are:

JTextArea mailArea
This is the main text area where the user's good mail headers or spam headers are displayed.

JComboBox goodOrSpam
This is the menu at the bottom of the screen that allows the user to control whether mailArea displays the headers from good mail or from spam. When the user makes a selection from this menu, what is displayed should change if the user's mail has already been downloaded. If no mail has been downloaded, nothing should happen.

JTextField filterField
This is the text field where the user enters a filter to be added to or removed from the list of filters.

JTextArea filterArea
This is the text area showing all the terms actively used as filters. Each filter the user adds should be displayed on a separate line within this text area.

JButton addFilterButton
This is the button labelled "Add to filter". When the user clicks this button, the entry in filterField should be added to filterArea. The filterField should be cleared. If the user's mail has already been downloaded, it should be refiltered and mailArea should be updated to show the results of the filtering.

JButton removeFilterButton
This is the button labelled "Remove from filter". When the user clicks this button, the entry in filterField should be removed from filterArea. The filterField should be cleared. If the user's mail has already been downloaded, it should be refiltered and mailArea should be updated to show the results of the filtering. If the entry in the text field does not exist in the filters, nothing should happen.

JComboBox servers
This is the menu where the user selects the mail server to connect to. Note that the user can type in the name of the server to connect to a server not in the original menu.

JTextField user
This is the text field where the user types in his/her name to connect to a mailbox.

JPasswordField pass
This is where the user types in his/her password. A JPasswordField is very similar to a JTextField except that what the user types is not displayed on the screen. Also, the program uses the getPassword() method to find out what the user typed in this field, rather than getText(). Recall that getPassword returns a char[] rather than a String, but we just pass this value on to the MailConnection constructor.

JButton connectButton
This is the button labeled "Get mail". When the user clicks this button, the program creates a new mail connection using the information in servers, user, and pass. It then downloads headers from all of the user's mail and saves just the "From" and "Subject" headers. It then closes the connection to the mail server. Your job is then to filter the saved mail headers into good mail or spam using the current filters. Then, display good mail or spam depending on the setting of the goodOrSpam menu.

You must provide:

Note that the code you are given extracts just the "From" and "Subject" headers from the long string that header returns. As shown earlier, the String that header returns actually contains multiple headers with a newline between each. To find just one header, the program finds a string that begins with a newline character (\n) followed by "From:" or "Subject:" and ending at the next newline character. Note how it handles the special case where the header it is looking for is the last header and does not end with a newline. When looking for these strings, we use a case-sensitive comparison.

Implementation Suggestions

We suggest that you approach this problem in the following order:

Submitting Your Work

We will grade your assignment both by running your submitted BlueJ project and by reading a printout of your Java source code. You should submit a printout for SpamFilter.java only. Homework is due by the beginning of your lecture section.

Before submitting your work, make sure that your .java file includes a comment containing your name and lab section. Also, before turning in your work, be sure to double check both its logical organization and your style of presentation. Make your code as clear as possible and include appropriate comments describing major sections of code and declarations. Make sure your indentation is all consistent. Refer to the lab style sheet for more information about style.

Print out a copy of your java source file, your design document, and the Homework 10 Cover Page. This page provides the guidelines for how your homework will be graded. Turn in one stapled hardcopy of all your work, with this cover page on top. You should include at the bottom any assigned exercises from the textbook.

Electronic submission

Because of Java security mechanisms, it will not be possible to run this program in a web browser or with an appletviewer. For this reason we will want you to submit your entire BlueJ project.

Open a terminal window and type

   cd public_html/cs101/Spam
This should bring you to your files for this assignment. Then, type
   ~cs101/submit

This will copy all files in your current directory and store them in our account. You can resubmit as many times as you want -- only the newest submission will be kept. The final deadline for both paper and electronic submission is Friday, December 2 at the beginning of your lecture section.

Good luck and have fun!


Back to Computer Science 101 Home
Department of Computer Science