CS 451 Homework 1 - Multivariate linear regression using gradient descent

Due: Monday, 9/24, at 5pm

This is the first programming assignment. It is in Octave/Matlab, while most of the remaining assignments will use Python. The goal of this assignment is to complete the work from the in-class exercises from class 3 and class 4, and then apply your program to a small dataset.

Each student should work independently, but you may use the code you developed in class with your partner as a starting point.

1. Create a directory hw1 within your cs451 directory, and copy the files from class 4. Also copy any code you developed with your partner. Make a copy of lab4.m and call it hw1.m. Add a comment with your name at the top. Your hw1.m script should read a matrix file as input. Initally you can use ex1data2.txt, but later you should substitute a different file. Your program should work for any number of features n and number of training examples m. The input matrix will have size m x (n+1) and can be written as [X y], i.e., the first n columns are the input matrix X, and the last column is y.

As in lab 4, your script should perform feature normalization and then estimate theta using gradient descent and print it. It should also plot the cost curve given the code provided. If you want, experiment with the learning rate and the number of iterations to see how fast you can get it to converge.

2. Once it works, apply your function to a different dataset with more features. Go to this Linear Regression Datasets page and download dataset "x15.txt" (gas consumption versus local conditions). Make a copy of the file, e.g. "x15a.txt" and remove all the text above the actual numbers. Then, load the file from your hw1.m script. After loading, remove the first column of the data matrix since it contains the row numbers, which we don't want to use as a feature. Assuming your implementation can handle an arbitrary number of features n, your code should now run and compute theta. Print the final theta, as well as the final cost.

Now, suppose we had a new state with the following features:

 xtest = [7.5  4200  9999  0.7]
Add some code to your hw1.m script to compute the estimated petrol consumption ytest = h_theta(xtest) and print it. Note that you'll have to apply feature normalization using the mu and sigma values you computed earlier before you can multiply by theta.

3. You last task is to compute theta via the normal equation, without feature normalization. Call the result theta2, and print it, as well as the corresponding cost J2. Use theta2 to compute another estimate ytest2 = h_theta2(xtest). You should get the same result - do you? (If not, perhaps your gradient descent didn't fully converge, and you'll have to adjust alpha or the number of iterations.)

What to hand in

Make sure when you run your script hw1.m, it plots the cost function and prints theta, J, ytest, theta2, J2, and ytest2. Create a zip archive containing your code and all the files it uses (including x15a.txt). Before submitting, unzip your archive in a new directory, restart Octave, load hw1.m, and test that everything still works. Once it does, submit your complete archive using the CS 451 HW 1 submission page.