CS 312 Software Development

CS 312 - Practical Eight

In this practical you will implement the database backend for the Simplepedia server. You will test our implementation against the SQLite file-based database, but by using the Knex library. The same code would also work with MySQL or PostgreSQL (Heroku, for instance, provides PostgreSQL).

Goals

  • Gain familiarity with RDBMS-backed servers, Knex and SQLite.
  • Employ code generators. Code generation (like with CRA) is very a helpful feature for working with complex libraries (that, for example, expect specific files in specific directories).

Prerequisites

  1. Visit the Practical 8 page on Replit (or click the 'Start Project' link for Practical 8 on our team page).

  2. Click through to the GitHub classroom assignment to create your private repository.

  3. Return to the assignment on Replit and go to the Version Control panel.

  4. Click the button to create a new git repository (not the one for connecting to an existing repo).

  5. Open the shell and follow the instructions on your GitHub repository for connecting an existing repository:

    1. git remote add origin repo-name where repo-name is the name of your repository (e.g., https://github.com/csci312-s21/practical08-ChristopherPAndrews.git)
    2. git branch -m main
    3. git push -u origin main

Setup Knex

Knex is both a library and a command line tool (usable via npx). Install Knex and the database clients (for SQLite 3 and PostgreSQL):

npm install knex pg sqlite3

Initialize Knex with npx knex init. This will create a configuration file named ./knexfile.js.

Remove the descriptions of the staging and production servers. Edit the file to look like the snippet below. You are configuring a SQLite-based test environment (using a specific seed and an in memory instance of SQLite), an SQLite-based development environment and a production configuration using PostgreSQL. You won't need to use the production configuration today, but you would use something similar when deploying to Heroku (e.g., for your project).

module.exports = {
  development: {
    client: "sqlite3",
    connection: {
      filename: "./simplepedia.sqlite3",
    },
    useNullAsDefault: true,
  },

  test: {
    client: "sqlite3",
    connection: ":memory:",
    useNullAsDefault: true,
    seeds: {
      directory: "./seeds/test",
    },
  },

  production: {
    client: "pg",
    connection: process.env.DATABASE_URL,
    ssl: true,
  },
};

Make a Knex Migration

Recall that migrations are how we automatically configure the database. Each migration has two parts, the "up" function which makes the desired changes to the database, e.g. create a table or add a column, and the "down" function which reverts those changes. For SQLite, running the migration will also create the database file if it doesn't exist.

Create a skeleton for a migration with npx knex migrate:make load-articles. This command will create a new folder called migrations, in which you will find your new migration file (note that the date is included in the name so the current state of the database schema can be tracked). You will need to fill in the exports.up (which should create your table) and exports.down (which should delete or "drop" the table) using the Knex schema builder API.

In the database, you want to create a table named Article with the four columns that correspond to the Article properties. Recall from the programming assignments that the four properties are: id, title, contents, and edited. The id field should be of type increments, an auto-incrementing integer, ensuring that each article has a unique id number (Knex automatically sets the increments column as the primary key). The remaining columns should be string or text as shown below. The difference between string and text is the intended size: string is typically of shorter, fixed, length, while text implies a longer, variable-length, string. The actual implementation will depend on the underlying database (e.g. SQLite has only variable length strings).

exports.up = function (knex) {
  return knex.schema.createTable("Article", (table) => {
    table.increments("id");
    table.string("title");
    table.text("contents");
    table.string("edited");
  });
};

exports.down = function (knex) {
  return knex.schema.dropTableIfExists("Article");
};

When defining the schema, you can specify additional constraints on the columns (as chained methods). What additional constraints should exist on the articles? Can there be articles with identical titles? Can the edited time be null? Add additional constraints to your schema.

View the schema

Once you have completed your migration, run any unperformed migrations (in this case just one) with the following command. Note we are explicitly specifying the environment (and thus the database that should be modified).

npx knex migrate:latest --env development

The migration should have created a simplepedia.sqlite3 file.

SQLite CLI

Replit's shell does not have a copy of the SQLite interpreter installed, but you can download the simplepedia.sqlite3to your local machine and try out the CLI. MacOS and Linux should have it installed by default, you will have to install it manually on Windows (I believe).

To check out the current schema, open the database with the SQLite client application in the shell via sqlite3 simplepedia.sqlite3 and execute .schema at the interpreter prompt. You should see the schema for your new Article table (as well as a knex table that knex added to keep track of migrations -- don't tinker with this). To get out of the sqlite3 command line tool, type Ctrl+D or .exit.

The sqlite3 client accepts two different types of commands. You can type raw SQL statements to interact with the database directly, or you have a small collection of system commands for working with the client. The system commands all start with a dot (like .schema). You can get a full list of these commands with .help.

Seeding the Article Data

"Seeding" is the process of pre-populating databases. It is worth noting that we frequently do not need to do this. If you are making a blog, online market, scheduling tool, etc, your content comes dynamically from your users. Even Simplepedia could conceptually be started "clean". However, seeding will frequently be used for testing.

Create the Seed File

Seed files are short scripts that populate the database. Create a skeleton seed file with:

npx knex seed:make load-articles --env development

The above command should have created a seeds/load-articles.js file. Modify that file to read in the article from the seed.json file and insert it into the database. For variety (and simplicity), we will use Node"s synchronous file I/O interface this time. Update the code generated by Knex to look like the below:

const fs = require("fs");

exports.seed = function (knex) {
  const contents = fs.readFileSync("seed.json");
  const data = JSON.parse(contents);

  // Deletes ALL existing entries
  return knex("table_name")
    .del()
    .then(function () {
      // Inserts seed entries
      return knex("table_name").insert([
        { id: 1, colName: "rowValue1" },
        { id: 2, colName: "rowValue2" },
        { id: 3, colName: "rowValue3" },
      ]);
    });
};

Modify the exports.seed function by changing table_name to match your table name (i.e. Article).

If you followed the directions above, the column names in the database match the property names of the objects in the data Array. Those objects can be passed directly to the insert function (the database itself will generate the id property). However, we have too much data to just pass the array to insert, instead we need to use knex's batchInsert utility to insert the article in blocks (in this case in blocks of 100 articles).

exports.seed = function (knex) {
  const contents = fs.readFileSync("./data/seed.json");
  const data = JSON.parse(contents);

  // Deletes ALL existing entries
  // Use batch insert because we have too many articles for simple insert
  return knex("Article")
    .del()
    .then(() => knex.batchInsert("Article", data, 100));
};

Run the Seed

Run the seed with npx knex seed:run --env development to populate your database. Now reopen your database with the SQLite3 command-line tool via sqlite3 simplepedia.sqlite3. Execute the following command in the SQLite interpreter to view the first 10 articles.

A note about SQLite grammar: the semi-colon is required, but the keywords in SQL are not case-sensitive. You will frequently see them in all caps, but the parser will accept them either way.

select * from Article limit 10;

While you have the interpreter open, experiment with a few SQL queries. How could you query for just the article with id of 11, or all the articles with titles that start with "C" (and thus would be shown in the "C" section of the Simplepedia client? Specifically what where conditions would you use in your query?

View queries

Update the Server

Now it is time to look at the server. If you look at the files in pages/api you will see that all of the important work is being done in a library file: lib/backend-utils.js. This is another instance of separation of concerns. There is no reason for the API routes to know where the data comes from. This makes it easy to swap out one solution for another (switching from a JSON backed solution to a database, say).

This will also make testing easier, since we can focus on the helper functions. This isn't to say that we shouldn't test that the routes do the right thing (that would be the server side of testing how it appears to the user), but it will certainly make this practical easier...

Connect to the database

Before we can use the database in your code, you need to initialize Knex. You can do this at the top of backend-utils.js.

Note the common pattern in which we use process.env.NODE_ENV to specify the environment (e.g. production, development, etc.) or default to development.

import knexConfig from "../../knexfile";
import knexInitializer from "knex";

const knex = knexInitializer(
  knexConfig[process.env.NODE_ENV || "development"]
);

Implement getArticles()

In order to fetch all of the articles, you want to perform a select query with Knex. If you don't give the function any arguments, it is the equivalent of SELECT *, which returns all the columns. As with the queries described earlier, the Knex Query builder creates a Promise so we want to use await.

const rows = await knex("Article").select();

Return the rows from the function. This is the equivalent to the SQL statement SELECT * FROM Article. It will return every attribute of every article in the collection. These will be marshalled into JavaScript objects and returned in an Array.

You can test that this function works by firing up the dev server and then popping out the web view and tacking /api/articles on to the end of the URL.

You should see a long array of articles.

Testing

The skeleton includes a simple test suite in backend-utils.test.js. If you look at the import statement at the top of the file, you will see that it imports all of the functions we are testing, as well as the knex variable you created. This will allow us to manipulate the knex instance in our tests.

The Jest test runner automatically sets NODE_ENV=test; thus Knex will use the test database you defined in knexfile.js (not the development database you just seeded). Note the simpler ./seeds/test/load-articles.js script. In test mode, the database will be seeded with this smaller collection. This is just one example of how being able to define different database environments is a very helpful feature of Knex.

As you should recall, we want our tests to be repeatable and to run independently. So, we need the database to be in the same state at the start of every test.

This is another advantage of using knex -- we can automate migration and seeding. We will "rollback" the latest migration (run the down function in the migration, which will delete the table), run the migration again to get a clean table, and finally re-seed it. We can do all of that in the beforeEach:

beforeEach(async () => {
  await knex.migrate.rollback();
  await knex.migrate.latest();
  await knex.seed.run();
});

Let's test the function you just completed. If you run npm test you will be flooded with errors from all of the functions you haven't implemented. So, look for the first test and change the function test to test.only (the function call will now look like test.only("getArticles gets all articles", ...etc)). This will run just this one test. This is a great tool for when you are worried about one particular test. Another useful variant is skip, which can be added in the same way. You can use this if there is one test you aren't ready to tackle and you just want to hide it. This can also be applied to describe to do this at the test suite level.

Now run the tests with npm test. The test getArticles function you implemented you should pass.

Implement getArticle

This function accepts an id and returns the corresponding article if it can be found. If it isn't found, it should return null.

The query will look almost the same as the last one. This time, however, we want to add a conditional statement to the end. In this case, use .where({id:id}). This just chains on to the end of the query.

There are a number of different forms for .where. In this case we are making use of the "object syntax" form, which allows us to specify an object with column names as properties and desired values as the property values.

Bear in mind that this will still return an array, even though we are just expecting a single article (if it can be found). So, you need to check if there are values in the returned array. If there are, return the first one (not the array), and if there aren't return null.

Test this in the browser and using npm test.

Implement deleteArticle

To implement this one, you can copy the same query again, as we are again interested in finding an article with a particular id. However, we want to delete it rather than returning it, so you can remove the .select and add .del() on the end.

This time the return value of the query isn't an array, it is count of how many records have been deleted. Use this to return a Boolean value indicating if the deletion was a success or not.

Implement updateArticle

This function is very similar to deleteArticle (detecting patterns yet?). However, instead of del(), you want to use update(). We pass the fields that we want to update in an object. So, we can pass the entire article object in and it will update all of the fields (even if they don't all need updating).

Implement addArticle

This one is a little trickier because we want to add the article to the collection and then return a new object that has the new id field added by the database. Unfortunately, while we use SQLite, we can't do this in a single step. We need to first insert the article into the database. This operation returns the id of the the new article. Then we have to look up the appropriate article in the database in a second query.

You will use the .insert function to add the data. This works as you might expect, you can just pass in an object with properties that match the table's fields.

One gotcha with this is that the return value is an array of length one, where the first element is the new id. This is because, as you saw in the seed file, .insert() can insert multiple records at a time.

Implement getSections

This function requires the most complicated of the queries. We could just write a query that returns all of the titles and then use the code you wrote for assignment 2 to convert them to a list of sections. However, SQL is pretty powerful and we can do a little better.

Unfortunately, knex doesn't support all of the SQl features that we need, so we are going to resort to knex.raw() which allows us to write some native SQL. In this case I am just going to give you the whole query:

let sections = await knex("Article").select(knex.raw("upper(substr(title, 1,1)) as section")).distinct();

We are making use of two string manipulation functions that are present in both SQLite and PostgreSQL. Basically, this takes the first character of every title (using substr) and then converts them to uppercase (using upper). We then use as to rename the resulting "column". Finally, we add .distinct() on the end to make it a unique set.

Bear in mind that the output of this will be no different from our other selection queries -- an array of objects. In this case, the objects will have a single property: section. You will want to convert this array into an array of plan strings. Hopefully I no longer need to remind you which tool to use to transform an array of values into a different array of modified values...

Implement getTitles

This function is very similar to getArticles() and getArticle(). This time, we want to select only the id and title fields, so add those as strings to the .select(). We need a .where(), but this time we won't use the object syntax. Instead, we are going to use .where("title", "like", `${section}%`).

This uses SQL's built in LIKE operator that does pattern matching ("%" is the wildcard character). So, this will find all articles that have a title that starts with our section.

We can make the result a little nicer by ordering them as well. All you need to do is add .orderBy("title") on to the end of the query.

Use Your New Server

When you are all done, you can use your newly implemented server with the Simplepedia client you implemented in assignment 4 (provided you have completed assignment 4). You should be able to copy your components and pages over and set the SERVER variable to the empty string.

Finishing Up

  1. Add and commit your changes to Github. Make sure to add and commit the new files you created.
  2. Submit your repository to Gradescope

Extra

Deploy to Heroku

Deploying this practical to Heroku and setting up the production PostgreSQL server would be good practice for the project. Here are some steps you can follow to get your code deployed:

First, we are going to add a new script to package.json that will allow Heroku to start your application on an arbitrary port.

"heroku:start": "next start -p $PORT"`

Second, add a new file called Procfile. Inside, add one line, which will tell Heroku how to start your application

web: npm run heroku:start

Commit your changes and push them to GitHub.

Now you can go to Heroku and make a new app. Use the GitHub deployment tab to connect to your repository and do a manual deployment. All of these steps are basically the same as you did for the DevOps practical. Now for the new stuff: provisioning the database.

Go to the Resources tab. You will see something that says 'Add-ons', with a search field under it. Type "postgres" into the field. It will make suggestions as you type -- you are looking for "Heroku Postgres". Select it and agree to the free "Hobby-dev" level. You now have a database connection. Heroku has even configured your environment variables for you so that process.env.DATABASE_URL will now point to the new database. However, we aren't done yet...

Go to the Settings tab. There you will see the "Config Vars". Click on the "Reveal Config Vars" button. There you will find your freshly created DATABASE_URL variable. You need to add a second variable. In the space provided, add a variable PGSSLMODE with value no-verify. This gets around the fact that we don't have a SSL provider.

Now that the database is set up, we need to create our tables and seed it. This is pretty much the same process you started with, but now we need to do it through Heroku's web interface. Click the "More" button in the upper right and select "Run command". This will allow you to run command line instructions within your deployment. The two commands you need to run are:

npx knex migrate:latest

and

npx knex seed:run

If both of those run without errors, you should be ready to go.

Just like you directly accessed your SQLite database via the sqlite3 client, you can do the same with your PostgreSQL database. Download and install one of the many PostgreSQL clients and use the DATABASE_URL from Heroku for the connection information.


Last updated 05/19/2021