Practice Problems 8 Complete before class on 2022-11-11

Complete the following problems on paper. Try to solve each problem on paper first before using Thonny to confirm your answers.

  1. Assume that a=np.array([1, 2, 3]) and b=np.array([4, 5, 6]) (after import numpy as np). Evaluate each of the following expressions. Make it clear whether the result is a scalar (a single value) or a vector (an array of values).
    1. a * b
    2. np.sum(a - b)
    3. a / np.sum(a)
    4. (a + b) + 2
    5. b - np.mean(b)
  2. Rewrite the following code into “plain” Python that does not use NumPy, assuming a is a list. Built-in functions like sum, etc. are considered “plain” Python.

     def mystery(a):
         return np.max(a) - np.min(a)
    
  3. Rewrite the following code into “plain” Python that does not use NumPy, assuming a and b are lists (of the same length). Built-in functions like sum, etc. are considered “plain” Python.

     def mystery(a, b):
         return np.sum((np.array(a)-np.mean(a)) * (np.array(b)-np.mean(b)))
    
  4. Rewrite the following Python function using NumPy to not have any explicit loops:

     def length_normalize(items):
         """
         Normalize all the values in the list by the sum
             
         Args:
             item: A list of numbers
            
         Returns: List of normalized numbers
         """
         total = 0
         for item in items:
             total += item
            
         new_items = []
         for item in items:
             new_items.append(item / total)
         return new_items
    
  5. Consider the following Table assigned to the tips variable, a subset of which are shown below (you can download the file here and read into Python via tips = ds.Table().read_table("tips.csv")).

     >>> tips
     total_bill | tip  | sex    | smoker | day  | time   | size
     16.99      | 1.01 | Female | No     | Sun  | Dinner | 2
     10.34      | 1.66 | Male   | No     | Sun  | Dinner | 3
     21.01      | 3.5  | Male   | No     | Sun  | Dinner | 3
     23.68      | 3.31 | Male   | No     | Sun  | Dinner | 2
     24.59      | 3.61 | Female | No     | Sun  | Dinner | 4
     25.29      | 4.71 | Male   | No     | Sun  | Dinner | 4
     8.77       | 2    | Male   | No     | Sun  | Dinner | 2
     26.88      | 3.12 | Male   | No     | Sun  | Dinner | 4
     15.04      | 1.96 | Male   | No     | Sun  | Dinner | 2
     14.78      | 3.23 | Male   | No     | Sun  | Dinner | 2
     ... (234 rows omitted) 
    

    Briefly describe the plot generated by the following code. & is the element-wise and operation.

     d = tips.where((tips["sex"] == "Female") & (tips["time"] == "Lunch"))
     plt.plot(d["total_bill"], d["tip"], "ro")
     d = tips.where((tips["sex"] == "Male") & (tips["time"] == "Lunch"))
     plt.plot(d["total_bill"], d["tip"], "bo")
     d = tips.where((tips["sex"] == "Female") & (tips["time"] == "Dinner"))
     plt.plot(d["total_bill"], d["tip"], "rx")
     d = tips.where((tips["sex"] == "Male") & (tips["time"] == "Dinner"))
     plt.plot(d["total_bill"], d["tip"], "bx")
     plt.show()
    
  6. For the dataset above, write datascience code to subset the data to just those rows where the tip is greater than 15% of the total bill.

  7. For the dataset above, write code using the datascience group method to concisely and efficiently compute the average tip percentage for all combinations of diner gender and meal time (“Lunch” vs. “Dinner”). As a suggestion, the NumPy np.mean method can be used as the function applied to each group.

  8. [Bonus] Write code to perform the same computation, computing the mean tip percentage for all combinations of diner gender and meal time (“Lunch” vs. “Dinner”), using just Python built-in functions and data structures. There are many ways to go about this, but as a hint, tuples, e.g. (sex, time), can be used as dictionary keys. You can easily iterate through the rows of a Table with the row attribute and access the fields as attributes of the value of that iterable, e.g.

     for row in tips.rows:
         print(row.tip / row.total_bill)