Tag Archives: Accelerated C++

Accelerated C++ Solution to Exercise 4-3

Exercise 4-3

What happens if we rewrite the previous program to allow values up to but not including 1000 but neglect to change the arguments to setw? Rewrite the program to be more robust in the face of changes that allow i to grow without adjusting the setw arguments.

Solution

Though there are many ways to solve this problem, below is the solution strategy that I use, with the aim of enhanced flexibility:

  1. Obtain the asymmetric range [m,n) from the user with the condition of n > m. In other words, m is the startNumber. n – 1 is the endNumber.
  2. Compute the number of elements within this asymmetric range as n – m. This is equivalent to the number of loops required (or number of rows to output).
  3. Create a function getStreamWidth that will be used to automatically compute the maximum width required for column 1 (the list of numbers), and column 2 (the square of column 1). This function is capable of dealing with both negative and non-negative input values.
  4. Have a for loop to output column 1 and 2 using the corresponding column stream widths computed upfront (step 3 above).

The Project

I have decided to partition the program as follows – for practice sake.

C++ Source Files

  • main.cpp – this is the first program that is run during the implementation phase.
  • getStreamWidth.cpp – contains all functions relating to obtaining the stream widths.

C++ Header Files

This diagram below shows what the Code::Block Management Tree look like after successful creation of these files.

Acpp4p3MgntTree

The actual content of the source and header files are documented in the following sections.

Source Files

main.cpp

#include <iostream>
#include <ios>
#include <iomanip>
#include <algorithm>
#include "getStreamWidth.h"

using std::cout;  // <iostream>
using std::cin;   // <iostream>
using std::endl;  // <iostream>
using std::streamsize;  // <ios>
using std::setw;  // <iomanip>
using std::max;  // <algorithm>

int main()
{
#include <iostream>
#include <ios>
#include <iomanip>
#include <algorithm>
#include "getStreamWidth.h"

using std::cout;  // <iostream>
using std::cin;   // <iostream>
using std::endl;  // <iostream>
using std::streamsize;  // <ios>
using std::setw;  // <iomanip>
using std::max;  // <algorithm>

int main()
{
    // display program intro message
    cout << "***********************************************************\n"
         << "*** This program computes the square of the numbers     ***\n"
         << "*** in the asymmetric range [m,n).                      ***\n"
         << "*** (Limitation: please ensure n > m)                   ***\n"
         << "*** e.g. [3,7) contains elements 3, 4, 5, 6 (but not 7) ***\n"
         << "***********************************************************";
    cout << endl;

    // ask user to supply m
    cout << "Enter m: ";
    int m;
    cin >> m;

    // ask user to supply n
    cout << "Enter n: ";
    int n;
    cin >> n;

    // ensure m and n are input correctly. If not, exit program.
    if (n <= m)
    {
        cout << "Please make sure n > m";
        return 1;
    }

    // initialise value
    const int startNumber = m;    // first output integer
    const int endNumber = n - 1;  // last output integer
    const int numLoops = n - m;  // number of rows to output

    // find the maxwidth for column 1 and 2
    const streamsize col1Width = max(getStreamWidth(startNumber), getStreamWidth(endNumber));
    const streamsize col2Width = max(getStreamWidth(startNumber * startNumber), getStreamWidth(endNumber * endNumber));

    // display a summary
    cout << "Asymmetric range: [" << m << "," << n << ")" << endl;
    cout << "Number of rows = " << numLoops << endl;
    cout << "Column 1 width = " << col1Width << " | Column 2 width = " << col2Width << endl;

    // get ready to print report
    int y = startNumber;
    for (int i = 0; i != numLoops; ++i)
    {
        cout << setw(col1Width) << y << setw(col2Width) << (y * y) << setw(0) << endl;
        ++y;
    }
  return 0;
}

getStreamWidth.cpp

#include <ios>

using std::streamsize;

// return the required streamsize to fit a particular integer number
streamsize getStreamWidth(int number)
{

    streamsize numDigits;

    // initialise numDigits and number depending on whether value is positive / negative.
    // If negative, require at least 2 spaces to fit the leading empty space string and the negative sign
    // If positive, require at least 1 space to fit the leading empty space string
    if (number < 0)
    {
        numDigits = 2;
        number *= -1;
    }
    else numDigits = 1;

    // numDigits is the number of divisions required to make number approaches zero (plus leading space and sign)
    // i.e. this is equivalent to the total stream width required
    while (number != 0)
    {
        ++numDigits;
        number /= 10;
    }

    return numDigits;
}

Header Files

getStreamWidth.h

#ifndef GUARD_GETSTREAMWIDTH_H
#define GUARD_GETSTREAMWIDTH_H

std::streamsize getStreamWidth(int number);

#endif // GUARD_GETSTREAMWIDTH_H

Test Results

Asymmetric range [-3, 4)

***********************************************************
*** This program computes the square of the numbers     ***
*** in the asymmetric range [m,n).                      ***
*** (Limitation: please ensure n > m)                   ***
*** e.g. [3,7) contains elements 3, 4, 5, 6 (but not 7) ***
***********************************************************
Enter m: -3
Enter n: 4
Asymmetric range: [-3,4)
Number of rows = 7
Column 1 width = 3 | Column 2 width = 2
 -3 9
 -2 4
 -1 1
  0 0
  1 1
  2 4
  3 9

Asymmetric range [995,1006)

***********************************************************
*** This program computes the square of the numbers     ***
*** in the asymmetric range [m,n).                      ***
*** (Limitation: please ensure n > m)                   ***
*** e.g. [3,7) contains elements 3, 4, 5, 6 (but not 7) ***
***********************************************************
Enter m: 995
Enter n: 1006
Asymmetric range: [995,1006)
Number of rows = 11
Column 1 width = 5 | Column 2 width = 8
  995  990025
  996  992016
  997  994009
  998  996004
  999  998001
 1000 1000000
 1001 1002001
 1002 1004004
 1003 1006009
 1004 1008016
 1005 1010025

Asymmetric range [-1005,-996)

***********************************************************
*** This program computes the square of the numbers     ***
*** in the asymmetric range [m,n).                      ***
*** (Limitation: please ensure n > m)                   ***
*** e.g. [3,7) contains elements 3, 4, 5, 6 (but not 7) ***
***********************************************************
Enter m: -1005
Enter n: -996
Asymmetric range: [-1005,-996)
Number of rows = 9
Column 1 width = 6 | Column 2 width = 8
 -1005 1010025
 -1004 1008016
 -1003 1006009
 -1002 1004004
 -1001 1002001
 -1000 1000000
  -999  998001
  -998  996004
  -997  994009

Asymmetric range [0,1000)

***********************************************************
*** This program computes the square of the numbers     ***
*** in the asymmetric range [m,n).                      ***
*** (Limitation: please ensure n > m)                   ***
*** e.g. [3,7) contains elements 3, 4, 5, 6 (but not 7) ***
***********************************************************
Enter m: 0
Enter n: 1000
Asymmetric range: [0,1000
Number of rows = 1000
Column 1 width = 4  Column 2 width = 7
   0      0
   1      1
   2      4
   ...
 993 986049
 994 988036
 995 990025
 996 992016
 997 994009
 998 996004
 999 998001

Reference

Koenig, Andrew & Moo, Barbara E., Accelerated C++, Addison-Wesley, 2000

Accelerated C++ Solution to Exercise 4-2

Exercise 4-2

Write a program to calculate the squares of int values up to 100. The program should write two columns: The first lists the value; the second contains the square of that value. Use setw to manage the output so that the values line up in columns.

Solution

To keep the code as simple as possible, for the sake of solving this particular problem, I will have some kind of hard-code values in the codes. Of course, the code can be made more intelligent later on and remove the need of hard-coding!

The setw(n) function returns a value of type streamsize that, when written on an output stream s, has the effect of calling s.width(n).

After doing some experiments in Code::Block I learn that, if (say) the integer width is smaller than n, the outputstream will simply align the numbers to the right, and pad the left side with leading empty spaces. Likewise for strings.

We are asked to output two columns.

Column 1 contains a list of integers defined by this symmetric range [0,100] – i.e. between 0 and 100 inclusively. We therefore require n = 3 for this, so we can fit in the max integer 100.

Column 2 contains a list of integers defined by this symmetric range [0, 10000] – i.e. between 0 and 10000 inclusively. We therefore require n = 6 for this, so we can fit in the max integer 10000 (which takes up a width of 5), plus 1 leading empty space to separate this column from column 1.

The program will look like this:

#include <iostream>
#include <iomanip>

using std::cout;
using std::endl;
using std::setw;

int main()
{
    for (int i = 0; i != 101; ++i)
    {
        cout << setw(3) << i << setw(6) << (i * i) << endl;
    }
}

Result

  0     0
  1     1
  2     4
  3     9
  4    16
  5    25
  6    36
  7    49
  8    64
  9    81
 10   100
 11   121
 12   144
 13   169
 14   196
 15   225
 16   256
 17   289
 18   324
 19   361
 20   400
 21   441
 22   484
 23   529
 24   576
 25   625
 26   676
 27   729
 28   784
 29   841
 30   900
 31   961
 32  1024
 33  1089
 34  1156
 35  1225
 36  1296
 37  1369
 38  1444
 39  1521
 40  1600
 41  1681
 42  1764
 43  1849
 44  1936
 45  2025
 46  2116
 47  2209
 48  2304
 49  2401
 50  2500
 51  2601
 52  2704
 53  2809
 54  2916
 55  3025
 56  3136
 57  3249
 58  3364
 59  3481
 60  3600
 61  3721
 62  3844
 63  3969
 64  4096
 65  4225
 66  4356
 67  4489
 68  4624
 69  4761
 70  4900
 71  5041
 72  5184
 73  5329
 74  5476
 75  5625
 76  5776
 77  5929
 78  6084
 79  6241
 80  6400
 81  6561
 82  6724
 83  6889
 84  7056
 85  7225
 86  7396
 87  7569
 88  7744
 89  7921
 90  8100
 91  8281
 92  8464
 93  8649
 94  8836
 95  9025
 96  9216
 97  9409
 98  9604
 99  9801
100 10000

Reference

Koenig, Andrew & Moo, Barbara E., Accelerated C++, Addison-Wesley, 2000

Accelerated C++ Solution to Exercise 4-1

Exercise 4-1

We noted in S4.2.3/65 that it is essential that the argument types in a call to max match exactly. Will the following code work? If there is a problem, how would you fix it?

int maxlen;
Student_info s;
max(s.name.size(), maxlen);

Solution

If we refer back to the program in S4.2.3/65, s.name.size() is of type std::string::size_type. If we call the std::max function with argument 1 (s.name.size()) that is of std::string::size_type, and argument 2 (maxlen) that is of type int, we are essentially comparing variables of different types. We are “comparing apple with orange”! The program will bump into compilation failure and complain.

To amend this, just simply make maxlen the same type as s.name.size(). We assume at the top of the program we already have all the required using std::xxx declarations.

string::size_type maxlen;
Student_info s;
max(s.name.size(), maxlen);

Now, both arguments 1 and 2 that we supply to the max function are of the same type string::size_type. i.e. We are now “comparing apple with apple”. The program should now compile.

Proof of Concept

Seeing is believing. I now provide the tangible evidence of why the above debate is so. To do this, I create a very simple program that simulates that “invalid” code in question, then compile it, then see compilation error, then go and fix it, then re-compile it, and demonstrate no compilation error after fixing.

Before the Fix

For instance, I explicitly specify the Student_info object type here.

#include <iostream>
#include <string>
#include <algorithm>

struct Student_info
{
    std::string name;
};

int main()
{
    int maxlen;       // this causes error
    Student_info s;
    std::max(s.name.size(), maxlen);
    return 0;
}

Submitting this program in an IDE (e.g. Code::Block) for compilation return the following errors:

error: no matching function for call to ‘max(std::basic_string<char>::size_type, int&)’

The compiler detects that we are comparing two different types and complain accordingly.

After the Fix

We can correct this easily by changing the int maxlen to std::string::size_type maxlen. i.e. corrected program as following:

#include <iostream>
#include <string>
#include <algorithm>

struct Student_info
{
    std::string name;
};

int main()
{
    std::string::size_type maxlen;   // this works
    Student_info s;
    std::max(s.name.size(), maxlen);
    return 0;
}

Submitting this corrected program the IDE now compiles smoothly.

Reference

Koenig, Andrew & Moo, Barbara E., Accelerated C++, Addison-Wesley, 2000

Accelerated C++ Solution to Exercise 4-0

Exercise 4-0

Compile, execute, and test the programs in this chapter

Solution

Chapter 4 (Organising programs and data) contains a mix of all learnings gained from previous chapters (0 to 3), with the newly added introduction to program partitioning – to break down a large program into multiple .cpp (source) files and .h (header) files to make it more manageable. In this post I will demonstrate my understanding of chapter 4 via presenting the one core project which encompasses the use of program partitioning.

My learning strategy:

  1. Read through chapter 4 – try and understand as much as possible.
  2. Write and execute the chapter 4 project in Code::Block – try and get the partitioned program work.
  3. Read through chapter 4 again and experiment with the project – now that the project is working, I would like to understand why and how it works.
  4. Document core learning outcome – this is what this post is for!

Now that I have spent 2 days completing (and re-iterating) step 1 to 3 above, I believe it is time to execute step 4 above – to document learning outcome via writing this post.

The Project

Purpose of the Chapter 4 project: to read in a flat-file format like input, and produce a summarised output (see diagram below).

Acpp4p0Problem

Chapter 4 requires us to create a partitioned program (or so called project) that is formed of multiple .cpp source files and .h header files. These files somewhat “know” about each other and can work together in the solving of big problem.

In Code::Block (or probably most of the mainstream IDE), we can create such a project fairly easily to keep these files tidy / organised.

I now document the C++ Source Files and Header Files in the following sections.

C++ Source Files

  • main.cpp – this is the first program that is run during the implementation phase.
  • grade.cpp – contains all functions relating to computing grades.
  • median.cpp – contains all functions relating to computing median.
  • Student_info.cpp – contains all functions relating to handling a Student_info object.

C++ Header Files

  • grade.h – declare the functions as defined in grade.cpp
  • median.h – declare the functions as defined in median.cpp
  • Student_info.h – declare the functions as defined in Student_info.cpp, plus defining the data structure of the Student_info (object) type.

See my post How to add header and source files in Code::Block for additional information.

This diagram below shows what the Code::Block Management Tree look like after successful creation of these files.

Acpp4p0MgntTree

The actual content of the source and header files are documented in the following sections.

Source Files

main.cpp

#include <algorithm>
#include <iomanip>
#include <ios>
#include <iostream>
#include <stdexcept>
#include <string>
#include <vector>
#include "grade.h"
#include "Student_info.h"

using std::cin;
using std::cout;
using std::endl;
using std::domain_error;
using std::max;
using std::setprecision;
using std::sort;
using std::streamsize;
using std::string;
using std::vector;

int main()
{
    vector<Student_info> students;
    Student_info record;
    string::size_type maxlen = 0;   // the length of the longest name

    // read and store all the student's data.
    // Invariant:   students contain all the student records read so far
    //              maxlen contains the length of the longest name in students
    while (read(cin, record))
    {
        // find the length of longest name
        maxlen = max(maxlen, record.name.size());
        students.push_back(record);
    }

    // alphabetize the student records
    sort(students.begin(), students.end(), compare);

    // write the names and grades
    for (vector<Student_info>::size_type i = 0;
         i != students.size(); ++i)
    {
        //write the name, padded on teh right to maxlen + 1 characters
        cout << students[i].name
             << string(maxlen + 1 - students[i].name.size(), ' ');

         //compute and write the grade
        try
        {
            double final_grade = grade(students[i]);
            streamsize prec = cout.precision();
            cout << setprecision(3) << final_grade
                 << setprecision(prec);
        }
        catch (domain_error e)
        {
            cout << e.what();
        }
        cout << endl;
    }
    return 0;
}

grade.cpp

#include <stdexcept>
#include <vector>
#include "grade.h"
#include "median.h"
#include "Student_info.h"

using std::domain_error;
using std::vector;

// definitions for the grade functions from S4.1/52, S4.1.2/54, S4.2.2/63

// compute a student's overall grade from midterm and final exam
// grades and homework grade (S4.1/52)
double grade(double midterm, double final, double homework)
{
    return 0.2 * midterm + 0.4 * final + 0.4 * homework;
}

// compute a student's overall grade from midterm and final exam grades
// and vector of homework grades.
// this function does not copy its argument, because median (function) does it for us.
// (S4.1.2/54)
double grade(double midterm, double final, const vector<double>& hw)
{
    if (hw.size() == 0)
        throw domain_error("student has done no homework");
    return grade(midterm, final, median(hw));
}

// this function computes the final grade for a Student_info object
// (S4.2.2/63)
double grade(const Student_info& s)
{
    return grade(s.midterm, s.final, s.homework);
}

median.cpp

// source file for the median function
#include <algorithm>
#include <stdexcept>
#include <vector>

using std::domain_error;
using std::sort;
using std::vector;

// compute the median of a vector<double>
double median(vector<double> vec)
{
    typedef vector<double>::size_type vec_sz;

    vec_sz size = vec.size();
    if (size == 0)
        throw domain_error("median of an empty vector");

    sort(vec.begin(),vec.end());

    vec_sz mid = size/2;

    return size % 2 == 0 ? (vec[mid] + vec[mid-1]) / 2 : vec[mid];
}

Student_info.cpp

#include "Student_info.h"

using std::istream;
using std::vector;

// we are interested in sorting the Student_info object by the student's name
bool compare(const Student_info& x, const Student_info& y)
{
    return x.name < y.name;
}

// read student's name, midterm exam grade, final exam grade, and homework grades
// and store into the Student_info object
// (as defined in S4.2.2/62)
istream& read(istream& is, Student_info& s)
{
    // read and store the student's name and midterm and final exam grades
    is >> s.name >> s.midterm >> s.final;

    // read and store all the student's homework grades
    read_hw(is, s.homework);
    return is;
}

// read homework grades from an input stream into a vector<double>
// (as defined in S4.1.3/57)
istream& read_hw(istream& in, vector<double>& hw)
{
    if (in)
    {
        // get rid of previous contents
        hw.clear();

        // read homework grades
        double x;
        while (in >> x)
            hw.push_back(x);

        // clear the stream so that input will work for the next student
        in.clear();
    }
    return in;
}

Header Files

grade.h

#ifndef GUARD_GRADE_H
#define GUARD_GRADE_H

//grade.h
#include <vector>
#include "Student_info.h"

double grade(double, double, double);
double grade(double, double, const std::vector<double>&);
double grade(const Student_info&);

#endif // GUARD_GRADE_H

median.h

#ifndef GUARD_MEDIAN_H
#define GUARD_MEDIAN_H

// median.h - final version
#include <vector>
double median(std::vector<double>);

#endif // GUARD_MEDIAN_H

Student_info.h

#ifndef GUARD_STUDENT_INFO_H
#define GUARD_STUDENT_INFO_H

// Student_info.h
#include <iostream>
#include <string>
#include <vector>

struct Student_info
{
    std::string name;
    double midterm, final;
    std::vector<double> homework;
};

bool compare(const Student_info&, const Student_info&);
std::istream& read(std::istream&, Student_info&);
std::istream& read_hw(std::istream&, std::vector<double>&);

#endif // GUARD_STUDENT_INFO_H

Test Program

I believe that by test running the program multiple times (and differently each time) it will enable me to understand a bit more about why and how the program works as a whole. Experiment, experiment, and experiment…

After compiling all the files followed by hitting the run program button, a blank command window fires up awaits me to provides input. I performed the various tests using different input values (or format). The results will hopefully enable me to visualise patterns and understand the program a bit more.

Test 1

I will now input all values in 1 line, hit enter, then hit end-of-file (F6), then hit end-of-file (F6) again. See what the output looks like and why it appears that way.

Test 1 – Input and Result

Johnny 70 80 50 60 30 Fred 95 90 100 100 100 Joe 40 40 50 60 50 70 70 50
^Z
^Z
Fred   95
Joe    46
Johnny 66

Process returned 0 (0x0)   execution time : 121.246 s
Press any key to continue.

Test 1 – Observation and Explanation

  1. The first while (read(cin, record)) { } (within the main program) activates the std::cin which enables user the type-in input values via the console window.
  2. I type all values in one line (separated by a space character), like this: name, midterm score, final score, homework scores.
  3. I then hit the enter button to open up a new line. This “hitting the enter button” action parses the values that I typed, into a buffer.
  4. The istream& read(istream&, Student_info&) function (as defined in Student_info.cpp) parse the first buffer value “Johnny” to s.name, and clear that value from the buffer.
  5. It then parse the (now first) buffer value 70 to s.midterm, and clear that value from the buffer.
  6. It then parse the (now first) buffer value 80 to s.final, and clear that value from the buffer.
  7. The istream& read_hw(istream&, vector<double>&) function is then invoked (as defined in Student_info.cpp). It prepares an empty vector<double>& hw. The while (in >> x) parses all the valid values from the buffer to the (vector) hw, until the value become invalid (e.g. a string rather than a number). In this case, this procedure parses the 50, 60, and 30 to hw[0], hw[1], hw[2] respectively. When the procedure encounters the (non-numeric) value “Fred”, it exits the while automatically and change the status of the istream& in to an error status. The in.clear() reset the error status to enable smooth data parse for the next student. Because the “Fred” was not parsed during this while loop (as the while loop got exited due to non-numeric value), and therefore not cleared from the buffer, it now becomes the first value of the buffer (this is an important note to make – because in the 2nd loop, the program now able to parse “Fred” as a name of the 2nd Student_info object!).
  8. The while (read(cin, record)) { } then enters the 2nd loop (to process Fred’s scores). It then repeats in the 3rd loop to process Joe’s scores. In the end, the vector<Student_info> students contains the 3 Student_info objects (i.e. “Johnny”, “Fred”, and “Joe”)
  9. After processing the entirety of the one-liner input, I enter end-of-file button. This has the effect of exiting the while loop of read_hw function.
  10. I enter the enter end-of-file button once more time. This has the effect of exiting the while loop (of the main program).
  11. Now that both loops are exited, the main program then proceeds to the sort(students.begin(), students.end(), compare) phase. The downstream block of code output the result in a nicely formatted summary showing the overall score for each student, sorted by the student’s name.

The above explanation is not comprehensive, as to explain the whole program, it would take multiple pages! The main reason that I decided to document the above is to highlight these core observations / concepts:

  • The behaviour of std::cin and buffer – my previous post Solution to Exercise 1-6 has enabled me to make sense of why and how this chapter 4 program works. e.g. the effect of hitting that enter button first time round!
  • The first end-of-file exits the inner-most while loop (the one within the read_hw function).
  • The second end-of-file exits the outer-most while loop (the one within the main program) – which enable the implementation to continue to the sort step (within the main program).
  • Chapter 4 of the book has explained most of the details in depth – so I am not going to repeat here.

Test 2

I will now input in the most consistent and most understandable format. i.e. input the values 1 line per student (Name, mid-term score, final score, and homework scores). When I am done I will hit enter, then hit end-of-file (F6), then hit end-of-file (F6) again. See what the output looks like and why it appears that way. (I will use the same values as of test 1 – to hopefully prove that the output result would be the same as test 1.)

Test 2 – Input and Result

Johnny 70 80 50 60 30
Fred 95 90 100 100 100
Joe 40 40 50 60 50 70 70 50
^Z
^Z
Fred   95
Joe    46
Johnny 66

Process returned 0 (0x0)   execution time : 33.322 s
Press any key to continue.

Test 2 – Observation and Explanation

The output of this test is exactly the same as test 1. This is not surprising. The overall process of test 2 is mostly similar to test 1, with one very minor difference: in test 1 we input all the values in 1 line and hit enter – this parses all values (for all 3 students) in the buffer. The downstream process then read from the buffer and proceed accordingly, and eventually created the 3 Student_info type objects.

In this test 2, we input the values for student 1 in 1 line. Hitting enter parse the values of this 1 student into the buffer. The read() function reads the name, then mid-term score, then the final score, then the read_hw function (within the read function) reads the vector elements homework 0 to home work 4. The implementation then awaits for our next homework score.

Then we type the values for the 2nd student (Fred) in a similar fashion. This time, after we hit the enter button to open up the 3rd line, the read_hw function (that we talked about just now) that is expecting a numeric homework 5, suddenly “sees” this non-numeric (string) value “Fred”. It exit the while loop (of the read_hw function), create an error status, then clear that error status as per the in.clear() (to enable smooth read of the next student). Going back to the main program, the 2nd while loop while (read(cin, record)) { } start reading that “Fred” (first value of the buffer) as the student’s name, followed by reading the renaming numeric values in the buffer (as mid-term score, final-score, homework scores). This cycle repeats for the 3rd student “Joe”.

Like test 1, the first end-of-file (F6) button exit the inner while loop (of the read_hw function). The second end-of-file (F6) button exit the outer while loop (of the main program).

The main program then proceeds with the downstream block of code, and output the results accordingly.

Test 3

This time in test 3, I combine a bit of test 1 and test 2 together. i.e. I will use the same set of values, but this time, some of these values shall spread over multiple lines, and some on the same line. I would like to prove that the result should be exactly the same as test 1 and 2, using the hybrid explanations as per test 1 and test 2.

Test 3 – Input and Result

Johnny 70 80 50
60 30
Fred 95 90 100 100 100
Joe
40
40
50
60 50 70 70 50
^Z
^Z
Fred   95
Joe    46
Johnny 66

Process returned 0 (0x0)   execution time : 47.050 s
Press any key to continue.

Test 3 – Observation and Explanation

As expect, the result is exactly the same as test 1 and test 2. This has proved that, using the explanation as per test 1 and test 2, as long as the input values are the same, it doesn’t matter whether we spread our data over multiple lines or on the same line. However, I do find the input format of test 2 (i.e. one line per student) is the most tidy and easy-to-understand flat-file format. In fact, most of the flat-files that I deal with at work (such as reading CSV files using SAS) likes this type of format – 1 line per observation (or record). So my recommendation is to stick with the (CSV like) flat-file format used in test 2.

Test 4

In this test, I would like to demonstrate what the result looks like, if I enter no homework for some students.

Test 4 – Input and Result

Johnny 70 80 50 60 30
Leon 100 100
Fred 95 90 100 100 100
Simon 90 90
Joe 40 40 50 60 50 70 70 50
^Z
^Z
Fred   95
Joe    46
Johnny 66
Leon   student has done no homework
Simon  student has done no homework

Process returned 0 (0x0)   execution time : 48.859 s
Press any key to continue.

Test 4 – Observation and Explanation

Note that Leon and Simon have done no homework! And as expected, the program is clever enough to pick this up and store this status for the corresponding Student_info type objects, instead of exiting the program entirely.

This “exception handling” step is carried out during the main.cpp program, between the try and catch exception handling step.

Test 5

This time, I enter 2 lines of input correctly. Then for the 3rd line, I only enter a name (and then hit enter). Then on the 4th line, I enter another name and hit enter. This time the program only processes the first two lines of input and output the results for these two lines. The program ignore the 3rd and 4th invalid lines entirely. This is as expected.

Test 5 – Input and Result

Johnny 70 80 50 60 30
Fred 95 90 100 100 100
Joe
Simon
Fred   95
Johnny 66

Process returned 0 (0x0)   execution time : 29.578 s
Press any key to continue.

Test 5 – Observation and Explanation

On the 3rd line, after entering “Joe, the read function (within the Student_info.cpp file) expects a numeric value midterm score. Because the next value “Simon” is not a numeric value, it exit that loop and return an error status (throug the lvalue istream& is). Because of this, the while loop within the main program exits, and proceeds with the downstream process. Also, because the 3rd line never make it to the read_hw function phase, the 3rd Student_info type object was never created. i.e. only the object for student Johnny and Fred were created. Hence the output result only contains these two students.

Conclusion

These tests conclude that the program works as long as the input data is in a consistent and expected format. The flat-file format as per test 2 is probably the best one to use (i.e. one line per record) – as it is easy to understand and consistent. Chapter 4 has really taught me a great deal on partitioning a program, and refreshing me the way std::cin and buffer function. Data extract-transform-load (ETL) is a core process used in industry reading flat-files. This chapter 4 has helped me understanding how C++ handle ETL.

Reference

Koenig, Andrew & Moo, Barbara E., Accelerated C++, Addison-Wesley, 2000

Accelerated C++ Solution to Exercise 3-6

Exercise 3-6

The average-grade computation in §3.1/36 might divide by zero if the student didn’t enter any grades. Division by zero is undefined in C++, which means that the implementation is permitted to do anything it likes. What does your C++ implementation do in this case? Rewrite the program so that its behavior does not depend on how the implementation treats division by zero.

Solution

Before changing any codes, let me try running the program like this:

  • Test 1: do not supply any grades at all.
  • Test 2: supply mid-term and final-exam grades, but not supply the homework grades.

Test 1

Please enter your first name: Johnny
Hello, Johnny!
Please enter your midterm and final exam grades: ^Z
Enter all your homework grades, followed by end-of-file: Your final grade is nan

Test 2

Please enter your first name: Johnny
Hello, Johnny!
Please enter your midterm and final exam grades: 80
90
Enter all your homework grades, followed by end-of-file: ^Z
Your final grade is nan

In both tests, because of the fact that I did not specify any homework grades, when the program tries to compute the average homework grade, it bump into division by zero. In my case, I ran the test via the Code::Block IDE on a Windows Vista machine, the undefined output is nan. (i.e. Not A Number).

To avoid this add an if statement to condition-check the value of count. If count is zero, exit the code peacefully with a message saying “Cannot compute final grade due to missing grades supplied – ensure you supply all grades as required.” Something like that.

 if (count == 0)
 {
 cout << "Cannot compute final grade due to missing grades supplied - ensure you supply all grades as required." << endl;
 return 1;
 }

The full code looks like this:

#include <iomanip>
#include <ios>
#include <iostream>
#include <string>

using std::cin;
using std::cout;
using std::endl;
using std::setprecision;
using std::string;
using std::streamsize;

int main()
{
    // ask for and read the student's name
    cout << "Please enter your first name: ";
    string name;
    cin >> name;
    cout << "Hello, " << name << "!" << endl;

    // ask for and read the midterm and final grades
    cout << "Please enter your midterm and final exam grades: ";
    double midterm, final;
    cin >> midterm >> final;

    // ask for the homework grades
    cout << "Enter all your homework grades, "
            "followed by end-of-file: ";

    // the number and sum of grades read so far
    int count = 0 ;
    double sum = 0.0;

    // a variable into which to read
    double x;

    // invariant:
    //    we have read count grades so far, and
    //    sum is the sum of the first count grades
    //    after entering the last value, hit the F6 button, then enter (to indicate end of file)
    //    or hit Ctrl+z, then enter.
    while (cin >> x)
    {
        ++count;
        sum += x;
    }

    double dummy = count; // for some reason the code fails unless I add this line.

    if (count == 0)
    {
        cout << "Cannot compute final grade due to missing grades supplied - ensure you supply all grades as required." << endl;
        return 1;
    }


    // write the result
    streamsize prec = cout.precision();

     cout << "Your final grade is " << setprecision(3)
         << 0.2 * midterm + 0.4 * final + 0.4 * sum / count
         << setprecision(prec) << endl;

    return 0;

}

 

Result

Please enter your first name: Johnny
Hello, Johnny!
Please enter your midterm and final exam grades: ^Z
Enter all your homework grades, followed by end-of-file: Cannot compute final gr
ade due to missing grades supplied - ensure you supply all grades as required.

Reference

Koenig, Andrew & Moo, Barbara E., Accelerated C++, Addison-Wesley, 2000

Accelerated C++ Solution to Exercise 3-5

Exercise 3-5

Write a program that will keep track of grades for several students at once. The program could keep two vectors in sync. The first should hold the student’s names, and the second the final grades that can be computed as input is read. For now, you should assume a fixed number of homework grades. We’ll see in §4.1.3/56 how to handle a variable number of grades intermixed with student names.

Solution

This exercise is a good one. The nature of the problem and the eventual solution builds on all the fundamentals that we have learnt from this chapter. There are many ways to solve this. Here is one of these many ways.

The solution strategy:

  • Create a constant numHomework that defines the fix number of homework scores required for each student.
  • Have an infinite while loop to enable user to enter the studentName – there will be points where the user can exit the program peacefully by entering the end-of-file key (Ctrl-Z / F6 for windows).
  • As soon as a studentName is read, we append it to the studentNames vector – this vector store the studentName elements.
  • Within the while loop we have a for loop to enable user to enter the homeworkScore one by one- until the predefined constant numHomework is reached.
  • During the process we compute the totalScore and subsequently the meanScore for the corresponding student.
  • As soon as the meanScore is computed, we append it to the meanScores vector – this vector store the list of meanScore elements.
  • The nature of the looping systems ensure the two vectors studentNames and meanScores are always in sync.
  • Allow user to either continue (by entering another studentName), or exit (by entering the end-of-file key. i.e. Ctrl-Z or F6 for windows).
  • Output the pairs of studentName and meanScore elements of the vectors studentNames and meanScores using a for loop.

Putting this all together, we have our full program.

#include <iostream>
#include <iomanip>
#include <algorithm>
#include <ios>
#include <string>
#include <vector>

using std::cin;             // <iostream>
using std::cout;            // <iostream>
using std::endl;            // <iostream>
using std::setprecision;    // <iomanip>
using std::sort;            // <algorithm>
using std::streamsize;      // <ios>
using std::string;          // <string>
using std::vector;          // <string>


int main()
{
    typedef vector<double>::size_type vecSize;

    const vecSize numHomework = 5;     // max number of homework per student

    string studentName;
    vector<string> studentNames;

    double homeworkScore;
    double totalScore;
    double meanScore;
    vector<double> meanScores;

    cout << "Enter student name: ";
    while (cin >> studentName)
    {
        studentNames.push_back(studentName);
        cout << "Enter " << numHomework << " homework scores below..." << endl;

        totalScore = 0; // Initialise
        meanScore = 0;  // Initialise

        for (vecSize i = 0; i != numHomework ; ++i)
        {
            cin >> homeworkScore;
            totalScore += homeworkScore;
        }

        meanScore = totalScore / numHomework;
        meanScores.push_back(meanScore);

        cout << "Enter another student name "
                "(or enter F6 key to exit): ";
    }

    vecSize numStudents = studentNames.size();
    cout << endl;
    cout << "Number of students entered: " << numStudents << endl;

    streamsize prec = cout.precision();
    for (vecSize i = 0; i != numStudents ; ++i)
    {
        cout << endl;
        cout << "Student: " << studentNames[i] << endl;
        cout << "Mean Score: " << setprecision(5)
            << meanScores[i] << setprecision(prec) << endl;
    }

    return 0;

}

Result

I now run a test to demonstrate the program output.

Enter student name: Johnny
Enter 5 homework scores below...
40
50
60
70
80
Enter another student name (or enter F6 key to exit): Fred
Enter 5 homework scores below...
99
78
67.5
66.8
100
Enter another student name (or enter F6 key to exit): Joe
Enter 5 homework scores below...
0
0
10
2
0
Enter another student name (or enter F6 key to exit): ^Z

Number of students entered: 3

Student: Johnny
Mean Score: 60

Student: Fred
Mean Score: 82.26

Student: Joe
Mean Score: 2.4

Process returned 0 (0x0)   execution time : 55.152 s

Reference

Koenig, Andrew & Moo, Barbara E., Accelerated C++, Addison-Wesley, 2000

Accelerated C++ Solution to Exercise 3-4

Exercise 3-4

Write a program to report the length of the longest and shortest string in its input.

Solution

I notice that the solution to this problem will be somewhat similar to the Solution to Exercise 3-3 (identify number of distinct elements in a vector). Except this time we are determining the length of the longest and shortest (string) elements. I therefore expect the eventual program to be similar.

Strategy

  • Ask user to provide the list of words so we can append the string elements to a vector, v.
  • Compute the size of vector, N.
  • If vector size is 0, output an error message (as we need at least 1 string). Exit program peacefully with return code 1.
  • If vector size is 1, we compute the size of the string element v[0]. The length of the longest string, SL, will be the same as the length of the shortest string, SS.
  • If vector size is 2 or above, we apply an algorithm to compute SL and SS (to be explained further below).

Algorithm

The algorithm (for the case of vector size of 2 or above) is described as followings:

  • Notice that this time we do not need to sort the vector – the condition checks downstream will be sufficient to update SS and SL.
  • We initialise SL and SS to the size of the first string element v[0]. SL and SS will be updated accordingly during the comparison step.
  • We initialise the index B to 1. This will be used for comparing the length of the v[B] with respect to SS and SL.
  • We perform N-1 number of comparisons between v[B] and the current SL and SS. (e.g. if there are 10 elements, we can only compare 9 sets of adjacent elements).
  • During each comparison step, if v[B].size() is larger than SL, we re-assign SL to v[B].size().
  • If v[B].size() is smaller than SS, we re-assign SS to v[B].size().
  • (If the two element lengths are equal, we do nothing.)
  • We then increment the indices B by 1 for the next comparisons.
  • We display the final value of SS and SL – these are the lengths of the shortest and longest strings elements within the vector v.

The Program

Putting this all together, we have our full program.

#include <iostream>
#include <string>
#include <vector>

using std::cin;             // <iostream>
using std::cout;            // <iostream>
using std::endl;            // <iostream>
using std::string;          // <string>
using std::vector;          // <string>

int main()
{
    // display header message
    cout << "***************************************************************\n"
            "*** This program reports the longest and shortest strings   ***\n"
            "***************************************************************\n";
    cout << endl;

    // ask for a list of numbers and store the list as a vector
    cout << "Enter a list of words one by one: ";
    vector<string> v;
    string x;
    while (cin >> x)
        v.push_back(x);    // append new input to the vector

    cout << endl;

    // define and compute core vector variables
    typedef vector<string>::size_type vecSize;   // define a type for vector size related variables
    vecSize N = v.size();            // number of elements in the vector
    vecSize numLoops = N - 1;        // number of (comparison) operators required

    typedef string::size_type strSize;   // define a type for string size related variables
    strSize SL;              // the length of the longest word
    strSize SS;              // the length of the shortest word

    // Check vector size, action accordingly
    if (N ==0 )
    {
        cout << "You need to enter at least 1 word! " << endl;
        return 1;
    }

    else if (N ==1 )
    {
        SL = v[0].size();
        cout << "Only 1 string supplied. The length of string = " << SL << endl;
        return 0;
    }

    else
    {
        // display some results to console window
        cout << "Vector size (number of words entered): " << N << endl;
        cout << endl;

        // declare new variables
        vecSize A = 0;     // vector index
        vecSize B = 1;     // vector index
        SS = v[0].size();            // the length of the shortest word
        SL = v[0].size();            // the length of the longest word

        // Loop through the vector, compute ND, and compute SS and SL
        for (vecSize i = 0; i != numLoops; ++i)
        {
            if (v[B].size() > SL)
            {
                SL = v[B].size();
            }
            if (v[B].size() < SS)
            {
                SS = v[B].size();
            }
            ++B;
        }
        // Display final results
        cout << endl;
        cout << "Length of shortest word: " << SS << endl;
        cout << "Length of longest word: " << SL << endl;
    }
    return 0;
}

Result

Below shows the results of the 3 sets of test:

  • N = 0
  • N=1
  • N>=2

The results appear to agree well with the algorithm used.

N = 0

***************************************************************
*** This program reports the longest and shortest strings   ***
***************************************************************

Enter a list of words one by one: ^Z

You need to enter at least 1 word!

Process returned 1 (0x1)   execution time : 3.156 s

N = 1

***************************************************************
*** This program reports the longest and shortest strings   ***
***************************************************************

Enter a list of words one by one: a23456789
^Z

Only 1 string supplied. The length of string = 9

N >= 2

***************************************************************
*** This program reports the longest and shortest strings   ***
***************************************************************

Enter a list of words one by one: a23456789
a23456
a2345b
a234
a234567890
^Z

Vector size (number of words entered): 5


Length of shortest word: 4
Length of longest word: 10

Reference

Koenig, Andrew & Moo, Barbara E., Accelerated C++, Addison-Wesley, 2000

Accelerated C++ Solution to Exercise 3-3

Exercise 3-3

Write a program to count how many times each distinct word appears in its output.

Solution


Note: Viktor has kindly pointed out the fact that I have mis-read the question as “How many distinct words are there” (instead of how many times each distinct word appears). The solution below therefore, does not actually under the question above!!! (If I have time I may redo the question – otherwise please just ignore my solution below!!!)


My solution is a result of combining the knowledge gained from Solution to Exercise 2-8 (on recursive operations between element v[i] and v[i+1]) and Solution to Exercise 3-2 (on manipulating vector).

The solution strategy:

  • Ask user to provide the list of words so we can append the string elements to a vector, v.
  • Compute the size of vector, N.
  • If vector size is 0, the number of distinct words is also 0.
  • If vector size is 1, the number of distinct words is also 1.
  • If vector size is 2 or above, we apply an algorithm to identify and count the distinct words. (To be explained further below).

The algorithm (for the case of vector size of 2 or above) is described as followings:

  1. We sort the vector v in non-descending order.
  2. We initialise the index A to 0, and index B to 1. These indices will be used for comparing adjacent elements within the vector (to check for distinctive elements).
  3. We also initialise the distinct element count figure ND to 1 – we know that we have at least 1 distinct word.
  4. We perform N-1 number of comparisons between the adjacent elements v[A] and v[B]. (e.g. if there are 10 elements, we can only compare 9 sets of adjacent elements).
  5. During each comparison step, if v[B] is not the same as v[A], it implies that v[B] is a newly discovered distinct word. We increment ND.
  6. Note: if v[B] is the same as v[A], it implies v[B] is a repeated word. We do nothing in that case, as we are only interested in discovering new distinct words.
  7. We then increment the indices A and B by 1 for the next comparisons. i.e. we first compare v[0] and v[1], then v[1] and v[2], then v[2] and v[3], etc…
  8. We display the final value of ND – this is the number of distinct words computed.

Putting this all together, we have our full program.

#include <algorithm>
#include <iostream>
#include <string>
#include <vector>

using std::cin;             // <iostream>
using std::cout;            // <iostream>
using std::endl;            // <iostream>
using std::sort;            // <algorithm>
using std::string;          // <string>
using std::vector;          // <string>

int main()
{
    // display header message
    cout << "***************************************************************\n"
            "*** This program computes number of unique words            ***\n"
            "***************************************************************\n";
    cout << endl;

    // ask for a list of numbers and store the list as a vector
    cout << "Enter a list of words one by one: ";
    vector<string> v;
    string x;
    while (cin >> x)
        v.push_back(x);    // append new input to the vector

    cout << endl;

    // define and compute core vector variables
    typedef vector<string>::size_type vecSize;   // define a type for vector size related variables
    vecSize N = v.size();            // number of elements in the vector
    vecSize numLoops = N - 1;        // number of (comparison) operators required
    vecSize ND;                      // number of distinct words

    // Check vector size, action accordingly
    if (N ==0 )
    {
        ND = 0;
        cout << "Number of distinct words = " << ND << endl;
        return 0;
    }

    else if (N ==1 )
    {
        ND = 1;
        cout << "Number of distinct words = " << ND << endl;
        return 0;
    }

    else
    {
        // sort the vector;
        sort(v.begin(),v.end());

        // display some results to console window
        cout << "Vector size (number of words entered): " << N << endl;
        cout << endl;
        cout << "Display the sorted (non-descending) distinct words below." << endl;
        cout << v[0] << endl;

        // declare new variables
        vecSize A = 0;     // vector index
        vecSize B = 1;     // vector index
        ND = 1;            // number of distinct words

        // Loop through the vector, compute ND, and identify the distinct words
        for (vecSize i = 0; i != numLoops; ++i)
        {
            if (v[B] != v[A])
            {
                ++ND;
                cout << v[B] << endl;  // display any newly discovered distinct words
            }
            ++A;
            ++B;
        }
        // Display final distinct word count
        cout << endl;
        cout << "Number of distinct elements (words): " << ND << endl;
    }
    return 0;

}

Result

Below shows the results of the 3 sets of test:

  • N = 0
  • N=1
  • N>=2

The results also reveal that fact that the sorting elements of a vector is case-sensitive. e.g. the string John (beginning with upper case) is different to john (all lower case).

N = 0

***************************************************************
*** This program computes number of unique words            ***
***************************************************************

Enter a list of words one by one: ^Z

Number of distinct words = 0

N = 1

***************************************************************
*** This program computes number of unique words            ***
***************************************************************

Enter a list of words one by one: Johnny
^Z

Number of distinct words = 1

N >= 2

*** This program computes number of unique words            ***
***************************************************************

Enter a list of words one by one: john
peter
pete
johnny
john
John
^Z

Vector size (number of words entered): 6

Display the sorted (non-descending) distinct words below.
John
john
johnny
pete
peter

Number of distinct elements (words): 5

Reference

Koenig, Andrew & Moo, Barbara E., Accelerated C++, Addison-Wesley, 2000

Accelerated C++ Solution to Exercise 3-2

Exercise 3-2

Write a program to compute and print the quartiles (that is, the quarter of the numbers with the largest values, the next highest quarter, and so on) of a set of integers.

Solution

According to this Wikipedia page, for discrete distributions, there is no universal agreement on selecting the quartiles values – there are 3 major computing methods.

I will apply Method 1 in my solution which has the following definitions:

  1. Use the median to divide the ordered data set into two halves. Do not include the median in either half.
  2. The lower quartile value is the median of the lower half of the data. The upper quartile value is the median of the upper half of the data.

This rule is employed by the TI-83 calculator boxplot and “1-Var Stats” functions (according to Wikipedia)

It is also very important to note that there are two types of medians:

  • Datum median – the middle data point of an ordered vector.
  • Non-datum median – the mean of the middle two data points of an ordered dataset.

Datum vs Non-datum

The computation of the lower quartile, median, and upper quartile can be either a datum or non-datum. It depends on the dataset dataset pattern which I shall expand on more shortly.

The 6 distinctive dataset patterns

After  3 hours of sketching and iterations, I am pleased to say that I have found “the dataset pattern” which I believe is consistent for all types of input datasets, and robust to implement. I summarise this pattern in the following table:

ID Pattern Condition  Quartiles (Q1, Q2, Q3)
1 NULL dataset N = 0 (Cannot compute)
2 1 Number only N = 1 Quartiles all resolve to that 1 number provided.
3 Datum Profile [0 0 0] N % 4 == 0 Datum Profile 000
4 Datum Profile [0 1 0] N % 4 == 1 Datum Profile 010
5 Datum Profile [1 0 1] N % 4 == 2   Datum Profile 101
6 Datum Profile [1 1 1] N % 4 == 3  Datum Profile 111

Note

  • I use Q1, Q2, and Q3 to denote Quartile 1, 2, and 3. (i.e. end of quarter 1, 2, and 3)
  • If the dataset is empty (Null), it is not possible to compute the quartiles for that dataset.
  • If the dataset contains only 1 number, it is quite natural to assume that there is no variation (or no range). i.e. all quartiles correspond to that number.
  • If there are at least 2 numbers provided, the pattern is determined by the Datum Profile [Q1 Q2 Q3]. There are four possible types of Datum Profiles: [0 0 0], [0 1 0], [1 0 1], and [1 1 1].
  • For example, a datum profile of [1 0 1] means: lower quartile is a datum, median is a non-datum, upper quartile is a datum. i.e. 1 represents datum. 0 represents non-datum.
  • N is the total number of elements within the dataset.
  • The % (percent) sign is used to compute remainder. e.g. N % 4 gives the remainder of N divided by 4.

Nomenclature

I use the following short-form letters to make the formula easier to understand, and program easier to code.

Symbol Meaning
v Represents the vector. e.g. v[M] resolves the value of the element locating at index M of the vector v.
N Number of elements of the vector (i.e. the input dataset)
ML An index value for computing the lower quartile. (Think Median of the Lower-halve).
M An index value for computing the median.
MU An index value for computing the upper quartile. (Think Median of the Upper-halve).
ml The computed value of lower quartile.
m The computed value of median
mu The computed value of upper quartile.

Solution Strategy

At a high level, this is what the program should do:

  • Read the list of numbers from the user and store in a vector.
  • Pattern 1: If vector contains no elements (i.e. NULL), output an error message (mentioning the program requires at least 1 number to compute quartiles), then exit the program peacefully.
  • Pattern 2: If vector contains just 1 number, output a message saying that all quartiles values are the same as that number provided, then exit the program peacefully.
  • Sort the vector so it become non-descending – this is required for the median computation.
  • Determine whether the dataset belongs to pattern 3, 4, 5, or 6, execute the corresponding algorithms, display the quartile summary at the end.

Before walking through this solution strategy I think it is worth expand a bit more on how I derive the algorithms for pattern 3, 4, 5, and 6 in the first place.

Derivation of Algorithms

Pattern 1 (null dataset) and patter 2 (1 number only) are very straight forward and have already been explained above. So I’m not going to expand further on these first two patterns.

Pattern 3, 4, 5 and 6, however, are much more interesting in comparison. These patterns require a bit more intelligence to solve – so I will focus on these one-by-one and summarise this at the end.

 The Repeating Cycles

The first thing that I did in terms of trying to spot a pattern was to sketch out around a number vectors with sizes ranging from N=2 to N=11 (could have been more but the patter had become quite obvious at that point!), circle out the datum profile, and hope to see some sort of pattern. Through the exercise I observed that the datum profile changes every time I increase N by 1. I also observed the profile change follows a cycle of 4 increments. e.g. the profile is the same at N=2, N=6, N=10, etc…. The following table is that “sketch” that I produced which led me to the discovery of algorithms.

N N % 4  Profile Sketch
2  2  [101] Acpp3p2N2
3 3  [111] Acpp3p2N3
4  0  [000] Acpp3p2N4
5 1  [010] Acpp3p2N5
6 2  [101] Acpp3p2N6
7 3  [111] Acpp3p2N7
8 0  [000] Acpp3p2N8
9 1  [010]  Acpp3p2N9
 10 2  [101]  Acpp3p2N10
11 3  [111]  Acpp3p2N11

The pattern can therefore be summarised as follows:

Pattern ID Datum Profile N % 4
3 [000] 0
4 [010] 1
5 [101] 2
6 [111] 3

We now have this core table, let’s derive the equations.

Equations and Examples

These diagrams summarise the equations, examples, and code implementations for each of the four datum patterns.

Profile [000] – Equations and Examples

Profile 000 Equations

Profile 000 Equations

Profile 000 Equations

Profile [010] – Equations and Examples

Profile 010 Equations

Profile 010 Equations

Acpp3p2Pic5b

Profile [101] – Equations and Examples

Acpp3p2Pic6

Acpp3p2Pic7

Acpp3p2Pic8a

Acpp3p2Pic8b

Profile [111] – Equations and Examples

Acpp3p2Pic9

Acpp3p2Pic10

Acpp3p2Pic11a

Acpp3p2Pic11b

 The Program

Now that we have our algorithms, let’s put everything together into a final program. I am purely using the skeleton program from chapter 3 of the book (which has an in-depth explanation of the various components). I merely implement the algorithms into this skeleton program, with some extra enhancements for user-friendly output.

#include <algorithm>
#include <iomanip>
#include <ios>
#include <iostream>
#include <string>
#include <vector>

using std::cin;             // <iostream>
using std::cout;            // <iostream>
using std::endl;            // <iostream>
using std::setprecision;    // <iomanip>
using std::sort;            // <algorithm>
using std::streamsize;      // <ios>
using std::string;          // <string>
using std::vector;          // <string>

int main()
{
    // display header message
    cout << "***************************************************************\n"
            "*** This program computes quartiles given a list of numbers ***\n"
            "***************************************************************\n";
    cout << endl;

    // ask for a list of numbers and store the list as a vector
    cout << "Enter all a list of numbers: ";
    vector<double> v;
    double x;
    while (cin >> x)
        v.push_back(x);

    // check vector size and action accordingly
    cout << endl;
    typedef vector<double>::size_type vecSize;
    vecSize N = v.size();
    if (N ==0 )
    {
        cout << "You must enter some numbers! " << endl;
        return 1;
    }

    else if (N ==1 )
    {
        cout << " Only 1 number supplied. Q1, Q2, and Q3 all equate to " << v[0] << endl;
        return 0;
    }

    else
    {
        // sort the homework grades;
        sort(v.begin(),v.end());
    }

    // declare new variables
    vecSize NMod4 = (N % 4);  // identification of 1 of the 4 known datum distribution profiles
    string datumDistr = "";   // datum distribution profile
    vecSize M, ML, MU;        // core vector indices for quartile computation
    double m, ml, mu;         // quartile values are store here

    // compute quartiles for the 4 known patterns
    if ( NMod4 == 0 )
    {
        // Q1-Q3 datum distribution: [0 0 0]
        datumDistr = "[0 0 0]";
        M = N / 2;
        ML = M / 2;
        MU = M + ML;

        // grab quartile values
        ml= (v[ML] + v[ML-1]) / 2;     // datum: 0
        m = (v[M] + v[M-1]) / 2;       // datum: 0
        mu = (v[MU] + v[MU-1]) / 2;    // datum: 0
    }

    else if ( NMod4 == 1 )
    {
        // Q1-Q3 datum distribution: [0 1 0]
        datumDistr = "[0 1 0]";
        M = N / 2;
        ML = M / 2;
        MU = M + ML + 1;

        // grab quartile values
        datumDistr = "[0 0 0]";
        ml= (v[ML] + v[ML-1]) / 2;      // datum: 0
        m = v[M];                       // datum: 1
        mu = (v[MU] + v[MU-1]) / 2;     // datum: 0
    }

    else if ( NMod4 == 2 )
    {
        // Q1-Q3 datum distribution: [1 0 1]
        datumDistr = "[1 0 1]";
        M = N / 2;
        ML = M / 2;
        MU = M + ML;

        // grab quartile values
        ml= v[ML];                    // datum: 1
        m = (v[M] + v[M-1]) / 2;     // datum: 0
        mu = v[MU];                   // datum: 1
    }

    else if ( NMod4 == 3 )
    {
        // Q1-Q3 datum distribution: [1 1 1]
        datumDistr = "[1 1 1]";
        M = N / 2;
        ML = M / 2;
        MU = M + ML + 1;

        // grab quartile values
        ml= v[ML];                    // datum: 1
        m = v[M];                     // datum: 0
        mu = v[MU];                   // datum: 1
    }

    else
    {
        cout << "Unknown pattern discovered - new algorithm may be required.";
    }

    // Display results
    streamsize prec = cout.precision();
    cout << "Display the sorted (non-descending) vector below." << endl;
    cout << "Index: Number" << endl;
    for (vecSize i = 0; i !=  N; ++i)
    {
        cout << i << ": " << v[i] << endl;
    }
    cout << endl;
    cout << "Vector size: " << N << endl;
    cout << "Datum Distribution: " << datumDistr << endl;
    cout << setprecision(3) << endl
         << " Q1: " << ml << endl
         << " Q2: " << m << endl
         << " Q3: " << mu << endl
         << setprecision(prec);
}

Result

i’m going to run the program a number of times – each with different input dataset. (e.g. incrementing N by 1, try out the Wikipedia example and compare, etc.)

N = 0

***************************************************************
*** This program computes quartiles given a list of numbers ***
***************************************************************

Enter all a list of numbers: ^Z

You must enter some numbers!

Process returned 1 (0x1)   execution time : 9.804 s

N = 1

***************************************************************
*** This program computes quartiles given a list of numbers ***
***************************************************************

Enter all a list of numbers: 5
^Z

 Only 1 number supplied. Q1, Q2, and Q3 all equate to 5

Process returned 0 (0x0)   execution time : 8.430 s

N = 2

***************************************************************
*** This program computes quartiles given a list of numbers ***
***************************************************************

Enter all a list of numbers: 10
20
^Z

Display the sorted (non-descending) vector below.
Index: Number
0: 10
1: 20

Vector size: 2
Datum Distribution: [1 0 1]

 Q1: 10
 Q2: 15
 Q3: 20

N = 3

***************************************************************
*** This program computes quartiles given a list of numbers ***
***************************************************************

Enter all a list of numbers: 10
20
30
^Z

Display the sorted (non-descending) vector below.
Index: Number
0: 10
1: 20
2: 30

Vector size: 3
Datum Distribution: [1 1 1]

 Q1: 10
 Q2: 20
 Q3: 30

N = 4

***************************************************************
*** This program computes quartiles given a list of numbers ***
***************************************************************

Enter all a list of numbers: 6
2
4
9
^Z

Display the sorted (non-descending) vector below.
Index: Number
0: 2
1: 4
2: 6
3: 9

Vector size: 4
Datum Distribution: [0 0 0]

 Q1: 3
 Q2: 5
 Q3: 7.5

N = 5

Enter all a list of numbers: 9
4
20
39
44
^Z

Display the sorted (non-descending) vector below.
Index: Number
0: 4
1: 9
2: 20
3: 39
4: 44

Vector size: 5
Datum Distribution: [0 0 0]

 Q1: 6.5
 Q2: 20
 Q3: 41.5

Wikipedia example (even size vector)

The results match!

**************************************************************
*** This program computes quartiles given a list of numbers **
**************************************************************

Enter all a list of numbers: 41
39
15
7
36
40
^Z

Display the sorted (non-descending) vector below.
Index: Number
0: 7
1: 15
2: 36
3: 39
4: 40
5: 41

Vector size: 6
Datum Distribution: [1 0 1]

 Q1: 15
 Q2: 37.5
 Q3: 40

Wikipedia example (odd size vector)

The results match!

***************************************************************
*** This program computes quartiles given a list of numbers ***
***************************************************************

Enter all a list of numbers: 49
43
41
39
15
6
7
36
40
42
47
^Z

Display the sorted (non-descending) vector below.
Index: Number
0: 6
1: 7
2: 15
3: 36
4: 39
5: 40
6: 41
7: 42
8: 43
9: 47
10: 49

Vector size: 11
Datum Distribution: [1 1 1]

 Q1: 15
 Q2: 40
 Q3: 43

Reference

Koenig, Andrew & Moo, Barbara E., Accelerated C++, Addison-Wesley, 2000

Accelerated C++ Solution to Exercise 3-1

Exercise 3-1

Suppose we wish to find the median of a collection of values. Assume that the we have read some values so far, and that we have no idea how many values remain to be read. Prove that we cannot afford to discard any of the values that we have read. Hint: One proof strategy is to assume that we can discard a value, and then find values for the unread–and therefore unknown–part of our collection that would cause the median to be the value that we discarded.

Solution

I must admit that I find this question very difficult to understand. When I first read it I wasn’t sure whether it is meant to be a very simple and straight forward question, or some kind of trick question which has a definite answer.

I decided to google this and found this very interesting forum on this topic (including some comments from the original author A.Koenig!). After spending an hour going through the thread I decided the question and answer might get a bit too mathematical / theatrical, which I am not prepared to spend my next hours or days researching it. For this reason, I will stick to my motto KISS – Keep It Simple and Stupid. i.e. I will just give a couple of examples showing that, the true median value of a full set “without discarding” may be different to the (distorted) median of the same full set “with discarding“. Hence demonstrating why we cannot afford to discard any numbers.

Example 1

Assume we have read the numbers 1, 2, 3, 4 so far, and imagine that the next number is 5.

Scenario The eventual “full set” The Median Observation
No Discarding 1, 2, 3, 4, 5 3 True median
We discard 1  2, 3, 4, 5 3.5 Larger than true median
We discard 2 1, 3, 4, 5 3.5 Larger than true median
We discard 3 1, 2, 4, 5 3 Identical to true median
We discard 4 1, 2, 3, 5 2.5 Smaller than true median

Example 2

Assume we have read the numbers 1, 2, 3, 4 so far, and imagine that the next number is 5, 6.

Scenario The eventual “full set” The Median Observation
No Discarding 1, 2, 3, 4, 5, 6 3.5 True median
We discard 1 2, 3, 4, 5, 6 4 Larger than true median
We discard 2 1, 3, 4, 5, 6 4 Larger than true median
We discard 3 1, 2, 4, 5, 6 4 Larger than true median
We discard 4 1, 2, 3, 5, 6 3 Smaller than true median

Conclusion

The two examples above have demonstrated that we cannot afford to discard any numbers from the number set, should we wish to identify the true median. By discarding any numbers from the (read) set, we essentially distort the median.

Reference

Koenig, Andrew & Moo, Barbara E., Accelerated C++, Addison-Wesley, 2000