Accelerated C++ Solution to Exercise 5-6

Exercise 5-6

Rewrite the extract_fails function from S5.1.1/77 so that instead of erasing each failing student from the input vector students, it copies the records for the passing students to the beginning of students, and then uses the resize function to remove the extra elements from the end of students. How does the performance of this version compare with the one in S5.1.1/77?

Solution

The original (version 2) extract_fail function in the text book S5.1.1/77 looks like this – it uses the vector<XXX>::erase function.

{
    Student_group fail;
    Student_group::size_type i = 0;

    // invariant: elements [0,i) of students represent passing grades
    while (i != students.size())
    {
        if (fgrade(students[i]))
        {
            fail.push_back(students[i]);
            students.erase(students.begin() + i);
        }
        else
            ++ i;
    }
    return fail;
}

Note that I have used the technique developed from my Solution to Exercise 5-4, in which all the vector<Student_info> are replaced by Student_group, using the C++ typedef. (Just trust me for now there is a typdef somewhere else that defines Student_group) I believe it makes sense to build on what we have learnt!

The newly “re-written” function will look like this it uses a combination of vector<XXX>::insert and vector<XXX>::resize functions.

Student_group extract_fails_v2_5p6(Student_group& students)
{
    Student_group fail;
    typedef Student_group::size_type VecSize;
    VecSize countPass = 0;

    // invariant: elements [0,i) of students represent passing grades
    VecSize i = 0;
    while (i != students.size())
    {
        if (fgrade(students[i]))
            fail.push_back(students[i]);
        else
        {
            students.insert(students.begin(), students[i++]);
            ++countPass;
        }
        ++i;
    }
    students.resize(countPass);

    return fail;
}

If after skimping through this new code and you fully understand the concepts / what’s going on, then congratulation you are done!

From a series of tests the new version happens to take 50% more time than the original version (when reaching around 10,000 lines of input) – performance results can be found at the bottom of the page under the Test Result section.

If you however would like to know more in details, such as how I derive this new code logic, then read on!

Code Logic – The insert/resize Method

The way I derive the eventual code was simply doing some sketches on paper (In face this is how I usually solve problems – making sketches and pictures!). Below shows the diagrams that I drawn – essentially I started with a hypothetical example and just work my way through using logic. Some notes:

  • The countPass is a counter to keep track of the number of passing students – this will be used later on to feed the vector<XXX>::resize function.
  • The i, is just an incrementing variable (our good old friend), to facilitate a while loop.

Acpp5p6Pic1a

Acpp5p6Pic1b

Acpp5p6Pic2a

Acpp5p6Pic2b

Have we reached the end of the vector? (yes)

Acpp5p6Pic3a

Use the resize function to keep only the passing students in vector<Student_info> students.

Acpp5p6Pic3b

And these are the final output!

Acpp5p6Pic3c

An Important Learning – iterator

This exercise has exposed an interesting fact: iterator does NOT work when the underlying vector changes size! i.e. the following code will not work (even though it looks correct). One must use vector indexing instead! (I spent an hour debugging this – a hard lesson learnt!). Must be a feature of vector iterator vs vector index.

Student_group extract_fails_v2_5p6(Student_group& students)
{
    Student_group fail;
    Student_group::iterator i = students.begin();
    Student_group::size_type countPass = 0;

    // invariant: elements [0,i) of students represent passing grades
    while (i != students.end())
    {
        if (fgrade(*i))
            fail.push_back(*i);
        else
        {
            students.insert(students.begin(), *i);
            ++countPass;
            ++i;
        }
        ++i;
    }
    students.resize(countPass);

    return fail;
}

The Project

Here are the complete set of header and source files that enable me to perform the test.

Source File List

Header File List

Source Files

main.cpp

#include <iostream>
#include <iomanip>
#include <ios>
#include <ctime>
#include "Student_info.h"
#include "Student_group.h" // switch between vector and list based Student_group here
#include "grade.h"
#include "extract_fails.h"

using std::cin;
using std::cout;
using std::endl;
using std::setprecision;
using std::streamsize;

int main()
{

    Student_group students;
    Student_info record;

    // read and store all the student's data.
    while (read(cin, record))
        students.push_back(record);

    // Extract the failed students and time-stamp it!
    clock_t startTime = clock();
    Student_group students_failed = extract_fails_v2(students);
    /* Student_group students_failed = extract_fails_v2_5p6(students);   */
    clock_t endTime = clock();
    clock_t clockTicksTaken = endTime - startTime;
    double timeInSeconds = clockTicksTaken / (double) CLOCKS_PER_SEC;


    streamsize prec = cout.precision();
    cout << "Elapsed (in seconds): " << setprecision(30) << timeInSeconds
         << setprecision(prec) << endl;

    // Uncomment the following block to display passing and failing students
    /*
    cout << endl;

    // Report passing students
    cout << "These students have passed." << endl;
    for (Student_group::const_iterator iter = students.begin();
         iter != students.end(); ++iter)
        cout << iter->name << " (" << grade(*iter) << ")" << endl;

    cout << endl;

    // Report failing students
    cout << "These students have failed." << endl;
    for (Student_group::const_iterator iter = students_failed.begin();
         iter != students_failed.end(); ++iter)
        cout << iter->name << " (" << grade(*iter) << ")" << endl;
    */

    return 0;
}

extract_fails.cpp

#include "Student_info.h"
#include "Student_group.h"
#include "grade.h"

// separate passing and failing student records
// derived from S5.1.1/77
Student_group extract_fails_v2(Student_group& students)
{
    Student_group fail;
    Student_group::size_type i = 0;

    // invariant: elements [0,i) of students represent passing grades
    while (i != students.size())
    {
        if (fgrade(students[i]))
        {
            fail.push_back(students[i]);
            students.erase(students.begin() + i);
        }
        else
            ++ i;
    }
    return fail;
}

// Required by Exercise 5-6
Student_group extract_fails_v2_5p6(Student_group& students)
{
    Student_group fail;
    typedef Student_group::size_type VecSize;
    VecSize countPass = 0;

    // invariant: elements [0,i) of students represent passing grades
    VecSize i = 0;
    while (i != students.size())
    {
        if (fgrade(students[i]))
            fail.push_back(students[i]);
        else
        {
            students.insert(students.begin(), students[i++]);
            ++countPass;
        }
        ++i;
    }
    students.resize(countPass);

    return fail;
}

grade.cpp

#include <stdexcept>
#include <vector>
#include "grade.h"
#include "median.h"
#include "Student_info.h"

using std::domain_error;
using std::vector;

// definitions for the grade functions from S4.1/52, S4.1.2/54, S4.2.2/63

// compute a student's overall grade from midterm and final exam
// grades and homework grade (S4.1/52)
double grade(double midterm, double final, double homework)
{
    return 0.2 * midterm + 0.4 * final + 0.4 * homework;
}

// compute a student's overall grade from midterm and final exam grades
// and vector of homework grades.
// this function does not copy its argument, because median (function) does it for us.
// (S4.1.2/54)
double grade(double midterm, double final, const vector<double>& hw)
{
    if (hw.size() == 0)
        throw domain_error("student has done no homework");
    return grade(midterm, final, median(hw));
}

// this function computes the final grade for a Student_info object
// (S4.2.2/63)
double grade(const Student_info& s)
{
    return grade(s.midterm, s.final, s.homework);
}

// predicate to determine whether a student failed
// (S5.1/75)
bool fgrade(const Student_info& s)
{
    return grade(s) < 60;
}

median.cpp

// source file for the median function
#include <algorithm>
#include <stdexcept>
#include <vector>

using std::domain_error;
using std::sort;
using std::vector;

// compute the median of a vector<double>
double median(vector<double> vec)
{
    typedef vector<double>::size_type vec_sz;

    vec_sz size = vec.size();
    if (size == 0)
        throw domain_error("median of an empty vector");

    sort(vec.begin(),vec.end());

    vec_sz mid = size/2;

    return size % 2 == 0 ? (vec[mid] + vec[mid-1]) / 2 : vec[mid];
}

Student_info.cpp

#include "Student_info.h"

using std::istream;
using std::vector;

// we are interested in sorting the Student_info object by the student's name
bool compare(const Student_info& x, const Student_info& y)
{
    return x.name < y.name;
}

// read student's name, midterm exam grade, final exam grade, and homework grades
// and store into the Student_info object
// (as defined in S4.2.2/62)
istream& read(istream& is, Student_info& s)
{
    // read and store the student's name and midterm and final exam grades
    is >> s.name >> s.midterm >> s.final;

    // read and store all the student's homework grades
    read_hw(is, s.homework);
    return is;
}

// read homework grades from an input stream into a vector<double>
// (as defined in S4.1.3/57)
istream& read_hw(istream& in, vector<double>& hw)
{
    if (in)
    {
        // get rid of previous contents
        hw.clear();

        // read homework grades
        double x;
        while (in >> x)
            hw.push_back(x);

        // clear the stream so that input will work for the next student
        in.clear();
    }
    return in;
}

Header Files

extract_fails.h

#ifndef GUARD_EXTRACT_FAILS_H
#define GUARD_EXTRACT_FAILS_H

// extract_fails.h
#include "Student_group.h"

Student_group extract_fails_v2(Student_group&);
Student_group extract_fails_v2_5p6(Student_group&);

#endif // GUARD_EXTRACT_FAILS_H

grade.h

#ifndef GUARD_GRADE_H
#define GUARD_GRADE_H

// grade.h
#include <vector>
#include "Student_info.h"

double grade(double, double, double);
double grade(double, double, const std::vector<double>&);
double grade(const Student_info&);
bool fgrade(const Student_info&);

#endif // GUARD_GRADE_H

median.h

#ifndef GUARD_MEDIAN_H
#define GUARD_MEDIAN_H

// median.h
#include <vector>
double median(std::vector<double>);

#endif // GUARD_MEDIAN_H

Student_group.h

#ifndef GUARD_STUDENT_GROUP_H
#define GUARD_STUDENT_GROUP_H

// Student_group.h
#include <list>
#include <vector>
#include "Student_info.h"

// ********************* AMEND THIS ***************************
// Solution to Exercise 5-4: Pick one of the followings
typedef std::vector<Student_info> Student_group;
//typedef std::list<Student_info> Student_group;
// ************************************************************

#endif // GUARD_STUDENT_GROUP_H

Student_info.h

#ifndef GUARD_STUDENT_INFO_H
#define GUARD_STUDENT_INFO_H

// Student_info.h
#include <iostream>
#include <string>
#include <vector>

struct Student_info
{
    std::string name;
    double midterm, final;
    std::vector<double> homework;
};

bool compare(const Student_info&, const Student_info&);
std::istream& read(std::istream&, Student_info&);
std::istream& read_hw(std::istream&, std::vector<double>&);

#endif // GUARD_STUDENT_INFO_H

Test Program

Just simply amend the following line in the main() program accordingly for test purposes.

Student_group students_failed = extract_fails_v2(students);
/* Student_group students_failed = extract_fails_v2_5p6(students); */

Performance Comparison

I’ve run a few tests and recorded the job run time for the extract_fail step. from this it looks like the insert/resize method is actually slower than the erase method. But then, I guess it depends on the input file – i.e. depending on the ratio of passing students to failing students, the sequence of the passing and failing students, and potentially other factors.

v2 (erase) v2_5p6 (insert/resize)
10 lines 0.001 seconds 0.000 seconds
100 lines 0.003 seconds 0.004 seconds
1,000 lines 0.156 seconds 0.212 seconds
10,000 lines 9.79 seconds 16.6 seconds

Reference

Koenig, Andrew & Moo, Barbara E., Accelerated C++, Addison-Wesley, 2000

6 thoughts on “Accelerated C++ Solution to Exercise 5-6”

  1. Using an iterator does appear to work, though I believe the insert invalidates all existing related iterator references. Also I didn’t notice that much of a difference in running time on an input of 10k. Interesting differences.

  2. You can use iterators but you’ll need to reinit them after the insert and keep track of where you we’re so you can increment back to where you were.

  3. But this way you can’t actually use list because list doesn’t support indexing, so I’ve been trying to find a way with iterators but I can’t… anyone who could give me a clue on the iterators way?

  4. Here is a solution that uses iterator
    // Chapter 5 problem 5.6 : using insert in the beginning and resize the vector instead of using erase
    Student_group extract_fails_insert(Student_container& students)
    {
    Student_group::const_iterator iter = students.begin();
    Student_group fail;

    using Student_size = Student_group::size_type;
    // how many passing students has been inserted
    Student_size pass = 0;
    // how many students has been read
    Student_size counter = 0;

    // invariant: elements [0,end)
    while(iter != students.end()){
    ++counter;
    // if students pass
    if (!fgrade(*iter)){
    ++pass;
    // insert the passing student in the beginning and restart iter from beginning of students
    iter = students.insert(students.begin(),*iter);
    // key: ignore what has been inserted (pass) and what has been read from students (counter)
    for (Student_size i = 0; i != pass + counter; ++i)
    ++iter;
    } else {
    fail.push_back(*iter);
    ++iter;
    }
    }

    // resize to only contain passing students
    students.resize(pass);

    return fail;

    }

  5. Hello Johnny,

    First of all : thank you for sharing your work, it is helping me a lot with the book.

    About this excercise, I did not have the same interpretation of the question statement.

    I think the idea was to insert the students at the beginning of the vector, but not to insert them one by one at the very beginning. Something like :

    vector extract_fails(vector& students)
    {
    vector fail;
    int i = 0;

    unsigned int successCount = 0;

    while (i != students.size()) {
    if (fgrade(students[i])) {
    fail.push_back(students[i]);
    } else
    {
    students[successCount++] = students[i];
    }
    ++i;
    }

    students.resize(successCount);

    return fail;

    }

    This leads to an algorithm that is more efficient than the previous two, since no additional allocation is made.

Comments are closed.