Exercise 3-3
Write a program to count how many times each distinct word appears in its output.
Solution
Note: Viktor has kindly pointed out the fact that I have mis-read the question as “How many distinct words are there” (instead of how many times each distinct word appears). The solution below therefore, does not actually under the question above!!! (If I have time I may redo the question – otherwise please just ignore my solution below!!!)
My solution is a result of combining the knowledge gained from Solution to Exercise 2-8 (on recursive operations between element v[i] and v[i+1]) and Solution to Exercise 3-2 (on manipulating vector).
The solution strategy:
- Ask user to provide the list of words so we can append the string elements to a vector, v.
- Compute the size of vector, N.
- If vector size is 0, the number of distinct words is also 0.
- If vector size is 1, the number of distinct words is also 1.
- If vector size is 2 or above, we apply an algorithm to identify and count the distinct words. (To be explained further below).
The algorithm (for the case of vector size of 2 or above) is described as followings:
- We sort the vector v in non-descending order.
- We initialise the index A to 0, and index B to 1. These indices will be used for comparing adjacent elements within the vector (to check for distinctive elements).
- We also initialise the distinct element count figure ND to 1 – we know that we have at least 1 distinct word.
- We perform N-1 number of comparisons between the adjacent elements v[A] and v[B]. (e.g. if there are 10 elements, we can only compare 9 sets of adjacent elements).
- During each comparison step, if v[B] is not the same as v[A], it implies that v[B] is a newly discovered distinct word. We increment ND.
- Note: if v[B] is the same as v[A], it implies v[B] is a repeated word. We do nothing in that case, as we are only interested in discovering new distinct words.
- We then increment the indices A and B by 1 for the next comparisons. i.e. we first compare v[0] and v[1], then v[1] and v[2], then v[2] and v[3], etc…
- We display the final value of ND – this is the number of distinct words computed.
Putting this all together, we have our full program.
#include <algorithm> #include <iostream> #include <string> #include <vector> using std::cin; // <iostream> using std::cout; // <iostream> using std::endl; // <iostream> using std::sort; // <algorithm> using std::string; // <string> using std::vector; // <string> int main() { // display header message cout << "***************************************************************\n" "*** This program computes number of unique words ***\n" "***************************************************************\n"; cout << endl; // ask for a list of numbers and store the list as a vector cout << "Enter a list of words one by one: "; vector<string> v; string x; while (cin >> x) v.push_back(x); // append new input to the vector cout << endl; // define and compute core vector variables typedef vector<string>::size_type vecSize; // define a type for vector size related variables vecSize N = v.size(); // number of elements in the vector vecSize numLoops = N - 1; // number of (comparison) operators required vecSize ND; // number of distinct words // Check vector size, action accordingly if (N ==0 ) { ND = 0; cout << "Number of distinct words = " << ND << endl; return 0; } else if (N ==1 ) { ND = 1; cout << "Number of distinct words = " << ND << endl; return 0; } else { // sort the vector; sort(v.begin(),v.end()); // display some results to console window cout << "Vector size (number of words entered): " << N << endl; cout << endl; cout << "Display the sorted (non-descending) distinct words below." << endl; cout << v[0] << endl; // declare new variables vecSize A = 0; // vector index vecSize B = 1; // vector index ND = 1; // number of distinct words // Loop through the vector, compute ND, and identify the distinct words for (vecSize i = 0; i != numLoops; ++i) { if (v[B] != v[A]) { ++ND; cout << v[B] << endl; // display any newly discovered distinct words } ++A; ++B; } // Display final distinct word count cout << endl; cout << "Number of distinct elements (words): " << ND << endl; } return 0; }
Result
Below shows the results of the 3 sets of test:
- N = 0
- N=1
- N>=2
The results also reveal that fact that the sorting elements of a vector is case-sensitive. e.g. the string John (beginning with upper case) is different to john (all lower case).
N = 0
*************************************************************** *** This program computes number of unique words *** *************************************************************** Enter a list of words one by one: ^Z Number of distinct words = 0
N = 1
*************************************************************** *** This program computes number of unique words *** *************************************************************** Enter a list of words one by one: Johnny ^Z Number of distinct words = 1
N >= 2
*** This program computes number of unique words *** *************************************************************** Enter a list of words one by one: john peter pete johnny john John ^Z Vector size (number of words entered): 6 Display the sorted (non-descending) distinct words below. John john johnny pete peter Number of distinct elements (words): 5
Hi, I think you misunderstood the problem. The description says to write a program that counts how many times each distinct word appears, not how many distinct words there are.
Ooops! You are so right!!! (I definitely had mis-read the question)
Hi! First of all, thank you for taking the time to do what you did for all the chapters. It helps immensely.
Moreover, I have a question regarding one part of your solution.
You decide to create variables A and B and give them the values 0 and 1 (respectively).
Is there any reason as to why you didn’t use the same ‘i’ of the for loop and simply say [i] and [i+1] ? (That way they also increment)
Like this:
for (vecSize i = 0; i != numLoops; ++i)
{
if (v[i+1] != v[i])
.
.
.
Also, I believe that your counter and the int in the last for loop should be ints and not vecSize. I think vecSize works because vector::size_type is a typedef of an int or longint, but I am almost sure you should not use it in this case.
Finally, and this is just what I believe is more of a common practice, ‘<‘ is more commonly used than’ !=’ , leaving the for loop like this.
int count = 1;
for (int i=0; i < size-1; i++) {
if (words[i+1] != words[i]) {
++count;
}
}
In conclusion, that is the beauty of programming, that a simple question can have so many different solutions which can all reach the correct output.
But again, thank you for your input and the good work you’ve done here!
here is my solution:
if (size == 0)
{
cout << "There is no words in the vector. Size = " << size <<
"\nPlease try again!" << endl;
}
else if (size == 1)
{
cout << "Number of distinct words = " << size <<
"\nPlease try again!" << endl;
}
else
{
sort(words.begin(), words.end());
int a = 0; // index of the first word in the vector
int b = 1; // index of the second word in the vector
int x = 1;
for (int i = 0; i < size; i++)
{
if (words[a] == words[b])
{
x++;
}
else
{
cout << words[a] << " - " << x << endl;
x = 1;
}
a++;
b++;
}
}
https://pastebin.com/VQXKXk2U
My solution, counts the unique words and outputs them, but doesn’t store them.
Your solution is great but I think you don’t need to put the last variable wordCount = 1 at the bottom
Here’s my solution: https://gist.github.com/luitzenhietkamp/debe965cbe369d5d11e611e9e633cba3
int main()
{
freopen(“input.txt”, “r”, stdin);
freopen(“output.txt”, “w”, stdout);
//enter string vector
std::vector v;
string x;
while(cin>>x) v.push_back(x);
//sorting => immediately see the similar words standing by each other
sort(v.begin(),v.end());
//size of vector
vector::size_type size=v.size();
//initialize count variable
int count = 1;
//start counting
for (int i = 0; i < size; ++i)
{
//cout<<v[i] <<endl;
if (v[i]==v[i+1])
{
++count;
} else {
cout<<“\”” <<v[i] <<“\”” <<” appears ” <<count <<” times “<<endl;
count=1;
}
}
return 0;
}
Here is my solution :D :
#include “stdafx.h”
#include
#include
#include // Pre Proccessor Directive – Includes files that program needs
#include
#include
#include
using namespace std; // Standard Library = std , this includes this library
// CTRL + K + C block code
int main()
{
//ask for list of strings
cout << “Please enter multiple strings. ” << endl <<
“followed by end-of-file : “;
// a variable into which to read
string x;
vector<string> WordList;
// invariant:homework contains all the homework grades read so far
while (cin >> x) {
WordList.push_back(x);
}
cout << endl;
//check that the student entered some homework grades
typedef vector<double>::size_type vec_sz;
vec_sz size = WordList.size();
sort(WordList.begin(), WordList.end());
if (size == 0) {
cout << endl << "YOU HAVE FAILED TO ENTER ANYTHING. "
"Please try again." << endl;
return 1;
}
// compute the counts of each distinct word in the list
if (size == 1) {
cout << WordList[0] << "Appears once" << endl;
}
if (size > 1) {
int b = 0;
for (int a = 0; a != size; a++) {
int appearances = 0;
for (int b = 0; b != size; b++) {
if (WordList[a] == WordList[b])
{
++appearances;
}
}
cout << WordList[a] << " appears " << appearances << " times" << endl;
if (appearances > 1) {
a += appearances - 1;
}
}
cout << endl;
cout << endl;
}
return 0;
}