PS3: Introduction to refactoring
Last Git reminder for this session
Maintain your repositories!
Conflict resolution
You have been introduced to conflict resolution in the SAÉ project. Try to resolve the conflicts on your own. If you encounter difficulty, ask for assistance from your instructor.
Ignore error handling for now
We have not covered error handling in class yet. For now, you can assume that the arguments are always valid and do not need to worry about error handling.
Compile and run regularly
Do not forget to compile and run your code regularly during the refactoring process. The behavior of your code should remain consistent with your expectations!
Objectives¶
The goal of this session is to understand the following points:
- Code organization and structuring
- Single Responsibility Principle
- The Stepdown Rule
- Functions with few arguments
const
and&
Exercise 1 : Short functions¶
Code organization¶
Create a
short-functions/
directory insidePS3/
.Create the following files inside
short-functions/
.
#ifndef PRODUCT_H
#define PRODUCT_H
#include <string>
class Product {
private:
std::string mName;
double mPrice; // in euros
int mQuantity; // in units
public:
Product(const std::string& name, double price, int quantity);
std::string getName() const;
double getPrice() const;
int getQuantity() const;
bool operator<(const Product& other) const;
};
#endif
#include "product.h"
#include <string>
Product::Product(const std::string& name, double price, int quantity)
: mName(name), mPrice(price), mQuantity(quantity) {}
std::string Product::getName() const {
return mName;
}
double Product::getPrice() const {
return mPrice;
}
int Product::getQuantity() const {
return mQuantity;
}
bool Product::operator<(const Product& other) const {
return mPrice < other.mPrice;
}
#ifndef DISPLAY_PRICE_H
#define DISPLAY_PRICE_H
#include "product.h"
void displayAvailableProductsByNonDecreasingPriceAndDisplayTotalPrice();
#endif
#include "product.h"
#include "display-prices.h"
#include <iostream>
#include <vector>
#include <algorithm>
void displayAvailableProductsByNonDecreasingPriceAndDisplayTotalPrice(){
// Fetching products data
std::vector<Product> products;
// Our current products
products.push_back(Product("Laptop", 1000, 5));
products.push_back(Product("Smartphone", 700, 0));
products.push_back(Product("Tablet", 500, 2));
products.push_back(Product("Headphones", 150, 10));
products.push_back(Product("Smartwatch", 200, 0));
// Fetching available products data
std::vector<Product> availableProducts;
for (const Product& product : products) {
if (product.getQuantity() > 0) {
availableProducts.push_back(product);
}
}
// Sorting available products by non-decreasing price
sort(availableProducts.begin(), availableProducts.end());
// Calculating total price
double totalPrice = 0;
for (const Product& product : availableProducts) {
totalPrice += product.getPrice() * product.getQuantity();
}
// Displaying available products by non-decreasing price
std::cout << "Sorted Available Products (by non-decreasing price):" << std::endl;
for (const Product& product : availableProducts) {
std::cout << "Name: " << product.getName() << " - Price: " << product.getPrice()
<< " - Quantity: " << product.getQuantity() << std::endl;
}
// Displaying total price
std::cout << "Total Price of Available Products: " << totalPrice << std::endl;
}
#include "display-prices.h"
int main() {
displayAvailableProductsByNonDecreasingPriceAndDisplayTotalPrice();
return 0;
}
Why not group all the code into a single file?
When the project becomes more complex than a simple coding exercise, code separation brings the same benefits as the cleanliness and development principles discussed in class: modularity, reusability, maintainability, extensibility, encapsulation, readability, error handling, ...
Where is using namespace std
?
using namespace std
?using namespace std
avoids the need to write std::
before all types, functions, or objects from the C++ standard library.
However, when the project becomes more complex, conflicts may arise if a function (or type/object) shares the same name as a function from the standard library.
We will therefore avoid using using namespace std
from now on.
Header files .h
in C++
.h
in C++In C++, header files (.h
) contain the declarations of functions, classes, and variables, but not their definitions.
This separation between declaration and definition allows C++ to manage dependencies and modularize the code.
For example, if a source file .cpp
needs the Product
object, it is enough to include (#include
) "product.h"
without worrying about the implementation, because product.h
“promises” that the declared attributes, constructor, and methods will be implemented.
Preprocessor directives in C++
This part is called directives:
#ifndef PRODUCT_H
#define PRODUCT_H
#include <string>
//...
#endif
Compilation in C++
To properly organize C++ code and understand its importance, it is essential to grasp how compilation works in C++.
In reality, the term “compilation” encompasses 4 steps for each source file .cpp
.
- Preprocessing
- Directives are executed. For example,
#include
is replaced with the contents of the included file. - Translation Unit
- The preprocessed file becomes a large text file called a translation unit.
- Compilation
- The translation unit is compiled into machine code in an object file (
.o
). - Linking
- The object files are linked based on their dependencies (headers and corresponding source files) to form the executable.
Why separate headers and source code in C++?
- Imagine a source code that contains both the declarations and the definition (which we are used to) of a class that we need to reuse in several files to build our executable.
- The best practice is to compile all
.cpp
files, rather than trying to follow the dependencies between files and risk getting lost to avoid compiling everything. - This code will thus be compiled at least twice: first as a
.cpp
file and another time when included in another.cpp
file. - C++ follows the one definition rule, which states that for a given executable, if the scopes of two variables or functions overlap, they must be different.
- During linking, C++ detects an error because the executable contains duplicate definitions from different translation units.
- A header is not compiled like a source file because it only contains declarations, not the definitions themselves. When the header is included in source files
.cpp
via#include
, identical declarations may appear multiple times, but C++ allows multiple declarations of the same element with different scopes or from different translation units (as long as there is only one definition). - Therefore, never include a
.cpp
, only.h
files.
Include guards
These directives together form an include guard :
#ifndef PRODUCT_H
#define PRODUCT_H
//...
#endif
#define
creates a macro that replaces the defined text (PRODUCT_H
) with the following code. The naming convention for macros is the same as for global constants and includes_H
to indicate that it is a macro related to the include guard of a header (though other types of macros may exist).#ifndef
(short for “if not defined”),#define
, and#endif
mean: “If the macro PRODUCT_H is not defined, define the macro PRODUCT_H as follows:<code>
, end of the condition”.
The include guard ensures that a header is included only once in each source code and therefore in each translation unit.
The downside of include guards is the need for a consistent naming scheme for an executable (whose compilation can come from thousands of different files spread across different directories): there cannot be two headers with the same macro (even if the files are in very distant directories from each other).
Another directive is sometimes used instead of include guards, but it also has its own drawbacks.
Standard best practices in C++ recommend using include guards.
Why include <string>
a “second” time?
<string>
a “second” time?If product.h
includes <string>
and product.cpp
includes product.h
, why do we need to include <string>
a second time?
Include what you see is a good practice in C++.
First, headers do not always contain all the libraries needed for source files, as some libraries may not be necessary for a declaration but will be essential for the definition.
Additionally, checking that all necessary libraries are already included in the headers is a waste of time when there are many headers. On the other hand, an extra include is not an issue if include guards are used (which is the case for all C++ libraries).
Product::
Product::
It is necessary to write Product::
in front of all methods of Product
to indicate to C++ that these elements are internal to Product
. It is possible to define functions external to Product
in the same file, but this is not a good practice.
operator<
operator<
The method operator<
is an override of the default comparison <
.
This means that Product
objects cannot be compared with <
in the traditional way. We redefine this comparison by saying that one Product
object is smaller than another (other
) if its price is smaller (mPrice < other.mPrice
).
This will allow sorting the products based on their price later.
for (const Product& product : products)
for (const Product& product : products)
It is possible to write a for
loop with the following syntax: for (ElementType element : vectorOfElements)
.
Code organization
It is good practice to break down the code into classes, related external functions, and the main (each having its own header, except for main
).
Finer breakdowns of external functions can be made if part of these functions needs to be reused elsewhere (modularity and reusability).
- Compile the code with
g++ -o short-functions main.cpp display-prices.cpp product.cpp
(the order of.cpp
files does not matter).
C++ version
If you see warnings during compilation related to the syntax standards used (for example, for (ElementType element : vectorOfElements)
, which only exists since C++11), you can compile with a more recent version of C++ that is compatible with your g++
compiler.
To check the version (year) of C++ used by default by g++
on your workstation, use the following command:
g++ -dM -E -x c++ /dev/null | grep __cplusplus | sed 's/[^0-9]*\([0-9]\{4\}\)[0-9]*L/\1/'
For example, 2011
corresponds to C++11.
To check the versions compatible with g++
on your workstation, use the command:
g++ -v --help 2> /dev/null | grep -oP '(?<=-std=)c\+\+\S+' | sed 's/\.$//' | sort | uniq
The latest C++ versions after 11 are C++14, 17, 20, and 23.
You can compile your code using a more recent and compatible version than the default one. For example:
g++ -std=c++14 -o short-functions main.cpp display-prices.cpp product.cpp
- Run
./short-functions
.
Refactoring the code¶
- What does this code do?
sort
sort
The sort
function sorts a vector
in place (no copy of the vector
is created). It takes as arguments the start and end indices of the vector
and uses the default <
comparison (which has been redefined for Product
). Therefore, it sorts the values in a non-decreasing order (increasing meaning strictly ascending).
const
and &
const
and &
Using const
for function arguments ensures that the function will not modify the const
argument. The reference &
for arguments allows working on the same object as the one passed to the function, avoiding the creation of a copy of the object. This also helps avoid copying “heavy” objects. Here, Product
has three attributes of type string
, double
, and int
. double
and int
are considered “light”, while string
can be “heavy”.
Quiz: What are the issues with the code of
displayAvailableProductsByNonDecreasingPriceAndDisplayTotalPrice
?Refactor the code of
display-prices.cpp
(by modifying the other affected files, such as the header and main).
Did you accomplish your task?
- Did you follow The Stepdown Rule?
- Are your functions well-named?
- Do your functions have a single responsibility?
- Do your functions have few arguments?
- Are your one-argument functions categorized as discussed in class?
- Did you place
const
and&
in the correct places?
Exercise 2: Classify it¶
Create a directory
classify-it/
(within which the code will be organized into several different files).In
classify-it/
, create the following file.
#include <iostream>
#include <string>
#include <vector>
double calculateAverage(const std::vector<double>& scores) {
double sum = 0;
for (double score : scores) {
sum += score;
}
return sum / scores.size();
}
void printScores(const std::string& studentName, int studentId, const std::vector<double>& scores) {
std::cout << std::endl << "Scores for " << studentName << " (Id: " << studentId << "): ";
for (size_t i = 0; i < scores.size(); i++) {
std::cout << "[" << i+1 << "] " << scores[i] << " ";
}
std::cout << std::endl;
}
void printAverage(const std::vector<double>& scores){
std::cout << "Average: " << calculateAverage(scores) << std::endl;
}
int main() {
std::string studentName;
int studentId;
std::vector<double> scores;
studentId = 12345;
studentName = "Alice";
scores.push_back(11);
scores.push_back(12.5);
scores.push_back(14.75);
scores.push_back(19);
printScores(studentName, studentId, scores);
return 0;
}
- What does this code do?
size_t
size_t
size_t
is the default type for the index of a vector
rather than int
because a vector
can be very large, and its indices might exceed the limit of int
.
It is difficult to add students and their grades.
We are therefore going to create a Student
class with the following three attributes:
- Only the student’s Id is necessary to construct the object. The Id is immutable: once initialized by the constructor, it cannot be modified. This is made possible by using
const
in front of the attribute.
Initializing a const
attribute
const
attributeAn immutable (const
) attribute can only be initialized in the initializer list in C++.
For example:
class MyClass{
private:
const int mAttribute;
public:
MyClass(int valueName) : mAttribute(valueName){
// Constructor body
}
}
This attribute cannot be initialized in the constructor body. For example, the following code will result in a compilation error:
class MyClass{
private:
const int mAttribute;
public:
MyClass(int valueName) {
mAttribute = valueName;
}
}
- The student’s name can be modified. Initially, it is empty (by default, an uninitialized
string someString
is== ""
). We will not handle potential errors from an empty name in this exercise.
Setter
A typical setter looks like this:
void setAttribute(const AttributeType& valueName){
mAttribute = valueName;
}
- The student’s
vector
of scores will not be accessible. The only access will be through displaying (printScores
) and adding scores via theappendScore
function, which adds a score to the end of thevector
usingpush_back
. There are no scores initially, but we will also ignore potential errors coming from an emptyvector
in this exercise.
Create the
Student
class with the appropriate methods (including the corresponding.h
and.cpp
files), and add functions to calculate the average and display the scores.Modify the
main.cpp
directives to includestudent.h
and the necessary libraries.
Did you accomplish your task?
- Do the
.h
and.cpp
files follow best practices? - Are the attributes named properly (without unnecessary prefixes)?
- Do the getters and setters match our use case?
- Have you converted the functions into methods of the class properly?
- Have you checked if methods that do not modify the object are marked
const
?
User input (Bonus)¶
The benefit of having a Student
class is to make it easier to enter data for multiple students.
In order to input data for several students, we will write several functions to handle user input.
Create a file
user-input.cpp
along with the associated header file.Here are some function declarations to help you:
void inputNumberOfStudents(int& number);
void inputStudentId(int& id);
void inputStudentName(Student& student);
void inputStudentScores(Student& student);
Student createStudentFromInput();
void printAllScoresAndAverages(const std::vector<Student>& students);
- Here is the code of
inputStudentScores
to help you:
void inputStudentScores(Student& student) {
double score;
std::cout << "Enter score (-1 to stop): ";
std::cin >> score;
while (score != -1) {
student.appendScore(score);
std::cout << "Enter score (-1 to stop): ";
std::cin >> score;
}
}
- Implement the given functions and modify
main
to retrieve user input and display the grades as well as the averages for each student.
Return to the objectives and check the points you have mastered. Review the points you have not yet fully understood. Ask your instructor for help if needed.