Chapter 5 Conditions and Loops

In programming, controlling the flow of execution is an essential task. When you start to work on more sophisticated programs with R, you will often need to control the flow and order the execution in your code. This is where the concept of flow control comes in.

There are two types of typical scenarios where flow control is used:

  1. Execute some parts of the code or skip others, based on certain criteria or conditions (i.e., an expression that evaluates to TRUE or FALSE)
  2. Repeat a particular code chunk a certain number of times, which is often referred to as loops.

In this Chapter, we will explore these core programming techniques using:

  • if-else statements
  • for and while loops

However, in Chapter 11, we will talk about loops more and point you to the idiomatic ways of dealing with loops in R (i.e., with Tidyverse syntax).

5.1 Conditions

Before we talk about the flow controls, let’s deal with the more fundamental element of the control structures: what is a condition? To that end, let’s first explore a few important concepts:

  • Boolean Values
  • Comparison Operators
  • Boolean Operators

Boolean Values

Unlike numbers or characters, the Boolean data type has only two values, TRUE and FALSE. In R, the Boolean values TRUE and FALSE lack the quotes you place around strings, and they are always uppercase.

cond1 <- TRUE
class(cond1)
[1] "logical"

Comparison Operators

Comparison Operators are important in programming. They compare two values and evaluate down to a single Boolean value. They are also referred to as relational operators.

Comparison Operators in R
Operator Meaning
== Equal to
!= Not equal to
> Greater than
< Less than
>= Greater than or equal to
<= Less than or equal to
X %in% Y X Is a member of Y |

These comparison operators return TRUE or FALSE depending on the values we give them.

45 == 45
[1] TRUE
45 > 50
[1] FALSE
45 != 4
[1] TRUE

Please note that these operators work not only with numbers but also characters as well.

a <- "run"
b <- "run"
c <- "walk"
all <- c("run","walk","march")

a == b
[1] TRUE
a == c
[1] FALSE
c %in% all
[1] TRUE

When using comparison operators, be careful of the data types (i.e., numeric or character).

## Create objects
a <- "42" ## char vec
b <- 42 ## num vec

## Comparison operations
a == 42.0
[1] TRUE
a > 40
[1] TRUE
a == b
[1] TRUE

Please note from the above example that R does implicit data type conversion, which means it automatically converts one data type to another as needed. This can occur in a variety of situations, such as when performing arithmetic operations on different types of data or when assigning a value of one data type to a variable of another data type.

Please note the difference between == and =. The == operator asks whether two objects are the same as each other while the = operator assigns the value/object on the right into the variable name on the left. In R, the preferred way of assignment is to use <- operator instead.

Boolean Operators

There are four Boolean operators in R to combine two conditions: & (AND), | (inclusive OR), xor() (exclusive OR), and ! (NOT). That is, we can combine two or more conditions with these operators for more complex conditions.

## Inclusive OR
x <- 47
x > 30 | x < 50
[1] TRUE
## AND
x <- 55
x > 30 & x < 50
[1] FALSE
## NOT
x <- 55
x > 50
[1] TRUE
!x > 50
[1] FALSE
## Create objects
a <- c(TRUE, FALSE, TRUE, FALSE)
b <- c(FALSE, TRUE, TRUE, FALSE)

## Exclusive vs. Inclusive OR
xor(a, b)
[1]  TRUE  TRUE FALSE FALSE
a|b
[1]  TRUE  TRUE  TRUE FALSE

Can you analyze the following given piece of R code and predict its output before execution?

# Exclusive OR
x <- c(1, 2, 3)
y <- c(1, 3, 5)
z <- xor(x %in% y, y %in% x)
print(z)

# Inclusive OR
z2 <- x %in% y | y %in% x
print(z2)

The Boolean operators follow the order of operations in math. That is, R evaluates the ! (not) operator first, then the & (and) operator, and then the | (or) operators.

## TRUE & FALSE & TRUE
2 + 2 == 4 &
  2 + 2 == 5 &
  2 * 2 == 2 + 2
[1] FALSE
## TRUE & TRUE & TRUE
2 + 2 == 4 &
  !2 + 2 == 5 &
  2 * 2 == 2 + 2
[1] TRUE

Elements of Flow Control

Flow control statements often start with a condition and are followed by a block of code. A quick recap:

  • Conditions: Any Boolean expressions can be a potential condition, which evaluates down to a Boolean value (i.e., TRUE or FALSE). A flow control statement decides what to do and what to skip based on whether the condition is TRUE or FALSE.
  • Block of Code: Lines of R code can be grouped together in blocks, using initial and ending curly brackets { and }. The beginning and ending of the block of code are clearly indicated.

5.2 if Statements

The main purpose of if is to control precisely which operations are carried out in a given code chunk.

An if statement runs a code chunk only if a certain condition is true. This conditional expression allows R to respond differently depending on whether the condition is TRUE or FALSE.

The basic template of if is as follows:


## Simple Binary Condition

if(CONDITION IS TRUE){

  DO THIS CODE CHUNK 1
  
} else {

  DO THIS CODE CHUNK 2

} ## endif



## More Complex Conditions
if (condition1) { 
    expr1
  
    } else if (condition2) {
    expr2
      
    } else if  (condition3) {
    expr3
      
    } else {
    expr4
      
} ## endif

  • The condition is placed in the parenthesis after if within ().

  • The condition must be an expression that returns only a single logical value (TRUE or FALSE).

  • If it is TRUE, the code chunk 1 in the curly braces will be executed; if the condition is not satisfied, the code chunk 2 in the curly braces after else will be executed.


Suppose you want to simulate a password checking system. The system requires you to enter a password to access it. The system has a password stored on the server. The gatekeeper of the system will check whether your input password matches the one stored on the server. If your input password does not match the one on the server, you will be banned from accessing the system.

correct_passcode <- 987  ## system correct passcode
input <- 113 ## assuming that you have the input 113
## Passcode checking & output
if(input == correct_passcode){
  writeLines("Congratulations! Now you may get in!")
} else{
  writeLines("Sorry! Wrong password.")
}
Sorry! Wrong password.

If the input matches the system passcode, you will be granted access to the system.

correct_passcode <- 987  ## system correct passcode
input <- 987 ## assuming that you have the input 987
## Passcode checking & output
if(input == correct_passcode){
  writeLines("Congratulations! Now you may get in!")
} else{
  writeLines("Sorry! Wrong password.")
}
Congratulations! Now you may get in!

Now we can ask R to prompt the user to input data directly in the R console, using the readline() function. (Note: Please run the entire code chunk all at once.)

## ------------------------------------- ##
## Run the entire code chunk all at once ##
## ------------------------------------- ##

## System Correct Passcode
correct_passcode <- 987  

## User's input
input <- readline(prompt = "Please enter your password:")

## Passcode checking & output
if (input == correct_passcode) {
  writeLines("Congratulations! Now you may get in!")
} else {
  writeLines("Sorry! Wrong password.")
} ## endif

In R, there are two very similar functions, readline() and readLines(). Please check the documentations of these two to make sure you understand their differences.

  • readline() is used to read a single line of input from the user in the R console. It prompts the user for input and waits for them to enter a response, then returns that response as a character string.

  • readLines() is used to read multiple lines of input from a file or other external source. It reads in all the lines of a file or input stream and returns them as a character vector.

5.3 for

The for loop is a core programming construct in R, which allows you to repeat a code chunk a certain number of times.

Typically, you would use a for loop to iterate over elements of a vector, list or data frame and/or perform a certain operation a fixed number of times.

The general structure of the for loop statement is to repeat the operation included a code chunk while incrementing an index or a counter.

The basic for loop template is as follows:


for(LOOP_INDEX in LOOP_VECTOR){
  
  DO THIS CODE CHUNK
  
}
  • The LOOP_INDEX is a placeholder that represents an element in the LOOP_VECTOR.

  • When the loop begins, the LOOP_INDEX starts off as the first element in the LOOP_VECTOR.

  • When the loop reaches the end of the brace, the LOOP_INDEX is incremented, taking on the next element in the LOOP_VECTOR.

  • This process continues until the loop reaches the final element of the LOOP_VECTOR. At this point, the code chunk is executed for the last time, and the loop exits.


For example, if we have a character vector with a few words in it. We can use a for loop to get the number of characters for each element (i.e., word) in the vector.

word_vec <- c("apple","banana","watermelon","papaya")

for(w in word_vec){
  word_nchar <- nchar(w)
  writeLines(as.character(word_nchar))
}
5
6
10
6

For the above example, there is another way to write the for loop:

for(i in 1:length(word_vec)){
  word_nchar <- nchar(word_vec[i])
  writeLines(as.character(word_nchar))
}
5
6
10
6
  • In our first example, the LOOP_INDEX serves as the exact element object in the LOOP_VECTOR.
  • In our second example, the LOOP_INDEX serves as the index of the element object in the LOOP_VECTOR.

Try the following code chunk and examine the differences in the outputs between print() and writeLines(). Do you know why the first for-loop does not work?

# writeLines() (This for-loop does not work!!)
for(i in 1:length(word_vec)){
  word_nchar <- nchar(word_vec[i])
  writeLines(word_nchar)
}

# `print()`
for(i in 1:length(word_vec)){
  word_nchar <- nchar(word_vec[i])
  print(word_nchar)
}

Please note that the control structures are to direct the flow of execution of the codes. Therefore, the control structure itself DOES NOT return any object. That is, you CANNOT assign a for-loop structure to an object. It is NOT meaningful and grammatical. The following code chunk would give you an error.

#######################################################
## WARNING!!! This code chunk is UNGRAMMATICAL!!!!!! ##
#######################################################

numOfChars <- for(i in 1:length(word_vec)){
  word_nchar <- nchar(word_vec[i])
  writeLines(as.character(word_nchar))
}

Exercise 5.1 Use the data set from stringr::sentences for this exercise. Create a for-loop structure to get the number of characters for all sentences in the stringr::sentences. Present your results in a data frame of three columns:

  • Column 1 includes an unique integer ID for each sentence;
  • Column 2 includes the texts of each sentence;
  • Column 3 includes the number of characters of each sentence.

Note: A for-loop structure is needed for this exercise. But it may not necessarily be the most efficient way.

The data set stringr::sentences includes 720 English sentences. The first six sentences are shown here.

[1] "The birch canoe slid on the smooth planks." 
[2] "Glue the sheet to the dark blue background."
[3] "It's easy to tell the depth of a well."     
[4] "These days a chicken leg is a rare dish."   
[5] "Rice is often served in round bowls."       
[6] "The juice of lemons makes fine punch."      

Your results may look like the following data frame:

Exercise 5.2 Use the data set stringr::fruit for this exercise. Create a for-loop and if-statement to instruct R to go through each fruit name and print out those fruit names that start with the vowel letters only (i.e., a, e, i, o, u).

[1] "apple"
[1] "apricot"
[1] "avocado"
[1] "eggplant"
[1] "elderberry"
[1] "olive"
[1] "orange"
[1] "ugli fruit"

5.4 while loop

There is another type of loop. Unlike the for loop, which repeats a code chunk by going through every element in a vector/list/data frame, the while loop repeats a code chunk UNTIL a specific condition evaluates to FALSE (It’s like the opposite of if-statement)

The basic template is as follows:


while(LOOP_CONDITION){
  
  DO THIS CODE CHUNK (UNTIL THE LOOP_CONDITION BECOMES FALSE)
  
}

  • Upon the start of a while loop, the LOOP_CONDITION is evaluated.
  • If the condition is TRUE, the braced code chunk is executed line by line till the end of the chunk.
  • At this point, the LOOP_CONDITION will be checked again.
  • The loop terminates immediately when the condition is evaluated to be FALSE. That is, the code chunk will be repeated as long as the condition is true.

Based on the template above, it is important to note that the code chunk executed must somehow cause the loop to exit.

In particular, the code chunk needs to change the values of certain objects, which would eventually lead to the change of the LOOP_CONDITION.

If the LOOP_CONDITION never evaluates to FALSE, the loop will continue infinitely, and R will crash. To avoid this, the code chunk must modify the values of certain objects that affect the loop condition. This way, as the loop runs, the loop condition will eventually be met, and the loop will exit.


Let’s come back to our password checker. This time let’s create a dumb checker.

When you give a wrong password which is smaller than the true passcode, it will automatically approach the right answer for you (and of course no real-world application would do that!)

## --------------------------------------------  ##
## Please run the entire code chunk all at once! ##
## --------------------------------------------  ##

## Create a program
app_v1 <- function(ans = 90, guess = 83) {
  while (guess != ans) {
    cat("Your `guess` is too small! \nThe system will take care for you!\n")
    guess <- guess + 1
    cat("Now the system is adjusting your `guess` to ", guess, "\n\n")
  } ## endwhile
  
  cat('Great! The passcode is finally cracked.\n')
} ## endfunc

## Run the program
app_v1(ans = 90, guess = 83)
Your `guess` is too small! 
The system will take care for you!
Now the system is adjusting your `guess` to  84 

Your `guess` is too small! 
The system will take care for you!
Now the system is adjusting your `guess` to  85 

Your `guess` is too small! 
The system will take care for you!
Now the system is adjusting your `guess` to  86 

Your `guess` is too small! 
The system will take care for you!
Now the system is adjusting your `guess` to  87 

Your `guess` is too small! 
The system will take care for you!
Now the system is adjusting your `guess` to  88 

Your `guess` is too small! 
The system will take care for you!
Now the system is adjusting your `guess` to  89 

Your `guess` is too small! 
The system will take care for you!
Now the system is adjusting your `guess` to  90 

Great! The passcode is finally cracked.

Exercise 5.3 The current implementation of app_v1() works fine when the guess value is smaller than the actual ans value. However, if the guess value is larger than the actual ans value, the script will enter an infinite loop and crash.

To solve this issue, we need to create an updated version of the program called app_v2(), which should automatically adjust the guessed number to approach the actual answer. This adjustment can be done by incrementally adding or subtracting one from the guessed number until it matches the actual answer.

The updated app_v2() program should take two parameters as input: ans and guess. It should then compare the guess value to the actual ans value. If the guess value is equal to ans, the program should print a message stating that the guess is correct. Examples are provided below.

app_v2(ans = 90, guess = 87)
Your `guess` is too small!
The system will take care for you!
Now the system is adjusting your `guess` to  88 

Your `guess` is too small!
The system will take care for you!
Now the system is adjusting your `guess` to  89 

Your `guess` is too small!
The system will take care for you!
Now the system is adjusting your `guess` to  90 
app_v2(ans = 90, guess = 93)
Your `guess` is too large!
The system will take care for you!
Now the system is adjusting your `guess` to  92 

Your `guess` is too large!
The system will take care for you!
Now the system is adjusting your `guess` to  91 

Your `guess` is too large!
The system will take care for you!
Now the system is adjusting your `guess` to  90 
app_v2(ans = 90, guess = 90)
This is spooky. Your guess is correct!!

5.5 Toy Example

Now we are playing the Guess Game. The game is as follows:

The program will pick a random number from 1 to 100. A user has to guess which number the computer has picked. Every time the user makes a wrong guess, the computer will tell the user whether the correct answer is higher or lower.

We first pack the game as an R function object:

## --------------------------------------------  ##
## Please run the entire code chunk all at once! ##
## --------------------------------------------  ##

## Create a function object
guessMyNumber <- function() {
  ## Randomly select an integer from 1 to 100
  ans <- sample(1:100, size = 1)
  
  ## Initialize the variable `guess`
  guess <- -1
  
  ## Instructions for user
  writeLines("Now I am thinking of a number between 1 and 100.")
  
  ## As long as user's guess is not the answer
  while (guess != ans) {
    ## Read the prompt input from user
    guess <- readline(prompt = "Please guess my number(1~100):")
    
    ## Convert input string into integer
    guess <- as.numeric(guess)
    
    ## if user's guess is smaller than the answer
    if (guess < ans) {
      writeLines("The answer is HIGHER.")
    ## if user's guess is larger than the answer
    } else if (guess > ans) {
      writeLines("The answer is LOWER")
    ## correct guess
    } else{
      writeLines(paste0("Good Job! You had the correct answer! My number is ", guess))
    } ## endif
  } ## endwhile
} ## endfunc

Please note that in the above function definition, we have included a line of code to initialize the guess to -1 before the while loop:

## Initialize the variable `guess`
  guess <- -1

This step is crucial because the variable guess is not defined before it is used in the while loop condition. The code tries to compare guess with ans but guess has not been defined yet. This results in an error message such as Error in while (guess != ans) : missing value where TRUE/FALSE needed.

To fix the issue, the guess variable needs to be initialized to some value before the while loop. For example, a default value could be set for the initial guess.


And guess is initialized to -1 instead of any other number because the range of possible answers is 1 to 100, and -1 is not a valid guess within this range.

Initializing guess to -1 guarantees that the loop will run at least once, as the default first guess (-1) will always be different from the value of ans, which is randomly selected from the range of valid guesses.

Once the user enters a valid guess, guess is updated to that value, and the loop continues until the user’s guess matches the value of ans.

After you load the above code chunk and create the guessMyNumber() function in your current R environment, you can play the game by running the function guessMyNumber():

guessMyNumber()

The above code demonstrates how to create a guessing game in R using a function called guessMyNumber(). Initially, the function does not accept any parameters. However, one can create a function with a parameter (e.g., guess), which allows the user to specify initial values for the game. For example:

## --------------------------------------------  ##
## Please run the entire code chunk all at once! ##
## --------------------------------------------  ##

## Create a function object
guessMyNumber_1 <- function(guess) {
  ## Randomly select an integer from 1 to 100
  ans <- sample(1:100, size = 1)
  
  ## Instructions for user
  writeLines("Now I am thinking of a number between 1 and 100.")
  
  ## As long as user's guess is not the answer
  while (guess != ans) {
    ## Read the prompt input from user
    guess <- readline(prompt = "Please guess my number(1~100):")
    
    ## Convert input string into integer
    guess <- as.numeric(guess)
    
    ## if user's guess is smaller than the answer
    if (guess < ans) {
      writeLines("The answer is HIGHER.")
    ## if user's guess is larger than the answer
    } else if (guess > ans) {
      writeLines("The answer is LOWER")
    ## correct guess
    } else{
      writeLines(paste0("Good Job! You had the correct answer! My number is ", guess))
    } ## endif
  } ## endwhile
} ## endfunc

The revised function guessMyNumber_1() now accepts a parameter guess that enables the user to specify the initial guess value when they start playing the game. The user can start the game by running the function and specifying the initial guess value, like this: guessMyNumber_1(guess = 15). However, the revised function has two potential issues:

  • It is not very intuitive to ask the user to provide a guess value when they have not even started the game yet. (They probably do not even know the objective of the game.)
  • If the user specifies a guess value that happens to be the computer’s selection, the game will not even start.

To address these issues, we may also try the following revision (guessMyNumber_2()):

## --------------------------------------------  ##
## Please run the entire code chunk all at once! ##
## --------------------------------------------  ##

## Create a function object
guessMyNumber_2 <- function(guess = -1) {
  ## Randomly select an integer from 1 to 100
  ans <- sample(1:100, size = 1)
  
  ## Instructions for user
  writeLines("Now I am thinking of a number between 1 and 100.")
  
  ## As long as user's guess is not the answer
  while (guess != ans) {
    ## Read the prompt input from user
    guess <- readline(prompt = "Please guess my number(1~100):")
    
    ## Convert input string into integer
    guess <- as.numeric(guess)
    
    ## if user's guess is smaller than the answer
    if (guess < ans) {
      writeLines("The answer is HIGHER.")
    ## if user's guess is larger than the answer
    } else if (guess > ans) {
      writeLines("The answer is LOWER")
    ## correct guess
    } else{
      writeLines(paste0("Good Job! You had the correct answer! My number is ", guess))
    } ## endif
  } ## endwhile
} ## endfunc

The revised guessMyNumber_2() provides two possibilities for the user to start the game. The user can either start the game with a random guess (by specifying the value for the parameter guess), e.g., guessMyNumber_2(guess = 15), or start the game by accepting the default setting of the parameter guess (e.g., -1), e.g., guessMyNumber_2().

The decision of whether or not to include parameters in a program or function ultimately depends on the developer’s design goals and the intended use of the program by its users. Parameters allow for greater flexibility and customization, as users can provide input values to the function to modify its behavior.

However, sometimes the program may be designed to have a fixed set of inputs and outputs, in which case parameters may not be necessary. The design choice should prioritize the usability and user experience of the program, taking into consideration the user’s level of expertise, potential use cases, and any potential issues that may arise from the inclusion or exclusion of parameters.

Exercise 5.4 The above guessMyNumber() can be improved. Sometimes naughty (careless) users would not input numbers as requested. Instead, they may accidentally (or on purpose) enter characters that are NOT digits at all in their guesses or digits that are not within the range of 1 to 100.

Please improve the guessMyNumber() function in R to handle non-digit and out-of-range inputs from users. If a user enters non-digit characters or digits outside the range of 1 to 100, the program should send a warning message. Additionally, once the user guesses the correct answer, the program should record the number of guesses the user made in total. The desired output of the revised function should be similar to the example output provided.

> guessMyNumber_v2()
Now I am thinking of a number between 1 and 100.
Please guess my number(1~100):%
Invalid guess. Please enter a number between 1 and 100.
Please guess my number(1~100):1000000
Invalid guess. Please enter a number between 1 and 100.
Please guess my number(1~100):-10
Invalid guess. Please enter a number between 1 and 100.
Please guess my number(1~100):50
The answer is LOWER
Please guess my number(1~100):25
The answer is LOWER
Please guess my number(1~100):10
The answer is HIGHER.
Please guess my number(1~100):15
The answer is HIGHER.
Please guess my number(1~100):19
The answer is LOWER
Please guess my number(1~100):18
The answer is LOWER
Please guess my number(1~100):17
The answer is LOWER
Please guess my number(1~100):16
Good Job! 
After 8 guess(es), you got the correct answer! My number is 16!