# Chapter 4 Manual and Examples of smplot

This chapter is a manual for **smplot**; it includes numerous examples. It also includes tutorials about `sm_bar()`

, `sm_bland_altman()`

and `sm_raincloud()`

, all of which are not mentioned in the preceding chapters. However, this chapter does not describe `sm_effsize()`

, `sm_power()`

and `sm_common_axis()`

; these functions are described in Chapters 5-7.

- If you are not sure about any of the functions, please type
`?`

before the function names, ex.`?sm_bar`

.

## 4.1 Installation of the Package

- The
**smplot**package is NOT available on CRAN yet. So, you will need to download it directly from my github for now. - To install it, please type in the R console:

```
install.packages('devtools')
::install_github('smin95/smplot') devtools
```

- To use the package, load it:

`library(smplot)`

### What is `smplot`

?

**smplot**is a package that provides functions that visually improve graphs produced from**ggplot2**.- So it does not work with plots made from base R.

- It was first developed in May 2021.
- It is
**free**and**open source**(https://github.com/smin95/smplot).

## 4.2 Colors and Graph Themes of smplot

### smplot’s color palette

- Its color palette can be accessed via two functions:
`sm_color()`

and`sm_palette()`

. `sm_color()`

accepts the**character string**of the color name.`sm_palette()`

accepts the**number of colors**(up to**20**) and returns the hex codes accordingly.

- For example, if you want
`blue`

and`red`

, just type the input like this:

`sm_color('blue','red')`

`## [1] "#1262b3" "#cc3d3d"`

- But, do not form a single vector that contains two characters. If so,
`sm_color()`

will only return the first color.

`sm_color(c('blue','red'))`

```
## Warning in if (color == "blue") return("#1262b3"): the condition has length > 1 and
## only the first element will be used
```

`## [1] "#1262b3"`

- If you need 5 colors, you can use
`sm_palette()`

.

`sm_palette(5)`

`## [1] "#cc1489" "#1262b3" "#5b4080" "#e57717" "#0f993d"`

### smplot’s graph themes

- There are several graph themes. The text positions and the font are all similar.
`sm_corr_theme()`

and`sm_hvgrid()`

are**equivalent**. They have major horizontal and vertical grids.`sm_hgrid()`

has major horizontal grids.This is the default theme of`sm_raincloud()`

, which creates a raincloud plot.`sm_minimal`

has no major grid. This is useful when a graph has a lot of annotation, such as texts and arrows.`sm_slope_theme()`

is a theme for a slope chart. It removes everything except the y-axis. This is the default theme of`sm_slope()`

.`sm_classic()`

has no grid. This is useful for all types of plots, such as bar graph, correlation plot and Bland-Altman plot. It has a typical x-axis and y-axis without panel borders. The Bland-Altman plot created using**smplot**uses`sm_classic()`

automatically.`sm_vgrid()`

has major vertical grids.- All of these functions, except for
`sm_slope_theme()`

and`sm_minimal()`

, have two arguments:`borders`

and`legends`

. The two exceptions only have the argument`legends`

.- For some of these functions, the defaults are set to
`borders = TRUE`

and`legends = TRUE`

. - You can check the defaults by typing
`?`

in front of the function. Ex:`?sm_corr_theme`

- For some of these functions, the defaults are set to
- Let’s explore these themes in depth. First, let’s create
`p1`

, which has the default theme of**ggplot2**.

```
library(tidyverse)
<- ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = class)) +
p1 geom_point(size = 2)
p1
```

- Now we can change the theme using
`sm_corr_theme()`

.

`+ sm_corr_theme() p1 `

- We can also remove
`borders`

and`legends`

by setting them as`FALSE`

.

```
<- p1 + sm_corr_theme(borders = FALSE, legends = FALSE)
p2 p2
```

- You can also apply smplot’s colors by using
`scale_color_manual()`

.

`+ scale_color_manual(values = sm_palette(7)) p2 `

- We can also apply
`sm_hgrid()`

.

`+ sm_hgrid() p1 `

- Let’s try
`sm_vgrid()`

.

`+ sm_vgrid() p1 `

- You can remove the legend as shown below.

`+ sm_vgrid(legends = FALSE) p1 `

`sm_classic()`

is one of my favourites!

`+ sm_classic() p1 `

- You can add the legend as shown below.

`+ sm_classic(legends = TRUE) p1 `

- Another choice is
`sm_minimal()`

.

`+ sm_minimal() p1 `

- You can choose to include the legend in
`sm_minimal()`

.

`+ sm_minimal(legends = TRUE) p1 `

## 4.3 Correlation Plot

`sm_corr_theme()`

and`sm_statCorr()`

can be used as a pair when plotting a correlation.This is the plot using the default theme of

**ggplot2**.

```
<- ggplot(data = mtcars, mapping = aes(x = drat, y = mpg)) +
p1 geom_point(shape = 21, fill = sm_color('green'), color = 'white', size = 3)
p1
```

- The next plot uses
`sm_corr_theme()`

to apply the smplot’s theme and`sm_statCorr()`

to print linear regression slope and statistical results from a paired correlation test (Pearson’s). **Important:**`sm_statCorr()`

recognizes the data for the y- and x-axes from the`mapping = aes()`

in`ggplot()`

function.- There is no
`mapping`

argument in`sm_statCorr()`

.

- There is no

```
+ sm_corr_theme() +
p1 sm_statCorr(color = sm_color('green'))
```

- You can also change the
`line_type`

to`'solid'`

in`sm_statCorr()`

, and change the location of the printed texts by using`label_x`

and`label_y`

arguments. - You can also change the font size of the printed texts by setting
`text_size`

to a larger numerical value.

```
+ sm_corr_theme() +
p1 sm_statCorr(color = sm_color('green'),
line_type = 'solid',
label_x = 3.7,
label_y = 11,
text_size = 5)
```

### 4.3.1 Data frame for a correlation plot

Column 1 has to be the data for x-axis.

Column 2 has to be the data for y-axis.

This structure of the data frame is slightly different from that is typically used in ggplot2 and smplot functions (ex.

`sm_boxplot()`

,`sm_bar()`

,`sm_violin()`

and`sm_raincloud()`

).Correlation plot and a bar plot requires a different data frame structure.

```
# Example
set.seed(11) # generate random data
= c(rnorm(19,0,1),2.5)
method1 = c(rnorm(19,0,1),2.5)
method2 <- rep(paste0('S',seq(1:20)), 2)
Subject <- data.frame(Value = matrix(c(method1,method2),ncol=1))
Data <- rep(c('Method 1', 'Method 2'), each = length(method1))
Method <- cbind(Subject, Data, Method) # used for sm_bar(), sm_boxplot(), sm_violin(), etc
df_general
<- data.frame(first = method1, second = method2) # used for correlation df_corr
```

- We have created two data frames:
`df_general`

and`df_corr`

. Let’s take a look at their structures.

`head(df_general)`

```
## Subject Value Method
## 1 S1 -0.59103110 Method 1
## 2 S2 0.02659437 Method 1
## 3 S3 -1.51655310 Method 1
## 4 S4 -1.36265335 Method 1
## 5 S5 1.17848916 Method 1
## 6 S6 -0.93415132 Method 1
```

- Notice that
`df_general`

has three columns. The first column is subject, second column is data (i.e.,`Value`

) and third column is measurement group.

`head(df_corr)`

```
## first second
## 1 -0.59103110 -0.65571812
## 2 0.02659437 -0.68251762
## 3 -1.51655310 -0.01585819
## 4 -1.36265335 -0.44260479
## 5 1.17848916 0.35255750
## 6 -0.93415132 0.07317058
```

- Notice that
`df_corr`

has two columns, each of which represents a measurement group.

```
# correlation plot using data frame 'df_corr'
ggplot(data = df_corr, mapping = aes(x = first, y = second)) +
geom_point(shape = 21, fill = sm_color('crimson'), color = 'white',
size = 3) + sm_corr_theme(borders = FALSE) +
scale_y_continuous(limits = c(-2.5,2.5)) +
scale_x_continuous(limits = c(-2.5,2.5)) +
sm_statCorr(color = sm_color('crimson'), corr_method = 'pearson',
label_x = -2.2, label_y = 2.3) +
ggtitle('Correlation plot') +
xlab('Method 1') + ylab('Method 2')
```

```
# bar graph using data frame 'df_general'
ggplot(data = df_general, mapping = aes(x = Method, y = Value, fill = Method)) +
sm_bar(shape = 21, color = 'white', bar_fill_color = 'gray80') +
scale_fill_manual(values = sm_color('crimson','green'))
```

### 4.3.2 Correlation plot with both regression and reference lines

- You can also add a reference line (slope = 1) in a correlation plot.
- This can be done with
`geom_abline()`

. In this example, the reference line’s slope is set to 1 and it has a dashed line style.

```
# correlation plot using data frame 'df_corr'
ggplot(data = df_corr, mapping = aes(x = first, y = second)) +
geom_point(shape = 21, fill = sm_color('crimson'), color = 'white',
size = 3) + sm_corr_theme(borders = FALSE) +
geom_abline(slope = 1, linetype = 'dashed') +
scale_y_continuous(limits = c(-2.8,2.8), expand = c(0,0)) +
scale_x_continuous(limits = c(-2.8,2.8), expand = c(0,0)) +
sm_statCorr(color = sm_color('crimson'), corr_method = 'pearson',
label_x = -2.2, label_y = 2.3) +
ggtitle('Correlation plot') +
xlab('Method 1') + ylab('Method 2')
```

## 4.4 Boxplot - `sm_boxplot()`

`sm_boxplot()`

generates a boxplot and individual points at the same time.- It automatically uses
`sm_hgrid()`

as its default theme. - It also has arguments
`borders`

and`legends`

, both of which have been set to`FALSE`

as defaults. - First, let’s generate some random data.

```
set.seed(1) # generate random data
= rnorm(16,0,1)
day1 = rnorm(16,5,1)
day2 <- rep(paste0('S',seq(1:16)), 2)
Subject <- data.frame(Value = matrix(c(day1,day2),ncol=1))
Data <- rep(c('Day 1', 'Day 2'), each = length(day1))
Day <- cbind(Subject, Data, Day) df
```

- Now, let’s make a boxplot using
`sm_boxplot()`

.

```
# a boxplot with the random data, all black points
ggplot(data = df, mapping = aes(x = Day, y = Value)) +
sm_boxplot(fill = 'black')
```

- Now let’s apply different color for each Day.

```
# a boxplot with different colored points
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Day)) +
sm_boxplot(shape = 21, color = 'white') +
scale_fill_manual(values = sm_color('blue','orange'))
```

- You can change the shape of the boxplot by setting
`notch = TRUE`

. You can also change the size of the individual points using`point_size`

argument. - A notched boxplot shows the confidence interval around the median (+/- 1.58 * interquartile range / sqrt(n)).
- The notches are used for group comparison.
- If the notch of each box does not overlap, there is a strong likelihood that the medians are significantly different between groups.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Day)) +
sm_boxplot(shape = 21, point_size = 4, notch = 'TRUE', alpha = 0.5) +
scale_fill_manual(values = sm_color('blue','orange'))
```

### 4.4.1 Plotting individual points with unique colors

- One can also use
`sm_boxplot()`

to plot individual points with unique colors. - However,
`sm_boxplot()`

cannot print distinct box colors across distinct x levels (i.e., in this example, all boxes are grey). This is because I think that it is not a good practice to print different colors of boxes when each point has unique color as well.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Subject)) +
sm_boxplot(shape = 21, color = 'white') +
scale_fill_manual(values = sm_palette(16))
```

## 4.5 Violin Plot - `sm_violin()`

`sm_violin()`

plots a violin plot, individual points and lines that indicate means and +/- 1 standard deviation at the same time.- It is very similar to
`sm_boxplot()`

except there is no option for`notch = TRUE`

in`sm_violin()`

. - Also
`sm_violin()`

uses both`color`

(for the lines of mean and SD) and`fill`

(for the colors of the points) arguments. - The default border color of the points is
`white`

. `sm_violin()`

automatically uses`sm_hgrid()`

as its default theme.- It also has arguments
`borders`

and`legends`

, both of which have been set to`FALSE`

as defaults.

```
# a violin plot with the random data, all black points and lines
ggplot(data = df, mapping = aes(x = Day, y = Value)) +
sm_violin(fill = 'black')
```

```
# a violin plot with different colored points and lines
ggplot(data = df, mapping = aes(x = Day, y = Value, color = Day)) +
sm_violin() +
scale_color_manual(values = sm_color('blue','orange'))
```

### 4.5.1 Plotting individual points with unique colors

- One can also use
`sm_violin()`

to plot individual points with unique colors. - The x-level has to be grouped in the aesthetics (ex.
`group = Day`

). - But
`sm_violin()`

cannot print distinct violin colors across distinct x levels (i.e., in this example, all violins are grey). This is because it is aesthetically distracting to assign different colors of the violin across distinct x-levels.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Subject,
group = Day)) +
sm_violin(shape = 21, color = 'white', point_alpha = 0.6) +
scale_fill_manual(values = sm_palette(16))
```

```
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Subject,
group = Day, color = Day)) +
sm_violin(shape = 21, color = 'white', point_alpha = 0.6) +
scale_fill_manual(values = sm_palette(16)) +
scale_color_manual(values = sm_color('blue', 'orange'))
```

## 4.6 Bar Plot - `sm_bar()`

`sm_bar()`

automatically uses`sm_bar_theme()`

/`sm_hgrid()`

.- It also has arguments
`borders`

and`legends`

, both of which have been set to`FALSE`

as defaults. - Let’s use data (
`df`

) we generated before.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Day)) +
sm_bar(shape = 21, color = 'white', bar_fill_color = 'gray80') +
scale_fill_manual(values = sm_color('blue','orange'))
```

- In this case, the error bar represents
**standard error**. If you prefer to show**standard deviation**, then you should set`errorbar_type = 'sd'`

in`sm_bar()`

.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Day)) +
sm_bar(shape = 21, color = 'white', bar_fill_color = 'gray80', errorbar_type = 'sd') +
scale_fill_manual(values = sm_color('blue','orange'))
```

- 95% confidence interval can also be displayed with
`errorbar_type = 'ci'`

.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Day)) +
sm_bar(shape = 21, color = 'white', bar_fill_color = 'gray80', errorbar_type = 'ci') +
scale_fill_manual(values = sm_color('blue','orange'))
```

- You can also make the bar slightly transparent by setting
`bar_alpha`

to less than 1.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Day)) +
sm_bar(shape = 21, color = 'white', bar_fill_color = 'gray80',
errorbar_type = 'ci', bar_alpha = 0.7) +
scale_fill_manual(values = sm_color('blue','orange'))
```

### 4.6.1 Plotting individual points with unique colors

- One can also use
`sm_bar()`

to plot individual points with unique colors.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, color = Subject)) +
sm_bar(bar_fill_color = 'gray80') +
scale_color_manual(values = sm_palette(16))
```

`sm_bar()`

can also print distinct box colors across distinct x levels.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, color = Subject,
fill = Day)) +
sm_bar() +
scale_color_manual(values = sm_palette(16)) +
scale_fill_manual(values = sm_color('yelloworange','skyblue'))
```

## 4.7 Slope Chart - `sm_slope()`

`sm_slope()`

plots a slope chart.- It also has the argument
`legends`

, which has been set to`FALSE`

as the default. - A slope chart is useful to describe changes between two different timepoints for each measurement (ex. a participant).
- It automatically uses
`sm_slope_theme()`

. - Let’s use
`df`

that we generated before. **Important:**To make this function work, the`mapping`

within`ggplot()`

has to have a certain structure.- x- and y-axes have to be defined.
- A slope chart groups each observation (ex.
`Subject`

) across x-axis. This has to be specified in`mapping`

as`group =`

.

- The x-axis cannot be
**continuous**. It has to be**discrete**. So, it should take the form of`character`

or`factor`

(ex. ‘One’, ‘Two’, ‘Three’). If x-axis only has number (i.e.,`double`

form, such as 1.02, 1.05, 1.5), then`sm_slope()`

will produce an error. `labels`

argument is required to use`sm_slope()`

. This refers to the labels of the ticks in the x-axis. Ex.`labels = c('Day 1', 'Day 2')`

.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, group = Subject)) +
sm_slope(labels = c('Day 1', 'Day 2'))
```

- Let’s set the shape to 21.
- Let’s make the border color to
`white`

. - Let’s apply the same color to each
**Day**.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, group = Subject)) +
sm_slope(labels = c('Day 1','Day 2'), shape = 21, color = 'white', fill = sm_color('blue'))
```

- You could also apply different color for each
`Day`

using`scale_fill_manual()`

.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, group = Subject, fill = Day)) +
sm_slope(labels = c('Day 1','Day 2'), shape = 21, color = 'white') +
scale_fill_manual(values = sm_color('blue','orange'))
```

- You can also change the line color and other aesthetics. For more information, please type
`?sm_slope`

.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, group = Subject, fill = Day)) +
sm_slope(labels = c('Day 1','Day 2'), shape = 21, color = 'white',
fill = sm_color('blue'), line_color = '#bfd5db',
line_size = 0.6)
```

## 4.8 A Bland Altman Plot - `sm_bland_altman()`

`sm_bland_altman()`

and`sm_statBlandAlt()`

functions can be used to create a Bland-Altman plot.- The plot is used to measure agreement between two different measurements.
- It is also used to measure test-retest variability of a method.
- Let’s generate random data.

```
set.seed(1)
<- rnorm(20)
first <- rnorm(20)
second <- as_tibble(cbind(first,second)) # requires library(tidyverse) df3
```

- Now let’s draw a Bland Altman plot using
`sm_bland_altman()`

, which requires two arguments: first dataset, second dataset. They have to be numerical vectors of equal length.- This function automatically uses
`sm_classic()`

theme.

- This function automatically uses

```
sm_bland_altman(df3$first, df3$second, color = sm_color('green')) +
scale_y_continuous(limits = c(-4,4))
```

- The upper dashed line represents the upper limit of the difference between two measurements (mean difference + 1.96 * standard deviation of the difference).
- The lower dashed line represents the lower limit of the difference between two measurements (mean difference - 1.96 * standard deviation of the difference).
- The middle dashed line represents the mean difference.
- The shaded region is the 95% confidence interval of the difference between the two measurements from one-sample t-test (difference vs 0).
- If the shaded region includes 0 in the y-axis, then there is no significant difference (p > 0.05) between 0 and the difference.
- If it does not include 0, then there is a significant difference. This indicates that the two measurement results are considerably different.

- I usually label them with
`annotate()`

, which is a function from**ggplot2**. This process can be tedious. - Also,
`sm_statBlandAlt()`

calculates the statistical values that are necessary to draw a Bland-Altman plot, such as the mean difference, upper and lower limits. This function is used to annotate the values in the plot.- The arguments for this function are first and second datasets, just like in
`sm_bland_altman()`

.

- The arguments for this function are first and second datasets, just like in

```
<- sm_statBlandAlt(df3$first,df3$second) # store the results in res variable
res
sm_bland_altman(df3$first, df3$second, color = sm_color('green')) +
scale_y_continuous(limits = c(-4,4)) +
annotate('text', label = 'Mean', x = -1, y = res$mean_diff + 0.4) +
annotate('text', label = signif(res$mean_diff,3), x = -1, y = res$mean_diff - 0.4) +
annotate('text', label = 'Upper limit', x = 1.2, y = res$upper_limit + 0.4) +
annotate('text', label = signif(res$upper_limit,3), x = 1.2, y = res$upper_limit - 0.4) +
annotate('text', label = 'Lower limit', x = 1.2, y = res$lower_limit + 0.4) +
annotate('text', label = signif(res$lower_limit,3), x = 1.2, y = res$lower_limit-0.4)
```

- Let’s change the border color of the circles to white. To do so, we will have to change their shape to 21.

```
sm_bland_altman(df3$first, df3$second, shape = 21, fill = sm_color('green'), color = 'white') +
scale_y_continuous(limits = c(-4,4)) +
annotate('text', label = 'Mean', x = -1, y = res$mean_diff + 0.4) +
annotate('text', label = signif(res$mean_diff,3), x = -1, y = res$mean_diff - 0.4) +
annotate('text', label = 'Upper limit', x = 1.2, y = res$upper_limit + 0.4) +
annotate('text', label = signif(res$upper_limit,3), x = 1.2, y = res$upper_limit - 0.4) +
annotate('text', label = 'Lower limit', x = 1.2, y = res$lower_limit + 0.4) +
annotate('text', label = signif(res$lower_limit,3), x = 1.2, y = res$lower_limit-0.4)
```

## 4.9 Raincloud plot - `sm_raincloud()`

`sm_raincloud()`

generates a raincloud plot.- It also has arguments
`borders`

and`legends`

. The the default of the former has been set as`TRUE`

and that of the latter as`FALSE`

. - A raincloud plot is a combination of jittered points, a boxplot and a violin plot.
- However, this plot can be visually crowded. Some people like to use raincloud plots, some do not. So, the choice to use it is entirely yours.
- Let’s generate some random data.

```
set.seed(2) # generate random data
= rnorm(20,0,1)
day1 = rnorm(20,5,1)
day2 = rnorm(20,6,1.5)
day3 = rnorm(20,7,2)
day4 <- rep(paste0('S',seq(1:20)), 4)
Subject <- data.frame(Value = matrix(c(day1,day2,day3,day4),ncol=1))
Data <- rep(c('Day 1', 'Day 2', 'Day 3', 'Day 4'), each = length(day1))
Day <- cbind(Subject, Data, Day) df2
```

- The x-axis variable column has to have the right level. If not, you should convert the column as factor and establish the levels correctly.
- Now let’s draw a raincloud plot using
`sm_raincloud()`

.

`sm_raincloud(data = df2, x = Day, y = Value) `

- Let’s change the x-axis labels.

```
sm_raincloud(data = df2, x = Day, y = Value) +
scale_x_continuous(limits = c(0.25,4.75), labels = c('First', 'Second', 'Third', 'Fourth'), breaks = c(1,2,3,4)) +
xlab('Day')
```

- The filling colors of the violin plots and boxplots can be modified by using
`scale_fill_manual()`

. - The border color of the violin plot can be changed by using
`scale_color_manual()`

.- I will set it
`transparent`

to remove the border of the violin plots.

- I will set it
- The color of the points can be used by either of the 2 functions depending on the shape, which can be set within
`sm_raincloud()`

.

```
sm_raincloud(data = df2, x = Day, y = Value, boxplot_alpha = 0.5,
color = 'white', shape = 21, sep_level = 2) +
scale_x_continuous(limits = c(0.25,4.75), labels = c('One', 'Two', 'Three', 'Four'), breaks = c(1,2,3,4)) +
xlab('Day') +
scale_color_manual(values = rep('transparent',4)) +
scale_fill_manual(values = sm_palette(4))
```

`sep_level`

is an argument to specify the degree of separation among points, boxplots and violin plots. When`sep_level = 0`

, they will all be crowded. When`sep_level = 4`

, they will all be separated from each other.- I personally prefer when the boxplot and violin plots are together, but not the points. So I set the default to
`sep_level = 2`

. - Shown below is an example when
`sep_level = 4`

with a grid theme`sm_minimal()`

.

- I personally prefer when the boxplot and violin plots are together, but not the points. So I set the default to

```
sm_raincloud(data = df2, x = Day, y = Value, boxplot_alpha = 0.5,
color = 'white', shape = 21, sep_level = 4) +
scale_x_continuous(limits = c(0.25,4.75), labels = c('1', '2', '3', '4'), breaks = c(1,2,3,4)) +
xlab('Day') +
scale_color_manual(values = rep('transparent',4)) +
scale_fill_manual(values = sm_palette(4)) +
sm_minimal()
```

- You can also flip the raincloud plot by setting
`which_side`

to`left`

.

```
sm_raincloud(data = df2, x = Day, y = Value, boxplot_alpha = 0.5,
color = 'white', shape = 21, sep_level = 2, which_side = 'left') +
scale_x_continuous(limits = c(0.25,4.75), labels = c('1', '2', '3', '4'), breaks = c(1,2,3,4)) +
xlab('Day') +
scale_color_manual(values = rep('transparent',4)) +
scale_fill_manual(values = sm_palette(4)) +
sm_minimal()
```

- So far the distribution plots (violin plots) have been vertical. We can change their configuration by setting
`vertical = FALSE`

.

```
sm_raincloud(data = df2, x = Day, y = Value, boxplot_alpha = 0.5,
color = 'white', shape = 21, sep_level = 2, which_side = 'left', vertical = FALSE) +
scale_x_continuous(limits = c(0.25,4.75), labels = c('1', '2', '3', '4'), breaks = c(1,2,3,4)) +
xlab('Day') +
scale_color_manual(values = rep('transparent',4)) +
scale_fill_manual(values = sm_palette(4)) +
sm_minimal()
```

- The orientation is not correct, so let’s change it by setting
`which_side = 'right'`

.

```
sm_raincloud(data = df2, x = Day, y = Value, boxplot_alpha = 0.5,
color = 'white', shape = 21, sep_level = 2, which_side = 'right', vertical = FALSE) +
scale_x_continuous(limits = c(0.25,4.75), labels = c('1', '2', '3', '4'), breaks = c(1,2,3,4)) +
xlab('Day') +
scale_color_manual(values = rep('transparent',4)) +
scale_fill_manual(values = sm_palette(4))
```

## 4.10 Forest plot - `sm_forest()`

- As the name suggests,
`sm_forest()`

draws a forest plot. - It has very similar arguments to those in
`sm_raincloud()`

.`sep_level`

sets the separation level between the average point and individual points. When`sep_level = 0`

, they overlap. When`sep_level = 3`

, they are separated. The values range from 0 to 3.

- The shape and the color of the average point can be set by arguments
`avg_point_shape`

and`avg_point_size`

. - The range of the error bar can be either at
`top`

or`bottom`

, which could be set by the argument`range_side`

. - The type of the error bar can be set using the argument
`errorbar_type`

, the default of which is`ci`

, short for confidence interval (95%). `borders`

and`legends`

can be used to remove the borders and legends as you can do in`sm_raincloud()`

.- Most importantly,
`x`

is required to be a continuous variable.`y`

should be a categorical variable (i.e.,`factor`

).

First, generate random numbers.

```
set.seed(2) # generate random data
= rnorm(20,0,1)
day1 = rnorm(20,5,1)
day2 = rnorm(20,6,1.5)
day3 = rnorm(20,7,2)
day4 <- rep(paste0('S',seq(1:20)), 4)
Subject <- data.frame(Value = matrix(c(day1,day2,day3,day4),ncol=1))
Data <- rep(c('Day 1', 'Day 2', 'Day 3', 'Day 4'), each = length(day1))
Day <- cbind(Subject, Data, Day) df2
```

When drawing a forest plot, the ratio of the figure is very important, and we will do an example until we save it properly. `x`

here is set to be the value, and `y`

the categorical variable. In `sm_raincloud`

, it was the opposite.

`sm_forest(data = df2, x = Value, y = Day) `

Therefore, the x and y scales have been flipped compared to the raincloud plots above. Notice that `sm_forest()`

automatically calculates the mean, 95% confidence interval and plots individual points as well.

We can further customise the aesthetics.

```
sm_forest(data = df2, x = Value, y = Day, alpha = 0.2,
avg_point_shape= 23, shape = 21) +
scale_color_manual(values = rep('transparent',4)) +
scale_fill_manual(values = sm_palette(4))
```

By changing the shape of the average point to `23`

(diamond with outline), we can include the line `scale_color_manual(values = rep('transparent',4))`

.

You can also change `jitter_width`

and `sep_level`

.

```
sm_forest(data = df2, x = Value, y = Day, alpha = 0.2,
avg_point_shape= 23, shape = 21,
jitter_width = 0.1, sep_level = 0) +
scale_color_manual(values = rep('transparent',4)) +
scale_fill_manual(values = sm_palette(4)) +
ylab('Day') +
ggtitle('Forest plot')
```

Now we can save the plot as an image file.
First store the plot into a variable, then set the correct width (`base_width = 3`

) and height (`base_height = 5`

) using `save_plot()`

.

```
<- sm_forest(data = df2, x = Value, y = Day, alpha = 0.2,
forest_plot avg_point_shape= 23, shape = 21,
jitter_width = 0.15, sep_level = 0) +
scale_color_manual(values = rep('transparent',4)) +
scale_fill_manual(values = sm_palette(4)) +
ylab('Day') +
ggtitle('Forest plot')
save_plot('forest.png', forest_plot, base_height = 5,
base_width = 3)
```

You can also set `errorbar_type = sd`

so that the the error bar and the text denote the standard deviation of each sample.

```
sm_forest(data = df2, x = Value, y = Day, alpha = 0.2,
avg_point_shape= 23, shape = 21,
jitter_width = 0.1, sep_level = 0,
errorbar_type = 'sd') +
scale_color_manual(values = rep('transparent',4)) +
scale_fill_manual(values = sm_palette(4)) +
ylab('Day') +
ggtitle('Forest plot')
```

## 4.11 Overriding Defaults of smplot’s Themes

You can override all the defaults by adding

`theme()`

object to your**ggplot2**graph.Here is a bar graph.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Day)) +
sm_bar(shape = 21, color = 'white', bar_fill_color = 'gray80') +
scale_fill_manual(values = sm_color('blue','orange'))
```

- Now let’s remove the x-axis title
**Day**.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Day)) +
sm_bar(shape = 21, color = 'white', bar_fill_color = 'gray80') +
scale_fill_manual(values = sm_color('blue','orange')) +
theme(axis.title.x = element_blank())
```

- Let’s customise the graph more by changing the y-axis title and adding a main title.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Day)) +
sm_bar(shape = 21, color = 'white', bar_fill_color = 'gray80') +
scale_fill_manual(values = sm_color('blue','orange')) +
theme(axis.title.x = element_blank()) +
ylab('Reading speed') +
ggtitle('Reading performance in children')
```

### 4.11.1 Scaling the y-axis

A bar plot is shown below using this random data.

```
set.seed(1) # generate random data
= abs(rnorm(16,1,1))
day1 = abs(rnorm(16,5,1))
day2 <- rep(paste0('S',seq(1:16)), 2)
Subject <- data.frame(Value = matrix(c(day1,day2),ncol=1))
Data <- rep(c('Day 1', 'Day 2'), each = length(day1))
Day <- cbind(Subject, Data, Day) df
```

However, there is something weird going on regarding the y-axis limit.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Day)) +
sm_bar(shape = 21, color = 'white', bar_fill_color = 'gray80') +
scale_fill_manual(values = sm_color('blue','orange'))
```

The y-axis limit does not start from 0. Let’s specify the y-axis limit manually using `scale_y_continuous()`

.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Day)) +
sm_bar(shape = 21, color = 'white', bar_fill_color = 'gray80') +
scale_fill_manual(values = sm_color('blue','orange')) +
scale_y_continuous(limits = c(0,7))
```

Although we have specified that the y-axis limit begins from 0 and ends at 7, there is still a small margin below 0. What is going on here?

The default of **ggplot2** is that there is always a small margin below the lowest point of the y-axis limit and above the largest point of the y-axis.The empty space below 0 and above 7 can be removed by using `expand = c(0,0)`

within `scale_y_continuous()`

.

```
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Day)) +
sm_bar(shape = 21, color = 'white', bar_fill_color = 'gray80') +
scale_fill_manual(values = sm_color('blue','orange')) +
scale_y_continuous(limits = c(0,7), expand = c(0,0))
```

Note that `expand = c(0,0)`

has already been used in this chapter where it discusses the correlation plot (with both regression and reference lines). You can plot it without `expand = c(0,0)`

and see what happens.

```
# correlation plot using data frame 'df_corr'
ggplot(data = df_corr, mapping = aes(x = first, y = second)) +
geom_point(shape = 21, fill = sm_color('crimson'), color = 'white',
size = 3) + sm_corr_theme(borders = FALSE) +
geom_abline(slope = 1, linetype = 'dashed') +
scale_y_continuous(limits = c(-2.8,2.8)) +
scale_x_continuous(limits = c(-2.8,2.8)) +
sm_statCorr(color = sm_color('crimson'), corr_method = 'pearson',
label_x = -2.2, label_y = 2.3) +
ggtitle('Correlation plot') +
xlab('Method 1') + ylab('Method 2')
```

Notice that even we have specified the x- and y-limits from -2.8 to 2.8, we still see -3 and 3. This is because **ggplot2** default provides extra margin space. Also, the plot above is not pretty because the grid lines at the outer ends act as pseudo-borders. So it is best to remove them. We can do this with `expand = c(0,0)`

, which reduces the margin for both x and y axes.

```
# correlation plot using data frame 'df_corr'
ggplot(data = df_corr, mapping = aes(x = first, y = second)) +
geom_point(shape = 21, fill = sm_color('crimson'), color = 'white',
size = 3) + sm_corr_theme(borders = FALSE) +
geom_abline(slope = 1, linetype = 'dashed') +
scale_y_continuous(limits = c(-2.8,2.8), expand = c(0,0)) +
scale_x_continuous(limits = c(-2.8,2.8), expand = c(0,0)) +
sm_statCorr(color = sm_color('crimson'), corr_method = 'pearson',
label_x = -2.2, label_y = 2.3) +
ggtitle('Correlation plot') +
xlab('Method 1') + ylab('Method 2')
```

I think this looks much nicer.

## 4.12 Overriding Defaults of smplot Colors

`sm_color('blue)`

prints a hex code of the`blue`

. Likewise,`sm_color('blue','orange')`

prints out two hex codes, one for`blue`

and another for`orange`

.Therefore, instead of using

`sm_color()`

function to call forth the colors, you can directly write the hex codes of colors that are not included in**smplot**.

```
<- c('#ff1493', '#483d8B') # pink and lavender
my_colors
ggplot(data = df, mapping = aes(x = Day, y = Value, fill = Day)) +
sm_bar(shape = 21, color = 'white', bar_fill_color = 'gray80') +
scale_fill_manual(values = my_colors) +
scale_y_continuous(limits = c(0,7), expand = c(0,0.05))
```