46  R Command Reference

This appendix provides a comprehensive reference for R commands, covering both base R and tidyverse functions.

46.1 Base R Fundamentals

Assignment and Basic Operations

Operation Syntax Example
Assignment <- or = x <- 5
Print value Variable name or print() x or print(x)
Comment # # This is a comment
Help ? or help() ?mean
Examples example() example(mean)
Search help ?? or help.search() ??regression

Arithmetic Operators

Operator Description Example
+ Addition 5 + 3
- Subtraction 5 - 3
* Multiplication 5 * 3
/ Division 5 / 3
^ or ** Exponentiation 5^3
%% Modulo (remainder) 5 %% 3
%/% Integer division 5 %/% 3

Comparison Operators

Operator Description Example
== Equal to x == 5
!= Not equal to x != 5
< Less than x < 5
> Greater than x > 5
<= Less than or equal x <= 5
>= Greater than or equal x >= 5

Logical Operators

Operator Description Example
& AND (element-wise) x > 0 & x < 10
| OR (element-wise) x < 0 | x > 10
! NOT !is.na(x)
&& AND (single value) cond1 && cond2
|| OR (single value) cond1 || cond2
%in% Value in set x %in% c(1, 2, 3)
xor() Exclusive OR xor(TRUE, FALSE)

46.2 Vector Operations

Creating Vectors

Function Description Example
c() Combine values c(1, 2, 3, 4, 5)
: Sequence 1:10
seq() Sequence with step seq(0, 1, by = 0.1)
seq_len() Sequence of length n seq_len(10)
seq_along() Sequence along object seq_along(x)
rep() Repeat values rep(1, times = 5)
rep() Repeat each rep(1:3, each = 2)
vector() Create empty vector vector("numeric", 10)

Vector Indexing

Syntax Description Example
x[i] Element at position i x[3]
x[c(i,j)] Multiple elements x[c(1, 3, 5)]
x[-i] Exclude element x[-1]
x[condition] Logical subsetting x[x > 5]
x[1:n] Range of elements x[1:5]
x["name"] By name x["first"]

Vector Functions

Function Description Example
length() Number of elements length(x)
sum() Sum of elements sum(x)
mean() Arithmetic mean mean(x)
median() Median value median(x)
sd() Standard deviation sd(x)
var() Variance var(x)
min() Minimum min(x)
max() Maximum max(x)
range() Range (min and max) range(x)
quantile() Quantiles quantile(x, 0.5)
sort() Sort values sort(x)
order() Indices for sorting order(x)
rev() Reverse order rev(x)
unique() Unique values unique(x)
table() Frequency table table(x)
cumsum() Cumulative sum cumsum(x)
diff() Differences diff(x)
which() Indices where TRUE which(x > 5)
which.max() Index of max which.max(x)
which.min() Index of min which.min(x)

46.3 Data Frames

Creating Data Frames

Function Description Example
data.frame() Create data frame data.frame(x = 1:3, y = c("a", "b", "c"))
tibble() Create tibble tibble(x = 1:3, y = c("a", "b", "c"))
as.data.frame() Convert to data frame as.data.frame(matrix)
as_tibble() Convert to tibble as_tibble(df)

Data Frame Indexing

Syntax Description Example
df$col Column by name df$age
df[, "col"] Column as vector df[, "age"]
df["col"] Column as data frame df["age"]
df[i, ] Row by position df[1, ]
df[i, j] Element df[1, 2]
df[condition, ] Filter rows df[df$age > 25, ]

Data Frame Functions

Function Description Example
nrow() Number of rows nrow(df)
ncol() Number of columns ncol(df)
dim() Dimensions dim(df)
names() Column names names(df)
colnames() Column names colnames(df)
rownames() Row names rownames(df)
head() First rows head(df, 10)
tail() Last rows tail(df, 10)
str() Structure str(df)
summary() Summary statistics summary(df)
glimpse() Tidyverse structure glimpse(df)
View() Open viewer View(df)

46.4 Reading and Writing Data

Base R I/O

Function Description Example
read.csv() Read CSV read.csv("file.csv")
read.table() Read table read.table("file.txt", header = TRUE)
read.delim() Read tab-delimited read.delim("file.tsv")
write.csv() Write CSV write.csv(df, "file.csv", row.names = FALSE)
write.table() Write table write.table(df, "file.txt")
saveRDS() Save R object saveRDS(obj, "file.rds")
readRDS() Read R object readRDS("file.rds")
save() Save multiple objects save(x, y, file = "data.RData")
load() Load RData file load("data.RData")

Tidyverse I/O (readr)

Function Description Example
read_csv() Read CSV (fast) read_csv("file.csv")
read_tsv() Read TSV read_tsv("file.tsv")
read_delim() Read with delimiter read_delim("file.txt", delim = "|")
read_fwf() Read fixed-width read_fwf("file.txt", col_positions)
write_csv() Write CSV write_csv(df, "file.csv")
write_tsv() Write TSV write_tsv(df, "file.tsv")

46.5 Tidyverse: dplyr

Core Verbs

Function Description Example
filter() Filter rows filter(df, age > 25)
select() Select columns select(df, name, age)
mutate() Create/modify columns mutate(df, age_sq = age^2)
arrange() Sort rows arrange(df, age)
summarise() Summarize data summarise(df, mean_age = mean(age))
group_by() Group data group_by(df, category)

Selection Helpers

Function Description Example
starts_with() Columns starting with select(df, starts_with("temp"))
ends_with() Columns ending with select(df, ends_with("_id"))
contains() Columns containing select(df, contains("score"))
matches() Columns matching regex select(df, matches("^x[0-9]"))
everything() All columns select(df, name, everything())
where() Columns where condition select(df, where(is.numeric))
all_of() All specified columns select(df, all_of(col_names))
any_of() Any of specified columns select(df, any_of(col_names))

Additional dplyr Functions

Function Description Example
distinct() Unique rows distinct(df, category)
count() Count occurrences count(df, category)
slice() Select rows by position slice(df, 1:10)
slice_head() First n rows slice_head(df, n = 5)
slice_tail() Last n rows slice_tail(df, n = 5)
slice_sample() Random rows slice_sample(df, n = 10)
slice_max() Rows with max values slice_max(df, age, n = 3)
slice_min() Rows with min values slice_min(df, age, n = 3)
pull() Extract column as vector pull(df, name)
rename() Rename columns rename(df, new_name = old_name)
relocate() Reorder columns relocate(df, name, .before = age)
across() Apply to multiple columns mutate(df, across(where(is.numeric), scale))
rowwise() Row-wise operations rowwise(df)
ungroup() Remove grouping ungroup(df)
n() Count in group summarise(df, count = n())
n_distinct() Count unique values summarise(df, unique = n_distinct(x))
first() First value summarise(df, first = first(x))
last() Last value summarise(df, last = last(x))
nth() Nth value summarise(df, third = nth(x, 3))

Joins

Function Description Example
left_join() Keep all left rows left_join(df1, df2, by = "id")
right_join() Keep all right rows right_join(df1, df2, by = "id")
inner_join() Keep matching rows inner_join(df1, df2, by = "id")
full_join() Keep all rows full_join(df1, df2, by = "id")
semi_join() Filter left by right semi_join(df1, df2, by = "id")
anti_join() Filter left, no match in right anti_join(df1, df2, by = "id")
bind_rows() Stack data frames bind_rows(df1, df2)
bind_cols() Combine columns bind_cols(df1, df2)

46.6 Tidyverse: tidyr

Function Description Example
pivot_longer() Wide to long pivot_longer(df, cols = -id, names_to = "var", values_to = "val")
pivot_wider() Long to wide pivot_wider(df, names_from = var, values_from = val)
separate() Split column separate(df, col, into = c("a", "b"), sep = "_")
separate_wider_delim() Split by delimiter separate_wider_delim(df, col, delim = "_", names = c("a", "b"))
unite() Combine columns unite(df, new_col, a, b, sep = "_")
drop_na() Remove NA rows drop_na(df)
fill() Fill NA with previous fill(df, column, .direction = "down")
replace_na() Replace NA values replace_na(df, list(x = 0))
complete() Complete missing combinations complete(df, x, y)
expand() Create all combinations expand(df, x, y)
nest() Nest data nest(df, data = -group)
unnest() Unnest data unnest(df, data)

46.7 Tidyverse: ggplot2

Basic Structure

ggplot(data, aes(x = x_var, y = y_var)) +
  geom_*() +
  labs() +
  theme_*()

Geometries

Function Plot Type Usage
geom_point() Scatter plot Continuous x and y
geom_line() Line plot Continuous x and y
geom_smooth() Smoothed line Add trend line
geom_bar() Bar chart Counts
geom_col() Bar chart Values
geom_histogram() Histogram Distribution
geom_density() Density plot Smooth distribution
geom_boxplot() Box plot Distribution by group
geom_violin() Violin plot Distribution shape
geom_jitter() Jittered points Avoid overplotting
geom_area() Area plot Filled area
geom_tile() Heatmap tiles Grid data
geom_text() Text labels Add text
geom_label() Text with background Labeled points
geom_errorbar() Error bars Uncertainty
geom_hline() Horizontal line Reference line
geom_vline() Vertical line Reference line
geom_abline() Diagonal line y = a + bx

Aesthetics

Aesthetic Description Example
x X-axis variable aes(x = var)
y Y-axis variable aes(y = var)
color Point/line color aes(color = group)
fill Fill color aes(fill = group)
size Point/line size aes(size = value)
shape Point shape aes(shape = group)
linetype Line type aes(linetype = group)
alpha Transparency aes(alpha = value)
group Grouping aes(group = id)

Scales and Labels

Function Description Example
labs() Add labels labs(title = "Title", x = "X", y = "Y")
scale_x_continuous() Continuous x scale scale_x_continuous(limits = c(0, 100))
scale_y_continuous() Continuous y scale scale_y_continuous(breaks = seq(0, 10, 2))
scale_x_log10() Log10 x scale scale_x_log10()
scale_color_manual() Manual colors scale_color_manual(values = c("red", "blue"))
scale_fill_brewer() ColorBrewer palette scale_fill_brewer(palette = "Set1")
scale_fill_viridis_d() Viridis discrete scale_fill_viridis_d()
coord_flip() Flip coordinates coord_flip()
coord_polar() Polar coordinates coord_polar()

Faceting

Function Description Example
facet_wrap() Wrap into panels facet_wrap(~variable)
facet_grid() Grid of panels facet_grid(rows ~ cols)

46.8 Statistical Functions

Function Description Example
t.test() T-test t.test(x, y)
cor() Correlation cor(x, y)
cor.test() Correlation test cor.test(x, y)
lm() Linear model lm(y ~ x, data = df)
glm() Generalized linear model glm(y ~ x, family = binomial)
aov() ANOVA aov(y ~ group, data = df)
chisq.test() Chi-squared test chisq.test(table)
wilcox.test() Wilcoxon test wilcox.test(x, y)
ks.test() Kolmogorov-Smirnov test ks.test(x, y)
summary() Model summary summary(model)
coef() Model coefficients coef(model)
residuals() Model residuals residuals(model)
predict() Predictions predict(model, newdata)

46.9 String Functions (stringr)

Function Description Example
str_length() String length str_length("hello")
str_sub() Substring str_sub("hello", 1, 3)
str_c() Concatenate str_c("a", "b", sep = "-")
str_detect() Detect pattern str_detect(x, "pattern")
str_replace() Replace first match str_replace(x, "old", "new")
str_replace_all() Replace all matches str_replace_all(x, "old", "new")
str_split() Split string str_split(x, ",")
str_trim() Remove whitespace str_trim(x)
str_to_lower() Lowercase str_to_lower(x)
str_to_upper() Uppercase str_to_upper(x)
str_to_title() Title case str_to_title(x)
str_extract() Extract match str_extract(x, "[0-9]+")
str_count() Count matches str_count(x, "a")

46.10 Control Flow

Conditionals

# if-else
if (condition) {
  # code if TRUE
} else if (other_condition) {
  # code if other TRUE
} else {
  # code if all FALSE
}

# Vectorized if-else
ifelse(condition, value_if_true, value_if_false)

# dplyr case_when
case_when(
  condition1 ~ value1,
  condition2 ~ value2,
  TRUE ~ default_value
)

Loops

# for loop
for (i in 1:10) {
  print(i)
}

# while loop
while (condition) {
  # code
}

# Apply functions (preferred)
lapply(list, function)   # Returns list
sapply(list, function)   # Returns vector
map(list, function)      # purrr, returns list
map_dbl(list, function)  # purrr, returns double vector

46.11 Package Management

Function Description Example
install.packages() Install from CRAN install.packages("dplyr")
library() Load package library(dplyr)
require() Load (returns TRUE/FALSE) require(dplyr)
installed.packages() List installed installed.packages()
update.packages() Update all update.packages()
remove.packages() Remove package remove.packages("dplyr")
packageVersion() Package version packageVersion("dplyr")