**Introduction**

A dummy variable designates subgroups within your analysis, typically based on 0 and 1 values. In this blog, we’ll how you how to create dummy variables from both continuous variables and binary strings in R.

**Load Data**

Let’s enter some data into R to experiment on. You can copy and paste the following code into R:

subject <- 1:20

gender <- c(rep('male',10),rep('female',10))

books <- c(12, 15, 0, 23, 18, 10, 10, 9, 8, 10,

25, 21, 13, 18, 3, 21, 22, 12, 12, 11)

df <- data.frame(subject, gender, books)

print(df)

We can designate these as data on the number of books read by 20 subjects, 10 male and 10 female, in 2022.

**Create Dummy Variable from Binary String Variable**

If we want to run a statistical analysis such as an ordinary least squares regression on the number of books as a function of gender, we cannot, as gender is a string variable. Confirm by trying the following code:

lmbooks = lm(books ~ gender, data = df)

lmbooks

We could, however, turn the string variable of gender into a dummy variable with male = 0, female = 1. Thereafter, the data would be amenable to regression analysis.

Try the following code:

df$gender.dummy<-ifelse(df$gender=="female",1,0)

print(df)

As you can see, the dataset now contains a new dummy variable in which females = 1, males = 0:

We could now run that regression:

lmbooks = lm(books ~ gender.dummy, data = df)

summary(lmbooks)

Here’s what you get:

Because we coded females as 1 and males as 0, we see that women read 4.3 books more than men, but this effect of gender is not significant, *p* = .154.

**Create Dummy Variable from Continuous Variable**

Still working with the previous data, let’s say we want to create a new dummy variable from the continuous variable of books. We want to designate 0 to 10 books as 0 and 11 and above books as 1. We can use the following R code to create a dummy variable accordingly:

df$books.dummy<-ifelse(df$books > 10,1,0)

print(df)

Here’s what your dataset looks like now:

You were successfully able to create a new dummy variable based on your cutoff value of 10 books.

BridgeText can help you with all of your **statistical analysis needs**.