Learning to Create Frequency Tables in R: A Step-by-Step Guide


A frequency table is an indispensable cornerstone of Exploratory Data Analysis (EDA). This analytical tool systematically organizes raw measurements by calculating and displaying the counts, or frequencies, of distinct categories or values present within a dataset. By providing this concise, structured display, the frequency table is crucial for gaining immediate insights into the underlying distribution, central tendencies, and structural composition of the variables under examination.

This comprehensive tutorial provides a step-by-step guide on effectively constructing various types of frequency tables—one-way, two-way, and three-way—using the robust capabilities of the R programming language. We will primarily utilize R’s built-in and highly versatile table() function, demonstrating its power in simplifying complex counting tasks for different data dimensions.

Setting Up the Example Data Frame for Analysis

To effectively illustrate the creation of frequency tables, we must first define a sample data frame. We will name this structure df, and it is designed to mimic operational transactional records. This frame includes three variables: the retail location (store), the value of daily sales transactions (sales), and the volume of returned goods (returns).

To ensure that the results are consistent and verifiable by all users, we employ the set.seed(0) function. This practice guarantees that the subsequent random generation of our sample data is fully reproducible, allowing the exact data frame structure to be generated every time the following code block is executed.

#make this example reproducible
set.seed(0)

#create data frame 
df <- data.frame(store=rep(c('A', 'B', 'C'), each=3),
                 sales=round(runif(9, 2, 6), 0),
                 returns=round(runif(9, 1, 3), 0))

#view data frame 
df

  store sales returns
1     A     6       2
2     A     3       1
3     A     3       1
4     B     4       1
5     B     6       2
6     B     3       2
7     C     6       3
8     C     6       2
9     C     5       2

The resulting df structure, displayed above, forms the essential foundation for our frequency calculations. The primary analytical goal is to determine how often each unique categorical value appears across the different variables, providing the necessary counts to perform subsequent statistical procedures, such as the Chi-Square test, which relies heavily on well-formed contingency tables.

Creating One-Way Frequency Tables in R (Univariate Analysis)

A one-way frequency table represents the simplest form of frequency analysis, focusing exclusively on the distribution of counts for a single variable. This process is formally known as univariate analysis. By examining only one variable, we gain an immediate, high-level overview of the data’s composition, identifying how frequently each specific category or level occurs within the dataset. This is often the first step in assessing data quality and identifying imbalances.

To generate this table, we call the table() function, providing it with a single vector—in this case, the store variable, accessed using R’s standard subsetting notation, df$store. The function automatically identifies all unique values within that vector and calculates the raw count for each occurrence.

#calculate frequency of each store
table(df$store)

A B C 
3 3 3 

The resulting output is a simple named array detailing the count for each unique store identifier. The interpretation is highly intuitive:

  • Store A, Store B, and Store C each account for 3 observations (rows) in the data frame.

This finding confirms that our sample data is perfectly balanced with respect to the store variable, a crucial piece of information necessary before proceeding with more complex statistical modeling or inferential tests.

Creating Two-Way Frequency Tables in R (Contingency Tables)

When the analysis incorporates a second variable, we transition from univariate to bivariate analysis. This generates a two-way frequency table, which is more commonly referred to as a contingency table or a cross-tabulation. The primary purpose of this table is to visualize the joint frequency distribution of the two selected variables, illustrating precisely how the categories of one variable interact with and relate to the categories of the other.

To construct this table using the table() function, we must pass two distinct vectors as arguments. Here, we analyze the relationship between df$store and df$sales, allowing us to quantify the count of observations for every possible combination of store location and sales value.

#calculate two-way frequency table
table(df$store, df$sales)

    3 4 5 6
  A 2 0 0 1
  B 1 1 0 1
  C 0 0 1 2 

The resulting structure is a matrix where the rows correspond to the first variable specified (store) and the columns correspond to the second variable (sales). Crucially, every cell within this matrix reports the count of data points that simultaneously satisfy both the row criterion and the column criterion. This visualization makes patterns immediately apparent.

For instance, interpreting the first row (Store A) reveals specific sales patterns:

  • The count of 2 at the intersection of Store A and Sales 3 indicates that Store A recorded sales of 3 units on two separate occasions.
  • Conversely, the count of 0 for Store A and Sales 4 confirms that Store A never recorded a 4-unit sale in this dataset.
  • The count of 1 for Store A and Sales 6 shows that Store A had one instance of a 6-unit sale.

Contingency tables are fundamental for preliminary tests of association. By inspecting the table, we can quickly hypothesize differences—for example, observing that Store B is the only location to have recorded a sales value of 4 suggests that sales patterns might indeed vary significantly across the different retail locations.

Creating Three-Way Frequency Tables in R (Multi-Dimensional Analysis)

The power of R’s table() function extends beyond two dimensions, enabling the creation of multi-dimensional frequency tables. When three variables are introduced, the output structure is presented as a series of two-way tables, effectively “sliced” by the unique levels of the third variable. This technique allows for highly granular analysis of three-way interactions.

We incorporate all three variables from our data frame—df$store, df$sales, and df$returns—by passing all three vectors to the function. This calculation reveals the joint distribution across store, sales volume, and return volume simultaneously.

#calculate three-way frequency table
table(df$store, df$sales, df$returns)

, ,  = 1

   
    3 4 5 6
  A 2 0 0 0
  B 0 1 0 0
  C 0 0 0 0

, ,  = 2

   
    3 4 5 6
  A 0 0 0 1
  B 1 0 0 1
  C 0 0 1 1

, ,  = 3

   
    3 4 5 6
  A 0 0 0 0
  B 0 0 0 0
  C 0 0 0 1 

The output is logically separated into distinct layers, where each layer corresponds to a specific value of the third variable (returns). Analyzing the first layer, labeled , , = 1, provides the joint frequencies only for observations where the volume of returns was exactly 1. Within this constrained subset, we can see that Store A recorded 3 sales twice, and Store B recorded 4 sales once.

This detailed, layered structure enables analysts to uncover specific situational patterns that would be completely obscured in lower-dimensional tables. For instance, by observing the final layer, , , = 3, we immediately identify that the only instance where 3 returns occurred was associated with a 6-unit sale at Store C. Such precision is valuable for deep diagnostic analysis.

Practical Limitations and Alternative Approaches

While the R environment technically permits the generation of frequency tables for four or five dimensions by continuously adding arguments to the table() function, the practical utility of these higher-dimensional structures decreases rapidly. The primary hurdle is the challenge of effective interpretation. As the number of variables increases, visualizing and drawing meaningful conclusions from the massive, layered output becomes exponentially difficult.

A second, critical technical limitation is the resulting sparsity of the table. Higher-dimensional tables frequently contain a vast number of zero counts, meaning that many possible combinations of variable levels simply do not exist in the dataset. This sparsity significantly inflates the size of the output without contributing proportional analytical value, often making the table unwieldy and non-informative.

For routine data analysis workflows, the one-way and two-way frequency tables remain the most practical, frequently employed, and easily interpretable tools. When the analysis necessitates examining the complex interactions of three or more variables, data scientists typically pivot toward more appropriate methodological alternatives. These often include advanced data visualization techniques, such as mosaic plots or heatmaps, or moving directly into formal statistical modeling rather than relying solely on raw frequency counts.

Additional Resources for R Data Manipulation

To further refine your proficiency in R data handling, statistical testing, and the creation of analytical summaries, please explore these related tutorials and resources:

How to Create Tables in R

How to Perform a Chi-Square Test of Independence in R

How to Perform a Chi-Square Goodness of Fit Test in R

Cite this article

Mohammed looti (2025). Learning to Create Frequency Tables in R: A Step-by-Step Guide. PSYCHOLOGICAL STATISTICS. Retrieved from https://statistics.arabpsychology.com/create-frequency-tables-in-r-with-examples/

Mohammed looti. "Learning to Create Frequency Tables in R: A Step-by-Step Guide." PSYCHOLOGICAL STATISTICS, 7 Nov. 2025, https://statistics.arabpsychology.com/create-frequency-tables-in-r-with-examples/.

Mohammed looti. "Learning to Create Frequency Tables in R: A Step-by-Step Guide." PSYCHOLOGICAL STATISTICS, 2025. https://statistics.arabpsychology.com/create-frequency-tables-in-r-with-examples/.

Mohammed looti (2025) 'Learning to Create Frequency Tables in R: A Step-by-Step Guide', PSYCHOLOGICAL STATISTICS. Available at: https://statistics.arabpsychology.com/create-frequency-tables-in-r-with-examples/.

[1] Mohammed looti, "Learning to Create Frequency Tables in R: A Step-by-Step Guide," PSYCHOLOGICAL STATISTICS, vol. X, no. Y, ص Z-Z, November, 2025.

Mohammed looti. Learning to Create Frequency Tables in R: A Step-by-Step Guide. PSYCHOLOGICAL STATISTICS. 2025;vol(issue):pages.

Download Post (.PDF)
Scroll to Top