澳洲幸运5开奖号码历史查询

Descriptive Statistics: Definition, Overview, Types, and Examples

Definition

Descriptive statistic༺s are techniques that sum💮marize a dataset’s main features. The data, which represents an entire population or a sample, is presented concisely in a way that doesn’t draw conclusions beyond the data.

What Are Descriptive Statistics?

Descriptive statistics are brief informational coefficients that summarize a given dataset, which can be either a representation of the entire 澳洲幸运5开奖号码历史查询:population or a sample of a population. Descriptive statistics are broken down into measures of central tendency and measures of variability (spread). Measures of central tendency include the mean, median, and mode, while measures of variability include 澳洲幸运5开奖号码历史查询:standard deviation, variance, minimum and maximum variables, kurtosis, and skewness.

Key Takeaways

  • Descriptive statistics summarize or describe the characteristics of a dataset.
  • Descriptive statistics consist of three basic categories of measures: measures of central tendency, measures of variability (or spread), and frequency distribution.
  • Measures of central tendency describe the center of the dataset (mean, median, mode).
  • Measures of variability describe the dispersion of the dataset (variance, standard deviation).
  • Measures of frequency distribution describe the occurrence of data within the dataset (count).
Descriptive Statistics

Jessica Olah / Investopedia

Understanding Descriptive Statistics

Descriptive statistics help describe anꩵd explain the features of a specific dataset by giving short summaries about the sample and measures of the data. The most recognized types of descriptive statꦰistics are measures of center. For example, the mean, median, and mode, which are used at almost all levels of math and statistics, are used to define and describe a dataset. The mean, or the average, is calculated by adding all the figures within the dataset and then dividing by the number of figures within the set.

For example, the sum of the following dataset is 20: (2, 3, 4, 5, 6). The mean is 4 (20/5). The mode of a dataset is the value appearing most often, and the median is the value situated in the middle of the dataset. It is the figure separating the higher figures from the lower figures within a dataset. However, there are less common types of descriptive statistics that are still very important.

People use descriptive statistics to repurpose hard-to-understand quantitative insights across a large dataset into bite-sized descriptions. A student’s grade-point average (GPA), for example, provides ꧋a good understanding of descriptive statistics. The idea of a GPA is that it takes data points from a range of individual course grades and averages them together to provide a general understanding of a student’s overall acad🍬emic performance. A student’s personal GPA reflects their mean academic performance.

Important

Descriptive statistics, especially in fields such as medicine, often visually depict data using scatter plots, histograms, line graphs, or stem and leaf displays. We’ll t🌞alk more about visuals later in this article.

Types of Descriptive Statistics

All descriptive statistics are either measures of central tendency or measures of variability, also known as measures of dispersion.

Central Tendency

Measures🅺 of central tendency focus on the average or middle values of datasets, whereas measures of variability focus on the dispersion of data. These two measures use graphs, tables, and general discussions to help people understand the me𓃲aning of the analyzed data.

Measures of central tendency describe the center po♉sition of a distribution for a dataset. A person analyzes the frequency of each data point in the distribution and describes it using the mean, median, or mode, which measures the most common patterns of the analyzed dataset.

Measures of Variability

Measures of variability (or measures of spread) aid in analyzing how dispersed the distribution is for a dataset. For ex𓄧ample, while the measures of central tendency may give a person the average of a dataset, it does not describe how the data is distributed within the𝓡 set.

So, while the average of the data might be 65 out of 100, there can still be data points at both 1 and 100. Measures of variability help communicate this by describing the shape and spread of the dataset. Range, 澳洲幸运5开奖号码历史查询:quartiles, absolute deviation, and variance are all examples of measures of variability.

Consider the following dataset: 5, 19, 24, 62, 91, 100. The range of thaꦕt dataset is 95, which is calculated by subtracting the lowest number (5) in the dataset from the highest (100).

Distribution

Distribution (or frequency distribution) refers to the number of times a data point occurs. Alternatively, it can be how manꦑy times a data point fails to occur. Consider this datase🐷t: male, male, female, female, female, other. The distribution of this data can be classified as:

  • The number of males in the dataset is 2.
  • The number of females in the dataset is 3.
  • The number of individuals identifying as other is 1.
  • The number of non-males is 4.

Univariate vs. Bivariate

In d🐓escriptive statistics, univariate data analyzes only one variable. It is used to identify characteristics of a single trait and is not used to analyze any relationships or causations.

For example, imagine a room full of high school students. Say you wanted to gather the average age of the individuals in the room. This univa꧑riate data is only dependent on one factor: each person’s age. By gathering this one piece of information fromꦬ each person and dividing by the total number of people, you can determine the average age.

Bivariate data, on the other hand, attempts to link two variables by searching for correlation. Two types of data are collected, and the relationship between the two pieces of information is analyzed together. Because multiple variables are analyzed, this approach may also be referred to as 澳洲幸运5开奖号码历史查询:multivariate.

Let’s say each high school student in the example above takes a col🐠lege assessment test, and we want to see whether older students are testing better than younger students. In addition to gathering the ages of the students, we need to find out each student’s test score. Then, using data analytics, we mathematically or graphically depict whether there is a relationship between student age and test scores.

Fast Fact

Preparing and reporting financial statements are an example of descriptive statistics. Analyzing that fin🧔ancial information to make decisions on the future is inferential statistics.

Descriptive Statistics and Visualizations

One essential aspect of descriptive statistics is graphical representation. Visualizing data distributions effectively can be♚ incredibly powerful, and this is done in several ways.

Histograms are tools for displaying the distribution of numerical data. They divide the data into bins or intervals and represent the frequency or count of data points falling into ea𝓰ch bin through bars of varying heights. Histog🍷rams help identify the shape of the distribution, central tendency, and variability of the data.

Another visualization is boxplots. Boxplots, also known as box-and-whisker plots, provide a concise summary of a data distribution by highlighting key summary statistics, including the median 💃(middle line inside the box), quartiles (edges of the box), and potential outliers (points outside, or the “whiskers”). Boxplots visually depict the spread and skewness of the data and are particularly useful for comparing distributions across different groups or variables.

Descriptive Statistics and Outliers

Whenever descriptive statistics are being discussed, it’s important to note outliers. Outliers are data ❀points that significantly differ from other observations in a dataset. These could be errors, anomalies, or rare events within the data.

Detecting and managing outliers is a step in descriptive statistics to ensure accurate and reliable data analysis. To identify outliers, you can use graphical techniques (such as boxplots or scatter plots) or statistical methods (such as Z-score or IQR method). These approaches help pinpoint observa🦋tions that devi♊ate substantially from the overall pattern of the data.

The presence of outliers can have a notable impact on descriptive statistics, skewing results and affecting the interpretation of data. Outliers can disproportionately influence measures of centra💎l tendency, such as the mean, pulling it toward their extreme values. For example, the dataset of (1, 1, 1, 997) is 250, even though that is hardly representative of the dataset. This distortion can lead to misleading conclusions about the typical behavior of the d🦩ataset.

Depending on the context, outliers can often be treated by removing them (if they are genuinely erroneous or irrelevant). Alternatively, outliers may hold important information and should be kept for the value they may be able to demo🦩nstrate. As you analyze your data, consider the relevance of what outliers can contribute and whether it makes more sense to just strike those data points from your descriptive statistic calculations.

Descriptive Statistics vs. Infereﷺntial Statistics

Descriptive statistics have a different function from infer🌟ential statistics, which are datasets that are used to make decisions or apply characteristics from one dataset to another.

Imagine another example where a company sells hot sauce. The company gathers data such as the count of sales, average quantity purchased per 澳洲幸运5开奖号码历史查询:transaction, and average sale per day of the week. All of this information is descriptive, as it tells a story of what actually happenedꦿ in the past. In this case, it is not being used beyond being informational.

Now let’s say that the company wants to roll out a new hot sauce. It gathers the same sales data above, but it uses the information to make predictions about what the sales of the new🍬 hot sauce will be. The act of using descriptive statistics and applying characteristics to a diff꧑erent dataset makes the dataset inferential statistics. We are no longer simply summarizing data; we are using it to predict what will happen regarding an entirely different body of data (in this case, the new hot sauce product).

Explain Like I’m 5

Imagine you have a large number of something and want to tell your friends about them,🐟 but there are so many of them. Descriptive statistics give your friends a quick su🔥mmary, so they know what you have without looking at every single item.

Descriptive statistics are about using numbers and often easy pictures, like charts, to tell a story about something being examined. They help you understand what is in the collection of something and explain it to others quickly.

What Do Descriptive Statistics Do?

Descriptive statistics ඣare a means of describing features of a dataset by generating summaries about data samples. For example, a population census may include descriptive statistics regarding the ratio of men and women in a specific ♐city.

What Are Examples of Descriptive Statistics?

In recapping a Major League Baseball season, for ex🍃ample, descriptive statistics might include team batting averages, the number of runs allowed per team, and the average wins per divisio🎃n.

What Is the Main Purpose of Descriptive Statistics?

The main purpose of descriptive statistics is to provide information about a dataset. In the e💮xample above, there are dozens of baseball teams, hundreds of players, and thousands of games. Descripti💝ve statistics summarizes large amounts of data into useful bits of information.

What Are the Types of Descriptive Statistics?

The three main types of descriptive statistics are frequency distribution, central tendency, and variability of a dataset. Fre𒀰quency distribution recor💫ds how often data occurs, central tendency records the data’s center point of distribution, and variability of a dataset records its degree of dispersion.

Can Descriptive Statistics Be Used to Make Inferences or Predictions?

Technically speaking, descriptive statistics only serve to help understand historical data attributes. Inferential statistics, a separate branch of statistics, are used to understand how variables interact with one another in a dataset and possibly predict what might happen in the future.

The Bottom Line

Descri👍ptive statistics refer to the analysis, summary, and communication of findings that describe a dataset. Often not useful for decision making, descriptive statistics still hold value in explaining high-level summaries of a set of information such a🅰s the mean, median, mode, variance, range, and count of information.

Article Sources
Investopedia requires writers to use primary sources to support their work. These include white papers, government data, original reporting, and interviews with industry experts. We also reference original research from other reputable publishers where appropriate. You can learn more about the standards we follow in producing accurate, unbiased content in our editorial policy.
  1. Purdue Online Writing Lab, Purdue University. “.”

  2. Nation𝔉al Center for Biotechnology Information. “.”

  3. California State University Northridge. “.”

  4. Kent State University, Department of ꧃Mathematical Sciences. “.”

  5. Purdue Online Writing Lab, Purdue University. “.”

Open a New Bank Account
The offers that appear in this table are from partnerships from which Investopedia receives compensation. This compensation may impact how and where listings appear. Investopedia does not include all offers available in the marketplace.

Related Articles