📊 Statistics Calculator

Calculate mean, median, mode, standard deviation, variance, quartiles, and more. Includes box plot visualization and outlier detection for comprehensive statistical analysis.

Enter Your Data

Enter Numbers (comma-separated or one per line)

Tip: Separate numbers with commas, spaces, or line breaks

Statistical Results

Mean (Average)

Median

Mode

Range

Standard Deviation

Variance

Sum

Count

Minimum

Maximum

Q1 (25th Percentile)

Q3 (75th Percentile)

Box Plot Visualization

Sorted Dataset

Enter data to see sorted values...

Understanding Statistical Measures

Measures of Central Tendency

Mean (Average): The sum of all values divided by the count. Most affected by extreme values (outliers). Best used when data is normally distributed without significant outliers.

Median: The middle value when data is sorted. Not affected by outliers. Better than mean for skewed distributions or data with outliers. For even-numbered datasets, it's the average of the two middle values.

Mode: The most frequently occurring value(s). Useful for categorical data or finding the most common value. A dataset can have no mode, one mode (unimodal), or multiple modes (bimodal, multimodal).

Measures of Dispersion (Spread)

Range: The difference between the maximum and minimum values. Simple measure of spread but sensitive to outliers.

Variance: The average of squared differences from the mean. Measures how spread out the data is. Units are squared, making interpretation less intuitive.

Standard Deviation: The square root of variance. Most commonly used measure of spread. Same units as the original data. About 68% of data falls within 1 standard deviation of the mean in normal distributions.

Interquartile Range (IQR): The difference between Q3 and Q1. Contains the middle 50% of data. Resistant to outliers and useful for identifying them.

Quartiles and Box Plots

Q1 (First Quartile): 25% of data falls below this value.

Q2 (Second Quartile/Median): 50% of data falls below this value.

Q3 (Third Quartile): 75% of data falls below this value.

Box plots visualize the five-number summary: minimum, Q1, median, Q3, and maximum. They make it easy to see the distribution shape, central tendency, and identify potential outliers.

When to Use Each Measure

Use Mean when: Data is normally distributed, no significant outliers, you need all values to contribute equally
Use Median when: Data is skewed, contains outliers, or for ordinal data
Use Mode when: Working with categorical data or finding the most typical value
Use Standard Deviation when: Data is normally distributed and you want to understand typical deviation from the mean
Use IQR when: Data contains outliers or is not normally distributed

Outlier Detection

Outliers are data points that differ significantly from other observations. They can indicate measurement errors, data entry errors, or genuinely exceptional cases.

This calculator identifies outliers using the IQR method:

Values below Q1 - 1.5 × IQR are lower outliers
Values above Q3 + 1.5 × IQR are upper outliers

Descriptive Statistics: Mean, Median, Mode × When to Use Which

The Three Measures of Central Tendency

Mean, median, and mode all describe the "center" of a dataset, but they're appropriate for different situations. Choosing the wrong measure can seriously mislead your analysis.

Measure	Calculation	Best Used For	Sensitive To Outliers
Mean (Average)	Sum × Count	Symmetric distributions; financial calculations	YES × highly
Median	Middle value when sorted	Income, prices, skewed data	No × robust
Mode	Most frequent value	Categorical data; shoe/clothing sizes	No

Why the Mean Can Mislead

Consider home prices in a neighborhood: $200K, $220K, $210K, $250K, $215K, $1,800K (one mansion). Mean: $482K × which says nothing useful about a typical home. Median: $215K × which accurately represents the middle of the market.

This is why median household income is used instead of mean: a few billionaires would inflate the mean dramatically. Median income gives the most accurate picture of what a typical household earns.

When Mode Matters Most

Mode is essential for categorical (non-numeric) data where mean and median are meaningless. If you survey shoe size preferences, mode (most popular size) is the only meaningful average. In continuous data, mode occurs naturally in multimodal distributions (datasets with multiple peaks) that reveal subgroups in your data.

Practical Rule

If your data has outliers or is clearly skewed (income, prices, response times), use the median. If data is roughly symmetric with no extreme outliers (heights, test scores in a class), use the mean. For equal weight to all values in decision-making, mean is appropriate.

Standard Deviation Demystified: What It Really Tells You

Understanding Spread, Not Just Center

Two datasets can have identical means but completely different distributions. Standard deviation (SD or s) quantifies how spread out data points are from the mean. A small SD means data clusters tightly around the average; large SD means data is highly variable.

Example: Test scores × Class A: 70, 72, 68, 73, 71. Mean = 70.8, SD = 1.6 (very consistent). Class B: 40, 60, 90, 85, 79. Mean = 70.8, SD = 20.3 (widely varied). Same mean, completely different teaching outcomes.

The Empirical Rule (68-95-99.7)

For data that follows a normal distribution (bell curve), the empirical rule states:

68% of values fall within ×1 standard deviation of the mean
95% of values fall within ×2 standard deviations of the mean
99.7% of values fall within ×3 standard deviations (virtually all data)

This is why "within 2 standard deviations" is often used as a benchmark for "normal" in quality control and scientific research. Values beyond ×3 SD are extremely unusual (0.3% probability) and often warrant investigation as potential outliers or errors.

Variance vs Standard Deviation

Variance is simply SD squared (SD×). Both measure spread, but SD is expressed in the same units as your original data (dollars, meters, points), making it more interpretable. Variance is useful in mathematical calculations (such as ANOVA) where squared units are needed. For communication and interpretation, always report SD.

Z-Scores: Putting SD to Work

Z-scores (z = (x - mean) / SD) standardize values from any distribution to a common scale. A z-score of +2 means the value is 2 standard deviations above the mean × occurring in only ~2.5% of values in a normal distribution. Z-scores let you compare values from completely different scales.

Outlier Detection and Skewed Distributions Explained

Identifying Outliers: IQR Method

The Interquartile Range (IQR) method is the most robust standard for outlier detection. Calculate:

Q1 (25th percentile) and Q3 (75th percentile)
IQR = Q3 - Q1
Lower fence = Q1 - 1.5 × IQR
Upper fence = Q3 + 1.5 × IQR
Values outside these fences are potential outliers

Understanding Data Skewness

Distribution Type	Relationship	Examples	Recommended Central Measure
Symmetric (Normal)	Mean × Median × Mode	Heights, test scores, measurement errors	Mean
Right-skewed (Positive)	Mean > Median > Mode	Income, wealth, home prices, wait times	Median
Left-skewed (Negative)	Mean < Median < Mode	Age at death, exam scores near ceiling	Median

What to Do With Outliers

Do not automatically remove outliers! First, investigate: Is it a data entry error? (Fix or remove.) Is it a legitimate extreme value? (Keep it × it's real.) Is it from a different population? (May need to model separately.) From removing legitimately extreme values, you can create a biased analysis that doesn't reflect reality.

When outliers persist after investigation, consider: reporting results with and without outliers; using median instead of mean; applying logarithmic transformation if data spans orders of magnitude; or using robust statistical methods designed to handle outliers.

Box Plot as a Diagnostic Tool

The box plot (box-and-whisker plot) visually shows median, quartiles, and outliers at a glance. Our statistics calculator generates one automatically. A box plot should be your first step when exploring a new dataset × it immediately reveals skewness, spread, and outliers before you even calculate a single statistic.

Statistical Significance & Reference Tables

Understanding statistical concepts is essential for interpreting data correctly. Here are the most commonly used critical values and distribution thresholds:

Confidence Level	Significance Level (a)	Z-score (two-tailed)	Common Use
80%	a = 0.20	×1.282	Exploratory analysis
90%	a = 0.10	×1.645	Business decisions
95%	a = 0.05	×1.960	Standard in most research
98%	a = 0.02	×2.326	Clinical studies
99%	a = 0.01	×2.576	Pharmaceutical trials
99.7%	a = 0.003	×3.000 (3-sigma)	Manufacturing / Six Sigma

Descriptive Statistics: Quick Definitions

Measure	Definition	Best Used When
Mean	Sum × count	Data has no outliers, symmetric distribution
Median	Middle value when sorted	Outliers present (incomes, home prices)
Mode	Most frequent value	Categorical data, bimodal distributions
Std Deviation	Avg distance from mean	Measuring variability around the mean

? Frequently Asked Questions

What's the difference between population and sample standard deviation? +

Population standard deviation divides by N (total count) and is used when you have data for the entire population. Sample standard deviation divides by N-1 (degrees of freedom) and is used when you have a sample that represents a larger population. This calculator uses sample standard deviation (N-1) which is more common in practice.

How do I interpret standard deviation? +

Standard deviation tells you how spread out your data is. A small standard deviation means data points are close to the mean, while a large standard deviation means they're more spread out. In a normal distribution, approximately 68% of values fall within 1 standard deviation of the mean, 95% within 2 standard deviations, and 99.7% within 3 standard deviations.

When should I use median instead of mean? +

Use median when your data is skewed or contains outliers. For example, median is better for income data because a few very high incomes can skew the mean upward. Median gives a better sense of the "typical" value in such cases. Mean is better when data is symmetrically distributed without outliers.

What if my dataset has no mode? +

If all values appear only once, there is no mode. This is common in continuous data or datasets with many unique values. In such cases, mode isn't a useful measure of central tendency, and you should focus on mean or median instead.

How are outliers identified in this calculator? +

Outliers are identified using the IQR (Interquartile Range) method. Any value less than Q1 - 1.5 × IQR or greater than Q3 + 1.5 × IQR is considered an outlier. This is the same method used in box plots and is a standard statistical technique.

What is variance and why is it important? +

Variance measures the average squared deviation from the mean. It's important in statistical analysis and is the foundation for many other statistical measures. However, because it's in squared units, standard deviation (the square root of variance) is often more interpretable for describing data spread.

How do I read a box plot? +

A box plot shows five key values: minimum, Q1, median, Q3, and maximum. The box represents the middle 50% of data (from Q1 to Q3), with a line at the median. The "whiskers" extend to the minimum and maximum. Box plots make it easy to see if data is symmetric or skewed, identify the spread, and spot potential outliers.

Can I paste data from Excel or other spreadsheets? +

Yes! You can copy data from Excel, Google Sheets, or any spreadsheet and paste it directly into the calculator. The calculator automatically handles commas, spaces, tabs, and line breaks as separators. You can paste a single column, row, or even a range of cells.