Outlier Calculator
Paste a list of numbers and detect outliers using three industry-standard methods: the classic 1.5 x IQR rule (Tukey fences), the Z-score threshold, and the robust Modified Z-score based on median absolute deviation. Switch methods or run all three at once to see where they agree.
Formula
Worked example
For 4, 5, 6, 7, 8, 9, 10, 11, 12, 40: Q1 = 6, Q3 = 11, IQR = 5. Fences at 1.5 x IQR are -1.5 and 18.5, so 40 is the only IQR outlier. The mean is 11.2 and population sd is 9.91, giving Z = (40 - 11.2) / 9.91 = 2.91, which is below the 3.0 Z-score threshold so the Z-score method does not flag it. The Modified Z-score for 40 is 0.6745 x (40 - 8.5) / 2.5 = 8.50, well above 3.5, confirming 40 as a strong outlier.
Three outlier detection methods explained
This calculator supports three widely used methods. The IQR (Tukey fences) method sorts your data, finds the first and third quartiles, and flags any value beyond Q1 - k x IQR or Q3 + k x IQR. Because it is based on quartiles rather than the mean it resists the very extremes it is designed to catch, making it the most robust choice for general use. John Tukey chose k = 1.5 as a practical balance: for normally distributed data it flags roughly 0.7% of values, and k = 3.0 catches only the most extreme cases. The Z-score method measures how many standard deviations each value sits from the mean; values beyond the threshold (commonly 3.0) are flagged. This method assumes approximate normality and is sensitive to extreme values pulling the mean and standard deviation. The Modified Z-score (Iglewicz-Hoaglin, 1993) replaces the mean with the median and the standard deviation with the Median Absolute Deviation (MAD), making it far more robust when the data already contains extreme values. The recommended threshold is 3.5.
Five-number summary and what it tells you
Beyond the outlier list, this calculator reports the full five-number summary: minimum, Q1, median, Q3, and maximum. These five values anchor a box-and-whisker plot and give a quick picture of the data spread without assuming any particular distribution. The IQR (Q3 minus Q1) is the width of the middle 50% of your data. A large IQR relative to the range means the data is spread fairly evenly; a small IQR with distant extremes is a strong signal that outliers exist. The mean and standard deviation are also reported so you can judge whether the data is roughly symmetric (mean close to median) or skewed by extreme values.
Mild vs extreme outliers and what to do next
Using k = 1.5 identifies mild outliers, values that are unusual but not impossibly far from the bulk. Using k = 3.0 identifies extreme outliers, values so distant that a data-entry error or instrument fault is the most likely explanation. The "All methods" mode runs IQR at k = 1.5, Z-score at threshold 3.0, and Modified Z-score at threshold 3.5 simultaneously. A value flagged by all three is a very strong outlier candidate. A value flagged only by one method deserves scrutiny but may simply reflect a skewed distribution. Whatever method you use, investigating the flagged values is the goal, not blindly removing them. Confirm or rule out entry errors, consider whether the value represents a real rare event, and document any decision to transform or exclude it.
Outlier detection method comparison
| Method | Threshold | Assumes normality | Robust to extremes | Best for |
|---|---|---|---|---|
| IQR (k = 1.5) | Q1 - 1.5 IQR / Q3 + 1.5 IQR | No | Yes | General use, skewed data |
| IQR (k = 3.0) | Q1 - 3 IQR / Q3 + 3 IQR | No | Yes | Extreme outliers only |
| Z-score (3.0) | |Z| > 3.0 | Yes | No | Large, bell-shaped samples |
| Modified Z (3.5) | |M| > 3.5 | No | Yes | Small samples, heavy-tailed data |
Choosing the right method depends on your data distribution and how robust you need the detection to be.
Frequently asked questions
What is an outlier and how is it detected?
An outlier is a value that falls far outside the typical range of the rest of the data. The most common detection rule is the 1.5 x IQR method: any value below Q1 - 1.5 x IQR or above Q3 + 1.5 x IQR is flagged. The Z-score method flags values more than 3 standard deviations from the mean. The Modified Z-score (MAD-based) flags values whose |0.6745 x (x - median) / MAD| exceeds 3.5.
What is the difference between mild and extreme outliers?
Mild outliers lie beyond the 1.5 x IQR fences but inside the 3 x IQR fences. Extreme outliers lie beyond the 3 x IQR fences. This two-level system, introduced by Tukey, lets you triage: extreme outliers are likely errors or very rare events, while mild outliers may simply reflect natural variation at the tails of a skewed distribution.
Which outlier method should I use?
Use the IQR method (k = 1.5) for most practical purposes because it does not assume normality and is resistant to the extremes it is hunting for. Use the Z-score method only if you are confident your data is roughly bell-shaped and your sample is large. Use the Modified Z-score when you have a small sample or suspect the data already contains extreme contamination.
How many numbers do I need?
You need at least four values so the data can be split into a lower and upper half to compute Q1 and Q3. For the Z-score and Modified Z-score methods you also need at least two values for a meaningful standard deviation or MAD. With fewer than four numbers no results are reported.
Should I delete outliers from my data?
Not automatically. An outlier may be a data-entry error, a measurement glitch, or a genuine rare event. Investigate the cause first: check for typos, instrument failures, and whether the value is physically plausible. Only remove or transform a value if you can justify and document the reason. In many cases keeping the outlier and using a robust statistical method is the right approach.
What is the Modified Z-score and why is it better for small samples?
The Modified Z-score (Iglewicz and Hoaglin, 1993) uses the median and the Median Absolute Deviation (MAD) instead of the mean and standard deviation. Because the median and MAD are not influenced by extreme values the way the mean and standard deviation are, this method is much more reliable when the data already contains a few very large or very small values. The formula is M_i = 0.6745 x (x_i - median) / MAD, and values with |M_i| > 3.5 are flagged.