Skewness

Skewness measures the asymmetry of a distribution. It’s the third standardized moment:

$skewness = \frac{1}{N} \sum_{i = 0}^{N - 1} (\frac{x ( i ) - μ _{x}}{σ _{x}})^{3}$

The cube preserves sign — positive when $x (i)$ is above the mean, negative when below — so the sum is sensitive to asymmetry. A perfectly symmetric distribution has skewness zero: positive and negative deviations from the mean cancel. An asymmetric distribution has nonzero skewness.

Positive skew (right-skewed). A long tail on the right. The bulk of the data sits on the left, and a few unusually large values pull a thin tail off to the right. The mean (which the tail tugs on) typically ends up to the right of the median, which is typically to the right of the mode. Rule-of-thumb ordering: mode, median, mean from left to right.

Negative skew (left-skewed). Mirror image. A long tail on the left. The bulk sits on the right; a thin tail extends left. Rule-of-thumb ordering: mean, median, mode from left to right.

Symmetric distribution. Skewness $= 0$ . The mean and median coincide; the mode coincides with them for unimodal symmetric distributions.

The mode–median–mean ordering rule is reliable for typical unimodal distributions like the log-normal but is not a theorem — counterexamples exist for multimodal and unusually-shaped distributions (see, e.g., von Hippel 2005). For most engineering signals it holds; for pathological distributions it can fail. Don’t use the rule as a definition of skewness — use the moment formula above.

In Feature extraction from signals, skewness is a useful feature for distinguishing classes that produce signals with different shapes. ECG heartbeats associated with certain arrhythmias are skewed differently from healthy heartbeats. Activity-recognition signals from running are skewed differently from walking. Skewness captures asymmetry that mean and standard deviation can’t.

Skewness is the third of the first four moments used as standard signal features, alongside the mean, Standard deviation, and Kurtosis. Together they describe the shape of the distribution of values in a window.

In Pandas: df['col'].skew() for the whole column, df['col'].rolling(N).skew() for a rolling window. In SciPy: scipy.stats.skew(arr).

The formula above is the population skewness. Sample-skewness formulas often include a correction factor like $\frac{N ^{2}}{( N - 1 ) ( N - 2 )}$ to make the estimator unbiased; Pandas uses one such corrected form by default. For long signals the correction is negligible, but for short windows it can matter. The textbook uses the uncorrected form for clarity.

Idriss Rami — Notes

Explorer

Skewness

Graph View

Backlinks