Last compiled · 2026-03-17 02:35:06 UTC

How it works

Where the data comes from

All of the data on this site is pulled from GitHub at build time. For each tracked package, we fetch every release that has been published, along with all of the issues that have been opened and closed against that package's repository. We do not use any proprietary or internal data from Pixel & Tonic, and we are not affiliated with them in any way.

Issues are matched to a major version based on their labels and content. Releases are grouped by their semantic version numbers into major, minor, and patch levels.

Release periods

The fundamental unit of measurement on this site is a "release period", which is the window of time between two consecutive releases. For example, if version 5.4.1 was published on January 10th and version 5.4.2 was published on February 3rd, the release period for 5.4.1 spans those 24 days.

During each release period, we count how many issues were opened, how many were closed, and how many were specifically referenced in the next release's notes (which we call "fixed in release"). This gives us a picture of how much activity and how much resolution happened between each pair of releases.

The three chart metrics

The chart on each version page can be switched between three different views using the dropdown. Each one shows a different way of looking at the same underlying issue data.

Issues by release

This is the simplest view. It shows the raw count of issues that were opened during each release period. If 7 issues were opened between version 5.4.1 and version 5.4.2, the data point for that period is 7. This is useful for getting a sense of absolute volume, but it does not account for how long each release period lasted. A release that was out for 60 days will naturally accumulate more issues than one that was out for 5 days, even if the underlying rate of issues is the same.

Issues by day

This view normalises the issue count by the number of days in the release period. It divides the number of issues opened by the number of days between releases to give an average daily issue rate. Using the same example, if 7 issues were opened over 24 days, the value would be approximately 0.29 issues per day. This is better for comparing release periods of different lengths on equal footing.

Fix rate by release

This view shows what percentage of issues opened during a release period were referenced as fixed in the subsequent release's notes. If 10 issues were opened and 6 of them were mentioned in the next release's changelog, the fix rate for that period is 60%. A high fix rate suggests that the maintainers are actively addressing reported issues in their releases, while a low or declining fix rate might suggest that issues are piling up unresolved.

Trend lines

Each chart also shows a dashed trend line alongside the main data. This trend line is calculated using cumulative linear regression, which means that the trend value at each point only considers the data from the beginning up to and including that point. It does not look ahead at future data.

At the first data point, there is nothing to regress on, so the trend value is just the raw value itself. At the second point, we fit a straight line through the first two values and take the fitted end value. At the third point, we fit a line through all three values, and so on. This means the trend line starts out volatile (because it is based on very little data) and gradually stabilises as more releases are included.

The "Trending up", "Trending down", or "Stable" label next to the chart heading is determined by comparing the first and last values of the trend line. If the last value is more than 10% higher than the first, we call it "trending up". If it is more than 10% lower, it is "trending down". Anything in between is considered "stable". The percentage change is shown in brackets next to the label.

The stability index

The stability index is the blue line on the chart and the headline number shown on each version page. It is a composite score that combines all three metrics into a single value between 0.0 and 1.0. A score of 0.44 or above is considered good (shown in green), a score between 0.35 and 0.43 is considered stable (shown in grey), and a score below 0.35 indicates declining stability (shown in red).

The score is calculated from four weighted components:

  • Issues by release (35% weight) looks at the percentage change of the trend line for issue counts per release period. Because fewer issues is better, this value is inverted before being added to the score. A version where issue counts are trending downward will receive a positive contribution from this component.
  • Issues by day (15% weight) looks at the percentage change of the trend line for the average daily issue rate. This is also inverted, so a decreasing rate contributes positively. This component carries less weight because it partially overlaps with the issues-by-release metric.
  • Fix rate (20% weight) looks at the percentage change of the trend line for fix rates. Unlike the issue metrics, this one is not inverted because a rising fix rate is directly a good thing. If maintainers are fixing a larger proportion of reported issues over time, this component pushes the stability score upward.
  • Quiet period bonus (30% weight) specifically rewards versions where longer release periods have relatively fewer issues. It works by computing a "days per issue" value for each release period, then comparing the average days-per-issue in the more recent half of the data against the earlier half. If recent releases are sitting for longer without accumulating proportionally more issues, the version gets a positive bonus. This is the heaviest-weighted component because it directly captures what most people mean by "stability": a release that can sit in production for a long time without problems.

Confidence dampening

One problem with using percentage changes is that small absolute numbers can produce misleadingly large percentages. If a version goes from 1 issue to 5 issues, that is technically a 400% increase, but 5 issues is not actually a lot in absolute terms. The percentage is enormous only because the starting number was so small.

To deal with this, each percentage change is scaled by a confidence factor that depends on how large the underlying values actually are. The confidence factor is calculated as the mean of the data values divided by a threshold value, capped at 1.0. This means that when the mean value is below the threshold, the percentage change is proportionally reduced.

The thresholds are set differently for each metric based on what constitutes a "meaningful" amount of data. For issues by release, the threshold is 10 issues (so a version averaging 5 issues per release would have its percentage change halved). For issues by day, the threshold is 0.5 issues per day. For fix rate, the threshold is 30%. Above these thresholds, the full percentage change is used.

The sigmoid normalisation

After the four weighted components are combined into a single raw score, that score needs to be mapped into the 0.0 to 1.0 range. We use a sigmoid function for this: 1 / (1 + e^(-score/20)).

The sigmoid is an S-shaped curve that smoothly maps any number to a value between 0 and 1. A raw score of 0 maps to exactly 0.5. Positive raw scores map to values above 0.5, and negative raw scores map to values below 0.5. The further the raw score is from zero, the closer the output gets to 1.0 or 0.0, but it never actually reaches either extreme. Note that because the colour thresholds are set asymmetrically (good starts at 0.44, bad is below 0.35), a raw score of zero actually lands in the "good" range rather than being perfectly neutral.

The divisor of 20 controls how steep the curve is. With this setting, a raw score of around +/- 40 will push the output close to the extremes (roughly 0.88 or 0.12). Scores beyond that range continue to push toward 1.0 or 0.0 but with diminishing effect. This prevents any single extreme metric from completely dominating the final score.

Cumulative calculation

Just like the trend lines, the stability index is calculated cumulatively. The value at each point on the chart only considers data up to and including that point. This means the stability score at release 5.4.3 is based on everything from 5.4.0 through 5.4.3, but it knows nothing about 5.4.4 or later.

The headline "Stability" number shown in the stats grid on each version page is simply the last value in this cumulative sequence, which represents the stability score calculated using all available data for that version.

Major version pages

On the pages that list minor versions within a major version (for example, the Craft CMS 5.x page), the chart data is grouped differently. Instead of showing every individual release period, each data point represents a minor version. The values used are taken from the last release period within each minor version, so the data point for "5.4" shows the metrics from the final patch release in the 5.4.x series.

The stability scores shown next to each minor version in the table and on the version blocks are calculated independently for each minor version using its own release periods. This means the stability of 5.4.x is based entirely on what happened within the 5.4.x release series and is not influenced by what happened in 5.3.x or 5.5.x.

When a minor version has only a single release (for example, a brand new 5.6.0 with no patch releases yet), there are not enough data points to calculate a meaningful stability score. In that case, the site falls back to showing the stability score of the previous minor version, with the version number shown in brackets to make it clear where the number came from.

What the colours mean

Throughout the site, green and red colours are used to indicate whether a metric is moving in a favourable or unfavourable direction. For issue counts and daily rates, red means the numbers are going up (more issues is bad) and green means they are going down (fewer issues is good). For fix rate, the meaning is reversed: green means the fix rate is going up (more issues being fixed is good) and red means it is going down.

For the stability index specifically, a score of 0.44 or above is shown in green (good), a score below 0.35 is shown in red (bad), and anything between 0.35 and 0.43 is shown in a neutral grey (stable). The version blocks on the major version pages use the same colour as a subtle background tint, making it easy to scan across minor versions and see which ones are in good shape and which ones might warrant caution.

Limitations

This tool looks at publicly available GitHub data only. It does not know about issues reported through other channels, internal bug trackers, security advisories that were handled privately, or the severity of any individual issue. A version could have a high stability score while still having one critical bug, or a low score because of a flood of minor cosmetic issues.

The stability index is a directional indicator, not an absolute measurement. It is most useful for comparing the trajectory of a version over time, or for getting a rough sense of how a minor version compares to its predecessors. It should not be the only factor you consider when deciding whether to update.