How risk scoring works

Every pull request analyzed by Pullminder receives a risk score from 0 to 100. The score reflects how likely the PR is to introduce problems — security vulnerabilities, regressions, policy violations, or review blind spots. Higher scores mean higher risk.

What goes into the score

Pullminder runs multiple analyzers against each PR. Each analyzer examines a different dimension of risk:

Analyzer	What it checks
Diff size	Total lines changed, number of files touched, and whether the diff exceeds review-friendly thresholds
File sensitivity	Whether changed files are in sensitive paths (e.g., `auth/`, `migrations/`, `*.pem`, CI configs)
Test coverage gap	Whether the PR modifies source code without adding or updating corresponding tests
Security patterns	Known vulnerability patterns such as hardcoded secrets, SQL injection vectors, insecure crypto usage
Dependency changes	Additions, removals, or version bumps in dependency manifests (`package.json`, `go.mod`, `requirements.txt`, etc.)
Review quality signals	Whether the PR has a description, linked issues, reasonable commit structure, and adequate reviewer assignment

How the score is calculated

The scoring model works as follows:

Pullminder runs all enabled rule packs against the diff.
Each rule pack produces zero or more findings, each with a severity level.
Findings are weighted based on their severity and the rule pack’s max_weight configuration.
The weighted findings are summed and normalized to produce a score between 0 and 100.

A PR with no findings scores 0. A PR that triggers multiple high-severity findings across several analyzers will score closer to 100.

Risk levels

Pullminder maps the findings to four risk levels:

Level	Meaning
Low	Routine change. Minimal review effort needed.
Medium	Some areas deserve attention. Review the flagged findings.
High	Significant risk. Careful review recommended before merging.
Critical	Major concerns detected. Address findings before merging.

How rule packs affect scores

Each rule pack declares a max_weight that caps its contribution to the total score. This prevents a single noisy pack from dominating results.

You can adjust max_weight per pack to reflect your team’s priorities. A team that considers test coverage non-negotiable might raise the test-coverage pack’s weight while lowering diff-size.

How to lower your score

If your PRs consistently score higher than you would like, these practices help:

Write tests alongside code changes. The test coverage gap analyzer checks whether modified source files have corresponding test updates. Adding tests directly addresses one of the most common finding types.
Keep PRs small and focused. Large diffs trigger the diff size analyzer and make it more likely that other analyzers find issues. Splitting work into smaller PRs reduces risk per review.
Avoid mixing sensitive path changes with feature work. Changing authentication logic, database migrations, or CI configuration alongside unrelated feature code raises the file sensitivity score. Isolate sensitive changes into dedicated PRs.
Fix security findings promptly. Security pattern findings carry the highest weights. Addressing hardcoded secrets, injection vectors, or insecure dependencies has the largest impact on your score.
Write PR descriptions and link issues. The review quality analyzer rewards PRs that provide context. A clear description and linked issue help both Pullminder and your reviewers.

Incident history and author context

Two additional analyzers contribute to the score when Collective Memory is enabled:

Incident history — files that have been hotfixed or reverted before raise the score, weighted by recency. A file patched last week contributes more than one patched a year ago.
Author context — the score rises when the PR author has no prior history with files that have incidents, or when a large change has no second reviewer.

These signals appear in the Red-Flag section of the PR comment when they fire. See Risk Red-Flagging for threshold configuration and override options.