Degree of Authorship (DOA): The Research-Backed Method for Measuring Code Ownership

Introduction
The Problem with Simple Metrics
What Degree of Authorship Measures
The Three Components
Interpreting DOA Scores
How DOA Improves Bus Factor Calculation
Practical Applications
Limitations to Keep in Mind
Conclusion

Introduction

When you need to know who really owns a piece of code, the first instinct is often to count commits. "Who has committed the most to this file?" seems like a reasonable way to identify the expert.

But commit counts are surprisingly misleading. Consider a file where Developer A has 100 commits (formatting fixes, dependency updates, minor tweaks) while Developer B has just 5 commits, but those commits include the original implementation and a major architectural refactor. By commit count, A "owns" this file. But if a critical bug appears at 2 AM, who would you page? Almost certainly Developer B.

Degree of Authorship (DOA) addresses this problem. Grounded in academic research, DOA measures true code ownership by considering not just the quantity of contributions but their nature and timing. It produces a score that better reflects actual understanding of the code.

The Problem with Simple Metrics

Commit counts miss several important factors that determine real code ownership.

First, they ignore the significance of creating a file. The person who wrote the original implementation typically has context that subsequent editors don't possess. They understand the design decisions, the constraints that were considered, and the reasons the code is structured the way it is. Later editors might fix bugs or add features without ever fully understanding the foundation.

Second, commit counts ignore knowledge decay. If you made substantial contributions to a file two years ago but haven't touched it since, while your colleague has been actively maintaining it for the past six months, your understanding is likely outdated. The code has evolved in ways you don't know about.

Third, commit counts treat all contributions equally. A 500-line feature implementation carries more ownership weight than a hundred one-line typo fixes, but simple counting doesn't distinguish between them.

What Degree of Authorship Measures

DOA produces a normalized score between 0 and 1 that represents how much a contributor "owns" a particular file based on their creation and modification history, adjusted for how much has changed since their last edit.

A score of 0.8 means strong ownership: this person either created the file or has made substantial, recent modifications. They're the go-to expert. A score of 0.2 indicates familiarity rather than deep ownership, meaning they've worked with the code but wouldn't be the first choice to fix a complex bug or make significant changes.

The approach comes from research by Fritz and colleagues in the software engineering field. Their studies validated DOA against developers' own assessments of their expertise and found it correlated well with how quickly people could answer questions about code, their self-reported expertise levels, and the time required to make changes. This isn't theoretical. It's been empirically shown to predict actual understanding.

The Three Components

DOA combines three factors to produce its ownership score.

First Authorship asks a simple question: did this contributor create the file? The person who wrote the original code typically has foundational understanding that later editors don't share. They know why the file exists, what problem it was meant to solve, and what design tradeoffs were made. This factor contributes significantly to the ownership score because creation builds understanding that's difficult to replicate through later modifications alone.

Deliveries counts how many times a contributor has modified the file. More modifications indicate ongoing engagement with the code. Someone who has returned to make changes twenty times understands more than someone who touched it once. This factor recognizes that repeated interaction builds deeper familiarity.

Acceptances measures how much has changed since the contributor's last edit. If you modified a file fifty commits ago and forty-nine other changes have happened since, your knowledge is stale. Others have modified the code in ways you're not aware of. This factor captures knowledge decay, reflecting the reality that understanding fades when you're not actively working with code.

The mathematical formula weights these three components and normalizes the result to a 0-1 scale. Creation provides a significant boost. Each modification adds incrementally. And the decay from others' modifications follows a logarithmic curve, where the first few changes after your last edit degrade your knowledge significantly, but additional changes have diminishing impact. (You can find the specific coefficients in our technical documentation.)

Interpreting DOA Scores

Scores above 0.75 indicate a file "author," someone with deep ownership who can work with this code independently. These are your go-to experts for questions, bug fixes, and major changes.

Scores between 0.50 and 0.74 indicate significant contributors with solid understanding. They can make changes effectively and would be reasonable choices for code reviews, though they might not have the deepest historical context.

Scores between 0.25 and 0.49 indicate familiarity. These contributors can work with the file given context but would benefit from pairing with someone more experienced for complex changes.

Scores below 0.25 indicate minimal knowledge. The contributor has touched this file but would need substantial ramp-up time to work with it effectively.

How DOA Improves Bus Factor Calculation

Traditional bus factor calculation uses commit counts: count each contributor's commits, find how many contributors you need to account for 50% of the commits, and that's your bus factor. The problem is treating all commits equally and ignoring knowledge decay.

DOA-based bus factor works differently. First, calculate DOA for each contributor on each file. Then identify "authors" (contributors with DOA above 0.75) for each file. Finally, find the minimum set of authors needed to cover more than half of all files in the repository.

This approach produces more accurate results because it accounts for creation significance and knowledge decay. Consider a repository where Alice has 500 commits but most are small fixes, while Bob has 300 commits including major feature implementations. Traditional counting might show Alice covering 50% of contributions, suggesting a bus factor of 1. But DOA analysis might reveal that Bob has high authorship scores on files Alice barely touches. The DOA-based bus factor would be 2, reflecting that both Alice and Bob hold critical knowledge.

Practical Applications

When you need to find the expert on a specific file or directory, DOA gives you a clear answer. Calculate scores for all contributors, filter for those above 0.75, and rank by score. The top contributor is your subject matter expert. Use this for routing code review requests, assigning bug fixes, or planning knowledge transfer.

DOA helps identify what we call "orphaned files," code where no active contributor has meaningful ownership. If the maximum DOA across all contributors is below 0.5, that file has no true expert. Bugs in orphaned files take longer to fix because no one has deep context. Changes carry more risk because no one fully understands the implications. Tracking orphaned file percentage gives you a health metric for your codebase.

Over time, DOA can reveal concerning trends. Watch for key contributors whose scores are declining, as others are modifying "their" files and their knowledge is becoming outdated. Watch for rising orphaned file counts, which indicate knowledge isn't being transferred effectively. Watch for new contributors who aren't building authorship, which might signal onboarding problems.

Limitations to Keep in Mind

DOA measures what can be observed in git history, which means it misses knowledge transferred through conversation, documentation reading, or architecture discussions. If Alice explained the entire system to Bob over lunch, DOA has no way to know.

Shallow git history can distort results. If repositories were cloned with limited history depth, DOA will underestimate longer-tenured contributors' ownership.

Large refactors can temporarily skew scores. A single commit that renames files across the codebase might artificially boost one person's authorship.

Pair programming creates attribution challenges. If pairs don't rotate who commits, one person gets credit for understanding that's actually shared.

The key is using DOA as one signal among several. Combine it with qualitative assessment from team leads, code review participation data, and self-reported expertise. No single metric captures everything, but DOA provides a research-backed foundation for understanding code ownership.

Conclusion

Degree of Authorship offers a meaningful improvement over naive commit counting for understanding who really owns your code. By considering whether someone created a file, how often they've modified it, and how much has changed since their last edit, DOA produces scores that correlate with actual developer understanding.

Use DOA for more accurate bus factor calculation, identifying subject matter experts, detecting orphaned files before they become problems, and monitoring knowledge distribution over time. Combined with other metrics and qualitative judgment, it helps you build a clear picture of where knowledge lives in your organization, and where it might be at risk.

Introduction#

The Problem with Simple Metrics#

What Degree of Authorship Measures#

The Three Components#

Interpreting DOA Scores#

How DOA Improves Bus Factor Calculation#

Practical Applications#

Limitations to Keep in Mind#

Conclusion#