Why Financial Model Version Control Is a Hard Problem
Start with why this matters. A financial model that underpins a major decision — an acquisition bid, a capital allocation, a product launch approval — may go through dozens of meaningful revisions across weeks or months. Multiple people edit it. The base case is updated as new information arrives. Scenarios are added, changed, and sometimes deleted. By the time the decision is made, nobody can reliably answer four basic questions:
- What changed between version A and version B? Not just which file was modified, but which specific cells, and from what value to what value.
- Why did it change? What drove the assumption update — a new data source, a management conversation, a market development?
- Which version is current? With multiple people editing across email threads, shared drives, and local copies, "current" becomes ambiguous fast.
- What was the model saying when the decision was made? If the deal closes and the model was wrong, the audit starts here.
These are not academic concerns. Model version confusion has contributed to real deal errors, regulatory issues, and costly rework at institutions from mid-market PE shops to bulge-bracket banks. The 2012 JP Morgan "London Whale" trading loss — $6.2 billion — involved, among other things, errors introduced when a risk model was transferred from one spreadsheet to another with a manual copy-paste operation. The error was invisible because there was no meaningful diff capability between the original and the copy.
The $6.2B lesson: The JP Morgan London Whale post-mortem identified that a key VaR model had been manually re-implemented in Excel and contained a formula error — dividing by a sum instead of an average — that halved the reported risk. There was no version control system that would have flagged the divergence between old and new. A structural diff would have caught it immediately.
Why Git — the Software Answer — Doesn't Work for Spreadsheets
If you have worked in software, your first instinct is probably: just use Git. Put the model in a repository, commit after every meaningful change, use branches for scenarios, and diff to see what changed. This is exactly the right instinct. It also completely fails for spreadsheets, for three reasons that are worth understanding precisely.
1. Excel files are binary blobs, not text
Git diffs work by comparing files line-by-line at the text level. .xlsx files are ZIP archives containing XML, relationships, shared strings, styles, and binary parts. Even a trivial one-cell change — updating a single number in a cell — results in a binary diff that looks like thousands of bytes of noise. Git will tell you "these two versions differ," but it cannot tell you which cell changed or by how much.
You can partially mitigate this by converting Excel to a text-diffable format — CSV exports, OpenDocument XML unpacked, or purpose-built tools like xltrail — but each introduces new problems: you lose formatting, you lose macros, you lose the ability to use the file natively, and you introduce a conversion step that must happen consistently every time.
2. Merging is not possible in any meaningful sense
Git's real power is not tracking changes — it is merging branches. Two developers can work on the same codebase simultaneously, and Git can automatically reconcile most of their changes. This works because text-level diffs map cleanly to the semantic structure of code.
For spreadsheets, automatic merging is essentially impossible. If two analysts both edit cell B14 in different branches, there is no Git merge strategy that can determine which value to keep — or whether the correct resolution is some third value derived from both changes. Excel's co-authoring feature handles this by locking cells during edit sessions, which prevents simultaneous editing rather than enabling merge. That is a very different thing.
3. There is no semantic diff layer
The most valuable thing a version control system does for code is not recording what bytes changed — it is answering "what does this change mean?" A code diff tells you which function was modified, what logic was added or removed, and in context, why that matters.
A meaningful financial model diff should answer: "Revenue growth assumption moved from 8% to 6.5% in the base case; EBITDA margin in year 3 changed from 22% to 20%; the exit multiple distribution was tightened." That is actionable information. A raw binary diff of an .xlsx file tells you nothing of the sort.
Tools like xltrail, Spreadsheet Compare (built into Microsoft Office 365), and various Python-based diff utilities exist and are genuinely useful for one-off comparisons. None of them integrate with a version control workflow in a way that analysts actually use day-to-day.
What Analysts Actually Use (and What Each Approach Gets You)
In the absence of a real solution, the industry has converged on a set of workarounds. Each one partially solves the problem and introduces its own failure modes.
The Save-As naming convention
The universal approach. You save versions with dates and initials: ProjectCardinal_LBO_v3_JM_0612.xlsx. When reviewing, you go to the folder and look for the most recent file.
What it gets you: a rough chronological record of file states, and the ability to go back to a prior version by opening an old file. That is genuinely valuable.
What it fails at: it provides no information about what changed between versions. To diff v3 and v4, you open both files and manually compare, which takes significant time and misses subtle changes. It also fails completely on the "why" dimension — there is no annotation mechanism, so the reason for every change lives in an email thread or, more often, in nobody's memory. And "which version is current" requires reading file names and dates, which is unreliable once multiple people are involved.
Cloud file versioning (SharePoint, Dropbox, Box, OneDrive)
SharePoint and Dropbox automatically save version history for files stored in them. You can restore any prior version and, in some cases, see a timestamp and the user who saved it. McKinsey, Bain, and most large financial institutions use SharePoint or OneDrive as the de facto version history for their models.
What it gets you: automatic timestamped snapshots without any manual naming discipline. Genuinely useful for "what did the model look like last Thursday?"
What it fails at: the same things as Save-As, with the additional problem that versions are saved every few minutes regardless of whether any meaningful change occurred — making it difficult to identify which version snapshot corresponds to a particular decision point. Still no semantic diff, no annotations, no "why."
Excel's built-in Track Changes
Excel has a Track Changes feature (also called "Show Changes" in newer versions) that highlights which cells were modified, by whom, and when. For simple models with one or two editors, this can be useful.
What it gets you: cell-level change attribution — "cell C18 was changed from 8% to 6.5% by JM on June 12." This is the closest native Excel gets to a meaningful diff.
What it fails at: Track Changes is designed for document review, not model version management. It does not handle branching scenarios, does not provide a model-level snapshot history, and is frequently turned off because it slows down large models. It also does not propagate well through the file-sharing workflows that teams actually use — once a file is saved without track changes enabled, the history is gone.
Cell comments and audit tabs
Some disciplined modelers build an "audit" or "changelog" tab manually — a structured log of what changed, when, and why. Goldfields like investment banking models at top firms (Goldman Sachs, Morgan Stanley, Evercore) sometimes include these as a matter of internal quality standards.
What it gets you: the "why" dimension, which nothing else provides. A changelog entry that says "Updated revenue growth from 8% to 6.5% following management downgrade call — per discussion with CRO 6/10" is enormously more useful than a timestamp.
What it fails at: it is entirely manual and therefore entirely discipline-dependent. In practice, changelog tabs start clean and gradually fall behind the model, especially under deadline pressure. They also do not provide the structural diff capability — they tell you what changed in prose, not at the cell level.
Microsoft Spreadsheet Compare
The most capable of the built-in tools. Spreadsheet Compare (available in Office Professional Plus and Microsoft 365 E3+) does a genuine cell-by-cell diff between two Excel files and highlights differences in a structured viewer. It handles both value differences and formula differences.
What it gets you: a real cell-level diff between any two file versions. This is the closest thing to a semantic diff that Microsoft has shipped natively.
What it fails at: it is a standalone comparison tool, not a version control system. It does not maintain a history, does not track who changed what, does not support branching scenarios, and requires you to already have both files. It answers "what is different between these two things I have" but not "show me the history of how this model evolved."
The Comparison: What Each Approach Actually Provides
| Approach | What changed? | Why it changed? | Who changed it? | Which is current? | Scenario branching? |
|---|---|---|---|---|---|
| Save-As naming | No | No | Via initials in filename | Sort by date | No |
| SharePoint / Dropbox versioning | No | No | Yes (login-based) | Yes (most recent) | No |
| Excel Track Changes | Cell-level | No | Yes | Partially | No |
| Manual changelog tab | If maintained | Yes (narrative) | If maintained | No | No |
| Spreadsheet Compare | Cell-level diff | No | No | No (point-in-time) | No |
| Git (raw) | Binary noise | Commit messages | Yes | Yes (HEAD) | Yes (branches) |
| Branching spreadsheet | Structural diff | Via annotations | Yes | Yes (explicit) | First-class |
The pattern is clear. Each existing solution answers one or two of the five questions well and ignores the rest. None of them handle scenario branching at all — which is, for most analysts, the version control problem that matters most.
The Scenario Branching Problem Is the Core Issue
When analysts say "I wish I had better version control for my financial model," they usually mean one of two things. The first is linear history: they want to track how the base case evolved over time. The second — and more pressing — is branching: they want to manage multiple simultaneous versions of the model (scenarios, sensitivities, alternative structures) without them diverging into separate, inconsistent files.
Linear history and branching are different problems that require different solutions. Linear history is what SharePoint versioning and Excel Track Changes partially address. Branching is what nothing addresses well.
The branching problem looks like this: you have a base case model. You need a bear case, a bull case, and a downside stress case. You copy the file three times and start editing each one. Six weeks later:
- The base case has been updated eight times with new information
- The bear case has been updated three times, but not consistently with the base case updates
- The bull case was never updated after week two and is now meaningfully out of date
- The stress case was handed to a different analyst who restructured part of the revenue model
- Nobody is certain which scenarios reflect the current base case logic and which don't
This is not a discipline failure — it is a structural failure. The copy-paste branching approach has no mechanism for ensuring that scenarios inherit from a living base case. Once the files separate, they drift. The only way to prevent drift is continuous manual reconciliation, and that work does not get done under deadline pressure.
What branching should mean for a financial model: A scenario should be defined as a set of overrides on the base case — not a separate copy. Every input the scenario doesn't explicitly override should flow through from the base case automatically. When the base case is updated, all scenarios update automatically everywhere they haven't explicitly diverged. This is inheritance, not copying.
What Institutional Desks Actually Do
The most sophisticated financial modeling shops have developed informal systems for managing this. It is worth looking at what those systems look like in practice, because they reveal what the right solution needs to provide.
The Goldman model
Analysts who have worked at Goldman Sachs describe a model culture where the central file is treated as the single source of truth, edited in place rather than duplicated. Scenarios live as named ranges or scenario-selector architecture within a single file. The "versions" are the model at different dates, preserved in SharePoint history. Scenario management is handled via a structured input block where each column is a scenario — a manual implementation of scenario inheritance.
This works, but it requires significant upfront architecture discipline and does not scale easily to complex scenario trees. It also still has no cell-level diff capability and relies on analysts to manually keep the scenario columns in sync when the model structure changes.
The PE fund model
At private equity funds, models tend to be more analyst-owned and less institutionally standardized. Version control typically means a SharePoint folder with dated file copies and a convention that the most recent file is current. Scenario management is done via file copies — one file per scenario — with the acknowledged pain that scenarios gradually diverge from each other.
The most disciplined shops add a reconciliation step at key decision gates: before a model goes to the investment committee, an analyst is responsible for verifying that each scenario still shares the same underlying model logic and that any formula changes to the base case were propagated to all scenario files. This is tedious and error-prone.
The startup FP&A model
Startups face an acute version of the same problem with fewer resources. A three-person finance team at a Series B company might be maintaining a single Excel file as their financial model, board deck model, and fundraising model simultaneously. The "version control" is a shared Google Drive folder with file copies labeled by date. Scenarios are managed via tab colors and cell highlighting conventions that mean nothing to anyone other than the person who built them.
When that person leaves, the model's history is essentially gone. The next finance hire starts from scratch rather than trying to understand a model with no documentation and no coherent history.
The Architecture a Real Solution Needs
Based on what the existing tools get wrong, a real solution to financial model version control needs to provide all of the following — not some of them:
Structural diff at the input level
The system must be able to answer "what is different between scenario A and scenario B" at the assumption level, not the byte level. Specifically: which inputs are different, what their values are in each scenario, and what the downstream output impact is. This is the diff capability that neither Excel Track Changes nor Spreadsheet Compare provides in a scenario context.
Inherited scenarios with explicit overrides
Scenarios must inherit from the base case automatically. A scenario is not a copy — it is a set of overrides. Every assumption not explicitly overridden uses the base case value. When the base case is updated, all scenarios update everywhere they haven't diverged. This eliminates the stale inheritance problem structurally rather than procedurally.
First-class scenario identity
Each scenario should have a name, a description of what it represents, and optionally an annotation explaining why the key overrides are set the way they are. This is the "why" dimension that only the manual changelog tab currently provides, but it needs to be built into the scenario structure itself — not maintained separately.
Model history with decision-point snapshots
The system should allow explicit model snapshots at decision points — "this was the model as presented to the IC on June 12." Those snapshots should be retrievable and diffable against the current state. Automatic periodic versioning is less valuable than deliberate decision-gate snapshots, because the relevant question is "what did the model say when we made the decision," not "what did it say at 3:47pm last Tuesday."
No new workflow overhead
The biggest failure mode of every sophisticated solution proposed for this problem is that it requires analysts to change their workflow significantly. A separate tool for diffing means analysts need to remember to use it. A Git integration means analysts need to learn Git. Any solution that requires consistent manual action from a team under deadline pressure will not be consistently used.
The right solution has version control and scenario branching built into the native modeling experience. The analyst should not need to think about it separately from building the model.
Why the Right Mental Model Is a Branch, Not a Copy
Software developers learned this decades ago: the branch is a first-class concept in version control because it perfectly captures what a scenario or alternative actually is — a deviation from a shared base, not an independent fork.
In Git, when you branch, you are not copying the repository. You are creating a pointer to a shared history and recording only the differences from that point forward. The vast majority of the codebase is identical across branches; only the specific changes are recorded separately. Merging brings those changes back together.
The same principle applies to financial models. A bear case is not a separate model — it is the base case plus a handful of overridden assumptions. A stress test is the base case plus an extreme sensitivity on the two or three variables that matter most. Modeling those as branches rather than copies means:
- The base case and all scenarios always share the same calculation engine
- Updating the base case propagates to all scenarios automatically
- The "diff" between any two scenarios is explicit and precise: exactly the assumptions that were overridden
- Adding a new scenario takes seconds, not the hours of a full file copy and review
This is the insight that is obvious in retrospect and almost universally absent from current financial modeling practice.
A practical test for your current version control setup: Pick any two scenario files you are currently maintaining. Can you, in under two minutes, produce a complete list of every assumption that differs between them and by how much? If not, you do not have version control — you have files.
Practical Steps You Can Take Right Now
Until you have a system that solves all of this natively, there are structural improvements you can make that partially address the problem with existing tools.
Centralize every input, no exceptions
The prerequisite for any version control capability is having all assumptions in one place. A model with hardcoded numbers scattered across calculation sheets cannot be diffed meaningfully. Move every assumption to a dedicated input sheet, name it with a descriptive range name, and reference it throughout the model. This makes even manual comparison far easier.
Build a scenario input block, not a scenario file collection
Instead of maintaining separate files for each scenario, build a scenario block on your assumption sheet: one row per assumption, one column per scenario, and a master selector that feeds the active scenario values into the calculation engine. This is more work upfront and has its own maintenance challenges, but it eliminates file proliferation and makes the diff between scenarios immediately legible.
Use SharePoint versioning with explicit commit messages
SharePoint and OneDrive do not natively support commit messages — a version is just a timestamp and a user. You can work around this by saving a version and then immediately renaming the SharePoint version with a descriptive label via the version history UI. It is cumbersome, but it provides the "why" context that pure timestamping lacks.
At every decision gate, take a named snapshot
Before any model goes to a board, an investment committee, or a counterparty, save a clean named copy — not just a date-stamped autosave. Name it for the decision: ProjectCardinal_LBO_IC_June12.xlsx. This file should never be edited again. It is the record of what the model said at that decision point and will be the reference if questions arise later.
Branching built into the spreadsheet itself
TreBranch treats scenarios as first-class branches — not file copies. Each scenario inherits the base case automatically and records only its overrides. DiffPanel shows you exactly which inputs differ between any two scenarios and what drives the output gap. Model History captures decision-gate snapshots you can return to at any time. No extra tools, no workflow overhead.
Try Free — Microsoft Store →Free trial. Buy once, own it forever. 100% offline — your models never leave your computer.
Frequently Asked Questions
Can you use Git for Excel files at all?
Git can track .xlsx files in the sense that it will detect when they have changed and record a new commit. But the diff it produces is binary noise — it tells you a file changed, not what changed inside it. Tools like git-xl and xltrail add Excel-aware diff plugins that extract meaningful cell-level changes from version history. These are genuinely useful for individual modelers who want chronological tracking, but they do not solve the merging problem or the scenario branching problem.
What does Microsoft's Spreadsheet Compare tool actually do?
Spreadsheet Compare (part of Office Professional Plus and Microsoft 365 E3+) performs a cell-by-cell comparison between two Excel files and displays differences in a structured side-by-side view. It handles value differences, formula differences, and formatting differences separately. It is most useful for auditing a model that was supposed to be a copy of another — checking that the right assumptions were carried over, for example. It is not a version control system and has no history or branching capabilities.
How do investment banks handle financial model version control?
Most investment banks use a combination of SharePoint or network drive storage (providing automatic versioning), file naming conventions, and internal model review protocols. Bulge-bracket banks like Goldman Sachs and Morgan Stanley have internal model validation groups that audit models at key decision points — this is an institutional workaround for the absence of a technical solution. The review process substitutes human audit for software version control. It is expensive and does not scale to the volume of models that exist at any large institution.
Is there a way to diff two Excel files for free?
Yes. Microsoft's Spreadsheet Compare is available free within Office Professional Plus and Microsoft 365 E3+. If you do not have access to it, Python's openpyxl library can be used to extract cell values from two files and compare them programmatically — there is no native UI, but the comparison logic is straightforward. For a quick one-off comparison, opening both files and using Excel's "View Side by Side" with "Synchronous Scrolling" is the simplest no-tool approach, though it is entirely manual.
How should you handle model version control when multiple people are editing simultaneously?
Simultaneous editing of a financial model is the hardest version control case. The least-bad approach with standard tools is to designate a single file owner who merges changes from others rather than having multiple people edit simultaneously. Excel's co-authoring feature (available in OneDrive/SharePoint) prevents cell-level conflicts by locking cells during editing sessions, but it provides no diff or history capability and tends to cause performance issues with complex models. For teams that need genuine simultaneous collaboration on a model, the honest answer is that no off-the-shelf tool handles this well — it is an unsolved problem in current tooling.
What is the best way to document why assumptions were changed in a financial model?
The most durable approach is a structured changelog tab in the workbook itself — a running log with columns for date, analyst, cell or assumption changed, old value, new value, and rationale. The rationale field is the critical one; it captures the context that no automated tool can. The limitation is that changelog tabs require consistent discipline to maintain and tend to fall behind under deadline pressure. A faster alternative is to use Excel cell comments (right-click → New Comment) directly on the assumption cells — the comment is attached to the cell and visible inline, though it does not survive if the cell is deleted or moved.