Skip to content
11 min read

Bad Data and Automated Rebalancing

How automated rebalancing systems handle bad data.

bad data

As readers of this blog know, we think automated rebalancing is changing wealth management. It has the ability to reduce costs, improve customization and generate tax alpha greater than most advisor’s fees — at much lower cost than current processes. It is becoming table stakes, and advisors who don’t take advantage of what automated rebalancing makes possible will be at an insupportable competitive disadvantage.

But automated rebalancing has a weakness: it only works if the data it relies on is accurate. There’s an old computer science saying: garbage in, garbage out. If you feed an automated rebalancing system incorrect holdings or model data, it’s not likely to generate useful trades. The solution? It comes in two steps:

  1. Perform integrity checks to catch bad data before you trade.
  2. Implement a rebalancing workflow that auto-corrects any data-related bad trades as soon as the data errors are fixed.

Let’s walk through these one at a time.


Data Integrity Checks

The first line of defense in protecting against bad data is to suspend trading of accounts that may be affected by bad data. There’s lots of possible sources of bad data. In practice, we see three:

  1. Stale Uploads: Rebalancing systems upload holdings information from custodians or intermediate accounting systems. If, for whatever reason, an account is missing from an upload, the data for those accounts will be stale.
  2. Corporate Actions: In theory, custodians and intermediate accounting systems all apply corporate actions on the same day. But in practice, corporate action processing is often delayed. This can cause problems. If, for example, your holdings don’t reflect a 2 for 1 split, you’re going to think you have ½ as much of the security as you really do.
  3. Unknown Securities: Rebalancing systems like Smartleaf have their own security master for prices and risk signature. Occasionally, we won’t recognize a CUSIP or ticker included in a model or a holdings upload. Most of the time, it’s because of delayed corporate action processing by the custodian (see above), but sometimes it’s because the security is obscure and thinly traded.

The solution is the same in each case: suspend trading for accounts affected by the bad data. However, the details of how this is done will vary from system to system. Here’s how it works in ours:

  1. Stale Uploads: This one’s easy. Our system automatically suspends trading for accounts with holdings information that has not been updated since the last market close.
  2. Corporate Actions: We show our clients all the corporate actions we know about from our own corporate action feed. Our clients can compare our list of corporate actions with their own. If there’s a mismatch — for example, if their custodian is a day late in recording a split — our clients can apply a filter to suspend trading of any affected account.
  3. Unknown Securities: We show our clients all the securities we don’t recognize. Usually, they’ll apply a filter to suspend trading of any affected account. If the client knows about a security and feels it can be safely ignored, they can chose to proceed with trading.


Self-Correcting Daily-Rebalancing Workflow

With these data integrity checks in place, it is exceedingly rare for any account to be traded on bad data. For most of our clients, there have never been any bad-data trades. For others, it’s a once-every-few-years event.

Fortunately, when it does happen, the bad trades are auto-corrected as soon as the data is fixed. This auto self-correction only works because automated rebalancing systems support a uniform daily rebalancing workflow. If, instead, you had a calendar-based rebalancing workflow — say, every quarter — you’d need a special process to make sure you go back and fix errors caused by bad data. But if you have a daily rebalancing workflow, this isn’t necessary. The problem is just caught and corrected the next day (or whenever the data problems are fixed).

Here’s how a daily rebalancing workflow works: every day, you trade any account that meets one or both of two conditions:

  1. The account is not compliant with one or more important constraints, such as “satisfy a cash out request” or “never own tobacco stocks.”
  2. The net benefit of trading exceeds a preset “cost/benefit threshold.” For each account, the system generates a “cost/benefit score” that measures how much the account can benefit from trading. What constitutes a benefit? It is strictly defined by the client. It includes bringing the account closer to its target asset allocation and recommended security weights; buying securities with higher security return rankings, and tax loss harvesting. What constitutes a cost? Costs include commissions, bid-ask spread and taxes.

Here’s the thing. If one of these triggers causes an account to be traded, the system-generated trades will remove the trigger. Specifically, the trades always fix whatever constraint was being violated, and take advantage of any opportunity to improve the portfolio net of costs. So, after the account trades, there’s nothing left to do until circumstances change — e.g. there’s a tactical asset allocation change, a security swap in a model, a new loss harvesting opportunity, a change in the account’s customization settings, etc. This means that under normal circumstances, the account won’t trade again for a while — it will neither violate a constraint nor have trades that exceed a cost/benefit threshold.

However, this isn’t true if an account traded on bad data and that data is subsequently fixed. In this case, it is likely to trade the account twice in rapid succession — perhaps two days in a row. The first day it trades based on bad data; the second day, armed with good data, it undoes the previous day’s trades. The key point here is that there was no special process for correcting the error — just the application of the firm’s standard daily rebalancing workflow.


Automated rebalancing is extraordinarily powerful, but it does depend on clean data. The bad news is that there will never be a world with perfect data. The good news is that data integrity checks and a self-correcting rebalancing workflow can reduce the problems caused by bad data to near zero.

President, Co-Founder