7. How do Backtests Work in Composer

A backtest is the basic way for any user to evaluate the logic and performance of any Composer symphony. Once a symphony’s logic can run without any errors, it will generate a backtest by running its logic against real-world historical market data.

It’s critical to note that a backtest is not necessarily a reflection of how a symphony will perform in the future. If you’ve been around the investment world for any length of time, you’ve seen the disclaimer that says “past performance is not indicative of future returns,” and it’s every bit as applicable here as it is with any other investment.

A backtest simply shows you how a symphony should have performed in the past by reacting to real-world market conditions over the backtested period.

A symphony that backtests to 2018 will show you how it should have behaved during the COVID crash, but it won’t necessarily demonstrate how it will behave in the next crash, because the next crash isn’t likely to happen the same circumstances as 2020’s crash — just as the 2020 crash differed from the 2008 crash, which differed from the dotcom crash of 2001, and so on.

Here are the major elements of a Composer backtest:

Time period (one year by default)
Initial investment amount
Fees, other costs, and slippage
The chart view (corresponds to time period settings by default)
Any comparison benchmarks
Calculated final values and costs/expenses for the backtest period
Symphony alpha, beta, R^2, and R values
Cumulative, annualized, and trailing 1-month and 3-month returns
Sharpe ratio
Max drawdown
Calmar ratio
Historical allocations

Let’s explore the meaning and purpose of each element.

Backtest Time Period

A backtest can extend as far into the past as data is available for all assets included in the symphony’s Asset blocks. By default, a Composer backtest runs over the most recent one-year period, but you can adjust the timeframe as broadly or narrowly as you like—within the constraints of the tickers used in the symphony.

How far can a backtest go?

The length of a backtest is always limited by the most recently launched ticker used in the symphony for either condition checks or investments.

Many symphonies’ backtests end in 2022 due to the popularity of volatility-related tickers like UVIX and SVIX, which were introduced in 2022.
Symphonies that invest in TQQQ often end in 2010, since that’s when the TQQQ ETF became publicly available—unless they incorporate tickers with shorter histories.
If a symphony primarily analyzes TQQQ but also evaluates QQQU (a newer 2x leveraged ETF focused on the “Magnificent Seven” tech stocks), the entire backtest will be restricted to March 2024, when QQQU first became investable.

The length of a backtest does not necessarily indicate a symphony’s potential. However, many Composer investors prefer symphonies with longer backtest histories because they provide more data points for evaluation and reduce the risk of overfitting to short-term market conditions.

If you add benchmarks for comparison, those benchmarks will also be limited by the most recently launched ticker in the symphony.

For example:

If a benchmark began trading in 2021 but your symphony backtests to 2018, the benchmark’s metrics will only begin in 2021—even though your symphony has a longer historical record.
This will affect most of the benchmark’s backtested performance metrics, which we’ll examine in the next section.

Initial Investment, Fees, and Slippage

Several backtest settings can significantly influence the values you see in Composer:

Initial Investment Amount:

By default, the initial investment for a backtest is set to $100,000, but you can adjust it as needed—whether to reflect your actual portfolio size or to test how smaller allocations affect performance.

Fractional Share Limitations:
- If your initial investment is too small, Composer may not be able to buy fractional shares of certain assets.
- Most symphonies can trade effectively with as little as $500.
- Alpaca accounts can sometimes function with as little as $100, but for more accurate trade execution, an investment of at least $500 is recommended for the backtest.

Fees Included in Backtests

Composer’s backtest accounts for the following costs and fees:

Annual Composer Trading Pass – $384 per year
Regulatory Fees – Required costs to facilitate trade execution
Slippage – The difference between the expected trade price and the actual execution price

Understanding Slippage

Slippage is more pronounced when trading lower-volume tickers, where fewer shares are available at a given price.

High-volume assets like TQQQ trade millions of shares daily, meaning slippage is minimal.
Low-volume assets (with only a few thousand shares traded daily) may experience significant price variation between expected and executed trades.

Customizing Backtest Settings

You can enable or disable fees and slippage in a symphony’s backtest.
You can also adjust slippage settings by setting the basis points to any value.

By default:

All fees and slippage are enabled
Slippage is set to one basis point (0.01%)

Composer's Backtest Chart View

The backtest chart in Composer’s symphony editor is the most detailed version of the symphony chart you’ll see on the platform. It displays charted performance of your tested symphony against a default benchmark, which will always be SPY (the SPDR S&P 500 ETF).

Below the date axis, you’ll see a sliding bar that allows you to narrow the chart view to shorter time frames. It’s usually more precise to set your time frame in the Period setting at the top of this view, but the sliding bar can help you visualize a symphony’s performance on dynamically shortening time frames.

Benchmarks on the Composer chart view

You can evaluate any symphony’s performance relative to other symphonies in your watchlist or live portfolio, as well as compare them against individual stocks or ETFs as benchmarks.

By default, every symphony is benchmarked against SPY, since the S&P 500 is the most widely used investing benchmark in U.S. public markets.

In strategies with annualized returns exceeding 100%, SPY may appear as a nearly flat line—especially when examining performance over longer time frames (2+ years).
Despite large differences in performance between your symphony and its benchmarks, you can still analyze relative performance by hovering over the chart.
When hovering, the chart dynamically displays the total return percentage for each option on the specific date you select.

A backtest can extend as far back as data is available for each ticker used in the symphony. However, this availability varies from symphony to symphony:

Some benchmarks may have longer historical records than the symphony being tested.
Others may start much later, depending on when the benchmark’s ticker was first introduced.
All calculations in the chart and performance data tables are based on the earliest available backtesting date across all symphonies in the comparison.

For example:

If you compare a benchmark that begins in 2020 with a symphony that backtests to 2015, the symphony’s performance line will start at the leftmost part of the chart.
The benchmark’s performance data will only appear starting from 2020, which may place it closer to the middle of the chart view.
As long as all symphonies in a comparison have data for the full period noted in the “Performance Metrics” table below the chart, their results will be reported for that entire period.

The next section will provide a more detailed breakdown of how these performance metrics are calculated.

Composer Backtest Performance Metrics

The easiest way to compare your symphony’s backtested performance against all selected benchmarks over the full backtest period is by reviewing the tables located just below the backtest chart.

Directly beneath the “Add Benchmarks” button, you’ll find three performance tables, each displaying various metrics and values for both your symphony and its benchmarks.

In the following sections, we’ll examine each of these tables in detail.

Composer's Simulated Returns Metrics

Every backtest generates Simulated Returns based on the settings configured in the Investment, Fees, & Slippage section of the backtest chart.

These key metrics provide insight into how your symphony would have performed under real-world trading conditions:

1. Initial Investment

By default, set to $100,000, but can be adjusted to any amount.
This value will always match the setting selected in the backtest chart.

2. Final Value

Represents the theoretical net gain or loss of your symphony over the backtest period, net of fees and slippage.
Does not account for taxes—consult a tax attorney or CPA for tax implications on simulated or actual gains.

3. Regulatory Fees

Mandated by the SEC (Securities and Exchange Commission) and FINRA (Financial Industry Regulatory Authority).
These fees are legally required for all investors and traders in U.S. markets and contribute to regulatory budgets.

4. Total Slippage

The difference between the expected trade price and the actual execution price.
Typically a minor cost, measured in basis points (hundredths of a percentage point) per trade.
While slippage tends to be the largest trading cost in Composer, it usually only affects a small fraction of total returns.

5. Trading Pass

Incorporates the cost of Composer’s Trading Pass subscription ($40/month) into backtest expenses.
This cost is fixed at $40 per month for the duration of the backtest.
In reality, the Trading Pass cost is spread across an entire portfolio, reducing its impact on individual symphonies.
Many users pay less than $40/month by subscribing annually, referring others for discounts, or both.
Since this cost is fixed, it is more burdensome for small accounts but insignificant for larger portfolios.

Composer's Benchmark Comparisons

Directly below the Simulated Returns table, you’ll find another table comparing your benchmarks against your backtested symphony.

This table only appears if you have selected benchmarks for comparison, and it exclusively displays metrics for your benchmarks relative to your symphony.

Below is a breakdown of how to interpret the key metrics in the “Compared To” table:

1. Alpha

Measures the excess return of your backtested symphony relative to a benchmark, accounting for expected risk.
If Alpha = 3.33, your symphony produced 3.33 times more risk-adjusted return than the benchmark over the past 12 months.
A higher Alpha suggests superior performance relative to the benchmark after adjusting for risk.

2. Beta

Indicates how sensitive (or volatile) your symphony is compared to the benchmark.
Beta = 1.0 → Moves in sync with the benchmark.
Beta > 1.0 → More volatile than the benchmark (e.g., Beta = 1.6 means your symphony moved 1.6 times as much as the benchmark).
Beta < 1.0 → Less volatile than the benchmark.
Beta < 0 → Moves inversely to the benchmark (rare). A Beta of -1.0 implies a one-to-one inverse relationship over the past year.

3. R² (R-Squared, or the Coefficient of Determination)

Quantifies how much of your symphony’s movement can be explained by Alpha and Beta in relation to the benchmark. Ranges from 0 to 1:
- R² = 1.0 → Alpha and Beta fully explain the symphony’s movements relative to the benchmark.
- R² = 0.05 → Symphony’s performance is highly unpredictable relative to the benchmark.
Lower R² values suggest greater randomness or a weaker relationship between the symphony and the benchmark.

4. R (Pearson’s Correlation Coefficient)

Measures how strongly the symphony’s returns correlate with the benchmark.
Also ranges from 0 to 1:
- R = 1.0 → Perfect correlation (symphony moves almost exactly like the benchmark).
- R = 0 → No correlation between symphony and benchmark.
Symphonies using similar logic or investing in the same tickers will have R values closer to 1.
- Example: TQQQ-based symphonies will likely have high R values when compared to other TQQQ-focused strategies.
- However, a TQQQ-focused symphony compared to one investing in gold miners or bank stocks will likely have low R values, as these sectors generally move independently.

Composer's Backtested Performance Metrics

The third table in the backtest results section is where most investors focus their attention when comparing a backtested symphony against its benchmarks. This table provides critical performance statistics used to evaluate potential investments. Below is a detailed breakdown of each metric and how to interpret it.

Performance Metrics:

1. Cumulative Return

Represents the total return over the full backtested period.
Calculated from the earliest available date for either the symphony or benchmark.
If a symphony has data from 2015, it will report returns from 2015 to the present, while a benchmark with data starting in 2020 will only report returns from 2020 onward.
This can lead to discrepancies if a symphony’s backtest covers a significantly longer or shorter period than its benchmarks.

Annualized Return

Measures the symphony’s average annual return by dividing the cumulative return by the number of years in the backtest.
Adjusted for the start and end date of the backtest and the total trading days available.
If all benchmarks and symphonies start from the same date, this metric provides a direct performance comparison.
If start dates differ, this metric reflects annualized returns from each respective starting date, which can make comparisons less meaningful.

Trailing 1-Month Return

Measures a symphony’s total gain or loss over the past month (~30 calendar days or ~21 trading days).
Starts one day after the corresponding date one month ago.
- Example: If the backtest ends on November 20, the trailing 1-month return is measured from October 21 to November 20.
Useful for identifying short-term momentum but not ideal for long-term evaluation.

Trailing 3-Month Return

Measures a symphony’s total gain or loss over the past three months (~90 calendar days or ~63 trading days).
Similar to the 1-month return, but provides a broader view of short-term trends.
Helps identify “hot” symphonies but is not necessarily an indicator of long-term sustainability.

Risk-Adjusted Performance Metrics:

Sharpe Ratio (Reward-to-Variability Ratio)

Measures risk-adjusted performance by comparing returns to a risk-free asset (e.g., U.S. Treasuries).
Higher Sharpe Ratios suggest higher returns per unit of risk, but extremely high values can indicate potential over-optimization.
Typical Sharpe Ratios:
- Most publicly traded assets have a Sharpe below 2.5.
- A Sharpe above 3.0 is often viewed with skepticism, as it may indicate unsustainable risk-taking.
- Example: MicroStrategy (MSTR), despite gaining 600% in 2024, had a Sharpe of ~2.5.

Standard Deviation (StDev) - Volatility Measurement
- Measures the variability of a symphony’s performance relative to its mean return over the past 12 months.
- Lower values indicate more stable investments, while higher values suggest greater risk and price fluctuations.
  Standard Deviation Examples (At the Time of Writing):
  - SPY: 12.2% (Low volatility)
  - Berkshire Hathaway (BRK-B): 14.4%
  - MicroStrategy (MSTR): 100%+ (Highly volatile)
- A StDev above 100% is rare and often signals unusual or unsustainable performance patterns.
Max Drawdown (DD) - Peak-to-Trough Decline
- Measures the largest decline from a symphony’s highest peak to its lowest trough during the backtest.
- Example: If a symphony’s Max Drawdown = 21%, it means the worst decline in the backtest period reduced its value by 21%.
- Smaller Max DD values indicate less risk, making them a highly desirable trait in symphony design.
- Longer backtest periods tend to have higher Max DD values, as they encompass more market downturns.
- The “Holy Grail” of symphony design is a low Max Drawdown paired with strong annualized returns, but this combination is rare.

Risk-Return Tradeoff Metrics:

Calmar Ratio

Measures a symphony’s annualized return divided by its Max Drawdown.
Most useful for longer time periods, where Max DD values better reflect actual market downturns.
High Calmar Ratios can sometimes be misleading:
- SPY’s one-year Calmar Ratio: 3.77
- MicroStrategy (MSTR) one-year Calmar Ratio: 16.08
- Extremely high Calmar Ratios can be the result of prolonged gains without major reversals, rather than a true indication of low risk.

Historical Allocations in Composer Backtests

The Historical Allocations table (also available as a graph) provides a daily snapshot of which assets your backtested symphony would have held at any point during the backtest, along with their respective allocations.

This section of the backtest is useful for:

Identifying asset holdings on key market days (e.g., during extreme rallies or crashes).
Analyzing how often the symphony holds each investable asset.
Assessing each asset’s contribution to the symphony’s overall valuation on a day-to-day basis.

For example, if your backtest chart shows a sharp drawdown in September 2022, you could review the Historical Allocations table to see:

Which assets the symphony held during that period.
Whether specific assets contributed to the decline.
Opportunities to adjust the symphony’s investable tickers or logic to potentially reduce similar drawdowns in the future.

While refining a symphony based on historical performance may seem beneficial, this approach can lead to overfitting—a key risk every Composer investor should understand.

In the next section, we’ll explore overfitting, its implications, and how it can impact a symphony’s long-term performance.

What is Overfitting in a Composer Backtest?

Overfitting occurs when a symphony’s conditions are designed to exploit past market-moving events, rather than adapt to future uncertainties. While this approach can generate exceptional backtested results, including high annualized and cumulative returns with minimal drawdowns, these results are often misleading. An overfit symphony effectively “predicts” the future within a backtest by responding to historical events it already “knows” have occurred, making it unreliable in real-world market conditions.

For example, you could create a strategy that aggressively invests in MicroStrategy (MSTR) in 2023 while avoiding it in 2022, using carefully selected metrics to achieve the best possible backtested results. While this may produce outstanding simulated performance, once the symphony runs in a live market environment, its effectiveness may decline because historical conditions cannot reliably generate forward-looking signals.

It is not always easy to determine whether a symphony has been overfit, but there are some key indicators to watch for:

Unusually high accuracy in rotating into tickers just before major price movements.
Extremely low drawdowns, even during significant market corrections.
Exceptional backtested performance before the Out-of-Sample (OOS) Start Date, followed by weaker performance afterward.

However, strong backtested performance alone is not necessarily proof of overfitting. The best way to assess whether a symphony is overfit is to analyze its logic and conditions.

A major red flag is overly specific conditional logic, particularly when using intricate, fine-tuned thresholds. For instance, a symphony that applies conditions such as “if the 13-day moving average of TQQQ is less than the 71-day RSI of TECL” or “if the 19-day RSI of BITO is greater than 77.31%” is likely optimized for historical trends rather than adaptable market behavior. This is especially concerning when applied to low-volume tickers, where price movements can be erratic and unreliable.

While complexity is not inherently bad, symphonies that rely on highly intricate and inexplicable logic are more likely to be overfit. While this is not a strict rule, it serves as a general guideline for evaluating whether a symphony is designed for robust, forward-looking performance or simply tailored to past data.

How to use this information to choose symphonies

Choosing the right symphony (or symphonies) for your Composer portfolio can seem intimidating at first, especially if you’re leaping into Discover or the symphony editor for the first time with no real context.

This guide has been meant to help you understand the various components and aspects of Composer and its symphonies, so you can better identify and select symphonies that work for your investment goals and risk tolerance. However, we know no one becomes an “expert” at algorithmic investing overnight, or even within just a few days — just as no one becomes an “expert” at any other approach to investing overnight.

Learning how to identify the best symphonies for your goals and risk tolerance will be an ongoing effort, and you’ll get better at it over time as you investigate more symphonies and develop a greater understanding of how everything works on this platform.