Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Portfolio support for Bar data #2236

Open
stefansimik opened this issue Jan 23, 2025 · 8 comments
Open

Portfolio support for Bar data #2236

stefansimik opened this issue Jan 23, 2025 · 8 comments
Labels
enhancement New feature or request

Comments

@stefansimik
Copy link
Contributor

stefansimik commented Jan 23, 2025

Current Behavior

The Portfolio component doesn't fully support bar-based backtesting, leading to incorrect PnL calculations when using OHLCV bar data. Specifically:

  • Portfolio.unrealized_pnl() returns None while positions are open
  • Portfolio.realized_pnl() returns incorrect values (zero profit) after positions are closed
  • The Portfolio module doesn't import or handle Bar data natively

Proposed Solution

Implement bar support based on Bar-to-Tick Conversion:

  • Convert each OHLCV bar into synthetic trade ticks
  • Split each bar into 4 proportional trade events based on OHLC prices
  • Distribute volume and timestamps across the synthetic ticks
  • Leverage existing tick-based Portfolio functionality

Reproduction

A minimal example demonstrating the current limitation is attached.

run_backtest.py.zip

It is fully reproducible example in one file (including artificial bars data).
Just unzip and run.

Strategy setup is extremely simple:

  • Just 12 artificial bars (1-min) with increasing prices
  • Just 2 orders: BUY MARKET at 1st bar ....... and SELL MARKET at 7th bar
  • Commission $2.50 per contract

I've added two debug points in the code to make it easy to see empty values being returned from Portfolio object:

  1. Line 93 (during open position): unrealized_pnl returns None instead of profit
  2. Line 101 (after position close): realized_pnl returns 0 when there should be profit

Why this is important

This is my main rationale for why bar data should be treated as a first-class citizen in Nautilus in the future.

Three key factors support this perspective:

  1. First experience: Bar data is typically the first type of data that beginners encounter, as it's widely available and accessible. When these commonly available data sources don't function properly, it creates immediate problems, raises repetitive questions, and causes disillusionment during users' first experiences with Nautilus Trader. It's therefore crucial that these common usage scenarios work seamlessly.
  2. Cost-effectiveness: The price disparity between 1-minute and tick data is substantial. Looking at EUR/USD futures from DataBento over a 5-year period illustrates this:
    • 1-minute bars: 9.03 USD
    • Tick data: 92.59 USD (10x more expensive)
  3. Practical workflow: During initial strategy development, we need to analyze multiple markets across several years. Using tick data would quickly drive costs into thousands of dollars. This allows developers to identify promising strategies first, then selectively invest in higher-resolution data for the most promising markets.

These factors make bar data support essential for practical usability in these basic, meaningful and widely needed scenarios.

2 possible implementation approaches

There are two viable approaches to handle bar data in the Portfolio component:
(as mentioned in this Discord message

1. Pre-processing Bars to Ticks (Optimal Performance)

This approach involves converting bars to synthetic ticks before adding them to the engine:

  • Advantages:

    • Highest performance due to reduced runtime computation
    • Consistent with existing tick-based architecture
    • Can be cached/stored for repeated backtests
    • Allows fine-grained control over tick generation
  • Implementation:

    • Process bars into quote/trade ticks as a preprocessing step
    • Store generated ticks or pipe directly to engine
    • Leverage existing examples in documentation
    • No runtime overhead during backtest execution

2. On-the-fly Bar Conversion (Maximum Convenience)

This approach lets Nautilus automatically convert bars to ticks during runtime:

  • Advantages:

    • Zero setup required from users
    • Works out of the box with raw bar data
    • Consistent with existing bar-to-tick conversions in matching engine
    • Simplifies initial user experience
  • Implementation:

    • Extend existing bar-to-tick conversion logic to Portfolio module
    • Add automatic conversion triggers on Portfolio updates
    • Ensure proper timestamp/volume distribution
    • Cache converted ticks within backtest run
@faysou
Copy link
Collaborator

faysou commented Jan 23, 2025

Why not just use the close of a bar ? The order execution is already responsible for distributing volume across a bar.

I suppose subscribing to external bars should be enough to have the most granular bars.

@stefansimik
Copy link
Contributor Author

Why not just use the close of a bar ? The order execution is already responsible for distributing volume across a bar.

I suppose subscribing to external bars should be enough to have the most granular bars.

Yeah, I am fully open 👍
By mentioning the "bar-to-tick" conversion, I just wanted to put accent, that there is already some related functionality implemented, that could re-used / taken into consideration.

@faysou
Copy link
Collaborator

faysou commented Jan 23, 2025

I'll have a look at some point, curious about it

@faysou
Copy link
Collaborator

faysou commented Jan 23, 2025

I've solved both issues, using bars, and realized_pnl = 0 after position closed (it's not 0 anymore).

I really like your examples, testing often takes me as much or more time than writing code.

@stefansimik
Copy link
Contributor Author

Unbelievable how quickly 🙏
I will be ready to check & test with you anytime 😊👍

@faysou
Copy link
Collaborator

faysou commented Jan 23, 2025

I've worked on some complicated features, this one was easy in comparison, on the other hand experience is starting to accumulate as well, I'm starting to know more and more components of nautilus.

Also the bugs you find and I fix sometimes, or features, are things that will be useful to me as well.

@faysou
Copy link
Collaborator

faysou commented Jan 23, 2025

I think the hardest feature I worked on was the mixed data request client + catalog, this was hard, required to touch a lot of things int he system. Also aggregating bars before the start of a backtest was hard, especially for time bars, there are a lot of particular cases. Maybe you could test that as well more, I think it works, but the more people use it the better.

@zhaoyiming808
Copy link

thank you faysou, I really need it❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants