Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with PP's trade matching algorithms/code #4461

Open
pfalcon opened this issue Jan 5, 2025 · 2 comments
Open

Issues with PP's trade matching algorithms/code #4461

pfalcon opened this issue Jan 5, 2025 · 2 comments
Labels

Comments

@pfalcon
Copy link
Contributor

pfalcon commented Jan 5, 2025

There're some issues with how PP implements matching individual transactions into trades. This is important because:

  1. PP doesn't support short trades (Short Positionen  #353), and due to the issues, adding support for them is complicated.
  2. Even in the current functionality, there're subtle bugs like Incorrect FIFO order used when doing multiple transfers #4446
  3. Given all the duplication and discrepancies, it's hard to believe that the results produced is consistent or correct in all cases.

The high-level issues are:

  1. The code to match trades is duplicated and repeated several times throughout the application, with subtle, and nor-so-subtle differences in approach and behavior.
  2. Some duplications use too simplistic data structures and/or algorithms to support features like short trades.
  3. All the code is under-documented and under-commented, so it's actually not possible to easily enough to assess what exactly it's doing (i.e. what are requirements to handle various cases), and whether that's implemented fully/correctly.

Specific drill-downs into each issue:

Code duplication

The most advanced trade matching implementation is snapshot.trades.TradeCollector . If to believe #4446, it works almost as intended, but still has bugs.

Another realm of code is in snapshot.security subpackage, where there's 2 similar scaffolding algorithm to match trades in CapitalGainsCalculation and CostCalculation. They literally differ only in processing of matched trades, while the matching algorithm appears to be similar (note: it should be the same to get consistent results). It begs to have been factored out into common base algorithm, but no.

Data structures

TradeCollector uses per-portfolio queue (FIFO) data structures to match trades from transactions, and that's the adequate way. CapitalGainsCalculation/CostCalculation use single FIFO, and that cannot work with short trades: if one portfolio has an open BUY trade, while another gets opening SELL (i.e. short trade), then this SELL shouldn't be paired with an open BUY in another portfolio. There's also strong suspicion that single-FIFO algorithm can't process TRANSFER transaction properly (or at least how TradeCollector does, which again leads to concern that trades may be shown to user in one way, while performance metrics for them calculated in subtly different way).

On the other hand, TradeCollector starts with breaking natural transaction order by forcing them sorted by timestamp (which is fine) and buy/sell type, which is not compatible with opening SELL transactions.

Beyond that, to support short trades, Trade object itself, and related LineItems for CapitalGainsCalculation/CostCalculation should include an (explicit or implicit) marker whether it's long or short trade.

Specification

Given all the duplications and variations, it's unclear which of the algorithms works "more correct" in which aspect of its behavior, given that the code is undocumented with docstrings and barely commented. #4459 and #4460 show that even simpler algorithms than trade matching do weird things.

Another example is #4446: if it was well specified how transfer transaction should work and in which code achieves that, such bugs wouldn't happen or would be easy to fix.

@pfalcon pfalcon added the bug label Jan 5, 2025
@pfalcon
Copy link
Contributor Author

pfalcon commented Jan 5, 2025

Some analysis/ideas:

Of 2 code realm above, CapitalGainsCalculation/CostCalculation is the older implementation, tracing back to 2014. It would have called for the trade matching algo from CostCalculation to be factored out and reused for CapitalGainsCalculation, but nah, was duplicated instead.

Now, in 2019, TradeCollector was introduced, which is again, the most advanced and most correct (see data structures comparison) implementation. *Calculation vs TradeCollector however work a bit differently, the former gets ValuationAtStart/End records, and constructs Trail. Given that they still match trades first, it would seem possible to make the pipeline be: Transactions -> TradeCollector -> *Calculation. Figuring out what are the complications on that way is a different story though.

But I'm writing all this from the experience of trying to implement short trades support. It's actually not too cumbersome to patch basic support (at the expense of stuff like transfers) to Trade/TradeCollector, as all the data structrues are already there. But then there's long trail of patch-repatching everything again and again in Calculation's.

And there can be no talk about implementing more advanced features like #4437 without single place in the codebase to form trades. So, that should be the only way forward.

@pfalcon
Copy link
Contributor Author

pfalcon commented Jan 8, 2025

As #4459 (comment) mentions, there're different performance types are being calculated, one is breaking down by security, another - by account. That may be a reason why there's code duplication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant