Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance for large models #750

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

MBradbury
Copy link

Addresses some performance issues identified in #749.

Primary changes:

  1. LpConstraint no longer inherits from LpAffineExpression, this is so a reference can be kept to the already created object instead of copying it
  2. Avoid creating temporary LpAffineExpression when building a LpConstraint with a RHS that is an int/float
  3. Avoid creating temporary lists (e.g., of Iterables before iterating over them, before transforming lists into another form)

In some experiments with a model I am working with I have observed 4x improvement in setting constraints. pulp/apis/cplex_api.py:buildSolverModel has gone from 70s to 50s.

Comparison of numbers in performance script from #749 comparing the original performance to performance with this PR. Using list of tuples and passing that directly to LpAffineExpression remains faster than using lpSum.

Name Original PR
Slow Objective 16.692 16.595
Slow Constraint 22.000 6.435
Fast Objective 2.945 2.999
Fast Constraint 20.278 5.552

@MBradbury
Copy link
Author

Just a note that I added some type hinting to help me understand the code. Happy to remove if needed.

@pchtsp
Copy link
Collaborator

pchtsp commented May 6, 2024

please run the black linter as per instructions: https://coin-or.github.io/pulp/develop/contribute.html#applying-the-black-linter-formatter (and check any other step you may be missing). I'll try to review the changes soon.

Copy link
Collaborator

@pchtsp pchtsp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. they look comprehensive. I also appreciate the type hints.

I'm curious though if there was a way to keep the inheritance of LpAffineExpression without making a copy when initializing the LpConstraint. Can't we just modifiy the LpConstraint init in line 1012 to fill the values without copying them? and then we would keep everything else as is.

@@ -339,6 +339,8 @@ def buildSolverModel(self, lp):
lp.solverModel.objective.set_sense(
lp.solverModel.objective.sense.maximize
)
if lp.objective is None:
raise PulpSolverError("No objective set")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in theory it's possible to not set an objective function afaik and just model a constraint satisfaction problem. Isn't that the case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, I added this because the next line calls get on lp.objective which will raise an exception if the objective is None which the type checker was saying was an option.

@MBradbury
Copy link
Author

Thanks for the changes. they look comprehensive. I also appreciate the type hints.

I'm curious though if there was a way to keep the inheritance of LpAffineExpression without making a copy when initializing the LpConstraint. Can't we just modifiy the LpConstraint init in line 1012 to fill the values without copying them? and then we would keep everything else as is.

This will still lead to a large copy as it is essentially copying a dict to another dict. So for the test file that would be a copy of a 3 million entry dict.

Another approach could be to have LpAffineExpression not inherit from dict and instead have it as a variable. Then the constraint would just use that variable when inherited from LpAffineExpression, avoiding the large copy.

@MBradbury
Copy link
Author

MBradbury commented May 6, 2024

I have tried to decouple LpAffineExpression from its dict base class, but this has led to test failures I cannot get to the bottom of. This commit can be found here: MBradbury@12feb60 if you have the opportunity to see what might be wrong.

@stumitchell
Copy link
Contributor

Looking good however, I suggest that we do a major version change as the change to modifying things in place instead of copying may cause errors in existing code.

@pchtsp
Copy link
Collaborator

pchtsp commented May 27, 2024

it seems some tests are failing

@MBradbury
Copy link
Author

Looks like LpConstraint needs a asCplexLpAffineExpression method that calls the method of the same name on expr. Or the test can be updated to use LpProblem.objective.expr.asCplexLpAffineExpression instead of LpProblem.objective.asCplexLpAffineExpression.

@christiansegercrantz
Copy link

What's the current situation with this PR, is it still being worked on? I'm currently experiencing a very similar situation where the solving itself is taking 1.7s but I accumulate 22s of time from LpSum.

@MBradbury
Copy link
Author

MBradbury commented Nov 26, 2024

There was a small change that I mentioned in an earlier message, I just don't have the time to address it currently. I expect that this will be an issue for me in March/April, when I need to use Pulp for an annual task, so will likely take a look again then.

When I originally took a look the biggest design issue is that LpAffineExpression inherits from a dict. Ideally this would not be the case to avoid excessive copying, but fixing this issue was even harder to get right than the current attempt at speeding up Pulp.

Edit: Just a note that I still have no solution to directly constructing a pulp.LpAffineExpression being faster than a pulp.lpSum. So I always just construct a pulp.LpAffineExpression directly now.

So instead of:

prob = pulp.lpSum([
                dv[a, b] * reward[a, b]
                for a in range(len(A))
                for b in range(len(B))
            ])

I do:

prob = pulp.LpAffineExpression([
                (dv[a, b], reward[a, b])
                for a in range(len(A))
                for b in range(len(B))
            ])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants