Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support different data types for optimization parameters #93

Closed
Engineero opened this issue May 10, 2018 · 16 comments · Fixed by #531
Closed

Support different data types for optimization parameters #93

Engineero opened this issue May 10, 2018 · 16 comments · Fixed by #531
Assignees

Comments

@Engineero
Copy link

Engineero commented May 10, 2018

It would be nice to support different data types---e.g. int, float, bool, and maybe a categorical string---for the parameters over which we optimize. I am not sure what the syntax would look like, except for maybe a list of datatypes passed in that corresponds to the parameter bounds.

All three of these types could be handled the same way, with int being drawn uniformly from the integer interval specified, bool being drawn uniformly from {0, 1}, and categorical strings being mapped to a drawing from integer values [0, 1, ..., n_categories-1] or one-hot encoded as @PedroCardoso suggested below.

See [E. C. Garrido-Merchan and D. Hernandez-Lobato, 2017] for one approach.

@fmfn
Copy link
Member

fmfn commented May 18, 2018

Interesting, the kernel change they propose wouldn't be too hard to implement. My only concern is making the API more and more cumbersome by piling features. However this one is requested often enough to be worth considering.

@fmfn fmfn self-assigned this May 18, 2018
@Engineero
Copy link
Author

I can try to take a look at it too. I'll let you know if I get anywhere.

@PedroCardoso
Copy link

Could I propose a different approach to categorical strings data types ?
I would suggest a one-hot implementation, in practice creating n bool dimensions on the search space. categorical types are independent.

@PedroCardoso
Copy link

Did anyone advanced on this ?

@dehdari
Copy link

dehdari commented Aug 6, 2018

It would be really useful to have these types supported.

@guidocalvano
Copy link

+1 I'd like this too!

1 similar comment
@dingtine
Copy link

+1 I'd like this too!

@jmehault
Copy link

I proposed an implementation for the type integer in a merge request. It is more a first shot than a terminated work, improvements can be done. Could anyone take a look to discuss on this please?

@gustavolvieira
Copy link

Great suggestion! Parameter typing would be really useful, specially for categorical parameters.

@janwendt
Copy link

janwendt commented Mar 22, 2019

+1 I would like to use integers

@friedsela
Copy link

friedsela commented Apr 16, 2019

Is it possible to exclude specific points in the bounds?
I mean, when defining
BayesianOptimization(f=black_box_function, pbounds={'e': (0, 1)})
I do not actually want 'e' to be 0.
Surely I can write pbounds={'e': (0.0001, 1)} or something like that but it is not nice.

Another thing, it sometime "gets stuck" on points (iter 10-16) which seems a waste:

| iter | target | e |

| 1 | 0.7492 | 0.2963 |
| 2 | 0.03762 | 0.7072 |
| 3 | 0.6771 | 0.2084 |
| 4 | 0.4013 | 0.4408 |
| 5 | 0.03448 | 0.9871 |
| 6 | 0.3762 | 0.001 |
| 7 | 0.7429 | 0.2671 |
| 8 | 0.7461 | 0.287 |
| 9 | 0.721 | 0.3317 |
| 10 | 0.7492 | 0.2928 |
| 11 | 0.7492 | 0.2927 |
| 12 | 0.7492 | 0.2925 |
| 13 | 0.7492 | 0.2928 |
| 14 | 0.7492 | 0.2917 |
| 15 | 0.7492 | 0.291 |
| 16 | 0.7492 | 0.2945 |
| 17 | 0.7461 | 0.2891 |
| 18 | 0.7367 | 0.3024 |
| 19 | 0.03448 | 0.8466 |
| 20 | 0.05016 | 0.5746 |

Is it intentional?

@fmfn
Copy link
Member

fmfn commented Apr 17, 2019

There's no way to exclude boundary points. I don't think the extra complexity would justify. As you mentioned you can simply use (1e-4, 1), or something like it, since the choice of lower bound should be immaterial. If, however, you believe the difference between picking 1e-5 or 1e-3 as a lower bound is important, you should transform this variable to a log scale.

And in this particular example the optimizer was not stuck, it was simply exploiting the maximum region around 0.292....

@friedsela
Copy link

Thank you.

@atisman89
Copy link

+1 any updates? thanks.

@marcelroed
Copy link

I'm looking to use this for booleans, any updates?

@till-m
Copy link
Member

till-m commented Dec 27, 2024

Implemented in #531 and published as a beta release, version 3.0.0b1. If someone is working on problems using typed optimization, I would love some getting some feedback from beta testers. :) Documentation for the feature is available here (example usage) and here (API reference).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.