Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introspect comparison operators #84

Merged
merged 7 commits into from
Oct 19, 2023
Merged

Introspect comparison operators #84

merged 7 commits into from
Oct 19, 2023

Conversation

plcplc
Copy link
Contributor

@plcplc plcplc commented Oct 17, 2023

What

Rather than dreaming up the comparison operators that can apply to each type we ask the database.

How

We introduce a new configuration field, comparisonOperatorAliases, which defines what names to expose database infix comparison operators under. This cannot be derived only from introspection. The default uses the names of graphql-engine v2.

We filter how much of the introspection data ends up in the deployment and schema to include only types, comparison operators, and aggregation functions that are actually used in native queries or tables, in order to keep the size small and the contents concise and relevant.

The operators are sourced from the catalog table pg_operator, because this is supported by both Postgres and CockroachDB. We could relatively easily extend our comparison operators to cover normal prefix predicates, but that is future work.

Answering the question "which operators are defined for a given type" is a somewhat nuanced affair, since we have to take into account operator overloading and implicit casts.

There is even the insular case of an operator (SIMILAR TO in Postgres) being defined as syntactic sugar over LIKE. Since SIMILAR TO is somewhat obscure (IMO) and not really very distinguished from LIKE and REGEX, I've opted not to try and have it appear automatically during configuration/introspection. (And any user who categorically needs to use it are free to define it themselves in the deployment)

@@ -23,6 +23,16 @@ pub struct ComparisonOperator {
pub argument_type: ScalarType,
}

/// Define the names that comparison operators will be exposed as by the automatic introspection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I understand the purpose of this struct and the related field in the metadata.json. We don't seem to use it in the query-engine nor is it visible in the deployment.json. What is its purpose and should it be here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be present in the deployment. It is passed as an argument to configuration.sql which uses it to decide the name to export operators under.

Copy link
Contributor

@soupi soupi Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be present in the deployment.

Sorry, you are right. I missed it when I looked.

It is passed as an argument to configuration.sql which uses it to decide the name to export operators under.

Perhaps move it to version1.rs since it's not part of the metadata and is not used by the query engine?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm still a bit lost about the purpose of the mapping. I thought we start the configuration server without any configuration files. If that's the case, why do we have the mapping in the deployment file? What is it used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps move it to version1.rs since it's not part of the metadata and is not used by the query engine?

I mainly put it in metadata because that was where the other types were defined, but I like this suggestion!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm still a bit lost about the purpose of the mapping. I thought we start the configuration server without any configuration files. If that's the case, why do we have the mapping in the deployment file? What is it used for?

Now that we introspect the comparison operators from the database we only get their names as defined in the database ('=', '<>', '~~', etc.).

We need a means to assign names to these which will work reasonably well in the NDC api and GraphQL especially, so a mapping needs to exist somewhere, which might as well be in configuration, making it customizable and extendable.

When the configuration server is called at first (given likely only a connection string), it assumes a default value for the mapping, which amounts to the same names we used in graphql-engine v2.

Subsequent calls to the configuration server will use the given mapping to inform introspection, letting users pick different names or expose more (or fewer!) operators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soupi : I've moved the type to version1.rs :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a means to assign names to these which will work reasonably well in the NDC api and GraphQL

Isn't that already what ComparisonOperators is though? It seems we now have two ways to map a graphql operator name to a postgres operator name, and its not obvious to me what the role of each of these is, whether they can diverge and what happens when they do, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapping directs the names that introspection/configuration server comes up with (i.e., which ComparisonOperators it produces)

The ComparisonOperators decide what gets into the schema.

Would it be less confusing if the comparison operator mapping and the list of excluded schemas were under a shared obejct field (e.g. configurationDefaults) to emphasise that they affect only the configuration server?

It would be neat if we could just use the names of the operators directly and decide what to expose and with what name in the DDN layer like we do tables and columns/fields. Then it would be the responsibility of the LSP server to suggest v2 names for operator mappings.

The reason I did put in the mapping here is that I don't know if the ndc spec is currently capable of mapping and selecting comparison operators, and because it would require more coordination with LSP development.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be less confusing if the comparison operator mapping and the list of excluded schemas were under a shared obejct field (e.g. configurationDefaults) to emphasise that they affect only the configuration server?

That could help I think. Thanks for explaining.

@@ -1228,43 +1209,6 @@
"returnType": "float8"
}
},
"geometry": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to remove a lot of valid operators. How come?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we are filtering types/functions/operators based on relevance, meaning that a type and associated aggregations and comparison operators are only included if there is a table or native query that uses that type.

The intention with this is only to keep the stored introspection data more concise, as there's no point including types that can never actually appear as values.

This is similar to how we by default exclude schemas that we deem to be internal to the database.

@plcplc plcplc disabled auto-merge October 18, 2023 15:20
Copy link
Contributor

@soupi soupi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving the changes, but would appreciate it if the comparison operators would move to where excluded schemas are, as you suggested.

@plcplc plcplc enabled auto-merge October 19, 2023 08:58
@plcplc plcplc added this pull request to the merge queue Oct 19, 2023
Merged via the queue into main with commit 4cd5ee8 Oct 19, 2023
24 checks passed
@plcplc plcplc deleted the plc/issues/NDAT-969 branch October 19, 2023 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants