-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(csharp/src/Drivers/Apache/Spark): low latency test cases #1948
base: main
Are you sure you want to change the base?
Conversation
I'm sorry; I seem to have missed this. It looks like there might be some imminent merge conflicts with #2014 (attn @birschick-bq). |
{ | ||
TExecuteStatementReq executeRequest = new TExecuteStatementReq(this.connection.sessionHandle, this.SqlQuery); | ||
SetStatementProperties(executeRequest); | ||
executeRequest.GetDirectResults = new TSparkGetDirectResults(BatchSizeDefault); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set it to the "current" value of BatchSize, in case it's been update through configuration/programmatically.
executeRequest.GetDirectResults = new TSparkGetDirectResults(BatchSizeDefault); | |
executeRequest.GetDirectResults = new TSparkGetDirectResults(BatchSize); |
ThriftHttpTransport transport = new ThriftHttpTransport(httpClient, config); | ||
_httpClient.BaseAddress = new UriBuilder(Uri.UriSchemeHttps, hostName, -1, path).Uri; | ||
_httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", token); | ||
_httpClient.DefaultRequestHeaders.UserAgent.ParseAdd("SimbaSparkJDBCDriver/2.06.15 Python/PyHive"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd update the UserAgent
value and leave a comment as to why this needs to be particular constant. Ideally, the ADBC driver agent name could also be added to your whitelist.
private static readonly char[] Characters = "0123456789abcdef".ToCharArray(); | ||
private static readonly Random Random = new Random(); | ||
|
||
public string Generate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a more descriptive name
public string Generate() | |
public string GenerateTraceId() |
long count = 0; | ||
long target = 80; | ||
Queue<String> log = new Queue<string>(); | ||
string FileName = $@"c:\temp\hmsmeta\{DateTime.Now.ToString("yyyyMMdd_HHmm")}__{target}.log"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a logFolder
property into the configuration file properties. This is not likely portable or desirable.
for (int i = 0; i < 32; i++) | ||
{ | ||
result.Append(Characters[Random.Next(Characters.Length)]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be extracted to a function with the length as the parameter and then resused for lines 315-318.
@@ -32,6 +32,17 @@ internal SparkStatement(SparkConnection connection) | |||
: base(connection) | |||
{ | |||
} | |||
public override QueryResult ExecuteQuery() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: empty line
public override QueryResult ExecuteQuery() | |
public override QueryResult ExecuteQuery() |
@@ -32,6 +32,17 @@ internal SparkStatement(SparkConnection connection) | |||
: base(connection) | |||
{ | |||
} | |||
public override QueryResult ExecuteQuery() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this override code be applied to ExecuteQueryAsync, too?
var conn = (connection as SparkConnection); | ||
|
||
if(conn != null) | ||
{ | ||
conn.ResetTraceId(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More concise?
```suggestion
if (connection is SparkConnection sparkConnection) sparkConnection.ResetTraceId();
statement.SqlQuery = TestConfiguration.Query; | ||
|
||
QueryResult queryResult = statement.ExecuteQuery(); | ||
for (int i = 0; i < target; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test could be refactored to handle any test "kernel" (lambda) to wrap the test such that we could capture performance information in a log file. What do you think?
using AdbcStatement statement = adbcConnection.CreateStatement(); | ||
statement.SqlQuery = TestConfiguration.Query; | ||
|
||
QueryResult queryResult = statement.ExecuteQuery(); | ||
Tests.DriverTests.CanExecuteQuery(queryResult, TestConfiguration.ExpectedResultsCount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be the lambda fed into the performance logger.
sorry, I didn't meant to ask for PR right away, since this is still at initial stage, not sure how the reviewers got added. |
You can set it to be a Draft PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like I wrote a bunch of comments a while back and never submitted them. I haven't looked at the most recent iteration, and this is still marked as a draft.
@@ -2,6 +2,7 @@ | |||
|
|||
<PropertyGroup> | |||
<TargetFrameworks>net472;net6.0</TargetFrameworks> | |||
<NoWarn>CS8618</NoWarn> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of disabling this warning, it looks like the _statementResp field should probably just be made nullable.
long count = 0; | ||
long target = 80; | ||
Queue<String> log = new Queue<string>(); | ||
string FileName = $@"c:\temp\hmsmeta\{DateTime.Now.ToString("yyyyMMdd_HHmm")}__{target}.log"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of hardcoding a path, please use something like Path.GetTempPath
and consider deleting the file once the test has finished.
var sparkConn = adbcConnection as SparkConnection; | ||
if(sparkConn == null) | ||
{ | ||
throw new InvalidCastException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is test code and we're just throwing a cast exception anyway, the code is a little cleaner as just
var sparkConn = (SparkConnection)adbcConnection;
(The cast will throw if it's not the right type.)
ThriftHttpTransport transport = new ThriftHttpTransport(httpClient, config); | ||
_httpClient.BaseAddress = new UriBuilder(Uri.UriSchemeHttps, hostName, -1, path).Uri; | ||
_httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", token); | ||
_httpClient.DefaultRequestHeaders.UserAgent.ParseAdd("SimbaSparkJDBCDriver/2.06.15 Python/PyHive"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the best choice for a user-agent? Are we driving some specific server behavior by sending this?
No description provided.