-
Notifications
You must be signed in to change notification settings - Fork 1k
PHOENIX-7198 support for multi row constructors in single upsert query #2222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Probably I need to add more tests for this functionality. |
|
This could be a nice feature! |
|
@richardantal Is there performance gain from using multiple values in a single query ? I don't think so since we can achieve the same by batching multiple upsert statements and then doing a commit. |
| conn.createStatement().execute("UPSERT INTO " + tableName + "(K, INT, INT2) VALUES ('E', 5, 5),('F', 61, 6)"); | ||
| conn.createStatement().execute("UPSERT INTO " + tableName + " VALUES ('C', 31, 32),('D', 41, 42)"); | ||
| conn.createStatement().execute("UPSERT INTO " + tableName + " VALUES ('G', 7, 72),('H', 8)"); | ||
| conn.createStatement().execute("UPSERT INTO " + tableName + " VALUES ('I', 9),('I', 10)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is valid standard SQL (i.e. not all columns are specified, but no column list either) , but if this works for the single value upsert then it's fine.
| } | ||
|
|
||
| @Test | ||
| public void testValidMultipleUpsert3() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more failure cases to exercise the parser would be nice.
i.e. no value list at all, no commas between, not closing some parentheses and similar.
| conn.createStatement().execute(ddl); | ||
|
|
||
|
|
||
| conn.createStatement().execute("UPSERT INTO " + tableName + " VALUES ('A', 11, 12)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to also have some tests for expressions and mixed expressions and literals.
i.e.
VALUES (substring('APPLE',0,1), 2*2)
| + constantExpression.toString() + " in column " + column); | ||
|
|
||
| int index = -1; | ||
| for (byte[][] valuesListIem : valuesList) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping the two lists in sync like this awkward.
Using an old-style for() would be more readable to me:
for(int index=0; index<valuesList.length();i++)
and then use the index to get both the valuesListItem and constantExpression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also a typo in valuesListIem
My take:
also:
|
|
Thanks for the reviews, I added a bit more tests I also uploaded a simple performance test for this feature to the ticket as an attachment. |
|
10% improvement for bulk loading is nothing to sneeze at @richardantal . |
stoty
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM
| : UPSERT (hint=hintClause)? INTO t=from_table_name | ||
| (LPAREN p=upsert_column_refs RPAREN)? | ||
| ((VALUES LPAREN v=one_or_more_expressions RPAREN ( ON DUPLICATE KEY ( ig=IGNORE | | ||
| ((VALUES LPAREN e = one_or_more_expressions {v.add(e);} RPAREN (COMMA LPAREN e = one_or_more_expressions {v.add(e);} RPAREN )* ( ON DUPLICATE KEY ( ig=IGNORE | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
Would extracting LPAREN e = one_or_more_expressions {v.add(e);} RPAREN imto a function make sense ?
76ce007 to
4844056
Compare
|
I rebased the change to resolve conflict with spotless change. |
|
The last CI run still shows spotless issues. |
4844056 to
9bf2be8
Compare
|
I rebase the change |
|
Given the grammar changes, we want this only for 5.4.0 right? |
| } | ||
| boolean isAutoCommit = connection.getAutoCommit(); | ||
| if (valueNodes == null) { | ||
| if (valueNodesList.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this not also consider valueNodesList being null? like we consider above with the for-loop?
9bf2be8 to
8cfb7e3
Compare
|
Thanks Viraj for the question, it is a good one. I tweaked the above condition a little, to be closer to what we had earlier. |
| targetColumns.add(rowTimestampCol); | ||
| if (valueNodes != null && !valueNodes.isEmpty()) { | ||
| valueNodes.add(getNodeForRowTimestampColumn(rowTimestampCol)); | ||
| if (valueNodesList != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably this check is not needed, but better to be safe
8cfb7e3 to
e2ed715
Compare
e2ed715 to
e20aaf1
Compare
|
I think the 2 failing tests (PhoenixTableLevelMetricsIT,MetadataServerConnectionsIT) is unrelated |
|
@virajjasani @kadirozde Can I proceed to merging this? |
|
It's been long time, sorry i could not take another round of the review. It seemed good in the past. The only concern is, i hope UpsertCompiler does not break any existing UPSERT functionalities. |
|
Triggered new build: https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-2222/8/ @richardantal could you also run spotless:apply if any recent change needs spotless fix? |
e20aaf1 to
10e7091
Compare
|
The latest build still reports:
|
10e7091 to
b486b13
Compare
|
I ran locally the failing tests, all passed |
|
Meanwhile I figured that running |
|
Yes Java 8 is the default in jenkins builds |
UpsertCompiler should not break any existing UPSERT functionalities. With this change we just accept a list of the already supported expressions. |
|
Sounds good. Based on @stoty's review, feel free to merge. Just one thing to confirm: could you run ConcurrentMutationsExtendedIT a couple of times to ensure no test is failing? In the jenkins build, concurrent mutations test is failing for the last two builds. Just want to make sure it is flaky and locally it is passing at least two times. Thank you @richardantal! |
1 similar comment
|
Sounds good. Based on @stoty's review, feel free to merge. Just one thing to confirm: could you run ConcurrentMutationsExtendedIT a couple of times to ensure no test is failing? In the jenkins build, concurrent mutations test is failing for the last two builds. Just want to make sure it is flaky and locally it is passing at least two times. Thank you @richardantal! |
No description provided.