Should I Replace My FOR XML PATH String Merges with String_agg?

By Jeffry Schwartz | Function

Mar 07

Overview

If you are looking for major performance gains, the answer is no.  However, if you are using SQL Server 2017 and developing new code, you should consider using the new String_agg function that is available on SQL Server 2017 because code that uses this function is so much more readable and easier to write and maintain.  This article compares the performance of the two methods while explaining the differences in the code.  The examples provided employ lists of SQL Server table names that are merged into a dynamic SQL clause that can be used in where or join clauses.

Testing Strategy

Since list length is often a concern when splitting or merging items, various numbers of records were merged and timed.  A relatively small table was chosen to minimize the impact of I/O, and caches were purged before the execution of each method.  Since the individual execution times were very small, the pertinent code was placed within a 10,000-iteration loop.  The tested numbers of records to be merged into a string were as follows:  225, 450, 675, 900, and 1125, and the loop code is shown below. 

Each test was performed within its own batch on an otherwise idle system.  Run-time performance was captured using an Extended Events session, which was configured to capture sqlserver.sql_batch_completed events ONLY to minimize overhead.  The session definition, start, and stop code are shown below.  This session collects information for completed batches that ran longer than 10 microseconds and were executed from database #7.

The Code

Two different T-SQL methods were used for testing.  The FOR XML PATH solution incorporates the reverse and stuff functions along with FOR XML PATH.  The String_agg solution is much simpler, but is only available on SQL Server 2017 currently.

The FOR XML PATH method has been documented thoroughly over the years in many articles, so these explanations will not be repeated here.  However, two points are worth mentioning.  The first one is that the @strSeparator (in this case ‘ or  ’) string will reside at the end of the constructed string (in this case @strFilter).  Since the last occurrence of @strSeparator must be removed to prevent a syntax error, characters must be dropped from the end of the string.  Although it may seem obvious that four characters (the length of @strSeparator) should be removed, actually only three need be removed because the end of the constructed string is “r” not “ “.  Therefore, the length argument of the left function specification is the length of @strFilter minus three.

The second point of interest concerns the parentheses that surround the select statement.  They may seem unnecessary, but they are important because without them, the following errors occur:

Msg 156, Level 15, State 1, Line 26

Incorrect syntax near the keyword ‘select’.

Msg 102, Level 15, State 1, Line 28

Incorrect syntax near ‘)’.

The second method, String_agg, is much simpler to implement, but does have a restriction, as shown below:

Msg 9829, Level 16, State 1, Line 62

STRING_AGG aggregation result exceeded the limit of 8000 bytes. Use LOB types to avoid result truncation.

This limitation can be circumvented by converting the [Full Table Name] column to an nvarchar(max) variable, so that the overall string length no longer matters.

The code for the two methods is shown below:

Test Results

Results of the tests are shown in the Figure 1 below.  Clearly, there was very little difference between the two methods, although the number of elements directly affected performance of both methods.    Durations were equal to processor times and proportional to the number of records that were merged.  The query plans for the two methods were quite similar and shown in Figure 2 and Figure 3, respectively.

Figure 1: Average Execution Time

Figure 2: FOR XML PATH Execution Plan

Figure 3: String_agg Execution Plan

Conclusion

In summary, existing T-SQL code need not be replaced.  However, if possible, new development that will run on SQL Server 2017 or higher should use the String_agg function for simplicity, improved readability, and easier maintenance.

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!

 

  • […] Jeffry Schwartz compares the performance of STRING_AGG in SQL Server 2017 versus the tried and true method of FOR XML PATH for string concatenation: […]

  • >