Modify SqlStreamingXml XmlWriter to internally use a MemoryStream#3974
Modify SqlStreamingXml XmlWriter to internally use a MemoryStream#3974jimhblythe wants to merge 4 commits intodotnet:mainfrom
Conversation
…tead of a StringBuilder. (dotnet#1877) Note: UTF8Encoding(false) addition in s_writerSettings is consistent with prior default used within StringWriter/StringBuilder
|
@dotnet-policy-service agree |
src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlStream.cs
Outdated
Show resolved
Hide resolved
…ple elements Enhance comments within SqlStreamingXml Extend Manual tests to fully cover GetChars WriteXmlElement includes uncovered paths not accessible for SQL XML column types which normalize Whitespace, CDATA, EntityReference, XmlDeclaration, ProcessingInstruction, DocumentType, and Comment node types
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
paulmedynski
left a comment
There was a problem hiding this comment.
Looks great - thanks for this optimization and expanded test coverage! Just one question.
src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlStream.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlStream.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/tests/ManualTests/SQL/SqlStreamingXmlTest/SqlStreamingXmlTest.cs
Show resolved
Hide resolved
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3974 +/- ##
==========================================
- Coverage 75.22% 65.82% -9.40%
==========================================
Files 266 275 +9
Lines 42932 65896 +22964
==========================================
+ Hits 32294 43379 +11085
- Misses 10638 22517 +11879
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add tests that verify SqlDataReader.GetChars correctly handles non-ASCII characters when reading XML columns with SequentialAccess: - GetChars_NonAsciiContent: Latin accented characters (2-byte UTF-8) - GetChars_NonAsciiContent_BulkRead: Bulk read path with accented chars - GetChars_CjkContent: CJK characters (3-byte UTF-8) These tests establish a baseline for correct behavior on main before PR #3974 (issue #1877) refactors SqlStreamingXml internals.
|
@jimhblythe can you take a look at the tests in #4005 and #4008? They seem to suggest that this introduces a regression for non-ascii characters. You can check the pipeline results to see the test outcomes. I'd like to see some tests like that included in this PR as well, please 🙏 |
I will additionally including a fourth test to verify surrogate pairs:
The test shows these worked with prior code. |
…reamingXml, reconstructing XML fragments as strings and streaming them char-by-char. Improves efficiency, reduces allocations, and fixes non-ASCII and surrogate pairs. Add comprehensive unit tests for XML edge cases (non-ASCII, surrogate pairs, comments, CDATA, attributes, namespaces, etc.). Refactor existing tests for clarity and better handling of disposables. (AI assist using ChatGPT to better consider edge cases)
|
Regarding this from my last comment: Is exact replication of prior logic desired or more consistent with how SSMS would show an XML column? I am reworking the logic and tests cases such that GetChars responds the same when streaming an Xml typed column as it does when reading Convert(NVarChar(max), xmlColumn) - each test case will verify reading as both data types. |
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |


Modify SqlStreamingXml XmlWriter to internally use a MemoryStream instead of a StringBuilder.
Note: UTF8Encoding(false) addition in s_writerSettings is consistent with prior default used within StringWriter/StringBuilder
Issues
Fixes #1877 to be O(n)
Testing
Added 2 new Manual tests to ensure linear behavior for single large node, and secondary validation for multiple nodes
