Skip to content

Commit d185d52

Browse files
authored
FEAT: Support for streaming large parameters in execute() (#176)
### Work Item / Issue Reference <!-- IMPORTANT: Please follow the PR template guidelines below. For mssql-python maintainers: Insert your ADO Work Item ID below (e.g. AB#37452) For external contributors: Insert Github Issue number below (e.g. #149) Only one reference is required - either GitHub issue OR ADO Work Item. --> <!-- mssql-python maintainers: ADO Work Item --> > [AB#33395](https://sqlclientdrivers.visualstudio.com/c6d89619-62de-46a0-8b46-70b92a84d85e/_workitems/edit/33395) <!-- External contributors: GitHub Issue --> > GitHub Issue: #<ISSUE_NUMBER> ------------------------------------------------------------------- ### Summary <!-- Insert your summary of changes below. Minimum 10 characters required. --> This pull request adds support for streaming large parameters to SQL Server using ODBC's Data At Execution (DAE) mechanism, particularly for long Unicode strings and binary data. The changes update both the Python and C++ layers to correctly identify large parameters, mark them for DAE, and handle the streaming process during execution. Additional refactoring improves parameter type mapping and memory handling for these cases. **Large parameter streaming (DAE) support:** * Updated the `_map_sql_type` method in `cursor.py` to return an `is_dae` flag for parameters that require streaming (e.g., long Unicode strings, long binary data), and to calculate the correct size for Unicode strings using UTF-16 encoding. [[1]](diffhunk://#diff-deceea46ae01082ce8400e14fa02f4b7585afb7b5ed9885338b66494f5f38280R227-R231) [[2]](diffhunk://#diff-deceea46ae01082ce8400e14fa02f4b7585afb7b5ed9885338b66494f5f38280R329-R366) [[3]](diffhunk://#diff-deceea46ae01082ce8400e14fa02f4b7585afb7b5ed9885338b66494f5f38280R376-R383) [[4]](diffhunk://#diff-deceea46ae01082ce8400e14fa02f4b7585afb7b5ed9885338b66494f5f38280R393-R419) * Modified `_create_parameter_types_list` in `cursor.py` to set DAE-related fields (`isDAE`, `strLenOrInd`, `dataPtr`) in the parameter info when streaming is needed. **C++ bindings and execution logic:** * Extended the `ParamInfo` struct and its Python bindings to include DAE fields (`isDAE`, `strLenOrInd`, `dataPtr`) for use during parameter binding and streaming. [[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L48-R51) [[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2509-R2556) * Added ODBC DAE API function pointers (`SQLParamData`, `SQLPutData`) and integrated their loading and usage into the driver handle setup. [[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R141-R144) [[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R785-R787) [[3]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L799-R798) [[4]](diffhunk://#diff-85167a2d59779df18704284ab7ce46220c3619408fbf22c631ffdf29f794d635R121-R123) * Refactored parameter binding and execution logic in `ddbc_bindings.cpp` to handle DAE parameters: if a parameter is marked for DAE, the code enters a loop to stream the data in chunks using `SQLParamData` and `SQLPutData`. This is done for large Unicode strings and (potentially) binary data. [[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R254-R288) [[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R997-L1007) **Build and warning improvements:** * Adjusted MSVC compiler flags to remove the "treat warnings as errors" option, making builds less strict on warnings. <!-- ### PR Title Guide > For feature requests FEAT: (short-description) > For non-feature requests like test case updates, config updates , dependency updates etc CHORE: (short-description) > For Fix requests FIX: (short-description) > For doc update requests DOC: (short-description) > For Formatting, indentation, or styling update STYLE: (short-description) > For Refactor, without any feature changes REFACTOR: (short-description) > For release related changes, without any feature changes RELEASE: #<RELEASE_VERSION> (short-description) ### Contribution Guidelines External contributors: - Create a GitHub issue first: https://github.com/microsoft/mssql-python/issues/new - Link the GitHub issue in the "GitHub Issue" section above - Follow the PR title format and provide a meaningful summary mssql-python maintainers: - Create an ADO Work Item following internal processes - Link the ADO Work Item in the "ADO Work Item" section above - Follow the PR title format and provide a meaningful summary -->
1 parent 8968a5c commit d185d52

File tree

3 files changed

+145
-73
lines changed

3 files changed

+145
-73
lines changed

mssql_python/cursor.py

Lines changed: 44 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@
1818
from mssql_python.exceptions import InterfaceError, NotSupportedError, ProgrammingError
1919
from .row import Row
2020

21+
# Constants for string handling
22+
MAX_INLINE_CHAR = 4000 # NVARCHAR/VARCHAR inline limit; this triggers NVARCHAR(MAX)/VARCHAR(MAX) + DAE
2123

2224
class Cursor:
2325
"""
@@ -233,10 +235,11 @@ def _map_sql_type(self, param, parameters_list, i):
233235
ddbc_sql_const.SQL_C_DEFAULT.value,
234236
1,
235237
0,
238+
False,
236239
)
237240

238241
if isinstance(param, bool):
239-
return ddbc_sql_const.SQL_BIT.value, ddbc_sql_const.SQL_C_BIT.value, 1, 0
242+
return ddbc_sql_const.SQL_BIT.value, ddbc_sql_const.SQL_C_BIT.value, 1, 0, False
240243

241244
if isinstance(param, int):
242245
if 0 <= param <= 255:
@@ -245,26 +248,30 @@ def _map_sql_type(self, param, parameters_list, i):
245248
ddbc_sql_const.SQL_C_TINYINT.value,
246249
3,
247250
0,
251+
False,
248252
)
249253
if -32768 <= param <= 32767:
250254
return (
251255
ddbc_sql_const.SQL_SMALLINT.value,
252256
ddbc_sql_const.SQL_C_SHORT.value,
253257
5,
254258
0,
259+
False,
255260
)
256261
if -2147483648 <= param <= 2147483647:
257262
return (
258263
ddbc_sql_const.SQL_INTEGER.value,
259264
ddbc_sql_const.SQL_C_LONG.value,
260265
10,
261266
0,
267+
False,
262268
)
263269
return (
264270
ddbc_sql_const.SQL_BIGINT.value,
265271
ddbc_sql_const.SQL_C_SBIGINT.value,
266272
19,
267273
0,
274+
False,
268275
)
269276

270277
if isinstance(param, float):
@@ -273,6 +280,7 @@ def _map_sql_type(self, param, parameters_list, i):
273280
ddbc_sql_const.SQL_C_DOUBLE.value,
274281
15,
275282
0,
283+
False,
276284
)
277285

278286
if isinstance(param, decimal.Decimal):
@@ -284,6 +292,7 @@ def _map_sql_type(self, param, parameters_list, i):
284292
ddbc_sql_const.SQL_C_NUMERIC.value,
285293
parameters_list[i].precision,
286294
parameters_list[i].scale,
295+
False,
287296
)
288297

289298
if isinstance(param, str):
@@ -297,6 +306,7 @@ def _map_sql_type(self, param, parameters_list, i):
297306
ddbc_sql_const.SQL_C_WCHAR.value,
298307
len(param),
299308
0,
309+
False,
300310
)
301311

302312
# Attempt to parse as date, datetime, datetime2, timestamp, smalldatetime or time
@@ -309,6 +319,7 @@ def _map_sql_type(self, param, parameters_list, i):
309319
ddbc_sql_const.SQL_C_TYPE_DATE.value,
310320
10,
311321
0,
322+
False,
312323
)
313324
if self._parse_datetime(param):
314325
parameters_list[i] = self._parse_datetime(param)
@@ -317,6 +328,7 @@ def _map_sql_type(self, param, parameters_list, i):
317328
ddbc_sql_const.SQL_C_TYPE_TIMESTAMP.value,
318329
26,
319330
6,
331+
False,
320332
)
321333
if self._parse_time(param):
322334
parameters_list[i] = self._parse_time(param)
@@ -325,25 +337,26 @@ def _map_sql_type(self, param, parameters_list, i):
325337
ddbc_sql_const.SQL_C_TYPE_TIME.value,
326338
8,
327339
0,
340+
False,
328341
)
329342

330343
# String mapping logic here
331344
is_unicode = self._is_unicode_string(param)
332-
# TODO: revisit
333-
if len(param) > 4000: # Long strings
345+
if len(param) > MAX_INLINE_CHAR: # Long strings
334346
if is_unicode:
335-
utf16_len = len(param.encode("utf-16-le")) // 2
336347
return (
337348
ddbc_sql_const.SQL_WLONGVARCHAR.value,
338349
ddbc_sql_const.SQL_C_WCHAR.value,
339-
utf16_len,
350+
len(param),
340351
0,
352+
True,
341353
)
342354
return (
343355
ddbc_sql_const.SQL_LONGVARCHAR.value,
344356
ddbc_sql_const.SQL_C_CHAR.value,
345357
len(param),
346358
0,
359+
True,
347360
)
348361
if is_unicode: # Short Unicode strings
349362
utf16_len = len(param.encode("utf-16-le")) // 2
@@ -352,12 +365,14 @@ def _map_sql_type(self, param, parameters_list, i):
352365
ddbc_sql_const.SQL_C_WCHAR.value,
353366
utf16_len,
354367
0,
368+
False,
355369
)
356370
return (
357371
ddbc_sql_const.SQL_VARCHAR.value,
358372
ddbc_sql_const.SQL_C_CHAR.value,
359373
len(param),
360374
0,
375+
False,
361376
)
362377

363378
if isinstance(param, bytes):
@@ -367,12 +382,14 @@ def _map_sql_type(self, param, parameters_list, i):
367382
ddbc_sql_const.SQL_C_BINARY.value,
368383
len(param),
369384
0,
385+
False,
370386
)
371387
return (
372388
ddbc_sql_const.SQL_BINARY.value,
373389
ddbc_sql_const.SQL_C_BINARY.value,
374390
len(param),
375391
0,
392+
False,
376393
)
377394

378395
if isinstance(param, bytearray):
@@ -382,12 +399,14 @@ def _map_sql_type(self, param, parameters_list, i):
382399
ddbc_sql_const.SQL_C_BINARY.value,
383400
len(param),
384401
0,
402+
True,
385403
)
386404
return (
387405
ddbc_sql_const.SQL_BINARY.value,
388406
ddbc_sql_const.SQL_C_BINARY.value,
389407
len(param),
390408
0,
409+
False,
391410
)
392411

393412
if isinstance(param, datetime.datetime):
@@ -396,6 +415,7 @@ def _map_sql_type(self, param, parameters_list, i):
396415
ddbc_sql_const.SQL_C_TYPE_TIMESTAMP.value,
397416
26,
398417
6,
418+
False,
399419
)
400420

401421
if isinstance(param, datetime.date):
@@ -404,6 +424,7 @@ def _map_sql_type(self, param, parameters_list, i):
404424
ddbc_sql_const.SQL_C_TYPE_DATE.value,
405425
10,
406426
0,
427+
False,
407428
)
408429

409430
if isinstance(param, datetime.time):
@@ -412,14 +433,11 @@ def _map_sql_type(self, param, parameters_list, i):
412433
ddbc_sql_const.SQL_C_TYPE_TIME.value,
413434
8,
414435
0,
436+
False,
415437
)
416438

417-
return (
418-
ddbc_sql_const.SQL_VARCHAR.value,
419-
ddbc_sql_const.SQL_C_CHAR.value,
420-
len(str(param)),
421-
0,
422-
)
439+
# For safety: unknown/unhandled Python types should not silently go to SQL
440+
raise TypeError("Unsupported parameter type: The driver cannot safely convert it to a SQL type.")
423441

424442
def _initialize_cursor(self) -> None:
425443
"""
@@ -495,14 +513,19 @@ def _create_parameter_types_list(self, parameter, param_info, parameters_list, i
495513
paraminfo.
496514
"""
497515
paraminfo = param_info()
498-
sql_type, c_type, column_size, decimal_digits = self._map_sql_type(
516+
sql_type, c_type, column_size, decimal_digits, is_dae = self._map_sql_type(
499517
parameter, parameters_list, i
500518
)
501519
paraminfo.paramCType = c_type
502520
paraminfo.paramSQLType = sql_type
503521
paraminfo.inputOutputType = ddbc_sql_const.SQL_PARAM_INPUT.value
504522
paraminfo.columnSize = column_size
505523
paraminfo.decimalDigits = decimal_digits
524+
paraminfo.isDAE = is_dae
525+
526+
if is_dae:
527+
paraminfo.dataPtr = parameter # Will be converted to py::object* in C++
528+
506529
return paraminfo
507530

508531
def _initialize_description(self):
@@ -762,9 +785,16 @@ def execute(
762785
self.is_stmt_prepared,
763786
use_prepare,
764787
)
765-
788+
# Check return code
789+
try:
790+
766791
# Check for errors but don't raise exceptions for info/warning messages
767-
check_error(ddbc_sql_const.SQL_HANDLE_STMT.value, self.hstmt, ret)
792+
check_error(ddbc_sql_const.SQL_HANDLE_STMT.value, self.hstmt, ret)
793+
except Exception as e:
794+
log('warning', "Execute failed, resetting cursor: %s", e)
795+
self._reset_cursor()
796+
raise
797+
768798

769799
# Capture any diagnostic messages (SQL_SUCCESS_WITH_INFO, etc.)
770800
if self.hstmt:

0 commit comments

Comments
 (0)