This repository was archived by the owner on Feb 2, 2024. It is now read-only.
Commit da003ef
authored
Refactor read_csv to support converters (#993)
* Refactor read_csv to support converters
Motivation: support pd.read_csv use with converters to avoid costly
unboxing of string columns read with pyarrow read_csv when returning
DF from objmode.
* Adds StdStringView Numba type to hstr_ext
Motivation: for optimization purposes (avoiding copy when creating NRT manageble
unicode instances) when working with string data stored in native extensions.
* Adds zip and dict builtins overloads to support easy literal dict ctor
Motivation: there's no easy way to create Numba LiteralStrKeyDict
objects for const dicts with many elements. This adds a special overload
for dict builtin that creates LiteralStrKeyDict from tuple of pairs
('col_name', col_data).
* Replacing zip overload builtin with internal sdc_tuple_zip function
Details: zip builtin is already overloaded in Numba and has priority
over user defined overloads, hence in cases when we want zip two single
elements tuples, e.g. zip(('A', ), (1, )) builtin function will match
and type inference will unliteral all tuples, producing iter objects
(that are always homogeneous in Numba). That is, literality of objects
will be lost. Using sdc_zip_tuples explicitly avoid this problem.
* Fixing issue with literal dict ctor with single element
* Moving stringlib to native
* Fixing refcnt issue and adding tests
* Adding rewrite for dict(zip()) calls
* Fixing str_view_to_float impl and tests
* Fixing refcnt problem with pyarrow table ptr
* Fixing bugs found in failed tests and examples1 parent dfaa715 commit da003ef
File tree
12 files changed
+1606
-1171
lines changed- sdc
- datatypes
- extensions
- hiframes
- io
- native
- str_ext
- rewrites
- tests
12 files changed
+1606
-1171
lines changedLarge diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
72 | 72 | | |
73 | 73 | | |
74 | 74 | | |
| 75 | + | |
75 | 76 | | |
76 | 77 | | |
77 | 78 | | |
| |||
679 | 680 | | |
680 | 681 | | |
681 | 682 | | |
682 | | - | |
| 683 | + | |
683 | 684 | | |
684 | 685 | | |
685 | 686 | | |
| |||
734 | 735 | | |
735 | 736 | | |
736 | 737 | | |
737 | | - | |
| 738 | + | |
738 | 739 | | |
739 | 740 | | |
740 | 741 | | |
| |||
0 commit comments