Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
2006959
feat: add CSV conversion command to ensrainbow CLI
djstrong Sep 30, 2025
b491441
refactor
djstrong Sep 30, 2025
4c18e0b
Create brave-kiwis-notice.md
djstrong Sep 30, 2025
5aefe9d
fix tests
djstrong Oct 1, 2025
f2c8f20
use fast-csv package
djstrong Oct 1, 2025
e20932d
add documentation for csv convert
djstrong Oct 6, 2025
b9c31b0
feat: add filtering capabilities to CSV conversion
djstrong Oct 17, 2025
e2b9255
feat: enhance CSV conversion with Bloom filter and deduplication options
djstrong Nov 24, 2025
2c94d41
refactor: simplify command options in package.json
djstrong Nov 24, 2025
721a50d
refactor: improve memory management and logging in CSV conversion
djstrong Dec 11, 2025
56bc356
refactor: streamline CSV conversion CLI options and improve logging
djstrong Dec 15, 2025
11992d7
fix: improve error handling and logging in CSV conversion tests
djstrong Dec 15, 2025
3dea60e
refactor: update CSV conversion logic and improve deduplication handling
djstrong Dec 16, 2025
b6c668a
Merge branch 'main' into csv-conversion-tool
djstrong Dec 16, 2025
b02b7f1
refactor: remove unused dependencies and enhance CSV conversion tests
djstrong Dec 17, 2025
35a05cb
Apply suggestions from code review
djstrong Jan 5, 2026
2cc8cad
refactor: rename convert command to convert-sql and update CLI docume…
djstrong Jan 5, 2026
bbc2786
refactor: update CLI documentation for output file and label set desc…
djstrong Jan 5, 2026
af4b041
docs: enhance SQL conversion section with repository link for legacy …
djstrong Jan 5, 2026
0ebee69
refactor: update CLI to make output-file optional and enhance documen…
djstrong Jan 5, 2026
9cdbb39
fix: enforce existing database path requirement in CLI and improve er…
djstrong Jan 5, 2026
a7fd4f3
refactor: update createRainbowRecord function to use RainbowRecord ty…
djstrong Jan 5, 2026
aac6789
refactor: remove label set version requirement from CLI and enhance o…
djstrong Jan 6, 2026
35cf39b
docs: update documentation to reflect removal of label set version re…
djstrong Jan 7, 2026
7125f23
feat: rename convert command for SQL dumps
djstrong Jan 7, 2026
f7ca244
refactor: update CSV conversion documentation
djstrong Jan 7, 2026
7ab5165
test: add error handling test for label set ID mismatch in CSV conver…
djstrong Jan 7, 2026
22e64ff
Merge branch 'main' into csv-conversion-tool
djstrong Jan 7, 2026
3967d4c
refactor: rename and enhance label set version retrieval function to …
djstrong Jan 7, 2026
0a55554
refactor: remove label set version requirement from CLI commands and …
djstrong Jan 7, 2026
bfd6102
Merge branch 'main' into csv-conversion-tool
djstrong Jan 7, 2026
653d200
Update apps/ensrainbow/src/cli.ts
djstrong Jan 7, 2026
af0c7f0
lint
djstrong Jan 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/brave-kiwis-notice.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"ensrainbow": patch
---

feat: add CSV conversion command to ensrainbow CLI to convert rainbow tables from CSV format to ensrainbow format
4 changes: 3 additions & 1 deletion apps/ensrainbow/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
"validate:lite": "tsx src/cli.ts validate --lite",
"purge": "tsx src/cli.ts purge",
"convert": "tsx src/cli.ts convert",
"convert-sql": "tsx src/cli.ts convert-sql",
"test": "vitest",
"test:coverage": "vitest --coverage",
"lint": "biome check --write .",
Expand All @@ -38,7 +39,8 @@
"progress": "^2.0.3",
"protobufjs": "^7.4.0",
"viem": "catalog:",
"yargs": "^17.7.2"
"yargs": "^17.7.2",
"@fast-csv/parse": "^5.0.0"
},
"devDependencies": {
"@ensnode/shared-configs": "workspace:*",
Expand Down
167 changes: 129 additions & 38 deletions apps/ensrainbow/src/cli.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -107,32 +107,29 @@ describe("CLI", () => {
const ensrainbowFile = join(TEST_FIXTURES_DIR, "test_ens_names_0.ensrainbow");
const ensrainbowOutputFile = join(tempDir, "test_ens_names_0.ensrainbow");
const labelSetId = "test-ens-names"; // Needed for convert
const labelSetVersion = 0; // Needed for convert

expect(() =>
cli.parse([
"convert",
"convert-sql",
"--input-file",
sqlInputFile,
"--output-file",
ensrainbowOutputFile,
]),
).toThrow(/Missing required arguments: label-set-id, label-set-version/);
).toThrow(/Missing required argument: label-set-id/);

// Successful convert with args
const ingestCli = createCLI({ exitProcess: false });
await ingestCli.parse([
"convert",
"convert-sql",
"--input-file",
sqlInputFile,
"--output-file",
ensrainbowOutputFile,
"--label-set-id",
labelSetId,
"--label-set-version",
labelSetVersion.toString(),
]);
//command: pnpm convert --input-file test/fixtures/test_ens_names.sql.gz --output-file test/fixtures/test_ens_names_0.ensrainbow --label-set-id test-ens-names --label-set-version 0
//command: pnpm convert-sql --input-file test/fixtures/test_ens_names.sql.gz --output-file test/fixtures/test_ens_names_0.ensrainbow --label-set-id test-ens-names --label-set-version 0
//verify that the file is created

await expect(stat(ensrainbowOutputFile)).resolves.toBeDefined();
Expand All @@ -159,32 +156,29 @@ describe("CLI", () => {
const sqlInputFile = join(TEST_FIXTURES_DIR, "ens_test_env_names.sql.gz");
const ensrainbowOutputFile = join(tempDir, "ens_test_env_0.ensrainbow");
const labelSetId = "ens-test-env"; // Needed for convert
const labelSetVersion = 0; // Needed for convert

expect(() =>
cli.parse([
"convert",
"convert-sql",
"--input-file",
sqlInputFile,
"--output-file",
ensrainbowOutputFile,
]),
).toThrow(/Missing required arguments: label-set-id, label-set-version/);
).toThrow(/Missing required argument: label-set-id/);

// Successful convert with args
const ingestCli = createCLI({ exitProcess: false });
await ingestCli.parse([
"convert",
"convert-sql",
"--input-file",
sqlInputFile,
"--output-file",
ensrainbowOutputFile,
"--label-set-id",
labelSetId,
"--label-set-version",
labelSetVersion.toString(),
]);
//command: pnpm convert --input-file test_ens_names.sql.gz --output-file test_ens_names_0.ensrainbow --label-set-id test-ens-names --label-set-version 0
//command: pnpm convert-sql --input-file test_ens_names.sql.gz --output-file test_ens_names_0.ensrainbow --label-set-id test-ens-names --label-set-version 0
//verify that the file is created

await expect(stat(ensrainbowOutputFile)).resolves.toBeDefined();
Expand All @@ -207,30 +201,56 @@ describe("CLI", () => {
const sqlInputFile = join(TEST_FIXTURES_DIR, "test_ens_names.sql.gz");
const ensrainbowOutputFile = join(tempDir, "test_ens_names_1.ensrainbow");
const labelSetId = "test-ens-names"; // Needed for convert
const labelSetVersion = 1; // Needed for convert

expect(() =>
cli.parse([
"convert",
"convert-sql",
"--input-file",
sqlInputFile,
"--output-file",
ensrainbowOutputFile,
]),
).toThrow(/Missing required arguments: label-set-id, label-set-version/);
).toThrow(/Missing required argument: label-set-id/);

const ingestCli2 = createCLI({ exitProcess: false });
// Successful convert with args
// Successful convert with args (convert-sql always creates version 0)
// To test version 1, we need to use convert command with existing database
// But for this test, we'll create version 0 and then manually test the ingestion failure
const csvInputFile = join(TEST_FIXTURES_DIR, "test_labels_2col.csv");
const tempDbDirForV1 = join(tempDir, "temp-db-for-v1");
const version0FileForV1 = join(tempDir, "test_ens_names_0_for_v1.ensrainbow");

// Create version 0 file
await ingestCli2.parse([
"convert",
"--input-file",
sqlInputFile,
csvInputFile,
"--output-file",
version0FileForV1,
"--label-set-id",
labelSetId,
]);

// Ingest version 0 to create database
await ingestCli2.parse([
"ingest-ensrainbow",
"--input-file",
version0FileForV1,
"--data-dir",
tempDbDirForV1,
]);

// Create version 1 file using existing database
await ingestCli2.parse([
"convert",
"--input-file",
csvInputFile,
"--output-file",
ensrainbowOutputFile,
"--label-set-id",
labelSetId,
"--label-set-version",
labelSetVersion.toString(),
"--existing-db-path",
tempDbDirForV1,
]);
//verify it is created
await expect(stat(ensrainbowOutputFile)).resolves.toBeDefined();
Expand All @@ -254,38 +274,99 @@ describe("CLI", () => {
});

it("should ingest first file successfully but reject second file with label set version not being 1 higher than the current highest label set version", async () => {
// First, ingest a valid file with label set version 0
const firstInputFile = join(TEST_FIXTURES_DIR, "test_ens_names_0.ensrainbow");
// First, we'll create a version 0 file and then a version 2 file
const secondInputFile = join(tempDir, "test_ens_names_2.ensrainbow");

// Create an ensrainbow file with label set version 2
const sqlInputFile = join(TEST_FIXTURES_DIR, "test_ens_names.sql.gz");
// To create version 2, we need to create version 0, ingest it, create version 1, ingest it, then create version 2
const csvInputFile = join(TEST_FIXTURES_DIR, "test_labels_2col.csv");
const labelSetId = "test-ens-names";
const labelSetVersion = 2; // Higher than 1

// Successful convert with label set version 2
// Create temporary directory for building up versions sequentially
const tempDbDir = join(tempDir, "temp-db");
const version0File = join(tempDir, "test_ens_names_0_temp.ensrainbow");
const version1File = join(tempDir, "test_ens_names_1_temp.ensrainbow");

const convertCli = createCLI({ exitProcess: false });

// Step 1: Create version 0 file
await convertCli.parse([
"convert",
"--input-file",
sqlInputFile,
csvInputFile,
"--output-file",
version0File,
"--label-set-id",
labelSetId,
]);

// Step 2: Ingest version 0 to create database (database now has version 0)
await convertCli.parse([
"ingest-ensrainbow",
"--input-file",
version0File,
"--data-dir",
tempDbDir,
]);

// Step 3: Create version 1 file using existing database (will be version 1)
await convertCli.parse([
"convert",
"--input-file",
csvInputFile,
"--output-file",
version1File,
"--label-set-id",
labelSetId,
"--existing-db-path",
tempDbDir,
]);

// Step 4: Ingest version 1 into the same database (database now has versions 0 and 1, highest is 1)
await convertCli.parse([
"ingest-ensrainbow",
"--input-file",
version1File,
"--data-dir",
tempDbDir,
]);

// Step 5: Create version 2 file using existing database (will be version 2, since highest is 1)
await convertCli.parse([
"convert",
"--input-file",
csvInputFile,
"--output-file",
secondInputFile,
"--label-set-id",
labelSetId,
"--label-set-version",
labelSetVersion.toString(),
"--existing-db-path",
tempDbDir,
]);

// Verify the file with label set version 2 was created
await expect(stat(secondInputFile)).resolves.toBeDefined();

// Create a completely separate version 0 file for the final test
// Use a fresh CLI instance and ensure no existing-db-path is used
const finalTestCli = createCLI({ exitProcess: false });
const finalTestVersion0File = join(tempDir, "final_test_v0.ensrainbow");
await finalTestCli.parse([
"convert",
"--input-file",
csvInputFile,
"--output-file",
finalTestVersion0File,
"--label-set-id",
labelSetId,
]);

// First ingest succeeds with label set version 0
const ingestCli = createCLI({ exitProcess: false });
await ingestCli.parse([
"ingest-ensrainbow",
"--input-file",
firstInputFile,
finalTestVersion0File,
"--data-dir",
testDataDir,
]);
Expand All @@ -311,35 +392,45 @@ describe("CLI", () => {
const thirdInputFile = join(tempDir, "different_label_set_id_1.ensrainbow");

// Create an ensrainbow file with different label set id
const sqlInputFile = join(TEST_FIXTURES_DIR, "test_ens_names.sql.gz");
const csvInputFile = join(TEST_FIXTURES_DIR, "test_labels_2col.csv");
const labelSetId = "different-label-set-id"; // Different from test-ens-names
const labelSetVersion = 0;

// Create temporary directory for version 0 database
const tempDbDir0 = join(tempDir, "temp-db-different-v0");

// Create second file with different label set id and label set version 0
const convertCli = createCLI({ exitProcess: false });
await convertCli.parse([
"convert",
"--input-file",
sqlInputFile,
csvInputFile,
"--output-file",
secondInputFile,
"--label-set-id",
labelSetId,
"--label-set-version",
labelSetVersion.toString(),
]);

// Create third file with different label set id and label set version 1
// First, ingest version 0 to create database
await convertCli.parse([
"ingest-ensrainbow",
"--input-file",
secondInputFile,
"--data-dir",
tempDbDir0,
]);

// Then create version 1 using existing database
await convertCli.parse([
"convert",
"--input-file",
sqlInputFile,
csvInputFile,
"--output-file",
thirdInputFile,
"--label-set-id",
labelSetId,
"--label-set-version",
"1",
"--existing-db-path",
tempDbDir0,
]);

// Verify the file with different label set id was created
Expand Down
Loading