DBbot is an npm package that allows researchers to create a bot based on a CSV file. The bot can query and filter the data of the CSV file, making it easy to interact with and manipulate large datasets.
- Easy to use and integrate
- Query and filter data from a CSV file
- Flexible and customizable
- Supports various data operations
This is a description of what the user will see
First, the bot will provide the uaer an explaintion about the database (CSV file). Second, the bot will display the user a list of the columns of the database, and the user can choose from it which column he/she would like to query.
After ther user has choose the attribute (column) the bot will display a list the of operators of the chosen column. Each column has it's own operators. The operators are used for query the data. For example, if the user chose a "name" attribute and he/she want to query only the names that starts with the letter "S", he/she can choose the "StartWith" operator to achive that. Some operators require input of parameters. The bot will ask the user for it as well.
Once the operator has been chosen and given all his parameters, the bot will prompt the user if he/she would like to add another attribute to the query. If the user chooses "No", the bot will fetch the data from the database and provide the user a CSV file to download with the results. If the user chooses "Yes", the whole proccess will start over.
First, node.js should be installed on your machine. If you don't know how to install it, you can look at this guide
Open your code editor, and create new folder my-new-project.
Open the integrated terminal of your code editor, and navigate to your new project:
cd my-new-projectThen init a new node.js project with the command:
npm initYou can install DBbot using npm:
npm install @gold-lab/dbbotAnd that's it! You are now ready to use DBbot.
Here is a basic example of how to use DBbot:
import { DBbot, App } from "dbbot";
// Create bot and app instances
const dbBot = new DBbot();
const app = new App();
// Initialize the bot with a CSV file
dbBot.loadFile("./example.csv");
// Run the bot
app.deploy(dbBot);- Attribute - a column of the data set
- Operator - a function used for filtering data from an attribute
- Developer - the programmer who write the code of the bot using dbbot npm package
- User - the user of the final product of the bot who only see the UI of the bot and not the code.
By default the bot has 11 operators to query data from the csv columns:
- Numeric
- Equal
- Less
- Greater
- Range
- String
- SoundLike
- StartWith
- EndWith
- Contains
- Equal
- Factor
- ChooseOne
- ChooseMultiple
Attributes with string values will have the string operators, and attributes with numeric values will have the numeric operators. Factor Attributes behave differently, it have several build in option to choose from. To define an attribute as a factor attribute the developer needs to do it manually. More on this in the next sections
When building the bot, the developer can add custom operators to specific attributes. A custom operator must have the following:
- name: The name of the operator
- function: The function to execute for the query of the data. Read more below about the requirements of the function.
- dataType: The type of the values the operator can work with. - can only be "string" or "numeric"
- column: The name of the attribute (column) the operator will be assign to.
- params: The parameters of the function. Each parameter must have a name and a dataType.
The function of the operator must always have a parameter called "cell" and a return value of type boolean. The cell parameter must always be the first one. For example:
// The function have a paraeter called "cell" which can be of type string or number
function startsWithB(cell: string) {
return cell.startsWith("B"); // return value of type boolean
}const startWithBOperator = {
name: "startsWithB",
customFunction: function (cell: string) {
return cell.startsWith("B");
},
dataType: "string",
column: "name",
params: [
{
name: "cell",
dataType: "string",
},
],
};
dbBot.addCustomOperator(startWithBOperator); // add the operator to the botTo add the operator to the bot, The addCustomOperator must be called with the operator as a parameter.
In many databases there are missing values in some attributes. DBbot gives the developer the option to decide how to treat those values.
The are some default value the developer can change the null values to:
- mean: The mean on the not-null values of the attribute. Only for numeric attributes.
- median: The median on the not-null values of the attribute. Only for numeric attributes.
- mode: The mode on the not-null values of the attribute. Only for numeric attributes.
- remove: Remove the cells with null values.
- custom: A custom value.
// Define what values treat as null values
const nullValues = [null, "NA", NaN];
// Get the attribute
const heightColumn = dbBot.getColumnByName("height");
// Fill the null values with the mean of the attribute
heightColumn.fillNullValues("mean", nullValues);
// Fill the null values with a custom value - 5
heightColumn.fillNullValues("custom", nullValues, 5);Another option is to fill al null values of all columns at the same time and with the same value (one for string attributes and one for numeric attributes)
// Define what values treat as null values
const nullValues = [null, "NA", NaN];
// Fill all null values
dbBot.fillNullValuesAll({
numericValue: -1,
stringValue: "FILL",
nullValue: nullValues,
});As was mention before, the developer can convert attributes to factor attributes:
dbBot.convertColumnsToFactor(["gender"]); // can enter multiple columnsThe factor attributes have several build in options. For example - attribute called gender with the values options of "male", "female" and "other.
The name of each attribute can be changed:
// change a display name of "name" attribute to "newName"
dbBot.changeColumnDisplayName("name", "newName");
// If the new name is already exists in another attribute, it throw an errorThe user will see the new name in the chat bot, and the dbBot.getColumnByName() method will be affected as well.
NOTE: The dbBot.getColumnById() will not be affected by changes in the display name of the attribute.
The id property of a column is a read only property, and can not be changed once the csv file get load.
All the messages of the chat bot can be changed.
There are two types of messages:
- Fixed in time messages
- Slots messgaes
The fixed in time messages has a fixed position in the chat, and only the text can be changed. The slots messages are messages that can be placed between the fixed in time messages, and the researcher can insert as many as he/she wants.
-
The fixed in time messages
- attributeMessage: In this message the user will see the list of attributes to choose from. The custom message is the text before the list.
- operatorMessage: Same as the attributeMessage, but with the operators list
- errorMessage: An error message in the chat, when the user enters invalid input.
- continueMessage: A message to confirm an continue the conversion, e.g. "Press 1 to continue"
- resultMessage: This message will be displayed to the user when the result of the query are ready.
-
The slots messages:
- welcomeSlot: Messages in this slots will be displayed at the begining of the chat.
- operatorSlot: Message in this slots will be displayed after the fix in time operatorMessage message.
- paramsSlot: Message in this slots will be displayed after the the messages of the params of the operator.
- restartSlot: Message in this slots will be displayed after the query has been chosen. This slots meant to ask the user if he/she wants to add another attribute to the query.
- resultSlot: Message in this slots will be displayed before the fix in time resultMessage message.
const messages = { attributeMessage: "Demo messages for attributes", operatorMessage: "Demo message for operators", errorMessage: "Demo error message", resultMessage: "Demo result message", }; dbBot.customMessages = messages;
// welcome slot const welcomeMessages = []; const welcomeFirstMsg = "Demo welcome slot first message"; welcomeMessages.push(welcomeFirstMsg); const welcomeSecondMsg = "Demo welcome slot second message"; welcomeMessages.push(welcomeSecondMsg); // operator slot const operatorMessages = []; const operatorFirstMsg = "Demo operator slot first message"; operatorMessages.push(operatorFirstMsg); const operatorSecondMsg = "Demo operator slot second message"; operatorMessages.push(operatorSecondMsg); const slots = { welcomeSlot: welcomeMessages, operatorSlot: operatorMessages, }; dbBot.slots = slots;
The DBbot class is the main class for the bot
- Description: Stores the file path of the currently loaded data.
- Default:
""
- Description: Contains predefined bot-related messages.
- Default:
{
customMessages: {
attributeMessage: undefined,
operatorMessage: undefined,
errorMessage: undefined,
continueMessage: undefined,
resultMessage: undefined,
},
slots: {
welcomeSlot: [],
operatorSlot: [],
paramsSlot: [],
restartSlot: [],
resultSlot: [],
}
}- Description: Contains predefined custom bot-related messages.
- Default:
{
attributeMessage: undefined,
operatorMessage: undefined,
errorMessage: undefined,
continueMessage: undefined,
resultMessage: undefined,
}- Description: Contains predefined slots bot-related messages.
- Default:
{
welcomeSlot: [],
operatorSlot: [],
paramsSlot: [],
restartSlot: [],
resultSlot: [],
}- Description: Stores configuration details specific to the bot.
- Default:
{
name: "DBBot",
helpDescription: "NO DESCRIPTION PROVIDED",
}- Description: Core data structure containing headers, columns, and custom operators.
- Default:
{
headers: [],
columns: [],
customOperators: this._customOperators,
}- Description: Configurable color settings for bot interactions.
- Default:
"blue"
- Description: Configurable color settings for bot interactions.
- Default:
"purple"
- Description:
This method allows you to define an array of null values for the bot. It updates the
nullValuesproperty, which holds an array of values that are considered null. - Parameters:
values- Type:
any[] - Description: An array of values that you wish to set as null values. The values in this array will replace the current
nullValuesarray.
- Type:
- RType:
void - Returns: nothing
- Description:
This method sets the description for a specified column. It updates the
descriptionproperty of the column object retrieved by its name. - Parameters:
column- Type:
string - Description: The name of the column whose description is to be set.
- Type:
description- Type:
string - Description: The description to be set for the specified column.
- Type:
- RType:
void - Returns: nothing
- Description:
This method retrieves the current
nullValuesproperty, which holds the array of values considered as null and the associated configuration. - Parameters:
- None
- RType:
NullValues - Returns: The current
nullValuesproperty.
- Description:
This method sets the
isFilterIncludesNullproperty in thenullValuesobject. It also validates that the provided value is of typeboolean, throwing aTypeErrorif the validation fails. - Parameters:
value- Type:
boolean - Description: The boolean value to be set for the
isFilterIncludesNullproperty.
- Type:
- RType:
void - Returns: nothing
- Description:
This method converts specified columns to the "factor" type. It retrieves the columns by name, invokes the
ConvertToFactor()method on each column, and updates the column's data by deleting all existing rows and adding new rows from the corresponding data map. If a column has no data, an error message is logged. - Parameters:
columns- Type:
string[] - Description: An array of column names to be converted to the "factor" type.
- Type:
- RType:
void - Returns: nothing
- Description:
This method retrieves a column by its
displayNameproperty. It internally calls the private methodgetColumnByStringPropertyto find a column whosedisplayNamematches the provided name, performing a case-insensitive comparison. - Parameters:
name- Type:
string - Description: The name of the column (as the
displayName) to be retrieved.
- Type:
- RType:
Column - Returns: The
Columnobject that matches the provideddisplayName. If no matching column is found, an error is thrown.
- Description:
This method retrieves a column by its
idproperty. It internally calls the private methodgetColumnByStringPropertyto find a column whoseidmatches the provided value, performing a case-insensitive comparison. - Parameters:
id- Type:
string - Description: The ID of the column to be retrieved.
- Type:
- RType:
Column - Returns: The
Columnobject that matches the providedid. If no matching column is found, an error is thrown.
- Description:
This method changes the display name of a specified column. It first checks if a column with the new name already exists. If not, it updates the column's
displayNameand updates the corresponding header in the_data.headersarray. If a column with the new name is found, an error is thrown. - Parameters:
name- Type:
string - Description: The current display name of the column to be renamed.
- Type:
newName- Type:
string - Description: The new display name to be assigned to the column.
- Type:
- RType:
void - Returns: nothing
- Description:
This method adds a custom operator by registering it and generating an import statement for any specified functions. It creates a file text containing the necessary import statements and the custom function, which is then added to the
operatorsFiles.functionsobject under the operator's name. - Parameters:
params- Type:
types.AddCustomOperatorParams - Description: An object containing parameters for adding a custom operator, including:
name(string): The name of the custom operator.customFunction(Function): The custom function to be assigned to the operator.importFunctions(string[] | undefined): An optional array of functions to be imported for use within the custom operator.
- Type:
- RType:
void - Returns: nothing
- Description:
This method loads a description file from the specified path, parses it, and updates the
descriptionproperty of the corresponding columns. It reads the file synchronously, parses the data into records, and associates the description from the file with each column by matching the column name. - Parameters:
path- Type:
string - Description: The file path of the description file to be loaded.
- Type:
- RType:
void - Returns: nothing
- Description:
This method loads and processes a file from the specified path. It reads the file data synchronously, parses the records, and updates the
_data.headersanddataMapproperties. It then adds columns to thedataMapand performs any additional automatic column assignments. If an error occurs during file reading or parsing, it is logged to the console. - Parameters:
path- Type:
string - Description: The file path of the file to be loaded and processed.
- Type:
- RType:
void - Returns: nothing
- Description:
This method fills null values in all columns of the data. It loops through each column and assigns a value based on the column's data type. If the column is of type
NUMERIC, it uses thenumericValue; otherwise, it uses thestringValue. If no value is provided for either, the method does nothing for that column. It also allows specifying custom null values. - Parameters:
numericValue- Type:
any - Description: The value to use for numeric columns when filling null values.
- Type:
stringValue- Type:
any - Description: The value to use for string columns when filling null values.
- Type:
nullValue- Type:
any[] - Description: An optional array of values representing the null values to be replaced. The default value is
[null].
- Type:
- RType:
void - Returns: nothing
The App class is the class for execute and deploy the bot
- Description:
This method generates a configuration file for the bot and saves it as a JSON file in the current working directory. It first calls the
createOperatorsFilemethod on thebotinstance to prepare operator-related data. Then, it serializes the bot object into a JSON file nameddb_bot.json. - Parameters:
bot- Type:
DBbot - Description: The bot instance for which the configuration file is being generated.
- Type:
- RType:
void - Returns: nothing
- Description:
This method deploys the bot by generating its configuration file and executing a deployment script. It first calls the
generateConfigFilemethod to create the configuration file for the bot. Then, it spawns a new process to execute thedeploy.shscript located in the current working directory. If an error occurs during the deployment, it is logged to the console. - Parameters:
bot- Type:
DBbot - Description: The bot instance to be deployed.
- Type:
- RType:
void - Returns: nothing
The Column class is the class for the Attributes of the data
- Description: Indicates whether the default operators should be used.
- Default:
true
- Description: The description of the column
- Default:
"No description available"
- Description: The id of the column
- Default:
None
- Description:
This method removes all rows from the current data structure by clearing the
rowsproperty. After this operation, therowsarray will be empty. - Parameters:
- None
- RType:
void - Returns: nothing
- Description:
This method retrieves the data of the current column in the form of a
ColumnDataobject. The returned object includes the column's ID, rows, data type, display name, and an array of operators associated with the column. - Parameters:
- None
- RType:
ColumnData - Returns:
- An object containing the following properties:
id(Type:string): The unique identifier of the column.rows(Type:any[]): An array representing the rows in the column.dataType(Type:string): The data type of the column.displayName(Type:string): The display name of the column.operators(Type:any[]): An array of operators associated with the column.
- An object containing the following properties:
- Description:
This method converts the column's data type to
FACTORand updates the column's operators array. It includes default factor operators and any custom operators associated with the column. - Parameters:
- None
- RType:
void - Returns: nothing
- Description:
This method adds the provided rows to the column by appending them to the existing
rowsarray. - Parameters:
rows- Type:
any[] - Description: An array of rows to be added to the column.
- Type:
- RType:
void - Returns: nothing
- Description:
This method adds a new operator to the column by appending it to the
operatorsArray. - Parameters:
operator- Type:
Operator - Description: The operator to be added to the column's operators array.
- Type:
- RType:
void - Returns: nothing
- Description:
This method calculates the mean (average) of the numeric rows in the column. It validates that the column's data type is
NUMERICbefore performing the calculation. Non-numeric values (NaN) in the rows are ignored. The result is rounded to two decimal places. - Parameters:
- None
- RType:
number - Returns:
- The mean (average) of the numeric rows in the column as a number.
- Throws:
- An error if the column's data type is not
NUMERIC.
- An error if the column's data type is not
- Description:
This method calculates the mode (the most frequently occurring value) of the numeric rows in the column. It validates that the column's data type is
NUMERICbefore performing the calculation. Non-numeric values (NaN) are ignored during the computation. - Parameters:
- None
- RType:
number - Returns:
- The mode (most frequently occurring value) of the numeric rows in the column as a number.
- Throws:
- An error if the column's data type is not
NUMERIC. - An error if all values occur with the same frequency, indicating there is no distinct mode.
- An error if the mode cannot be determined due to an unexpected issue.
- An error if the column's data type is not
- Description:
This method calculates the median (the middle value) of the numeric rows in the column. It validates that the column's data type is
NUMERICbefore performing the calculation. Non-numeric values (NaN) are excluded from the computation, and the rows are sorted in ascending order. - Parameters:
- None
- RType:
number - Returns:
- The median of the numeric rows in the column:
- If the number of rows is odd, the middle value.
- If the number of rows is even, the average of the two middle values.
- The median of the numeric rows in the column:
- Throws:
- An error if the column's data type is not
NUMERIC.
- An error if the column's data type is not
- Description: This method fills null values in the column based on the specified method. It supports various methods such as replacing nulls with a custom value, removing null values, or filling them with the mean, median, or mode of the column. The method throws errors if invalid parameters are provided or if the required custom value is not given.
- Parameters:
method- Type:
NullMethod - Description: The method to be used for filling the null values. Possible values are
CUSTOM,REMOVE,MEAN,MEDIAN, orMODE.
- Type:
nullValue- Type:
any[](default:[null]) - Description: The value(s) to be considered as null in the column. By default, this is set to
[null].
- Type:
customValue- Type:
NumOrStr | null(optional) - Description: A custom value to replace the null values, used when
methodis set toCUSTOM. If not provided for theCUSTOMmethod, an error is thrown.
- Type:
- RType:
void - Returns:
- - Throws:
- An error if the
methodis invalid or unrecognized. - An error if the
methodisCUSTOMandcustomValueisundefined.
- An error if the
The CustomOperator class is the class for custom operators for filtering the data.
- Description:
Creates an instance of the
CustomOperatorclass, initializing it with a name and a custom function that defines the operator's behavior. - Parameters:
name- Type:
string - Description: The name of the operator.
- Type:
customFunction- Type:
Function - Description: The custom function that defines the behavior of the operator.
- Type:
- RType:
CustomOperator - Returns: A new instance of the
CustomOperatorclass.
- Description:
Executes the
customFunctiondefined in the constructor and returns its result. - Parameters: None
- RType:
any - Returns: The result of the custom function execution.
- Description:
Adds the provided parameters to the operator's
paramsarray. - Parameters:
params- Type:
Params[] - Description: An array of parameters to be added to the operator's
paramsarray.
- Type:
- RType:
void - Returns: nothing
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License.
Special thanks to all the contributors and the open-source community.