DescribeExplorer

Explorer

DashAI.back.exploration.explorers.DescribeExplorer

Explorer that generates a statistical summary table using pandas describe.

For numeric columns the output includes: count (number of non-missing values), mean, std (standard deviation), min, the requested percentile rows (defaulting to 25 %, 50 %, and 75 %), and max. For object and categorical columns it reports count, unique (number of distinct values), top (most frequent value), and freq (frequency of the most common value).

This explorer is a fast first step when exploring a new dataset: it immediately surfaces the central tendency, spread, and range of each column, flags potential data quality issues (e.g. unexpectedly low counts indicating missing values), and helps decide which transformations or visualisations to apply next.

Users can customise which percentiles appear in the output and restrict or expand the dtype groups that are summarised via the include and exclude parameters.

Parameters

percentiles, default=25, 50, 75: Percentiles to include in the exploration. Use integers between 0 and 100. Example: '25, 50, 75'
include, default=all: Data types to include in the exploration.
exclude, default=None: Data types to exclude from the exploration.

Methods

get_results(self, exploration_path: str, options: Dict[str, Any]) -> Dict[str, Any]

Defined on DescribeExplorer

Load and return the saved statistical summary for the frontend.

Parameters

exploration_path : str: Path to the JSON file saved by save_notebook.
options : Dict[str, Any]: Rendering options from the frontend. Supports "orientation" (str, default "dict"), which is forwarded to pandas.DataFrame.to_dict.

Returns

Dict[str, Any]: Dictionary with keys "data" (nested dict of the transposed describe output in the requested orientation), "type" ("tabular"), and "config" (dict containing {"orient": <orientation>}).

launch_exploration(self, dataset: 'DashAIDataset', explorer_info: DashAI.back.dependencies.database.models.Explorer) -> Any

Defined on DescribeExplorer

Compute a statistical summary of the dataset using pandas describe.

Parameters

dataset : DashAIDataset: The dataset to summarize.
explorer_info : Explorer: The explorer database record (unused).

Returns

Any: A pandas.DataFrame containing descriptive statistics (count, mean, std, min, percentiles, max for numeric columns; count, unique, top, freq for object columns).

save_notebook(self, notebook_info: DashAI.back.dependencies.database.models.Notebook, explorer_info: DashAI.back.dependencies.database.models.Explorer, save_path: 'Path', result: Any) -> str

Defined on DescribeExplorer

Save the descriptive statistics DataFrame to a JSON file on disk.

Parameters

notebook_info : Notebook: The notebook database record (unused).
explorer_info : Explorer: The explorer record used for filename generation.
save_path : Path: Directory where the file will be saved.
result : Any: The pandas.DataFrame returned by launch_exploration.

Returns

str: The path of the saved JSON file as a POSIX string.

validate_parameters(cls, params: Dict[str, Any]) -> bool

Defined on DescribeExplorer

Validate explorer parameters against the schema and business rules.

Parameters

cls : type: The explorer class (injected automatically by Python for classmethods).
params : Dict[str, Any]: Parameter dictionary to validate (must match DescribeExplorerSchema).

Returns

bool: True if all validations pass, False if any percentile value is outside [0, 100] or cannot be parsed as an integer.

get_metadata(cls) -> Dict[str, Any]

Defined on BaseExplorer

Get metadata for the explorer, used by the DashAI frontend.

Returns

Dict[str, Any]: Dictionary containing display name, description, image preview path, category, icon, color, allowed semantic types, allowed dtypes, and input cardinality constraints.

get_schema(cls) -> dict

Defined on ConfigObject

Generates the component related Json Schema.

Returns

dict: Dictionary representing the Json Schema of the component.

prepare_dataset(self, loaded_dataset: 'DashAIDataset', columns: List[Dict[str, Any]]) -> 'DashAIDataset'

Defined on BaseExplorer

Prepare the dataset by selecting only the columns needed for this exploration.

Parameters

loaded_dataset : DashAIDataset: The full dataset loaded from storage.
columns : List[Dict[str, Any]]: List of column descriptor dicts, each containing at least "columnName". Optional keys: "id", "valueType", "dataType".

Returns

DashAIDataset: Dataset restricted to the requested columns.

validate_and_transform(self, raw_data: dict) -> dict

Defined on ConfigObject

It takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.

Parameters

raw_data : dict: A dictionary with the data provided by the user to initialize the model.

Returns

dict: A validated dictionary with the necessary objects.

validate_columns(cls, explorer_info: DashAI.back.dependencies.database.models.Explorer, column_spec: Dict[str, Dict[str, str]]) -> bool

Defined on BaseExplorer

Validate that the selected columns satisfy the explorer's constraints.

Parameters

explorer_info : Explorer: The database record for the explorer instance, including the selected columns.
column_spec : Dict[str, Dict[str, str]]: A mapping from column name to a dict with at least "type" (semantic type name) and "dtype" (dtype string).

Returns

bool: True if all column constraints are satisfied, False otherwise.

Parameters​

Methods​

Parameters

Methods