Skip to main content

DescribeExplorer

Explorer
DashAI.back.exploration.explorers.DescribeExplorer

Explorer that generates a statistical summary table using pandas describe.

For numeric columns the output includes: count (number of non-missing values), mean, std (standard deviation), min, the requested percentile rows (defaulting to 25 %, 50 %, and 75 %), and max. For object and categorical columns it reports count, unique (number of distinct values), top (most frequent value), and freq (frequency of the most common value).

This explorer is a fast first step when exploring a new dataset: it immediately surfaces the central tendency, spread, and range of each column, flags potential data quality issues (e.g. unexpectedly low counts indicating missing values), and helps decide which transformations or visualisations to apply next.

Users can customise which percentiles appear in the output and restrict or expand the dtype groups that are summarised via the include and exclude parameters.

Parameters

percentiles, default=25, 50, 75
Percentiles to include in the exploration. Use integers between 0 and 100. Example: '25, 50, 75'
include, default=all
Data types to include in the exploration.
exclude, default=None
Data types to exclude from the exploration.

Methods

get_results(self, exploration_path: str, options: Dict[str, Any]) -> Dict[str, Any]

Defined on DescribeExplorer

Load and return the saved statistical summary for the frontend.

Parameters

exploration_path : str
Path to the JSON file saved by save_notebook.
options : Dict[str, Any]
Rendering options from the frontend. Supports "orientation" (str, default "dict"), which is forwarded to pandas.DataFrame.to_dict.

Returns

Dict[str, Any]
Dictionary with keys "data" (nested dict of the transposed describe output in the requested orientation), "type" ("tabular"), and "config" (dict containing {"orient": <orientation>}).

launch_exploration(self, dataset: 'DashAIDataset', explorer_info: DashAI.back.dependencies.database.models.Explorer) -> Any

Defined on DescribeExplorer

Compute a statistical summary of the dataset using pandas describe.

Parameters

dataset : DashAIDataset
The dataset to summarize.
explorer_info : Explorer
The explorer database record (unused).

Returns

Any
A pandas.DataFrame containing descriptive statistics (count, mean, std, min, percentiles, max for numeric columns; count, unique, top, freq for object columns).

save_notebook(self, notebook_info: DashAI.back.dependencies.database.models.Notebook, explorer_info: DashAI.back.dependencies.database.models.Explorer, save_path: 'Path', result: Any) -> str

Defined on DescribeExplorer

Save the descriptive statistics DataFrame to a JSON file on disk.

Parameters

notebook_info : Notebook
The notebook database record (unused).
explorer_info : Explorer
The explorer record used for filename generation.
save_path : Path
Directory where the file will be saved.
result : Any
The pandas.DataFrame returned by launch_exploration.

Returns

str
The path of the saved JSON file as a POSIX string.

validate_parameters(cls, params: Dict[str, Any]) -> bool

Defined on DescribeExplorer

Validate explorer parameters against the schema and business rules.

Parameters

cls : type
The explorer class (injected automatically by Python for classmethods).
params : Dict[str, Any]
Parameter dictionary to validate (must match DescribeExplorerSchema).

Returns

bool
True if all validations pass, False if any percentile value is outside [0, 100] or cannot be parsed as an integer.

get_metadata(cls) -> Dict[str, Any]

Defined on BaseExplorer

Get metadata for the explorer, used by the DashAI frontend.

Returns

Dict[str, Any]
Dictionary containing display name, description, image preview path, category, icon, color, allowed dtypes, restricted dtypes, and input cardinality constraints.

get_schema(cls) -> dict

Defined on ConfigObject

Generates the component related Json Schema.

Returns

dict
Dictionary representing the Json Schema of the component.

prepare_dataset(self, loaded_dataset: 'DashAIDataset', columns: List[Dict[str, Any]]) -> 'DashAIDataset'

Defined on BaseExplorer

Prepare the dataset by selecting only the columns needed for this exploration.

Parameters

loaded_dataset : DashAIDataset
The full dataset loaded from storage.
columns : List[Dict[str, Any]]
List of column descriptor dicts, each containing at least "columnName". Optional keys: "id", "valueType", "dataType".

Returns

DashAIDataset
Dataset restricted to the requested columns.

validate_and_transform(self, raw_data: dict) -> dict

Defined on ConfigObject

It takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.

Parameters

raw_data : dict
A dictionary with the data provided by the user to initialize the model.

Returns

dict
A validated dictionary with the necessary objects.

validate_columns(cls, explorer_info: DashAI.back.dependencies.database.models.Explorer, column_spec: Dict[str, Dict[str, str]]) -> bool

Defined on BaseExplorer

Validate that the selected columns satisfy the explorer's constraints.

Parameters

explorer_info : Explorer
The database record for the explorer instance, including the selected columns.
column_spec : Dict[str, Dict[str, str]]
A mapping from column name to a dict with at least a "type" key describing the column's data type.

Returns

bool
True if all column constraints are satisfied, False otherwise.