DescribeExplorer
Explorer that generates a statistical summary table using pandas describe.
For numeric columns the output includes: count (number of non-missing
values), mean, std (standard deviation), min, the requested
percentile rows (defaulting to 25 %, 50 %, and 75 %), and max. For
object and categorical columns it reports count, unique (number of
distinct values), top (most frequent value), and freq (frequency of
the most common value).
This explorer is a fast first step when exploring a new dataset: it immediately surfaces the central tendency, spread, and range of each column, flags potential data quality issues (e.g. unexpectedly low counts indicating missing values), and helps decide which transformations or visualisations to apply next.
Users can customise which percentiles appear in the output and restrict or
expand the dtype groups that are summarised via the include and
exclude parameters.
Parameters
- percentiles, default=
25, 50, 75 - Percentiles to include in the exploration. Use integers between 0 and 100. Example: '25, 50, 75'
- include, default=
all - Data types to include in the exploration.
- exclude, default=
None - Data types to exclude from the exploration.
Methods
get_results(self, exploration_path: str, options: Dict[str, Any]) -> Dict[str, Any]
DescribeExplorerLoad and return the saved statistical summary for the frontend.
Parameters
- exploration_path : str
- Path to the JSON file saved by
save_notebook. - options : Dict[str, Any]
- Rendering options from the frontend. Supports
"orientation"(str, default"dict"), which is forwarded topandas.DataFrame.to_dict.
Returns
- Dict[str, Any]
- Dictionary with keys
"data"(nested dict of the transposed describe output in the requested orientation),"type"("tabular"), and"config"(dict containing{"orient": <orientation>}).
launch_exploration(self, dataset: 'DashAIDataset', explorer_info: DashAI.back.dependencies.database.models.Explorer) -> Any
DescribeExplorerCompute a statistical summary of the dataset using pandas describe.
Parameters
- dataset : DashAIDataset
- The dataset to summarize.
- explorer_info : Explorer
- The explorer database record (unused).
Returns
- Any
- A
pandas.DataFramecontaining descriptive statistics (count, mean, std, min, percentiles, max for numeric columns; count, unique, top, freq for object columns).
save_notebook(self, notebook_info: DashAI.back.dependencies.database.models.Notebook, explorer_info: DashAI.back.dependencies.database.models.Explorer, save_path: 'Path', result: Any) -> str
DescribeExplorerSave the descriptive statistics DataFrame to a JSON file on disk.
Parameters
- notebook_info : Notebook
- The notebook database record (unused).
- explorer_info : Explorer
- The explorer record used for filename generation.
- save_path : Path
- Directory where the file will be saved.
- result : Any
- The
pandas.DataFramereturned bylaunch_exploration.
Returns
- str
- The path of the saved JSON file as a POSIX string.
validate_parameters(cls, params: Dict[str, Any]) -> bool
DescribeExplorerValidate explorer parameters against the schema and business rules.
Parameters
- cls : type
- The explorer class (injected automatically by Python for classmethods).
- params : Dict[str, Any]
- Parameter dictionary to validate (must match
DescribeExplorerSchema).
Returns
- bool
Trueif all validations pass,Falseif any percentile value is outside [0, 100] or cannot be parsed as an integer.
get_metadata(cls) -> Dict[str, Any]
BaseExplorerGet metadata for the explorer, used by the DashAI frontend.
Returns
- Dict[str, Any]
- Dictionary containing display name, description, image preview path, category, icon, color, allowed dtypes, restricted dtypes, and input cardinality constraints.
get_schema(cls) -> dict
ConfigObjectGenerates the component related Json Schema.
Returns
- dict
- Dictionary representing the Json Schema of the component.
prepare_dataset(self, loaded_dataset: 'DashAIDataset', columns: List[Dict[str, Any]]) -> 'DashAIDataset'
BaseExplorerPrepare the dataset by selecting only the columns needed for this exploration.
Parameters
- loaded_dataset : DashAIDataset
- The full dataset loaded from storage.
- columns : List[Dict[str, Any]]
- List of column descriptor dicts, each containing at least
"columnName". Optional keys:"id","valueType","dataType".
Returns
- DashAIDataset
- Dataset restricted to the requested columns.
validate_and_transform(self, raw_data: dict) -> dict
ConfigObjectIt takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.
Parameters
- raw_data : dict
- A dictionary with the data provided by the user to initialize the model.
Returns
- dict
- A validated dictionary with the necessary objects.
validate_columns(cls, explorer_info: DashAI.back.dependencies.database.models.Explorer, column_spec: Dict[str, Dict[str, str]]) -> bool
BaseExplorerValidate that the selected columns satisfy the explorer's constraints.
Parameters
- explorer_info : Explorer
- The database record for the explorer instance, including the selected columns.
- column_spec : Dict[str, Dict[str, str]]
- A mapping from column name to a dict with at least a
"type"key describing the column's data type.
Returns
- bool
- True if all column constraints are satisfied, False otherwise.