Profile
CharStats
¶
Source code in python/scouter/profile/_profile.pyi
DataProfile
¶
Data profile of features
Source code in python/scouter/profile/_profile.pyi
features
property
¶
Returns dictionary of features and their data profiles
__str__()
¶
model_dump_json()
¶
model_validate_json(json_string)
staticmethod
¶
Load Data profile from json
Parameters:
Name | Type | Description | Default |
---|---|---|---|
json_string
|
str
|
JSON string representation of the data profile |
required |
save_to_json(path=None)
¶
Save data profile to json file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
Optional[Path]
|
Optional path to save the data profile. If None, outputs to |
None
|
Returns:
Type | Description |
---|---|
Path
|
Path to the saved data profile |
Source code in python/scouter/profile/_profile.pyi
DataProfiler
¶
Source code in python/scouter/profile/_profile.pyi
__init__()
¶
create_data_profile(data, data_type=None, bin_size=20, compute_correlations=False)
¶
Create a data profile from data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Any
|
Data to create a data profile from. Data can be a numpy array, a polars dataframe or pandas dataframe. Data is expected to not contain any missing values, NaNs or infinities These types are incompatible with computing quantiles, histograms, and correlations. These values must be removed or imputed. |
required |
data_type
|
Optional[DataType]
|
Optional data type. Inferred from data if not provided. |
None
|
bin_size
|
int
|
Optional bin size for histograms. Defaults to 20 bins. |
20
|
compute_correlations
|
bool
|
Whether to compute correlations or not. |
False
|
Returns:
Type | Description |
---|---|
DataProfile
|
DataProfile |