Domain Selection

Domain selection is a key differentiator of Rassket. By choosing a domain that matches your data, you enable specialized feature engineering and domain-aware explanations that improve both model accuracy and interpretability.

Energy vs Research

Rassket offers two primary domains, each optimized for different types of problems and data characteristics.

Energy Domain

Select Energy when your data relates to:

Energy consumption, production, or generation
Grid operations and demand management
Renewable energy systems
Energy markets and pricing
Emissions and sustainability metrics

Energy Sub-Domains

Forecasting: Energy forecasting, optimization, and energy market signals
Renewables: Renewable energy generation and integration
Grid: Grid operations and demand management
Energy Economics: Energy markets and pricing

Research Domain

Select Research when your data relates to:

Academic and scientific datasets
Experimental and observational studies
Econometric analysis
Statistical modeling
Time series analysis in research contexts

Research Sub-Domains

Forecasting: Time series forecasting within research contexts
Econometrics: Economic modeling and statistical analysis
Statistical Modeling: Advanced statistical methods
Experimental Analysis: Experimental design and observational studies

Sub-Domains are Optional: You can skip sub-domain selection and use the base domain. Sub-domains provide more refined explanations and feature engineering, but the base domain still provides significant benefits over generic AutoML.

Why Domain Selection Matters

1. Specialized Feature Engineering

Each domain applies domain-specific feature engineering:

Energy Domain Features

Temporal Features: Hour-of-day, day-of-week, month, season
Lag Features: Previous consumption/production values
Rolling Statistics: Moving averages, rolling standard deviations
Cyclical Encoding: Sine/cosine transformations for seasonal patterns
Energy-Specific: Peak/off-peak indicators, demand patterns

Research Domain Features

Statistical Transformations: Log, square root, Box-Cox transformations
Interaction Terms: Domain-appropriate feature interactions
Research Metrics: Effect sizes, confidence intervals
Experimental Features: Treatment/control indicators, blocking variables

2. Domain-Aware Explanations

When you select a domain, Rassket provides explanations that make sense in your context:

Energy: "Peak demand hours show the strongest correlation with consumption"
Research: "The treatment effect is statistically significant with p < 0.05"

These explanations help you understand not just what the model learned, but why it matters in your specific domain.

3. Improved Model Selection

Domain selection influences model recommendations:

Energy: Time-series models (LSTM, ARIMA) may be prioritized for forecasting
Research: Interpretable models (linear regression, logistic regression) may be prioritized for explainability

4. Better Evaluation Metrics

Domain selection ensures appropriate evaluation:

Energy: Focus on forecasting accuracy (RMSE, MAE) and peak prediction
Research: Emphasis on statistical significance and effect sizes

How Domain Choice Improves Feature Engineering

Example: Energy Consumption Forecasting

Without domain selection, Rassket might create generic features. With Energy domain selection:

Creates "hour_of_day" and "day_of_week" features from timestamps
Generates lag features (consumption_1h_ago, consumption_24h_ago)
Adds rolling averages (7-day average, 30-day average)
Identifies peak hours and creates binary indicators
Encodes seasonal patterns using cyclical transformations

Example: Research Time Series

With Research domain selection:

Applies statistical transformations appropriate for the data distribution
Creates interaction terms between treatment and control variables
Generates features for experimental design (blocking, stratification)
Applies domain-appropriate scaling and normalization

Optional Sub-Domains

Sub-domains provide even more refined feature engineering and explanations. They're optional but recommended when your use case closely matches a sub-domain.

When to Use Sub-Domains

Use when: Your problem closely matches a sub-domain description
Skip when: Your problem spans multiple sub-domains or doesn't fit neatly

Best Practice: If you're unsure, start with the base domain. You can always retrain with a sub-domain later if needed. The base domain still provides significant benefits over generic AutoML.

What Happens After Domain Selection

Once you select a domain (and optionally a sub-domain), Rassket:

Applies Domain-Specific Preprocessing: Feature engineering based on your domain
Stores Domain Context: Used throughout the workflow for explanations
Prepares Domain-Aware Models: Model selection considers domain context
Enables Domain Explanations: Results include domain-specific insights

Changing Domains

If you realize you selected the wrong domain after preprocessing:

You can upload a new file and select a different domain
Or continue with the current domain—models will still be accurate, just without domain-specific enhancements

Domain selection happens early in the workflow, but its effects are felt throughout model training, evaluation, and explanation generation.

Next Steps

After domain selection and preprocessing, proceed to Data Analysisto understand your problem type and get model recommendations.