Domain Selection

Domain selection is a key differentiator of Rassket. By choosing a domain that matches your data, you enable specialized feature engineering and domain-aware explanations that improve both model accuracy and interpretability.

Energy vs Research

Rassket offers two primary domains, each optimized for different types of problems and data characteristics.

Energy Domain

Select Energy when your data relates to:

  • Energy consumption, production, or generation
  • Grid operations and demand management
  • Renewable energy systems
  • Energy markets and pricing
  • Emissions and sustainability metrics
Energy domain selection

Energy Sub-Domains

  • Forecasting: Energy forecasting, optimization, and energy market signals
  • Renewables: Renewable energy generation and integration
  • Grid: Grid operations and demand management
  • Energy Economics: Energy markets and pricing

Research Domain

Select Research when your data relates to:

  • Academic and scientific datasets
  • Experimental and observational studies
  • Econometric analysis
  • Statistical modeling
  • Time series analysis in research contexts

Research Sub-Domains

  • Forecasting: Time series forecasting within research contexts
  • Econometrics: Economic modeling and statistical analysis
  • Statistical Modeling: Advanced statistical methods
  • Experimental Analysis: Experimental design and observational studies
Sub-Domains are Optional: You can skip sub-domain selection and use the base domain. Sub-domains provide more refined explanations and feature engineering, but the base domain still provides significant benefits over generic AutoML.

Why Domain Selection Matters

1. Specialized Feature Engineering

Each domain applies domain-specific feature engineering:

Energy Domain Features

  • Temporal Features: Hour-of-day, day-of-week, month, season
  • Lag Features: Previous consumption/production values
  • Rolling Statistics: Moving averages, rolling standard deviations
  • Cyclical Encoding: Sine/cosine transformations for seasonal patterns
  • Energy-Specific: Peak/off-peak indicators, demand patterns

Research Domain Features

  • Statistical Transformations: Log, square root, Box-Cox transformations
  • Interaction Terms: Domain-appropriate feature interactions
  • Research Metrics: Effect sizes, confidence intervals
  • Experimental Features: Treatment/control indicators, blocking variables

2. Domain-Aware Explanations

When you select a domain, Rassket provides explanations that make sense in your context:

  • Energy: "Peak demand hours show the strongest correlation with consumption"
  • Research: "The treatment effect is statistically significant with p < 0.05"

These explanations help you understand not just what the model learned, but why it matters in your specific domain.

3. Improved Model Selection

Domain selection influences model recommendations:

  • Energy: Time-series models (LSTM, ARIMA) may be prioritized for forecasting
  • Research: Interpretable models (linear regression, logistic regression) may be prioritized for explainability

4. Better Evaluation Metrics

Domain selection ensures appropriate evaluation:

  • Energy: Focus on forecasting accuracy (RMSE, MAE) and peak prediction
  • Research: Emphasis on statistical significance and effect sizes

How Domain Choice Improves Feature Engineering

Example: Energy Consumption Forecasting

Without domain selection, Rassket might create generic features. With Energy domain selection:

  • Creates "hour_of_day" and "day_of_week" features from timestamps
  • Generates lag features (consumption_1h_ago, consumption_24h_ago)
  • Adds rolling averages (7-day average, 30-day average)
  • Identifies peak hours and creates binary indicators
  • Encodes seasonal patterns using cyclical transformations

Example: Research Time Series

With Research domain selection:

  • Applies statistical transformations appropriate for the data distribution
  • Creates interaction terms between treatment and control variables
  • Generates features for experimental design (blocking, stratification)
  • Applies domain-appropriate scaling and normalization

Optional Sub-Domains

Sub-domains provide even more refined feature engineering and explanations. They're optional but recommended when your use case closely matches a sub-domain.

When to Use Sub-Domains

  • Use when: Your problem closely matches a sub-domain description
  • Skip when: Your problem spans multiple sub-domains or doesn't fit neatly
Best Practice: If you're unsure, start with the base domain. You can always retrain with a sub-domain later if needed. The base domain still provides significant benefits over generic AutoML.

What Happens After Domain Selection

Once you select a domain (and optionally a sub-domain), Rassket:

  1. Applies Domain-Specific Preprocessing: Feature engineering based on your domain
  2. Stores Domain Context: Used throughout the workflow for explanations
  3. Prepares Domain-Aware Models: Model selection considers domain context
  4. Enables Domain Explanations: Results include domain-specific insights

Changing Domains

If you realize you selected the wrong domain after preprocessing:

  • You can upload a new file and select a different domain
  • Or continue with the current domain—models will still be accurate, just without domain-specific enhancements

Domain selection happens early in the workflow, but its effects are felt throughout model training, evaluation, and explanation generation.

Next Steps

After domain selection and preprocessing, proceed to Data Analysisto understand your problem type and get model recommendations.