FAQ & Concepts

Common questions and important concepts about Rassket, its capabilities, and best practices.

What Makes Rassket Different?

Domain-Aware Intelligence

Unlike generic AutoML tools, Rassket understands domain context. When you select Energy or Research, it applies specialized feature engineering and provides domain-appropriate explanations.

Decision Intelligence Focus

Rassket doesn't just build models—it provides insights and explanations that help you make decisions. The platform explains what matters, why it matters, and how to use predictions.

Production-Ready Outputs

Every model comes with export-ready packages, comprehensive reports, and deployment documentation. No additional work needed to move from prototype to production.

Accessibility

Rassket makes ML accessible to non-technical users while providing transparency and control for technical users. You don't need deep ML expertise to get value.

When to Use Sub-Domains

Use Sub-Domains When:

Your problem closely matches a sub-domain description
You want the most refined feature engineering
You need domain-specific explanations
Your use case is clearly defined

Skip Sub-Domains When:

Your problem spans multiple sub-domains
You're unsure which sub-domain fits
You want to start quickly
Your use case is exploratory

Best Practice: Start with the base domain if unsure. You can always retrain with a sub-domain later. The base domain still provides significant benefits over generic AutoML.

How Accurate Models Are Evaluated

Evaluation Process

Data Splitting: Data is split into train/validation/test sets
Cross-Validation: Models are evaluated using k-fold cross-validation
Test Set Evaluation: Final metrics are computed on held-out test set
Multiple Metrics: Comprehensive metrics are calculated, not just one

Metrics Used

Regression: R², RMSE, MAE, MSE
Classification: Accuracy, Precision, Recall, F1, ROC-AUC
Additional: Cross-validation scores, confidence intervals

Model Comparison

When multiple models are trained:

All models are evaluated on the same test set
Metrics are compared side-by-side
Best model is selected based on primary metric
Statistical significance is considered

Reliability Assessment

Rassket provides:

Model reliability scores
Potential issues identification
Overfitting detection
Confidence intervals

Data Privacy and Handling

Data Storage

Uploaded files are stored securely
Data is processed according to your domain selection
Models are stored with unique identifiers
Raw data is not retained in exported packages

Data Processing

Data is processed server-side
Preprocessing pipelines are preserved
Feature engineering is reproducible
No data is shared with third parties

Export Security

Model packages contain no raw data
Only model files and preprocessing pipelines are exported
Reports contain aggregated metrics only
No sensitive data in exports

Privacy: Rassket processes your data to train models but prioritizes privacy. Exported packages and reports contain no raw data—only model artifacts and aggregated insights.

Common Questions

How long does training take?

Training time depends on:

Dataset Size: Small datasets (<10K rows) take minutes; large datasets (>100K rows) can take 30+ minutes
Number of Models: Training multiple models multiplies time
Model Complexity: Complex models take longer
Hyperparameter Tuning: More optimization trials increase time

Most training completes within 10-30 minutes. Keep the browser tab open during training.

What if my data doesn't fit Energy or Research?

You can still use Rassket! Select the closest domain, or use the base domain. Rassket will build accurate models regardless. Domain selection enhances feature engineering and explanations, but models work for any tabular data.

Can I use Rassket without selecting a domain?

Domain selection is required for preprocessing, but you can choose the base domain (Energy or Research) without selecting a sub-domain. This still provides domain-aware benefits.

What file formats are supported?

Currently, Rassket supports CSV files. Ensure your CSV:

Has headers in the first row
Uses commas as delimiters
Is UTF-8 encoded
Is under 50MB

Can I train multiple models?

Yes! During analysis, you can select multiple recommended models. Rassket will train all selected models and compare their performance, helping you find the best one.

How do I deploy models to production?

Export the model package (ZIP file). It contains:

Trained model file
Preprocessing pipeline
Inference code
Documentation
Requirements file

Follow the included documentation to deploy in your environment.

What if training fails?

Common causes:

Missing Target: Ensure target column is selected
Insufficient Data: Very small datasets may fail
Data Quality: Check for data issues
Timeout: Large datasets may timeout—try with a sample

Check error messages for specific issues. Most problems are data-related and can be fixed.

Can I retrain models with different settings?

Yes! Upload your data again and select different options:

Different domain or sub-domain
Different models
Different problem type

Compare results to find the best configuration.

Key Concepts

AutoML

Automated Machine Learning—the process of automating ML pipeline steps including data preprocessing, feature engineering, model selection, and hyperparameter tuning.

Domain Awareness

Understanding domain context (Energy vs. Research) to apply specialized feature engineering and provide domain-appropriate explanations.

Decision Intelligence

The combination of ML predictions with explanations and insights that help users make informed decisions based on model outputs.

Feature Engineering

The process of creating new features from raw data to improve model performance. Rassket automates this with domain-specific enhancements.

Hyperparameter Tuning

Optimizing model hyperparameters (settings that control model behavior) to improve performance. Rassket uses Optuna for automated optimization.

SHAP Values

SHapley Additive exPlanations—a unified measure of feature importance that explains individual predictions and overall model behavior.

Best Practices

Data Preparation

Ensure CSV has headers
Clean data before upload (remove obvious errors)
Include relevant features
Ensure target column is present (for training)

Domain Selection

Choose domain that best matches your data
Select sub-domain if your use case matches closely
Don't worry if unsure—base domain still helps

Model Training

Start with automatic model selection
Train multiple models for comparison if needed
Review metrics and diagnostics
Export models you want to use

Evaluation

Review multiple metrics, not just one
Check feature importance
Read AI insights for context
Review visualizations for issues

Next Steps

Still have questions?

Review the Dashboard Walkthrough for detailed workflow
Check Getting Started for setup guidance
Explore Use Cases for examples