Getting Started

This guide will walk you through uploading your first dataset to Rassket and understanding what happens during the initial data processing phase.

Uploading Your Data

Supported File Types

Rassket currently supports CSV (Comma-Separated Values) files. Your CSV file should:

Have headers in the first row
Use commas as delimiters
Be encoded in UTF-8
Have a maximum file size of 50MB

File Size Limit: Files larger than 50MB may take longer to process. For very large datasets, consider sampling or aggregating your data before upload.

Upload Process

The upload interface provides two methods:

Drag & Drop: Simply drag your CSV file into the upload area
Click to Browse: Click the upload area to open your file browser

Once you select a file, Rassket will:

Validate the file format
Check file size
Upload the file to the server
Display basic file information (rows, columns, memory usage)

What Happens After Upload

1. File Validation

Rassket automatically validates your CSV file structure:

Structure Validation: Ensures the file is properly formatted CSV
Schema Detection: Identifies column names and data types
Quality Checks: Detects missing values, duplicates, and data inconsistencies

2. File Information Display

After successful upload, you'll see:

Rows: Total number of data rows
Columns: Total number of features/columns
Memory Usage: Approximate memory footprint
Column Names: Preview of your dataset columns

3. Domain Selection

After upload, you'll be prompted to select a domain. This is a critical step that enables domain-aware processing. See the Domain Selection guide for detailed information.

How Rassket Understands Your Dataset

Automatic Schema Detection

Rassket automatically analyzes your dataset to understand:

Data Types: Numeric vs. categorical columns
Missing Values: Which columns have missing data and how much
Distributions: Basic statistical properties of numeric columns
Cardinality: Number of unique values in categorical columns

Problem Type Detection

When you proceed to analysis, Rassket will automatically detect:

Regression: Predicting continuous numeric values
Binary Classification: Predicting one of two classes
Multi-class Classification: Predicting one of multiple classes

This detection happens based on your target column selection and the nature of the target variable.

Feature Engineering Preparation

Once you select a domain, Rassket prepares domain-specific feature engineering:

Identifies time-series patterns (if applicable)
Detects potential feature interactions
Plans domain-appropriate transformations
Prepares preprocessing pipelines

Pro Tip: Even if you're unsure about domain selection, you can proceed with a generic domain. Rassket will still build accurate models, though domain-specific enhancements won't be applied.

Preprocessing Phase

After domain selection, Rassket automatically preprocesses your data:

1. Missing Value Handling

Numeric columns: Imputation using median or mean (domain-dependent)
Categorical columns: Mode imputation or "missing" category
Time-series: Forward-fill or interpolation for temporal data

2. Duplicate Removal

Identical rows are detected and removed to prevent data leakage and improve model performance.

3. Domain-Aware Feature Engineering

Based on your selected domain, Rassket creates additional features:

Energy Domain: Time-of-day features, seasonal patterns, lag features for consumption patterns
Research Domain: Statistical transformations, interaction terms, research-specific aggregations

4. Data Type Optimization

Columns are optimized for memory efficiency and model compatibility:

Numeric columns: Appropriate precision (float32 vs float64)
Categorical columns: Category encoding for memory efficiency
Date/time columns: Proper datetime parsing and feature extraction

Ready to Get Started?

Start using Rassket now!

Upload your data and begin your AutoML journey. No credit card required.

Launch Rassket App

Next Steps

Once preprocessing is complete, you're ready to:

Proceed to Data Analysis to understand your problem type
Select models for training (or let Rassket choose automatically)
Train your models and evaluate results

Continue to the Dashboard Walkthrough for a complete step-by-step guide through the entire workflow.