Getting Started

This guide will walk you through uploading your first dataset to Rassket and understanding what happens during the initial data processing phase.

Uploading Your Data

Supported File Types

Rassket currently supports CSV (Comma-Separated Values) files. Your CSV file should:

  • Have headers in the first row
  • Use commas as delimiters
  • Be encoded in UTF-8
  • Have a maximum file size of 50MB
File Size Limit: Files larger than 50MB may take longer to process. For very large datasets, consider sampling or aggregating your data before upload.

Upload Process

The upload interface provides two methods:

  1. Drag & Drop: Simply drag your CSV file into the upload area
  2. Click to Browse: Click the upload area to open your file browser
Upload interface

Once you select a file, Rassket will:

  • Validate the file format
  • Check file size
  • Upload the file to the server
  • Display basic file information (rows, columns, memory usage)

What Happens After Upload

1. File Validation

Rassket automatically validates your CSV file structure:

  • Structure Validation: Ensures the file is properly formatted CSV
  • Schema Detection: Identifies column names and data types
  • Quality Checks: Detects missing values, duplicates, and data inconsistencies

2. File Information Display

After successful upload, you'll see:

  • Rows: Total number of data rows
  • Columns: Total number of features/columns
  • Memory Usage: Approximate memory footprint
  • Column Names: Preview of your dataset columns

3. Domain Selection

After upload, you'll be prompted to select a domain. This is a critical step that enables domain-aware processing. See the Domain Selection guide for detailed information.

How Rassket Understands Your Dataset

Automatic Schema Detection

Rassket automatically analyzes your dataset to understand:

  • Data Types: Numeric vs. categorical columns
  • Missing Values: Which columns have missing data and how much
  • Distributions: Basic statistical properties of numeric columns
  • Cardinality: Number of unique values in categorical columns

Problem Type Detection

When you proceed to analysis, Rassket will automatically detect:

  • Regression: Predicting continuous numeric values
  • Binary Classification: Predicting one of two classes
  • Multi-class Classification: Predicting one of multiple classes

This detection happens based on your target column selection and the nature of the target variable.

Feature Engineering Preparation

Once you select a domain, Rassket prepares domain-specific feature engineering:

  • Identifies time-series patterns (if applicable)
  • Detects potential feature interactions
  • Plans domain-appropriate transformations
  • Prepares preprocessing pipelines
Pro Tip: Even if you're unsure about domain selection, you can proceed with a generic domain. Rassket will still build accurate models, though domain-specific enhancements won't be applied.

Preprocessing Phase

After domain selection, Rassket automatically preprocesses your data:

1. Missing Value Handling

  • Numeric columns: Imputation using median or mean (domain-dependent)
  • Categorical columns: Mode imputation or "missing" category
  • Time-series: Forward-fill or interpolation for temporal data

2. Duplicate Removal

Identical rows are detected and removed to prevent data leakage and improve model performance.

3. Domain-Aware Feature Engineering

Based on your selected domain, Rassket creates additional features:

  • Energy Domain: Time-of-day features, seasonal patterns, lag features for consumption patterns
  • Research Domain: Statistical transformations, interaction terms, research-specific aggregations

4. Data Type Optimization

Columns are optimized for memory efficiency and model compatibility:

  • Numeric columns: Appropriate precision (float32 vs float64)
  • Categorical columns: Category encoding for memory efficiency
  • Date/time columns: Proper datetime parsing and feature extraction

Ready to Get Started?

Start using Rassket now!

Upload your data and begin your AutoML journey. No credit card required.

Launch Rassket App

Next Steps

Once preprocessing is complete, you're ready to:

  1. Proceed to Data Analysis to understand your problem type
  2. Select models for training (or let Rassket choose automatically)
  3. Train your models and evaluate results

Continue to the Dashboard Walkthrough for a complete step-by-step guide through the entire workflow.