Performance and Profiling

This section documents the performance characteristics and profiling results of the 5D Neural Network Interpolator across different dataset sizes.

Overview

A comprehensive performance benchmark was conducted to evaluate the neural network’s behavior with datasets of varying sizes: 1,000, 5,000, and 10,000 samples. The benchmark examines three key aspects:

  • Training Time: How training duration scales with dataset size

  • Memory Usage: Memory consumption during training and prediction phases

  • Accuracy Metrics: Model performance (MSE and R²) across different dataset sizes

Running the Benchmark

The performance benchmark script is located at backend/experiments/performance_benchmark.py.

Prerequisites:

pip install psutil

Run the benchmark:

cd backend
python experiments/performance_benchmark.py

The script will:

  • Generate synthetic datasets (1K, 5K, 10K samples)

  • Run all three performance experiments

  • Save results to backend/experiments/results/

  • Generate plots to backend/experiments/figures/

Experiment 1: Training Time vs Dataset Size

This experiment measures how training time scales with the number of samples in the dataset.

Method:

  • Neural networks were trained with identical architectures (2 hidden layers: 64, 32 neurons)

  • Training was performed for 100 epochs with a learning rate of 0.001

  • Training time was measured from start to completion

Results:

Training Time Results

Dataset Size

Training Time (s)

Epochs

1,000 samples

0.29

100

5,000 samples

0.38

100

10,000 samples

0.55

100

Findings:

  • Training time increases with dataset size

  • The 5x increase from 1K to 5K samples results in a 1.3x increase in training time

  • The 10x increase from 1K to 10K samples results in a 1.9x increase in training time

  • This indicates efficient batch processing and good scalability with the dataset size

Visualisation:

The training time results are visualised in backend/experiments/figures/training_time_vs_dataset_size.png.

Experiment 2: Memory Usage Analysis

This experiment profiles memory consumption during both training and prediction phases.

Method:

  • Memory usage was measured using psutil

  • Baseline memory was recorded before each operation

  • Peak memory during training and prediction was captured

  • Memory change = peak memory - baseline memory was calculated and recorded

Results:

Memory Usage Results

Dataset Size

Training Memory (MB)

Prediction Memory (MB)

1,000 samples

~2.1

~0.02

5,000 samples

~2.7

~0.0

10,000 samples

~0.5

~0.0

Findings:

  • Memory usage during training is relatively stable across dataset sizes

  • Prediction phase has minimal memory overhead (less than 0.02 MB)

  • Memory usage does not scale linearly with dataset size, indicating efficient memory management

  • The neural network is memory efficient for both training and prediction

Visualisation:

Memory usage results are visualised in backend/experiments/figures/memory_usage_vs_dataset_size.png.

Experiment 3: Accuracy Metrics vs Dataset Size

This experiment evaluates model accuracy (MSE and R²) across different dataset sizes.

Method:

  • Neural networks were trained with identical architectures (2 hidden layers: 64, 32 neurons)

  • Evaluation was performed on the test set

  • MSE and R² scores were calculated

Results:

Accuracy Metrics Results

Dataset Size

MSE

R² Score

1,000 samples

0.162

0.422

5,000 samples

0.129

0.543

10,000 samples

0.119

0.569

Findings:

  • MSE decreases as dataset size increases, indicating better model fit

  • R² score improves with larger datasets (from 0.42 to 0.57)

  • The improvement from 5K to 10K samples is smaller than from 1K to 5K, suggesting the returns are diminishing as we increase dataset size.

  • As we would expect, larger datasets provide more training examples leading to better generalisation

Key Observations:

  • With 1,000 samples: R² = 0.42 (moderate fit)

  • With 5,000 samples: R² = 0.54 (good fit)

  • With 10,000 samples: R² = 0.57 (good fit)

Visualisation:

Accuracy metrics are visualised in backend/experiments/figures/accuracy_metrics_vs_dataset_size.png.

Summary and Recommendations

Performance Characteristics:

  • Scalability: The neural network demonstrates excellent scalability with the dataset size

  • Memory Efficiency: Low memory usage makes it suitable for regular computers and laptops

  • Accuracy: Model performance improves with larger datasets, however returns diminish beyond 5K samples

Recommendations:

  • For production use: Use 5,000+ samples for better accuracy

  • For maximum accuracy: Use 10,000+ samples, although gains are marginal beyond 5K samples

  • Memory constraints: The system is memory-efficient and can handle datasets up to 10K samples with minimal overhead, therefore it can easily run on personal computers and laptops.

  • Training Time: Even with 10K samples, training completes in under 1 second per 100 epochs (this is not a bottleneck for the application).

Summary of Benchmark Configuration

The benchmark was run with the following configuration:

  • Neural Network Architecture: 2 hidden layers (64, 32 neurons)

  • Training Parameters: - Learning rate: 0.001 - Optimizer: Adam - Loss function: MSE - Epochs: 100

  • Data Split: 70% train, 15% validation, 15% test

  • Hardware: CPU-based training (PyTorch CPU backend)

  • Random Seed: 42 (for reproducibility)

  • Hardware: CPU-based training using PyTorch CPU backend

    • CPU: 1.8 GHz Dual-Core Intel Core i5

    • RAM: 8 GB

    • OS: macOS Monterey 12.7.6

The full benchmark script is available in the backend/experiments/performance_benchmark.py file.