Performance and Profiling¶
This section documents the performance characteristics and profiling results of the 5D Neural Network Interpolator across different dataset sizes.
Overview¶
A comprehensive performance benchmark was conducted to evaluate the neural network’s behavior with datasets of varying sizes: 1,000, 5,000, and 10,000 samples. The benchmark examines three key aspects:
Training Time: How training duration scales with dataset size
Memory Usage: Memory consumption during training and prediction phases
Accuracy Metrics: Model performance (MSE and R²) across different dataset sizes
Running the Benchmark¶
The performance benchmark script is located at backend/experiments/performance_benchmark.py.
Prerequisites:
pip install psutil
Run the benchmark:
cd backend
python experiments/performance_benchmark.py
The script will:
Generate synthetic datasets (1K, 5K, 10K samples)
Run all three performance experiments
Save results to
backend/experiments/results/Generate plots to
backend/experiments/figures/
Experiment 1: Training Time vs Dataset Size¶
This experiment measures how training time scales with the number of samples in the dataset.
Method:
Neural networks were trained with identical architectures (2 hidden layers: 64, 32 neurons)
Training was performed for 100 epochs with a learning rate of 0.001
Training time was measured from start to completion
Results:
Training Time Results¶
Dataset Size |
Training Time (s) |
Epochs |
|---|---|---|
1,000 samples |
0.29 |
100 |
5,000 samples |
0.38 |
100 |
10,000 samples |
0.55 |
100 |
Findings:
Training time increases with dataset size
The 5x increase from 1K to 5K samples results in a 1.3x increase in training time
The 10x increase from 1K to 10K samples results in a 1.9x increase in training time
This indicates efficient batch processing and good scalability with the dataset size
Visualisation:
The training time results are visualised in backend/experiments/figures/training_time_vs_dataset_size.png.
Experiment 2: Memory Usage Analysis¶
This experiment profiles memory consumption during both training and prediction phases.
Method:
Memory usage was measured using psutil
Baseline memory was recorded before each operation
Peak memory during training and prediction was captured
Memory change = peak memory - baseline memory was calculated and recorded
Results:
Memory Usage Results¶
Dataset Size |
Training Memory (MB) |
Prediction Memory (MB) |
|---|---|---|
1,000 samples |
~2.1 |
~0.02 |
5,000 samples |
~2.7 |
~0.0 |
10,000 samples |
~0.5 |
~0.0 |
Findings:
Memory usage during training is relatively stable across dataset sizes
Prediction phase has minimal memory overhead (less than 0.02 MB)
Memory usage does not scale linearly with dataset size, indicating efficient memory management
The neural network is memory efficient for both training and prediction
Visualisation:
Memory usage results are visualised in backend/experiments/figures/memory_usage_vs_dataset_size.png.
Experiment 3: Accuracy Metrics vs Dataset Size¶
This experiment evaluates model accuracy (MSE and R²) across different dataset sizes.
Method:
Neural networks were trained with identical architectures (2 hidden layers: 64, 32 neurons)
Evaluation was performed on the test set
MSE and R² scores were calculated
Results:
Accuracy Metrics Results¶
Dataset Size |
MSE |
R² Score |
|---|---|---|
1,000 samples |
0.162 |
0.422 |
5,000 samples |
0.129 |
0.543 |
10,000 samples |
0.119 |
0.569 |
Findings:
MSE decreases as dataset size increases, indicating better model fit
R² score improves with larger datasets (from 0.42 to 0.57)
The improvement from 5K to 10K samples is smaller than from 1K to 5K, suggesting the returns are diminishing as we increase dataset size.
As we would expect, larger datasets provide more training examples leading to better generalisation
Key Observations:
With 1,000 samples: R² = 0.42 (moderate fit)
With 5,000 samples: R² = 0.54 (good fit)
With 10,000 samples: R² = 0.57 (good fit)
Visualisation:
Accuracy metrics are visualised in backend/experiments/figures/accuracy_metrics_vs_dataset_size.png.
Summary and Recommendations¶
Performance Characteristics:
Scalability: The neural network demonstrates excellent scalability with the dataset size
Memory Efficiency: Low memory usage makes it suitable for regular computers and laptops
Accuracy: Model performance improves with larger datasets, however returns diminish beyond 5K samples
Recommendations:
For production use: Use 5,000+ samples for better accuracy
For maximum accuracy: Use 10,000+ samples, although gains are marginal beyond 5K samples
Memory constraints: The system is memory-efficient and can handle datasets up to 10K samples with minimal overhead, therefore it can easily run on personal computers and laptops.
Training Time: Even with 10K samples, training completes in under 1 second per 100 epochs (this is not a bottleneck for the application).
Summary of Benchmark Configuration¶
The benchmark was run with the following configuration:
Neural Network Architecture: 2 hidden layers (64, 32 neurons)
Training Parameters: - Learning rate: 0.001 - Optimizer: Adam - Loss function: MSE - Epochs: 100
Data Split: 70% train, 15% validation, 15% test
Hardware: CPU-based training (PyTorch CPU backend)
Random Seed: 42 (for reproducibility)
Hardware: CPU-based training using PyTorch CPU backend
CPU: 1.8 GHz Dual-Core Intel Core i5
RAM: 8 GB
OS: macOS Monterey 12.7.6
The full benchmark script is available in the backend/experiments/performance_benchmark.py file.