Performance and Profiling
==========================

This section documents the performance characteristics and profiling results of the 5D Neural Network Interpolator across different dataset sizes.

Overview
--------

A comprehensive performance benchmark was conducted to evaluate the neural network's behavior with datasets of varying sizes: 1,000, 5,000, and 10,000 samples. The benchmark examines three key aspects:

- **Training Time**: How training duration scales with dataset size

- **Memory Usage**: Memory consumption during training and prediction phases

- **Accuracy Metrics**: Model performance (MSE and R²) across different dataset sizes

Running the Benchmark
----------------------

The performance benchmark script is located at ``backend/experiments/performance_benchmark.py``.

**Prerequisites:**

.. code-block:: bash

   pip install psutil

**Run the benchmark:**

.. code-block:: bash

   cd backend
   python experiments/performance_benchmark.py

The script will:

- Generate synthetic datasets (1K, 5K, 10K samples)

- Run all three performance experiments

- Save results to ``backend/experiments/results/``

- Generate plots to ``backend/experiments/figures/``

Experiment 1: Training Time vs Dataset Size
--------------------------------------------

This experiment measures how training time scales with the number of samples in the dataset.

**Method:**

- Neural networks were trained with identical architectures (2 hidden layers: 64, 32 neurons)

- Training was performed for 100 epochs with a learning rate of 0.001

- Training time was measured from start to completion

**Results:**

Training Time Results
^^^^^^^^^^^^^^^^^^^^^

+---------------+----------------------+----------+
| Dataset Size  | Training Time (s)    | Epochs   |
+===============+======================+==========+
| 1,000 samples | 0.29                 | 100      |
+---------------+----------------------+----------+
| 5,000 samples | 0.38                 | 100      |
+---------------+----------------------+----------+
| 10,000 samples| 0.55                 | 100      |
+---------------+----------------------+----------+

**Findings:**

- Training time increases with dataset size

- The 5x increase from 1K to 5K samples results in a 1.3x increase in training time

- The 10x increase from 1K to 10K samples results in a 1.9x increase in training time

- This indicates efficient batch processing and good scalability with the dataset size

**Visualisation:**

The training time results are visualised in ``backend/experiments/figures/training_time_vs_dataset_size.png``.

Experiment 2: Memory Usage Analysis
-----------------------------------

This experiment profiles memory consumption during both training and prediction phases.

**Method:**

- Memory usage was measured using psutil

- Baseline memory was recorded before each operation

- Peak memory during training and prediction was captured

- Memory change = peak memory - baseline memory was calculated and recorded

**Results:**

Memory Usage Results
^^^^^^^^^^^^^^^^^^^^

+---------------+------------------------+------------------------+
| Dataset Size  | Training Memory (MB)   | Prediction Memory (MB) |
+===============+========================+========================+
| 1,000 samples | ~0.10                  | ~0.02                  |
+---------------+------------------------+------------------------+
| 5,000 samples | ~2.20                  | ~0.0                   |
+---------------+------------------------+------------------------+
| 10,000 samples| ~3.41                  | ~0.0                   |
+---------------+------------------------+------------------------+

**Findings:**

The data shows a sublinear relationship between dataset size and memory consumption during training. When the dataset increases 10x (from 1,000 to 10,000 samples), training memory only increases by approximately 34x, suggesting reasonable memory efficiency. 

The prediction phase is efficient, using negligible memory, which indicates good optimisation for inference.

- Training Complexity: Appears to be approximately O(n) to O(n log n), where n is the number of samples. The memory doesn't scale quadratically.

- Prediction Complexity: Effectively O(1) memory-wise, as the memory usage remains constant and minimal regardless of the training set size used.

- Low Memory Requirements: Even at 10,000 samples, total training memory remains under 4 MB, suggesting this is a lightweight model suitable for resource-constrained environments.

- The O(n) to O(n log n) training complexity and O(1) prediction complexity suggest the algorithm is practical for scaling to moderate-sized datasets without memory bottlenecks.

**Visualisation:**

Memory usage results are visualised in ``backend/experiments/figures/memory_usage_vs_dataset_size.png``.

Experiment 3: Accuracy Metrics vs Dataset Size
----------------------------------------------

This experiment evaluates model accuracy (MSE and R²) across different dataset sizes.

**Method:**

- Neural networks were trained with identical architectures (2 hidden layers: 64, 32 neurons)

- Evaluation was performed on the test set

- MSE and R² scores were calculated

**Results:**

Accuracy Metrics Results
^^^^^^^^^^^^^^^^^^^^^^^^

+---------------+--------+----------+
| Dataset Size  | MSE    | R² Score |
+===============+========+==========+
| 1,000 samples | 0.162  | 0.422    |
+---------------+--------+----------+
| 5,000 samples | 0.129  | 0.543    |
+---------------+--------+----------+
| 10,000 samples| 0.119  | 0.569    |
+---------------+--------+----------+

**Findings:**

- MSE decreases as dataset size increases, indicating better model fit

- R² score improves with larger datasets (from 0.42 to 0.57)

- The improvement from 5K to 10K samples is smaller than from 1K to 5K, suggesting the returns are diminishing as we increase dataset size.

- As we would expect, larger datasets provide more training examples leading to better generalisation

**Key Observations:**

- With 1,000 samples: R² = 0.42 (moderate fit)

- With 5,000 samples: R² = 0.54 (good fit)

- With 10,000 samples: R² = 0.57 (good fit)

**Visualisation:**

Accuracy metrics are visualised in ``backend/experiments/figures/accuracy_metrics_vs_dataset_size.png``.

Summary and Recommendations
----------------------------

**Performance Characteristics:**

- **Scalability**: The neural network demonstrates excellent scalability with the dataset size

- **Memory Efficiency**: Low memory usage makes it suitable for regular computers and laptops

- **Accuracy**: Model performance improves with larger datasets, however returns diminish beyond 5K samples

**Recommendations:**

- **For production use**: Use 5,000+ samples for better accuracy

- **For maximum accuracy**: Use 10,000+ samples, although gains are marginal beyond 5K samples

- **Memory constraints**: The system is memory-efficient and can handle datasets up to 10K samples with minimal overhead, therefore it can easily run on personal computers and laptops.

- **Training Time**: Even with 10K samples, training completes in under 1 second per 100 epochs (this is not a bottleneck for the application).


Summary of Benchmark Configuration
-----------------------------------

The benchmark was run with the following configuration:

- **Neural Network Architecture**: 2 hidden layers (64, 32 neurons)

- **Training Parameters**:
  - Learning rate: 0.001
  - Optimizer: Adam
  - Loss function: MSE
  - Epochs: 100

- **Data Split**: 70% train, 15% validation, 15% test

- **Hardware**: CPU-based training (PyTorch CPU backend)

- **Random Seed**: 42 (for reproducibility)

- **Hardware**: CPU-based training using PyTorch CPU backend

  - CPU: 1.8 GHz Dual-Core Intel Core i5
  - RAM: 8 GB
  - OS: macOS Monterey 12.7.6

The full benchmark script is available in the ``backend/experiments/performance_benchmark.py`` file.