Installation

Get the project set up on your machine in a few steps.

Prerequisites

Python 3.9+
Git
Conda (Miniconda or Anaconda)
10GB+ free disk space for the dataset

Setup Steps

1. Clone the Repository

git clone https://github.com/aclarke/echoes.git
cd echoes

2. Create Conda Environment

conda env create -f environment.yml
conda activate echoes

This installs all dependencies including PyTorch, torchvision, MLflow, TensorBoard, and project utilities.

3. Verify Installation

python -c "import torch; print(f'PyTorch version: {torch.__version__}')"
python -c "import torchvision; print(f'torchvision version: {torchvision.__version__}')"

Both commands should print version information without errors.

Dataset Setup

Option 1: Download Full UCF101 (13GB)

For full benchmarking, download the complete UCF101 dataset:

python scripts/download_ucf101_full.py /mnt/echoes_data

This downloads and extracts to /mnt/echoes_data/ucf101/.

Option 2: Use Existing Dataset

If you already have the dataset, ensure it's in one of these locations: - /mnt/echoes_data/ucf101/ (persistent storage) - ./data/ucf101/ (local directory)

Validate Dataset

python scripts/validate_dataset.py /mnt/echoes_data/ucf101

This verifies the dataset has all 101 classes and 13,320 videos.

Code Quality Tools

The project uses several tools to maintain code quality:

Linting and Formatting

# Check code style
ruff check .

# Auto-fix style issues
ruff check --fix .

# Format code
ruff format .

Run Tests

# Unit tests
pytest tests/

# With coverage
pytest --cov=data --cov=scripts tests/

# Integration tests (requires full dataset)
RUN_INTEGRATION_TESTS=1 pytest tests/test_integration.py

Pre-commit Hooks

Install pre-commit hooks to run checks automatically:

pre-commit run --all-files

Development Workflow

Activate the conda environment: conda activate echoes
Make changes to code
Run linters: ruff check --fix . && ruff format .
Run tests: pytest tests/
Commit changes: git commit -m "Your message"

Troubleshooting

PyTorch GPU Not Detected

If you have a GPU but PyTorch isn't using it:

# Check GPU availability
python -c "import torch; print(torch.cuda.is_available())"

# Reinstall PyTorch for your system
conda install pytorch::pytorch pytorch::torchvision pytorch::torchaudio -c pytorch

Dataset Download Fails

# Check internet connection
ping github.com

# Try manual download at https://www.crcv.ucf.edu/datasets/ucf101/
# Then extract to /mnt/echoes_data/ucf101/

Conda Environment Issues

# Recreate environment from scratch
conda env remove --name echoes
conda env create -f environment.yml
conda activate echoes

Next Steps

Quick Start - Run your first experiment
Running Experiments - Advanced training options
Infrastructure - Understand the deployment setup