A Production ML Workspace· Part 1 of 5
Practical Applications8 min readshipped

Building a Production ML Workspace: Part 1 - Designing an Organized Structure

Building a Production ML Workspace: Part 1 - Designing an Organized Structure

When you start working with GPU infrastructure for machine learning—whether it's a DGX workstation, cloud GPUs, or local hardware—one of the first challenges you face isn't technical complexity. It's organizational chaos.

You'll be juggling Ollama models, fine-tuning experiments, agent prototypes, datasets, training checkpoints, and Jupyter notebooks. Without a clear structure from day one, you'll waste precious time searching for files, recreating experiments, and wondering "which model was that again?"

This article shows you how to design a workspace structure that scales from your first experiment to hundreds of concurrent projects, with a specific focus on Ollama integration and LLM development workflows.

About This Series

This is Part 1 of a 5-part series on building production ML workspaces on GPU infrastructure. Each part stands alone but builds toward a complete system:

  • Part 1: Workspace Structure (you are here)
  • Part 2: Documentation Systems
  • Part 3: Experiment Tracking
  • Part 4: Agent Templates
  • Part 5: Ollama Integration

Why Organization Matters for ML Work

Before diving into the structure, let's understand why this matters:

The Problem: ML projects generate a lot of artifacts:

  • Downloaded models (5-100+ GB each)
  • Training checkpoints (multiple per experiment)
  • Datasets in various processing stages
  • Experimental code and notebooks
  • Results, logs, and visualizations
  • Custom models and configurations

The Cost: Without organization:

  • 30+ minutes per day searching for files
  • Lost experiment context (what were we testing?)
  • Duplicate work (did we already try this?)
  • Difficulty reproducing results
  • Hard to track what actually works

The Solution: A well-designed directory structure that:

  • Makes file locations obvious
  • Separates concerns cleanly
  • Scales to hundreds of projects
  • Supports your workflow naturally
  • Reduces cognitive load

Core Design Principles

Our workspace structure follows five key principles:

1. Separation of Concerns

Different types of work live in different directories. Ollama models don't mix with fine-tuning experiments. Agents don't mix with datasets. This makes searching faster and reduces cross-contamination.

2. Tool-Specific Organization

Ollama has unique workflows (Modelfiles, pull, create) that deserve dedicated space. Generic ML models go elsewhere. This reflects how you actually work with these tools.

3. Lifecycle Management

Experiments move through states: active → completed → archived. Your directory structure should reflect these lifecycles, keeping your active work focused.

4. Timestamped Naming

Use YYYYMMDD-HHMMSS-descriptive-name for experiments. This provides chronological sorting, unique names, and temporal context at a glance.

5. Discoverability

Names should be intuitive. If you need a dataset, look in datasets/. If you need an Ollama model, look in ollama-models/. No guessing.


The Complete Workspace Structure

Here's the full structure we'll build. Don't worry—we'll break down each section:

(local path)
├── ollama-models/          # Ollama-specific work
│   ├── custom/            # Custom models you created
│   ├── downloaded/        # Models pulled from registry
│   └── modelfiles/        # Modelfile version control
│
├── model-tuning/          # Fine-tuning workspace
│   ├── datasets/         # Training datasets
│   ├── configs/          # Training configurations
│   ├── checkpoints/      # Training checkpoints
│   ├── fine-tuned-models/ # Final outputs
│   └── evaluation/       # Evaluation results
│
├── agents/               # AI agent development
│   ├── prototypes/      # Experimental agents
│   ├── production/      # Production-ready agents
│   ├── libraries/       # Shared components
│   └── tools/           # Agent-specific tools
│
├── datasets/            # Data management
│   ├── raw/            # Original data
│   ├── processed/      # Cleaned data
│   └── annotations/    # Labeled data
│
├── experiments/         # Experiment tracking
│   ├── active/         # Running experiments
│   ├── completed/      # Finished experiments
│   └── archived/       # Historical reference
│
├── notebooks/          # Jupyter development
│   ├── exploration/   # Data exploration
│   ├── analysis/      # Results analysis
│   └── demos/         # Demonstrations
│
├── scripts/           # Automation
│   ├── utilities/    # Helper scripts
│   ├── automation/   # Workflow automation
│   └── deployment/   # Deployment scripts
│
├── results/          # Experiment outputs
├── logs/             # Comprehensive logging
│   ├── experiments/
│   ├── training/
│   └── agents/
│
└── templates/        # Project templates
    ├── experiment/
    └── agent/

Breaking Down Each Directory

Ollama Models Directory

ollama-models/
├── custom/            # Models you create from Modelfiles
├── downloaded/        # Models pulled from Ollama registry
└── modelfiles/        # Version-controlled Modelfiles

Why separate from other models?

  • Ollama has specific workflows (pull, create, run)
  • Modelfiles need version control
  • Custom models need different tracking than downloads
  • Easy to see what you've created vs. downloaded

Example usage:

# Pull a model
ollama pull llama3.1

# Create custom model
cd ollama-models/modelfiles/my-assistant/
ollama create my-assistant:v1 -f v1-Modelfile

# Track in downloaded/ and custom/ respectively

Naming convention for custom models:

base-model-purpose-version
Examples:
  llama3-customer-support-v1
  mistral-code-reviewer-v2
  phi3-summarizer-v1

Model Tuning Directory

model-tuning/
├── datasets/         # Training data
├── configs/          # Training configurations
├── checkpoints/      # Model checkpoints during training
├── fine-tuned-models/ # Final trained models
└── evaluation/       # Evaluation metrics and reports

Complete workflow:

  1. Prepare dataset → datasets/
  2. Create config → configs/
  3. Train model → checkpoints saved to checkpoints/
  4. Evaluate → results in evaluation/
  5. Export final → fine-tuned-models/
  6. Import to Ollama → create Modelfile, add to ollama-models/custom/

Why this structure:

  • Clear pipeline from data to deployed model
  • Checkpoints separate from final models
  • Configs version-controlled
  • Evaluation results easily comparable

Agents Directory

agents/
├── prototypes/      # Experimental, rapidly changing
├── production/      # Stable, tested, documented
├── libraries/       # Shared code between agents
└── tools/           # Agent-specific tools

Agent organization pattern:

agents/production/customer-support-agent/
├── README.md              # Full documentation
├── agent.py              # Main code
├── config.yaml           # Configuration
├── requirements.txt      # Dependencies
├── prompts/              # Prompt templates
├── tools/                # Agent tools
├── tests/                # Tests
└── logs/                 # Execution logs

Prototype vs. Production:

  • Prototypes: Fast iteration, minimal docs, breaking changes OK
  • Production: Stable API, comprehensive docs, tests, versioning

Experiments Directory

experiments/
├── active/         # Currently running
├── completed/      # Done but recent (< 1 month)
└── archived/       # Historical (> 1 month old)

Experiment naming:

YYYYMMDD-HHMMSS-descriptive-name

Examples:
  20241019-143000-llama-sentiment-analysis
  20241019-155500-phi3-code-generation-benchmark
  20241020-090000-custom-model-evaluation

Lifecycle:

  1. Create in active/ from template
  2. Run experiment, document results
  3. Move to completed/ when done
  4. After 1 month, move to archived/

Benefits:

  • active/ stays clean (only current work)
  • completed/ for recent reference
  • archived/ keeps history without clutter
  • Chronological sorting with timestamps

Datasets Directory

datasets/
├── raw/            # Original, unmodified data
├── processed/      # Cleaned, transformed data
└── annotations/    # Labels, tags, metadata

Best practices:

  • Never modify raw/ - immutable source of truth
  • Document processing steps (scripts in scripts/utilities/)
  • Symlink large datasets instead of copying
  • Use consistent naming: datasetname_YYYYMMDD_version

Notebooks Directory

notebooks/
├── exploration/   # Initial data exploration
├── analysis/      # Results analysis
└── demos/         # Demonstrations and presentations

Naming convention:

YYYYMMDD-purpose-description.ipynb

Examples:
  20241019-exploration-sentiment-dataset.ipynb
  20241019-analysis-llama-vs-phi3-performance.ipynb
  20241020-demo-custom-model-showcase.ipynb

Scripts Directory

scripts/
├── utilities/    # Helper scripts (data processing, etc.)
├── automation/   # Workflow automation
└── deployment/   # Deployment scripts

Examples:

  • utilities/convert_data.py - Data format conversion
  • automation/run_benchmark.sh - Automated benchmarking
  • deployment/deploy_agent.py - Agent deployment

Logs Directory

logs/
├── experiments/   # Experiment execution logs
├── training/      # Training logs
└── agents/        # Agent execution logs

Log naming:

YYYYMMDD-HHMMSS-activity.log

Examples:
  20241019-143000-sentiment-experiment.log
  20241019-155500-model-training.log
  20241020-090000-agent-execution.log

Templates Directory

templates/
├── experiment/    # Experiment template
└── agent/         # Agent template

These contain pre-built structures for rapid project creation. We'll cover these in detail in Part 3 (Experiments) and Part 4 (Agents).


Creating the Structure

Here are the commands to create the complete workspace:

# Create main directories
mkdir -p (local path)

# Ollama structure
mkdir -p (local path)

# Model tuning structure
mkdir -p (local path)

# Agents structure
mkdir -p (local path)

# Datasets structure
mkdir -p (local path)

# Experiments structure
mkdir -p (local path)

# Notebooks structure
mkdir -p (local path)

# Scripts structure
mkdir -p (local path)

# Logs structure
mkdir -p (local path)

# Templates structure
mkdir -p (local path)

Verify creation:

cd (local path)
find . -type d | head -30

You should see 30+ directories organized exactly as designed.


Real-World Benefits

After implementing this structure, you'll experience:

Time Saved:

  • ~30 minutes per day (no file searching)
  • ~15 minutes per new project (template usage)
  • ~10 minutes per experiment (clear organization)

Quality Improvements:

  • Consistent project structure
  • Better documentation (templates prompt it)
  • Easier collaboration (others understand structure)
  • Reproducible work (everything in its place)

Cognitive Benefits:

  • Less decision fatigue ("where should this go?")
  • Clear mental model of workspace
  • Easier to return after breaks
  • Reduced stress from chaos

Design Decisions Explained

Why Timestamp-Based Experiment Naming?

Decision: Use YYYYMMDD-HHMMSS-descriptive-name

Rationale:

  • Chronological sorting works naturally with ls
  • No naming collisions (even with same description)
  • Temporal context visible in name
  • Easy to search by date range

Alternative considered: Simple numbering (exp-001, exp-002) Rejected because: Numbers don't convey temporal information

Why Separate Ollama Directory?

Decision: Dedicated ollama-models/ top-level directory

Rationale:

  • Ollama has tool-specific workflows
  • Modelfiles need special tracking
  • Different from generic ML models
  • Easy discoverability

Alternative considered: Generic models/ directory Rejected because: Ollama-specific needs warrant dedicated space

Why Three-Stage Experiment Lifecycle?

Decision: Active/Completed/Archived

Rationale:

  • Active: High priority, currently working
  • Completed: Done but recent, easy reference
  • Archived: Historical record without clutter

Benefits:

  • Clean active directory
  • Recent work accessible
  • History preserved
  • Manageable growth

Alternative considered: Just active/completed Rejected because: Completed grows forever without archival stage


Integration with Development Tools

This workspace structure integrates naturally with:

Claude Code / AI Assistants:

  • Clear structure guides AI suggestions
  • Templates provide context
  • Documentation in predictable locations

Git Version Control:

# Initialize in specific subdirectories
cd (local path)
git init

# Or workspace-wide tracking
cd (local path)
git init

Jupyter Notebooks:

  • Organized by purpose (exploration/analysis/demos)
  • Easy to find related notebooks
  • Clear lifecycle (experimental → production)

Ollama CLI:

  • Model files in predictable locations
  • Version-controlled Modelfiles
  • Clear custom vs. downloaded distinction

What's Next

You now have a battle-tested workspace structure that can handle hundreds of ML projects without becoming chaotic. But structure alone isn't enough—you need systems to track your work, document experiments, and capture insights.

In Part 2: Documentation Systems, we'll build a three-tier logging system that:

  • Captures execution details for debugging
  • Tracks daily activity for review
  • Creates session narratives for blog posts
  • Maintains experiment history
  • Builds a model registry

We'll also create activity tracking systems that turn your ML work into reproducible, shareable knowledge.


Key Takeaways

  • Separation of concerns prevents chaos as projects grow
  • Tool-specific directories (like ollama-models/) reflect real workflows
  • Timestamped naming provides chronological organization
  • Lifecycle management keeps active work focused
  • Templates (coming in Part 3 & 4) accelerate project creation
  • 30 minutes saved daily with proper organization

Resources

Workspace Templates:

  • Full structure creation script (provided above)
  • Template files (covered in Parts 3 & 4)

Related Reading:


Series Navigation

  • Next: Part 2: Documentation Systems
  • Series Home: Building a Production ML Workspace on GPU Infrastructure

Questions or suggestions? Find me on Twitter @bioinfo or at rundatarun.io


Related Articles


About the Author: Justin Johnson builds AI systems and writes about practical AI development.

justinhjohnson.com | Twitter | LinkedIn | Run Data Run | Subscribe

Follow the lab

Get the next experiment

Enjoyed the breakdown on Building a Production ML Workspace: Part 1 - Designing an Organized Structure? New entries land roughly weekly. No digest, no roundup. Just the next build log, when it ships.

Links to this entry