Practical ApplicationsOctober 19, 20258 min readshipped

Building a Production ML Workspace: Part 1 - Designing an Organized Structure

When you start working with GPU infrastructure for machine learning—whether it's a DGX workstation, cloud GPUs, or local hardware—one of the first challenges you face isn't technical complexity. It's organizational chaos.

You'll be juggling Ollama models, fine-tuning experiments, agent prototypes, datasets, training checkpoints, and Jupyter notebooks. Without a clear structure from day one, you'll waste precious time searching for files, recreating experiments, and wondering "which model was that again?"

This article shows you how to design a workspace structure that scales from your first experiment to hundreds of concurrent projects, with a specific focus on Ollama integration and LLM development workflows.

About This Series

This is Part 1 of a 5-part series on building production ML workspaces on GPU infrastructure. Each part stands alone but builds toward a complete system:

Part 1: Workspace Structure (you are here)
Part 2: Documentation Systems
Part 3: Experiment Tracking
Part 4: Agent Templates
Part 5: Ollama Integration

Why Organization Matters for ML Work

Before diving into the structure, let's understand why this matters:

The Problem: ML projects generate a lot of artifacts:

Downloaded models (5-100+ GB each)
Training checkpoints (multiple per experiment)
Datasets in various processing stages
Experimental code and notebooks
Results, logs, and visualizations
Custom models and configurations

The Cost: Without organization:

30+ minutes per day searching for files
Lost experiment context (what were we testing?)
Duplicate work (did we already try this?)
Difficulty reproducing results
Hard to track what actually works

The Solution: A well-designed directory structure that:

Makes file locations obvious
Separates concerns cleanly
Scales to hundreds of projects
Supports your workflow naturally
Reduces cognitive load

Core Design Principles

Our workspace structure follows five key principles:

1. Separation of Concerns

Different types of work live in different directories. Ollama models don't mix with fine-tuning experiments. Agents don't mix with datasets. This makes searching faster and reduces cross-contamination.

2. Tool-Specific Organization

Ollama has unique workflows (Modelfiles, pull, create) that deserve dedicated space. Generic ML models go elsewhere. This reflects how you actually work with these tools.

3. Lifecycle Management

Experiments move through states: active → completed → archived. Your directory structure should reflect these lifecycles, keeping your active work focused.

4. Timestamped Naming

Use YYYYMMDD-HHMMSS-descriptive-name for experiments. This provides chronological sorting, unique names, and temporal context at a glance.

5. Discoverability

Names should be intuitive. If you need a dataset, look in datasets/. If you need an Ollama model, look in ollama-models/. No guessing.

The Complete Workspace Structure

Here's the full structure we'll build. Don't worry—we'll break down each section:

(local path)
├── ollama-models/          # Ollama-specific work
│   ├── custom/            # Custom models you created
│   ├── downloaded/        # Models pulled from registry
│   └── modelfiles/        # Modelfile version control
│
├── model-tuning/          # Fine-tuning workspace
│   ├── datasets/         # Training datasets
│   ├── configs/          # Training configurations
│   ├── checkpoints/      # Training checkpoints
│   ├── fine-tuned-models/ # Final outputs
│   └── evaluation/       # Evaluation results
│
├── agents/               # AI agent development
│   ├── prototypes/      # Experimental agents
│   ├── production/      # Production-ready agents
│   ├── libraries/       # Shared components
│   └── tools/           # Agent-specific tools
│
├── datasets/            # Data management
│   ├── raw/            # Original data
│   ├── processed/      # Cleaned data
│   └── annotations/    # Labeled data
│
├── experiments/         # Experiment tracking
│   ├── active/         # Running experiments
│   ├── completed/      # Finished experiments
│   └── archived/       # Historical reference
│
├── notebooks/          # Jupyter development
│   ├── exploration/   # Data exploration
│   ├── analysis/      # Results analysis
│   └── demos/         # Demonstrations
│
├── scripts/           # Automation
│   ├── utilities/    # Helper scripts
│   ├── automation/   # Workflow automation
│   └── deployment/   # Deployment scripts
│
├── results/          # Experiment outputs
├── logs/             # Comprehensive logging
│   ├── experiments/
│   ├── training/
│   └── agents/
│
└── templates/        # Project templates
    ├── experiment/
    └── agent/

Breaking Down Each Directory

Ollama Models Directory

ollama-models/
├── custom/            # Models you create from Modelfiles
├── downloaded/        # Models pulled from Ollama registry
└── modelfiles/        # Version-controlled Modelfiles

Why separate from other models?

Ollama has specific workflows (pull, create, run)
Modelfiles need version control
Custom models need different tracking than downloads
Easy to see what you've created vs. downloaded

Example usage:

# Pull a model
ollama pull llama3.1

# Create custom model
cd ollama-models/modelfiles/my-assistant/
ollama create my-assistant:v1 -f v1-Modelfile

# Track in downloaded/ and custom/ respectively

Naming convention for custom models:

base-model-purpose-version
Examples:
  llama3-customer-support-v1
  mistral-code-reviewer-v2
  phi3-summarizer-v1

Model Tuning Directory

model-tuning/
├── datasets/         # Training data
├── configs/          # Training configurations
├── checkpoints/      # Model checkpoints during training
├── fine-tuned-models/ # Final trained models
└── evaluation/       # Evaluation metrics and reports

Complete workflow:

Prepare dataset → datasets/
Create config → configs/
Train model → checkpoints saved to checkpoints/
Evaluate → results in evaluation/
Export final → fine-tuned-models/
Import to Ollama → create Modelfile, add to ollama-models/custom/

Why this structure:

Clear pipeline from data to deployed model
Checkpoints separate from final models
Configs version-controlled
Evaluation results easily comparable

Agents Directory

agents/
├── prototypes/      # Experimental, rapidly changing
├── production/      # Stable, tested, documented
├── libraries/       # Shared code between agents
└── tools/           # Agent-specific tools

Agent organization pattern:

agents/production/customer-support-agent/
├── README.md              # Full documentation
├── agent.py              # Main code
├── config.yaml           # Configuration
├── requirements.txt      # Dependencies
├── prompts/              # Prompt templates
├── tools/                # Agent tools
├── tests/                # Tests
└── logs/                 # Execution logs

Prototype vs. Production:

Prototypes: Fast iteration, minimal docs, breaking changes OK
Production: Stable API, comprehensive docs, tests, versioning

Experiments Directory

experiments/
├── active/         # Currently running
├── completed/      # Done but recent (< 1 month)
└── archived/       # Historical (> 1 month old)

Experiment naming:

YYYYMMDD-HHMMSS-descriptive-name

Examples:
  20241019-143000-llama-sentiment-analysis
  20241019-155500-phi3-code-generation-benchmark
  20241020-090000-custom-model-evaluation

Lifecycle:

Create in active/ from template
Run experiment, document results
Move to completed/ when done
After 1 month, move to archived/

Benefits:

active/ stays clean (only current work)
completed/ for recent reference
archived/ keeps history without clutter
Chronological sorting with timestamps

Datasets Directory

datasets/
├── raw/            # Original, unmodified data
├── processed/      # Cleaned, transformed data
└── annotations/    # Labels, tags, metadata

Best practices:

Never modify raw/ - immutable source of truth
Document processing steps (scripts in scripts/utilities/)
Symlink large datasets instead of copying
Use consistent naming: datasetname_YYYYMMDD_version

Notebooks Directory

notebooks/
├── exploration/   # Initial data exploration
├── analysis/      # Results analysis
└── demos/         # Demonstrations and presentations

Naming convention:

YYYYMMDD-purpose-description.ipynb

Examples:
  20241019-exploration-sentiment-dataset.ipynb
  20241019-analysis-llama-vs-phi3-performance.ipynb
  20241020-demo-custom-model-showcase.ipynb

Scripts Directory

scripts/
├── utilities/    # Helper scripts (data processing, etc.)
├── automation/   # Workflow automation
└── deployment/   # Deployment scripts

Examples:

utilities/convert_data.py - Data format conversion
automation/run_benchmark.sh - Automated benchmarking
deployment/deploy_agent.py - Agent deployment

Logs Directory

logs/
├── experiments/   # Experiment execution logs
├── training/      # Training logs
└── agents/        # Agent execution logs

Log naming:

YYYYMMDD-HHMMSS-activity.log

Examples:
  20241019-143000-sentiment-experiment.log
  20241019-155500-model-training.log
  20241020-090000-agent-execution.log

Templates Directory

templates/
├── experiment/    # Experiment template
└── agent/         # Agent template

These contain pre-built structures for rapid project creation. We'll cover these in detail in Part 3 (Experiments) and Part 4 (Agents).

Creating the Structure

Here are the commands to create the complete workspace:

# Create main directories
mkdir -p (local path)

# Ollama structure
mkdir -p (local path)

# Model tuning structure
mkdir -p (local path)

# Agents structure
mkdir -p (local path)

# Datasets structure
mkdir -p (local path)

# Experiments structure
mkdir -p (local path)

# Notebooks structure
mkdir -p (local path)

# Scripts structure
mkdir -p (local path)

# Logs structure
mkdir -p (local path)

# Templates structure
mkdir -p (local path)

Verify creation:

cd (local path)
find . -type d | head -30

You should see 30+ directories organized exactly as designed.

Real-World Benefits

After implementing this structure, you'll experience:

Time Saved:

~30 minutes per day (no file searching)
~15 minutes per new project (template usage)
~10 minutes per experiment (clear organization)

Quality Improvements:

Consistent project structure
Better documentation (templates prompt it)
Easier collaboration (others understand structure)
Reproducible work (everything in its place)

Cognitive Benefits:

Less decision fatigue ("where should this go?")
Clear mental model of workspace
Easier to return after breaks
Reduced stress from chaos

Design Decisions Explained

Why Timestamp-Based Experiment Naming?

Decision: Use YYYYMMDD-HHMMSS-descriptive-name

Rationale:

Chronological sorting works naturally with ls
No naming collisions (even with same description)
Temporal context visible in name
Easy to search by date range

Alternative considered: Simple numbering (exp-001, exp-002) Rejected because: Numbers don't convey temporal information

Why Separate Ollama Directory?

Decision: Dedicated ollama-models/ top-level directory

Rationale:

Ollama has tool-specific workflows
Modelfiles need special tracking
Different from generic ML models
Easy discoverability

Alternative considered: Generic models/ directory Rejected because: Ollama-specific needs warrant dedicated space

Why Three-Stage Experiment Lifecycle?

Decision: Active/Completed/Archived

Rationale:

Active: High priority, currently working
Completed: Done but recent, easy reference
Archived: Historical record without clutter

Benefits:

Clean active directory
Recent work accessible
History preserved
Manageable growth

Alternative considered: Just active/completed Rejected because: Completed grows forever without archival stage

Integration with Development Tools

This workspace structure integrates naturally with:

Claude Code / AI Assistants:

Clear structure guides AI suggestions
Templates provide context
Documentation in predictable locations

Git Version Control:

# Initialize in specific subdirectories
cd (local path)
git init

# Or workspace-wide tracking
cd (local path)
git init

Jupyter Notebooks:

Organized by purpose (exploration/analysis/demos)
Easy to find related notebooks
Clear lifecycle (experimental → production)

Ollama CLI:

Model files in predictable locations
Version-controlled Modelfiles
Clear custom vs. downloaded distinction

What's Next

You now have a battle-tested workspace structure that can handle hundreds of ML projects without becoming chaotic. But structure alone isn't enough—you need systems to track your work, document experiments, and capture insights.

In Part 2: Documentation Systems, we'll build a three-tier logging system that:

Captures execution details for debugging
Tracks daily activity for review
Creates session narratives for blog posts
Maintains experiment history
Builds a model registry

We'll also create activity tracking systems that turn your ML work into reproducible, shareable knowledge.

Key Takeaways

Separation of concerns prevents chaos as projects grow
Tool-specific directories (like ollama-models/) reflect real workflows
Timestamped naming provides chronological organization
Lifecycle management keeps active work focused
Templates (coming in Part 3 & 4) accelerate project creation
30 minutes saved daily with proper organization

Resources

Workspace Templates:

Full structure creation script (provided above)
Template files (covered in Parts 3 & 4)

Related Reading:

Ollama Documentation: https://github.com/ollama/ollama
Git for ML: Version control best practices
Experiment tracking patterns

Series Navigation

Next: Part 2: Documentation Systems
Series Home: Building a Production ML Workspace on GPU Infrastructure

Questions or suggestions? Find me on Twitter @bioinfo or at rundatarun.io

About the Author: Justin Johnson builds AI systems and writes about practical AI development.

justinhjohnson.com | Twitter | LinkedIn | Run Data Run | Subscribe

Related experiments

Apparatus

1,365 words · 8 min read

workspace-organization
ml-development
ollama
dgx
gpu-computing
best-practices

Building a Production ML Workspace: Part 1 - Designing an Organized Structure

Building a Production ML Workspace: Part 1 - Designing an Organized Structure

Why Organization Matters for ML Work

Core Design Principles

1. Separation of Concerns

2. Tool-Specific Organization

3. Lifecycle Management

4. Timestamped Naming

5. Discoverability

The Complete Workspace Structure

Breaking Down Each Directory

Ollama Models Directory

Model Tuning Directory

Agents Directory

Experiments Directory

Datasets Directory

Notebooks Directory

Scripts Directory

Logs Directory

Templates Directory

Creating the Structure

Real-World Benefits

Design Decisions Explained

Why Timestamp-Based Experiment Naming?

Why Separate Ollama Directory?

Why Three-Stage Experiment Lifecycle?

Integration with Development Tools

What's Next

Key Takeaways

Resources

Series Navigation

Related Articles

Related experiments

Apparatus

Links to this entry

Building a Production ML Workspace: Part 1 - Designing an Organized Structure

Why Organization Matters for ML Work

Core Design Principles

1. Separation of Concerns

2. Tool-Specific Organization

3. Lifecycle Management

4. Timestamped Naming

5. Discoverability

The Complete Workspace Structure

Breaking Down Each Directory

Ollama Models Directory

Model Tuning Directory

Agents Directory

Experiments Directory

Datasets Directory

Notebooks Directory

Scripts Directory

Logs Directory

Templates Directory

Creating the Structure

Real-World Benefits

Design Decisions Explained

Why Timestamp-Based Experiment Naming?

Why Separate Ollama Directory?

Why Three-Stage Experiment Lifecycle?

Integration with Development Tools

What's Next

Key Takeaways

Resources

Series Navigation

Related Articles

Get the next experiment

Related experiments

Apparatus

Links to this entry