The Jam Machine

A generative AI music composition tool that creates MIDI sequences using a GPT-2 model trained on ~5,000 MIDI songs.

How It Works

Music is converted to text, like a language. A GPT-2 model learns patterns from thousands of songs and generates new, coherent sequences.

5000 MIDI Songs ──► Encoder ──► Text Tokens ──► Train GPT-2

Your Input ──► GPT-2 Model ──► Generated Text ──► Decoder ──► MIDI File

The model’s vocabulary has ~300 tokens covering 128 MIDI pitches, 16 instrument families, time steps, and structural markers.

Encoding & Decoding Guide — How MIDI files become text tokens and back. The full pipeline, token format, quantization trade-offs, and a worked example using The Strokes - Reptilia.
Embedding Explorer — Visualizations of the model’s learned token embeddings, attention patterns, and next-token predictions. Compares trained vs untrained model.

See the README for installation and usage instructions.