PMetal: Fine-Tune LLMs on Your Mac — No Cloud Required
Professional-grade large language model fine-tuning that runs entirely on Apple Silicon. Your data stays on your machine, your models stay under your control, and your costs stay at the price of electricity.
What Is PMetal?
PMetal is a local LLM fine-tuning platform built specifically for Apple Silicon. It gives developers, researchers, and businesses the tools to train and adapt large language models entirely on their own hardware — no cloud accounts, no API keys, no data leaving the building.
The name is a nod to Metal, Apple's GPU compute framework that powers everything under the hood. PMetal harnesses the unified memory architecture of M-series chips to make fine-tuning workloads that once required expensive cloud GPUs practical on hardware you already own.
Key Capabilities at a Glance
LoRA / QLoRA / DoRA
All major parameter-efficient fine-tuning methods, ready to run
GRPO Reasoning Training
Group Relative Policy Optimization for training reasoning models
20+ Model Architectures
Llama, Mistral, Gemma, Phi, Qwen, and more out of the box
Desktop GUI
Point-and-click fine-tuning with real-time loss curves and progress
TUI & CLI
Fully scriptable terminal interfaces for automation and CI workflows
Python SDK
Integrate fine-tuning directly into your existing ML pipelines
The Problem: Cloud Fine-Tuning Is Expensive and Risky
For the past few years, fine-tuning a language model has meant one thing: renting cloud GPUs. A modest training run on a capable model could easily cost hundreds of dollars. A production-quality fine-tune with hyperparameter sweeps could run into the thousands. And that cost repeats every time you want to experiment, iterate, or retrain on fresh data.
Cost is only half the story. The other half is data. When you send your training data to a cloud provider, you lose control of it. For businesses handling customer information, proprietary knowledge bases, internal documents, or anything governed by HIPAA, GDPR, SOC 2, or internal compliance policies, sending that data off-premises is either illegal, a breach of contract, or simply unacceptable.
The result has been a two-tier AI landscape: large organizations with dedicated ML infrastructure can fine-tune models on private data, while everyone else either uses generic public models or accepts the privacy trade-off of cloud training. PMetal exists to close that gap.
Cloud Fine-Tuning Pain Points
- GPU rental costs add up fast — especially during experimentation phases
- Your training data is uploaded to and processed on third-party infrastructure
- Compliance teams may prohibit sending sensitive data outside the organization
- Vendor lock-in: model weights and training artifacts live in someone else's storage
- Internet dependency creates latency, availability risk, and bandwidth costs
- No reproducibility guarantees if the provider changes their environment
The Solution: Run Everything Locally on Apple Silicon
Apple Silicon changed the calculus for local ML workloads. The M-series unified memory architecture means your CPU and GPU share the same high-bandwidth memory pool. A Mac Studio with 192GB of unified memory can comfortably hold and train models that would require A100-class cloud hardware — at a one-time hardware cost rather than an ongoing rental fee.
PMetal is built from the ground up to exploit this architecture. It uses Metal Performance Shaders for GPU-accelerated matrix operations, implements memory-efficient quantization to pack larger models into available RAM, and provides parameter-efficient fine-tuning methods (LoRA, QLoRA, DoRA) that reduce the memory footprint of training to a fraction of full fine-tuning.
The result is a complete, self-contained fine-tuning environment that installs on your Mac in minutes. Your training data never leaves your machine. Your model weights are stored wherever you choose. Your results are reproducible because nothing about your environment changes between runs.
Fine-Tuning Methods: LoRA, QLoRA, DoRA, and GRPO
PMetal supports the full spectrum of modern parameter-efficient fine-tuning techniques, letting you choose the right trade-off between quality, speed, and memory usage for your specific use case.
LoRA (Low-Rank Adaptation)
The gold standard for efficient fine-tuning. LoRA injects small trainable rank decomposition matrices into the model's attention layers, allowing the base model weights to remain frozen while only a tiny fraction of parameters are updated. Ideal for task-specific adaptation with minimal compute.
QLoRA (Quantized LoRA)
LoRA applied to a 4-bit quantized base model. QLoRA dramatically reduces the memory footprint of fine-tuning, making it possible to train 7B and 13B parameter models on hardware with 16–24GB of unified memory. The quality-to-resource ratio is exceptional.
DoRA (Weight-Decomposed Low-Rank Adaptation)
An advancement over LoRA that decomposes pre-trained weights into magnitude and direction components and applies LoRA only to the directional component. DoRA consistently achieves better performance than LoRA at equivalent parameter counts.
GRPO (Group Relative Policy Optimization)
PMetal is one of the first local fine-tuning tools to support GRPO — the reinforcement learning technique used to train reasoning-capable models. GRPO enables you to train models that reason step-by-step, improving performance on math, coding, and logic tasks without requiring a separate reward model.
Four Ways to Work with PMetal
PMetal is designed to fit into your workflow, not the other way around. Whether you prefer a graphical interface, the terminal, or Python scripting, there is a first-class PMetal experience for you.
Desktop GUI
A native macOS application with a clean visual interface for loading datasets, configuring training runs, and monitoring progress in real time. Loss curves, memory utilization, and estimated completion times are all visible at a glance. No command-line experience required.
Terminal UI (TUI)
A rich terminal interface for those who live in the command line. Keyboard-driven navigation, side-by-side training monitors, and full feature parity with the desktop GUI — all without leaving your terminal session.
CLI
A composable command-line interface designed for scripting and automation. Chain PMetal commands in shell scripts, integrate with Makefiles, or trigger fine-tuning jobs as part of a larger data pipeline without writing Python.
Python SDK
A fully-featured Python library for integrating PMetal into your existing ML workflows. Programmatically configure training jobs, sweep hyperparameters, load checkpoints, and export adapters — all from within your Jupyter notebooks or training scripts.
On-Premises Means You Own Your AI
When a fine-tuned model runs in the cloud, the business relationship between you and your AI provider shapes what you can do with it. Rate limits, terms of service changes, pricing adjustments, outages, and provider decisions are outside your control. Your AI is on someone else's infrastructure, subject to someone else's policies.
PMetal takes a different philosophy: the model you train is yours, full stop. The weights live on your hardware. The training data stays in your control. There is no usage metering, no per-inference cost, and no third party that can revoke access. The only ongoing cost is electricity.
For organizations with sensitive workloads — healthcare, legal, finance, defense, or any domain with stringent compliance requirements — this is not just a nice-to-have. It is often the only acceptable path to deploying AI on private data.
Data Sovereignty by Design
PMetal has no telemetry, no cloud sync, and no requirement for an internet connection after installation. Your training runs, datasets, model checkpoints, and exported adapters are entirely local. Compliance auditors will find nothing to flag because there is nothing leaving the machine.
Get Started with PMetal
PMetal runs on any Apple Silicon Mac — M1, M2, M3, or M4, across the full lineup from MacBook Air to Mac Pro. Models from 1B to 70B+ parameters are supported depending on available unified memory. The recommended minimum is 16GB unified memory; 32GB or more unlocks the full range of capabilities.
The project is open source and available on GitHub. Full documentation, quickstart guides, example datasets, and a community forum are available at the PMetal product page.
Fine-Tuning That Respects Your Data
PMetal is the result of a straightforward belief: the hardware sitting on your desk is already powerful enough to fine-tune world-class language models. The only thing that was missing was software built specifically for it. Now it exists.