From the deadwax

How We Wrote a FLAC Codec in Pure Swift

engineeringflacaudioswift

Apple ships a FLAC decoder in Core Audio. It can play FLAC files. It has no encoder. If your app needs to write FLAC, Apple's frameworks won't help.

So if you build a macOS app that needs to write FLAC files, your options are: bundle a C library like libFLAC, vendor FFmpeg, or link against a framework like SFBAudioEngine that wraps one of those C libraries plus its 26 open-source dependencies.

We went a different direction. We wrote a complete FLAC codec from scratch in Swift.

Why it matters that there are no dependencies

Private Press touches your audio files. It writes artwork into their metadata containers. For the ingest pipeline (coming post-1.0), it will also transcode between formats. The code that does this needs to be code we wrote, code we test, and code we can audit.

A vendored C library is someone else's code compiled for a generic target. It has its own memory management, its own buffer handling, its own release cycle. A regression in a dependency update could corrupt a file. A buffer overflow in a C string parser could overwrite audio data. These aren't hypothetical risks. They're the reason security-conscious projects minimize their dependency trees.

Private Press has zero third-party dependencies. 37 Apple frameworks for UI, networking, and system integration. Everything that touches your files is Keynell code. The FLAC codec is the most complex piece of that commitment.

What a FLAC encoder actually does

FLAC is a lossless compression format. The audio goes in, gets compressed, and comes back out bit-identical. No quality loss, ever. The compression works by predicting each audio sample from the samples before it, then encoding only the prediction error (the residual). Good predictions produce small residuals. Small residuals compress well.

The encoder has four layers:

Primitives. Bit-level I/O, CRC checksums, and Rice entropy coding. FLAC operates at the bit level, not the byte level. A single sample's residual might be 3 bits. The next might be 7. The bit reader and writer handle this, plus the CRC-8 and CRC-16 checksums that protect every frame against corruption.

Prediction. Two prediction strategies. Fixed prediction uses hardcoded polynomial predictors (orders 0 through 4). Linear Predictive Coding (LPC) computes custom coefficients from the signal's autocorrelation, fitting a filter to the specific audio content of each block. LPC is where the compression happens. A well-fit LPC filter of order 12 can reduce a block's residual energy by orders of magnitude compared to the fixed predictors.

Framing. Audio is divided into frames, each containing one block of samples per channel. Stereo files get an additional optimization: channel decorrelation. Instead of encoding left and right independently, the encoder can encode left + difference, right + difference, or mid + side. Whichever representation produces the smallest residuals wins.

Stream. The top layer orchestrates everything: metadata blocks (STREAMINFO, PICTURE, VORBIS_COMMENT, SEEKTABLE), frame sequencing, and MD5 verification of the complete decoded output.

Where the hardware comes in

The LPC stage is compute-intensive. For each block of audio, the encoder needs to compute the autocorrelation of the signal, solve the Levinson-Durbin recursion for optimal coefficients, then apply those coefficients as a FIR filter across every sample in the block. Multiply that by thousands of frames in a typical album.

We use Apple's Accelerate framework for the hot paths. vDSP.convolve() computes the autocorrelation and applies the LPC filter in a single vectorized call. On Apple Silicon, this dispatches to the AMX coprocessor automatically. No explicit SIMD, no GPU kernels, no configuration. Just native Swift calling Accelerate, running on hardware designed for exactly this kind of math.

A C library like libFLAC doesn't use the AMX coprocessor. It brings its own math, compiled for a generic x86 or ARM target, running on the CPU while dedicated acceleration hardware sits idle.

Frame-parallel encoding

FLAC frames encode independently, which makes parallelism possible in principle. The reference encoder added multithreading in version 1.5 (February 2025), and earlier academic work explored the idea. Our encoder uses Swift's structured concurrency (TaskGroup) to process up to 4 frames concurrently. Seek points are assembled after all frames complete. The result is the same valid FLAC file, produced faster on modern hardware.

This matters for batch operations. When you're processing a large collection, the difference between sequential and parallel encoding across hundreds of albums adds up.

What we verified

The codec passes round-trip tests at every layer. Encode audio to FLAC, decode it back, compare sample-by-sample. Bit-identical. This is tested across bit depths (16, 24, 32), sample rates (44.1kHz through 192kHz), channel counts (mono through 8-channel), and block sizes.

The metadata round-trip is equally strict. Every metadata block type (STREAMINFO, VORBIS_COMMENT, PICTURE, SEEKTABLE, APPLICATION, CUESHEET) is parsed, preserved, and rewritten without loss. Your existing FLAC tags, artwork, and cue sheets survive intact.

What this means for your collection

If you have FLAC files, the codec that handles them in Private Press is code we wrote and test. It handles hi-res audio up to 32-bit/192kHz without breaking a sweat. It preserves every metadata block. It runs on the hardware Apple built for signal processing.

If the ingest pipeline needs to transcode to FLAC, it uses the same codec. No shelling out to a command-line tool. No temporary files piped through FFmpeg. A direct, in-process encode with the same atomic write guarantees as every other file operation in Private Press.

Zero dependencies isn't a marketing line. It's a security posture and a commitment to quality at every layer. Your files are touched only by code that exists to serve your collection. Nothing else.