Dropping Unused Attributes to Reduce Tile Size

Dropping unused attributes to reduce tile size is achieved by explicitly filtering GeoJSON or database properties before or during vector tile encoding. In production pipelines, this is handled by passing a strict property whitelist to the tile encoder—typically Tippecanoe’s --attribute-filter flag—which strips non-essential keys from the Protocol Buffers payload before compression. Removing redundant metadata consistently shrinks .pbf files by 15–60%, directly lowering CDN egress costs, accelerating tile fetch times, and reducing browser memory overhead during WebGL rendering.

How MVT Encoding Amplifies Attribute Bloat

Vector tiles bundle geometry coordinates, property keys, and string values into a highly compressed binary format. Every unused attribute still consumes dictionary space, string tables, and value arrays inside the MVT container. Protobuf encoding relies on a per-layer dictionary where keys are stored once and referenced by integer indices. If your source data contains 40 properties but your frontend only uses 6, the remaining 34 keys still populate the key table, and their associated string values are deduplicated and stored globally. At scale, this inflates the string_values and keys arrays, negating the compression benefits of geometry simplification.

The Mapbox Vector Tile Specification defines this dictionary-based encoding explicitly: keys and values are stored in parallel arrays, and features reference them via integer offsets. Because the encoder cannot predict which properties your application will query at runtime, it defaults to preserving everything. Implementing strict Attribute Filtering Rules early in your workflow prevents bloat from propagating into your tile cache and ensures downstream style expressions only reference guaranteed properties.

Core Implementation: Tippecanoe Attribute Filtering

Tippecanoe’s --attribute-filter accepts a JSON object mapping layer names to arrays of allowed property keys. If a layer is omitted from the filter specification, all its attributes are preserved by default, making explicit configuration mandatory for size reduction. The filter operates during the encoding phase, meaning it doesn’t alter your source datasets but strips keys before they enter the MVT structure.

The JSON schema follows this pattern:

json
{
  "roads": ["name", "highway", "surface", "maxspeed"],
  "buildings": ["height", "building", "levels"],
  "landuse": ["class", "name"]
}

When Tippecanoe processes this configuration, it performs two critical optimizations:

  1. Key Dictionary Pruning: Only whitelisted keys are written to the layer’s key table. Unused keys never receive an index.
  2. Value Deduplication: String values for dropped keys are never added to the global string table, saving significant bytes across millions of features.

Refer to the official Tippecanoe documentation for CLI syntax updates and advanced filtering patterns like --attribute-filter-file for dynamic layer generation.

Production Python Pipeline

The following script demonstrates a production-ready automation wrapper. It generates a temporary filter configuration, executes Tippecanoe with strict attribute pruning, validates output size, and cleans up artifacts.

python
import json
import logging
import os
import subprocess
import tempfile
from pathlib import Path

logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")

def build_tile_with_filtered_attributes(
    input_geojson: Path,
    output_mbtiles: Path,
    attribute_filter: dict[str, list[str]],
    max_zoom: int = 14,
    min_zoom: int = 0,
) -> None:
    """Encode vector tiles while dropping unused attributes to reduce tile size."""
    if not input_geojson.exists():
        raise FileNotFoundError(f"Source GeoJSON not found: {input_geojson}")

    # 1. Write temporary attribute filter JSON
    with tempfile.NamedTemporaryFile(mode="w", suffix=".json", delete=False) as tmp_filter:
        json.dump(attribute_filter, tmp_filter)
        filter_path = tmp_filter.name

    try:
        # 2. Construct Tippecanoe command
        cmd = [
            "tippecanoe",
            "--output", str(output_mbtiles),
            "--attribute-filter", filter_path,
            "--maximum-zoom", str(max_zoom),
            "--minimum-zoom", str(min_zoom),
            "--drop-densest-as-needed",
            "--force",
            str(input_geojson)
        ]

        logging.info("Encoding tiles with strict attribute filtering...")
        result = subprocess.run(cmd, capture_output=True, text=True, check=False)

        if result.returncode != 0:
            logging.error("Tippecanoe failed:\n%s", result.stderr)
            raise RuntimeError("Tile encoding failed. Check logs for details.")

        # 3. Validate output size
        original_size = input_geojson.stat().st_size
        tile_size = output_mbtiles.stat().st_size
        reduction = ((original_size - tile_size) / original_size) * 100

        logging.info("Encoding complete. Original: %.2f MB | Tiles: %.2f MB | Reduction: %.1f%%",
                     original_size / 1_048_576, tile_size / 1_048_576, reduction)

    finally:
        # 4. Cleanup temporary filter file
        os.unlink(filter_path)

if __name__ == "__main__":
    # Example usage
    build_tile_with_filtered_attributes(
        input_geojson=Path("data/osm_extracts.geojson"),
        output_mbtiles=Path("dist/filtered_tiles.mbtiles"),
        attribute_filter={
            "roads": ["name", "highway", "surface", "maxspeed", "oneway"],
            "buildings": ["height", "building", "levels"],
            "water": ["name", "waterway"],
        },
        max_zoom=15
    )

Validation & Performance Trade-offs

Attribute pruning delivers measurable gains, but requires validation to prevent rendering regressions:

  • Verify with --stats: Run Tippecanoe with --stats to inspect per-layer key/value counts. Confirm dropped keys no longer appear in the output dictionary.
  • Network Waterfall Analysis: Compare tile fetch times before and after filtering. A 30–50% reduction in .pbf size typically yields proportional improvements in Time to First Byte (TTFB) and decompression latency.
  • WebGL Memory Profiling: Fewer properties mean smaller Feature objects in MapLibre GL JS or Mapbox GL. Monitor heap allocation in browser dev tools; aggressive filtering often cuts per-tile memory by 20–40%.
  • Style Expression Safety: Ensure your frontend style layers never reference dropped keys. Use has() guards or provide fallback values in coalesce() expressions to prevent silent rendering failures.
  • Dynamic Data Handling: If your pipeline ingests evolving datasets, version your filter JSON alongside your tile generation jobs. Integrate this logic into Automated Generation Pipelines with Tippecanoe to enforce schema consistency across CI/CD runs.

Dropping unused attributes to reduce tile size is not a one-time optimization—it’s a continuous constraint that keeps your mapping infrastructure lean, cost-effective, and responsive at scale.