Reducing storage costs in S3 by auto-compressing PDF uploads

Trigger a Lambda when a PDF is uploaded to S3, compress it with the PDF Squeezer API, and store the smaller file. You save on storage and data transfer.

Architecture

S3 → Event notification (ObjectCreated) → Lambda → PDF Squeezer API → write compressed PDF back to S3 (e.g. a compressed/ prefix). Optionally delete or archive the original to avoid storing both.

Lambda handler (Python)

On each S3 put, get the object, call the API, then put the result.

import boto3
import os
import requests

s3 = boto3.client('s3')
API_KEY = os.environ['PDF_SQUEEZER_KEY']
API_URL = 'https://api.pdfsqueezer.io/v1/compress'

def handler(event, context):
    for record in event.get('Records', []):
        bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key']
        if not key.lower().endswith('.pdf'):
            continue
        obj = s3.get_object(Bucket=bucket, Key=key)
        pdf_bytes = obj['Body'].read()
        files = {'file': (os.path.basename(key), pdf_bytes, 'application/pdf')}
        headers = {'Authorization': f'Bearer {API_KEY}'}
        r = requests.post(API_URL, files=files, headers=headers, params={'stripMetadata': True})
        if r.status_code != 200:
            raise Exception(r.json().get('error', r.text))
        out_key = f"compressed/{key}"
        s3.put_object(Bucket=bucket, Key=out_key, Body=r.content, ContentType='application/pdf')
    return {'statusCode': 200}

S3 event configuration

Add an S3 event notification on the bucket: event type s3:ObjectCreated:*, prefix optional (e.g. uploads/), and send to your Lambda. Ensure the Lambda has permission to read from and write to the bucket, and has your API key in environment variables.

Lambda compress guide · Remove EXIF metadata · Docs index