FLUX API Integration Guide

API Endpoints Overview

Primary Global Endpoint

api.bfl.ai - Primary Endpoint

Routes requests across all available clusters globally
Provides automatic failover between clusters for enhanced uptime
Intelligent load distribution prevents bottlenecks during high traffic periods
Important: Always use the polling_url returned in responses when using this endpoint
Suitable for: Standard inference
Not suitable for: Finetuning operations and finetuned model inference workloads (finetuning remains region-specific)

Regional Endpoints

api.eu.bfl.ai - European Multi-cluster Endpoint

Multi-cluster routing limited to EU regions
GDPR compliant
Provides the same uptime and load balancing benefits within EU regions

api.us.bfl.ai - US Multi-cluster Endpoint

Multi-cluster routing limited to US regions
Provides the same uptime and load balancing benefits within US regions

Legacy Regional Endpoints

api.eu1.bfl.ai - EU Single-cluster Endpoint

Required for finetuning operations in EU region
Single cluster, no automatic failover

api.us1.bfl.ai - US Single-cluster Endpoint

Required for finetuning operations in US region
Single cluster, no automatic failover

Key Benefits of New Endpoints

Enhanced Reliability

Reduced downtime through automatic cluster failover

Better Performance

Intelligent traffic distribution prevents overload during peak usage

Seamless Experience

Load balancing happens transparently on our end

Polling URL Usage

When using the primary global endpoint (api.bfl.ai) or regional endpoints (api.eu.bfl.ai, api.us.bfl.ai), you must use the polling_url returned in the initial request response.

Webhook Users: If you’re using webhooks to receive results, no changes are needed. The polling_url requirement only applies when implementing async polling behavior to check request status.

Example Implementation

import requests
import time
import os

# Submit request to global endpoint
response = requests.post(
    'https://api.bfl.ai/v1/flux-pro-1.1',
    headers={
        'accept': 'application/json',
        'x-key': os.environ.get("BFL_API_KEY"),
        'Content-Type': 'application/json',
    },
    json={
        'prompt': 'A serene landscape with mountains',
        'aspect_ratio': '16:9'
    }
)

data = response.json()
request_id = data['id']
polling_url = data['polling_url']  # Use this URL for polling

# Poll using the returned polling_url
while True:
    time.sleep(0.5)
    result = requests.get(
        polling_url,
        headers={
            'accept': 'application/json',
            'x-key': os.environ.get("BFL_API_KEY"),
        },
        params={'id': request_id}
    ).json()
    
    if result['status'] == 'Ready':
        print(f"Image ready: {result['result']['sample']}")
        break
    elif result['status'] in ['Error', 'Failed']:
        print(f"Generation failed: {result}")
        break

Content Delivery and Storage Guidelines

Delivery URLs

Generated images are served from region-specific delivery URLs:

EU: delivery-eu1.bfl.ai
US: delivery-us1.bfl.ai

Important Delivery Considerations

Not for Direct Serving: The result.sample URLs from delivery endpoints are not meant to be served directly to end users.

No CORS Support: We do not enable CORS on delivery URLs, which means they cannot be used directly in web browsers for cross-origin requests.

10-Minute Expiration: Generated images expire after 10 minutes and become inaccessible.

Network Access: If your infrastructure uses firewalls or network restrictions, ensure you whitelist the delivery endpoints (delivery-eu1.bfl.ai, delivery-us1.bfl.ai) to allow downloading generated images.

Recommended Image Handling

Download and Re-serve Pattern:

import requests
import os
from datetime import datetime
from typing import Dict, Any

def download_and_store_image(result_url: str, local_path: str) -> str:
    """
    Download image from BFL delivery URL and store locally
    """
    response = requests.get(result_url)
    response.raise_for_status()
    
    with open(local_path, 'wb') as f:
        f.write(response.content)
    
    return local_path

def handle_generation_result(result: Dict[str, Any]) -> Dict[str, Any]:
    """
    Process generation result and store image locally
    """
    if result['status'] == 'Ready':
        sample_url = result['result']['sample']
        
        # Generate unique filename
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        filename = f"generated_image_{timestamp}.jpg"
        local_path = os.path.join("./images", filename)
        
        # Ensure directory exists
        os.makedirs(os.path.dirname(local_path), exist_ok=True)
        
        # Download and store
        stored_path = download_and_store_image(sample_url, local_path)
        
        # Now serve from your own infrastructure
        return {
            'status': 'ready',
            'local_path': stored_path,
            'public_url': f"https://yourdomain.com/images/{filename}"
        }
    
    return result

Migration Checklist

Update API Endpoints

Replace legacy endpoints with appropriate new endpoints based on your needs
Use api.bfl.ai for global load balancing
Use api.eu.bfl.ai or api.us.bfl.ai for regional preferences

Implement Polling URL Handling

Ensure your code extracts and uses the polling_url from API responses
Update polling logic to use the provided polling URL instead of hardcoded endpoints

Update Finetuning Workflows

Continue using region-specific endpoints (api.eu1.bfl.ai, api.us1.bfl.ai) for finetuning
Ensure finetuned model inference uses the same region as training

Implement Proper Image Handling

Set up download and re-serve infrastructure for generated images
Plan for 10-minute expiration window
Consider implementing CDN or cloud storage for better performance

Best Practices

Error Handling

import requests
import time
from typing import Dict, Any, Optional

def robust_api_call(url: str, headers: Dict[str, str], json_data: Dict[str, Any], max_retries: int = 3) -> Dict[str, Any]:
    """
    Robust API call with retry logic and proper error handling
    """
    for attempt in range(max_retries):
        try:
            response = requests.post(url, headers=headers, json=json_data)
            
            if response.status_code == 429:
                # Rate limit exceeded, wait and retry
                wait_time = 2 ** attempt  # Exponential backoff
                time.sleep(wait_time)
                continue
                
            elif response.status_code == 402:
                # Insufficient credits
                raise Exception("Insufficient credits. Please add credits to your account.")
                
            elif response.status_code >= 400:
                # Other client/server errors
                response.raise_for_status()
            
            return response.json()
            
        except requests.exceptions.RequestException as e:
            if attempt == max_retries - 1:
                raise e
            time.sleep(2 ** attempt)
    
    raise Exception(f"Failed after {max_retries} attempts")

Rate Limiting

Maximum 24 concurrent requests for most endpoints
Maximum 6 concurrent requests for flux-kontext-max
Implement exponential backoff for 429 responses

Content Management

Download images immediately upon generation completion
Implement proper error handling for expired URLs
Consider implementing a queue system for high-volume applications
Use appropriate storage solutions (CDN, cloud storage) for serving images to users

Get Started

FLUX.1 Kontext

Finetuning

Integration

API Endpoints Overview

Primary Global Endpoint

Regional Endpoints

Legacy Regional Endpoints

Key Benefits of New Endpoints

Enhanced Reliability

Better Performance

Seamless Experience

Polling URL Usage

Example Implementation

Content Delivery and Storage Guidelines

Delivery URLs

Important Delivery Considerations

Recommended Image Handling

Migration Checklist

Best Practices

Error Handling

Rate Limiting

Content Management

Get Started

FLUX.1 Kontext

Finetuning

Integration

​API Endpoints Overview

​Primary Global Endpoint

​Regional Endpoints

​Legacy Regional Endpoints

​Key Benefits of New Endpoints

Enhanced Reliability

Better Performance

Seamless Experience

​Polling URL Usage

​Example Implementation

​Content Delivery and Storage Guidelines

​Delivery URLs

​Important Delivery Considerations

​Recommended Image Handling

​Migration Checklist

​Best Practices

​Error Handling

​Rate Limiting

​Content Management

API Endpoints Overview

Primary Global Endpoint

Regional Endpoints

Legacy Regional Endpoints

Key Benefits of New Endpoints

Polling URL Usage

Example Implementation

Content Delivery and Storage Guidelines

Delivery URLs

Important Delivery Considerations

Recommended Image Handling

Migration Checklist

Best Practices

Error Handling

Rate Limiting

Content Management