Disclaimer: This post was written with the help of Claude Sonnet 4

You know that feeling when you’ve automated everything perfectly, and then reality hits you with a curveball? That’s exactly what happened during our recent migration from media.tifi.tv to media.telex.im. What started as a straightforward file migration turned into an unexpected deep dive into Elasticsearch operations. Here’s how it all unfolded.

The Perfect Plan (That Wasn’t So Perfect)

The mission seemed straightforward: migrate all media files from our old domain to the new one. We had thousands of files scattered across our PostgreSQL database—user avatars, uploaded documents, logos—all pointing to https://media.tifi.tv/telexbucket/. The new infrastructure was ready at https://media.telex.im/ with separate buckets for staging and production environments.

I had it all mapped out in my head: download files from the old server, copy them to the new MinIO buckets, and update the database URLs. Clean, automated, bulletproof. Or so I thought.

Building the Migration Pipeline

I started with what I knew best: Python automation. The first script (download_files.py) would parse a list of URLs from our database export and download everything in parallel:

def download_file(url: str, env: str) -> str:
    """Download file into env-specific directory and return local path."""
    filename = url.rsplit("/", 1)[-1]  # last part of URL
    env_dir = os.path.join(os.getcwd(), env)
    os.makedirs(env_dir, exist_ok=True)
    filepath = os.path.join(env_dir, filename)

    try:
        response = requests.get(url, timeout=15)
        response.raise_for_status()
        with open(filepath, "wb") as f:
            f.write(response.content)
        return filepath
    except Exception as e:
        print(f"❌ Failed to download {url}: {e}")
        return ""

Threading was crucial here. With 20 concurrent workers, what would have taken hours completed in minutes:

def run_downloads(tasks):
    """Download in parallel using threads and return results."""
    results = []
    with ThreadPoolExecutor(max_workers=MAX_WORKERS) as executor:
        future_to_task = {
            executor.submit(download_file, new_url, env): (env, table, col, old, new_url)
            for env, table, col, old, new_url in tasks
        }
        for future in as_completed(future_to_task):
            # Process results as they complete
            env, table, col, old, new_url = future_to_task[future]
            full_path = future.result()
            results.append([table, col, old, full_path])
    return results

This felt like the right approach—fast, efficient, and it generated CSV files tracking every downloaded file. But I quickly realized the downloaded files were sitting on the wrong server.

The Remote Copy Challenge

The second script (copy_files.py) used scp to transfer everything to the destination server. This is where things got interesting. You can’t just copy thousands of files sequentially—that would take forever. But you also can’t overwhelm the remote server with too many concurrent connections.

def scp_file(local_path, env):
    """Copy a file via scp and return remote path (or empty if failed)."""
    filename = os.path.basename(local_path)
    remote_dir = f"{REMOTE_BASE}/{env}"
    remote_path = f"{remote_dir}/{filename}"

    try:
        subprocess.run(
            ["scp", local_path, f"{remote_dir}/"],
            check=True
        )
        print(f"✅ Copied {local_path}{remote_path}")
        return remote_path
    except subprocess.CalledProcessError as e:
        print(f"❌ Failed to copy {local_path}: {e}")
        return ""

The key was finding the right balance with MAX_WORKERS = 10. Too few workers and the transfer crawled. Too many and the SSH connections would get refused or timeout. It’s like trying to pour water through a funnel—there’s an optimal flow rate.

MinIO and the Final Destination

Once files were on the destination server, the third script (copy_to_new_minio_and_write_to_postgresql.py) handled the final placement into MinIO buckets. This is where the URL transformation logic really mattered:

def build_new_url(old_url: str, env: str) -> str:
    """
    Replace telexbucket with the environment-specific bucket
    and update domain from tifi.tv -> telex.im
    """
    if env == "production":
        bucket = "telexprodbucket"
    else:  # staging
        bucket = "telexstagingbucket"

    new_url = old_url.replace("telexbucket", bucket)
    new_url = new_url.replace("tifi.tv", "telex.im")
    return new_url

def get_bucket_path(old_url: str, env: str) -> str:
    """
    Return the correct target bucket path for mc cp
    """
    if env == "production":
        bucket = "telexprodbucket"
    else:
        bucket = "telexstagingbucket"

    if "/file-uploads/" in old_url:
        return f"localminio/{bucket}/public/file-uploads"
    else:
        return f"localminio/{bucket}/public/profile_pics"

Different file types needed different bucket paths, and each environment had its own naming convention. The mc (MinIO Client) commands looked simple enough:

mc_cmd = [
    "mc", "cp",
    full_path,
    bucket_path
]
subprocess.run(mc_cmd, check=True)

But coordinating this across thousands of files while tracking success and failure rates required careful orchestration.

PostgreSQL: The Easy Part

Updating PostgreSQL felt like coming home after all the file manipulation. Clean, predictable SQL:

sql = f"UPDATE {table} SET {column} = '{new_url}' WHERE {column} = '{old_url}';"

Each CSV row became an UPDATE statement. The pattern was straightforward: find the old URL, replace it with the new one. PostgreSQL handled this elegantly, and I could verify the changes immediately with simple SELECT queries.

-- Before
SELECT avatar_url FROM users WHERE avatar_url LIKE '%media.tifi.tv%';

-- After  
SELECT avatar_url FROM users WHERE avatar_url LIKE '%media.telex.im%';

The transaction safety of PostgreSQL meant I could batch these updates without worrying about partial failures leaving the database in an inconsistent state.

The Moment of Truth

By this point, I was feeling pretty confident. The Python pipeline was solid:

  1. ✅ Download files from old server
  2. ✅ Copy files to new server
  3. ✅ Upload files to MinIO buckets
  4. ✅ Update PostgreSQL with new URLs

Everything worked beautifully. The migration scripts handled both staging and production environments, tracked progress with detailed CSV logs, and even included dry-run modes for testing. I ran a few verification queries against PostgreSQL and saw all the URLs properly updated.

Mission accomplished, right?

The Forgotten Database

The scripts worked like a charm. Files downloaded in parallel, copied to the new MinIO buckets, and PostgreSQL updates executed flawlessly. I was feeling pretty good about myself—until I realized I’d only solved half the problem.

The Forgotten Database

While I was busy patting myself on the back, our Elasticsearch cluster was sitting there with thousands of documents still pointing to the old media.tifi.tv URLs. Think of it like updating your address with the postal service but forgetting to tell your friends—half your mail still goes to the wrong place.

PostgreSQL was easy because I could write straightforward UPDATE queries. But Elasticsearch? That’s a different beast entirely. You can’t just run a simple UPDATE statement like you would in SQL. You need to work with its REST API and understand its query structure.

Learning Elasticsearch the Hard Way

Instead of diving into documentation, I decided to learn by doing. I fired up Postman and started experimenting. My first attempts were clumsy—I was treating Elasticsearch like PostgreSQL and getting frustrated when it didn’t behave the same way.

Here’s what a simple search looked like initially:

{
    "query": {
        "match": {
            "media.file_link": "media.tifi.tv/telexbucket"
        }
    }
}

This would find documents containing those URLs, but updating them? That required a completely different approach.

The Update Revelation

The breakthrough came when I discovered _update_by_query. This isn’t like a SQL UPDATE—it’s more like saying “find all documents matching this pattern and run this script on each one.” Here’s what actually worked:

{
    "script": {
        "source": "if (ctx._source.media != null) { for (int i = 0; i < ctx._source.media.size(); i++) { if (ctx._source.media[i].file_link != null) { ctx._source.media[i].file_link = ctx._source.media[i].file_link.replace('media.tifi.tv/telexbucket','media.telex.im/telexprodbucket'); } } }"
    },
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "media.file_link": "media.tifi.tv/telexbucket"
                    }
                }
            ]
        }
    }
}

Notice how different this is from SQL. Instead of a simple UPDATE statement, you’re writing a script that iterates through arrays and conditionally updates fields. It’s like the difference between giving someone directions and teaching them to navigate.

The Complexity Spiral

What started as a simple URL replacement quickly spiraled into multiple scenarios:

  1. Media files nested in arrays needed array iteration
  2. Avatar URLs were simple string fields requiring different logic
  3. Staging vs Production environments needed different target URLs
  4. Organization-specific updates required filtering by hundreds of org IDs

Each scenario demanded its own approach. For avatar URLs, the script was much simpler:

{
    "script": {
        "source": "if (ctx._source.containsKey('avatar_url') && ctx._source.avatar_url != null) { ctx._source.avatar_url = ctx._source.avatar_url.replace('media.tifi.tv/telexbucket','media.telex.im/telexstagingbucket'); }"
    },
    "query": {
        "match": {
            "avatar_url": "media.tifi.tv/telexbucket"
        }
    }
}

Boolean Queries: When Simple Isn’t Enough

The real learning curve hit when I needed to update only specific organizations’ data. This meant constructing boolean queries with hundreds of organization IDs:

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "media.file_link": "media.tifi.tv/telexbucket"
                    }
                }
            ],
            "filter": [
                {
                    "terms": {
                        "org_id.keyword": [
                            "01978a73-9d32-7839-b295-a3305f6f407c",
                            "01921d81-d2a3-75c8-aaec-32c5ae3c26d6",
                            // ... hundreds more IDs
                        ]
                    }
                }
            ]
        }
    }
}

This is where Elasticsearch’s power becomes evident. Try filtering by hundreds of IDs in SQL and you’ll quickly hit query length limits or performance issues. Elasticsearch handles this elegantly with its terms filter.

The Learning Process

Through trial and error, I discovered the patterns:

  • match queries for full-text search
  • term queries for exact matches
  • bool queries for combining conditions
  • must for AND logic
  • should for OR logic
  • filter for performance when you don’t need scoring

Each query type serves a specific purpose, like having different tools in a toolbox. You wouldn’t use a hammer for everything, and you shouldn’t use match for everything either.

Testing Before Committing

One thing I learned the hard way: always test your queries before running updates. I used simple search queries to verify I was targeting the right documents:

{
    "query": {
        "match": {
            "media.file_link": "media.tifi.tv/telexbucket"
        }
    },
    "size": 5
}

This showed me exactly which documents would be affected. It’s like doing a SELECT before your UPDATE in SQL—good practice that saved me from several disasters.

The Final Automation

Once I understood the patterns, I could build proper automation around the Elasticsearch operations. The key was breaking down the migration into discrete steps:

  1. Search and verify what needs updating
  2. Test the update logic on a small subset
  3. Run the full update with conflicts=proceed to handle concurrent updates
  4. Verify the results with follow-up searches

What I Learned

This migration taught me that automation isn’t just about writing scripts—it’s about understanding the tools you’re automating. My Python scripts handled the file operations perfectly, but they couldn’t help me with Elasticsearch because that required domain-specific knowledge.

The biggest lesson? When you encounter a new system, don’t try to force it into patterns you already know. PostgreSQL and Elasticsearch serve different purposes and have different strengths. Fighting against a tool’s design will only frustrate you.

The Aftermath

In the end, the migration was successful. Files moved, URLs updated, and users saw no interruption in service. But the real value was the learning experience. Now when I encounter Elasticsearch challenges, I don’t panic—I open Postman and start exploring.

Sometimes the best education comes from problems you didn’t expect to solve. This migration forced me to understand Elasticsearch in a practical, hands-on way that no tutorial could match. And that knowledge has proven invaluable for future projects.

The moral of the story? Your automation scripts will get you far, but be ready to learn new tools when they reach their limits. The problems you didn’t plan for often teach you the most.