Mike Czech

I'm Mike Czech, a software engineer and data scientist living in Hamburg, Germany. I work on autonomous driving safety at MOIA and write about data, machine learning, software engineering, travel, and other things I’m learning.

I have a feeling that the next breakthroughs in AI won’t necessarily come from more capable models, but rather from much faster inference. This week, we’ve seen two interesting developments in that direction.

First, OpenAI announced GPT-5.6 Sol:

“We’re also launching GPT‑5.6 Sol on Cerebras at up to 750 tokens per second in July”

Second, Google released Nano Banana 2 Lite, bringing image generation down to just a few seconds.

I suspect that even with model capabilities around the level of Claude Opus 4.6+, this kind of inference speed would enable a lot of new use cases, much like video streaming only became practical once the internet became fast enough.

I’ve noticed that with AI-assisted coding, it’s becoming even more common to end up with large PRs. This is problematic for several reasons:

  • bugs are harder to spot
  • merge conflicts increase
  • reviews take longer

The interesting thing is that AI coding agents are also really good at splitting large PRs into smaller, more manageable pieces! I’ll often ask an agent to identify independent changes, separate refactors from functional changes, and suggest a sequence of smaller PRs.

To me, this shows that AI-assisted coding isn’t just about producing more code in less time. It can also help reinforce good engineering practices, which are necessary if you want to scale development over time.

One of my favourite new tricks is to be a little more verbose in Slack and then use Claude and Slack MCP to generate a pull request from the discussion. That way, ideas from our Slack discussions make their way into the product almost immediately!

It’s a small example of how software engineering is becoming less about producing code and more about collaboratively shaping the product.

Recently, I’ve been working more with dbt again and came across a useful way to handle questionable rows: configure a test to warn and store its failures.

{{ config(
    severity = 'warn',
    store_failures = true
) }}

select *
from {{ ref('some_model') }}
where ...

With dbt test, this writes the failing rows to a table instead of stopping the whole pipeline. That makes it a handy way to flag invalid or suspicious records and keep them available for investigation.

I’m currently working with large Polars dataframes (20M+ rows) and have noticed that using an inner join to filter on a categorical column can be significantly slower than using is_in. There’s a good example on GitHub that demonstrates this issue:

In [156]: N = 10

In [157]: df = pl.DataFrame({"x": pl.Series(range(N))})

In [158]: %timeit -n10 -r10 df.filter(pl.col("x").is_in(df.select("x").to_series() + 1))
120 µs ± 12.3 µs per loop (mean ± std. dev. of 10 runs, 10 loops each)

In [159]: %timeit -n10 -r10 df.join(df.select("x") + 1, on="x")
865 µs ± 101 µs per loop (mean ± std. dev. of 10 runs, 10 loops each)

The somewhat obvious lesson here is to avoid using significantly more expensive operations when a much simpler and more appropriate alternative is available.