Using PDB in Metaflow

March 23, 2025 · 1 min · 138 words · Me

Finding the Maximal Rectangle in Augmented Reality

March 16, 2025 · 2 min · 272 words · Me

How Boosted Decision Trees can Benefit from Language Models

March 13, 2025 · 5 min · 981 words · Me

Polars is_in vs. inner join

March 3, 2025 · 1 min · 125 words · Me

A Lightweight Vector DB with DuckDB and Cloud Run

March 2, 2025 · 3 min · 518 words · Me

Do you really need a distributed query engine?

October 11, 2024 · 2 min · 362 words · Me

Optimizing Data Loading and GPU Usage in PyTorch

March 20, 2024 · 3 min · 445 words · Me

SQL Query Testing with DuckDB and SQLGlot

January 12, 2024 · 2 min · 365 words · Me

Enhance Training Speed with Mixed Precision Training

December 20, 2023 · 2 min · 353 words · Me

Optional Sampling for Better Feedback Loops

October 4, 2023 · 3 min · 481 words · Me

Reducing Memory Requirements with Sparse Data Structures

June 11, 2023 · 2 min · 317 words · Me

Reconsidering Data Types for Efficiency

September 20, 2022 · 2 min · 242 words · Me

Columnar vs. Row-Based Storage: Key Differences and Use Cases

March 20, 2022 · 2 min · 347 words · Me

Enhancing Code Performance with Vectorization

January 22, 2022 · 2 min · 351 words · Me

Table Partitioning and Clustering Strategies

November 4, 2020 · 2 min · 283 words · Me

Implementing a Python Package Repository on GCP

September 13, 2020 · 4 min · 832 words · Me

Optimizing Resource Utilization in Batch Jobs on GKE

April 15, 2019 · 3 min · 567 words · Me

Concolic Testing With KLEE

July 8, 2014 · 3 min · 452 words · Me