Duckdb : AntonioCortes.com

DuckDB and httpfs behind a proxy: the secret nobody tells you

2026-03-02

4 min de lectura

The problem: httpfs ignores your environment variables

If you work with DuckDB and the httpfs extension to read remote Parquet files, CSVs from S3, or any HTTP resource, you probably assume that the HTTP_PROXY and HTTPS_PROXY environment variables work just like every other tool. Curl respects them. wget respects them. Python requests respects them. Node.js respects them.

DuckDB does not.

I ran into this while working in a corporate environment with a mandatory proxy. I had a script reading Parquet files from Google Cloud Storage using httpfs, and it simply would not work. No clear error, no descriptive timeout, just silence. Meanwhile, a curl to the same resource with the same environment variables returned data without issue.

#duckdb #devops #databases

DuckDB y httpfs detrás de un proxy: el secreto que nadie te cuenta

2026-03-02

4 min de lectura

El problema: httpfs ignora tus variables de entorno

Si trabajas con DuckDB y la extensión httpfs para leer Parquet remotos, CSVs desde S3 o cualquier recurso HTTP, probablemente asumes que las variables de entorno HTTP_PROXY y HTTPS_PROXY funcionan igual que en cualquier otra herramienta. Curl las respeta. wget las respeta. Python requests las respeta. Node.js las respeta.

DuckDB no.

Me he encontrado con esto trabajando en un entorno corporativo con proxy obligatorio. Tenía un script que leía ficheros Parquet desde Google Cloud Storage usando httpfs, y simplemente no funcionaba. Sin error claro, sin timeout descriptivo, solo silencio. Mientras tanto, un curl al mismo recurso con las mismas variables de entorno devolvía los datos sin problema.

#duckdb #devops #databases

DuckDB: File Formats and Performance Optimizations

2026-02-01

3 min de lectura

Lately I’ve been working quite a bit with DuckDB, and one of the things that interests me most is understanding how to optimize performance according to the file format we’re using.

It’s not the same working with Parquet, compressed CSV, or uncompressed CSV. And the performance differences can be dramatic.

Let’s review the key optimizations to keep in mind when working with different file formats in DuckDB.

Parquet: Direct Query or Load First?

DuckDB has advanced Parquet support, including the ability to query Parquet files directly without loading them into the database. But when should you do one or the other?

#duckdb #sql #optimization

DuckDB: Formatos de archivo y optimizaciones de rendimiento

2026-02-01

6 min de lectura

Últimamente estoy trabajando bastante con DuckDB, y una de las cosas que más me interesan es entender cómo optimizar el rendimiento según el formato de archivo que estemos usando.

No es lo mismo trabajar con Parquet, con CSV comprimido, o con CSV descomprimido. Y las diferencias de rendimiento pueden ser dramáticas.

Vamos a revisar las optimizaciones clave que hay que tener en cuenta cuando trabajamos con diferentes formatos de archivo en DuckDB.

#duckdb #sql #optimizacion

Últimas Entradas

Auto Memory y Auto Dream: como Claude Code aprende y consolida su memoria

Claude Code con LSP: de buscar texto a entender codigo

Ghost Jobs: la economía construida sobre empleos que no existen

DuckDB y httpfs detrás de un proxy: el secreto que nadie te cuenta

Cómo PostgreSQL estima tus consultas (y por qué a veces se equivoca)

Analizando el aislamiento de filesystems en contenedores para cargas multi-tenant

Tag: Duckdb

DuckDB and httpfs behind a proxy: the secret nobody tells you

The problem: httpfs ignores your environment variables

DuckDB y httpfs detrás de un proxy: el secreto que nadie te cuenta

El problema: httpfs ignora tus variables de entorno

DuckDB: File Formats and Performance Optimizations

Parquet: Direct Query or Load First?

DuckDB: Formatos de archivo y optimizaciones de rendimiento