Rüdiger Voigt

Rüdiger Voigt is a Business Intelligence Consultant and Software Developer based near Cologne, Germany.

He creates data warehouses and efficient ETL processes, making business insights easily accessible through intuitive dashboards. Additionally, he provides customized training and executive workshops to help clients leverage BI tools and data-driven applications effectively.

He brings an interdisciplinary perspective to his work, rooted in his studies of Political Science and research interests that span Artificial Intelligence, Data Science, and Comparative Politics.

Notes

Downloading an Archive with exoskeleton

Build a resilient Python crawler with exoskeleton and MariaDB to harvest thousands of documents. This note covers scraping search results, extracting metadata, and organizing downloads for future analysis.

Software Projects / Open Source

I publish and maintain open-source Python and software tools on GitHub. These are the most important:

Exoskeleton

A modern Python framework for polite, fault-tolerant web crawling and large-scale downloads, with a database-backed queue, deduplication, file versioning, and progress reporting.

salted

A fast asynchronous link checker for HTML, Markdown, and TeX files that caches results, normalizes URLs, and works both from the command line and in CI pipelines.

userprovided

A Python package for validating, normalizing, and standardizing user input, with utilities for parameters, URLs, email addresses, hashes, and basic security checks.

compatibility

A lightweight Python library for package authors that checks Python and OS compatibility, warns about untested versions, and reminds users to update before runtime issues hit.