Incidentally does anyone have resources for SMALL data? EG a few MB of a time, but requires the same ETL, scheduling, traceability. I'd love some lite versions of big-data tools but needs to be simple, small and cheap.

Take look at luigi, which is a lightweight task orchestrator with minimalistic dependencies.

[1] https://github.com/spotify/luigi