Get the latest tech news
Building data infrastructure that will last
As a consultant, I have been called in to review and, in many cases, replace dozens of half-finished, abandoned, and sometimes forgotten data infrastructure projects.
The data infrastructure in a few cases may just need a little tweaking to operate effectively, but other times the project is either so incomplete or so lacking in a central design that the best thing to do is replace the old system. One great point that was made by an ex-manager in analytics I was just talking to was that it can be easy to pick an ETL or data pipeline tool that sounds good on paper. As part of Uber’s cloud journey, we are migrating the on-prem Apache Hadoop® based data lake along with analytical and machine learning workloads to GCP™ infrastructure platform.
Or read this on Hacker News