Ask DT: Data Scientists Not Utilizing Data Warehouse (None)
5 points by shicken 1623 days ago | web | 2 comments

Hi, Looking for some advice. I have been a DS now for 3 years, but am struggling working with my colleagues. They will quite regularly write tens of thousands of lines of R code (mostly copied and pasted) that takes hours to run, when I approach them to tell them they can just do a sum on the production data warehouse tables rather than dumping the data lake and avoid the need to clean the data, validate and aggregate, they look at me like I am a fool even though it would save them hours and they could have a result within a minute. Has anyone else experienced this?

All the time. Bring proof and examples for them, and show how something that takes hours can really take minutes or seconds. DW are typically OLAP and built for analytical processing.