The data Tower of Babel

AI is good at translation. The problem is that “speaking” data is not just a translation problem. Yes, in some environments, using one of the many variations of Text2SQL, you can type “top 10 customers by Q1 spend” and get a viable response from auto-generated SQL. But enterprise data requests also need to account for factors like lineage (“where did this data come from and when was it refreshed?”). They need documentation, testing, versioning and governance. And they need to support multiple iterations as analysts, business users and data engineers work together to get the right data. AI can generate the right answers, but it can’t get everyone on the same page.  

Mitigating disconnects 

So, if AI isn’t the answer, what is? You need to apply different approaches to bridge the gap between groups. This doesn’t mean teaching each group the other’s language, though there is an element of that. It’s more about designing communications with an awareness of the needs of whoever is receiving the message — crystallizing best practices and making communications consistent and effective.  

For instance, when possible, analysts should develop visual prototypes using live data and actual data structures, rather than sketching approximations in PowerPoint. Pseudo-code can obscure edge cases that quickly become apparent when working with the actual database. More importantly, engineers work in code, and code requires specifics. The goal isn’t to make power users abandon their visual medium. It’s to include enough context that the visual metaphor can be cleanly translated to code. The less engineering needs to infer, the more accurate the results are likely to be. Most modern tools will let you do this with only a small representative “slice” of the data. 



Source link

Leave a Comment