All AI initiatives have an underlying data story

We’re all going overboard with the AI boom and the hype cycle. Problem with all hype cycles is that when we’re at peak the horizon looks beautiful and when the slide begins the avalanches tear apart everything. The AI hype cycle is all in the media houses and newsrooms. On ground ZERO we know how sticky the situation is.

Any AI initiative will need a very compelling use case and a good data set on which it can build, learn and improvise. All Fintechs seem to be very aggressive with the use case part as it is under their control to get the development of the code and the models, but when it comes to the efficiency of the models they struggle to find the relevant data sets. Banks still hold data. It is the same story newsrooms from past 2 years, and a lot has been done to foster the collaboration and data sharing side, but a lot is still required to be done.

Even if the bank allows the data sharing with the Fintech, many times which kick-starting the AI journey and a product adoption or development journey we realize we’re sitting literally on a gold mine of data. Goldmine here is more than just a metaphor, as sometimes data is in such a bad shape that you have to re-process the entire data set to clean up the anomalies. That in itself is a herculean task.

Most Common Issues

Many banks have this problem and they’re sitting on it doing nothing due to multiple reasons:

The missing data pieces
  1. Technical Debt mounted due to no code refactor
  2. Systems went Live without data clean up, assumption being, it’ll be done later, but never got prioritized
  3. Original Source systems miss the data documentation and now no one knows the relevance of certain fields
  4. Certain non-mandatory fields were left blank for convenience, creating sparse data sets
  5. Assumptions and logic to interpret certain blank data fields are usually coded in the application never to be known to general usage.

Many more specific reasons are there for data gaps, but when you start feeding these data sets to the ML Models the pre-processing starts failing and many times projects detour towards an extended data clean-up or data sanity, and that impacts the overall turnaround of the projects. From outside it seems Banks are dragging their heels in the new tech and are innovation averse, but in reality, they’re trying their best to show the world that they too can but are tied down by the constraints no one paid attention to.

Conclusion

In many banks, the new data initiatives are being taken to solve this puzzle, but the sheer amount of data volume makes it a programme targeted for 1 year to run through multiple years and that may impact the aggressive AI or ML timelines, and in many cases Fintech / Vendor integration with the bank. No matter what the media says, we at the grass root know that every AI initiative will turn into a data exercise eventually.

450total visits,6visits today