How a data lakehouse helps harness the power of AI
The scope of its technology can help enable a better data infrastructure to shape AI integration and healthcare outcomes.
Generative artificial intelligence is revolutionizing healthcare. In a recent study, nearly 30 percent of healthcare leaders said they’ve already implemented AI, with about 60 percent planning to implement such technology in the next 12 to 24 months or more.
The same research points to the importance of healthcare organizations building a solid data foundation. On average, only 57 percent of a healthcare organization’s data is being used. A strong data platform enables healthcare organizations to excel in AI by enabling roadmaps and ensuring more of their data is meaningfully put to work.
Whether it’s a highly targeted supervised model or a 100 billion-plus parameter large language model, any AI-powered innovation is only as powerful as the data platform on which it’s built. The quality of the data within that platform will be reflected in the quality of the AI it supports, for better or for worse. In healthcare, this means organizations need a structured but adaptable data environment that enables both current technologies and the developments of the future.
By supporting leading healthcare organizations through their digital transformations, I've determined some key characteristics that a data platform must have to effectively harness AI. There are five core components that will enable healthcare leaders to embrace the present and future of this technology.
The most valuable healthcare data in an AI environment is:
High quality and accurate. It’s vital that AI algorithms be trained on reliable and precise information to earn trust and achieve adoption.
Longitudinal and holistic. Some of the most powerful AI outcomes emerge from data connections not obvious to humans, benefitting from both broad and deep datasets, including both structured and unstructured data.
Catalogued and organized. The context of data can be as vital to a model’s success as the data itself, requiring careful tagging and routing of different tiers of data.
Omni-directional and real time. Deployment and meaningful adoption of models requires seamless and timely flows between data producers and consumers.
Secure and high-scale. The quantity of data that current and future AI models are capable of processing is staggering, requiring efficient and elastic underlying infrastructure and a security-first architecture.
The capabilities of a data lakehouse
The modern data lakehouse embraces all five of these tenets, combining the best attributes of a well-structured warehouse and a flexible and fast lake.
Critically, a data lakehouse offers both the data quality and data quantity that are critical to large language models and machine learning, which other data architectures might lack. For AI that works, healthcare organizations need the right data at the right volume and for it to be properly organized.
The latest generation of AI models also require significant horsepower. The training and operations of these models require massive computing capacity and specialized chipsets. The storage and transmission of the petabytes of data that power them also require efficient architecture. Overlooking these details can lead to unexpectedly steep cloud infrastructure bills. An efficient zero-copy lakehouse architecture can support massive horizontal scale without breaking the bank.
In such a data environment, the information that’s aggregated can be instantly deployed for current AI uses, like summarizing complex medical histories for a busy provider before their appointment or stitching together stray, unstructured health data to complete a more holistic patient record.
Currently, AI can save time for providers and support staff, reducing the hours spent sifting through records and freeing them up for more meaningful top-of-license work. It also can act as a separate set of eyes, catching critical clinical insights that might otherwise go overlooked — from previously undocumented conditions to potentially missed screenings or treatments.
In the future, there are many ways AI might shape healthcare workers’ and administrators’ day-to-day lives and the way hospitals deliver care. A data lakehouse makes this existing and potential innovation possible, in whatever form it takes, at an enterprise level. AI is already here, and its future is inside the data lakehouse, where innovation awaits.
Nick Stepro is chief product and technology officer at Arcadia.