Data Solutions Case Study

Data Lake

Our team developed an efficient, cost-effective alternative to an enterprise data warehouse by designing and building a data lake.

The Problem

Our client needed to make better use of its data to streamline case processing, make quicker decisions, and improve program integrity; however, they were challenged by massive volumes of data from disparate sources, many trapped in siloed legacy systems, all in different formats.

Our Solution

Given the volume and complexity of data, and the high cost of developing an enterprise data warehouse, our team developed an efficient, cost-effective alternative by designing and building a data lake. Using an Oracle extract process, we daily transformed and stored more than 10 GB of structured and semi-structured data. The data was drawn from over 25,000 fields in 1,575 tables, into 1,790 fields in 61 tables in the data lake, and could then be used as needed for a variety of activities, including reporting, monitoring, and asynchronous predictive analytics.

The advantage of a data lake is that it allowed incoming data to remain as close as possible to its original, native format, and then provided a “just in time” transformation and shaping of data, based on the unique data requirements for each project and/or activity. The data lake, and its terabytes of data, lowered costs and increased capabilities by reducing the impact on source systems and making data available much more quickly.

Challenges

  • Storing and making available disparate and highly-complex data from multiple sources and legacy systems when needed
  • Transforming and normalizing stored data to meet the differing data requirements of new and ongoing projects and activities
  • Incorporating changing requirements and new data sources
  • Complying with a growing array of State and Federal regulations
  • Adhering to stringent privacy and security requirements
  • Providing complete audit functionality

Benefits

  • Quick and highly cost-effective solution
  • Capture and storage of key data from many different sources in original, native format
  • “Just in time” transformation and normalization of data to meet specific project requirements
  • Access to, and better use of, existing data
  • Improved monitoring, reporting and analysis
  • Improved staff productivity and accuracy
  • Increased detection and deterrence of fraud, waste, and abuse

LET'S WORK
TOGETHER