Data Solutions

Data Lake


The client needed to organize and streamline data to improve operator efficiencies in a fast paced call-center.  The quicker decisions could be made based on accessible real-time data, increased revenue growth could be realized.  The manner in which to serve this data was not meeting the needs of the business.

With massive volumes of data from disparate sources trapped in legacy systems, custom software development was the only good choice.  The majority of the data was in siloed legacy systems not able to interact with current systems. Each of these legacy systems commonly would house the data in different formats impeding its readiness.   The Software Consulting team at Sparkfish was brought in to evaluate and determine if there could be a feasible resolution.

The business challenges were quickly recognized and mapped to possible technical resolutions.  As each dataset was analyzed, the differences and complexity of integration were identified. We built a capable team which consisted of Senior Developers, Project Managers, and Data Scientists.


  • Storing and distributing highly-complex data from multiple sources and legacy systems
  • Integrating multiple data formats into a single solution
  • Normalizing stored data to meet the requirements of new and ongoing back-end systems
  • Incorporating changing data management requirements
  • Complying with regulations for protected personal information, PPI
  • Adhering to stringent privacy and security requirements
  • Providing complete audit functionality


Due to the complexity of the data and the cost of enterprise data warehouse development, our team developed an efficient alternative.  We designed and built a data lake using SQL. We transformed and stored more than 10 GB of structured and semi-structured data. The data was drawn from over 25,000 fields in 1,575 tables into 1,790 fields in 61 tables in real-time.  The data lake is now used as needed for a variety of activities, including reporting, monitoring, and asynchronous predictive analytics.

The advantage of a data lake allowed incoming data to remain as close as possible to its original native format.  Using a “just in time” codeset, we shaped the data based on the unique requirements for each project and/or activity.  The information could now freely flow to each of the internal systems used in the call-center. Operator efficiency escalated at a faster pace than was expected.

The data lake, and its terabytes of data, lowered costs and increased capabilities.  By reducing the impact on source systems and making data available faster, revenue growth would be realized within 4 months.


  • Quick and highly cost-effective solution managing structured and raw data
  • Capture and storage of key data from many different sources in original, native format
  • “Just in time” transformation and normalization of data to meet specific project requirements
  • Modern data aggregation with API development connectors built in
  • Access to and better use of existing data
  • Improved monitoring, reporting and analysis
  • Improves staff productivity and accuracy
  • Increased detection and deterrence of fraud, waste, and abuse

Drop us a line