Designing the workflows and framework
As a final stage in the data lake setup, the following was done to ensure data was mined and reported as per the requirement:
-
- Developed the DevOps process
- Monitored/alerts/scheduling for failures/success
- ETL Framework based on AWS Glue to discover data and store the associated metadata in the AWS Glue Data Catalog
- Catalogued data was searchable using Elastic Search and available for various business applications through various SQL queries.
Author :
Aditya Sathyadev, Co-founder and Director – BizAcuity
IIT (BHU), IIM (Calcutta)