Data lake is a central place to have all your data for analytics purposes

  • Fully managed serice that makes it easy to setup a data lake in days (instead of months)
  • Discover, cleanse, transform, and ingest data with automation (collecting, moving, cataloging…)
  • Combine structured and unstructured data
  • Out-of-the-box blueprints: S3, RDS, Relational & NoSQL DB
  • Fine-grained access control
  • Built on top of AWS Glue (though user do not interact with it directly)

One of the advantages is centralized access control. It centralize all services, and makes managing permissions easier