top of page
Abstract Linear Background

Introduction to Big Data Engineering

This course introduces learners to the fundamentals of big data systems, tools, and techniques, progressing to intermediate-level skills in designing, building, and managing big data pipelines. Students will explore key frameworks like Hadoop and Apache Spark, work with data storage solutions, and delve into data ingestion, processing, and analytics. The course also covers cloud solutions, stream processing, machine learning, and data governance, with a focus on hands-on practice and real-world applications.

Add a Title

Add paragraph text. Click “Edit Text” to update the font, size and more. To change and reuse text themes, go to Site Styles.

Next Item
Previous Item

Course Duration:

36 hours

Level:

Beginner to Intermediate

Course Objectives

  • Understand the core concepts of big data and its characteristics (Volume, Velocity, Variety, Veracity, Value)

  • Learn how to work with data storage systems, including HDFS and Amazon S3

  • Gain proficiency in big data frameworks, including Hadoop and Apache Spark

  • Develop skills in building ETL pipelines using tools like Apache Kafka and NiFi

  • Understand cloud-based big data solutions and tools

  • Learn the basics of machine learning on big data

  • Build and manage data governance and security practices

  • Complete a hands-on capstone project to design and implement a scalable big data pipeline

Prerequisites

  • Basic programming knowledge (Python or Java preferred)

  • Familiarity with databases and basic SQL

  • Understanding of fundamental data structures and algorithms

bottom of page