Curso:
Pós Graduação em Cidades Inteligentes (Smart Cities)
Unidade curricular:
Big Data Analytics
Semestre:
Primavera
Número de créditos:
7,5
Número de horas de aula por semana:
2.00
Objetivos da unidade curricular:
Big data is a blanket term for the non-traditional strategies and technologies needed to gather, organize, process, and gather insights from large datasets. In this course, we will discuss the challenges created by Big Data and some of the state-of-the-art approaches do deal with them.
In this curricular unit, students will obtain practical experience with Hadoop, Hive, and Spark tools and understand their role in the analytical workflow of a data scientist. Lectures will approach the complex and heterogeneous Big Data ecosystem, and the privacy and societal implications of these technologies, in the Labs students will obtain hands-on experience with the state of the art tools of methods associated with the analyzes of Big Data.
Intended Learning Objectives
- Explain what Big Data is and what are their implications to society;
- Identify the sources of Big Data;
- Describe the architecture of Big Data Systems;
- Explain the core technologies that enabled the Big Data revolution;
- Understand the role and importance of the Hadoop Ecosystem;
- Explain what Map-Reduce and HDFS are, and Describeits role in the Hadoop Ecosystem;
- Setup a Hive Data Warehouse;
- Explore and Analyze data with Hive
- Understand which data can be ingested by Flume and Sqoop, and how to do it
- Understand what Spark is;
- Load, Transform and Analyze data using Spark;
- Manipulate structured data with SparkSQL;
- Analyze large networked data with Spark Graphx;
- DevelopSpark application to create machine learning models;
Requisitos de frequência:
Conhecimentos introdutório de programação em Python ou outras linguagens de programação.
Familiaridade com bases de dados estruturadas e com SQL.
Língua de ensino:
Português. Em caso de existirem alunos ou professores estrangeiros, as aulas serão dadas em Inglês.