BDA400 - Introduction to Big Data

Outline info
Last revision date 2018-07-20 11:56:02.957
Last review date 2018-07-20 11:56:14.583

Subject Title
Introduction to Big Data

Subject Description
The data warehousing and analytic marketplace continues to be one of the fastest growing areas of technology and applications. Much of this growth is fueled by applications starting to leverage new types of data to gain new insights into their industry, competition and customers.  This subject introduces students to big data concepts.  This course will explain key concepts such as the Hybrid Data Warehouse, Logical Data Warehouse and Data Lakes.  This course will provide hands on experience with the following:  Open source capabilities such as Hadoop and Spark: How to work with non-traditional data such as semi-structured and unstructured data such as text, JSON and social data; Performing analytics using capabilities such as SQL over Hadoop and MapReduce.

Credit Status
1 Credit

Learning Outcomes
Upon successful completion of this subject the student will be able to:

  1. Explain how traditional data warehouse environments are evolving and what new business problems are being solved
  2. Explain what key concepts are such as a Logical Data Warehouse, Data Lake and Hybrid Data Warehouse
  3. Explain the purpose of key open source and big data components such as HDFS, MapReduce, Yarn, Ambari, Zookeeper, Hive, Hbase, Spark and more.
  4. Leverage open source components to perform text analytics
  5. Leverage open source components to run SQL over data stored in Hadoop
  6. Leverage MapReduce to run analytics in a Hadoop environment
  7. Leverage SparkSQL, SparkR, SparkML and other API in a Hadoop environment
  8. Perform management tasks in a Hadoop environment
  9. Set up security in a Hadoop environment
  10. Deal with data movement between a traditional relational database and a Hadoop environment

Essential Employability Skills
Communicate clearly, concisely and correctly in the written, spoken and visual form that fulfils the purpose and meets the needs of the audience.

Execute mathematical operations accurately.

Apply a systematic approach to solve problems.

Locate, select, organize, and document information using appropriate technology and information systems.

Analyze, evaluate, and apply relevant information from a variety of sources.

Manage the use of time and other resources to complete projects.

Cheating and Plagiarism
Each student should be aware of the College's policy regarding Cheating and Plagiarism. Seneca's Academic Policy will be strictly enforced.

To support academic honesty at Seneca College, all work submitted by students may be reviewed for authenticity and originality, utilizing software tools and third party services. Please visit the Academic Honesty site on for further information regarding cheating and plagiarism policies and procedures.

All students and employees have the right to study and work in an environment that is free from discrimination and/or harassment. Language or activities that defeat this objective violate the College Policy on Discrimination/Harassment and shall not be tolerated. Information and assistance are available from the Student Conduct Office at

Accommodation for Students with Disabilities
The College will provide reasonable accommodation to students with disabilities in order to promote academic success. If you require accommodation, contact the Counselling and Disabilities Services Office at ext. 22900 to initiate the process for documenting, assessing and implementing your individual accommodation needs.

Promotion Policy

Grading Policy
A+ 90%  to  100%
A 80%  to  89%
B+ 75%  to  79%
B 70%  to  74%
C+ 65%  to  69%
C 60%  to  64%
D+ 55%  to  59%
D 50%  to  54%
F 0%    to  49% (Not a Pass)
EXC Excellent
SAT Satisfactory
UNSAT Unsatisfactory

For further information, see a copy of the Academic Policy, available online ( or at Seneca's Registrar's Offices.

Modes of Evaluation

Grade Breakdown:
40% - Final Exam
30% - Midterm
30% - Assignments

Approved by: Sharon Estok