Course Description: Data-driven models are revolutionizing science and industry. Scalable systems are needed to collect, stream, process, and validate data at scale. This course is an introduction to “big” data engineering where students will receive hands-on experience building and deploying realistic data-intensive systems. It will cover streaming, data cleaning, relational data modeling and SQL, and machine learning model training. A core theme of the course is “scale”, and we will discuss the theory and the practice of programming with large external datasets that cannot fit in main memory on a single machine. The course will consist of bi-weekly programming assignments, a midterm examination, and a final.
Location: MWF 9:30-10:20 SHFE 203
Office Hours: MW 4:30-5:30 243 JCL (Sanjay)
Office Hours (TA): Wed 11-12 (Rose), Thurs 9:30-10:30 (Will) both in 259 JCL
Grading: Quizzes (10%), Homework (20%), Midterm (30%), Final (40%) . The exam schedule is listed below:
- Midterm (6:30 pm-8:30 pm May 10)
- Final (10:30am-12:30pm June 12)
- For any conflicts, a makeup exam will be scheduled prior to these times. It is your responsibility to coordinate this well in advance.
Late Policy: 0% for all late work, reasonable exceptions will be considered including family emergencies, illness, etc.
Official Communication: The TA(s) and Instructor WILL NOT respond to personal emails. Please communicate through Piazza either with a public post if it is of general interest or a private message.
|4/1||Course Introduction (pdf)|
|4/5||Operators (pdf) (submission instructions)||HW0|
|4/8||Composing Operators (pdf) (db.py)|
|4/10||Main-Memory Aggregation (pdf)|
|4/12||Out-of-core algorithms (pdf) (iosim.py)|
|4/15||Out-of-core cont’d/ Hash Join (pdf)||HW1|
|4/17||In Class Quiz|
|4/26||SQL II||HW 2 OUT|
|4/29||Intro to Machine Learning||HW1 DUE|
|5/3||ML Systems I|
|5/6||ML Systems II|
|5/8||(Guest Lecture) Midterm Review|
|5/10||ETL and Potter’s Wheel||Midterm|
|5/13||Potter’s Wheel Cont’d||HW2 DUE|
|5/15||Formal Logic/Abstract Algebra Review||HW3 OUT|
|5/17||Integrity Constraints I|
|5/20||Integrity Constraints II|
|5/31||Knowledge Bases I|
|6/3||Knowledge Bases II|
|6/5||View From The Top||HW4 DUE|