Background

Everyday, thousands of videos and images are uploaded into the web creating an ever-growing demand for methods to make them easier to retrieve, search, and index. YouTube alone claims that every minute, 100 hours worth of video material is uploaded. Most of these videos consist of consumer-produced, “unconstrained” videos from social media networks, such as YouTube uploads or Flickr content. Since many of these videos are reflecting people's everyday life experience, they constitute a corpus of never before seen scale for empirical research. But what are the methods can cope with this amount of data? How does one approach the problems in a research setting, with or without thousands of compute cores at one's disposal? How does one budget and estimate the time needed to perform such experiments? What rigor has to be employed to make experiments robust in evaluations? What about reproducibility?

It turns out, there are many myth in Machine Learning that this class will dispell. For example, speech recognition does not need the cloud and many computer vision tasks do not need a GPU.

Content

This class will provide a theoretical and practical perspective on the experimental design for machine learning experiments on multimedia data.

The class consists of lectures and a hands-on component. The lectures provide a theoretical introduction to machine learning design and signal processing for various media types, including the visual content, the temporal structure, acoustic content, metadata, user comments, etc. Moreover, the lectures will discuss contemporary work in the field of multimedia content analysis. Guest speakers will enrich the class with their experiences.

For the hands-on component students will receive Amazon EC2 credits to implement project discussed with the instructor. Experiments can be performed using the Multimedia Commons infrastructure on the YFCC100M corpus.

The Multimedia Commons. See http://mmcommons.org

Lectures

1) Motivation: Lecture Slides

2) Scientific process revisited, deterministic and statistical machine learning: Lecture Slides, Homework.

3) Guest talks: Rishi Puri, Zachary Stone (UC Berkeley). Homework

4) Fundamentals of Machine Learning: Capacity in general and why it is important. Lecture Slides, Homework

5) Capacity for Neural Networks. Lecture Slides, Homework. Demo: Tensorflow Meter

6) Estimating required capacity given a task. Lecture Slides. Homework. Demo: Capacity estimation tools.

7) Fundamentals of Machine Learning: Generalization and Adversarial Examples. Lecture Slides, Homework.

8) Guest Talk: Joel Hestness (Cerebras). Lecture Slides.

9) The Relationship between Image Classification and Compression. Lecture Slides.

10) Perceptual Data Best Practices for Audio. Lecture Slides.

11) Perceptual Data Best Practices for Video. Lecture Slides.

12) Multimedia Retrieval Research. Lecture Slides.

13) Student presentations

14) Summary and QA: On whiteboard. Slides from ACM MM Tutorial.

Grading

To pass, students have to attend regularily and write a report related to a project as outlined below.

Master of Engineering students are required to take the final exam, everybody else is encouraged to take the final exam optionally. Weekly homework will be given which is optional but highly recommended, especially for non-graduate students.

The project counts 100% of the grade. If a final is taken, the project counts 80% of the grade and the final counts 20% of the grade. If the final was taken voluntarily, the final only improves the grade or does not count.

The final is open book. I definitely recommend bringing the Machine Learning Experimental Design Cheat Sheet.

Project Requirements

Teams of 1 to 3 students enrolled in the class chose a project to either produce or reproduce a scientific result that entails machine learning. They write a paper that judges the experimental design and limits of the machine learning approach.

The project should comment on all measurements taken and scientific methods applied. Intetllctual tools for this are presented in class and are summarized in the Machine Learning Experimental Process overview sheet. The report must be in written form and can be accompanied by slides, code and videos.

Further inspiration can be found here:

- ACM reproducibility guidelines and the ACM SIG Multimedia reproducibility guidelines.

- The 10 questions sheet from the lecture.

In this class, we do not care about accuracy as much as we care about reproducibility and scientific judgement of the experimental design. A project will not fail due to low accuracy.

Important: Group projects must describe each indvidual student's contributions in a paragraph.

Optional: Submit a paper to ACM Multimedia, ACM ICMR, IEEE MIPR, IEEE ICASSP, or another conference about your excellent results.

Enrollment

EE, CS, and data science MS and PhD students can directly enroll. Undergraduates should contact me for enrollment details. Priority will be given to URAP students of the multimedia group.

Pre-Requisites

The class requires solid programming skills, assumes familiarity with fundamental statistical concepts like the central limit theorem, probability distributions, and information measures. Familiarity with basic signal processing and computer architecture skills are helpful. Furthermore, a team-working attitude and open-mindedness towards interdisciplinary approaches is essential.

Materials

The Machine Learning Experimental Design Cheat Sheet helps with the ML fundamentals of the class.

The Machine Learning Experimental Process is an overview of the suggested experimental process.

The 10 questions sheet.

David MacKay's fantastic book (especially Chapter 40) can be consulted for depth.

In general, supportive materials used for this class consists of contemporary research articles from conferences and journals. Details will be presented in class. I humbly recommend my textbook from Cambridge University Press. An overview of research on a large scale video analysis task is given in the Springer book. Also check out our demo on (deep) neural network capacity.

Lectures from the 2012 version of the class (before deep learning) can be accessed here.

Experimental Design for Machine Learning on Multimedia Data

Fall 2019

CS294-082 Lecture (CCN 33112) -- Room 310 Soda -- Fridays 3:30-5pm

Prof. Gerald Friedland