Abstract: We present a fully automatic system for extracting the semantic structure of a typical academic presentation video, which captures the whole presentation stage with abundant camera motions ...