Machine learning methods for audio-visual event analysis