Knowledge-Augmented Vision-and-Language Assistant