Leveraging privileged information and symmetry for policy learning in partially observable robotic systems