Reinforcement learning for proactive content caching in wireless networks