Learning and hoping to contribute to general purpose AI one day
Python, PyTorch, OpenAI Gym, Keras, Tensorflow, Scikit, Jupyter, Google Colab, Docker
Learning from Open AI contributions and breakthrough research papers, I explored state of the art model-free algorithms like Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradient (DDPG), Vanilla Policy Gradient (VPG), Trust Region Policy Optimization (TRPO), Twin Delayed DDPG (TD3), Soft Actor Critic (SAC), Deep Q Network (DQN), Advantage Actor Critic (Async + Sync) (A2C/A3C) and model-based algoirthms like the famous Alpha Zero Monte Carlo Search Tree (MCST). I explored each algorithm, their benefits, use cases and implementations. Archived Github upon request.