Api old
Common: Auxiliary modules like trainer and logger.
- Engine: Engine for building Hsuanwu application.
- Logger: Logger for managing output information.
Xploit: Modules that focus on exploitation in RL.
- Agent: Agent for interacting and learning.
Type | Algorithm |
---|---|
On-Policy | A2C🖥️⛓️💰,PPO🖥️⛓️💰 DAAC🖥️⛓️💰,DrAC🖥️⛓️💰🔭,DrDAAC🖥️⛓️💰🔭 |
Off-Policy | DQN🖥️⛓️💰,DDPG🖥️⛓️💰,SAC🖥️⛓️💰 DrQ-v2🖥️⛓️💰🔭 |
Distributed | IMPALA⛓️ |
- 🖥️: Support Neural-network processing unit.
- ⛓️: Multi Processing.
- 💰: Support intrinsic reward shaping.
- 🔭: Support observation augmentation.
Module | Recurrent | Box | Discrete | MultiBinary | Multi Processing | NPU | Paper | Citations |
---|---|---|---|---|---|---|---|---|
SAC | ❌ | ✔️ | ❌ | ❌ | ❌ | ✔️ | Link | 5077⭐ |
DrQ | ❌ | ✔️ | ❌ | ❌ | ❌ | ✔️ | Link | 433⭐ |
DDPG | ❌ | ✔️ | ❌ | ❌ | ❌ | ✔️ | Link | 11819⭐ |
DrQ-v2 | ❌ | ✔️ | ❌ | ❌ | ❌ | ✔️ | Link | 100⭐ |
DAAC | ❌ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | Link | 56⭐ |
PPO | ❌ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | Link | 11155⭐ |
DrAC | ❌ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | Link | 29⭐ |
IMPALA | ✔️ | ✔️ | ✔️ | ❌ | ✔️ | ✔️ | Link | 1219⭐ |
Tips of Agent
- 🐌: Developing.
- NPU: Support Neural-network processing unit.
- Recurrent: Support recurrent neural network.
- Box: A N-dimensional box that containes every point in the action space.
- Discrete: A list of possible actions, where each timestep only one of the actions can be used.
- MultiBinary: A list of possible actions, where each timestep any of the actions can be used in any combination.
- Encoder: Neural nework-based encoder for processing observations.
Module | Input | Reference | Target Task |
---|---|---|---|
EspeholtResidualEncoder | Images | IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures | Atari or Procgen games. |
IdentityEncoder | States | N/A | DeepMind Control Suite: state |
MnihCnnEncoder | Images | Playing Atari with Deep Reinforcement Learning | Atari games. |
TassaCnnEncoder | Images | DeepMind Control Suite | DeepMind Control Suite: pixel |
PathakCnnEncoder | Images | Curiosity-Driven Exploration by Self-Supervised Prediction | Atari or MiniGrid games |
VanillaMlpEncoder | States | N/A | DeepMind Control Suite: state |
Tips of Encoder
- Naming Rule: 'Surname of the first author' + 'Backbone' + 'Encoder'
- Input: Input type.
- Target Task: The testing tasks in their paper or potential tasks.
- Storage: Storge for storing collected experiences.
Module | Remark |
---|---|
VanillaRolloutStorage | On-Policy RL |
VanillaReplayStorage | Off-Policy RL |
NStepReplayStorage | Off-Policy RL |
PrioritizedReplayStorage | Off-Policy RL |
DistributedStorage | Distributed RL |
Xplore: Modules that focus on exploration in RL.
- Augmentation: PyTorch.nn-like modules for observation augmentation.
Module | Input | Reference |
---|---|---|
GaussianNoise | States | Reinforcement Learning with Augmented Data |
RandomAmplitudeScaling | States | Reinforcement Learning with Augmented Data |
GrayScale | Images | Reinforcement Learning with Augmented Data |
RandomColorJitter | Images | Reinforcement Learning with Augmented Data |
RandomConvolution | Images | Reinforcement Learning with Augmented Data |
RandomCrop | Images | Reinforcement Learning with Augmented Data |
RandomCutout | Images | Reinforcement Learning with Augmented Data |
RandomCutoutColor | Images | Reinforcement Learning with Augmented Data |
RandomFlip | Images | Reinforcement Learning with Augmented Data |
RandomRotate | Images | Reinforcement Learning with Augmented Data |
RandomShift | Images | Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning |
RandomTranslate | Images | Reinforcement Learning with Augmented Data |
- Distribution: Distributions for sampling actions.
Module | Type | Reference |
---|---|---|
NormalNoise | Noise | torch.distributions |
OrnsteinUhlenbeckNoise | Noise | Continuous Control with Deep Reinforcement Learning |
TruncatedNormalNoise | Noise | Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning |
Bernoulli | Distribution | torch.distributions |
Categorical | Distribution | torch.distributions |
DiagonalGaussian | Distribution | torch.distributions |
SquashedNormal | Distribution | torch.distributions |
Tips of Distribution
- In Hsuanwu, the action noise is implemented via a
Distribution
manner to realize unification.
- Reward: Intrinsic reward modules for enhancing exploration.
Module | Remark | Repr. | Visual | Reference |
---|---|---|---|---|
PseudoCounts | Count-Based exploration | ✔️ | ✔️ | Never Give Up: Learning Directed Exploration Strategies |
ICM | Curiosity-driven exploration | ✔️ | ✔️ | Curiosity-Driven Exploration by Self-Supervised Prediction |
RND | Count-based exploration | ❌ | ✔️ | Exploration by Random Network Distillation |
GIRM | Curiosity-driven exploration | ✔️ | ✔️ | Intrinsic Reward Driven Imitation Learning via Generative Model |
NGU | Memory-based exploration | ✔️ | ✔️ | Never Give Up: Learning Directed Exploration Strategies |
RIDE | Procedurally-generated environment | ✔️ | ✔️ | RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments |
RE3 | Entropy Maximization | ❌ | ✔️ | State Entropy Maximization with Random Encoders for Efficient Exploration |
RISE | Entropy Maximization | ❌ | ✔️ | Rényi State Entropy Maximization for Exploration Acceleration in Reinforcement Learning |
REVD | Divergence Maximization | ❌ | ✔️ | Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning |
Tips of Reward
- 🐌: Developing.
- Repr.: The method involves representation learning.
- Visual: The method works well in visual RL.
See Tutorials: Use intrinsic reward and observation augmentation for usage examples.
Evaluation: Reasonable and reliable metrics for algorithm evaluation.
See Tutorials: Evaluate your model.
Env: Packaged environments (e.g., Atari games) for fast invocation.
Module | Name | Remark | Reference |
---|---|---|---|
make_atari_env | Atari Games | Discrete control | The Arcade Learning Environment: An Evaluation Platform for General Agents |
make_bullet_env | PyBullet Robotics Environments | Continuous control | Pybullet: A Python Module for Physics Simulation for Games, Robotics and Machine Learning |
make_dmc_env | DeepMind Control Suite | Continuous control | DeepMind Control Suite |
make_minigrid_env | MiniGrid Games | Discrete control | Minimalistic Gridworld Environment for Gymnasium |
make_procgen_env | Procgen Games | Discrete control | Leveraging Procedural Generation to Benchmark Reinforcement Learning |
make_robosuite_env | Robosuite Robotics Environments | Continuous control | Robosuite: A Modular Simulation Framework and Benchmark for Robot Learning |
Pre-training: Methods of pre-training in RL.
See Tutorials: Pre-training in Hsuanwu.