Skip to content

Api old

Common: Auxiliary modules like trainer and logger.

  • Engine: Engine for building Hsuanwu application.
  • Logger: Logger for managing output information.

Xploit: Modules that focus on exploitation in RL.

  • Agent: Agent for interacting and learning.
Type Algorithm
On-Policy A2C🖥️⛓️💰,PPO🖥️⛓️💰 DAAC🖥️⛓️💰,DrAC🖥️⛓️💰🔭,DrDAAC🖥️⛓️💰🔭
Off-Policy DQN🖥️⛓️💰,DDPG🖥️⛓️💰,SAC🖥️⛓️💰 DrQ-v2🖥️⛓️💰🔭
Distributed IMPALA⛓️
  • 🖥️: Support Neural-network processing unit.
  • ⛓️: Multi Processing.
  • 💰: Support intrinsic reward shaping.
  • 🔭: Support observation augmentation.
Module Recurrent Box Discrete MultiBinary Multi Processing NPU Paper Citations
SAC ✔️ ✔️ Link 5077⭐
DrQ ✔️ ✔️ Link 433⭐
DDPG ✔️ ✔️ Link 11819⭐
DrQ-v2 ✔️ ✔️ Link 100⭐
DAAC ✔️ ✔️ ✔️ ✔️ ✔️ Link 56⭐
PPO ✔️ ✔️ ✔️ ✔️ ✔️ Link 11155⭐
DrAC ✔️ ✔️ ✔️ ✔️ ✔️ Link 29⭐
IMPALA ✔️ ✔️ ✔️ ✔️ ✔️ Link 1219⭐

Tips of Agent

  • 🐌: Developing.
  • NPU: Support Neural-network processing unit.
  • Recurrent: Support recurrent neural network.
  • Box: A N-dimensional box that containes every point in the action space.
  • Discrete: A list of possible actions, where each timestep only one of the actions can be used.
  • MultiBinary: A list of possible actions, where each timestep any of the actions can be used in any combination.
  • Encoder: Neural nework-based encoder for processing observations.
Module Input Reference Target Task
EspeholtResidualEncoder Images IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures Atari or Procgen games.
IdentityEncoder States N/A DeepMind Control Suite: state
MnihCnnEncoder Images Playing Atari with Deep Reinforcement Learning Atari games.
TassaCnnEncoder Images DeepMind Control Suite DeepMind Control Suite: pixel
PathakCnnEncoder Images Curiosity-Driven Exploration by Self-Supervised Prediction Atari or MiniGrid games
VanillaMlpEncoder States N/A DeepMind Control Suite: state

Tips of Encoder

  • Naming Rule: 'Surname of the first author' + 'Backbone' + 'Encoder'
  • Input: Input type.
  • Target Task: The testing tasks in their paper or potential tasks.
  • Storage: Storge for storing collected experiences.
Module Remark
VanillaRolloutStorage On-Policy RL
VanillaReplayStorage Off-Policy RL
NStepReplayStorage Off-Policy RL
PrioritizedReplayStorage Off-Policy RL
DistributedStorage Distributed RL

Xplore: Modules that focus on exploration in RL.

  • Augmentation: PyTorch.nn-like modules for observation augmentation.
Module Input Reference
GaussianNoise States Reinforcement Learning with Augmented Data
RandomAmplitudeScaling States Reinforcement Learning with Augmented Data
GrayScale Images Reinforcement Learning with Augmented Data
RandomColorJitter Images Reinforcement Learning with Augmented Data
RandomConvolution Images Reinforcement Learning with Augmented Data
RandomCrop Images Reinforcement Learning with Augmented Data
RandomCutout Images Reinforcement Learning with Augmented Data
RandomCutoutColor Images Reinforcement Learning with Augmented Data
RandomFlip Images Reinforcement Learning with Augmented Data
RandomRotate Images Reinforcement Learning with Augmented Data
RandomShift Images Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning
RandomTranslate Images Reinforcement Learning with Augmented Data
  • Distribution: Distributions for sampling actions.
Module Type Reference
NormalNoise Noise torch.distributions
OrnsteinUhlenbeckNoise Noise Continuous Control with Deep Reinforcement Learning
TruncatedNormalNoise Noise Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning
Bernoulli Distribution torch.distributions
Categorical Distribution torch.distributions
DiagonalGaussian Distribution torch.distributions
SquashedNormal Distribution torch.distributions

Tips of Distribution

  • In Hsuanwu, the action noise is implemented via a Distribution manner to realize unification.
  • Reward: Intrinsic reward modules for enhancing exploration.
Module Remark Repr. Visual Reference
PseudoCounts Count-Based exploration ✔️ ✔️ Never Give Up: Learning Directed Exploration Strategies
ICM Curiosity-driven exploration ✔️ ✔️ Curiosity-Driven Exploration by Self-Supervised Prediction
RND Count-based exploration ✔️ Exploration by Random Network Distillation
GIRM Curiosity-driven exploration ✔️ ✔️ Intrinsic Reward Driven Imitation Learning via Generative Model
NGU Memory-based exploration ✔️ ✔️ Never Give Up: Learning Directed Exploration Strategies
RIDE Procedurally-generated environment ✔️ ✔️ RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments
RE3 Entropy Maximization ✔️ State Entropy Maximization with Random Encoders for Efficient Exploration
RISE Entropy Maximization ✔️ Rényi State Entropy Maximization for Exploration Acceleration in Reinforcement Learning
REVD Divergence Maximization ✔️ Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning

Tips of Reward

  • 🐌: Developing.
  • Repr.: The method involves representation learning.
  • Visual: The method works well in visual RL.

See Tutorials: Use intrinsic reward and observation augmentation for usage examples.

Evaluation: Reasonable and reliable metrics for algorithm evaluation.

See Tutorials: Evaluate your model.

Env: Packaged environments (e.g., Atari games) for fast invocation.

Module Name Remark Reference
make_atari_env Atari Games Discrete control The Arcade Learning Environment: An Evaluation Platform for General Agents
make_bullet_env PyBullet Robotics Environments Continuous control Pybullet: A Python Module for Physics Simulation for Games, Robotics and Machine Learning
make_dmc_env DeepMind Control Suite Continuous control DeepMind Control Suite
make_minigrid_env MiniGrid Games Discrete control Minimalistic Gridworld Environment for Gymnasium
make_procgen_env Procgen Games Discrete control Leveraging Procedural Generation to Benchmark Reinforcement Learning
make_robosuite_env Robosuite Robotics Environments Continuous control Robosuite: A Modular Simulation Framework and Benchmark for Robot Learning

Pre-training: Methods of pre-training in RL.

See Tutorials: Pre-training in Hsuanwu.

Deployment: Methods of model deployment in RL.

See Tutorials: Deploy your model in inference devices.