itch.io is community of indie game creators and players

Devlogs

Devlog #8 - Evaluating Trained Models

Digital Media Final Project
A downloadable game

Model Comparison

ParameterDefault ModelSearchAgent
 (Custom)
Description
trainer_typeppoppoSame algorithm used
max_steps500,0003,000,000Extended training for more learning and convergence
summary_freq50,00010,000More frequent summaries for closer monitoring
HYPERPARAMETERS________________________________________________
learning_rate3e-43e-4No change
batch_size10241024No change
buffer_size10,24010,240No change
betanot set2.5e-4Regularisation to reduce policy entropy
epsilonnot set0.2PPO clipping parameter to control policy updates
lambdnot set0.95GAE lambda for bias-variance trade-off in advantage estimation
num_epochnot set3Number of passes over data per policy update
learning_rate_schedulelinearlinearGradual learning rate decay over training
NETWORK SETTINGS________________________________________________
hidden_units128256Larger network capacity to learn more complex features
num_layers22Same depth for balance between expressiveness and speed
normalizefalsetrueNormalise inputs to stabilise and speed training
REWARD SIGNALS________________________________________________
EXTRINSIC________________________________________________
gamma 0.990.99No change
strength
1.01.0No change
CURIOSITY________________________________________________
strengthnot set0.1Added curiosity for intrinsic motivation/exploration
gammanot set0.99Discount factor for curiosity rewards
learning_ratenot set0.0003Learning rate specific to curiosity module
Leave a comment