Module moog_demos.example_configs.predators_arena
Task with predators chasing agent in open arena.
The predators (red circles) chase the agent. The predators bouce off the arena boundaries, while the agent cannot exit but does not bounce (i.e. it has inelastic collisions with the boundaries). Trials only terminate when the agent is caught by a predator. The subject controls the agent with a joystick.
This task also contains an auto-curriculum: When the subject does well (evades the predators for a long time before being caught), the predators' masses are decreased, thereby increasing the predators' speeds. Conversely, when the subject does poorly (gets caught quickly), the predators' masses are increased, thereby decreasing the predators' speeds.
Functions
def get_config(num_predators)
-
Expand source code
def get_config(num_predators): """Get config dictionary of kwargs for environment constructor. Args: num_predators: Int. Number of predators. """ ############################################################################ # Sprite initialization ############################################################################ state_initialization = StateInitialization( num_predators=num_predators, step_scaling_factor=0.1, threshold_trial_len=200, ) ############################################################################ # Physics ############################################################################ agent_friction_force = physics_lib.Drag(coeff_friction=0.25) predator_friction_force = physics_lib.Drag(coeff_friction=0.04) predator_random_force = physics_lib.RandomForce(max_force_magnitude=0.03) predator_attraction = physics_lib.DistanceForce( physics_lib.linear_force_fn(zero_intercept=-0.0025, slope=0.0001)) elastic_asymmetric_collision = physics_lib.Collision( elasticity=1., symmetric=False) inelastic_asymmetric_collision = physics_lib.Collision( elasticity=0., symmetric=False) forces = ( (agent_friction_force, 'agent'), (predator_friction_force, 'predators'), (predator_random_force, 'predators'), (predator_attraction, 'agent', 'predators'), (elastic_asymmetric_collision, 'predators', 'walls'), (inelastic_asymmetric_collision, 'agent', 'walls'), ) physics = physics_lib.Physics(*forces, updates_per_env_step=10) ############################################################################ # Task ############################################################################ task = tasks.ContactReward( -1, layers_0='agent', layers_1='predators', reset_steps_after_contact=0) ############################################################################ # Action space ############################################################################ action_space = action_spaces.Joystick( scaling_factor=0.01, action_layers='agent') ############################################################################ # Observer ############################################################################ observer = observers.PILRenderer( image_size=(64, 64), anti_aliasing=1, color_to_rgb='hsv_to_rgb') ############################################################################ # Game rules ############################################################################ def _increment_count(meta_state): meta_state['count'] += 1 rules = game_rules.ModifyMetaState(_increment_count) ############################################################################ # Final config ############################################################################ config = { 'state_initializer': state_initialization.state_initializer, 'physics': physics, 'task': task, 'action_space': action_space, 'observers': {'image': observer}, 'meta_state_initializer': state_initialization.meta_state_initializer, } return config
Get config dictionary of kwargs for environment constructor.
Args
num_predators
- Int. Number of predators.
Classes
class StateInitialization (num_predators, step_scaling_factor, threshold_trial_len)
-
Expand source code
class StateInitialization(): """State initialization class to dynamically adapt predator mass. This is essentially an auto-curriculum: When the subject does well (evades the predators for a long time before being caught), the predators' masses are decreased, thereby increasing the predators' speeds. Conversely, when the subject does poorly (gets caught quickly), the predators' masses are increased, thereby decreasing the predators' speeds. """ def __init__(self, num_predators, step_scaling_factor, threshold_trial_len): """Constructor. This class uses the meta-state to keep track of the number of steps before the agent is caught. See the game rules section near the bottom of this file for the counter incrementer. Args: step_scaling_factor: Float. Fractional decrease of predator mass after a trial longer than threshold_trial_len. Also used as fractional increase of predator mass after a trial shorter than threshold_trial_len. Should be small and positive. threshold_trial_len: Length of a trial above which the predator mass is decreased and below which the predator mass is increased. """ self._mass = 1. self._step_scaling_factor = step_scaling_factor self._threshold_trial_len = threshold_trial_len # Agent agent_factors = distribs.Product( [distribs.Continuous('x', 0., 1.), distribs.Continuous('y', 0., 1.)], shape='circle', scale=0.1, c0=0.33, c1=1., c2=0.66, ) self._agent_generator = sprite_generators.generate_sprites( agent_factors, num_sprites=1) # Predators predator_factors = distribs.Product( [distribs.Continuous('x', 0., 1.), distribs.Continuous('y', 0., 1.)], shape='circle', scale=0.1, c0=0., c1=1., c2=0.8, ) self._predator_generator = sprite_generators.generate_sprites( predator_factors, num_sprites=num_predators) # Walls self._walls = shapes.border_walls( visible_thickness=0., c0=0., c1=0., c2=0.5) self._meta_state = None def state_initializer(self): """State initializer method to be fed to environment.""" agent = self._agent_generator(without_overlapping=self._walls) predators = self._predator_generator( without_overlapping=self._walls + agent) if self._meta_state is not None: if self._meta_state['step_count'] > self._threshold_trial_len: self._mass -= self._mass * self._step_scaling_factor else: self._mass += self._mass * self._step_scaling_factor for s in predators: s.mass = self._mass state = collections.OrderedDict([ ('walls', self._walls), ('agent', agent), ('predators', predators), ]) return state def meta_state_initializer(self): """Meta-state initializer method to be fed to environment.""" self._meta_state = {'step_count': 0} return self._meta_state
State initialization class to dynamically adapt predator mass.
This is essentially an auto-curriculum: When the subject does well (evades the predators for a long time before being caught), the predators' masses are decreased, thereby increasing the predators' speeds. Conversely, when the subject does poorly (gets caught quickly), the predators' masses are increased, thereby decreasing the predators' speeds.
Constructor.
This class uses the meta-state to keep track of the number of steps before the agent is caught. See the game rules section near the bottom of this file for the counter incrementer.
Args
step_scaling_factor
- Float. Fractional decrease of predator mass after a trial longer than threshold_trial_len. Also used as fractional increase of predator mass after a trial shorter than threshold_trial_len. Should be small and positive.
threshold_trial_len
- Length of a trial above which the predator mass is decreased and below which the predator mass is increased.
Methods
def meta_state_initializer(self)
-
Expand source code
def meta_state_initializer(self): """Meta-state initializer method to be fed to environment.""" self._meta_state = {'step_count': 0} return self._meta_state
Meta-state initializer method to be fed to environment.
def state_initializer(self)
-
Expand source code
def state_initializer(self): """State initializer method to be fed to environment.""" agent = self._agent_generator(without_overlapping=self._walls) predators = self._predator_generator( without_overlapping=self._walls + agent) if self._meta_state is not None: if self._meta_state['step_count'] > self._threshold_trial_len: self._mass -= self._mass * self._step_scaling_factor else: self._mass += self._mass * self._step_scaling_factor for s in predators: s.mass = self._mass state = collections.OrderedDict([ ('walls', self._walls), ('agent', agent), ('predators', predators), ]) return state
State initializer method to be fed to environment.