obs spaces simplified to boxes in init and reset, superfluous methods deleted;...
Compare changes
Some changes are not shown
For a faster browsing experience, some files are collapsed by default.
Files
18+ 0
− 132
This is a re-implementation of the Anti-poaching game, motivated by the growing numbers of compatibility layers and hard-to-find bugs. This new version, tentatively numbered 0.3, should be in full agreement with the current description (as described in the NeurIPS draft).
Notable changes to the environment (and other resulting code) include:
NULL_POS
state when captured, but are NOT terminated or truncated. Any penalty that they receive can now be assigned to them directly, without maintaining a global list of total_rewards. This is theoretically sound, and also drastically simplifies dealing with QMIX.C_{prey}
for each prey they were carrying when captured, and C_{trap}
for each trap. They lose C_{trap}
whenever a trap (full or empty) is captured, and gain R_{trap}
whenever they get a prey from a trap.The code has been tested with the test suite (modified to handle the changing behaviour of the PettingZoo environment), and with the examples. This includes the manual_examples
and the rllib_examples
.
Note: episode_reward_mean
is now 0 :)
For a faster browsing experience, some files are collapsed by default.