Biology is an undeniable inspiration when it comes to robotics, not only because of its effectiveness but also because of its elegance and beauty. In the article: "Brainteberg's dream: A tale of a brain is a tale of a body", we saw how to embody a computation. The embodiment allows us it integrate the rules of behaviour into the structure and shape of the agent.
In that case, we tasted different flavors of the two main actions every biological organism performs: approach and avoid, in what is called a "taxis" behaviour. The model's premise is that the agent's body decides what to move towards, given some sensed stimuli. Here, the brain's role is pretty much just to establish the correct mapping between sensors and actuators.
Of course, the world is more complex than that, and for any living machine, there will be multiple mappings and orientations that will be beneficial under different circumstances. In that case, we can imagine extending our agent with additional sensors and mappings that differentiate, for example, between dangerous and desired properties of the world. The reader is encouraged to play around with such possibilities; in this article, we will extend our simple model to deal with something more pervasive in the real world: uncertainty. In doing so, we will illustrate the mechanism while avoiding unnecessary complexity.
Simple agents, complex behaviours I: Uncertainty [Robotics Tutorial]
Also Read: How to use TensorFlow in Python [Complete Tutorial]
Uncertainty plays a role in every part of robotics, from a noisy world to noisy sensors to noisy outcomes. It is not surprising then that most of the modern control theories in robotics are expressed in terms of probability theory. In a nutshell, it is natural to assume that an agent must collect samples from the world to create some internal model of the regularities it discovers by acting on it. While the nature of such models can get quite complicated, there is a simple mechanism that illustrates that principle: evidence accumulation.
A famous experiment in visual psychophysics exemplifies this idea. In such an experiment, monkeys were trained to look at a random pattern of moving dots and then indicate the perceived motion direction by moving their eyes in what is called a "saccadic movement". The catch is that not all the dots were moving in the same direction, so the task became more complicated when the proportion of coherent points approached chance. The difficulty in deciding showed up in the time it took them to move. What was happening in the monkey's brain while it waited ?
Remarkably, when measuring the activity of neurons in a specific part of the brain, there was a clear ramping up towards what looked like a threshold. Thresholds are a handy way of making decisions: just wait long enough until the evidence is overwhelming and then act! Many models of decision-making are variations of the same theme. So we can upgrade our little agent's brain to make decisions in a noisy environment by providing it with an evidence accumulator and a threshold.
The setting is similar to the previous article. We already have a differential drive robot with a brain that drives the wheels' speed in proportion to some sensed world property. Our new world is composed of many dots that move in different directions, parametrized by a coherence value. When the coherence is 1, all the points move to the right; when the coherence is 0, all the points move to the left. The robot's new sensor is such that it can only attend to one of the dots' velocities at each step, so it can only get an idea of the overall motion by sampling many such points one by one.
The Brain Accumulator
As we already said, our robot's brain will be an evidence accumulator. It is, at its heart, a one-dimensional stochastic process. Here is how it works: While there is no clear evidence, the system wanders around 0 in a random walk. However, once the agent observes a new piece of evidence, the accumulator jumps towards one of two thresholds on the positive and negative sides of the real line. If further evidence arrives that is consistent with the previous one, one of the two thresholds will eventually be crossed, and a decision will be made.
Mathematically, we express this as the following stochastic differential equation, which is, for our current purpose, just like a first-order ordinary differential equation with a noise term added:-
The first term on the right indicates a decay over time with a rate “a”: if no new evidence is collected for a while, the system will return to its neutral position; it is a kind of memory. The second represents the effect of new evidence, with “c” standing for the agent's confidence; an overconfident agent will give bigger jumps towards the threshold. Finally, the last term is a random Gaussian term, it makes the accumulation non-deterministic. We can use each of those three terms to express different ideas about behaviour. As an exercise, what can we say in behavioral terms about an agent with a faster decay ? Or with higher variance in the random term ?
The accumulation function replaces our sensory input in the original implementation of the Braitenberg vehicle.
Note that we drive the agent forward unless one of the two thresholds is crossed. Once they are crossed, we change the left or right wheel speed to orient the agent towards the perceived motion direction. The rest of the main loop remains the same: we compute the speed and orientation using the forward kinematics model.
Now let's program the experimental task. We initialize an array of random points and a specified direction in the proportion given by the coherence parameter for each of them.
We randomly select one point at each task step and pass the velocity to the agent. After that, we move the agent and the points. Note that we wrap the space around to keep the dots and the robot in the same two-dimensional domain. Finally, we make some plots of the dots, their velocity vectors, the agent and the accumulator's state.
Let's see how our agent behaves in this not-so-unforgiving but uncertain world. When the evidence is strong, and almost all the points move in the same direction, the agent decides to follow the direction of movement without much dithering.
In the left figure we see the path of the robot which goes straight up until it makes the decision. In the eight image, we can see the exact moment the accumulation crosses the threshold. In the same situation, an overconfident agent will naturally make an earlier decision:-
However, when the points are not coherent, a "normal" agent will take longer to decide. Note that the accumulation goes up and down trying to cope with the noisy environment. In the end, however, the right decision is made most of the time.
In this scenario, with coherence = 0.43, an overconfident agent is seen to make a faster decision but dithers for a while, "wondering" whether it is the correct decision after all, it even makes mistakes by being impulsive.
Here, for example, with coherence = 0.45, the agent would have benefited from collecting more evidence.
While the scenario presented in this article is a bit contrived, evidence accumulation is behind many estimation models of latent variables from the world to drive actions. In such models, the estimated entities are sometimes distributions, priors or other mathematical objects that encode the uncertainty of the situation. In future articles, we will have time to be concerned about more complex scenarios. For now, it is important to add that the opportunity to collect more samples by acting on the world can turn this straightforward model into a valuable tool to generate interesting autonomous behaviours.
Consider, for example, the case in which you have a very rough visual system intended to detect balls for the robot to play with them. We can strengthen the initial, potentially uncertain estimation of a given object as a ball by pushing the ball and observing if it rolls; in this way, the evidence of a ball's presence can easily become quickly overwhelming.
This principle can be beneficial as well in human-robot interaction scenarios as well. Given the fuzziness of human behaviour, the robot can accumulate evidence about the nature of a particular interaction and act accordingly: Is the human angry ? is it sad ?
- Haefner, R. M., Berkes, P., & Fiser, J. (2016). Perceptual decision-making as probabilistic inference by neural sampling. Neuron, 90(3), 649-660.
- Gerstner, W., Kistler, W. M., Naud, R., & Paninski, L. (2014). Neuronal dynamics: From single neurons to networks and models of cognition. Cambridge University Press.