Teaching a glider to soar using reinforcement learning in the field

1 Like

Also looking at using deep reinforcement learning for perching manoeuvres, reducing drag and Flow manipulation

Would love to see a single line AWES attempt that landing

I think this could actually be quite useful for AWE. Im just not sure who will use it first. I think if you cant solve a task poorly using traditional code, then you will not have a platform on which to apply this AI.

So I see this useful in an electrically articulated kite for tasks such as:

  • landing safely in near ground wind
  • optimizing power output in production
  • keeping a kite airborne in a nonproductive «safe linger» mode

Im sure there’s more…

1 Like

Deep learning[edit]

Depiction of a basic artificial neural network

Deep learning is a form of machine learning that utilizes a neural network to transform a set of inputs into a set of outputs via an artificial neural network. Deep learning methods, often using supervised learning with labeled datasets, have been shown to solve tasks that involve handling complex, high-dimensional raw input data such as images, with less manual feature engineering than prior methods, enabling significant progress in several fields including computer vision and natural language processing.

Reinforcement learning[edit]

Diagram explaining the loop recurring in reinforcement learning algorithms

Diagram of the loop recurring in reinforcement learning algorithms

Reinforcement learning is a process in which an agent learns to make decisions through trial and error. This problem is often modeled mathematically as a Markov decision process (MDP), where an agent at every timestep is in a state {\displaystyle s}s, takes action {\displaystyle a}a, receives a scalar reward and transitions to the next state {\displaystyle s’}s' according to environment dynamics {\displaystyle p(s’|s,a)}{\displaystyle p(s'|s,a)}. The agent attempts to learn a policy {\displaystyle \pi (a|s)}{\displaystyle \pi (a|s)}, or map from observations to actions, in order to maximize its returns (expected sum of rewards). In reinforcement learning (as opposed to optimal control) the algorithm only has access to the dynamics {\displaystyle p(s’|s,a)}{\displaystyle p(s'|s,a)} through sampling.

Deep reinforcement learning[edit]

In many practical decision making problems, the states {\displaystyle s}s of the MDP are high-dimensional (eg. images from a camera or the raw sensor stream from a robot) and cannot be solved by traditional RL algorithms. Deep reinforcement learning algorithms incorporate deep learning to solve such MDPs, often representing the policy {\displaystyle \pi (a|s)}{\displaystyle \pi (a|s)} or other learned functions as a neural network, and developing specialized algorithms that perform well in this setting.