Ben Bokser's Portfolio
  • Projects
    • Bipedal Robot with Floating Body MPCl
    • SpryDrive
    • Bipedal Robot with Reaction Wheel Stabilization
    • Reaction Wheel Assisted Quadruped Locomotion
    • RoboJoey
    • Soft Knee Exoskeleton
    • ASME SDC 2018
    • Hydraulically Actuated Robotic Arm
    • ToddlerCane
    • A Middleweight Combat Robot
  • Publications
  • Blog
  • About
  • Contact

Model Predictive Control for a Legged Robot

10/12/2020

0 Comments

 
I have recently implemented a model predictive controller (MPC) to calculate the necessary reaction forces for a legged robot. The work presented here is based on "Highly Dynamic Quadruped Locomotion via
Whole-Body Impulse Control and Model Predictive Control"
by Kim et al. If you don't know what model predictive control is, I recommend this excellent explanation by Steve Brunton. I also found this guide to model predictive control with CasADI to be extremely helpful. CasADi is an open source nonlinear optimization tool which I'm using for MPC.

However, I'm not entirely sure yet whether this code works the way it's intended--I'm not getting any error messages, but the force values I'm getting seem too low. If you spot anything wrong with my implementation, please let me know.

​Here's the code.

Part 1. The Dynamics Formulation
From the paper:
Picture
First off, you may be wondering what this notation means.
Picture
This refers to a skew-symmetric matrix. The 3x1 vectors r1 and r2 (footstep positions) are converted into a 3x3 matrix in order to perform a cross product operation with the inverse of the global inertia tensor.
   
% matlab code

r = [[0, -r_z, r_y];
     [r_z, 0, -r_x];
     [-r_y, r_x, 0]];
     
i_inv = sym('i', [3, 3]);

mtimes(i_inv, r)

  
A few more notes:
  • The state term x is a 12x1 vector composed of the angle, position, angular velocity, and velocity of the body w.r.t. to the global frame.
  • In my case, there are only two legs, so there six scalar controls values: f1x, f1y, f1z, f2x, f2y, and f2z. 
  • The gravity term is a 12x1 matrix of zeros until the last term, which is the gravitational constant.
  • R_z is a 3x3 rotation matrix for translating angular velocity from the global frame to the body frame.
   
% matlab code

syms theta_x theta_y theta_z p_x p_y p_z omega_x omega_y omega_z pdot_x ...
     pdot_y pdot_z f1_x f1_y f1_z f2_x f2_y f2_z ...
     dt i_inv r1x r1y r1z r2x r2y r2z mass gravity...
     i11 i12 i13 i21 i22 i23 i31 i32 i33...
     rz11 rz12 rz13 rz21 rz22 rz23 rz31 rz32 rz33

x = [[theta_x];  % states
     [theta_y];
     [theta_z];
     [p_x];
     [p_y];
     [p_z];
     [omega_x];
     [omega_y];
     [omega_z];
     [pdot_x];
     [pdot_y];
     [pdot_z]];

f = [[f1_x];  % controls
     [f1_y];
     [f1_z];
     [f2_x];
     [f2_y];
     [f2_z]];

g = [[0];
     [0];
     [0];
     [0];
     [0];
     [0];
     [0];
     [0];
     [0];
     [0];
     [0];
     [gravity]];

A = [[1,  0,  0,  0,  0,  0,  rz11*dt,  rz12*dt,  rz13*dt,  0,  0,  0];
     [0,  1,  0,  0,  0,  0,  rz21*dt,  rz22*dt,  rz23*dt,  0,  0,  0];
     [0,  0,  1,  0,  0,  0,  rz31*dt,  rz32*dt,  rz33*dt,  0,  0,  0];
     [0,  0,  0,  1,  0,  0,  0,  0,  0,  dt,  0,  0];
     [0,  0,  0,  0,  1,  0,  0,  0,  0,  0,  dt,  0];
     [0,  0,  0,  0,  0,  1,  0,  0,  0,  0,  0,  dt];
     [0,  0,  0,  0,  0,  0,  1,  0,  0,  0,  0,  0];
     [0,  0,  0,  0,  0,  0,  0,  1,  0,  0,  0,  0];
     [0,  0,  0,  0,  0,  0,  0,  0,  1,  0,  0,  0];
     [0,  0,  0,  0,  0,  0,  0,  0,  0,  1,  0,  0];
     [0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  1,  0];
     [0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  1]];

B = [[0, 0, 0, 0, 0, 0];
     [0, 0, 0, 0, 0, 0];
     [0, 0, 0, 0, 0, 0];
     [0, 0, 0, 0, 0, 0];
     [0, 0, 0, 0, 0, 0];
     [0, 0, 0, 0, 0, 0];
     [(i12*r1z - i13*r1y)*dt, (-i11*r1z + i13*r1x)*dt, (i11*r1y - i12*r1x)*dt, ... 
        (i12*r2z - i13*r2y)*dt, (-i11*r2z + i13*r2x)*dt, (i11*r2y - i12*r2x)*dt];
     [(i22*r1z - i23*r1y)*dt, (-i21*r1z + i23*r1x)*dt, (i21*r1y - i22*r1x)*dt, ... 
        (i22*r2z - i23*r2y)*dt, (-i21*r2z + i23*r2x)*dt, (i21*r2y - i22*r2x)*dt];
     [(i32*r1z - i33*r1y)*dt, (-i31*r1z + i33*r1x)*dt, (i31*r1y - i32*r1x)*dt, ... 
        (i32*r2z - i33*r2y)*dt, (-i31*r2z + i33*r2x)*dt, (i31*r2y - i32*r2x)*dt];
     [dt/mass, 0, 0, dt/mass, 0, 0];
     [0, dt/mass, 0, 0, dt/mass, 0];
     [0, 0, dt/mass, 0, 0, dt/mass]];

x_next = mtimes(A, x) + mtimes(B, f) + g;

  
The end result is a very long equation which will be used as a constraint for the QP. The states and controls are defined as symbolics, but other values such as the inertia tensor, rotation matrix, and foot positions are fed in as numerical values from sensor feedback.
   
# python

gravity = -9.807
dt = self.dt
mass = self.mass

x_next = [dt * omega_x * rz11 + dt * omega_y * rz12 + dt * omega_z * rz13 + theta_x,
          dt * omega_x * rz21 + dt * omega_y * rz22 + dt * omega_z * rz23 + theta_y,
          dt * omega_x * rz31 + dt * omega_y * rz32 + dt * omega_z * rz33 + theta_z,
          dt * pdot_x + p_x,
          dt * pdot_y + p_y,
          dt * pdot_z + p_z,
          dt * f1_x * (i12 * r1z - i13 * r1y) + dt * f1_y * (-i11 * r1z + i13 * r1x)
          + dt * f1_z * (i11 * r1y - i12 * r1x) + dt * f2_x * (i12 * r2z - i13 * r2y)
          + dt * f2_y * (-i11 * r2z + i13 * r2x) + dt * f2_z * (i11 * r2y - i12 * r2x) + omega_x,
          dt * f1_x * (i22 * r1z - i23 * r1y) + dt * f1_y * (-i21 * r1z + i23 * r1x)
          + dt * f1_z * (i21 * r1y - i22 * r1x) + dt * f2_x * (i22 * r2z - i23 * r2y)
          + dt * f2_y * (-i21 * r2z + i23 * r2x) + dt * f2_z * (i21 * r2y - i22 * r2x) + omega_y,
          dt * f1_x * (i32 * r1z - i33 * r1y) + dt * f1_y * (-i31 * r1z + i33 * r1x)
          + dt * f1_z * (i31 * r1y - i32 * r1x) + dt * f2_x * (i32 * r2z - i33 * r2y)
          + dt * f2_y * (-i31 * r2z + i33 * r2x) + dt * f2_z * (i31 * r2y - i32 * r2x) + omega_z,
          dt * f1_x / mass + dt * f2_x / mass + pdot_x,
          dt * f1_y / mass + dt * f2_y / mass + pdot_y,
          dt * f1_z / mass + dt * f2_z / mass + gravity + pdot_z]

self.fn = cs.Function('fn', [theta_x, theta_y, theta_z,
                      p_x, p_y, p_z,
                      omega_x, omega_y, omega_z,
                      pdot_x, pdot_y, pdot_z,
                      f1_x, f1_y, f1_z,
                      f2_x, f2_y, f2_z],
                      x_next)  # nonlinear mapping of function f(x,u)
  

Part 2. The Objective Function
Picture
The objective function is shown above.
Picture
Let's start with this notation. What does it mean? Well, double bars represent a matrix norm. However, the R in subscript clues us into the fact that this is a special case of the matrix norm with an inner product. Q and R are weighing matrices and should be diagonal.
Picture
However, I'm not sure what the purpose of the square root would be here. It seems to me that minimizing the square root of f(x) is basically equivalent to minimizing f(x) by itself, but more computationally expensive. In fact, running the code with the square root operation actually causes a "QP is infeasible!" error. As such, I'm leaving the square root operation out.
​
So, the objective function minimizes x and f (the state and control vectors) based on the difference between actual and reference state as well as the value of f, scaled to Q and R respectively, over k time steps where k is between 0 and m. Here, m refers to the prediction horizon, which is chosen by the user.
   
u = cs.SX.sym('u', self.n_controls, self.N)  # decision variables, control action matrix
st_ref = cs.SX.sym('st_ref', self.n_states + self.n_states)  # initial and reference states
x = cs.SX.sym('x', self.n_states, (self.N + 1))  # represents the states over the opt problem.
obj = 0  # cs.SX(0)  # objective function
constr = []  # constraints vector

Q = np.zeros((12, 12))  # state weighing matrix
Q[0, 0] = 1
Q[1, 1] = 1
Q[2, 2] = 1
Q[3, 3] = 1
Q[4, 4] = 1
Q[5, 5] = 1
Q[6, 6] = 1
Q[7, 7] = 1
Q[8, 8] = 1
Q[9, 9] = 1
Q[10, 10] = 1
Q[11, 11] = 1

R = np.zeros((6, 6))  # control weighing matrix
R[0, 0] = 1
R[1, 1] = 1
R[2, 2] = 1
R[3, 3] = 1
R[4, 4] = 1
R[5, 5] = 1

st = x[:, 0]  # initial state
constr = cs.vertcat(constr, st - st_ref[0:self.n_states])  # initial condition constraints
# compute objective and constraints
for k in range(0, self.N):
    st = x[:, k]  # state
    con = u[:, k]  # control action
    # calculate objective
    obj = obj + cs.mtimes(cs.mtimes((st - st_ref[self.n_states:(self.n_states * 2)]).T, Q),
                          st - st_ref[self.n_states:(self.n_states * 2)])) \
        + cs.mtimes(cs.mtimes(con.T, R), con))
  
Both my Q and R are identity matrices right now, as I'm not sure how best to weight them.

Part 3. The Constraints Vector
In the same for loop, we can now add the constraints vector. For each iteration of the loop, the discrete dynamics equation from Part 1 is subtracted from the next state vector, x(k+1).
   
for k in range(0, self.N):
    # ...
    st_next = x[:, k + 1]
    f_value = self.fn(st[0], st[1], st[2], st[3], st[4], st[5],
                      st[6], st[7], st[8], st[9], st[10], st[11],
                      con[0], con[1], con[2], con[3], con[4], con[5])
    st_n_e = np.array(f_value)
    constr = cs.vertcat(constr, st_next - st_n_e)  # compute constraints
  
There are more constraints, however. Namely, friction cone constraints, governed by friction of the feet with the ground.  As stated in the paper:
Picture
Here's my implementation below. Both the Fx and Fy inequalities must be repeated due to the absolute value function. Fz is not required here because it's extremely simple and can be implemented as an input constraint.
   
# add additional constraints
for k in range(0, self.N):
    constr = cs.vertcat(constr, u[0, k] - self.mu * u[2, k])  # f1x - mu*f1z
    constr = cs.vertcat(constr, -u[0, k] - self.mu * u[2, k])  # -f1x - mu*f1z

    constr = cs.vertcat(constr, u[1, k] - self.mu * u[2, k])  # f1y - mu*f1z
    constr = cs.vertcat(constr, -u[1, k] - self.mu * u[2, k])  # -f1y - mu*f1z

    constr = cs.vertcat(constr, u[3, k] - self.mu * u[5, k])  # f2x - mu*f2z
    constr = cs.vertcat(constr, -u[3, k] - self.mu * u[5, k])  # -f2x - mu*f2z

    constr = cs.vertcat(constr, u[4, k] - self.mu * u[5, k])  # f2y - mu*f2z
    constr = cs.vertcat(constr, -u[4, k] - self.mu * u[5, k])  # -f2y - mu*f2z
  

Part 4. Problem Setup
The solver used here is qpOASES. The optimization variables (in this case, the state and controls), objective function, constraint functions, and reference state are declared as shown. It is important to specify a bound tolerance and termination tolerance.
   
opt_variables = cs.vertcat(cs.reshape(x, n_states * (self.N + 1), 1),
                                   cs.reshape(u, n_controls * self.N, 1))
qp = {'x': opt_variables, 'f': obj, 'g': constr, 'p': st_ref}
opts = {'print_time': 0, 'error_on_fail': 0, 'printLevel': "low", 'boundTolerance': 1e-6,
        'terminationTolerance': 1e-6}
solver = cs.qpsol('S', 'qpoases', qp, opts)
  

Part 5. Upper and Lower Bounds
The upper and lower bounds for the constraint vector still needs to be set. The dynamics equation is set up as an equality constraint, so the upper and lower bounds are both zero. The initial condition constraint is also an equality constraint, since the first state vector must be equal to the input state vector. The rest of the constraints have a lower bound close to infinity.
   
c_length = np.shape(constr)[0]
o_length = np.shape(opt_variables)[0]

lbg = list(itertools.repeat(-1e10, c_length))  # inequality constraints: big enough to act like infinity
lbg[0:(self.N + 1)] = itertools.repeat(0, self.N + 1)  # IC + dynamics equality constraint
ubg = list(itertools.repeat(0, c_length))  # inequality constraints
  
Constraints for the optimization variables are set separately. This is where the friction cone restraint for Fz comes into play. The reaction force perpendicular to the ground (assumed flat) must be upward, because the robot can only push itself away from the ground rather than pull itself toward the ground. Therefore, the lower bound on all Fz is 0.

Additionally, the MPC must check if each leg is actually in contact with the ground when calculating forces for it. If that leg is not in contact, all bounds are set equal to zero.
   
# constraints for optimization variables
lbx = list(itertools.repeat(-1e10, o_length))  # input inequality constraints
ubx = list(itertools.repeat(1e10, o_length))  # input inequality constraints

dyn_len = self.n_states * (self.N + 1)

lbx[(dyn_len + 2)::3] = [0 for i in range(20)]  # lower bound on all f1z and f2z

if c_l == 0:  # if left leg is not in contact... don't calculate output forces for that leg.
    ubx[(self.n_states * (self.N + 1))::6] = [0 for i in range(10)]  # upper bound on all f1x
    ubx[(self.n_states * (self.N + 1) + 1)::6] = [0 for i in range(10)]  # upper bound on all f1y
    lbx[(self.n_states * (self.N + 1))::6] = [0 for i in range(10)]  # lower bound on all f1x
    lbx[(self.n_states * (self.N + 1) + 1)::6] = [0 for i in range(10)]  # lower bound on all f1y
    ubx[(self.n_states * (self.N + 1) + 2)::6] = [0 for i in range(10)]  # upper bound on all f1z

if c_r == 0:  # if right leg is not in contact... don't calculate output forces for that leg.
    ubx[(self.n_states * (self.N + 1) + 3)::6] = [0 for i in range(10)]  # upper bound on all f2x
    ubx[(self.n_states * (self.N + 1) + 4)::6] = [0 for i in range(10)]  # upper bound on all f2y
    lbx[(self.n_states * (self.N + 1) + 3)::6] = [0 for i in range(10)]  # lower bound on all f2x
    lbx[(self.n_states * (self.N + 1) + 4)::6] = [0 for i in range(10)]  # lower bound on all f2y
    ubx[(self.n_states * (self.N + 1) + 5)::6] = [0 for i in range(10)]  # upper bound on all f2z
  

Part 6. Solving
Firstly, initial values must be fed into the solver.

I've set the initial values for my control inputs as an array of zeros, although I'm wondering if that should really be the last set of control inputs applied.

The initial values for the state are pulled in and calculated from the simulator. This 12x1 array is repeated N+1 times, where N is the prediction horizon, in order to fill the entire array including future states.
   
u0 = np.zeros((self.N, self.n_controls))  # six control inputs
X0 = np.matlib.repmat(x_in, 1, self.N + 1).T  # initialization of the state's decision variables

parameters = cs.vertcat(x_in, x_ref)  # set values of parameters vector
# init value of optimization variables
x0 = cs.vertcat(np.reshape(X0.T, (self.n_states * (self.N + 1), 1)),
                np.reshape(u0.T, (self.n_controls * self.N, 1)))
  
Finally, the initial states, bounds, and parameters are plugged into the solver. The controls vector is pulled from the solution, and the first row of optimized control values is returned. The rest of the values (representing future time steps) are ignored.
   
sol = solver(x0=x0, lbx=lbx, ubx=ubx, lbg=lbg, ubg=ubg, p=parameters)

solu = np.array(sol['x'][self.n_states * (self.N + 1):])
u = np.reshape(solu.T, (self.n_controls, self.N)).T  # get controls from the solution

u_cl = u[0, :]  # ignore rows other than first row

return u_cl
  
0 Comments



Leave a Reply.

    Archives

    July 2021
    October 2020
    May 2020

    Categories

    All

    RSS Feed

© 2016-2021 Benjamin Bokser | CC BY-NC 4.0
  • Projects
    • Bipedal Robot with Floating Body MPCl
    • SpryDrive
    • Bipedal Robot with Reaction Wheel Stabilization
    • Reaction Wheel Assisted Quadruped Locomotion
    • RoboJoey
    • Soft Knee Exoskeleton
    • ASME SDC 2018
    • Hydraulically Actuated Robotic Arm
    • ToddlerCane
    • A Middleweight Combat Robot
  • Publications
  • Blog
  • About
  • Contact