To initialize neural networks it’s often desirable to generate a set of vectors which span the space. In the case of a square weights matrix this means that we want a random orthonormal basis.

The code below generates such a random basis by concatenating random Householder transforms.

import numpy import random import math def make_orthonormal_matrix(n): """ Makes a square matrix which is orthonormal by concatenating random Householder transformations """ A = numpy.identity(n) d = numpy.zeros(n) d[n-1] = random.choice([-1.0, 1.0]) for k in range(n-2, -1, -1): # generate random Householder transformation x = numpy.random.randn(n-k) s = math.sqrt((x**2).sum()) # norm(x) sign = math.copysign(1.0, x[0]) s *= sign d[k] = -sign x[0] += s beta = s * x[0] # apply the transformation y = numpy.dot(x,A[k:n,:]) / beta A[k:n,:] -= numpy.outer(x,y) # change sign of rows A *= d.reshape(n,1) return A n = 100 A = make_orthonormal_matrix(n) # test matrix maxdot = 0 maxlen = 0.0 for i in range(n-1): maxlen = max(math.fabs(math.sqrt((A[i,:]**2).sum())-1.0), maxlen) for j in range(i+1,n): maxdot = max(math.fabs(numpy.dot(A[i,:],A[j,:])), maxdot) print("max dot product = %g" % maxdot) print("max vector length error = %g" % maxlen)

Another way to do this is to do a QR decomposition of a random Gaussian matrix. However the code above avoids calculating the R matrix.

Postscript:

I did some timing tests and it seems like the QR method is 3 times faster in python3:

import numpy from scipy.linalg import qr n = 4 H = numpy.random.randn(n, n) Q, R = qr(H) print(Q)