Language:EN
Pages: 43
Rating : ⭐⭐⭐⭐⭐
Price: $10.99
Page 1 Preview
recursively applying the laplace expansion

Recursively applying the laplace expansion

An Image is worth 16x16 words: Transformers for Image Recognition at Scale. Anonymous ICLR’21 submission.

• For 2×2 matrices, if 𝑨 = 𝑎(( 𝑎)( 𝑎)), recall that the inverse of A is 𝑎()

𝑨3(= 𝑎((𝑎)) − 𝑎()𝑎)( 1 −𝑎)( 𝑎)) −𝑎() 𝑎((• Hence, 𝑨 is invertible if and only if
𝑎((𝑎)) − 𝑎()𝑎)( ≠ 0

• This quantity is the determinant of 𝑨 ∈ ℝ)×), i.e.,
det 𝑨 = 𝑎(( 𝑎)( 𝑎))= 𝑎((𝑎)) − 𝑎()𝑎)( 𝑎()

which we have observed in the preceding example.

• For 𝑛 = 3 (known as Sarrus’ rule),

• We call a square matrix 𝑻 an upper-triangular matrix if 𝑇𝑖𝑗 = 0 for 𝑖 > 𝑗, i.e., the matrix is zero below its diagonal.

• Analogously, we define a lower-triangular matrix as a matrix with zeros above its diagonal.

• How can we compute the determinant of an 𝑛×𝑛 (𝑛 > 3) matrix?

• We reduce this problem to computing the determinant of 𝑛 − 1 × 𝑛 − 1 matrices. By recursively applying the Laplace expansion, we can compute determinants of an 𝑛×𝑛 matrix by ultimately computing determinants of 2×2 matrices.

−1KLM𝑎KMdet 𝑨𝒌,𝒋

KH(

2. Expansion along row 𝑗 +
det 𝑨 = J
• Let us compute the determinant of

1

3

2
𝑨 = 1
0

1

• Using the Laplace expansion along the first row, yielding 1 2 3

using Sarrus’ rule:

det 𝑨 = 1 T 1 T 1 + 3 T 0 T 3 + 0 T 2 T 2 − 0 T 1 T 3 − 1 T 0 T 2 − 3 T 2 T 1 = 1 − 6 = −5.

• Adding a multiple of a column/row to another one does not change det 𝑨• Multiplication of a column/row with 𝜆 ∈ ℝ scales det 𝑨 by 𝜆. In particular, det 𝜆𝑨 = 𝜆𝒏det 𝑨
• Swapping two rows/columns changes the sign of det 𝑨
• Because of the last three properties, we can use Gaussian elimination to compute det 𝑨 by bringing 𝑨 into row-echelon form. We can stop Gaussian elimination when we have 𝑨 in a triangular form where the elements below the diagonal are all 0. Recall: the determinant of a triangular matrix is the product of the diagonal elements.

• We can verify this result with the previous example.

• Matrices characterize linear transformations. B B 1 A (1,1)T 1
B 1
1
-1
0
C -1

D

-1 1

-1
C -1

A (0.5,0.5)T

When determinant is greater than 1, it will enlarge

D
0

C

D

cos(−45°)

sin 45°

− sin 45°

sin(−45°)
andcos(−45°) sin(−45°)

cos(−45°)

• Some linear transformations (matrices) are not invertible

• The characteristic polynomial 𝑝𝑨 𝜆 ≔ det 𝑨 − 𝜆𝑰 will allow us to compute eigenvalues and eigenvectors.

• Example

• 𝑨 = 1 2
𝑝𝑨 𝜆 = det 𝑨 − 𝜆𝑰 = 1 − 𝜆 2

• Let 𝑨 ∈ ℝ+×+be a square matrix. Then 𝜆 ∈ ℝ is an eigenvalue of 𝑨 and 𝒙 ∈ ℝ+\ 𝟎 is the corresponding eigenvector of 𝑨 if

𝑨𝒙 = 𝜆𝒙

• rk 𝑨 − 𝜆𝑰+ < 𝑛

• det 𝑨 − 𝜆𝑰 = 0

• 𝑨 = 1 2

4
3, we have,

4
3 − 𝜆 = 1 − 𝜆

𝑝𝑨 𝜆 = det 𝑨 − 𝜆𝑰 = 1 − 𝜆 2

• If 𝜆 is an eigenvalue of 𝑨 ∈ ℝ+×+, then the corresponding eigenspace 𝐸y is the solution space of the homogeneous system of linear equations ( 𝑨 −𝜆𝑰 𝒙 = 𝟎

Example (The Case of the Identity Matrix)

• Useful properties regarding eigenvalues and eigenvectors
• A matrix 𝑨 and its transpose 𝑨Xpossess the same eigenvalues, but not necessarily the same eigenvectors
• The eigenspace 𝐸y is the null space of 𝑨 − 𝜆𝑰 since
𝑨𝒙 = 𝜆𝒙 ⟺ 𝑨𝒙 − 𝜆𝒙 = 𝟎
⟺ 𝑨 − 𝜆𝑰 𝒙 = 𝟎 ⟺ 𝒙 ∈ ker 𝑨 − 𝜆𝑰
• Symmetric, positive definite matrices always have positive, real eigenvalues.

∀𝒙 ∈ 𝑉\ 𝟎 : 𝒙X𝑨𝒙 > 0

4

1 3 − 𝜆 0

3 − 𝜆 − 2 T 1

giving the roots 𝜆( = 2 and 𝜆) = 5.

Step 3: Eigenvectors and Eigenspaces. From our definition of the eigenvector

1 3 − 𝜆 𝒙 = 𝟎

• For 𝜆 = 5 we obtain

𝑥𝟏 𝑥𝟐= −1 −2 2

• This eigenspace is one-dimensional as it possesses a single basis vector.

• Analogously, we find the eigenvector for 𝜆 = 2 by solving

• The corresponding eigenspace is given as

𝐸) = span[ 1−1 ]

• In our previous example, the geometric multiplicity of 𝜆 = 5 and 𝜆 = 2 is 1.

• In another example, the matrix 𝑨 = 2 0 1 2has two repeated eigenvalues 𝜆( = 𝜆) = 2. The algebraic multiplicity of 𝜆( and 𝜆) is 2.

Definition. A square matrix 𝑨 ∈ ℝ+×+is defective if it possesses fewer than 𝑛 linearly independent eigenvectors

• positive semidefinite: 𝒙Š𝑺𝒙 = 𝒙Š𝑨Š𝑨𝒙 = 𝑨𝒙Š𝑨𝒙 ≥ 𝟎

• If rk(𝑨) = 𝑛, then 𝑺 ≔ 𝑨Š𝑨 is positive definite.

so that we obtain the eigenvalues 𝜆( = 1 and 𝜆) = 7, where 𝜆( is a repeated eigenvalue. Following our standard procedure for computing eigenvectors, we obtain the eigenspaces
−1 −1 1
𝐸( = span[• 1 0 ,

• 0 1
] , 𝐸Ž = span[

• To construct such a basis, we exploit the fact that 𝒙𝟏, 𝒙𝟐 are eigenvectors

using such linear combinations.

• Therefore, even if 𝒙𝟏and 𝒙𝟐 are not orthogonal, we can apply the Gram-

• which are orthogonal to each other, orthogonal to 𝒙𝟑, and eigenvectors of 𝑨

associated with 𝜆( = 1.

You are viewing 1/3rd of the document.Purchase the document to get full access instantly

Immediately available after payment
Both online and downloadable
No strings attached
How It Works
Login account
Login Your Account
Place in cart
Add to Cart
send in the money
Make payment
Document download
Download File
img

Uploaded by : Esteban Blasco Naranjo

PageId: ELID9BC2E6