Linear algebra is the branch of mathematics concerning linear equations such as
linear functions such as
and their representations through matrices and vector spaces.^{[1]}^{[2]}^{[3]}
Linear algebra is central to almost all areas of mathematics. For instance, linear algebra is fundamental in modern presentations of geometry, including for defining basic objects such as lines, planes and rotations. Also, functional analysis may be basically viewed as the application of linear algebra to spaces of functions. Linear algebra is also used in most sciences and engineering areas, because it allows modeling many natural phenomena, and efficiently computing with such models. For nonlinear systems, which cannot be modeled with linear algebra, linear algebra is often used as a first approximation.
The study of linear algebra first emerged from the introduction of determinants, for solving systems of linear equations. Determinants were considered by Leibniz in 1693, and subsequently, in 1750, Gabriel Cramer used them for giving explicit solutions of linear system, now called Cramer's Rule. Later, Gauss further developed the theory of solving linear systems by using Gaussian elimination, which was initially listed as an advancement in geodesy.^{[4]}
The study of matrix algebra first emerged in England in the mid-1800s. In 1844 Hermann Grassmann published his "Theory of Extension" which included foundational new topics of what is today called linear algebra. In 1848, James Joseph Sylvester introduced the term matrix, which is Latin for "womb". While studying compositions of linear transformations, Arthur Cayley was led to define matrix multiplication and inverses. Crucially, Cayley used a single letter to denote a matrix, thus treating a matrix as an aggregate object. He also realized the connection between matrices and determinants, and wrote "There would be many things to say about this theory of matrices which should, it seems to me, precede the theory of determinants".^{[4]}
In 1882, Hüseyin Tevfik Pasha wrote the book titled "Linear Algebra".^{[5]}^{[6]} The first modern and more precise definition of a vector space was introduced by Peano in 1888;^{[4]} by 1900, a theory of linear transformations of finite-dimensional vector spaces had emerged. Linear algebra took its modern form in the first half of the twentieth century, when many ideas and methods of previous centuries were generalized as abstract algebra. The use of matrices in quantum mechanics, special relativity, and statistics helped spread the subject of linear algebra beyond pure mathematics. The development of computers led to increased research in efficient algorithms for Gaussian elimination and matrix decompositions, and linear algebra became an essential tool for modelling and simulations.^{[4]}
The origin of many of these ideas is discussed in the articles on determinants and Gaussian elimination.
Linear algebra first appeared in American graduate textbooks in the 1940s and in undergraduate textbooks in the 1950s.^{[7]} Following work by the School Mathematics Study Group, U.S. high schools asked 12th grade students to do "matrix algebra, formerly reserved for college" in the 1960s.^{[8]} In France during the 1960s, educators attempted to teach linear algebra through finite-dimensional vector spaces in the first year of secondary school. This was met with a backlash in the 1980s that removed linear algebra from the curriculum.^{[9]} In 1993, the U.S.-based Linear Algebra Curriculum Study Group recommended that undergraduate linear algebra courses be given an application-based "matrix orientation" as opposed to a theoretical orientation.^{[10]} Reviews of the teaching of linear algebra call for stress on visualization and geometric interpretation of theoretical ideas,^{[11]} and to include the jewel in the crown of linear algebra, the singular value decomposition (SVD), as 'so many other disciplines use it'.^{[12]} To better suit 21st century applications, such as data mining and uncertainty analysis, linear algebra can be based upon the SVD instead of Gaussian Elimination.^{[13]}^{[14]}
The main structures of linear algebra are vector spaces. A vector space over a field F (often the field of the real numbers) is a set V equipped with two binary operations satisfying the following axioms. Elements of V are called vectors, and elements of F are called scalars. The first operation, vector addition, takes any two vectors v and w and outputs a third vector . The second operation, scalar multiplication, takes any scalar a and any vector v and outputs a new . The operations of addition and multiplication in a vector space must satisfy the following axioms.^{[15]} In the list below, let u, v and w be arbitrary vectors in V, and a and b scalars in F.
Axiom | Signification |
Associativity of addition | u + (v + w) = (u + v) + w |
Commutativity of addition | u + v = v + u |
Identity element of addition | There exists an element 0 ? V, called the zero vector, such that v + 0 = v for all v ? V. |
Inverse elements of addition | For every v ? V, there exists an element -v ? V, called the additive inverse of v, such that v + (-v) = 0 |
Distributivity of scalar multiplication with respect to vector addition | a(u + v) = au + av |
Distributivity of scalar multiplication with respect to field addition | (a + b)v = av + bv |
Compatibility of scalar multiplication with field multiplication | a(bv) = (ab)v ^{[nb 1]} |
Identity element of scalar multiplication | 1v = v, where 1 denotes the multiplicative identity in F. |
The first four axioms are those of V being an abelian group under vector addition. Elements of a vector space may have various nature; for example, they can be sequences, functions, polynomials or matrices. Linear algebra is concerned with properties common to all vector spaces.
Similarly as in the theory of other algebraic structures, linear algebra studies mappings between vector spaces that preserve the vector-space structure. Given two vector spaces V and W over a field F, a linear transformation (also called linear map, linear mapping or linear operator) is a map
that is compatible with addition and scalar multiplication:
for any vectors u,v ? V and a scalar a ? F.
Additionally for any vectors u, v ? V and scalars a, b ? F:
When a bijective linear mapping exists between two vector spaces (that is, every vector from the second space is associated with exactly one in the first), we say that the two spaces are isomorphic. Because an isomorphism preserves linear structure, two isomorphic vector spaces are "essentially the same" from the linear algebra point of view. One essential question in linear algebra is whether a mapping is an isomorphism or not, and this question can be answered by checking if the determinant is nonzero. If a mapping is not an isomorphism, linear algebra is interested in finding its range (or image) and the set of elements that get mapped to zero, called the kernel of the mapping.
Linear transformations have geometric significance. For example, 2 × 2 real matrices denote standard planar mappings that preserve the origin.
Again, in analogue with theories of other algebraic objects, linear algebra is interested in subsets of vector spaces that are themselves vector spaces; these subsets are called linear subspaces. For example, both the range and kernel of a linear mapping are subspaces, and are thus often called the range space and the nullspace; these are important examples of subspaces. Another important way of forming a subspace is to take a linear combination of a set of vectors v_{1}, v_{2}, ..., v_{k}:
where a_{1}, a_{2}, ..., a_{k} are scalars. The set of all linear combinations of vectors v_{1}, v_{2}, ..., v_{k} is called their span, which forms a subspace.
A linear combination of any system of vectors with all zero coefficients is the zero vector of V. If this is the only way to express the zero vector as a linear combination of v_{1}, v_{2}, ..., v_{k} then these vectors are linearly independent. Given a set of vectors that span a space, if any vector w is a linear combination of other vectors (and so the set is not linearly independent), then the span would remain the same if we remove w from the set. Thus, a set of linearly dependent vectors is redundant in the sense that there will be a linearly independent subset which will span the same subspace. Therefore, we are mostly interested in a linearly independent set of vectors that spans a vector space V, which we call a basis of V. Any set of vectors that spans V contains a basis, and any linearly independent set of vectors in V can be extended to a basis.^{[16]} It turns out that if we accept the axiom of choice, every vector space has a basis;^{[17]} nevertheless, this basis may be unnatural, and indeed, may not even be constructible. For instance, there exists a basis for the real numbers, considered as a vector space over the rationals, but no explicit basis has been constructed.
Any two bases of a vector space V have the same cardinality, which is called the dimension of V. The dimension of a vector space is well-defined by the dimension theorem for vector spaces. If a basis of V has finite number of elements, V is called a finite-dimensional vector space. If V is finite-dimensional and U is a subspace of V, then dim U V. If U_{1} and U_{2} are subspaces of V, then
One often restricts consideration to finite-dimensional vector spaces. A fundamental theorem of linear algebra states that all vector spaces of the same dimension are isomorphic,^{[19]} giving an easy way of characterizing isomorphism.
A particular basis {v_{1}, v_{2}, ..., v_{n}} of V allows one to construct a coordinate system in V: the vector with coordinates (a_{1}, a_{2}, ..., a_{n}) is the linear combination
The condition that v_{1}, v_{2}, ..., v_{n} span V guarantees that each vector v can be assigned coordinates, whereas the linear independence of v_{1}, v_{2}, ..., v_{n} assures that these coordinates are unique (i.e. there is only one linear combination of the basis vectors that is equal to v). In this way, once a basis of a vector space V over F has been chosen, V may be identified with the coordinate n-space F^{n}. Under this identification, addition and scalar multiplication of vectors in V correspond to addition and scalar multiplication of their coordinate vectors in F^{n}. Furthermore, if V and W are an n-dimensional and m-dimensional vector space over F, and a basis of V and a basis of W have been fixed, then any linear transformation T: V -> W may be encoded by an m × n matrix A with entries in the field F, called the matrix of T with respect to these bases. Two matrices that encode the same linear transformation in different bases are called similar. Matrix theory replaces the study of linear transformations, which were defined axiomatically, by the study of matrices, which are concrete objects. This major technique distinguishes linear algebra from theories of other algebraic structures, which usually cannot be parameterized so concretely.
There is an important distinction between the coordinate n-space R^{n} and a general finite-dimensional vector space V. While R^{n} has a standard basis {e_{1}, e_{2}, ..., e_{n}}, a vector space V typically does not come equipped with such a basis and many different bases exist (although they all consist of the same number of elements equal to the dimension of V).
One major application of the matrix theory is calculation of determinants, a central concept in linear algebra. While determinants could be defined in a basis-free manner, they are usually introduced via a specific representation of the mapping; the value of the determinant does not depend on the specific basis. It turns out that a mapping has an inverse if and only if the determinant has an inverse (every non-zero real or complex number has an inverse^{[20]}). If the determinant is zero, then the nullspace is nontrivial. Determinants have other applications, including a systematic way of seeing if a set of vectors is linearly independent (we write the vectors as the columns of a matrix, and if the determinant of that matrix is zero, the vectors are linearly dependent). Determinants could also be used to solve systems of linear equations (see Cramer's rule), but in real applications, Gaussian elimination is a faster method.
In general, the action of a linear transformation may be quite complex. Attention to low-dimensional examples gives an indication of the variety of their types. One strategy for a general n-dimensional transformation T is to find "characteristic lines" that are invariant sets under T. If v is a non-zero vector such that Tv is a scalar multiple of v, then the line through 0 and v is an invariant set under T and v is called a characteristic vector or eigenvector. The scalar ? such that Tv = ?v is called a characteristic value or eigenvalue of T.
To find an eigenvector or an eigenvalue, we note that
where I is the identity matrix. For there to be nontrivial solutions to that equation, det(T - ? I) = 0. The determinant is a polynomial, and so the eigenvalues are not guaranteed to exist if the field is R. Thus, we often work with an algebraically closed field such as the complex numbers when dealing with eigenvectors and eigenvalues so that an eigenvalue will always exist. It would be particularly nice if given a transformation T taking a vector space V into itself we can find a basis for V consisting of eigenvectors. If such a basis exists, we can easily compute the action of the transformation on any vector: if v_{1}, v_{2}, ..., v_{n} are linearly independent eigenvectors of a mapping of n-dimensional spaces T with (not necessarily distinct) eigenvalues ?_{1}, ?_{2}, ..., ?_{n}, and if v = a_{1}v_{1} + ... + a_{n} v_{n}, then,
Such a transformation is called a diagonalizable matrix since in the eigenbasis, the transformation is represented by a diagonal matrix. Because operations like matrix multiplication, matrix inversion, and determinant calculation are simple on diagonal matrices, computations involving matrices are much simpler if we can bring the matrix to a diagonal form. Not all matrices are diagonalizable (even over an algebraically closed field).
Besides these basic concepts, linear algebra also studies vector spaces with additional structure, such as an inner product. The inner product is an example of a bilinear form, and it gives the vector space a geometric structure by allowing for the definition of length and angles. Formally, an inner product is a map
that satisfies the following three axioms for all vectors u, v, w in V and all scalars a in F:^{[21]}^{[22]}
Note that in R, it is symmetric.
We can define the length of a vector v in V by
and we can prove the Cauchy-Schwarz inequality:
In particular, the quantity
and so we can call this quantity the cosine of the angle between the two vectors.
Two vectors are orthogonal if . An orthonormal basis is a basis where all basis vectors have length 1 and are orthogonal to each other. Given any finite-dimensional vector space, an orthonormal basis could be found by the Gram-Schmidt procedure. Orthonormal bases are particularly nice to deal with, since if v = a_{1}v_{1} + ... + a_{n} v_{n}, then .
The inner product facilitates the construction of many useful concepts. For instance, given a transform T, we can define its Hermitian conjugate T* as the linear transform satisfying
If T satisfies TT* = T*T, we call T normal. It turns out that normal matrices are precisely the matrices that have an orthonormal system of eigenvectors that span V.
Because of the ubiquity of vector spaces, linear algebra is used in many fields of mathematics, natural sciences, computer science, and social science. Below are just some examples of applications of linear algebra.
Linear algebra provides the formal setting for the linear combination of equations used in the Gaussian method. Suppose the goal is to find and describe the solution(s), if any, of the following system of linear equations:
The Gaussian-elimination algorithm is as follows: eliminate x from all equations below L_{1}, and then eliminate y from all equations below L_{2}. This will put the system into triangular form. Then, using back-substitution, each unknown can be solved for.
In the example, x is eliminated from L_{2} by adding (3/2)L_{1} to L_{2}. x is then eliminated from L_{3} by adding L_{1} to L_{3}. Formally:
The result is:
Now y is eliminated from L_{3} by adding -4L_{2} to L_{3}:
The result is:
This result is a system of linear equations in triangular form, and so the first part of the algorithm is complete.
The last part, back-substitution, consists of solving for the known in reverse order. It can thus be seen that
Then, z can be substituted into L_{2}, which can then be solved to obtain
Next, z and y can be substituted into L_{1}, which can be solved to obtain
The system is solved.
We can, in general, write any system of linear equations as a matrix equation:
The solution of this system is characterized as follows: first, we find a particular solution x_{0} of this equation using Gaussian elimination. Then, we compute the solutions of Ax = 0; that is, we find the null space N of A. The solution set of this equation is given by . If the number of variables is equal to the number of equations, then we can characterize when the system has a unique solution: since N is trivial if and only if det A ? 0, the equation has a unique solution if and only if det A ? 0.^{[23]}
The least squares method is used to determine the best-fit line for a set of data.^{[24]} This line will minimize the sum of the squares of the residuals.
Fourier series are a representation of a function f: [-?, ?] -> R as a trigonometric series:
This series expansion is extremely useful in solving partial differential equations. In this article, we will not be concerned with convergence issues; it is nice to note that all Lipschitz-continuous functions have a converging Fourier series expansion, and nice enough discontinuous functions have a Fourier series that converges to the function value at most points.
The space of all functions that can be represented by a Fourier series form a vector space (technically speaking, we call functions that have the same Fourier series expansion the "same" function, since two different discontinuous functions might have the same Fourier series). Moreover, this space is also an inner product space with the inner product
The functions g_{n}(x) = sin(nx) for n > 0 and h_{n}(x) = cos(nx) for n >= 0 are an orthonormal basis for the space of Fourier-expandable functions. We can thus use the tools of linear algebra to find the expansion of any function in this space in terms of these basis functions. For instance, to find the coefficient a_{k}, we take the inner product with h_{k}:
and by orthonormality, ; that is,
Quantum mechanics is highly inspired by notions in linear algebra. In quantum mechanics, the physical state of a particle is represented by a vector, and observables (such as momentum, energy, and angular momentum) are represented by linear operators on the underlying vector space. More concretely, the wave function of a particle describes its physical state and lies in the vector space L^{2} (the functions ?: R^{3} -> C such that is finite), and it evolves according to the Schrödinger equation. Energy is represented as the operator , where V is the potential energy. H is also known as the Hamiltonian operator. The eigenvalues of H represent the possible energies that can be observed. Given a particle in some state ?, we can expand ? into a linear combination of eigenstates of H. The component of ? in each eigenstate determines the probability of measuring the corresponding eigenvalue, and the measurement forces the particle to assume that eigenstate (wave function collapse).
Many of the principles and techniques of linear algebra can be seen in the geometry of lines in a real two-dimensional plane E. When formulated using vectors and matrices the geometry of points and lines in the plane can be extended to the geometry of points and hyperplanes in high-dimensional spaces.
Point coordinates in the plane E are ordered pairs of real numbers, (x,y), and a line is defined as the set of points (x,y) that satisfy the linear equation^{[25]}
where a, b and c are not all zero. Then,
or
where x = (x, y, 1) is the 3 × 1 set of homogeneous coordinates associated with the point (x, y).^{[26]}
Homogeneous coordinates identify the plane E with the z = 1 plane in three-dimensional space. The x-y coordinates in E are obtained from homogeneous coordinates y = (y_{1}, y_{2}, y_{3}) by dividing by the third component (if it is nonzero) to obtain y = (y_{1}/y_{3}, y_{2}/y_{3}, 1).
The linear equation, ?, has the important property, that if x_{1} and x_{2} are homogeneous coordinates of points on the line, then the point ?x_{1} + ?x_{2} is also on the line, for any real ? and ?.
Now consider the equations of the two lines ?_{1} and ?_{2},
which forms a system of linear equations. The intersection of these two lines is defined by x = (x, y, 1) that satisfy the matrix equation,
or using homogeneous coordinates,
The point of intersection of these two lines is the unique non-zero solution of these equations. In homogeneous coordinates, the solutions are multiples of the following solution:^{[26]}
if the rows of B are linearly independent (i.e., ?_{1} and ?_{2} represent distinct lines). Divide through by x_{3} to get Cramer's rule for the solution of a set of two linear equations in two unknowns.^{[27]} Notice that this yields a point in the z = 1 plane only when the 2 × 2 submatrix associated with x_{3} has a non-zero determinant.
It is interesting to consider the case of three lines, ?_{1}, ?_{2} and ?_{3}, which yield the matrix equation,
which in homogeneous form yields,
Clearly, this equation has the solution x = (0,0,0), which is not a point on the z = 1 plane E. For a solution to exist in the plane E, the coefficient matrix C must have rank 2, which means its determinant must be zero. Another way to say this is that the columns of the matrix must be linearly dependent.
Another way to approach linear algebra is to consider linear functions on the two-dimensional real plane E=R^{2}. Here R denotes the set of real numbers. Let x=(x, y) be an arbitrary vector in E and consider the linear function ?: E->R, given by
or
This transformation has the important property that if Ay=d, then
This shows that the sum of vectors in E map to the sum of their images in R. This is the defining characteristic of a linear map, or linear transformation.^{[25]} For this case, where the image space is a real number the map is called a linear functional.^{[27]}
Consider the linear functional a little more carefully. Let i=(1,0) and j =(0,1) be the natural basis vectors on E, so that x=xi+yj. It is now possible to see that
Thus, the columns of the matrix A are the image of the basis vectors of E in R.
This is true for any pair of vectors used to define coordinates in E. Suppose we select a non-orthogonal non-unit vector basis v and w to define coordinates of vectors in E. This means a vector x has coordinates (?,?), such that x=?v+?w. Then, we have the linear functional
where Av=d and Aw=e are the images of the basis vectors v and w. This is written in matrix form as
This leads to the question of how to determine the coordinates of a vector x relative to a general basis v and w in E. Assume that we know the coordinates of the vectors, x, v and w in the natural basis i=(1,0) and j =(0,1). Our goal is to find the real numbers ?, ?, so that x=?v+?w, that is
To solve this equation for ?, ?, we compute the linear coordinate functionals ? and ? for the basis v, w, which are given by,^{[26]}
The functionals ? and ? compute the components of x along the basis vectors v and w, respectively, that is,
which can be written in matrix form as
These coordinate functionals have the properties,
These equations can be assembled into the single matrix equation,
Thus, the matrix formed by the coordinate linear functionals is the inverse of the matrix formed by the basis vectors.^{[25]}^{[27]}
The set of points in the plane E that map to the same image in R under the linear functional ? define a line in E. This line is the image of the inverse map, ?^{-1}: R->E. This inverse image is the set of the points x=(x, y) that solve the equation,
Notice that a linear functional operates on known values for x=(x, y) to compute a value c in R, while the inverse image seeks the values for x=(x, y) that yield a specific value c.
In order to solve the equation, we first recognize that only one of the two unknowns (x,y) can be determined, so we select y to be determined, and rearrange the equation
Solve for y and obtain the inverse image as the set of points,
For convenience the free parameter x has been relabeled t.
The vector p defines the intersection of the line with the y-axis, known as the y-intercept. The vector h satisfies the homogeneous equation,
Notice that if h is a solution to this homogeneous equation, then t h is also a solution.
The set of points of a linear functional that map to zero define the kernel of the linear functional. The line can be considered to be the set of points h in the kernel translated by the vector p.^{[25]}^{[27]}
Since linear algebra is a successful theory, its methods have been developed and generalized in other parts of mathematics. In module theory, one replaces the field of scalars by a ring. The concepts of linear independence, span, basis, and dimension (which is called rank in module theory) still make sense. Nevertheless, many theorems from linear algebra become false in module theory. For instance, not all modules have a basis (those that do are called free modules), the rank of a free module is not necessarily unique, not every linearly independent subset of a module can be extended to form a basis, and not every subset of a module that spans the space contains a basis.
In multilinear algebra, one considers multivariable linear transformations, that is, mappings that are linear in each of a number of different variables. This line of inquiry naturally leads to the idea of the dual space, the vector space V^{*} consisting of linear maps where F is the field of scalars. Multilinear maps can be described via tensor products of elements of V^{*}.
If, in addition to vector addition and scalar multiplication, there is a bilinear vector product , the vector space is called an algebra; for instance, associative algebras are algebras with an associate vector product (like the algebra of square matrices, or the algebra of polynomials).
Functional analysis mixes the methods of linear algebra with those of mathematical analysis and studies various function spaces, such as L^{p} spaces.
Representation theory studies the actions of algebraic objects on vector spaces by representing these objects as matrices. It is interested in all the ways that this is possible, and it does so by finding subspaces invariant under all transformations of the algebra. The concept of eigenvalues and eigenvectors is especially important.
Algebraic geometry considers the solutions of systems of polynomial equations.
There are several related topics in the field of computer programming that utilize much of the techniques and theorems linear algebra encompasses and refers to.