<?xml version="1.0" encoding="UTF-8" standalone="no"?>
	<?xml-stylesheet type="text/xsl" href="mathml.xsl"?>
	<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN" "http://www.w3.org/Math/DTD/mathml2/xhtml-math11-f.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<meta http-equiv="pragma" content="no-cache" />
<meta http-equiv="Expires" content="-1" />
<title>Data</title>
<style type="text/css">body {
	font-size : 12pt;
	font-family : serif;
	color : #000000
}
math {
color : #000000;
font-family : Mathematica1, Mathematica2, Mathematica3, Mathematica4, Mathematica5, serif, CMSY10, Symbol, Times, Lucida Sans Unicode, MT Extra
}
</style>
</head>
<body>
<h2>Matrices as Datasets</h2><br />
<br />
 Another useful way to think of matrices is as observations of Random Variables. For example, if we have two random variables <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi></math> and <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>y</mi></math>, with three observations each, we can tabulate our data as <math xmlns="http://www.w3.org/1998/Math/MathML"><mover accent="true"><mrow><mi>X</mi></mrow><mo stretchy="true">&RightArrow;</mo></mover><mo>=</mo>
<mrow><mo>(</mo><mtable><mtr columnalign="right"><mtd>
<msub><mi>x</mi><mn>1</mn></msub></mtd></mtr>
<mtr columnalign="right"><mtd>
<msub><mi>x</mi><mn>2</mn></msub></mtd></mtr>
<mtr columnalign="right"><mtd>
<msub><mi>x</mi><mn>3</mn></msub>
</mtd></mtr>
</mtable><mo>)</mo></mrow></math> or even put both variables together as <div align="center"><math xmlns="http://www.w3.org/1998/Math/MathML"><mstyle displaystyle="true"><mi>D</mi><mo>=</mo>
<mo>[</mo><mtable><mtr columnalign="right"><mtd>
<msub><mi>x</mi><mn>1</mn></msub></mtd><mtd><msub><mi>y</mi><mn>1</mn></msub></mtd></mtr>
<mtr columnalign="right"><mtd>
<msub><mi>x</mi><mn>2</mn></msub></mtd><mtd><msub><mi>y</mi><mn>2</mn></msub></mtd></mtr>
<mtr columnalign="right"><mtd>
<msub><mi>x</mi><mn>3</mn></msub></mtd><mtd><msub><mi>y</mi><mn>3</mn></msub>
</mtd></mtr>
</mtable><mo>]</mo></mstyle></math></div>. It is common and convenient to 'center' these data matrices by substracting off the <b>mean</b> for each variable (the vector of means is easily added back in when appropriate. With data arranged this way, our familiar matrix operations become very useful; Our inner product yeilds the variance (a fact used to great effect in the geometry of statistics):<br />
<br />
<div align="center"><math xmlns="http://www.w3.org/1998/Math/MathML"><mstyle displaystyle="true"><mi>X</mi>'<mo>&CircleTimes;</mo><mi>X</mi><mo>=</mo><mrow><mo>(</mo><mtable><mtr columnalign="right"><mtd>
<mrow><mo>(</mo><msub><mi>x</mi><mn>1</mn></msub><mo>-</mo><msub><mo>&mu;</mo><mi>x</mi></msub><mo>)</mo></mrow></mtd><mtd><mrow><mo>(</mo><msub><mi>x</mi><mn>2</mn></msub><mo>-</mo><msub><mo>&mu;</mo><mi>x</mi></msub><mo>)</mo></mrow></mtd><mtd><mrow><mo>(</mo><msub><mi>x</mi><mn>3</mn></msub><mo>-</mo><msub><mo>&mu;</mo><mi>x</mi></msub><mo>)</mo></mrow>
</mtd></mtr>
</mtable><mo>)</mo></mrow><mo>&CircleTimes;</mo>
<mrow><mo>(</mo><mtable><mtr columnalign="right"><mtd>
<msub><mi>x</mi><mn>1</mn></msub></mtd></mtr>
<mtr columnalign="right"><mtd>
<msub><mi>x</mi><mn>2</mn></msub></mtd></mtr>
<mtr columnalign="right"><mtd>
<msub><mi>x</mi><mn>3</mn></msub>
</mtd></mtr>
</mtable><mo>)</mo></mrow><mo>=</mo><msub><mo>&sum;</mo><mi>i</mi></msub><msup><mrow><mo>(</mo><msub><mi>x</mi><mi>i</mi></msub><mo>-</mo><msub><mo>&mu;</mo><mi>i</mi></msub><mo>)</mo></mrow><mn>2</mn></msup><mo>=</mo><mtext>cov</mtext><mrow><mo>(</mo><mi>x</mi><mo>)</mo></mrow></mstyle></math></div><br />
<br />
 Extending this to multiple variables brings us into the land of covariance matrices, the fundamental object in applied statistics:<br />
<br />
<div align="center"><math xmlns="http://www.w3.org/1998/Math/MathML"><mstyle displaystyle="true"><mi>D</mi>'<mo>&CircleTimes;</mo><mi>D</mi><mo>=</mo>
<mrow><mo>(</mo><mtable><mtr columnalign="right"><mtd>
<msub><mi>x</mi><mn>1</mn></msub></mtd><mtd><msub><mi>x</mi><mn>2</mn></msub></mtd><mtd><msub><mi>x</mi><mn>3</mn></msub></mtd></mtr>
<mtr columnalign="right"><mtd>
<msub><mi>y</mi><mn>1</mn></msub></mtd><mtd><msub><mi>y</mi><mn>2</mn></msub></mtd><mtd><msub><mi>y</mi><mn>3</mn></msub>
</mtd></mtr>
</mtable><mo>)</mo></mrow>
<mo>&CircleTimes;</mo>
<mo>[</mo><mtable><mtr columnalign="right"><mtd>
<msub><mi>x</mi><mn>1</mn></msub></mtd><mtd><msub><mi>y</mi><mn>1</mn></msub></mtd></mtr>
<mtr columnalign="right"><mtd>
<msub><mi>x</mi><mn>2</mn></msub></mtd><mtd><msub><mi>y</mi><mn>2</mn></msub></mtd></mtr>
<mtr columnalign="right"><mtd>
<msub><mi>x</mi><mn>3</mn></msub></mtd><mtd><msub><mi>y</mi><mn>3</mn></msub>
</mtd></mtr>
</mtable><mo>]</mo>

<mo>=</mo>
<mo>[</mo><mtable><mtr columnalign="right"><mtd>
<mtext>  var</mtext><mrow><mo>(</mo><mi>x</mi><mo>)</mo></mrow></mtd><mtd><mtext>cov</mtext><mrow><mo>(</mo><mi>x</mi><mo>,</mo><mi>y</mi><mo>)</mo></mrow></mtd></mtr>
<mtr columnalign="right"><mtd>
<mtext>cov</mtext><mrow><mo>(</mo><mi>y</mi><mo>,</mo><mi>x</mi><mo>)</mo></mrow></mtd><mtd><mtext>  var</mtext><mrow><mo>(</mo><mi>y</mi><mo>)</mo></mrow>
</mtd></mtr>
</mtable><mo>]</mo></mstyle></math></div><br />
<br />
Note that these covariance matrices are always real, symmetric, and <i>positive semi-definate</i>.<br />
<br />

</body>
</html>