GETTING STARTED IN MATLAB (ver 1.0, beta/draft)

 

Regression with matrix algebra

 

 

 

 

 

 

We are going to run the following regression

GDPpcap = constant + Consump + Mgrowth + Popgrowth + Xgrowth

Were

 

>> muconsump=mean(Consump)

muconsump =

��� 3.2439

 

>> mumgrowth=mean(Mgrowth)

mumgrowth =

��� 6.9586

 

>> muxgrowth=mean(Xgrowth)

muxgrowth =

��� 6.0626

 

>> mupopgrowth=mean(Popgrowth)

mupopgrowth =

��� 1.0589

>> mugdp=mean(GDPpcap)

mugdp =

��� 2.0861

>> muones=mean(Ones)

 

muones =

 

���� 1

>> devgdp=GDPpcap./mugdp

devx=Xgrowth./muxgrowth

devm=Mgrowth./mumgrowth

devcon=Consump./muconsump

devpop=Popgrowth./mupopgrowth

 

 

Running this regression using Stata we get the following. We will replicate this process in MATLAB using matrix algebra

 

 

 

 

>> Ones=ones(39,1)

 

>> ivs = [Xgrowth Mgrowth Consump Popgrowth Ones]

 

>> b=inv(ivs'*ivs)*ivs'*GDPpcap

b =

��� 0.0966

�� 0.0867

��� 0.8317

�� -0.9844

�� -0.7585

gdphat= ivs*b

e=GDPpcap-gdphat

>> sumsqre=e'*e

sumsqre =

�� 16.2059

 

>> sscpinv=inv(ivsdev'*ivsdev)

 

sscpinv =

 

��� 0.0343�� -0.0072��� 0.0038��� 0.0016�� -0.0325

�� -0.0072��� 0.0736�� -0.1227�� -0.0893��� 0.1455

�� 0.0038�� -0.1227��� 0.3314��� 0.2186�� -0.4311

��� 0.0016�� -0.0893��� 0.2186��� 1.6649�� -1.7959

�� -0.0325��� 0.1455�� -0.4311�� -1.7959��� 2.1397

 

>> variance=sumsqre/33

 

variance =

 

��� 0.4911

 

When you first open MATLAB this is what you will see:

MatlabScreen.jpg

Once you have some file in your directory they will appear in the �current directory� window:

If you right click on the file Heating.csv, a menu will pop-up:

 

MATLAB gives you, among other things, the option to open the files as text, using other program or importing it.

The import data option open the �Import Wizard� (which you can open also using File � Import Data from the main menu, or in the command window tye uiimport)

Step 1 - Select the format of your data

 

Step 2 � Click �Finish�

If you click on the �Workspace� tab you�ll see two files �data� and �textdata

MATLAB works with matrices, so on the �value� column, <323x4 double> means that data is in �double precision� numeric format (64 bits for each number stored in memory) in a matrix format with 323 rows and 4 columns. MATLAB means �MATrix LABoratory�.

The �textdata� variable is in �cell� array which can contain string or numeric characters, the default is string and it is a matrix with 324 rows and 2 columns.

If double-click on �data� a fourth screen will appear. The �Array Editor� is where you data sits. It looks like spreadsheet.

 

 

The list of operators includes

 

+ Addition

- Subtraction

.* Element-by-element multiplication

./ Element-by-element division

.\ Element-by-element left division

.^ Element-by-element power

.' Unconjugated array transpose

Source: Matlab tutorial

 

 

Checking for missing data

 

>> sum(isnan(data))

ans =

���� 0���� 0���� 2���� 2���� 2���� 0

There are 2 missing data in columns 3, 4 and 5.

 

 

 

Code

Description

i = find(~isnan(x));

x = x(i)

Find the indices of elements in a vector x that are not NaNs. Keep only the non-NaN elements.

x = x(~isnan(x));

Remove NaNs from a vector x.

x(isnan(x)) = [];

Remove NaNs from a vector x (alternative method).

X(any(isnan(X),2),:) = [];

Remove any rows containing NaNs from a matrix X.

Source: http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/data_analysis/bqm3i7n-13.html&http://www.mathworks.com/access/helpdesk/help/techdoc/data_analysis/f0-9167.html

If you frequently need to remove NaNs, you might want to write a short M-file function that you can call:

function X = exciseRows(X)

X(any(isnan(X),2),:) = [];

The following command computes the correlation coefficients of X after all rows containing NaNs are removed:

C = corrcoef(excise(X));

 

 

 

Generating matrices

 

>> E = [1 2 3 4 5; 2 3 4 5 6; 3 4 5 6 7; 4 5 6 7 8; 5 6 7 8 9]

 

E =

 

���� 1���� 2���� 3���� 4���� 5

���� 2���� 3���� 4���� 5���� 6

���� 3���� 4���� 5���� 6���� 7

���� 4���� 5���� 6���� 7���� 8

���� 5���� 6���� 7���� 8���� 9

 

 

>> B = zeros(3,4)

 

B =

 

���� 0���� 0���� 0���� 0

�� ��0���� 0���� 0���� 0

���� 0���� 0���� 0���� 0

 

>> C = ones(3,4)

 

C =

 

���� 1���� 1���� 1���� 1

���� 1���� 1���� 1���� 1

���� 1���� 1���� 1���� 1

 

>> R = rand(3,4)

 

R =

 

��� 0.0357��� 0.6787��� 0.3922��� 0.7060

��� 0.8491��� 0.7577��� 0.6555��� 0.0318

��� 0.9340��� 0.7431��� 0.1712��� 0.2769

 

>> Rn = randn(3,4)

 

Rn =

 

�� -1.3362�� -0.6918�� -1.5937�� -0.3999

��� 0.7143��� 0.8580�� -1.4410��� 0.6900

��� 1.6236��� 1.2540��� 0.5711��� 0.8156

 

>> D = [B C; R Rn]

 

D =

 

�������� 0�������� 0�������� 0�������� 0�� 1.0000��� 1.0000��� 1.0000��� 1.0000

�������� 0�������� 0�������� 0�������� 0��� 1.0000��� 1.0000��� 1.0000��� 1.0000

�������� 0�������� 0�������� 0�������� 0��� 1.0000��� 1.0000��� 1.0000��� 1.0000

��� 0.0357��� 0.6787��� 0.3922��� 0.7060�� -1.3362�� -0.6918�� -1.5937�� -0.3999

��� 0.8491��� 0.7577��� 0.6555��� 0.0318��� 0.7143��� 0.8580�� -1.4410��� 0.6900

��� 0.9340��� 0.7431��� 0.1712��� 0.2769��� 1.6236��� 1.2540��� 0.5711��� 0.8156

 

Deleting rows and columns

 

>> E��� %Recalling matrix E

 

E =

 

���� 1���� 2���� 3���� 4���� 5

���� 2���� 3���� 4���� 5���� 6

���� 3���� 4���� 5���� 6���� 7

���� 4���� 5���� 6���� 7���� 8

���� 5���� 6���� 7���� 8���� 9

 

>> E(:,3) = []�� %Deleting the third column

 

E =

 

���� 1���� 2���� 4���� 5

���� 2���� 3���� 5���� 6

���� 3���� 4���� 6���� 7

���� 4���� 5���� 7���� 8

���� 5���� 6���� 8���� 9

 

>> E(3,:) = []%Deleting the third row

 

E =

 

���� 1���� 2���� 4���� 5

���� 2���� 3���� 5���� 6

���� 4���� 5���� 7���� 8

���� 5���� 6���� 8���� 9

 

You cannot delete a single element without reshaping the entire matrix.

 

Creating tables

 

>> F = (10:5:100)'

 

F =

 

��� 10

��� 15

��� 20

��� 25

��� 30

��� 35

��� 40

��� 45

��� 50

��� 55

��� 60

��� 65

��� 70

��� 75

��� 80

��� 85

��� 90

��� 95

�� 100

 

>> squares = [F F.*F]

 

squares =

 

��������� 10�������� 100

��������� 15�������� 225

��������� 20�������� 400

��������� 25�������� 625

��������� 30�������� 900

��������� 35������� 1225

��������� 40������� 1600

��������� 45������� 2025

��������� 50������� 2500

��������� 55������� 3025

��������� 60����� ��3600

��������� 65������� 4225

��������� 70������� 4900

��������� 75������� 5625

��������� 80������� 6400

��������� 85������� 7225

��������� 90������� 8100

��������� 95������� 9025

�������� 100������ 10000

 

Descriptive statistics

 

>> R

 

R =

 

��� 1.4929�� 2.4352��� 6.1604��� 5.4972��� 3.8045��� 7.7917��� 0.1190

��� 2.5751��� 9.2926��� 4.7329��� 9.1719��� 5.6782��� 9.3401��� 3.3712

��� 8.4072��� 3.4998��� 3.5166��� 2.8584��� 0.7585��� 1.2991��� 1.6218

��� 2.5428��� 1.9660��� 8.3083��� 7.5720��� 0.5395��� 5.6882��� 7.9428

��� 8.1428��� 2.5108��� 5.8526��� 7.5373��� 5.3080��� 4.6939��� 3.1122

 

>> mean(R)

 

ans =

 

��� 4.6322��� 3.9409��� 5.7142��� 6.5274��� 3.2177��� 5.7626��� 3.2334

 

>> std(R)

 

ans =

 

��� 3.3551��� 3.0434��� 1.7847��� 2.4304��� 2.4489��� 3.0817��� 2.9372

>> var(R)

 

ans =

 

�� 11.2568��� 9.2620��� 3.1850��� 5.9069��� 5.9970��� 9.4966��� 8.6273

 

>> median(R)

 

ans =

 

��� 2.5751��� 2.5108��� 5.8526��� 7.5373��� 3.8045��� 5.6882��� 3.1122

 

>> mode(R)

 

ans =

 

��� 1.4929��� 1.9660��� 3.5166��� 2.8584�� 0.5395��� 1.2991��� 0.1190

 

>> max(R)

 

ans =

 

��� 8.4072��� 9.2926��� 8.3083��� 9.1719��� 5.6782��� 9.3401��� 7.9428

 

>> min(R)

 

ans =

 

��� 1.4929��� 1.9660��� 3.5166��� 2.8584��� 0.5395��� 1.2991��� 0.1190

 

>> [n,p]=size(R)%Structure of the data, n=rows, p=columns

 

n =

 

���� 5

 

 

p =

 

���� 7

 

Plotting data (command)

 

>> data = rand(20,3)*10%Lets generate some numbers

 

data =

 

��� 1.8482��� 0.2922��� 7.9618

��� 9.0488��� 9.2885��� 0.9871

��� 9.7975��� 7.3033��� 2.6187

��� 4.3887��� 4.8861��� 3.3536

��� 1.1112��� 5.7853��� 6.7973

��� 2.5806��� 2.3728��� 1.3655

��� 4.0872��� 4.5885��� 7.2123

��� 5.9490��� 9.6309��� 1.0676

��� 2.6221��� 5.4681��� 6.5376

��� 6.0284��� 5.2114��� 4.9417

��� 7.1122��� 2.3159��� 7.7905

��� 2.2175��� 4.8890��� 7.1504

��� 1.1742��� 6.2406��� 9.0372

��� 2.9668��� 6.7914��� 8.9092

��� 3.1878��� 3.9552��� 3.3416

��� 4.2417��� 3.6744��� 6.9875

��� 5.0786��� 9.8798��� 1.9781

��� 0.8552��� 0.3774��� 0.3054

��� 2.6248��� 8.8517��� 7.4407

��� 8.0101��� 9.1329��5.0002

 

>> [n,p]= size(data)%Get the structure of the data

 

n =

 

��� 20

 

 

p =

 

���� 3

 

>> t = 1:n;�� %Generate the timeline for the x-axis

>> plot(t,data)

>> legend('GDP','Invest','Pop',2)

>> xlabel('Time'), ylabel('Growth')

 

 

>> plot(t,data(:,1))�� %If you want to plot only the first column

>> plot(t,data(:,2))%If you want to plot only the second column

>> plot(t,data(:,3))%If you want to plot only the third column

 

 

CURVE FITTING

 

You open Curve Fitting Tool with the cftool command.

cftool

 

 

 

 

>> b1=inv(ivsdev'*ivsdev)*ivsdev'*devgdp

 

b1 =

 

��� 0.2752

��� 0.3140

��� 1.2200

�� -0.8049

 

 

Online references

 

http://www.mathworks.com/matlabcentral/

http://www.duke.edu/~hpgavin/matlab.html

http://oak.cats.ohiou.edu/~lacombe/research.html

http://www.mathworks.com/products/matlab/demos.html

 

 

GOOGLE Books

 

http://books.google.com/books?ct=result&q=%2Bmatlab&btnG=Search+Books