GETTING
STARTED IN MATLAB (ver 1.0, beta/draft)
Regression with matrix algebra
We are going to run the following regression
GDPpcap = constant + Consump + Mgrowth + Popgrowth + Xgrowth
Were
>> muconsump=mean(Consump)
muconsump =
��� 3.2439
>> mumgrowth=mean(Mgrowth)
mumgrowth =
��� 6.9586
>> muxgrowth=mean(Xgrowth)
muxgrowth =
��� 6.0626
>> mupopgrowth=mean(Popgrowth)
mupopgrowth =
��� 1.0589
>> mugdp=mean(GDPpcap)
mugdp =
��� 2.0861
>> muones=mean(Ones)
muones =
���� 1
>> devgdp=GDPpcap./mugdp
devx=Xgrowth./muxgrowth
devm=Mgrowth./mumgrowth
devcon=Consump./muconsump
devpop=Popgrowth./mupopgrowth
Running this regression using Stata we
get the following. We will replicate this process in MATLAB using matrix
algebra
>> Ones=ones(39,1)
>> ivs = [Xgrowth Mgrowth Consump Popgrowth Ones]
>> b=inv(ivs'*ivs)*ivs'*GDPpcap
b =
��� 0.0966
�� �0.0867
��� 0.8317
�� -0.9844
�� -0.7585
gdphat= ivs*b
e=GDPpcap-gdphat
>> sumsqre=e'*e
sumsqre =
�� 16.2059
>> sscpinv=inv(ivsdev'*ivsdev)
sscpinv =
��� 0.0343��
-0.0072��� 0.0038��� 0.0016��
-0.0325
�� -0.0072���
0.0736�� -0.1227�� -0.0893���
0.1455
���
0.0038�� -0.1227��� 0.3314���
0.2186�� -0.4311
��� 0.0016��
-0.0893��� 0.2186��� 1.6649��
-1.7959
�� -0.0325���
0.1455�� -0.4311�� -1.7959���
2.1397
>> variance=sumsqre/33
variance =
��� 0.4911
When you first open MATLAB this is what you will see:
Once you have some file in your directory they will appear in the �current directory� window:
If you right click on the file Heating.csv, a menu will pop-up:
MATLAB gives you, among other things, the option to open the files as text, using other program or importing it.
The import data option open the �Import Wizard� (which you can open also using File � Import Data from the main menu, or in the command window tye uiimport)
Step 1 - Select the format of your data
Step 2 � Click �Finish�
If you click on the �Workspace� tab you�ll see two files �data� and �textdata�
MATLAB works with matrices, so on the �value� column, <323x4 double> means that data is in �double precision� numeric format (64 bits for each number stored in memory) in a matrix format with 323 rows and 4 columns. MATLAB means �MATrix LABoratory�.
The �textdata� variable is in �cell� array which can contain string or numeric characters, the default is string and it is a matrix with 324 rows and 2 columns.
If double-click on �data� a fourth screen will appear. The �Array Editor� is where you data sits. It looks like spreadsheet.
The list of operators includes
+ Addition
- Subtraction
.* Element-by-element
multiplication
./
Element-by-element division
.\ Element-by-element left
division
.^
Element-by-element power
.' Unconjugated
array transpose
Source: Matlab
tutorial
Checking for missing data
>> sum(isnan(data))
ans =
���� 0���� 0���� 2���� 2���� 2���� 0
There are 2 missing data in columns 3, 4 and 5.
Code |
Description |
Find the indices of elements in a vector x that are not NaNs. Keep only the non-NaN elements. |
|
Remove NaNs from a vector x. |
|
Remove NaNs from a vector x (alternative method). |
|
Remove any rows containing NaNs from a matrix X. |
If you frequently need to remove NaNs, you might want to write a short M-file function
that you can call:
function X = exciseRows(X)
X(any(isnan(X),2),:) =
[];
The following command computes the correlation
coefficients of X after all rows containing NaNs are removed:
C
= corrcoef(excise(X));
Generating matrices
>> E = [1 2 3 4 5; 2 3
4 5 6; 3 4 5 6 7; 4 5 6 7 8; 5 6 7 8 9]
E =
���� 1����
2���� 3���� 4����
5
���� 2����
3���� 4���� 5����
6
���� 3����
4���� 5���� 6����
7
���� 4����
5���� 6���� 7����
8
���� 5����
6���� 7���� 8����
9
>> B = zeros(3,4)
B =
���� 0����
0���� 0���� 0
�� ��0���� 0���� 0���� 0
���� 0����
0���� 0���� 0
>> C = ones(3,4)
C =
���� 1���� 1���� 1���� 1
���� 1���� 1���� 1���� 1
���� 1���� 1���� 1���� 1
>> R = rand(3,4)
R =
��� 0.0357��� 0.6787��� 0.3922��� 0.7060
��� 0.8491��� 0.7577��� 0.6555��� 0.0318
��� 0.9340��� 0.7431��� 0.1712��� 0.2769
>> Rn = randn(3,4)
Rn =
�� -1.3362�� -0.6918�� -1.5937�� -0.3999
��� 0.7143��� 0.8580�� -1.4410��� 0.6900
��� 1.6236��� 1.2540��� 0.5711��� 0.8156
>> D = [B C; R Rn]
D =
�������� 0�������� 0�������� 0�������� 0�� �1.0000��� 1.0000��� 1.0000��� 1.0000
�������� 0�������� 0�������� 0�������� 0��� 1.0000��� 1.0000��� 1.0000��� 1.0000
�������� 0�������� 0�������� 0�������� 0��� 1.0000��� 1.0000��� 1.0000��� 1.0000
��� 0.0357��� 0.6787��� 0.3922��� 0.7060�� -1.3362�� -0.6918�� -1.5937�� -0.3999
��� 0.8491��� 0.7577��� 0.6555��� 0.0318��� 0.7143��� 0.8580�� -1.4410��� 0.6900
��� 0.9340��� 0.7431��� 0.1712��� 0.2769��� 1.6236��� 1.2540��� 0.5711��� 0.8156
Deleting rows and columns
>> E��� %Recalling matrix E
E =
���� 1���� 2���� 3���� 4���� 5
���� 2���� 3���� 4���� 5���� 6
���� 3���� 4���� 5���� 6���� 7
���� 4���� 5���� 6���� 7���� 8
���� 5���� 6���� 7���� 8���� 9
>> E(:,3) = []�� %Deleting the third column
E =
���� 1���� 2���� 4���� 5
���� 2���� 3���� 5���� 6
���� 3���� 4���� 6���� 7
���� 4���� 5���� 7���� 8
���� 5���� 6���� 8���� 9
>> E(3,:) = []� %Deleting the third row
E =
���� 1���� 2���� 4���� 5
���� 2���� 3���� 5���� 6
���� 4���� 5���� 7���� 8
���� 5���� 6���� 8���� 9
You cannot delete a single element without reshaping the entire matrix.
Creating tables
>> F = (10:5:100)'
F =
��� 10
��� 15
��� 20
��� 25
��� 30
��� 35
��� 40
��� 45
��� 50
��� 55
��� 60
��� 65
��� 70
��� 75
��� 80
��� 85
��� 90
��� 95
�� 100
>> squares = [F F.*F]
squares =
��������� 10�������� 100
��������� 15�������� 225
��������� 20�������� 400
��������� 25�������� 625
��������� 30�������� 900
��������� 35������� 1225
��������� 40������� 1600
��������� 45������� 2025
��������� 50������� 2500
��������� 55������� 3025
��������� 60����� ��3600
��������� 65������� 4225
��������� 70������� 4900
��������� 75������� 5625
��������� 80������� 6400
��������� 85������� 7225
��������� 90������� 8100
��������� 95������� 9025
�������� 100������ 10000
Descriptive statistics
>> R
R =
��� 1.4929�� �2.4352��� 6.1604��� 5.4972��� 3.8045��� 7.7917��� 0.1190
��� 2.5751��� 9.2926��� 4.7329��� 9.1719��� 5.6782��� 9.3401��� 3.3712
��� 8.4072��� 3.4998��� 3.5166��� 2.8584��� 0.7585��� 1.2991��� 1.6218
��� 2.5428��� 1.9660��� 8.3083��� 7.5720��� 0.5395��� 5.6882��� 7.9428
��� 8.1428��� 2.5108��� 5.8526��� 7.5373��� 5.3080��� 4.6939��� 3.1122
>> mean(R)
ans =
��� 4.6322��� 3.9409��� 5.7142��� 6.5274��� 3.2177��� 5.7626��� 3.2334
>> std(R)
ans =
��� 3.3551��� 3.0434��� 1.7847��� 2.4304��� 2.4489��� 3.0817��� 2.9372
>> var(R)
ans =
�� 11.2568��� 9.2620��� 3.1850��� 5.9069��� 5.9970��� 9.4966��� 8.6273
>> median(R)
ans =
��� 2.5751��� 2.5108��� 5.8526��� 7.5373��� 3.8045��� 5.6882��� 3.1122
>> mode(R)
ans =
��� 1.4929��� 1.9660��� 3.5166��� 2.8584�� �0.5395��� 1.2991��� 0.1190
>> max(R)
ans =
��� 8.4072��� 9.2926��� 8.3083��� 9.1719��� 5.6782��� 9.3401��� 7.9428
>> min(R)
ans =
��� 1.4929��� 1.9660��� 3.5166��� 2.8584��� 0.5395��� 1.2991��� 0.1190
>> [n,p]=size(R)� %Structure of the data, n=rows, p=columns
n =
���� 5
p =
���� 7
Plotting data (command)
>> data = rand(20,3)*10� %Lets generate some numbers
data =
��� 1.8482��� 0.2922��� 7.9618
��� 9.0488��� 9.2885��� 0.9871
��� 9.7975��� 7.3033��� 2.6187
��� 4.3887��� 4.8861��� 3.3536
��� 1.1112��� 5.7853��� 6.7973
��� 2.5806��� 2.3728��� 1.3655
��� 4.0872��� 4.5885��� 7.2123
��� 5.9490��� 9.6309��� 1.0676
��� 2.6221��� 5.4681��� 6.5376
��� 6.0284��� 5.2114��� 4.9417
��� 7.1122��� 2.3159��� 7.7905
��� 2.2175��� 4.8890��� 7.1504
��� 1.1742��� 6.2406��� 9.0372
��� 2.9668��� 6.7914��� 8.9092
��� 3.1878��� 3.9552��� 3.3416
��� 4.2417��� 3.6744��� 6.9875
��� 5.0786��� 9.8798��� 1.9781
��� 0.8552��� 0.3774��� 0.3054
��� 2.6248��� 8.8517��� 7.4407
��� 8.0101��� 9.1329� ��5.0002
>> [n,p]= size(data)� %Get the structure of the data
n =
��� 20
p =
���� 3
>> t = 1:n;�� %Generate the timeline for the x-axis
>> plot(t,data)
>> legend('GDP','Invest','Pop',2)
>> xlabel('Time'), ylabel('Growth')
>> plot(t,data(:,1))�� %If you want to plot only the first column
>> plot(t,data(:,2))� %If you want to plot only the second column
>> plot(t,data(:,3))� %If you want to plot only the third column
CURVE FITTING
You open Curve Fitting Tool with the cftool command.
cftool
>> b1=inv(ivsdev'*ivsdev)*ivsdev'*devgdp
b1 =
���
0.2752
���
0.3140
���
1.2200
��
-0.8049
Online references
http://www.mathworks.com/matlabcentral/
http://books.google.com/books?ct=result&q=%2Bmatlab&btnG=Search+Books