Canonical Correlation

High School and Beyond Dataset

Variable
Number Variable
Name Variable
Label Coded
Response

1 id ID number 3-digit number
2 sex Sex 1=male
2=female
3 race Race or Ethnicity 1=hispanic
2=asian
3=african-american
4=white
4 ses Socioeconomic status 1=low
2=medium
3=high
5 sctyp School type 1=public
2=private
6 hsp High school program 1=general
2=academic prepartory
3=vacational/technical
7 locus Locus of control (mean = 0 and sd = 1)
8 slfcpt Self-concept (mean = 0 and sd = 1)
9 mot Motivation (average of three motivational items)
10 car Career choice 1=clerical
2=craftsman
3=farmer
4=homemaker
5=labkorer
6=manager
7=military
8=operative
9=professional 1
10=professional 2
11=proprietor
12=protective
13=sales
14=school
15=service
16=technical
17=not working
11 read Stdandardized reading score (mean = 50 and sd = 10)
12 write Stdandardized writing score (mean = 50 and sd = 10)
13 math Stdandardized math score (mean = 50 and sd = 10)
14 sci Stdandardized science score (mean = 50 and sd = 10)
15 socst Stdandardized soc st score (mean = 50 and sd = 10)

Variable Number	Variable Name	Variable Label	Coded Response
1	id	ID number	3-digit number
2	sex	Sex	1=male 2=female
3	race	Race or Ethnicity	1=hispanic 2=asian 3=african-american 4=white
4	ses	Socioeconomic status	1=low 2=medium 3=high
5	sctyp	School type	1=public 2=private
6	hsp	High school program	1=general 2=academic prepartory 3=vacational/technical
7	locus	Locus of control	(mean = 0 and sd = 1)
8	slfcpt	Self-concept	(mean = 0 and sd = 1)
9	mot	Motivation	(average of three motivational items)
10	car	Career choice	1=clerical 2=craftsman 3=farmer 4=homemaker 5=labkorer 6=manager 7=military 8=operative 9=professional 1 10=professional 2 11=proprietor 12=protective 13=sales 14=school 15=service 16=technical 17=not working
11	read	Stdandardized reading score	(mean = 50 and sd = 10)
12	write	Stdandardized writing score	(mean = 50 and sd = 10)
13	math	Stdandardized math score	(mean = 50 and sd = 10)
14	sci	Stdandardized science score	(mean = 50 and sd = 10)
15	socst	Stdandardized soc st score	(mean = 50 and sd = 10)

data hsb;
infile 'hsb.dat';
input id sex race ses sctyp hsp locus slfcpt mot
      career read write math sci socst;
proc reg;
  model locus = read write math/stb;
proc discrim manova canonical;
  class ses;
  var read write math;
proc cancorr;
  var locus slfcpt mot;
  with read write math;
proc factor method=prinit priors=smc heywood rotate=varimax;
  var locus slfcpt mot read write math sci socst;
run;

---------------------------------------------------------------------------

[Regression Output]

Model: MODEL1
Dependent Variable: LOCUS

Analysis of Variance

                   Sum of      Mean
Source      DF    Squares    Square  F Value   Prob>F

Model        3   45.38359  15.12786   40.299   0.0001
Error      596  223.73220   0.37539
C Total    599  269.11579

Root MSE     0.61269     R-square     0.1686
Dep Mean     0.09653     Adj R-sq     0.1645
C.V.       634.69316

Parameter Estimates

               Parameter     Standard    T for H0:
Variable DF     Estimate        Error   Parameter=0  Prob > |T|

INTERCEP  1    -1.562191   0.15552897     -10.044      0.0001
READ      1     0.013507   0.00360415       3.748      0.0002
WRITE     1     0.012067   0.00354769       3.401      0.0007
MATH      1     0.006279   0.00388416       1.617      0.1065

              Standardized
Variable  DF      Estimate

INTERCEP   1    0.00000000
READ       1    0.20358696
WRITE      1    0.17510731
MATH       1    0.08819281

[Discriminant Analysis Output]

Discriminant Analysis

600 Observations        599 DF Total
  3 Variables           597 DF Within Classes
  3 Classes               2 DF Between Classes


                        Class Level Information

                                                    Prior
   SES   Frequency      Weight   Proportion   Probability
     1         139    139.0000     0.231667      0.333333
     2         299    299.0000     0.498333      0.333333
     3         162    162.0000     0.270000      0.333333


Discriminant Analysis     Pooled Covariance Matrix Information

Covariance       Natural Log of the Determinant
Matrix Rank      of the Covariance Matrix

      3                    12.2917218


Discriminant Analysis     Pairwise Generalized Squared Distances Between Groups

 2         _   _       -1  _   _
D (i|j) = (X - X )' COV   (X - X )
            i   j           i   j

                 Generalized Squared Distance to SES
 From SES             1             2             3
        1             0       0.31342       0.98945
        2       0.31342             0       0.19325
        3       0.98945       0.19325             0


Discriminant Analysis

Multivariate Statistics and F Approximations

S=+1    M=0    N=+196.5

Statistic                   Value         F   Num DF  Den DF  Pr > F
Wilks' Lambda            0.88906941   12.0096      6    1190  0.0001
Pillai's Trace           0.11099492   11.6733      6    1192  0.0001
Hotelling-Lawley Trace   0.12469921   12.3452      6    1188  0.0001
Roy's Greatest Root      0.12411623   24.6578      3     596  0.0001

NOTE: F Statistic for Roy's Greatest Root is an upper bound.
NOTE: F Statistic for Wilks' Lambda is exact.

Canonical Discriminant Analysis

                        Adjusted       Approx       Squared
         Canonical      Canonical     Standard     Canonical
        Correlation    Correlation     Error      Correlation
   1      0.332283       0.325805     0.036348      0.110412
   2      0.024138       -.010077     0.040835      0.000583

                       Eigenvalues of INV(E)*H
                         = CanRsq/(1-CanRsq)

        Eigenvalue    Difference    Proportion    Cumulative

   1       0.1241        0.1235       0.9953        0.9953
   2       0.0006         .           0.0047        1.0000

              Test of H0: The canonical correlations in the
                current row and all that follow are zero

        Likelihood
           Ratio      Approx F      Num DF      Den DF    Pr > F

   1    0.88906941     12.0096           6        1190    0.0001
   2    0.99941736      0.1737           2         596    0.8406


Total Canonical Structure

                   CAN1              CAN2

READ           0.888254         -0.295295
WRITE          0.752720         -0.443850
MATH           0.933310          0.324888


Between Canonical Structure

                   CAN1              CAN2

READ           0.999709         -0.024143
WRITE          0.999084         -0.042795
MATH           0.999680          0.025279


Pooled Within Canonical Structure

                   CAN1              CAN2

READ           0.876870         -0.308983
WRITE          0.733301         -0.458315
MATH           0.925962          0.341649

Total-Sample Standardized Canonical Coefficients

                   CAN1              CAN2

READ        0.448968815      -0.635268929
WRITE       0.137497524      -0.870695286
MATH        0.595918616       1.306821765


Pooled Within-Class Standardized Canonical Coefficients

                   CAN1              CAN2

READ        0.429673382      -0.607966834
WRITE       0.133341917      -0.844380139
MATH        0.567466508       1.244427616

Raw Canonical Coefficients

                   CAN1              CAN2

READ       0.0444392333      -.0628793431
WRITE      0.0141364479      -.0895182559
MATH       0.0632963677      0.1388059858


Class Means on Canonical Variables

         SES              CAN1              CAN2

           1      -.5465768557      -.0228164669
           2      0.0112924485      0.0241526513
           3      0.4481342028      -.0250009496

Discriminant Analysis     Linear Discriminant Function

               _     -1 _                              -1 _
Constant = -.5 X' COV   X      Coefficient Vector = COV   X
                j        j                                 j

                            SES

                        1                2                3

CONSTANT        -17.65626        -21.00216        -23.90440
READ              0.16808          0.18992          0.21242
WRITE             0.27408          0.27776          0.28834
MATH              0.29651          0.33834          0.35917


Discriminant Analysis     Classification Summary for Calibration Data: WORK.HSB

Resubstitution Summary using Linear Discriminant Function

Generalized Squared Distance Function:

 2         _       -1   _
D (X) = (X-X )' COV  (X-X )
 j          j            j

Posterior Probability of Membership in each SES:

                   2                    2
Pr(j|X) = exp(-.5 D (X)) / SUM exp(-.5 D (X))
                   j        k           k

                    Number of Observations and Percent Classified into SES:

  From SES          1          2          3      Total

         1         95         16         28        139
                68.35      11.51      20.14     100.00

         2        122         51        126        299
                40.80      17.06      42.14     100.00

         3         39         27         96        162
                24.07      16.67      59.26     100.00
                
Total             256         94        250        600
Percent         42.67      15.67      41.67     100.00

Priors         0.3333     0.3333     0.3333


           Error Count Estimates for SES:

                1          2          3    Total
Rate       0.3165     0.8294     0.4074   0.5178
Priors     0.3333     0.3333     0.3333

[Canonical Correlation Analysis]


                        Adjusted       Approx       Squared
         Canonical      Canonical     Standard     Canonical
        Correlation    Correlation     Error      Correlation

   1      0.446310       0.440513     0.032720      0.199193
   2      0.091170        .           0.040519      0.008312
   3      0.002400        .           0.040859      0.000006

                       Eigenvalues of INV(E)*H
                         = CanRsq/(1-CanRsq)

        Eigenvalue    Difference    Proportion    Cumulative

   1       0.2487        0.2404       0.9674        0.9674
   2       0.0084        0.0084       0.0326        1.0000
   3       0.0000         .           0.0000        1.0000

              Test of H0: The canonical correlations in the
                current row and all that follow are zero

        Likelihood
           Ratio      Approx F      Num DF      Den DF    Pr > F

   1    0.79414650     15.9574           9    1445.791    0.0001
   2    0.99168226      1.2450           4        1190    0.2900
   3    0.99999424      0.0034           1         596    0.9533


Multivariate Statistics and F Approximations

S=3    M=-0.5    N=+196

Statistic                     Value          F      Num DF    Den DF  Pr > F

Wilks' Lambda              0.79414650    15.9574         9  1445.791  0.0001
Pillai's Trace             0.20751038    14.7630         9      1788  0.0001
Hotelling-Lawley Trace     0.25712716    16.9323         9      1778  0.0001
Roy's Greatest Root        0.24873970    49.4163         3       596  0.0001

NOTE: F Statistic for Roy's Greatest Root is an upper bound.


Canonical Correlation Analysis

Raw Canonical Coefficients for the 'VAR' Variables

                      V1                V2                V3

LOCUS       1.2559521341       0.642735041      -0.636900954
SLFCPT      -0.231449286      1.0673050885       1.012326503
MOT         1.2293420708      -2.373306414      1.6001728133


Raw Canonical Coefficients for the 'WITH' Variables

                     W1                W2                W3

READ        0.042654856      0.0805224686      -0.111464486
WRITE      0.0544447099      -0.130501783      -0.009401289
MATH       0.0183642562      0.0580981087      0.1426914796


Standardized Canonical Coefficients for the 'VAR' Variables

                  V1            V2            V3

LOCUS         0.8418        0.4308       -0.4269
SLFCPT       -0.1633        0.7530        0.7142
MOT           0.4213       -0.8134        0.5484


Standardized Canonical Coefficients for the 'WITH' Variables

                 W1            W2            W3

READ         0.4309        0.8135       -1.1261
WRITE        0.5296       -1.2693       -0.0914
MATH         0.1729        0.5470        1.3434

Canonical Structure

Correlations Between the 'VAR' Variables and Their Canonical Variables

                  V1            V2            V3

LOCUS         0.9172        0.3603       -0.1702
SLFCPT        0.1024        0.5920        0.7994
MOT           0.5806       -0.4905        0.6499


Correlations Between the 'WITH' Variables and Their Canonical Variables

                 W1            W2            W3

READ         0.8813        0.3872       -0.2711
WRITE        0.9098       -0.4119        0.0506
MATH         0.8007        0.2965        0.5206


Correlations Between the 'VAR' Variables and the
Canonical Variables of the 'WITH' Variables

                  W1            W2            W3

LOCUS         0.4093        0.0329       -0.0004
SLFCPT        0.0457        0.0540        0.0019
MOT           0.2591       -0.0447        0.0016


Correlations Between the 'WITH' Variables and
the Canonical Variables of the 'VAR' Variables

                 V1            V2            V3

READ         0.3933        0.0353       -0.0007
WRITE        0.4061       -0.0376        0.0001
MATH         0.3573        0.0270        0.0012
                      
                      
[Factor Analysis Output]       

Initial Factor Method: Iterated Principal Factor Analysis

Prior Communality Estimates: SMC

     LOCUS    SLFCPT       MOT      READ     WRITE      MATH       SCI     SOCST
  0.205013  0.108939  0.173511  0.625200  0.538955  0.572827  0.556934  0.439018

Preliminary Eigenvalues:  Total = 3.22039668  Average = 0.40254959

                       1           2           3           4
Eigenvalue        3.2643      0.4515      0.0714     -0.0529
Difference        2.8129      0.3800      0.1244      0.0092
Proportion        1.0136      0.1402      0.0222     -0.0164
Cumulative        1.0136      1.1538      1.1760      1.1596

                       5           6           7           8
Eigenvalue       -0.0622     -0.1046     -0.1313     -0.2157
Difference        0.0425      0.0266      0.0845
Proportion       -0.0193     -0.0325     -0.0408     -0.0670
Cumulative        1.1403      1.1078      1.0670      1.0000

1 factors will be retained by the PROPORTION criterion.

Iter    Change   Communalities

  1   0.098014   0.20466 0.01092 0.08337 0.68936 0.59256 0.62595 0.58382 0.47367
  2   0.018630   0.20151 0.00989 0.07749 0.70799 0.60450 0.63881 0.58696 0.47807
  3   0.005926   0.20049 0.00982 0.07694 0.71392 0.60707 0.64198 0.58653 0.47801
  4   0.002069   0.20023 0.00981 0.07686 0.71599 0.60759 0.64279 0.58602 0.47768
  5   0.000772   0.20016 0.00980 0.07684 0.71676 0.60767 0.64299 0.58576 0.47752

Convergence criterion satisfied.

Eigenvalues of the Reduced Correlation Matrix:
  Total = 3.31696379  Average = 0.41462047

                       1           2           3           4
Eigenvalue        3.3175      0.3783      0.1060      0.0145
Difference        2.9392      0.2722      0.0916      0.0639
Proportion        1.0002      0.1140      0.0320      0.0044
Cumulative        1.0002      1.1142      1.1462      1.1505

                       5           6           7           8
Eigenvalue       -0.0494     -0.0539     -0.1027     -0.2933
Difference        0.0045      0.0488      0.1905
Proportion       -0.0149     -0.0163     -0.0310     -0.0884
Cumulative        1.1356      1.1194      1.0884      1.0000

Factor Pattern

           FACTOR1

LOCUS      0.44740
SLFCPT     0.09901
MOT        0.27720
READ       0.84662
WRITE      0.77953
MATH       0.80187
SCI        0.76535
SOCST      0.69103

Variance explained by each factor

   FACTOR1
  3.317508


Final Communality Estimates: Total = 3.317508

     LOCUS    SLFCPT       MOT      READ     WRITE      MATH       SCI     SOCST
  0.200163  0.009803  0.076840  0.716759  0.607674  0.642991  0.585762  0.477516


Rotation Method: Varimax

Rotation not possible with 1 factor.