# 3. IQ data set

## 3.a) Available statistical IQ data set

The current statistical research has been performed with the IQ data set contained in the Young Adulthood Study: 1939-1967 [made accessible in 1979 on electronic files]. This IQ data set was collected by Virginia Crandall and made available through an archive at the Henry A. Murray Research Center of The Radcliffe Institute for Advanced Study, Harvard University, Cambridge, Massachusetts [Producer and Distributor]

This collection of longitudinal data contains the variables we are interested in: those relative to the intelligence quotients (IQ) of parents and their corresponding children. The statistical data reliability is assued.

After a preliminary analysis of the available statistical IQ data set, one variable for the mothers (M) (Otis intelligence test), fathers (F) (Otis test) and children (C4) was used with 70 corresponding values, two more from the children (C1 and C5) with 69 corresponding values, and another set of three variables of the children with less corresponding values (C2, C3, and C6 with values of 58, 42, and 64 respectively) that we will use only to create variable X6, the average of the children's six variables.

The statistical IQ data set is taken from average class white families, with a mean IQ of 110, slightly above the average. For each family, the data source corresponds to the father, the mother, and one child.

## 3.a.2. Limitations of statistical data set

• Sample size of statistical IQ data set

This is a limitation that could become very serious, although the sample size is 70 (n=70) (Otis IQ test of mothers and fathers and one of the children), when we make the analysis by groups it is reduced to only 7 groups with a sample size of 10 in each one.

Nevertheless, we do the mentioned grouping for values of 2, 3, 4, 5, 6, 7, 8, 9 and 10. Also, different groupings are created depending on the order the 70 values can be rearranged.

In this way, as you will see in the following sections, we multiplied the number of studied variables by more than 50. Consequently, the model becomes very sensitive to small statistical data set modifications in the different groupings.

The different variables suppose different views on the same statistical data set; in other words, they will simultaneously provide estimations of the existing correlations in different dimensions.

In my opinion, this sensitivity is the strongest point of the model: the good adjustments obtained are very significant regarding the goodness-of-fit of this model's structure, especially because they have been obtained without any modification of the original variables allowing a total statistical data reliability.

The strength of the analysis performed allowed the initial objectives to be achieved and much more.

• Statistical IQ Data set quality

As shown in the previous table of the statistical data set and selected variables, it should be emphasized that the test types or methods of evaluation used were not the same.

Likewise, the existence of values considered extreme should be taken out when they are not reasonable.

There is only one statistical data set for the parents' IQ whereas for the children there are various IQ data set that, as we will see, are not highly correlated at all.

Even so, these limitations should reinforce the obtained results since, with a more precise global statistical data set, it would be expected that there would be a higher correlation between variables.

Anyway, the fact that this is a relatively homogenous sample will also work against the study's objective, because it will be more difficult to discriminate between the study's values. Therefore, the results would be more relevant.

• Temporary stability of intellectual ability

The different IQ data set of children has been obtained for along different years. Without having reached a clear conclusion, it is fair to say that the temporary stability of the statistical IQ data set is compatible with the different observed values in the model's simulation.