Updated 10/20/04, 1/28/04

stata commands

**1. codebook **displays information about variables' names, labels and values.

**2. clear **command clears out the dataset that is currently in memory. We need to do this before we can create or read a new dataset.

**3. describe **displays a summary of a Stata dataset, describing the variables and other information.

**4. edit **command opens up a spreadsheet like window in which you can enter and change data. You can also get to the "Data Editor" from the pull-down "Window" menu or by clicking on the "Data Editor" icon on the tool bar.

Enter values and press return. Double click on the column head and you can change the name of the variables. When you are done click the "close box" for the "Data Editor" window.

**5. inspect **displays information about the values of variables and is useful for checking data accuracy.

**6. list **command without any variable names displays values of all the variables for all the cases. List with variable names displays values only the variables following the command.

**7. log using filename.log **opens a log file called "filename.log" that records everything you type and all of the output from the commands (For example: log using

**8. log using filename.log, noproc **opens a log file called " filename.log" that records only what you type with no output.

**9. log using filename.log, append **opens an existing log file called "oldlog.log" and adds the current log to the end of "oldlog.log."

**10. log using filename.log, replace **opens an existing log file called "oldlog.log," deletes the contents and places the current log into the file.

**11. **The command **log off **temporarily suspends logging while log on turns logging back on again.

**12. **The command** log close **closes and saves the current log file.

The log command is such an important command. Having a log of commands and output is extremely useful in documenting the data analyses and recoverying from errors that may occur. You will be tempted to not use the log feature of Stata, it can seem like a bother. But the time will come when you will need to recreate a data file or data analysis and the only way you will be successful is if you created and saved a log-file.

**13. **Stata displays **--more--** whenever it fills up the computer screen. Pressing the 'space bar' will display the next screen, and so on, until all of the information has been displayed. To get out of --more--, you can click on the 'break' button, select 'Break" from the pull-down 'Tools' menu, or press the 'q' key.

**14. **The **save** command will save the dataset. (For example: "*my*.dta"). Editing the dataset changes data in the computer's memory, it does not change the data that is stored on the computer's disk. The **replace** option allows you to save a changed file to the disk, replacing the original file.

**15. **The **type** command displays the contents of a text file to the screen.

**16. **The** use **command loads a Stata dataset into memory for use.

Statistics with Stata

**1. **The **summarize **command displays basic descriptive statistics: n, mean, standard deviation, min and max. The **detail** option provides more descriptive statistics, including the variance, skewness, kurtosis, the median and other percentiles. (For example: summarize ses, detail)

**2. **The **graph **command is used to create one and two dimensional graphs.

**3. **The** correlate **command displays a matrix of Pearson correlations for the variables listed.

**4. graph twoway scatter variable1 variable2**:

**5. **The **tabulate** command with one variable creates a frequency distribution table. Note that the **nolabel** option shows the numeric values of the variable instead of the "value label."

** **The **tabulate** command** **with** two variables **creates a two-way table or cross tabulation. (For example, tabulate ses female).

The **tabulate** command with the **chi2** option the command includes the chi-square value along with its p-value. (For example, tabulate ses female, chi2)

6. t-tests: examples

ttest write=50

This example involves **the single-sample t-test**, testing whether the sample was drawn from a population with a mean of 50. By the way, the standardized writing test in this sample was normed nationally with a mean of 50.

ttest write=read

This example makes use of **the t-test for dependent samples**. In this case, we are testing whether there is a significant difference between the math and the science test scores.

ttest write, by(female)

ttest write, by(female) unequal

The t-test for independent groups comes in two varieties: pooled variance and unequal variance. We want to look at the differences in writing test scores between 'school types.' We will begin with the ttest for independent groups with pooled variances and compare the results to the ttest for independent groups using unequal variance.

There is a test for heterogeneity of variance, sdtest, but it is overly sensitive to non-normality and statisticians do not recommend using it to screen for heterogeneity of variance.

7. Analysis of Variance Examples:

oneway write prog, tabulate

anova write prog

sort prog

by prog: summarize write

table prog, contents(n write mean write sd write)

Here are two different ways to perform a one-way analysis of variance (ANOVA). They both give the exact same answer. The most visible difference is that one-way includes a test for homogeneity of variance.

anova write female prog female*prog

This example demonstrates a 3 X 3 factorial analysis of variance.

8. Regression Examples:

regress write read

regress write read, beta

predict pre1

generate pre2 = 23.95944 + .5517051*read

list pre1 pre2

graph twoway (scatter write read)(lfit write read)

graph twoway (scatter write read, jitter(2))(lfit write read)

These are two examples of simple linear regression. The first one displays confidence
intervals for the regression coefficients while the second one displays standardized regression
coefficients along with the 'regular' regression coefficients. The predict command computes a
predicted science score for each observation. Compare 'pre1' with 'pre2' that was created using
the generate command. The graph command, in this example, displays a scatter plot of read and
write along with showing the regression line of write on read. The second example uses the
**jitter** option to help see the points where there are multiple observations on one point.

regress write read math

regress write read math female

This time we have two examples of a multiple regression, the first one with two predictor variables and the second one with three.

Copy Stata Output and Stata Graphs

**Important**

Stata holds only about **500 lines** of output anything after that output is discarded.

You **cannot** use the pull-down menus to save the contents of the results window (i.e. you
**cannot** go File Save to save the results).

1. Copy from the Results Window and Paste into Word

You can use the mouse to scroll through the results window and mark an area that you want to save.
You can then using the pull down menu choose **Edit** and then **Copy **.

You can then go to Microsoft Word and from its pull down menu choose **Edit** then **Paste**. Most likely, the results will look lousy with the text all horribly aligned. This happens because the output from Stata uses **fixed space fonts**, and most fonts in Microsoft Word are **proportionally spaced fonts** (for example, the text in the window above is Times New Roman). If you select the text in Word and choose a fixed space font like **courier** the output will then look as it did in Stata.

2. Copying Graphs to Word

If you create a graph in Stata, you can copy that graph and then paste it into Microsoft Word.
You need to copy the graph by choosing from the pull down menu **Edit **then **Copy** or
**Copy Graph** in Windows.

You can then open Word and then paste the graph by choosing from the Word pull down menu
**Edit** and then **Paste**. You will then see the Stata graph pasted into the Word document.

HELP/SEARCH

**1. help**

When you know the name of the command (e.g. summarize) you can type help summarize in the command window to get help on the summarize command. You can also use the pull-down menu by clicking on Help and then Stata Command and then typing summarize in the window to get help for the summarize command.

2. search

When you don't know the name of the command, you can search the Stata help files (as well as their help on their website) based on keywords. For example, you want to know more about increasing memory in Stata so you want to search for the keyword "memory". You can type search memory or you can use the help pull down menu by clicking Help and then click search and then type in the window the keyword you want to search for, e.g. memory.

3. findit

When you looking for a program to download. Findit works best if you know the name of the program but will also search on more general topics.

Return to Stata Computer Module