Export Summary Statistics from Stata
Table of Contents
Default summary statistics
The default in asdocx is to report mean, standard deviation, minimum, maximum for all numeric variables. Therefore, we do not need to type variable names with the sum
command.
* Load Example dataset sysuse auto, clear * Estimate summary statistics asdocx sum, replace
Variable | Obs | Mean | Std. Dev. | Min | Max |
---|---|---|---|---|---|
price | 74 | 6165.257 | 2949.496 | 3291 | 15906 |
mpg | 74 | 21.297 | 5.786 | 12 | 41 |
rep78 | 69 | 3.406 | 0.99 | 1 | 5 |
headroom | 74 | 2.993 | 0.846 | 1.5 | 5 |
trunk | 74 | 13.757 | 4.277 | 5 | 23 |
weight | 74 | 3019.459 | 777.194 | 1760 | 4840 |
length | 74 | 187.932 | 22.266 | 142 | 233 |
turn | 74 | 39.649 | 4.399 | 31 | 51 |
displacement | 74 | 197.297 | 91.837 | 79 | 425 |
gear_ratio | 74 | 3.015 | 0.456 | 2.19 | 3.89 |
foreign | 74 | 0.297 | 0.46 | 0 | 1 |
Statistics for selected variables
In case summary statistics are required only for selected variables, then we need to write the variable names after the sum
word. Assume that we need summary statistics for price, trunk, mpg, weight, and foreign, our code and results are shown below.
asdocx sum price trunk mpg weight foreign, replace
Table: Descriptive Statistics
Variable | Obs | Mean | Std. Dev. | Min | Max |
---|---|---|---|---|---|
price | 74 | 6165.257 | 2949.496 | 3291 | 15906 |
trunk | 74 | 13.757 | 4.277 | 5 | 23 |
mpg | 74 | 21.297 | 5.786 | 12 | 41 |
weight | 74 | 3019.459 | 777.194 | 1760 | 4840 |
foreign | 74 | 0.297 | 0.46 | 0 | 1 |
Reporting variable Labels
Variable labels can be reported with the option label
. This option works with all other tables that asdocx can create. If a variable does not have a label, then the variable name is reported.
asdocx sum price trunk mpg weight foreign, replace label
Table: Descriptive Statistics
Variable | Obs | Mean | Std. Dev. | Min | Max |
---|---|---|---|---|---|
price | 74 | 6165.257 | 2949.496 | 3291 | 15906 |
Trunk space (.. ft.) | 74 | 13.757 | 4.277 | 5 | 23 |
Mileage (mpg) | 74 | 21.297 | 5.786 | 12 | 41 |
Weight (lbs.) | 74 | 3019.459 | 777.194 | 1760 | 4840 |
Car origin | 74 | 0.297 | 0.46 | 0 | 1 |
Using [if] [in] conditions
asdocx accepts if
and in
conditions just like any other Stata command. Both if
and in
should come at the end of the variable list and before the ,
. See the following example where we want to report descriptive statistics for cases where the car price is greater than 4000.
asdocx sum if price > 4000, replace
Table: Descriptive Statistics
Variable | Obs | Mean | Std. Dev. | Min | Max |
---|---|---|---|---|---|
price | 63 | 6586.81 | 3003.064 | 4010 | 15906 |
mpg | 63 | 20.444 | 5.488 | 12 | 41 |
rep78 | 59 | 3.356 | 0.978 | 1 | 5 |
headroom | 63 | 3.016 | 0.889 | 1.5 | 5 |
trunk | 63 | 14.317 | 4.261 | 5 | 23 |
weight | 63 | 3125.714 | 772.782 | 1760 | 4840 |
length | 63 | 191.238 | 21.603 | 147 | 233 |
turn | 63 | 40.111 | 4.381 | 31 | 51 |
displacement | 63 | 208.095 | 92.766 | 85 | 425 |
gear_ratio | 63 | 2.984 | 0.454 | 2.19 | 3.89 |
foreign | 63 | 0.286 | 0.455 | 0 | 1 |
Controlling decimal points
Decimal points can be controlled using the dec()
option. The default is to report three decimal points. If we were to report 4 decimal points, we shall add dec(4)
option.
asdocx sum , dec(4)
Table: Descriptive Statistics
Variable | Obs | Mean | Std. Dev. | Min | Max |
---|---|---|---|---|---|
cob | 170 | 0.1824 | 0.3873 | 0 | 1 |
ridageyr | 170 | 31.1882 | 25.5146 | 0 | 80 |
sex | 170 | 0.5294 | 0.5006 | 0 | 1 |
married | 92 | 0.5217 | 0.5023 | 0 | 1 |
ridreth3 | 170 | 2.9412 | 1.7497 | 1 | 7 |
hs | 170 | 45.8353 | 49.0983 | 0 | 99 |
poor | 148 | 0.4932 | 0.5017 | 0 | 1 |
insured | 169 | 0.8343 | 0.3729 | 0 | 1 |
rtnhcpl | 170 | 0.9353 | 0.2467 | 0 | 1 |
srhealth | 170 | 0.8353 | 0.372 | 0 | 1 |
cursmk | 36 | 0.3889 | 0.4944 | 0 | 1 |
alc_use | 36 | 1.1111 | 0.9495 | 0 | 2 |
dropped | 170 | 0.5118 | 0.5013 | 0 | 1 |
mec8yr | 170 | 8083.9818 | 8791.0193 | 0 | 39463.809 |
sdmvpsu | 170 | 1.6647 | 0.6612 | 1 | 3 |
sdmvstra | 170 | 95.6 | 4.0403 | 90 | 103 |
tzok : Reporting equal number of decimal points
In the preceding example, we reported four decimal points using option dec(4)
. However, some of the values have no decimal points. The reason is that asdocx does not report decimal points if all the trailing values are zeros. We can force asdocx to report equal number of decimal points even if all trailing values are zero. This can be done using the option tzok
, that is, trailing zeros ok.
asdocx sum , dec(4) tzok
Table: Descriptive Statistics
Variable | Obs | Mean | Std. Dev. | Min | Max |
---|---|---|---|---|---|
cob | 170 | 0.1824 | 0.3873 | 0.0000 | 1.0000 |
ridageyr | 170 | 31.1882 | 25.5146 | 0.0000 | 80.0000 |
sex | 170 | 0.5294 | 0.5006 | 0.0000 | 1.0000 |
married | 92 | 0.5217 | 0.5023 | 0.0000 | 1.0000 |
ridreth3 | 170 | 2.9412 | 1.7497 | 1.0000 | 7.0000 |
hs | 170 | 45.8353 | 49.0983 | 0.0000 | 99.0000 |
poor | 148 | 0.4932 | 0.5017 | 0.0000 | 1.0000 |
insured | 169 | 0.8343 | 0.3729 | 0.0000 | 1.0000 |
rtnhcpl | 170 | 0.9353 | 0.2467 | 0.0000 | 1.0000 |
srhealth | 170 | 0.8353 | 0.3720 | 0.0000 | 1.0000 |
cursmk | 36 | 0.3889 | 0.4944 | 0.0000 | 1.0000 |
alc_use | 36 | 1.1111 | 0.9495 | 0.0000 | 2.0000 |
dropped | 170 | 0.5118 | 0.5013 | 0.0000 | 1.0000 |
mec8yr | 170 | 8083.9818 | 8791.0193 | 0.0000 | 39463.8086 |
sdmvpsu | 170 | 1.6647 | 0.6612 | 1.0000 | 3.0000 |
sdmvstra | 170 | 95.6000 | 4.0403 | 90.0000 | 103.0000 |
Detailed summary statistics
To find detailed summary statistics, we normally type summarize, detail
or sum, detail
command in Stata. To make a table of detailed summary statistics, we shall just add detail
after comma to the asdocx sum
command. Using this option, the following statistics are added to the table : observations
, mean
, standard deviation
, minimum
, maximum
, 1st percentile
, 99th percentile
, skewness
, and kurtosis
. If additional statistics or a specific combination of statistics are required, then we can use the customized statistics option [see the following section].
asdocx sum, replace detail
Table: Descriptive Statistics
Variables | Obs | Mean | Std. Dev. | Min | Max | p1 | p99 | Skew. | Kurt. |
---|---|---|---|---|---|---|---|---|---|
price | 74 | 6165.3 | 2949.5 | 3291 | 15906 | 3291 | 15906 | 1.7 | 4.8 |
mpg | 74 | 21.3 | 5.8 | 12 | 41 | 12 | 41 | 0.9 | 4 |
rep78 | 69 | 3.4 | 1 | 1 | 5 | 1 | 5 | -0.1 | 2.7 |
headroom | 74 | 3 | 0.8 | 1.5 | 5 | 1.5 | 5 | 0.1 | 2.2 |
trunk | 74 | 13.8 | 4.3 | 5 | 23 | 5 | 23 | 0 | 2.2 |
weight | 74 | 3019.5 | 777.2 | 1760 | 4840 | 1760 | 4840 | 0.1 | 2.1 |
length | 74 | 187.9 | 22.3 | 142 | 233 | 142 | 233 | 0 | 2 |
turn | 74 | 39.6 | 4.4 | 31 | 51 | 31 | 51 | 0.1 | 2.2 |
displacement | 74 | 197.3 | 91.8 | 79 | 425 | 79 | 425 | 0.6 | 2.4 |
gear_ratio | 74 | 3 | 0.5 | 2.2 | 3.9 | 2.2 | 3.9 | 0.2 | 2.1 |
foreign | 74 | 0.3 | 0.5 | 0 | 1 | 0 | 1 | 0.9 | 1.8 |
Custom summary statistics
To make a table of a specific combination of statistics, use the option statistics()
or stat()
with asdocx sum
command. Option statistics()
allows the following statistics:
option | details |
---|---|
N | Number of observations |
mean | Arithmetic mean |
sd | Standard deviation |
semean | Stanard error of the mean |
sum | Sum / total |
range | Range |
min | The smallest value |
max | The largest value |
count | Counts the number of non-missing observations |
var | Variance |
cv | Coefficient of variation |
skewness | Skewness |
kurtosis | Kurtosis |
iqr | Interquartile range |
p1 | 1st percentile |
p5 | 5th percentile |
p10 | 10th percentile |
p25 | 25th percentile |
p50 | Median or the 50 percentile |
p75 | 75th percentile |
p90 | 90th percentile |
p99 | 99th percentile |
tstat | t-statistics that the given variable == 0 |
Assume that we wish to report mean, standard deviation, t-value, 1st, and 99th percentiles for all variables.
asdocx sum, replace stat(N mean sd tstat p1 p99)
Table: Descriptive Statistics
Variables | N | Mean | Std. Dev. | 1st Perc. | 99th Perc. | t-value |
---|---|---|---|---|---|---|
price | 74 | 6165.257 | 2949.496 | 3291 | 15906 | 17.981 |
mpg | 74 | 21.297 | 5.786 | 12 | 41 | 31.666 |
rep78 | 69 | 3.406 | 0.99 | 1 | 5 | 28.578 |
headroom | 74 | 2.993 | 0.846 | 1.5 | 5 | 30.436 |
trunk | 74 | 13.757 | 4.277 | 5 | 23 | 27.666 |
weight | 74 | 3019.459 | 777.194 | 1760 | 4840 | 33.421 |
length | 74 | 187.932 | 22.266 | 142 | 233 | 72.605 |
turn | 74 | 39.649 | 4.399 | 31 | 51 | 77.527 |
displacement | 74 | 197.297 | 91.837 | 79 | 425 | 18.481 |
gear ratio | 74 | 3.015 | 0.456 | 2.19 | 3.89 | 56.839 |
foreign | 74 | 0.297 | 0.46 | 0 | 1 | 5.557 |
Statistics over a grouping variable
To find summary statistics separately for each category of a grouping variable, we can use by(varname)
or the prefix bysort varname:
with asdocx. Examples of grouping variables can include country, year, industry, gender, family, etc. In the following example, let us report mean SD, t-value 1st and 99th percentiles for each category of the variable foreign
. In the auto dataset, the variable foreign
has two categories : Domestic and Foreign.
bys foreign : asdocx sum, replace stat(N mean sd tstat p1 p99)
N | mean | sd | 1st Perc. | 99th Perc. | t-value | |
---|---|---|---|---|---|---|
Car type = Domestic | ||||||
price | 52 | 6072.423 | 3097.104 | 3291 | 15906 | 13.425 |
mpg | 52 | 19.827 | 4.743 | 12 | 34 | 28.483 |
rep78 | 48 | 3.021 | 0.838 | 1 | 5 | 24.985 |
headroom | 52 | 3.154 | 0.916 | 1.5 | 5 | 23.994 |
trunk | 52 | 14.75 | 4.306 | 7 | 23 | 24.406 |
weight | 52 | 3317.115 | 695.364 | 1800 | 4840 | 33.919 |
length | 52 | 196.135 | 20.046 | 147 | 233 | 68.581 |
turn | 52 | 41.442 | 3.968 | 31 | 51 | 75.723 |
displacement | 52 | 233.712 | 85.263 | 86 | 425 | 19.325 |
gear ratio | 52 | 2.807 | 0.336 | 2.19 | 3.58 | 58.862 |
foreign | 52 | 0 | 0 | 0 | 0 | . |
Car type = Foreign | ||||||
price | 52 | 6072.423 | 3097.104 | 3291 | 15906 | 12.525 |
mpg | 52 | 19.827 | 4.743 | 12 | 34 | 18.364 |
rep78 | 48 | 3.021 | 0.838 | 1 | 5 | 27.386 |
headroom | 52 | 3.154 | 0.916 | 1.5 | 5 | 25.891 |
trunk | 52 | 14.75 | 4.306 | 7 | 23 | 15.95 |
weight | 52 | 3317.115 | 695.364 | 1800 | 4840 | 28.439 |
length | 52 | 196.135 | 20.046 | 147 | 233 | 59.238 |
turn | 52 | 41.442 | 3.968 | 31 | 51 | 113.933 |
displacement | 52 | 233.712 | 85.263 | 86 | 425 | 22.079 |
gear ratio | 52 | 2.807 | 0.336 | 2.19 | 3.58 | 52.855 |
foreign | 52 | 0 | 0 | 0 | 0 | . |
See also
- tabmany – Table of multiple coded answers
- mrtab – One- and two-way tables of multiple responses
- fre – One-way frequency tables
- tabcount – tabulates frequencies for up to 7 variables
- tab3way – Three way table of frequencies and percentages
- missings – Various utilities for managing missing values
- tabulate, tab1, tab2
- pctab – Percentage over a grouping variable
- crosstab – table of means and weighted by in cross tabulations
Order asdocx
Yearly license of asdocx is available at $14.99. Its life-time license is available at $59.99. With the asdocx membership, you get :
- Life-time license to use
- All future updates
- All premium templates / plugins.