Burrel Vann Jr

*open text log & overwrite old
   log using logname.log, text replace

*change directory
   cd "foldername of directory"

*descriptive statistics on variables
   summarize variable1 variable2

*view the codebook for variables
   codebook variable1 variable2

*drop missing cases on variable (using drop)
   drop if variable1==.

*drop missing cases on variable (using keep)
   keep if variable1!=.

*crosstab of variable
   tab variable1

*browse data for variables
   browse variable1 variable2

*create new variable from old variable value
   gen newvariable1 = oldvariable1==value

*create variable label
   label variable newvariable1 "label for the variable"

*create value label for variable (unattached)
   label define newlabelname value1 "label for value1" value2 "label for value2"

*attach value label to the new variable
   label value newvariable1 newlabelname

*save data as new file & overwrite
   save newdatasetname.dta, replace

*close log
   log close

*set a variable for fixed effects
   xtset variable1

*set a new confidence interval
   set level 99

*drop variables
   drop variable1 variable2

*create new continuous variable from string
   gen newvariable1 = real(oldvariable1)

*rename variable
   rename oldvariable1 newvariable1

*replace missing values on a variable with new value
   replace variable1 = newvalue if variable1 ==.

*merging: create specific string values for string values on more than one variable
   gen newvariablename = "newstringvalue" if oldvariable1 == "oldstringvalue1" & oldvariable2 == "oldstringvalue2"

*use stata to run equations
   display calculations

*output (selected variables) to comma-delimited CSV
   outsheet variable1 variable2 variable3 using filename.csv, comma

*output (all variables) to comma-delimited CSV
   outsheet using filename.csv, comma

*recode into new variable
   recode oldvariable oldvalue1=newvalue1 oldvalue2=newvalue2, gen(newvariable)

*change variable from string to numeric and replace
   destring variable1, replace

*store an estimation output, after running the estimation
   est store estimation1

*output saved estimations as table.txt, b and s.e.
   estout estimation1 estimation2 estimation3 using title.txt, cells(b(star fmt(#decimals)) se(par fmt(#decimals))) replace

*generate a sum of a variable
   egen newvariable = sum(oldvariable)

*generate a sum of a variable by another variable
   bys clustervariable: egen newvariable = sum(oldvariable)

*oneway ANOVA
   oneway DV IV1 IV2 IV3

*oneway ANOVA with post-hoc
   oneway DV IV1 IV2 IV3, scheffe

*correlation
   correlate variable1 variable2 variable3

*regression
   regress DV IV1 IV2 IV3

*negative binomial regression
   nbreg DV IV1 IV2 IV3

*nested regression
   nestreg: regress DV (IV1inblock1) (IV2inblock2 IV3inblock2)

*stepwise regression
   sw, pe(PINvalue): regress DV IV1 IV2 IV3

*Cronbach's alpha/item-reliability analysis
   alpha IV1 IV2 IV3 IV4 IV5

*factor analysis
   factor IV1 IV2 IV3 IV4 IV5

*principal components analysis
   pca IV1 IV2 IV3 IV4 IV5

*Post-PCA/FA Screeplot of Eigenvalues
   screeplot

*scatterplot
   twoway scatter y-axis-variable x-axis-variable

*line graph
   twoway line y-axis-variable x-axis-variable

*time series: set variable
   tsset time-series-variable

*time series: line graph
   tsline y-axis-variable

*correlation with means
   correlate variable1 variable2 variable3, means

*regression with fixed effects
   xtreg DV IV1 IV2 IV3, fe

*regression with 99% CI
   regress DV IV1 IV2 IV3, level(99)

*regression using beta coefficients
   regress DV IV1 IV2 IV3, beta

*regression with 2 IV categorical dummies and interactions
   regress DV IV1##IV2

*regression with 1 IV categorical dummy, 1 continuous and interactions
   regress DV IV1##c.continuousIV2

*regression with clustered standard errors
   regress DV IV1, cluster(clustervariable)