*open text log & overwrite old
log using logname.log, text replace
*change directory
cd "foldername of directory"
*descriptive statistics on variables
summarize variable1 variable2
*view the codebook for variables
codebook variable1 variable2
*drop missing cases on variable (using drop)
drop if variable1==.
*drop missing cases on variable (using keep)
keep if variable1!=.
*crosstab of variable
tab variable1
*browse data for variables
browse variable1 variable2
*create new variable from old variable value
gen newvariable1 = oldvariable1==value
*create variable label
label variable newvariable1 "label for the variable"
*create value label for variable (unattached)
label define newlabelname value1 "label for value1" value2 "label for value2"
*attach value label to the new variable
label value newvariable1 newlabelname
*save data as new file & overwrite
save newdatasetname.dta, replace
*close log
log close
*set a variable for fixed effects
xtset variable1
*set a new confidence interval
set level 99
*drop variables
drop variable1 variable2
*create new continuous variable from string
gen newvariable1 = real(oldvariable1)
*rename variable
rename oldvariable1 newvariable1
*replace missing values on a variable with new value
replace variable1 = newvalue if variable1 ==.
*merging: create specific string values for string values on more than one variable
gen newvariablename = "newstringvalue" if oldvariable1 == "oldstringvalue1" & oldvariable2 == "oldstringvalue2"
*use stata to run equations
display calculations
*output (selected variables) to comma-delimited CSV
outsheet variable1 variable2 variable3 using filename.csv, comma
*output (all variables) to comma-delimited CSV
outsheet using filename.csv, comma
*recode into new variable
recode oldvariable oldvalue1=newvalue1 oldvalue2=newvalue2, gen(newvariable)
*change variable from string to numeric and replace
destring variable1, replace
*store an estimation output, after running the estimation
est store estimation1
*output saved estimations as table.txt, b and s.e.
estout estimation1 estimation2 estimation3 using title.txt, cells(b(star fmt(#decimals)) se(par fmt(#decimals))) replace
*generate a sum of a variable
egen newvariable = sum(oldvariable)
*generate a sum of a variable by another variable
bys clustervariable: egen newvariable = sum(oldvariable)
*oneway ANOVA
oneway DV IV1 IV2 IV3
*oneway ANOVA with post-hoc
oneway DV IV1 IV2 IV3, scheffe
*correlation
correlate variable1 variable2 variable3
*regression
regress DV IV1 IV2 IV3
*negative binomial regression
nbreg DV IV1 IV2 IV3
*nested regression
nestreg: regress DV (IV1inblock1) (IV2inblock2 IV3inblock2)
*stepwise regression
sw, pe(PINvalue): regress DV IV1 IV2 IV3
*Cronbach's alpha/item-reliability analysis
alpha IV1 IV2 IV3 IV4 IV5
*factor analysis
factor IV1 IV2 IV3 IV4 IV5
*principal components analysis
pca IV1 IV2 IV3 IV4 IV5
*Post-PCA/FA Screeplot of Eigenvalues
screeplot
*scatterplot
twoway scatter y-axis-variable x-axis-variable
*line graph
twoway line y-axis-variable x-axis-variable
*time series: set variable
tsset time-series-variable
*time series: line graph
tsline y-axis-variable
*correlation with means
correlate variable1 variable2 variable3, means
*regression with fixed effects
xtreg DV IV1 IV2 IV3, fe
*regression with 99% CI
regress DV IV1 IV2 IV3, level(99)
*regression using beta coefficients
regress DV IV1 IV2 IV3, beta
*regression with 2 IV categorical dummies and interactions
regress DV IV1##IV2
*regression with 1 IV categorical dummy, 1 continuous and interactions
regress DV IV1##c.continuousIV2
*regression with clustered standard errors
regress DV IV1, cluster(clustervariable)