Stata's expertise lies in the analysis of time based data. Stata provides not only the basic time series models like ARIMA but even the multivariate equivalents (VAR/VECModels) as well. Further you are able to model volatility using GARCHmodels in Stata. KaplanMeiercurves are the way to analyse survival times, while mixed models help to analyse panel data. A mighty scripting language completes the package.
Stata produces all kinds of classical statistics. You can use it for descriptive statistics, hypothesis testing and visualization of data. Typically Stata is used in research and development. The large amount of different statistical methods helps scientists in all fields of applications (Social science, econometrics, epidimiology, medical research).
No matter if you are a student or a senior researcher, there is always the right version of Stata available: Stata/IC, Stata/SE and Stata/MP
Arguments for Stata:
 Used in research and development
 Wide range of statistical and graphical methods
 Comprehensive statistical software
 Flexible and especially powerful for analysis of time series
 Easy to learn but mighty scripting language
Recommended products
Limdep
STATA MP
NLOGIT (includes Limdep)
Stata/SE
Stata statistical software is a complete, integrated statistical software package that provides everything you need for data analysis, data management, and graphics. Stata is not sold in modules, which means you get everything you need in one package.
Easy to learn yet fully programmable for the most demanding data management and statistical requirements.
With Stata's menus and dialogs, you can easily point and click or drag and drop your way to all of Stata's statistical, graphical, and data management features. You can completely reshape your data, create grouplevel variables for panel or longitudinal data, graph a receiver operating characteristics (ROC) curve or impulseresponse function (IRF), perform a casecontrol analysis, estimate a randomeffects countdata model or a Cox proportional hazards model, or compute marginal effects from a nonlinear estimator. You can even access the dialog boxes for each command directly from the online help system. T his is a great way to explore all of the capabilities of Stata.
Stata Software is available in 3 different flavors
Whether you’re a student or a seasoned research professional, we have a package designed to suit your needs:
 Stata/MP: The fastest version of Stata (for quadcore, dualcore, and multicore/multiprocessor computers) that can analyze the most data
 Stata/SE: Stata for large datasets
 Stata/IC: Stata for midsized datasets
 Numerics by Stata: Stata for embedded and web applications
Stata/MP is the fastest and largest version of Stata. Virtually any current computer can take advantage of the advanced multiprocessing of Stata/MP. This includes the Intel i3, i5, i7, i9, Xeon, and Celeron, and AMD multicore chips. On dualcore chips, Stata/MP runs 40% faster overall and 72% faster where it matters, on the timeconsuming estimation commands. With more than two cores or processors, Stata/MP is even faster. Find out more about Stata/MP.
Stata/MP, Stata/SE, and Stata/IC all run on any machine, but Stata/MP runs faster. You can purchase a Stata/MP license for up to the number of cores on your machine (maximum is 64). For example, if your machine has eight cores, you can purchase a Stata/MP license for eight cores, four cores, or two cores.
Stata/MP can also analyze more data than any other flavor of Stata. Stata/MP can analyze 10 to 20 billion observations given the current largest computers, and is ready to analyze up to 1 trillion observations once computer hardware catches up.
Stata/SE and Stata/IC differ only in the dataset size that each can analyze. Stata/SE and Stata/MP can fit models with more independent variables than Stata/IC (up to 10,998). Stata/SE can analyze up to 2 billion observations.
Stata/IC allows datasets with as many as 2,048 variables and 2 billion observations. Stata/IC can have at most 798 independent variables in a model.
Numerics by Stata can support any of the data sizes listed above in an embedded environment.
All the above flavors have the same complete set of features and include PDF documentation.
Product features  Stata/IC  Stata/SE  Stata/MP 
Maximum number of variables  2,048  32,767  120 
Maximum number of observations  2.14 billion  2.14 billion  Up to 20 billion 
Maximum number of independent variables  798  10,998  10,998 
Multicore support (Time to run logistic regression with 5 million obs and 10 covariates )  1core/ 10.0 sec  1core/ 10.0 sec  2 core (5.0 sec), 4core (2,6 sec), 4+ core (even faster) 
Complete suite of statistical features  Yes!  Yes!  Yes! 
Publicationquality graphics  Yes!  Yes!  Yes! 
Matrix programming language  Yes!  Yes!  Yes! 
Complete PDF documentation  Yes!  Yes!  Yes! 
Exceptional technical support  Yes!  Yes!  Yes! 
Includes withinrelease updates  Yes!  Yes!  Yes! 
64bit version available  Yes!  Yes!  Yes! 
Windows, macOS, and Linux  Yes!  Yes!  Yes! 
Memory requirements  1 GB  2 GB  4 GB 
Disk space requirements  1 GB  1 GB  1 GB 
* The maximum number of observations is limited only by the amount of available RAM on your system.
Stata scripting language
Stata's scripting language is easy to learn and helps you to get the most out of your data. It allows not only to use and modify the existing routines to generate standard reports, but can easily be extended with newly created statistical functions.
Efficient Datamanagent with Stata
Datamanagement with Stata is easy and efficient. Joining datasets, creating new variables or producing summary tables is done in no time.
Professional Graphics with Stata
STATA provides professional graphics that can directly be used for documents and publications. This includes not only predefined standard graphs but although highly customizeable graphics.
Further Information:
Trialversion of Stata
The producer provides a free 30day trialversion on their website. The trialversion contains all the features of Stata. You can register for this license simply by visiting the following link: http://www.stata.com/customerservice/evaluatestata/
Compatible operating systems
Stata will run on the platforms listed below. While Stata software is platformspecific, your Stata license is not; therefore, you need not specify your operating system when placing your order for a license.
running Stata on a dualcore, multicore, or multiprocessor computer.
Platforms
 Windows 10 *
 Windows 8 *
 Windows Server 2019, 2016, 2012 *
* Stata requires 64bit Windows for x8664 processors made by Intel® or AMD
 Mac with Apple Silicon or 64bit Intel processor
 macOS 11.0 (Big Sur) or newer for Macs with Apple Silicon and macOS 10.12 (Sierra) or newer for Macs with 64bit Intel processors
 Any 64bit (x8664 or compatible) running Linux
 For xstata, you need to have GTK 2.24 installed
Hardware requirements
Package  Memory  Disk space 

Stata/MP  4 GB  2 GB 
Stata/SE  2 GB  2 GB 
Stata/BE  1 GB  2 GB 
Stata for Linux requires a video card that can display thousands of colors or more (16bit or 24bit color)
What's new in Stata?
Tables
Customize your tables of
 Summary statistics
 Results from hypothesis tests
 Regression results
 LR and Wald tests, GOF statistics, ...
 Results from any Stata command
Export to
 Word, Excel
 LaTeX
 HTML, Markdown
 and more
Bayesian econometrics
Bayesian
 VAR models
 IRF and FEVD analysis
 Dynamic forecasting
 Panel/longitudinaldata models
 Linear and nonlinear DSGE models
PyStata—Python and Stata
 Call Python from Stata.
 Call Stata from Python.
 Exchange data, metadata, and results seamlessly.
 Use Stata from Jupyter Notebook, Spyder, PyCharm IDE, and more.
Jupyter Notebook with Stata
 Invoke Stata and Mata from Jupyter Notebook.
 Easily reproduce your work and collaborate with others.
 Access results from Stata analyses within Python.
 Stata output, graphs, and tables seamlessly integrate with your Jupyter Notebook.
Differenceindifferences (DID) and DDD models
 Evaluate the effect of a policy, a treatment, or an intervention.
 Control for confounding unobserved group and time characteristics.
 Use panel data or repeated crosssections.
 Use DID. In vogue since 1855.
Faster Stata
Stata is fast, and keeps getting faster.
 Faster sort and collapse
 Faster mixed models
 Faster estimation commands
 Faster import delimited
 And more
Intervalcensored Cox model
You want to model time to an event.
But you don't know the exact event times—only the intervals in which events happen.
And you don't want to make parametric assumptions.
Try an intervalcensored Cox model.
Multivariate metaanalysis
Do you have multiple effect sizes?
Do they share a common control group?
Do they share the same group of subjects?
Multivariate metaanalysis can help.
Bayesian VAR models
You fit your VAR models with var.
You fit your Bayesian regression models with bayes:.
Now fit your Bayesian VAR models with bayes: var.
Bayesian multilevel modeling
Nonlinear, joint, SEMlike, and more.
More multilevel models.
More powerful.
Easier to use.
Treatmenteffects lasso estimation
When you want:
Causal inference, average treatment effects, potentialoutcome means, doublerobust estimation
And you have:
Many (maybe hundreds or thousands of) potential covariates
Use treatmenteffects estimation with lasso variable selection.
New functions for dates and times
 Calculate durations, such as ages and other differences between datetimes.
 Calculate relative dates, or dates from other dates, such as the previous or next birthday or anniversary relative to a reference date.
 Extract individual components from datetime values and variables.
Leaveoneout metaanalysis
Are there influential studies in your data?
Use leaveoneout metaanalysis to find out.
Galbraith plots
Graphically summarize metaanalysis results
 Studyspecific effect sizes
 Precision of effect sizes
 Overall effect size
Detect potential outliers
Assess heterogeneity
Paneldata multinomial logit model
You can model categorical outcomes with mlogit.
You can model panel data with xt.
Now you can do both!
Stata's new xtmlogit command models categorical outcomes that change over time.
Bayesian paneldata models
Bayesian analysis lets you answer probabilistic questions with paneldata models.
 How likely is it that an extra year of schooling will increase wages?
 What is the probability of default for a lowrisk portfolio?
Incorporate prior knowledge, see posterior distributions of random effects, compute Bayesian predictions, and more.
Zeroinflated ordered logit model
Need to model an ordinal outcome?
Have excess zeros (or responses in the lowest category)?
ziologit is the answer.
Nonparametric tests for trend
Do responses have an increasing or decreasing trend? Find out using one of four nonparametric tests for trend:
 Cochran–Armitage test
 Jonckheere–Terpstra test
 Linearbylinear test
 Cuzick's test with ranks
Bayesian IRF and FEVD analysis
What is the effect of a shock over time?
What is the mean or median of the effect for a distribution of probable scenarios?
Bayesian IRF analysis answers these and more.
Bayesian dynamic forecasting
After VAR, you want a dynamic forecast.
After Bayesian estimation, you want statistics of posterior distributions.
Estimate both. Visualize both.
Lasso with clustered data
Your data have ...
many variables.
Your data have ...
clusters of observations.
Your lasso for prediction, model selection, or inference can now select variables while accounting for clustering.
BIC for lasso penalty selection
Which variables should lasso include?
BIC for lasso penalty selection can tell you.
Bayesian linear and nonlinear DSGE models
Forming rational expectations
of the future is hard.
DSGE models include
these expectations.
Prior information helps.
Dofile Editor enhancements
 Persistent bookmarks
 Navigation Control
 Syntax highlighting for Java, XML, and more
 Autocompletion for quotes, parentheses, and brackets
Stata on Apple Silicon
 Native M1 processor support
 Universal application for both Intel and Apple Silicon Macs
 One license, both kinds of hardware
Intel Math Kernel Library (MKL)
Mata functions and operators use heavily optimized LAPACK routines underpinned by the Intel Math Kernel Library.
Use your favorite Stata commands like always; underlying functions are faster, so you get results faster.
Java integration
 Use Java interactively (like JShell) from within Stata.
 Embed Java code in dofiles.
 Embed Java code in adofiles.
 Compile and execute Java code "on the fly" without external programs.
H2O integration
 Start a new H2O cluster or connect to an existing one.
 Manipulate data on an H2O cluster.
 Access the capabilities of H2O directly in Stata.
JDBC
Connecting Stata to databases is now easier.
Want to access data from Oracle, MySQL, Amazon Redshift, Snowflake, Microsoft SQL Server, and others?
Use jdbc.
Want one driver that works on Windows, Mac, and Linux?
Use jdbc.
Intervalcensored survival modelsFit any of Stata's six parametric survival models to intervalcensored data. All the usual survival features are supported: stratified estimation, robust and clustered SEs, survey data, graphs, and more. 
Nonlinear multilevel

Mixed logit models: Advanced choice modelingDo you walk to work, ride a bus, or drive your car? Which of three insurance plans do you buy? Which political party do you vote for? We make dozens of choices every day. Researchers have access to gaggles of data about those choices. Mixed logit introduces random effects into choice modeling and thereby relaxes the IIA assumption and increases model flexibility. 
Nonparametric regressionWhen you know something matters. But have no idea how. 
Create Word documents from Stata

Bayesian multilevel modelsSmall number of groups? Consider Bayesian multilevel modeling. 
Threshold regressionYour timeseries regression may change parameters at some point in time or at multiple points in time. The activity of foraging animals might follow a completely different pattern at temperatures above some threshold. You may not know the value of that threshold. Finding such thresholds and estimating the parameters within the regimes is what threshold regression does. 
Paneldata tobit with random coefficientsStata has long had estimators for random effects (random intercepts) in panel data. 
Search, browse, and import FRED dataThe St. Louis Federal Reserve makes available over 470,000 U.S. and international economic and financial time series. You can now easily search, browse, and import these data. 
Multilevel regression for intervalmeasured outcomesIncomes are sometimes recorded in groupings, as are people's weights, insect counts, gradepoint averages, and hundreds of other measures. Often we have repeated measurements for individuals, or schools, or orchards, etc. So ... we need multilevel regression for intervalmeasured (intervalcensored) outcomes. 
Multilevel tobit regression for censored outcomes

Paneldata cointegration tests

Tests for multiple breaks in time series

Multiplegroup generalized SEMGeneralized SEM now supports multiplegroup analysis. Easily specify groups and test parameter invariance across groups. GSEM models include

ICD10CM/PCS

Power for cluster randomized designsPower analysis for comparing
when you randomize clusters instead of individuals 
Power for linear regression models

Heteroskedastic linear regression

Poisson models with sample selectionCounts are common. How many: Fish did you catch?
Accidents occurred? Patents does a firm generate? Outcomes are not always seen. Folks evade the game warden.
Accidents are not always reported. Some firms prefer trade secrets to patents. So you need Poisson models with sample selection. 
More in panel dataNonlinear models with random effects, including random coefficients Bayesian paneldata models Interval regression with random intercepts and random coefficients 
More in graphicsTransparency in graphs SVG export 
More in statisticsBayesian survival models Zeroinflated ordered probit Add your own power and samplesize methods Bayesian sampleselection models And yet more 
More in the interfaceStata in Swedish Stata in Chinese Improvements to the Dofile Editor 
And, even more
Stream randomnumber generator Improvements for Java plugins
Die gesamte Feature Liste finden Sie auch afu der Seite von Stata.com:
https://www.stata.com/features/
Stata Features
Data management
data transformations, matchmerge, ODBC, XML, bygroup processing, append files, sort, row–column transposition, labeling, saving results
Basic statistics
summaries, crosstabulations, correlations, t tests, equalityofvariance tests, tests of proportions, confidence intervals, factor variables
Linear models
regression; bootstrap, jackknife, and robust Huber/White/sandwich variance estimates; instrumental variables; threestage least squares; constraints; quantile regression; GLS
Multilevel mixedeffects models
generalized linear models;continuous, binary, and count outcomes; two, three, and higherlevel models; randomintercepts; randomslopes; crossed random effects; BLUPs of effects and fitted values; hierarchical models; residual error structures; support for survey data in linear models
Binary, count, and discrete outcomes
logistic, probit, tobit; Poisson and negative binomial; conditional, multinomial, nested, ordered, rankordered, and stereotype logistic; multinomial probit; zeroinflated and lefttruncated count models; selection models; marginal effects
Longitudinal data/panel data
random and fixed effects with robust standard errors; linear mixed models, randomeffects probit, GEE, random and fixedeffects Poisson, dynamic paneldata models, and instrumentalvariables regression; panel unitroot tests; AR(1) disturbances
Generalized linear models (GLMs)
ten link functions, userdefined links, seven distributions, ML and IRLS estimation, nine variance estimators, seven residuals
Nonparametric methods
WilcoxonMannWhitney, Wilcoxon signed ranks and KruskalWallis tests; Spearman and Kendall correlations; KolmogorovSmirnov tests; exact binomial CIs; survival data; ROC analysis; smoothing; bootstrapping
Exact statistics
exact logistic and Poisson regression, exact casecontrol statistics, binomial tests, Fisher's exact test for r × c tables
ANOVA/MANOVA
balanced and unbalanced designs; factorial, nested, and mixed designs; repeated measures; marginal means; contrasts
Multivariate methods
factor analysis, principal components, discriminant analysis, rotation, multidimensional scaling, Procrustean analysis, correspondence analysis, biplots, dendrograms, userextensible analyses
Cluster analysis
hierarchical clustering; kmeans and kmedian nonhierarchical clustering; dendrograms; stopping rules; userextensible analyses
Resampling and simulation methods
bootstrapping, jackknife and Monte Carlo simulation; permutation tests
Tests, predictions, and effects
Wald tests; LR tests; linear and nonlinear combinations, predictions and generalized predictions, marginal means, leastsquares means, adjusted means; marginal and partial effects; forecast models; Hausman tests
Graphics
line charts, scatterplots, bar charts, pie charts, hilo charts, regression diagnostic graphs, survival plots, nonparametric smoothers, distribution QQ plots
Survey methods
multistage designs; bootstrap, BRR, jackknife, linearized, and SDR variance estimation; poststratification; DEFF; predictive margins; means, proportions, ratios, totals; summary tables; regression, instrumental variables, probit, Cox regression
Survival analysis
KaplanMeier and NelsonAalen estimators,; Cox regression (frailty); parametric models (frailty); competing risks; hazards; timevarying covariates; left and rightcensoring, Weibull, exponential, and Gompertz analysis
Epidemiology
standardization of rates, case–control, cohort, matched casecontrol, MantelHaenszel, pharmacokinetics, ROC analysis, ICD9CM
Time series
ARIMA; ARFIMA; ARCH/GARCH; VAR; VECM; multivariate GARCH; unobserved components model; dynamic factors; statespace models; business calendars; correlograms; periodograms; forecasts; impulseresponse functions; unitroot tests; filters and smoothers; rolling and recursive estimation
Multiple imputation
nine univariate imputation methods; multivariate normal imputation; chained equations; explore pattern of missingness; manage imputed datasets; fit model and pool results; transform parameters; joint tests of parameter estimates; predictions
Simple maximum likelihood
specify likelihood using simple expressions; no programming required; survey data; standard, robust, bootstrap, and jackknife SEs; matrix estimators
Programmable maximum likelihood
userspecified functions; NR, DFP, BFGS, BHHH; OIM, OPG, robust, bootstrap, and jackknife SEs; Wald tests; survey data; numeric or analytic derivatives
Other statistical methods
kappa measure of interrater agreement; Cronbach's alpha; stepwise regression; tests of normality
Programming features
adding new commands; command scripting; objectoriented programming; menu and dialogbox programming; Project Manager; plugins
Matrix programmingMata
interactive sessions, largescale development projects, optimization, matrix inversions, decompositions, eigenvalues and eigenvectors, LAPACK engine, real and complex numbers, string matrices, interface to Stata datasets and matrices, numerical derivatives, objectoriented programming
Internet capabilities
ability to install new commands, web updating, web file sharing, latest Stata news
Accessibility
Section 508 compliance, accessibility for persons with disabilities
Sample session
A sample session of Stata for Mac, Unix, or Windows.
Userwritten commands
Userwritten commands for metaanalysis, data management, survival, econometrics
Graphical user interface
menus and dialogs for all features; Data Editor; Variables Manager; Graph Editor; Project Manager; Dofile Editor; Clipboard Preview Tool; multiple preference sets
Graphics
line charts; scatterplots; bar charts; pie charts; hilo charts; contour plots; GUI Editor; regression diagnostic graphs; survival plots; nonparametric smoothers; distribution QQ plots
Documentation
20 manuals20 manuals; 11,000+ pages; seamless navigation; thousands of worked examples; methods and formulas; references; 11,000+ pages; seamless navigation; thousands of worked examples; methods and formulas; references
Power and sample size
power; sample size; effect size; minimum detectable effect; means; proportions; variances; correlations; casecontrol studies; cohort studies; survival analysis; balanced or unbalanced designs; results in tables or graphs
Treatment effects
inverse probability weight (IPW); doubly robust methods; propensity score matching; regression adjustment; covariate matching; multilevel treatments; average treatment effects (ATEs); average treatment effects on the treated (ATETs); potentialoutcome means (POMs)
SEM (Structural equation modeling)
graphical path diagram builder; standardized and unstandardized estimates; modification indices; direct and indirect effects; continuous, binary, count, and ordinal outcomes (GLM); multilevel models; random slopes and intercepts; factors scores, empirical Bayes, and other predictions; groups and tests of invariance; goodness of fit; handles MAR data by FIML; correlated data
Functions
statistical; randomnumber; mathematical; string; date and time
Embedded statistical computations
Numerics by Stata
Contrasts, pairwise comparisons, and margins
compare means, intercepts, or slopes; compare to reference category, adjacent category, grand mean, etc.; orthogonal polynomials; multiple comparison adjustments; graph estimated means and contrasts; interaction plots
GMM an nonlinear regression
generalized method of moments (GMM); nonlinear regression