using R for matrix algebra (à la MATLAB), and working with very large datasets. The examples lag Autocorrelation D-W Statistic p-value. PDF | On Jan 1, , James P Lesage published The Theory and Practice of Spatial Econometrics | Find, read and torentyok.fun = Durbin-Watson Statistic.
In Chapter 5 we discuss some numerical problems that result from implement- ing Markov chain Monte Carlo algorithms on digital computers. These concerns can be quite complicated, but the foundational issues are essentially like those shown here: numerical treatment within low-level algorithmic implementation. In Chapter 6 we look at the problem of a non-invertible Hessian matrix, a seri- ous problem that can occur not just because of collinearity, but also because of problems in computation or data.
We propose some solutions, including a new approach based on generalizing the inversion process followed by importance sampling simulation. In Chapter 7 we investigate a complicated modeling scenario with important theoretical concerns: ecological inference, which is susceptible to numerical inac- curacies. In Chapter 10 Paul Allison discusses numerical issues in logistical regression. Many related issues are exacerbated with spatial data, the topic of Chapter 9 by James LeSage.
Finally, in Chapter 11 we pro- vide a summary of recommendations and an extended discussion of methods for ensuring replicable research. Throughout, there are real examples and replications of published social science research and inno- vations in numerical methods. Our purpose is not just to present a collection of recommendations from different methodological literatures.
We hope that this will bolster the idea that political science and other social sciences should seek to recertify accepted results. Markov chain Monte Carlo has revolutionized Bayesian estimation, and a new focus on sophisticated soft- ware solutions has similarly reinvigorated the study of ecological inference. Benchmarks are useful tools to assess the accuracy and reliability of computer software.
This is a neglected area, but it turns out that the transmission of data across applications can degrade the quality of these data, even in a way that affects estimation. We discuss a number of existing benchmarks to test numerical algorithms and to provide a new set of standard benchmark tests for distributional accuracy of statistical packages.
When a Hessian is non-invertible purely because of an interaction between the model and the data and not because of rounding and other numerical errors , this means that the desired variance matrix does not exist; the likelihood func- tion may still contain considerable information about the questions of interest. Ecological inference, the problem of inferring individual behavior from aggre- gate data, was and perhaps still is arguably once the longest-standing unsolved problem in modern quantitative social science.
The results illuminate the trade-offs among correctness, complexity, and numerical sensitivity. As social scientists ourselves, we recognize that our data analysis and estimation processes can differ substantially from those described in a number of even excellent texts. All too often new ideas in statistics are presented with examples from biology. There is nothing wrong with this, and clearly the points are made more clearly when the author actually cares about the data being used.
These are actual examples taken from some of our favorite statistical texts. This is a book for those who agree. Further error may arise from the limitations of algorithms, such as pseudo-random number generators PRNG and nonlinear optimization algorithms. In this chapter we provide a detailed treatment of the sources of inaccuracy in statistical computing.
A Microsoft technical note1 states, in effect, that some functions in Excel v5. Calculation of the standard deviation by Microsoft Excel is a telling example of a software design choice that produces inaccurate results. In this case, the textbook formula is not the fastest way to calculate the standard deviation since it requires one pass through data to compute the mean and a second pass to compute the difference terms.
Table 2. Each column of 10 numbers in Table 2. Excel is one of the most popular software packages for business statistics and simulation, and the solver functions are used particularly heavily Fylstra et al. Excel exhibits 2 Table 2.
The one presented here is an extension of Simon and LeSage Excel is not alone in its algorithm choice; we entered the numbers in Table 2. The standard deviation is a simple formula, and the limitations of alternative implementations is well known; Wilkinson and Dallal pointed out failures in the variance calculations in statistical packages almost three decades ago.
However, given the uses to which Excel is normally put and the fact that internal limits in Excel prohibit analysis of truly large datasets, the one-pass algorithm offers no real performance advantage. In this case, the textbook formula is more accurate than the algorithm used by Excel. However, we do not claim that the textbook formula here is the most robust method to calculate the standard deviation. Numerical stability could be improved in this formula in a number of ways, such as by sorting the differences before summation.
Nor do we claim that textbook formulas are in general always numerically robust; quite the opposite is true see Higham , pp. However, there are other one-pass algorithms for the standard deviation that are nearly as fast and much more accurate than the one that Excel uses. An important concern is that Excel produces incorrect results without warn- ing, allowing users unwittingly to accept erroneous results.
In this example, even moderately sophisticated users would not have much basis for caution. A standard deviation is requested for a small column of numbers, all of which are similarly scaled, and each of which is well within the documented precision and magnitude used by the statistical package, yet Excel reports severely inaccurate results. Because numeric inaccuracies can occur in intermediate calculations that pro- grams obscure from the user, and since such inaccuracies may be undocumented, users who do not understand the potential sources of inaccuracy in statistical com- puting have no way of knowing when results received from statistical packages and other programs are accurate.
The intentional, and unnecessary, inaccuracy of Excel underscores the fact that trust in software and its developers must be earned, not assumed. In the remainder of this chapter we discuss the various sources of such potential inaccuracy. Because of the multiplicity of disciplines that the subject touches on, laying out some terminology is useful. Accuracy almost always refers to the absolute or relative error of an approximate quantity.
When referring to measurement, precision refers to the degree of agree- ment among a set of measurements of the same quantity—the number of digits possibly in binary that are the same across repeated measurements. However, on occasion, it is also used simply to refer to the number of digits reported in an estimate.
Other meanings exist that are not relevant to our discussion; for example, Bayesian statisticians use the word precision to describe the inverse of the variance. An algorithm is said to solve a problem if and only if it can be applied to any instance of that problem and is guaranteed to produce a correct solution to that instance.
Heuristics may be distinguished from approximations and randomized algorithms. An approximation algorithm produces a solution within some known relative or absolute error of the optimal solution. A randomized algorithm pro- duces a correct solution with some known probability of success.
The behavior of approximation and randomized algorithms, unlike heuristics, is formally provable across all problem instances. The same algorithm may be expressed using different computer languages, different encoding schemes for variables and parameters, different accuracy and precision in calculations, and run on dif- ferent types of hardware. An implementation is a particular instantiation of the algorithm in a real computer environment.
Algorithms are designed and analyzed independent of the particular hard- ware and software used to execute them. For a precise explanation of the mechanics of rounding error, see Section 2. Altering the algorithm to sort S before summing reduces rounding error and leads to more accurate results.
This is generally true, not true only for the previous example. Implementation matters. A particular algorithm may have been chosen for asymptotic performance that is irrelevant to the current data analysis, may not lend itself easily to accurate implementation, or may elide crucial details regarding the handling of numerical errors or boundary details. Both algorithm and implementation must be considered when evaluating accuracy. An algorithm may be correct but still lead to inaccurate implementations.
In the next section we discuss the role of algorithms and implementations in inference. This inverse probability model of inference is, unfortunately, impossible. There is always potential error in the collection and coding of social science—people lie about their opin- ions or incorrectly remember responses to surveys, votes are tallied for the wrong candidate, census takers miss a household, and so on.
In theory, some sources of error could be dealt with formally in the model but frequently are dealt with outside the model. Although we rarely model measurement error explicitly in these cases, we have pragmatic strategies for dealing with them: We look for outliers, clean the data, and enforce rigorous data collection procedures.
Other sources of error go almost entirely unacknowledged. That error can be introduced in the act of estimation is known but rarely addressed, even informally. Particularly for estimates that are too complex to calculate analytically, using only pencil and paper, we must consider how computation may affect results. In 3 Thereis a great deal to recommend the Bayesian perspective, but most researchers settle for the more limited but easily understood model of inference: maximum likelihood estimation see King By algorithm we intend to encompass choices made in creating output that are not part of the statistical description of the model and which are independent of a particular computer program or language: This includes the choice of math- ematical approximations for elements of the model e.
Implementation is meant to capture all remaining aspects of the program, including bugs, the precision of data storage, and arithmetic operations e. We discuss both algorithmic and implementation choices at length in the following sections of this chapter. Ignoring the subtle difference between output and estimates may often be harmless. However, as we saw at the beginning of this chapter, the two may be very different. An algorithm may be proved to work properly only on a subset of the possible data and models.
Our approach is more formal and precise, but is roughly compatible. Where noise is present in the data or its storage representation and not explicitly modeled, correct inference requires the output to be stable. What matters is how an algorithm handles inaccuracy. Note that unstable output could be caused by sensitivity in the algorithm, implementation, or model. Any error, from any source, may lead to incorrect inferences if the output is not stable.
Users of statistical computations must cope with errors and inaccuracies in implementation and limitations in algorithms. Problems in implementations in- clude mistakes in programming and inaccuracies in computer arithmetic. Prob- lems in algorithms include approximation errors in the formula for calculating a statistical distribution, differences between the sequences produced by pseudo- random number generators and true random sequences, and the inability of nonlinear optimization algorithms to guarantee that the solution found is a global one.
We examine each of these in turn. Harrell , Greene , and Montgomery et al. It is tacitly understood that domain knowledge is needed to select an appropriate model, and it has begun to be recognized that knowledge of data collection is necessary to understand whether data actually correspond to the variables of the model. Thus, correct inference sometimes requires a combination of expertise in the substantive domain, statistics, computer algorithms, and numerical analysis Figure 2.
In this section we review the primary sources of errors and inaccuracy. Our intent in this chapter is not to make positive recommenda- tions. We save such recommendations for subsequent chapters. We believe, as Acton p. For example, Maros writes on the wide gap between the simplicity of the simplex algorithm in its early incarnations and the sophistication of current implementations of it. He argues that advances in optimization software have come about not simply through algorithms, but through an integration of algorithmic analysis with software engineering principles, numerical analysis of software, and the design of computer hardware.
At the same time, people. To offer them even a hint of a panacea. No statistical software is known to work properly with certainty. In limited circumstances it is theoretically possible to prove software correct, but to our knowledge no statistical software package has been proven correct using formal methods. Until recently, in fact, such formal methods were widely viewed by practitioners as being completely impractical Clarke et al.
In practice, statistical software will be tested but not proven correct. As Dahl et al. For example, in a experiment by Brown and Gould [following a survey by Creeth ], experienced spread- sheet programmers were given standardized tasks and allowed to check their results. Although we suspect that statistical programs have a much lower error rate, the example illustrates that caution is warranted. Although the purpose of this book is to discuss more subtle inaccuracies in sta- tistical computing, one should be aware of the potential threat to inference posed by bugs.
Since it is unlikely that identical mistakes will be made in different implementations, one straightforward method of testing for bugs is to reproduce results using multiple independent implementations of the same algorithm see Chapter 4. Disillusioned computer users have just the opposite approach; they are constantly afraid that their answers are almost meaningless. The central issues in computational numerical analysis are how to minimize errors in calculations and how to estimate the magnitude of inevitable errors.
Statistical computing environments generally place the user at a level far removed from these considerations, yet the manner in which numbers are handled at the lowest possible level affects the accuracy of the statistical calculations. An understanding of this process starts with studying the basics of data storage and manipulation at the hardware level. We provide an overview of the topic here; for additional views, see Knuth and Overton Cooper further points out that there are pitfalls in the way that data are stored and organized on computers, and we touch on these issues in Chapter 3.
For a particular y, only m, e, and a sign are stored. This is dealt with by normalizing the mantissa. Some numbers cannot be exactly represented using this scheme. An example is the number 0. This division may cause low-order bits in the mantissa of the smaller number to be lost. As Knuth , p. Furthermore, as Higham , Sec. In other words, the smallest number that can be added does not have full precision. If two nearly equal numbers are subtracted, cancellation may occur, leaving only the accumulated rounding error as a result, perhaps to be further multiplied and propagated in other calculations Higham , p.
A double-precision number uses double the storage of a single-precision number. No rounding occurred in the representation. Similarly, the number 2 is represented exactly by 1. The number 0. The general method for adding and subtracting is given by the following: 1. Perform a carry on the exponent if necessary. Normalize the result, and round the result if necessary. These are not important for this particular example.
In practice, IEEE implementation handles these and other subtle rounding issues correctly. Floating point arith- metic libraries that do not support the IEEE standard, such as those supplied on older mainframe computers, generally do not handle these issues correctly and should be avoided for statistical computation where possible. For more detail, see Overton In effect, the number 2 has simply been dropped from the calculation.
The same thing occurs when we subtract the quantity 0. In contrast, when we perform these operations in a different order, we may obtain a completely different result. The next step is 1. Aligned, this yields 1. There are three important lessons to take from this example.
First, rounding errors occur in binary computer arithmetic that are not obvious when one con- siders only ordinary decimal arithmetic. Fact: Even one error, when propagated through the calculation, can cause wildly inaccurate results. Fact: Even a simple computation can be subject to rounding error. Attempt to keep calculations at the same scale throughout.
Sometimes, albeit rarely, rounding errors can help a computation, such that a computed answer can be more accurate than any of the intermediate quantities of the computation. Thus, in Chapters 3 and 4 we discuss various ways of testing and improving the accuracy of arithmetic. Identical software code can produce different results on different operating systems and hardware.
Within the IEEE standard, there are a number of places where results are not determined exactly. Hence, on some systems such as many Intel-based computers , some intermediate calculations in a sequence are performed at extended precision and then rounded down to double precision at the end. There is some question whether this practice complies with the IEEE standard. This type of conversion is referred to as casting in some programming languages.
Compiler optimizations may also interfere with replication of results. For example, in the programming language C using gcc 2. Although some compilers allow such optimizations to be disabled explicitly, many optimizations are applied by default. Differences among com- pilers and among the hardware platforms targeted by the compiler can cause the same code to produce different results when compiled under different operating systems or run on different hardware.
This is not the case with all algorithms. In this section we discuss the limitations of algorithms commonly used in statistical computations. These limited algorithms usually fall into one of the following four categories: 1. Randomized algorithms return a correct solution with some known proba- bility, p, and an incorrect solution otherwise.
They are used most commonly with decision problems. Heuristic algorithms or, simply, heuristics are procedures that often work in practice but provide no guarantees on the optimality of their results. Nor do they provide bounds on the relative or absolute error of these results as compared to the true quantities of interest. Local search algorithms comprise practically all general nonlinear opti- mization algorithms.
Local search algorithms are guaranteed only to provide locally optimal solutions. In practice, i is set such that the risk of a false negative is negligible. Therefore, although there is only a 0. For example, linear programming, the minimization of m continuous variables subject to n linear inequality constraints, is computation- ally tractable.
But integer linear programming, where each of the m variables is an integer, is computationally intractable. A similar form of truncation error can stem from using an asymptotic series expansion, which inherently limits the number of terms that can contribute to the accuracy 10 Integerlinear programming is NP-complete.
As Lozier and Olver point out in their extensive review of software for evaluating functions, before the construction implementation of software there are two stages. First, one must choose a suit- able mathematical representation of the function of interest, such as asymptotic expansions, continued fractions, difference and differential equations, functional identities, integral representations, and Taylor series expansions.
Random numbers are also used in subsampling techniques, resampling techniques such as jackknife and the bootstrap Efron ; Hall ; Shao and Tu , and to pick starting parameters and search direc- tions for some nonlinear optimization algorithms. For example, computer-savvy gamblers have been known to exploit poor random number generators in gaming Grochowski , and an otherwise secure encryption implementation has been defeated for similar reasons Goldberg and Wanger Peter Neumann reports on a variety of software system errors related to random number generators.
We provide an overview of the topic here and a discussion of appropriate choice of generators for Monte Carlo and MCMC simulation in Chapter 5. The numbers provided by computer algorithms are not genuinely random. This sequence is statistically similar, in limited respects, to random draws from a uni- form distribution. However, a pseudo-random sequence does not mimic a random sequence completely, and there is no complete theory to describe how similar PRNG sequences are to truly random sequences.
This is a fundamental limitation of the algorithms used to generate these sequences, not a result of inaccuracy in implementation. Note that in practice, x is usually divided by m. This is still an extremely popular generator, and modern versions of it very frequently use the choice of m and a attributed to Lewis et al. Even with these well-tested parameter values, the generator is now considered a comparatively poor one, because it has a short period, constrained by its modulus, and exhibits a lattice structure in higher dimensions Marsaglia For poor choices of a, m, and c, this lattice structure is extreme.
The infamous RANDU generator, which was widely used in early computing and from which many other generators descended, is simply an LCG with values of 65,, , and 0. Although the sequence produced appears somewhat random when subsequent pairs of points are two dimensions Figure 2. Any of these generators, using appropriately chosen parameters for initializa- tion, are likely to be better than the standard LCG in terms of both period length and distributional properties. Combined generators see below are good insurance against such defects, but even these may, in theory, result in periodicity effects for simulations that demand a large number of random numbers, as we show in Chapter 5 for Markov chain Monte Carlo methods.
A newer class of PRNGs, nonlinear generators, are promising because they appear to eliminate defects of previous PRNGs, although they are slower and less thoroughly tested and less well understood. The resulting stream has a longer period and a better lattice structure. Another approach to combining generators, originated by Collings , is more straightforward.
Collings compounds generators by maintaining a pool of k separate generators of different types and intermixing the results. A separate generator is used to generate a number i from [1, k], and the ith generator is used to provide the next number in the sequence.
If the periods of each generator used in the pool are p, the period of the combined generator is roughly p 2. All of the PRNGs discussed are designed to generate uniformly distributed random numbers. Sampling of random numbers from other distributions is usually done by applying a transformation to a uniformly distributed PRNG. Two of the simpler techniques for such transformation are the inverse cumulative distribution function CDF method and the rejection method.
The rejection method can be used where the inverse CDF is inapplicable and is straightforward to apply to multivariate distributions. Intuitively, it involves drawing a bounding box or other bounding region around the integral probabil- ity density function, uniformly sampling from the box, and throwing away any samples above the integral.
More formally, the method is as follows: 1. Generate u from a uniform distribution on 0,1 and y from Y. First, a PRNG should have a long period. The recommended minimum length of the period depends on the number of random numbers n used by the simula- tion. Conservatively, PRNGs provided by most packages are inadequate for even the simplest simulations. Even using the less conservative recommendation, the typical period is wholly inadequate for computer-intensive techniques such as the double bootstrap, as McCullough and Vinod point out.
Some PRNGs produce numbers that are appar- ently independent in one dimension but produce a latticelike structure in higher dimensions. Third, the distribution of draws from the generator must be extremely close to uniform. In practice, we do not know if a PRNG produces a distribution that is close to uniform. Good tests, however, constitute prototypes of simulation problems and examine both the sequence as a whole and the quality of subsequences Knuth Fourth, to ensure independence across sequences, the user must supply seeds that are truly random.
In practice, statistical software selects the seed automati- cally using the current clock value, and users rarely change this. As encryption researchers have discovered, such techniques produce seeds that are not com- pletely random Eastlake et al. Finally, for the purpose of later replication of the analysis, PRNG results must be reproducible. In most cases, reproducibility can be ensured by using the same generator and saving the seed used to initialize the random sequence.
However, even generators that are based on the same PRNG algorithm can be implemented in subtly different ways that will interfere with exact reproduction. For example, Gentle , pp. In addition, more care must be used in parallel computing environments. Wher- ever multiple threads of execution sample from a single generator, interprocessor delays may vary during a run, affecting the sequence of random numbers received by each thread. It may be necessary to record the subsequences used in each thread of the simulation to ensure later reproducibility Srinivasan et al.
If these conditions are met, there remains, inevitably, residual approxima- tion error. This approximation error can also cause Monte Carlo algorithms to converge more slowly with PRNGs than would be expected using true random draws and may prevent convergence for some problems Traub and Woznakowski 11 Behavior of short subsequences is particularly important for simulations using multiple threads of execution.
Entacher shows that many popular random number generators are inadequate for this purpose. Since the error of a PRNG does not dissipate entirely with sample size, traditional analysis of simulations based on asymptotic assumptions about sam- pling error overstates the accuracy of the simulation Fishman , Chap. Developers of PRNG algorithms stress that there is no single generator that is appropriate for all tasks.
PRNGs should be chosen with characteristics of the simulation in mind. Hardware generators are typically many orders of magnitude slower than PRNGs. As of the time this book was written, the less expensive generators, pro- duce roughly 10, random bytes per second.
Thus they are more often used in cryptographic applications, which require small amounts of extremely high quality randomness, than in Monte Carlo simulation. However, even the slowest generators can be used to provide high-quality seeds to PRNGs or to run many of the Monte Carlo simulations used by social scientists, if not for large MCMC computations.
With forethought, large numbers of random bits from hardware random number generators can be stored over time for later use in simulation. Moreover, the availability and speed of these generators have increased dra- matically over the last few years, putting hardware random number generation within reach of the social scientist. We list a number of hardware random number generators and online sources for random bits in the Web site associated with this book.
Note that although many of these chip sets are in wide use in workstations and even in home computers, programming effort is needed to access the generator. This device gathers entropy from a combination of interkeystroke times and other system interrupts. Some caution is still warranted with respect to hardware random number gen- erators. Typically, some forms of TRNG hardware generation are subject to envi- ronmental conditions, physical breakage, or incorrect installation. Although most hardware generators check their output using the FIPS test suite when the device starts up, these tests are not nearly as rigorous as those supplied by stan- dard test suites for statistical software see Chapter 3.
In Chapter 3 we show that modern statistical packages are still prone to the problems we describe, and in Chapter 10 we discuss some aspects of this problem with respect to nonlinear regression. The purpose of this section is to alert researchers to the limitations of these algorithms.
Thus, numerical inaccuracies may prevent the location of local optima, even when the search algorithm itself is mathematically correct. The conditions for global optima for some classes of problems are known e. In addition, as Gentle , p. See Chapter 4 for a discussion of inference in the presence of multiple optima.
These theorems apply to such popular black-box optimization techniques as neural networks, genetic algorithms, and simulated annealing. In addition, some of these methods raise other practical problems that render their theoretical properties invalid in all practical circumstances see Chapter 4.
Veall describes another way of testing the hypothesis, using a similar set of local optima. Unlike a grid search, these tests provide a way to formally test the hypothesis that the optimum found is global. We discuss these tests in more detail in Chapter 4. These tests are not in wide use and to our knowledge have not been incorporated into any statistical software package.
The following data were collected Figure 2. Figure 2. These contours are shown in the area very close to the true solution—a luxury available only after the solution is found. Further- more, in the lower left quadrant there appears to be a small basin of attraction that does not include the real solution. There are generally no tests that will prove the absence of bugs. However, comparing the results from multiple independent implementations of the same algorithm is likely to reveal any bugs that affect those particular results.
These errors can accumulate during the course of executing the program, resulting in large differences between the results produced by an algorithm in theory, and its implementation in practice. These limitations can be of different sorts. Other pathological problems have also been reported, the sum of which tends to erode trust in statistical computing.
For example, Yeo found that by simply reordering the input of data to SAS, he was able to get a noticeably different regression result. After reading this chapter, one may be tempted to trust no computational algorithm.
The remain- der of the book provides a variety of techniques for discovering inaccuracy and ameliorating it. The underlying theme of this chapter and book is that careful consideration of the problem, and an understanding of the limitations of comput- ers, can guide researchers in selecting algorithms that will improve the reliability of the inference they wish to make. Statistical software is a critical tool used in the vast majority of quantitative and statistical analysis in many disciplines.
Without training in computer science, social science researchers must treat their package of choice as a black box and take it as a matter of faith that their software produces reliable and accurate results. Many statistical software packages are currently available to researchers. Some are designed to analyze a particular problem; others support a wide range of fea- tures. Commercial giants such as SAS and SPSS, and large open-source packages such as R offer thousands of commonly used data manipulation, visualization, and statistical features.
These differences may make the difference between results that are robust and replicable and results that are not. In this chapter we discuss strategy for evaluating the accuracy of computer algorithms and criteria for choosing accurate statistical software that produce reliable results. We then explore the methodology for testing statistical software in detail, provide comparisons among popular packages, and demonstrate how the choice of package can affect published results.
Our hope is that this chapter leads social scientists to be more informed consumers of statistical software. For some distribution functions, and many individual computations in matrix algebra, it is possible to derive analytical bounds on the accuracy of those functions, given a particular implementation and algorithm.
Methods such as interval arithmetic can be used to track the accumulated round-off errors across a set of calculations Higham Furthermore, in an ideal world, all statistical computation would follow uni- form standards for the best practices for treatment of data, choice of algorithm, and programming techniques. Best practices for many common analyses used by social scientists are discussed in Chapters 7 to Unfortunately, these best practices are often ignored, even in large commercial statistical packages.
While best practices can improve accuracy and stability, Chapter 2 explains why per- fect accuracy and stability cannot be guaranteed for all algorithms, particularly random numbers generators and nonlinear optimization algorithms. There are three general heuristics that can help identify potential computational problems: 1. Test benchmark cases. Correct estimates can sometimes be computed, exactly or to a known level of accuracy, for a particular model and set of test data.
The estimates generated by a particular algorithm and implementation can then be compared to these known results. Discrepancies are an indication of potential computation problems. Benchmarks are useful and should be employed on publicly distributed software wherever feasible. Second, realistic benchmarks, for which estimates can be cal- culated with known accuracy, are sometimes impossible to create.
Third, benchmark testing can detect some inaccuracies but is valid only for the model and data tested. The performance of an algorithm for different models and data remains unknown. In any statistical analysis, the researcher should always apply substantive knowledge of the model, data, and phenomena being ana- lyzed to check that the results are plausible.
Implausible results should be held up to extensive scrutiny. Use sensitivity analysis. One popular approach is to replicate the analysis keeping the data and model the same, but using many different algorithms, algorithmic parameters such as starting values , and implementations e. If results disagree, one should investigate applying the other techniques until it is clear which set of results should be discarded.
This is highly recommended where multiple implementations and algorithms are available. The effort required to create alternatives where none presently exist, however, can be prohibitively high. A second popular and complementary approach is to replicate the anal- ysis while perturbing the input data and to observe the sensitivity of the estimates to such perturbations.
Sensitivity or pseudoinstability is not a measure of true computational stability, because values for the correct esti- mates are unknown. We discuss this in more detail in Chapter 4. These two sensitivity tests can be combined fruitfully, as we show in Chapter 7.
These can be arbitrarily small, even in the presence of multicollinearity Beaton et al. Combining the two methods can help to separate the portions of pseu- doinstability due to model. Note that the size and form of the noise is not what serves to differ- entiate numerical problems from model and data problems—even simple uniform noise at the level of machine round-off can affect analyses purely because of model and data problems.
It is the combination of perturbations and varying implementations that allows one to gain some insight into sources of sensitivity. Nevertheless, regardless of the cause of sensitivity, one should be cautious if the conclusions are not pseudostable with respect to the amount of noise that is reasonably thought to be in the data. These three approaches cannot be used to prove the accuracy of a particu- lar method but are useful in drawing attention to potential problems. One or both programs has a bug, one performs some calculations less accurately, or the results from each are condi- tioned on different implementation-level parameters e.
One or both programs may use an algorithm for which the required conditions are not met by the particular model and data. Algo- rithms may afford different levels of approximation error. The problem is ill-conditioned. We discuss ill- conditioning in the next section.
With the exception of standard software bugs from programming error, it is not obvious whether the programmer or end user is at fault for ignoring these issues. Users of statistical software should pay close attention to warning messages, diag- nostics, and stated limitation of implementations and algorithms. Often, however, software developers fail to provide adequate diagnostics, informative warning messages, or to document the computational methods used and their limitations.
Users should also examine data for outliers, coding errors, and other problems, as these may result in ill-conditioned data. However, users often have no a priori knowledge that a particular set of data is likely to cause computational problems given the algorithm and implementation chosen by the programmer.
Following Higham Secs. Condition numbers are used to represent the conditioning of a problem with respect to a particular set of inputs. The condition number is a particularly useful formalization inasmuch as it is easier to derive the backward error of a computational method than its overall accuracy or stability. Although conditioning is an important factor in the accuracy of any com- putation, social scientists should not assume that all computational inaccuracies problems are simply a matter of conditioning.
In fact, a computation method with a large backward error will yield inaccurate results even where the problem itself is well conditioned. Moreover, the conditioning of the problem depends on data, model, algorithm, and the form of perturbation. There is no such thing as data that is well condi- tioned with respect to every model.
These formulas may be inappropriate when used to estimate the conditioning of another type of problem or computation procedure. Following Longley, investigations of software inaccuracy and how to detect it have resurfaced regularly e.
In Chapter 2 we explain why the computer algorithms on which statistical software are built contain unavoidable numerical inaccuracies. Unfortunately, comprehensively testing statistical packages for all numerical inaccuracies is practically impossible given the amount of time that is required to investigate all algorithms. In lieu of an exhaustive analysis of a statistical program, benchmarks serve as a basis for assessing their degree of numerical accuracy. Benchmarks are problems—models and data—with known answers that can be compared to the estimates produced by statistical packages.
In addition, we benchmark an overlooked source of inaccuracy: basic processes of data input and export. These datasets provide benchmarks for assessing the accuracy of univariate descriptive statistics, linear regression, analysis of vari- ance, and nonlinear regression Rogers et al.
Since the release of the StRD, numerous reviewers have used them to assess the accuracy of software packages, and vendors of statistical software have begun to publish the test results themselves. Good performance on the StRD provides evidence that a software package is reliable for tested algorithms, but of course, provides no evidence for untested algorithms. For each tested algorithm, the StRD contain multiple problems designed to challenge various aspects of the algorithm.
Details of the formulas used to generate these descriptive statistics are available in the archive. These observations are read into a statistical software package, and the same descriptive statistics are generated to verify the accuracy of the algorithm.
Although StRD tests are well documented and relatively easy to apply, some care is warranted in their application: 1. In this case, the inaccuracy in data input may be attributed incorrectly to the statistical algorithm rather to than the data input process. A software package may fail to display results with full internal precision or may truncate the values when they are exported into another program for analysis.
A software package may present errors in a graph or other visualization results even where the internal values are correct see, e. Often, a statistical package will provide multiple tools for computing the same quantity. Different tools, even within the same package, can pro- vide different results. The Microsoft Excel standard deviation example in Chapter 2 shows how estimates from a built-in function may also be generated through other spreadsheet functions, which may yield different results.
For some benchmark tests, the StRD problem requires a function to be coded in the language of the statistical package. While NIST datasets include sample code to implement StRD problems, the syntax of statis- tical packages, like programming languages, have syntactic differences in precedence, associativity, and function naming conventions that can lead the same literal formula to yield different results. When in doubt, we recommend consulting the language reference manual and using parentheses extensively.
Every nonlinear optimization algorithm begins a set of starting values that are initially assigned, explicitly or implicitly, to the parameters of the function being optimized. In addition, a nonlinear problem may be tested using the default starting values provided by the statistical package itself, which is often simply a vector of zeros or ones. For example, a particular package may allow one to change the method by varying the following see Chapter 4 for a more thorough treatment of these options : 1.
Often, default options are chosen for speed over accuracy, because speedier algorithms typically perform no worse than more accurate, and slower, algorithms for easy problems and because the market for statistical software provides incentives to value speed and features over accuracy see Renfro ; McCullough and Vinod These solvers will generally not yield identical results for the same StRD problems. The nonlinear benchmark problems in the StRD are formulated as nonlinear least squares problems.
Formally, nonlinear regression problems can simply be reformulated more generally as maximum likelihood problems, under the assump- tions that data themselves are observed without error, the model is known, and the error term is normally distributed see Seber and Wild In practice, how- ever, nonlinear least squares solvers use algorithms that take advantage of the more restricted structure of the original problem that maximum likelihood solvers cannot. For these reasons, although MLE solvers are not fundamentally less accurate than nonlinear least squares solvers, the former will tend to perform worse on the StRD test problems.
McCullough recom- mends reporting two sets of test results for nonlinear regression problems. He reports the results for the default options using start I and also reports results using an alternative set of options, derived through ad hoc experimentation.
On the other hand, it is common for statistical soft- ware vendors to report results based on a tuned set of options using the easier start II values. Users who are unfamiliar with statistical computation, even if they are experienced researchers, may simply use default algorithms, options and starting values sup- plied by the statistical software, and change these only if the package fails to give any answer. See Chapter 1 for examples of published research that were invalidated by naive use of defaults.
For this reason, when used with defaults, it is important that a software package not return a plausible but completely inaccurate solution. Warning messages, such as failure to converge, should be provided to alert users to potential problems. A well-designed implementation will be able to replicate these digits to the same degree of accuracy, will document the accuracy of the output if it is less, or will warn the user when problems were encountered.
See Chapters 4 and 8 for further discussion of these algorithms. As mentioned in Chapter 2, the latter case is termed the log absolute error LAE. LRE values of zero or less are usually con- sidered to be completely inaccurate in practice, although technically, a statistical package that produced estimates with LREs of zero or less may provide some information about the solution. Various statistical tests can be applied to the output of a PRNG or a generator to detect patterns that indicate violations of randomness.
Strictly speaking, a PRNG cannot be proved innocent of all patterns but can only be proved guilty of non-randomness. Most tests for PRNG follow a simple general three-part methodology: 1. A minimum length for N may be dictated by the test statistic used in step 2. Compute a test statistic t s on s. Compare t s to E[t s ] under the hypothesis that t s is random. The comparison is usually made with a threshold value, an acceptance range, or a p-value. In this case, the hypothesis being tested is that the PRNG is random, and the null hypothesis is that it is not.
No test can prove that the generator is random, because a single test does not guarantee that a PRNG will be assessed similarly with regard to another test. A test for randomness may only prove the null hypothesis that the generator is not random. Failure to accept the null is evidence, not proof, of the hypothesis that the PRNG is indeed random. It is important to note that tests such as these examine only one aspect of pseudo-random number generation: the distributional properties of the sequence created.
As discussed in Chapter 2, there are two other requirements for correct use of PRNGs: adequate period length and truly random seed values. Further- more, parallel applications and innovative simulation applications require special treatment. In practice, a variety of tests are used to reveal regularities in known classes of PRNGs and capture some behavior of the sequence that is likely to be used by researchers relying on the PRNGS in their simulations.
Some test statistics are based on simple properties of the uniform distribution, while other test statistics are designed explicitly to detect the types of structure that PRNG are known to be prone to produce. Good physical tests have the structure of practical applications while providing a known solution. Two examples of common tests illustrate this general methodology. Generate a random sequence of 20, bits. Compute the distribution of all runs of lengths l in s.
Note that it tests only the distribu- tion of the members of the set and does not test the ordering of the members. Let t s be the number of values that occur more than once in this sequence. No software is provided for these tests, but the tests described are reasonably straightforward to implement.
It includes tests selected from the sources above, plus compression, complexity, spectral, and entropy tests. In practice, this appears to be the most complete and rigorous suite available McCullough This new version is being developed at the University of Hong Kong by W.
The preliminary version examined provided two additional tests and allowed some of the parameters of existing tests to be varied. Many of the tests are simply designed to detect correlations among multiple streams extracted from a single generator. Empirical tests, such as those just described, should be considered an essential complement for, but not a replacement of, theoretical analysis of the random number algorithm.
In addition, new tests continue to be developed see, e. The seed used for initialization must be truly random. Non-random seeds may lead to correlations across multiple sequences, even if there is no intrasequence correlation. Although it is possible to collect adequate randomness from hardware and from external events such as the timing of keystrokes , many commonly used methods of resetting seeds, such as by using the system clock, are inadequate Eastlake et al.
The tests described previously are designed only to detect intrasequence correlations and will not reveal such problems with seed selection. Statistical packages that offer methods to select a new seed automatically should make use of hardware to generate true random values, and should thoroughly document the method by which the seed is generated.
The tests we just described do not measure the period of the generator, so it is essential that a statistical package documentation include the period of the PRNGs. Modern PRNG algorithms have much longer periods. Even the simple multiply-with-carry generator has a period of See Chapter 2 for more detail.
A statistical package should 6 Non-random seeds may make a sequence easier to predict—an important consideration in cryptog- raphy see Viega and McGraw Although some pack- ages are now doing just this, it remains common for a package to provide a single generator.
Some statistical applications, however, such as parallelized MCMC, use multiple processors simultaneously to run different parts of the simulation or analysis. Also calculates slope, area under the curve, tracing and matrix transformation.
Calculus Problem Solver -- differentiates any arbitrary equation and outputs the result, providing detailed step-by-step solutions in a tutorial-like format. Can also initiate an interactive quiz in which you can solve differentiation while the computer corrects your solutions. ZeroRejects -- Implements the "Six Sigma" statistical process control methodology developed by Motorola.
The alpha and beta version are freely downloadable. WinSPC day free trial -- statistical process control software to:. Includes equivalence- and non-inferiority testing for most tests, Monte Carlo simulation for small samples; group sequential interim analyses. Design-Ease and Design-Expert -- two programs from Stat-Ease that specialize in the design of experiments.
Full-function day evaluation copies of both programs are available for download. Has a comprehensive web-based tutorial and reference manual. Factor -- a comprehensive factor analysis program. Provides univariate and multivariate descriptive statistics of input variables mean, variance, skewness, kurtosis , Var charts for ordinal variables, dispersion matrices user defined , covariance, pearson correlation, polychoric correlation matrix with optional Ridge estimates.
Provides mean, variance and histogram of fitted and standardized residuals, and automatic detection of large standardized residuals. You can download the Version 3. To obtain a free copy of the program and manual, send an e-mail to the custodians: Statistics-Chemometrics shell. Weka -- a collection of machine learning algorithms for data mining tasks, implemented in Java. Can be executed from a command-line environment, or from a graphical interface, or can either be called from your own Java code.
Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization, and is well-suited for developing new machine learning schemes. StatCalc -- a PC calculator that computes table values and other statistics for 34 probability distributions. Also includes some nonparametric table values , tolerance factors , and bivariate normal distribution. A help file is provided for each distribution. Scientific Calculator - ScienCalc program contains high-performance arithmetic, trigonometric, hyperbolic and transcendental calculation routines.
All the function routines therein map directly to Intel FPU floating point machine instructions. EqPlot -- Equation graph plotter program plots 2D graphs from equations. The application comprises algebraic, trigonometric, hyperbolic and transcendental functions. PCP Pattern Classification Program -- a machine-learning program for supervised classification of patterns vectors of measurements. Supports interactive keyboard-driven menus and batch processing.
Last updated in , and the website says it's no longer being developed. An augmented Windows version Aug. EXE - For comparisons of two independent groups or samples. The current version number is 3. EXE - For use in descriptive epidemiology including the appraisal of separate samples in comparative studies.
EXE - Miscellaneous randomization, random sampling, adjustment of multiple-test p-values, appraisal of synergism, assessment of a scale, correlation-coefficient tools, large contingency tables, three-way tables, median polish and mean polish, appraisal of effect of unmeasured confounders. EXE - Multiple logistic regression. The current version number is 1. EXE - For appraisal of differences and agreement between matched samples or observations.
EXE - Multiple Poisson regression. EXE - An expression evaluator with storage of constants, interim results, and formulae and calculator for p values and their inverse , confidence intervals, and time spans. The current version number is 4. Provides sophisticated methods in a friendly interface. TETRAD is limited to models The TETRAD programs describe causal models in three distinct parts or stages: a picture, representing a directed graph specifying hypothetical causal relations among the variables; a specification of the family of probability distributions and kinds of parameters associated with the graphical model; and a specification of the numerical values of those parameters.
EpiData -- a comprehensive yet simple tool for documented data entry. Overall frequency tables codebook and listing of data included, but no statistical analysis tools. Calculate sample size required for a given confidence interval, or confidence interval for a given sample size. Can handle finite populations. Online calculator also available. Biomapper -- a kit of GIS and statistical tools designed to build habitat suitability HS models and maps for any kind of animal or plant.
Deals with: preparing ecogeographical maps for use as input for ENFA e. Graphical displays include an automatic collection of elementary graphics corresponding to groups of rows or to columns in the data table, automatic k-table graphics and geographical mapping options, searching, zooming, selection of points, and display of data values on factor maps. Simple and homogeneous user interface. Weibull Trend Toolkit -- Fits a Weibull distribution function like a normal distribution, but more flexible to a set of data points by matching the skewness of the data.
Command-line interface versions available for major computer platform; a Windows version, WinBUGS, supports a graphical user interface, on-line monitoring and convergence diagnostics. GUIDE is a multi-purpose machine learning algorithm for constructing classification and regression trees. Incredibly powerful and multi-featured program for data manipulation and analysis.
Designed for econometrics, but useful in many other disciplines as well. Creates output modelss as LaTeX files, in tabular or equation format. Has an integrated scripting language: enter commands either via the gui or via script, command loop structure for Monte Carlo simulations and iterative estimation procedures, GUI controller for fine-tuning Gnuplot graphs, Link to GNU R for further data analysis.
Includes a sample US macro database. See also the gretl data page. Originally designed for survival models, but the language has evolved into a general-purpose tool for building and estimating general likelihood models. Joinpoint Trend Analysis Software from the National Cancer Institute -- for the analysis of trends using joinpoint models where several different lines are connected together at the "joinpoints.
Takes trend data e. Models may incorporate estimated variation for each point e. In addition, the models may also be linear on the log of the response e. The software also allows viewing one graph for each joinpoint model, from the model with the minimum number of joinpoints to the model with maximum number of joinpoints. DTREG generates classification and regression decision trees. It uses V-fold cross-valication with pruning to generate the optimal size tree, and it uses surrogate splitters to handle missing data.
A free demonstration copy is available for download. NLREG performs general nonlinear regression. NLREG will fit a general function, whose form you specify, to a set of data values. Origin -- technical graphics and data analysis software for Windows. Biostatistics and Epidemiology: Completely Free OpenEpi Version 2.
Anderson Statistical Software Library -- A large collection of free statistical software almost 70 programs! Anderson Cancer Center. Performs power, sample size, and related calculations needed to plan studies. Covers a wide variety of situations, including studies whose outcomes involve the Binomial, Poisson, Normal, and log-normal distributions, or are survival times or correlation coefficients.
Two populations can be compared using direct and indirect standardization, the SMR and CMF and by comparing two lifetables. Confidence intervals and statistical test are provided. There is an extensive helpfile in which everything is explained. Lifetables is listed in the Downloads section of the QuantitativeSkills web site. Sample Size for Microarray Experiments -- compute how many samples needed for a microarray experiment to find genes that are differentially expressed between two kinds of samples e.
This is a stand-alone Windows 95 through XP program that receives information about dose-limiting toxicities DLTs observed at some starting dose, and calculates the doses to be administered next. DLT information obtained at each dosing level guides the calculation of the next dose level. Epi Info has been in existence for over 20 years and is currently available for Microsoft Windows.
The program allows for data entry and analysis. Within the analysis module, analytic routines include t-tests, ANOVA, nonparametric statistics, cross tabulations and stratification with estimates of odds ratios, risk ratios, and risk differences, logistic regression conditional and unconditional , survival analysis Kaplan Meier and Cox proportional hazard , and analysis of complex survey data.
Limited support is available. The calculation of person-years allows flexible stratification by sex, and self-defined and unrestricted calendar periods and age groups, and can lag person-years to account for latency periods. Developed by Eurostat to facilitate the application of these modern time series techniques to large-scale sets of time series and in the explicit consideration of the needs of production units in statistical institutes.
Contains two main modules: seasonal adjustment and trend estimation with an automated procedure e. Ideal for learning meta-analysis reproduces the data, calculations, and graphs of virtually all data sets from the most authoritative meta-analysis books, and lets you analyze your own data "by the book". Generates numerous plots: tandard and cumulative forest, p-value function, four funnel types, several funnel regression types, exclusion sensitivity, Galbraith, L'Abbe, Baujat, modeling sensitivity, and Trim-and-Fill.
Surveys, Testing, and Measurement: Completely Free CCOUNT -- a package for market research data cleaning, manipulation, cross tabulation and data analysis. IMPS Integrated Microcomputer Processing System -- performs the major tasks in survey and census data processing: data entry, data editing, tabulation, data dissemination, statistical analysis and data capture control.
Stats 2. SABRE -- for the statistical analysis of multi-process random effect response data. Responses can be binary, ordinal, count and linear recurrent events; response sequences can be of different types. Such multi-process data is common in many research areas, e. Sabre has been used intensively on many longitudinal datasets surveys either with recurrent information collected over time or with a clustered sampling scheme.
Last released in Mac, K; Win anticipated in September. NewMDSX -- software for Multidimensional Scaling MDS , a term that refers to a family of models where the structure in a set of data is represented graphically by the relationships between a set of points in a space.
MDS can be used on a variety of data, using different models and allowing different assumptions about the level of measurement. SuperSurvey -- to design andimplement surveys, and to acquire, manage and analyze data from surveys. Optional Web Survey Module and Advanced Statistics Module curve fitting, multiple regression, logistic regression, factor, analysis of variance, discriminant function, cluster, and canonical correlation.
Free version is limited to 1 survey, 10 questions, 25 total responses. Rasch Measurement Software -- deals with the various nuances of constructing optimal rating scales from a number of usually dichotomous measurements, such as responses to questions in a survey or test. These may be freely downloaded, used, and distributed, and they do not expire.
This Excel spreadsheet converts confidence intervals to p values, and this PDF file explains it's background and use. RegressIt - An Excel add-in for teaching and applied work. Performs multivariate descriptive analysis and ordinary linear regression. Creates presentation-quality charts in native editable Excel format, intelligently formatted tables, high quality scatterplot matrices, parallel time series plots of many variables, summary statistics, and correlation matrices.
Easily explore variations on models, apply nonlinear and time transformations to variables, test model assumptions, and generate out-of-sample forecasts. SimulAr -- Provides a very elegant point-and-click graphical interface that makes it easy to generate random variables correlated or uncorrelated from twenty different distributions, run Monte-Carlo simulations, and generate extensive tabulations and elegant graphical displays of the results.
EZAnalyze -- enhances Excel Mac and PC by adding "point and click" functionality for analyzing data and creating graphs no formula entry required. Does all basic "descriptive statistics" mean, median, standard deviation, and range , and "disaggregates" data breaks it down by categories , with results shown as tables or disaggregation graphs". Advanced features: correlation; one-sample, independent samples, and paired samples t-tests; chi square; and single factor ANOVA.
Update Available! EZ-R Stats -- supports a variety of analytical techniques, such as: Benford's law, univariate stats, cross-tabs, histograms. Simplifies the analysis of large volumes of data, enhances audit planning by better characterizing data, identifies potential audit exceptions and facilitates reporting and analysis. Marko Lucijanic's Excel spreadsheet to perform Log Rank test on survival data, and his article. SSC-Stat -- an Excel add-in designed to strengthen those areas where the spreadsheet package is already strong, principally in the areas of data management, graphics and descriptive statistics.
SSC-Stat is especially useful for datasets in which there are columns indicating different groups. Menu features within SSC-Stat can:. Each spreadsheet gives a graph of the distribution, along with the value of various parameters, for whatever shape and scale parameters you specify. You can also download a file containing all 22 spreadsheets. Sample-size calculator for cluster randomized controlled trials , which are used when the outcomes are not completely independent of each other.
This independence assumption is violated in cluster randomized trials because subjects within any one cluster are more likely to respond in a similar manner. A measure of this similarity is known as the intra-correlation coefficient ICC.
Because of the lack of independence, sample sizes have to be increased. This web site contains two tools to aid the design of cluster trials — a database of ICCs and a sample size calculator along with instruction manuals. Exact confidence intervals for samples from the Binomial and Poisson distributions -- an Excel spreadsheet with several built-in functions for calculating probabilities and confidence intervals.
Smith , of Virginia Tech. A user-friendly add-in for Excel to draw a biplot display a graph of row and column markers from data that forms a two-way table based on results from principal components analysis, correspondence analysis, canonical discriminant analysis, metric multidimensional scaling, redundancy analysis, canonical correlation analysis or canonical correspondence analysis.
Allows for a variety of transformations of the data prior to the singular value decomposition and scaling of the markers following the decomposition. Lifetable -- does a full abridged current life table analysis to obtain the life expectancy of a population. From the Downloads section of the QuantitativeSkills web site.
A third spreadsheet concerns a method for two clusters by Donner and Klar. You will have to insert your own data by overwriting the tables in the second total number of positive responses and third total number of negative responses or fourth column total number. A step-by-step guide to data analysis with separate workbooks for handling data with different numbers and types of variables. XLStatistics is not an Excel add-in and all the working and code is visible.
A free version for analysis of 1- and 2-variable data is available. XLSTAT -- an Excel add-in for PC and MAC that holds more than statistical features including data visualization, multivariate data analysis, modeling, machine learning, statistical tests as well as field-oriented solutions: features for sensory data analysis preference mapping , time series analysis forecasting , marketing conjoint analysis, PLS structural equation modeling , biostatistics survival analysis, OMICs data analysis and more.
It proposes a free day trial of all features as well as a free version. Statistics -- executes programs written in the easy-to-learn Resampling Stats statistical simulation language. You write a short, simple program in the language, describing the process behind a probability or statistics problem.
Statistics then executes your Resampling Stats model thousands of times, each time with different random numbers or samples, keeping track of the results. When the program completes, you have your answer. Runs on Windows, Mac, Lunux -- any system that supports Java. R -- a programming language and environment for statistical computing and graphics.
Similar to S or S-plus will run most S code unchanged. Provides a wide variety of statistical linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, Well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. The R environment includes:. Review and comparison of R graphical user interfaces A number of graphical user interfaces GUI allow you to use R by menu instead of by programming.
Written by Robert A. Detailed reviews of R graphical user interfaces Also by Robert A. RStudio -— is a set of integrated tools designed to help you be more productive with R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management. Integrated development environment Access RStudio locally Syntax highlighting, code completion, and smart indentation Execute R code directly from the source editor Quickly jump to function definitions Easily manage multiple working directories using projects Integrated R help and documentation Interactive debugger to diagnose and fix errors quickly Extensive package development tools RStudio Server Access via a web browser Move computation closer to the data Scale compute and RAM centrally Shiny A web application framework for R.
Turn your analyses into interactive web applications R-Instat R-Instat is a free, open source statistical software that is easy to use, even with low computer literacy. This software is designed to support improved statistical literacy in Africa and beyond, through work undertaken primarily within Africa. A lot of statistical functions.
There is a free version and a commercial version. They both have the same statistical functions. The commercial version offers technical support. Zelig -- an add-on for R that can estimate, help interpret, and present the results of a large range of statistical methods.
It translates hard-to-interpret coefficients into quantities of interest; combines multiply imputed data sets to deal with missing data; automates bootstrapping for all models; uses sophisticated nonparametric matching commands which improve parametric procedures; allows one-line commands to run analyses in all designated strata; automates the creation of replication data files so that you or anyone else can replicate the results of your analyses hence satisfying the replication standard ; makes it easy to evaluate counterfactuals; and allows conditional population and superpopulation inferences.
It includes many specific methods, based on likelihood, frequentist, Bayesian, robust Bayesian, and nonparametric theories of inference. Zelig comes with detailed, self-contained documentation that minimizes startup costs for Zelig and R, automates graphics and summaries for all models, and, with only three simple commands required, generally makes the power of R accessible for all users. Zelig also works well for teaching, and is designed so that scholars can use the same program with students that they use for their research.
Apophenia -- a statistics library for C. Octave -- a high-level mathematical programming language, similar to MATLAB, for numerical computations -- solving common numerical linear algebra problems, finding the roots of nonlinear equations, integrating ordinary functions, manipulating polynomials, and integrating ordinary differential and differential-algebraic equations.
J -- a modern, high-level, general-purpose, high-performance programming language. J runs both as a GUI and in a console command line. J is particularly strong in the mathematical, statistical, and logical analysis of arrays of data. J systems have:.
It also allows you to accept potential citations to this item that we are uncertain about. We have no bibliographic references for this item. You can help adding them by using this form. If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item.
If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation. For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Christopher F Baum email available below.
Please note that corrections may take a couple of weeks to filter through the various RePEc services. Economic literature: papers , articles , software , chapters , books. FRED data. Registered: Ludwig Kanzler. Ludwig Kanzler, To determine if a Durbin-Watson test statistic is significantly significant at a certain alpha level, you can refer to this table of critical values. If the absolute value of the Durbin-Watson test statistic is greater than the value found in the table, then you can reject the null hypothesis of the test and conclude that autocorrelation is present.
If you reject the null hypothesis of the Durbin-Watson test and conclude that autocorrelation is present in the residuals, then you have a few different options to correct this problem if you deem it to be serious enough:. These strategies are typically sufficient to remove the problem of autocorrelation. For step-by-step examples of Durbin-Watson tests, refer to these tutorials that explain how to perform the test using different statistical software:. Your email address will not be published.
Skip to content Menu. Posted on January 21, April 2, by Zach. For negative serial correlation, check to make sure that none of your variables are overdifferenced. For seasonal correlation, consider adding seasonal dummy variables to the model.
Следующая статья max payne 3 torrent download