From: RonBonnell@SPRINTMAIL.COM Subject: Compare statistic software I have been trying to compare various PC statistic packages. Here are my present conclusions from research on the Internet and talking to the vendors. Any comments on my conclusions? Am I way off in my thinking somewhere? I realize that the best package depends on what we need to do. At this time I am not sure what we will do. We will be examining medical claims and evaluating the effectiveness of different health management programs. We will likely hire a PHD to do the analysis but we are trying to decide which software package we should start to work with. My conclusions: Statistical Analysis Software Options Summary: I reviewed several statistical software packages on the internet and also talked to users and vendors of the software. Each statistic package has its own advantages and many statisticians work with several different packages according to their strengths. In general, from internet comments, irregardless of costs there is an overwhelming response in favor of SYSTAT over SAS. It is much easier to use, yet still has a powerful programming language, and its representation of analysis in graphics is superior to anything else out there. >From what I see, SAS may be better suited to an organization in the following situations: 1. Data is in several different locations including mainframes. 2. SAS programming is used as a means to access the data for several other purposes besides statistical analysis. SAS is an integrated suite of software tools used for purposes such as data warehousing, executive information systems, data visualization, application development, etc. besides statistical analysis. 3. You have a staff of highly experienced, dedicated SAS programmers. 4. There is a long term commitment to dedicate staff to the SAS environment. There is probably nothing which cannot be done with SAS but getting it done is always very difficult and it is likely SYSTAT or another statistics package could also accomplish it and allow you to go beyond what you would do in SAS because you wouldn t be spending all your time trying to figure out the simple actions in SAS. In our case SAS would literally be 50 times more expensive over a 4 year period and be at least an order of magnitude more difficult to use. I would suggest starting simple and working our way up as needed. If all we need are some statistical numbers, an Access upgrade could do the trick. If we need further analysis and graphics, SYSTAT could very well do the job. Specific needs for a non core statistical procedure could move us to a different program such as Statistica, STATA, or SAS. I should also note that these different statistic packages are competing with each other and continually updateing their capability to adopt the latest features or statistical modeling technique. The best package at any particular point in time may be the one with the latest major release. Product Comparisons: 1. Access Add In a) There is an Access Add In for $200 which will provide many statistical analysis routines such as linear regression and looking at correlations. b) For Example: i) Take a data set of HRA ii) Set Age and Weight as independent fields iii) Set Cholesterol as a dependent field iv) Determine Regression Type a) y = Ax + C b) y = Ax + Bz + ... + C c) y = Ax + Bx**z + ... + C v) The result then provides a table of values including: a) Correlation R values .. using Pearson s Correlation (1) Note statistical packages will allow you to choose further types of correlation, i.e SYSTAT can compute the Spearman and several other Correlations. b) Y intercept c) Coefficient of x d) Standard error of coefficients e) Beta value of coefficients f) t-value of the coefficient g) Probability of coefficients h) A number of ANOVA values c) The Access Add-In creates a table of the results, the statistical packages have the ability to create a graphic result i.e. scatter points with a linear regression line. 2. SAS a) A very complete package b) Can link to Access Database via ODBC c) Very difficult to use, manuals take up entire shelf d) Expensive $5K+ with additional $2.5K per year. Plus cost for other modules if needed. e) Can be run on several different platforms i.e. large mainframe computers down to PC s. f) I have heard comments that people feel trapped with SAS, with their huge investment in software and training they need to keep on paying the lisence fees every year. g) There are a lot of jobs out there for SAS programmers h) Training available in various cities, didn t see STAT course in Minneapolis but there are other SAS courses in Minneapolis i) Best Quotes: i) Unless you are into rarefied realms of unusual stats, it (SPSS) will probably cover your needs ii) SAS is about as hostile as they come. iii) I have been trying to stare at some SAS macros to see what in the world is going on here ... ohhhh!!! after 25 minutes (That is a bizarre way to do what amounts to an AGGREGATE command.) iv) I am coming from the Systat word where the statistical graphics capability is truly staggering. ... This ought to be easy to do, but I fear that like everything else with SAS, instead of being one or two commands, it may involve multiple lines of coding.. Have I missed something in the 1300- odd pages of the graph manual? 3. SYSTAT a) Powerful statistical package b) Much easier to use programmer interface, still requires learning programming syntax. c) Rich graphic capability d) Those who have SAS and SYSTAT tend use SYSTAT more, but use SAS or Access for some processes such as data management. e) We already have a copy and can upgrade to the new version for $300. Otherwise the cost would have been $1000. f) New release in August which will be easier to use and have additional function. (may have ODBC connectivity to Access Database) g) Tends to be used in the scientific community h) Training courses under development since being purchased by SPSS, not currently available i) Best Quotes: i) Everything else is second-rate compared to SYSTAT ii) Comparing to JMP (end user app from SAS) ... but nothing in comparison to the power of Systat or SAS iii) I would dump STATA in a second and use SYSTAT iv) Its graphics capabilities are so far beyond the others that I can t imagine any other choice. 4. SPSS a) Owned by the same company which owns SYSTAT b) Non programmers find it easier to use than SYSTAT; more menu driven versus programming c) Slower performance d) $1500 for the starter modules. Additional cost for further modules. e) Tends to be used more in the professional field vs. scientific field for SYSTAT. f) Training available 5. STATA a) Interactive, very fast performance, seconds vs. half hour or more in SAS or SYSTAT. b) Fast performance requires large memory, would need memory and possibly system upgrade. Model could be constrained by lack of memory. c) $1000 for the package. d) Offers NetCourses ; less expensive training accomplished via E-mail. 6. STATISTICA a) Gets very glowing reviews from every magazine comparison b) A very comprehensive package and fast performance c) Very high accuracy in computations d) I don t see a lot of recent comparisons by users in the Internet newsgroups e) Can ODBC connect to Access Database f) Taining available - Chicago g) $1000 for the package Observations: 1. The Access Add In provides statistical information but with less flexibility. If we want to visualize the data we will need a different application or spend a lot of time developing graphics in Access. 2. Many statisticians use several of the statistical packages at the same time. The core statistical capabilities exist in each of the packages. Each of the packages has its own strengths and ease of use features for the different types of analysis. 3. SPSS for some people may be easier to use with its point and click interface, and STATA may have its advantages with performance (though there could be data size limits), but the overwhelming response of all my searches on the Internet is in the use of SYSTAT for statistical analysis. 4. In general SYSTAT and SAS appear comparable in their capabilities on a PC. They both use a powerful programming language and have rich statistical routines. Though it may be possible to extend SAS beyond the capability of SYSTAT it is unlikely we would need to do so. SYSTAT would likely do everything we would need with less effort and with better graphical results. 5. SAS appears to be more appropriate for an enterprise solution, where the data may reside in many different formats and SAS is the tool used to get at the data and perform Statistical analysis. Where we are looking only for a single user PC somewhat limited use solution the other packages appear more appropriate. 6. There are some specific differences in the packages capabilities. We need to know what we will be doing with the data. For example, Strata doesn t provide cluster analysis but has the most mature bootstrapping capability. Recommendation: We need to determine what type of statistical analysis we plan to perform, then move up the following solution scale to determine the most appropriate statistical solution. I would approach the solution in the following order: 1. Determine if the Access Add-In provides the needed analysis (from what I know we will probably want to move beyond the Access add in capability). 2. Determine if the SYSTAT upgrade provides the needed function (Are we looking for some specific non core statistical analysis routine? Which statistical program package has this routine?) 3. Determine if speed and interactivity lead us to turn to STATA for performance. 4. Commit the resource to incorporate SAS -----== Posted via Deja News, The Leader in Internet Discussion ==----- http://www.dejanews.com/rrod/nlreg.htmlstik/allgemeines/frey.eng.html Now offering spam-free web-based newsreading