Conventional wisdom suggests that one either perform a power transformation (log, square root, etc) to data derived from studies with small sample sizes whose response variables have non-normal distributions before analysis, or use a distribution-free procedure such as a rank transformation or a randomization test procedure. To better appreciate the effect of specific alternatives on both the type I error and power of detecting differences between treatment groups, simulation studies were conducted assuming specific gamma distributions G(r, theta). A simple two group design was assumed. The reference group always had a average disease level mu = ry theta = 3.0, and the treatment group always had means whose percentage reductions ranged from 0% to 50%. By varying the shape parameter r from 1, 2, 4, 8, 16 one could investigate distributional profiles having almost symmetric distributions (r = 16) to those with highly skewed distributions (r = 1 or 2). Six statistical test procedures were compared. All test procedures were robust relative to the type I error. The UMP test based on a ratio of sample means produced the greatest power for all combinations of n, r and R-T. The power losses associated with the randomization test, the t-test on original scale, and the t-test on the square root scale were very small, (3% to 6% in absolute value) for n = 10 and 15, and less than 2% for group sizes of 25 or more. The power loss associated with the t-test on the log scale was much larger, ranging from 5% to 10% smaller power than the t-test on original scale. The Wilcoxon rank test produced similar results to that of the LOG t- test for small samples. The loss in power for the unshifted LOG test could be recouped by use of a shifted LOG (x + c) test. The same procedures based on differences in sample means were then compared for comparable lognormal distributions. Here the log transformation performed the best, better than the Wilcoxon rank test, and both considerably better than the t-test on the original scale. These results suggest that statistical inferences can be highly dependent on both the distributional form of the response variable and the scale of measurement used in the statistical analysis.