The DO Loop

Rick WicklinJune 2, 2025 1

The modulo operation for large integers in SAS

On social media, a SAS user reported that SAS could not compute the modulo of an extremely large integer. In SAS, the modulo operation is usually performed by using the MOD function, which computes the remainder of dividing an integer, N, by another integer, d. (In symbols, the remainder is

English

Analytics | Data Visualization | Programming Tips

Rick WicklinMay 27, 2025 0

Visualize the Gini-Simpson diversity index

A previous article discusses the Gini-Simpson diversity index and how to compute it in SAS. Suppose you have a sample that contains R classes. (Classes are also called groups or categories.) Intuitively, the sample exhibits "high diversity" if the class sizes are approximately equal. The sample shows "low diversity" if

English

Learn SAS | Programming Tips

Rick WicklinMay 19, 2025 1

Calculate the Gini-Simpson diversity index in SAS

An article by David Corliss in Amstat News (Corliss D. (2025) "Quantifying Diversity: Calculating the Gini-Simpson Diversity Index") discusses a new statistical measure of diversity that was adopted by the US Census Bureau. The statistic is called the Gini-Simpson diversity index. The Census Bureau has published an article about how

English

Analytics | Learn SAS

Rick WicklinMay 12, 2025 0

Stratified bootstrapping and when to use it

When you use the bootstrap method in statistics, the most common resampling method is called case resampling. For data that has N observations, each bootstrap sample is created by sampling with replacement from the N observations (or "cases") in the data. However, if the data set includes categorical variables, it

English

Advanced Analytics | Machine Learning

Rick WicklinMay 5, 2025 0

Implement a SMOTE simulation algorithm in SAS

A recent article describes the main features of simulation by using the Synthetic Minority Over-sampling Technique (SMOTE). SMOTE was created to oversample from a set of rare events prior to running a machine learning classification algorithm. However, at its heart, the SMOTE algorithm (Chawla et al., 2002) provides a way

English

Learn SAS | Machine Learning | Programming Tips

Rick WicklinApril 28, 2025 2

The SMOTE method for generating synthetic data

The Synthetic Minority Over-sampling Technique (SMOTE) was created to address class-imbalance problems in machine learning algorithms. The idea is to oversample from the rare events prior to running a machine learning classification algorithm. However, at its heart, the SMOTE algorithm (Chawla et al., 2002) is essentially a way to simulate

English

Blogs

Blogs

The DO Loop