/*----------------------------------------------------------------------------- File: RenameVariablesInLibref.sas Author: John Sabel john.sabel@ofm.wa.gov Washington State Education Research & Data Center Creation date: 2014-08-20 Version: 0.93 Purpose For all data sets in a libref (i.e. libname), renames all the variables that have entries in a renaming Excel worksheet. The renaming worksheet should contain thefollowing columns: 1) Label - (Optional) The labels of candidate variables that are to be renamed. 2) OldName - The original names of candidate variables to be renamed. 3) NewName - The revised names of candidate variables to be renamed. If the optional label column exists in the renaming Excel worksheet, this macro perfentially bases the renaming based on the variable labels. If a neither the variable itself has a label, or the renaming worksheet doesn't have a matching label, then the renaming will be based on comparing the variable name with a matching OldName in the worksheet. If no matching labels or OldNames are found, then no renaming occurs. Note Since this macro uses PROC IMPORT to import the Excel renaming worksheet data, the Excel workbook containing the worksheet need to be closed when running the macro, otherwise the macro will throw the error "" and stop running. The renaming worksheet could be used to maintain variable name standards. For example, you could have a standard where the variable name for a person's last name should be "LAST_NAME". Then in the renaming worksheet, you could have multiple entries that have the NewName as "LAST_NAME", but with the OldNames containing all the variations of last name in your data such as "LastName", "PersonLastName", "lname", "NameLast". With these entries, by running all your data sets through this macro, you would cause all last name variable names to be standardized on "LAST_NAME". The origin of this macro occurred when I imported a text files with 33 character plus variable names into SAS using Enterprise Guide. With these variabless, Enterprise Guide truncates the variables to 32 characters, but maintains the original variable names as labels. If two or more variables share the same first 32 characters, Enterprise Guide will further truncate the second and succeeding variables to 27 characters and append "_0001" to the name of the second variable, and "_0002" to the third variable should it exist, and so on. I find variables named like this hard to work with, so I wanted to condense the variable names into something that better resembled the original 33+ character variable names. Since I was importing more than one text file, I wasn't absolutely sure that the "_0001" version of one variable in one text file was necessarily the same "_0001" version of a similar variable in another text file. This is why I developed the capacity for this macro to preferentially rename based on variable labels. This source code listing contains two helper macros, 1)%_DownloadWorksheet(), and 2)%_RenameDsnVariablesViaRenameDsn. I created these so I could reuse them in the sister macro %RenameVariablesInDataSet(). Parameters LibnameIn: The libref containing the data sets and/or tables for which you want the variables to be renamed. The libname could refer to a file server or disc drive containing SAS data sets, or to an DBMS system containing tables, or combinations of all three if the original libref concatenates libraries. LibnameOut: The name of the libref which you want to contain the outputted data sets/tables. LibnameOut can be equal to LibnameIn. In this case, if both LibnameIn and LibnameOut refer to a single library (as opposed to referring to concatenated libraries), the original data sets will be modified. RegExFilter_1: (Optional) Regular expression filter for subsetting the data sets that you wish to have this macro operate on. The default, "/./I", acts as no filter at all. WorkbookPath: The path, including file name, to the Excel workbook containing the renaming worksheet (see WorksheetName below). WorksheetName: The name of the worksheet containing the label/old name/ new name mappings. Debug: (Optional) Y = Don't delete intermidiate data set. N = Delete intermidiate data set. Default is N. Signature %RenameVariablesInLibref(LibnameIn=, LibnameOut=, WorkbookPath=, WorksheetName=, RegExFilter_1=/./I, Debug=N) Revisions: Date Version Author Comments 2014-09-04 0.91 John Removed unneeded UPCASE statements from SQL WHERE clauses involving Dictionary tables and LIBNAMEs and MEMNAMEs. 2014-09-18 0.92 John Fixed bug in %_RenameDsnVariablesViaRenameDsn(). 2015-06-03 0.93 John Modfied code to remove any "old name" labels for variables that the macro renames. Also, macro now writes to the SAS log some information concerning number of variablesrenamed. Fixed bug where macro would error out if there were no renamings to be done. Where there are no renamings, macro will simply copy DsnInto DsnOut and write a warning to the log. -----------------------------------------------------------------------------*/ %macro RenameVariablesInLibref(LibnameIn=, LibnameOut=, WorkbookPath=, WorksheetName=, RegExFilter_1=/./I, Debug=N); %put --- Start of %upcase(&sysmacroname) macro; %_DownloadWorksheet(DsnOut=__OldNameNewNameTable, WorkbookPath=&WorkbookPath., WorksheetName=&WorksheetName., Debug=&Debug.) proc sql %if &Debug.=N %then %do; noprint; %end; %else %do; ; title "Data Set Names"; title2 "Libref: %upcase(&LibnameIn.), RegEx Table Filter: &RegExFilter_1."; %end; SELECT DISTINCT MEMNAME INTO :MemNames1-:MemNames999 FROM Dictionary.COLUMNS WHERE LIBNAME = "%upcase(&LibnameIn.)" AND MEMTYPE = 'DATA' AND prxmatch("&RegExFilter_1", MEMNAME) > 0 /* AND prxmatch("&RegExColFilter_1", NAME) > 0*/ ; %let NumDataSets=&sqlobs.; quit; %do I = 1 %to &NumDataSets.; %_RenameDsnVariablesViaRenameDsn(DsnIn=&LibnameIn..&&MemNames&I.., DsnOut=&LibnameOut..&&MemNames&I.., RenameDsnIn=__OldNameNewNameTable, Debug=&Debug.) %end; %if &Debug.=N %then %do; proc sql; DROP TABLE __OldNameNewNameTable; quit; %end; %put --- End of %upcase(&sysmacroname) macro; %mend RenameVariablesInLibref; %macro _DownloadWorksheet(DsnOut=, WorkbookPath=, WorksheetName=, Debug=N); %put --- Start of %upcase(&sysmacroname) macro; proc import out=&DsnOut. datafile="&WorkbookPath." dbms=xlsx replace; sheet="&WorksheetName."; run; %put --- End of %upcase(&sysmacroname) macro; %mend _DownloadWorksheet; %macro _RenameDsnVariablesViaRenameDsn(DsnIn=, DsnOut=, RenameDsnIn=, Debug=N); %put --- Start of %upcase(&sysmacroname) macro; %if &Debug.~=N %then %put _local_; %local _OldNameNewName _OldNameNewName1 _OldNameNewName2; %local LabelToNone1 OldNameLabelToNone; %let _OldNameNewName1=%str( ); %let _OldNameNewName2=%str( ); %let LabelToNone1=%str( ); %let OldNameLabelToNone=%str( ); data __RenameVar_DsnIn_Copy; set &DsnIn.; run; /* Create the old name to new name rename pairs. */ proc sql noprint; /* This allows macro to determine if renaming workbook has a label column. If so, use it. */ SELECT STRIP(NAME) FROM DICTIONARY.COLUMNS WHERE LIBNAME = "WORK" AND MEMNAME = "%upcase(__OldNameNewNameTable)" AND UPCASE(NAME) = "LABEL" ; %let _HasLabelColumn=&sqlobs.; %let NumRenamedByLabel=0; %if &_HasLabelColumn.>0 %then %do; SELECT "'" || strip(Dic.Name) || "'n=" || strip(Xwk.NewName), strip(Xwk.NewName) || "=' '" INTO :_OldNameNewName1 SEPARATED BY ' ', :LabelToNone1 SEPARATED BY ' ' FROM __OldNameNewNameTable Xwk INNER JOIN DICTIONARY.COLUMNS Dic ON UPCASE(Xwk.Label) = UPCASE(Dic.Label) WHERE Dic.LIBNAME = 'WORK' AND Dic.MEMNAME = "%upcase(__RenameVar_DsnIn_Copy)" AND Xwk.Label IS NOT NULL AND Xwk.NewName IS NOT NULL ; %let NumRenamedByLabel=&sqlobs.; %end; SELECT "'" || strip(Dic.Name) || "'n=" || strip(Xwk.NewName), strip(Xwk.NewName) || "=' '" INTO :_OldNameNewName2 SEPARATED BY ' ', :OldNameLabelToNone SEPARATED BY ' ' FROM __OldNameNewNameTable Xwk INNER JOIN DICTIONARY.COLUMNS Dic ON UPCASE(Xwk.OldName) = UPCASE(Dic.Name) WHERE Dic.LIBNAME = 'WORK' AND Dic.MEMNAME = "%upcase(__RenameVar_DsnIn_Copy)" AND %if &_HasLabelColumn.=1 %then %do; Xwk.Label IS NULL AND %end; Xwk.NewName IS NOT NULL ; %let NumRenamedByOldName=&sqlobs.; %let _OldNameNewName=&_OldNameNewName1. &_OldNameNewName2.; quit; /* Dropping DsnOut firsts allows macro to work on DBMS based libraries. */ %if %sysfunc(exist(&DsnOut.)) %then %do; proc sql; DROP TABLE &DsnOut.; quit; %end; %if &NumRenamedByLabel=0 & &NumRenamedByOldName.=0 %then %do; %put WARNING: No variables were renamed. &DsnOut. is an exact copy of &DsnIn..; data &DsnOut.; set __RenameVar_DsnIn_Copy; run; %end; %else %do; %if &NumRenamedByLabel.=1 %then %put NOTE: For &DsnOut., one variable was renamed using an Label/NewName combination.; %if &NumRenamedByLabel.>1 %then %put NOTE: For &DsnOut., &NumRenamedByLabel. variables were renamed using Label/NewName combinations.; %if &NumRenamedByOldName.=1 %then %put NOTE: For &DsnOut., one variable was renamed using an OldName/NewName combination.; %if &NumRenamedByOldName.>1 %then %put NOTE: For &DsnOut., &NumRenamedByOldName. variables were renamed using OldName/NewName combinations.; data &DsnOut.; set __RenameVar_DsnIn_Copy (rename=(&_OldNameNewName.)); label /* &LabelToNone1. */ &OldNameLabelToNone.; run; %end; %if &Debug.=N %then %do; proc sql; DROP TABLE __RenameVar_DsnIn_Copy; quit; %end; %if &Debug.~=N %then %put _local_; %put --- End of %upcase(&sysmacroname) macro; %mend _RenameDsnVariablesViaRenameDsn; /* /* * Sample setup, test data and macro call **************************************; * This sample code consists of three parts: Part 1: Creates a sample worksheet containing variable renaming pairs for first and last names. Part 2: Creates test data sets. Part 3: Macro call to %RenameVariablesInLibref() to rename all first and last names in the test data to the standardized variables names "FIRST_NAME" and "LAST_NAME".; * Part 1: Create an renaming data set and export to an Excel worksheet on C:\Temp; data OldNameNewName_Names; input @1 OldName $32. @33 NewName $32.; datalines; lname LAST_NAME LastName LAST_NAME StudentLastName LAST_NAME NameLast LAST_NAME DQ_LastName LAST_NAME fname FIRST_NAME FirstName FIRST_NAME StudentFirstName FIRST_NAME NameFirst FIRST_NAME DQ_FirstName FIRST_NAME ; run; proc export data=OldNameNewName_Names outfile="C:\temp\TestOldnameNewName.xlsx" dbms=xlsx replace; sheet='OldNameNewName_Names'; run; * Part 2: Create test data sets. The names of each begin with "TestData_".; data TestData_InPatient; input lname $ fname $; datalines; Robert Frank Clatsup Patty Kalaloch Joe ; run; data TestData_K12; input StudentFirstName $ StudentLastName $; datalines; Jose Gonzales Patty Clatsup ; run; data TestData_License; input fname $ lname $; datalines; Nguyet Tran Julie Sams ; run; * Part 3: Run %RenameVariablesInLibref() using a regular expression filter to rename variables in only those data sets in the WORK library that have names that begin with "TestData_".; * Libname C_TEMP is set up for the output files so that original input data sets would not be modified. However, you could set up LibnameOut to equal WORK, and this would cause the original inputted data sets to be modifed.; libname C_TEMP 'C:\Temp'; %RenameVariablesInLibref(LibnameIn=WORK, LibnameOut=C_TEMP, WorkbookPath=C:\temp\TestOldnameNewName.xlsx, WorksheetName=OldNameNewName_Names, RegExFilter_1=/./I, Debug=N) %RenameVariablesInLibref(LibnameIn=WORK, LibnameOut=C_TEMP, WorkbookPath=C:\temp\TestOldnameNewName.xlsx, WorksheetName=OldNameNewName_Names, RegExFilter_1=/^TestData_/I, Debug=N) */