/*------------------------------------------------------------------------------ File: RenameVariablesInDataSet.sas Author: John Sabel john.sabel@ofm.wa.gov Washington State Education Research & Data Center Creation date: 2014-08-20 Version: 0.92 Purpose Renames all the variables in a data set that have entries in a renaming Excel worksheet. The renaming worksheet should contain the following columns: 1) Label - (Optional) The labels of candidate variables that are to be renamed. 2) OldName - The original names of candidate variables to be renamed. 3) NewName - The revised names of candidate variables to be renamed. If the optional label column exists in the renaming Excel worksheet, this macro perfentially bases the renaming based on the variable labels. If a neither the variable itself has a label, or the renaming worksheet doesn't have a matching label, then the renaming will be based on comparing the variable name with a matching OldName in the worksheet. If no matching labels or OldNames are found, then no renaming occurs. Note Since this macro uses PROC IMPORT to import the Excel renaming worksheet data, the Excel workbook containing the worksheet need to be closed when running the macro, otherwise the macro will throw the error "" and stop running. The renaming worksheet could be used to maintain variable name standards. For example, you could have a standard where the variable name for a person's last name should be "LAST_NAME". Then in the renaming worksheet, you could have multiple entries that have the NewName as "LAST_NAME", but with the OldNames containing all the variations of last name in your data such as "LastName", "PersonLastName", "lname", "NameLast". With these entries, by running all your data sets through this macro, you would cause all last name variable names to be standardized on "LAST_NAME". The origin of this macro occurred when I imported a text files with 33 character plus variable names into SAS using Enterprise Guide. With these variabless, Enterprise Guide truncates the variables to 32 characters, but maintains the original variable names as labels. If two or more variables share the same first 32 characters, Enterprise Guide will further truncate the second and succeeding variables to 27 characters and append "_0001" to the name of the second variable, and "_0002" to the third variable should it exist, and so on. I find variables named like this hard to work with, so I wanted to condense the variable names into something that better resembled the original 33+ character variable names. Since I was importing more than one text file, I wasn't absolutely sure that the "_0001" version of one variable in one text file was necessarily the same "_0001" version of a similar variable in another text file. This is why I developed the capacity for this macro to preferentially rename based on variable labels. This source code listing contains two helper macros, 1)%_DownloadWorksheet(), and 2)%_RenameDsnVariablesViaRenameDsn. I created these so I could reuse them in the sister macro %RenameVariablesInLibref(). Parameters DsnIn: The two level name of a data set. DsnOut: The name of the outputted data set. DsnOut can be the same as DsnIn. In this case the original DsnIn will be modified. WorkbookPath: The path, including file name, to the Excel workbook containing the renaming worksheet (see WorksheetName below). WorksheetName: The name of the worksheet containing the label/old name/ new name mappings. Debug: (Optional) Y = Don't delete intermidiate data set. N = Delete intermidiate data set. Default is N. Signature %RenameVariablesInDataSet(DsnIn=, DsnOut=, WorkbookPath=, WorksheetName=, Debug=N) Revisions: Date Version Author Comments 2014-09-04 0.91 John Removed unneeded UPCASE statements from SQL WHERE clauses involving Dictionary tables and LIBNAMEs and MEMNAMEs. 2015-06-03 0.92 John Modfied code to remove any "old name" labels for variables that the macro renames. Also, macro now writes to the SAS log some information concerning number of variablesrenamed. Fixed bug where macro would error out if there were no renamings to be done. Where there are no renamings, macro will simply copy DsnInto DsnOut and write a warning to the log. -------------------------------------------------------------------------------*/ %macro RenameVariablesInDataSet(DsnIn=, DsnOut=, WorkbookPath=, WorksheetName=, Debug=N); %put --- Start of %upcase(&sysmacroname) macro; %if &Debug.~=N %then %put _local_; %let DsnOut_nrbquote=%nrbquote(&DsnOut.); %_DownloadWorksheet(DsnOut=__OldNameNewNameTable, WorkbookPath=&WorkbookPath., WorksheetName=&WorksheetName., Debug=&Debug.) %_RenameDsnVariablesViaRenameDsn(DsnIn=&DsnIn., DsnOut=&DsnOut., RenameDsnIn=__OldNameNewNameTable, Debug=&Debug.) %if &Debug.=N %then %do; proc sql; DROP TABLE __OldNameNewNameTable; quit; %end; %exit: %if &Debug.~=N %then %put _local_; %put --- End of %upcase(&sysmacroname) macro; %mend RenameVariablesInDataSet; %macro _DownloadWorksheet(DsnOut=, WorkbookPath=, WorksheetName=, Debug=N); %put --- Start of %upcase(&sysmacroname) macro; %if &Debug.~=N %then %put _local_; proc import out=&DsnOut. datafile="%trim(&WorkbookPath.)" dbms=xlsx replace; sheet="%trim(&WorksheetName.)"; run; %if &Debug.~=N %then %put _local_; %put --- End of %upcase(&sysmacroname) macro; %mend _DownloadWorksheet; %macro _RenameDsnVariablesViaRenameDsn(DsnIn=, DsnOut=, RenameDsnIn=, Debug=N); %put --- Start of %upcase(&sysmacroname) macro; %if &Debug.~=N %then %put _local_; %local _OldNameNewName _OldNameNewName1 _OldNameNewName2; %local LabelToNone1 OldNameLabelToNone; %let _OldNameNewName1=%str( ); %let _OldNameNewName2=%str( ); %let LabelToNone1=%str( ); %let OldNameLabelToNone=%str( ); data __RenameVar_DsnIn_Copy; set &DsnIn.; run; /* Create the old name to new name rename pairs. */ proc sql noprint; /* This allows macro to determine if renaming workbook has a label column. If so, use it. */ SELECT STRIP(NAME) FROM DICTIONARY.COLUMNS WHERE LIBNAME = "WORK" AND MEMNAME = "%upcase(__OldNameNewNameTable)" AND UPCASE(NAME) = "LABEL" ; %let _HasLabelColumn=&sqlobs.; %let NumRenamedByLabel=0; %if &_HasLabelColumn.>0 %then %do; SELECT "'" || strip(Dic.Name) || "'n=" || strip(Xwk.NewName), strip(Xwk.NewName) || "=' '" INTO :_OldNameNewName1 SEPARATED BY ' ', :LabelToNone1 SEPARATED BY ' ' FROM __OldNameNewNameTable Xwk INNER JOIN DICTIONARY.COLUMNS Dic ON UPCASE(Xwk.Label) = UPCASE(Dic.Label) WHERE Dic.LIBNAME = 'WORK' AND Dic.MEMNAME = "%upcase(__RenameVar_DsnIn_Copy)" AND Xwk.Label IS NOT NULL AND Xwk.NewName IS NOT NULL ; %let NumRenamedByLabel=&sqlobs.; %end; SELECT "'" || strip(Dic.Name) || "'n=" || strip(Xwk.NewName), strip(Xwk.NewName) || "=' '" INTO :_OldNameNewName2 SEPARATED BY ' ', :OldNameLabelToNone SEPARATED BY ' ' FROM __OldNameNewNameTable Xwk INNER JOIN DICTIONARY.COLUMNS Dic ON UPCASE(Xwk.OldName) = UPCASE(Dic.Name) WHERE Dic.LIBNAME = 'WORK' AND Dic.MEMNAME = "%upcase(__RenameVar_DsnIn_Copy)" AND %if &_HasLabelColumn.=1 %then %do; Xwk.Label IS NULL AND %end; Xwk.NewName IS NOT NULL ; %let NumRenamedByOldName=&sqlobs.; %let _OldNameNewName=&_OldNameNewName1. &_OldNameNewName2.; quit; /* Dropping DsnOut firsts allows macro to work on DBMS based libraries. */ %if %sysfunc(exist(&DsnOut.)) %then %do; proc sql; DROP TABLE &DsnOut.; quit; %end; %if &NumRenamedByLabel=0 & &NumRenamedByOldName.=0 %then %do; %put WARNING: No variables were renamed. &DsnOut. is an exact copy of &DsnIn..; data &DsnOut.; set __RenameVar_DsnIn_Copy; run; %end; %else %do; %if &NumRenamedByLabel.=1 %then %put NOTE: For &DsnOut., one variable was renamed using an Label/NewName combination.; %if &NumRenamedByLabel.>1 %then %put NOTE: For &DsnOut., &NumRenamedByLabel. variables were renamed using Label/NewName combinations.; %if &NumRenamedByOldName.=1 %then %put NOTE: For &DsnOut., one variable was renamed using an OldName/NewName combination.; %if &NumRenamedByOldName.>1 %then %put NOTE: For &DsnOut., &NumRenamedByOldName. variables were renamed using OldName/NewName combinations.; data &DsnOut.; set __RenameVar_DsnIn_Copy (rename=(&_OldNameNewName.)); label /* &LabelToNone1. */ &OldNameLabelToNone.; run; %end; %if &Debug.=N %then %do; proc sql; DROP TABLE __RenameVar_DsnIn_Copy; quit; %end; %if &Debug.~=N %then %put _local_; %put --- End of %upcase(&sysmacroname) macro; %mend _RenameDsnVariablesViaRenameDsn; /* * Part 1: Create a example renaming data set and export to an Excel worksheet on C:\Temp.; * Create renaming data set.; data OldNameNewName_Animal; input @1 Label $55. @56 OldName $32. @88 NewName $32.; datalines; Orycteropus_afer Aardvark Bos_taurus Cow Mammalia_Carnivora_Ursidae_Ursus_acrtos_horribilus Mammalia_Carnivora_Ursidae_0001 Grizzly_Bear Mammalia_Carnivora_Ursidae_Ursus_acrtos_middendorffi Mammalia_Carnivora_Ursidae_0002 Kodiak_Bear ; run; * Export renaming data set as a worksheet in an Excel workbook.; proc export data=OldNameNewName_Animal outfile="C:\temp\TestOldnameNewName.xlsx" dbms=xlsx replace; sheet='OldNameNewName_Animal'; run; * Part 2: Create sample animal data set; data ScaryAnimals; Mammalia_Carnivora_Ursidae_0001="Run!"; Mammalia_Carnivora_Ursidae_0002="Run harder!"; Orycteropus_afer = 666; * The number of the aardvark. ; label Mammalia_Carnivora_Ursidae_0001 = "Mammalia_Carnivora_Ursidae_Ursus_acrtos_horribilus" Mammalia_Carnivora_Ursidae_0002 = "Mammalia_Carnivora_Ursidae_Ursus_acrtos_middendorffi"; run; * Part 3: Run %RenameVariablesInDataSet(); %RenameVariablesInDataSet(DsnIn=ScaryAnimals, DsnOut=ScaryAnimals_Out, WorkbookPath=C:\temp\TestOldnameNewName.xlsx, WorksheetName=OldNameNewName_Animal, Debug=N) */