/*----------------------------------------------------------------------------- File: LinkViaNameChange.sas Author: John Sabel Washington State Education Research & Data Center Creation date: 11/30/12 Version: 0.91 Purpose Deterministically link name data with old name/new name name change pairs found in DOL data, AOC (court) data, combined marriage and divorce data. (These data sets will have to have been previously massaged into new name / old name pairs.) Then it classifies resulting data set based on similarities in the middle names. DOL data and AOC data has data for both males and females. But WA state marriage and divorce data can only be used to infer name changes for females. A future enhancement would be to bring in the county information into the output, e.g. county of court, residence county of bride. Test code is at the bottom of this source listing. You can see from this test code the structure of the name change data that contains old name/new name pairings. Macros %_PossibleNameChanges(DsnIn_1=, DsnIn_2=, DsnOut=, AocDsnIn=, DolDsnIn=, MarrDivDsnIn=, HasSourceCol=) Deterministically links names to old name/new name pairs based on first last names and DOB. It is a little complicated by the fact that AOC (court) data has no DOB, and the other sources do have DOB. %_ClassifyNameChanges(DsnIn=, DsnOut=, Debug=N) Classifies all possible name changes found in %_PossibleNameChanges(). %GetNameChanges(DsnIn=, DsnOut=, AocDsnIn=, DolDsnIn=, MarrDivDsnIn=, HasSourceCol=N) The macro you would call when you are looking for name changes within a single data set. %GetNameChangesBetweenDataSets(AocDsnIn=, DolDsnIn=, MarrDivDsnIn=, DsnIn_1=, DsnIn_2=, DsnOut_1_2=, DsnOut_2_1=) The macro you would call when you are looking for name changes between two data sets. %OrganizeNameChangeClasses(DsnIn=, DsnOut=) Organizes the data set created by %GetNameChanges() or %GetNameChangesBetweenDataSets() into a form suitable for manual review. When %_ClassifyNameChanges() is run, a single matched old name / new name pair could occur in more than one class. This macro then ensures that each matched old name/new name pair occurs only once, occuring in the class with the highes sort order, Therefore, if a matched old name / new name pair occurs in Class 1, then it won't reoccur in any other classes further down the sort order. You could dispense with this macro if you were to move the logic into %_ClassifyNameChanges(). I imagine you would want to do this if you were to port these macros into an implementation in another language. I have it in here because it makes it easy in case I want to add another class. I could just add the class to %_ClassifyNameChanges(), with the appropriate Class variable so the new class occurs in the sort hierarchy used in %OrganizeNameChangeClasses(). Revisions: Date Version Author Comments 2013-02-20 0.91 John Changed the order in which the macros occur in the source file. Also, added additional comments. Also, changed the name of the source file. -----------------------------------------------------------------------------*/ /* Deterministically links P20IDs between two sources (or one source if DsnIn_1=DsnIn_2) using both DOL and AOC name change data. The records are linked based on concatenatation of first and last names in all sources. */ %macro _PossibleNameChanges(DsnIn_1=, DsnIn_2=, DsnOut=, AocDsnIn=, DolDsnIn=, MarrDivDsnIn=, HasSourceCol=); %put --- Start of %upcase(&sysmacroname) macro; /* This SQL statement uses a UNION operators to join three SELECT statements pertaining to the following: 1) DOL data, 2) AOC (court) data, 3) Combined marriage and divorce data */ proc sql; CREATE TABLE &DsnOut. AS /* Drivers License data ******************************************************************/ SELECT AllNameChanges.ValidationSource, AllNameChanges.TranDate_New AS NameChangeDate format=mmddyys10., CASE WHEN Linked_1.P20ID < Linked_2.P20ID THEN catx('_', Linked_1.P20ID,Linked_2.P20ID) ELSE catx('_', Linked_2.P20ID,Linked_1.P20ID) END AS P20ID_P20ID length=22, Linked_1.P20ID AS P20ID_1, Linked_2.P20ID AS P20ID_2, %if &HasSourceCol.=Y %then %do; Linked_1.Source AS Source_1, Linked_2.Source AS Source_2, %end; catx('_', Linked_1.FirstName, Linked_1.MiddleName, Linked_1.LastName) AS Name_1, /* catx() is a SAS concatenate function. To perform the same operation using ANSI SQL-92 do this (should work in SAS as well): TRIM(AllNameChanges.FirstName_Old) || "_" || TRIM(AllNameChanges.MiddleName_Old) || "_" TRIM(AllNameChanges.LastName_Old) AS Name_Old, */ catx('_', AllNameChanges.FirstName_Old, AllNameChanges.MiddleName_Old, AllNameChanges.LastName_Old) AS Name_Old, catx('_', AllNameChanges.FirstName_New, AllNameChanges.MiddleName_New, AllNameChanges.LastName_New) AS Name_New, catx('_', Linked_2.FirstName, Linked_2.MiddleName, Linked_2.LastName) AS Name_2, AllNameChanges.County_Old as CountyCode_Old, AllNameChanges.County_New as CountyCode_New, Linked_1.MiddleName AS MiddleName_1, Linked_2.MiddleName AS MiddleName_2, AllNameChanges.MiddleName_Old AS MiddleName_Old, AllNameChanges.MiddleName_New AS MiddleName_New, Linked_1.BirthDate As BirthDate_1 format=mmddyys10., Linked_2.BirthDate As BirthDate_2 format=mmddyys10. FROM &DsnIn_1. Linked_1 INNER JOIN &DolDsnIn. AllNameChanges ON cats(Linked_1.LastName, Linked_1.FirstName) = cats(AllNameChanges.LastName_Old, AllNameChanges.FirstName_Old) AND /* cats() is a SAS concatenate function. A similar statement using ANSI SQL-92 would be as follows: ON TRIM(Linked_1.LastName) || TRIM(Linked_1.FirstName) = TRIM(AllNameChanges.LastName_Old) || TRIM(AllNameChanges.FirstName_Old) AND */ Linked_1.BirthDate = AllNameChanges.BirthDate_Old INNER JOIN &DsnIn_2. Linked_2 ON cats(AllNameChanges.LastName_New, AllNameChanges.FirstName_New) = cats(Linked_2.LastName, Linked_2.FirstName) AND AllNameChanges.BirthDate_New = Linked_2.BirthDate WHERE Linked_1.BirthDate IS NOT NULL AND Linked_1.P20ID <> Linked_2.P20ID AND /* Check only for people who are of driving age. But what about identity cards? */ /* yrdif() is a SAS function. */ yrdif(Linked_1.BirthDate, AllNameChanges.TranDate_New, 'ACT/365') > 14 UNION ALL /* AOC (court) name change data. This data does not have birth dates. *******************/ SELECT AllNameChanges.ValidationSource, AllNameChanges.CaseFileDate AS NameChangeDate format=mmddyys10., CASE WHEN Linked_1.P20ID < Linked_2.P20ID THEN catx('_', Linked_1.P20ID,Linked_2.P20ID) ELSE catx('_', Linked_2.P20ID,Linked_1.P20ID) END AS P20ID_P20ID length=22, Linked_1.P20ID AS P20ID_1, Linked_2.P20ID AS P20ID_2, catx('_', Linked_1.FirstName, Linked_1.MiddleName, Linked_1.LastName) AS Name_1, catx('_', AllNameChanges.FirstName_Old, AllNameChanges.MiddleName_Old, AllNameChanges.LastName_Old) AS Name_Old, catx('_', AllNameChanges.FirstName_New, AllNameChanges.MiddleName_New, AllNameChanges.LastName_New) AS Name_New, catx('_', Linked_2.FirstName, Linked_2.MiddleName, Linked_2.LastName) AS Name_2, AllNameChanges.CountyCode as CountyCode_Old, AllNameChanges.CountyCode as CountyCode_New, Linked_1.MiddleName AS MiddleName_1, Linked_2.MiddleName AS MiddleName_2, AllNameChanges.MiddleName_Old AS MiddleName_Old, AllNameChanges.MiddleName_New AS MiddleName_New, Linked_1.BirthDate As BirthDate_1 format=mmddyys10., Linked_2.BirthDate As BirthDate_2 format=mmddyys10. FROM &DsnIn_1. Linked_1 INNER JOIN &AocDsnIn. AllNameChanges ON cats(Linked_1.LastName, Linked_1.FirstName) = cats(AllNameChanges.LastName_Old, AllNameChanges.FirstName_Old) INNER JOIN &DsnIn_2. Linked_2 ON cats(AllNameChanges.LastName_New, AllNameChanges.FirstName_New) = cats(Linked_2.LastName, Linked_2.FirstName) WHERE Linked_1.BirthDate IS NOT NULL AND /* Since AOC data does not have birth dates, check to see that Person 1 has the same birth date as Person 2. */ Linked_1.BirthDate = Linked_2.BirthDate AND Linked_1.P20ID <> Linked_2.P20ID AND /* Check for name change only for people born before the CaseFileDate. */ Linked_1.BirthDate < AllNameChanges.CaseFileDate UNION ALL /* Marriage and divorce name change data. This data DOES have birth dates. **************/ SELECT AllNameChanges.ValidationSource, NameChangeDate format=mmddyys10., CASE WHEN Linked_1.P20ID < Linked_2.P20ID THEN catx('_', Linked_1.P20ID,Linked_2.P20ID) ELSE catx('_', Linked_2.P20ID,Linked_1.P20ID) END AS P20ID_P20ID length=22, Linked_1.P20ID AS P20ID_1, Linked_2.P20ID AS P20ID_2, catx('_', Linked_1.FirstName, Linked_1.MiddleName, Linked_1.LastName) AS Name_1, catx('_', AllNameChanges.FirstName, AllNameChanges.MiddleName, AllNameChanges.LastName_Old) AS Name_Old, catx('_', AllNameChanges.FirstName, AllNameChanges.MiddleName, AllNameChanges.LastName_New) AS Name_New, catx('_', Linked_2.FirstName, Linked_2.MiddleName, Linked_2.LastName) AS Name_2, AllNameChanges.CountyCode as CountyCode_Old, AllNameChanges.CountyCode as CountyCode_New, Linked_1.MiddleName AS MiddleName_1, Linked_2.MiddleName AS MiddleName_2, AllNameChanges.MiddleName AS MiddleName_Old, AllNameChanges.MiddleName AS MiddleName_New, Linked_1.BirthDate As BirthDate_1 format=mmddyys10., Linked_2.BirthDate As BirthDate_2 format=mmddyys10. FROM &DsnIn_1. Linked_1 INNER JOIN &MarrDivDsnIn. AllNameChanges ON Linked_1.LastName = AllNameChanges.LastName_Old AND Linked_1.FirstName = AllNameChanges.FirstName AND Linked_1.BirthDate = AllNameChanges.BirthDate INNER JOIN &DsnIn_2. Linked_2 ON AllNameChanges.LastName_New = Linked_2.LastName AND AllNameChanges.FirstName = Linked_2.FirstName AND AllNameChanges.BirthDate = Linked_2.BirthDate WHERE Linked_1.BirthDate IS NOT NULL AND Linked_1.P20ID <> Linked_2.P20ID AND /* These lines are not needed, but do they make the query run faster? */ Linked_1.FirstName = Linked_2.FirstName AND Linked_1.BirthDate = Linked_2.BirthDate AND /* Check only for people who old enough to marry and divorce */ yrdif(Linked_1.BirthDate, AllNameChanges.NameChangeDate, 'ACT/365') > 15 ; quit; %put --- End of %upcase(&sysmacroname) macro; %mend _PossibleNameChanges; %macro _ClassifyNameChanges(DsnIn=, DsnOut=, Debug=N); %put --- Start of %upcase(&sysmacroname) macro; proc sql; CREATE TABLE __SameMiddleName AS SELECT DISTINCT '1' AS Class length=30, * FROM &DsnIn. NameChanges WHERE NOT(day(BirthDate_1) = 1 and month(BirthDate_1) = 1) AND ( MiddleName_1 = MiddleName_Old AND MiddleName_2 = MiddleName_New ) ; CREATE TABLE __SameMiddleInitial AS SELECT DISTINCT '2' AS Class, NameChanges.* FROM &DsnIn. NameChanges LEFT JOIN __SameMiddleName AlreadySpecified ON ( NameChanges.P20ID_1 = AlreadySpecified.P20ID_1 AND NameChanges.P20ID_2 = AlreadySpecified.P20ID_2 ) OR ( NameChanges.P20ID_1 = AlreadySpecified.P20ID_2 AND NameChanges.P20ID_2 = AlreadySpecified.P20ID_1 ) WHERE AlreadySpecified.P20ID_1 IS NULL AND NOT(day(NameChanges.BirthDate_1) = 1 and month(NameChanges.BirthDate_1) = 1) AND ( substr(NameChanges.MiddleName_1, 1, 1) = substr(NameChanges.MiddleName_Old, 1, 1) AND substr(NameChanges.MiddleName_2, 1, 1) = substr(NameChanges.MiddleName_New, 1, 1) ) ; CREATE TABLE __NullMiddleName AS SELECT DISTINCT '3' as Class, NameChanges.* FROM &DsnIn. NameChanges LEFT JOIN ( SELECT P20ID_1, P20ID_2 FROM __SameMiddleName UNION ALL SELECT P20ID_1, P20ID_2 FROM __SameMiddleInitial ) AlreadySpecified ON ( NameChanges.P20ID_1 = AlreadySpecified.P20ID_1 AND NameChanges.P20ID_2 = AlreadySpecified.P20ID_2 ) OR ( NameChanges.P20ID_1 = AlreadySpecified.P20ID_2 AND NameChanges.P20ID_2 = AlreadySpecified.P20ID_1 ) WHERE AlreadySpecified.P20ID_1 IS NULL AND NOT(day(NameChanges.BirthDate_1) = 1 and month(NameChanges.BirthDate_1) = 1) AND Name_1 <> Name_2 AND ( ( NameChanges.MiddleName_1 IS NULL AND NameChanges.MiddleName_2 IS NULL ) OR ( NameChanges.MiddleName_2 IS NULL AND ( MiddleName_1 = MiddleName_Old OR substr(MiddleName_1, 1, 1) = MiddleName_Old OR MiddleName_1 = substr(MiddleName_Old, 1, 1) ) ) OR ( NameChanges.MiddleName_1 IS NULL AND ( MiddleName_New = MiddleName_2 OR substr(MiddleName_New, 1, 1) = MiddleName_2 OR MiddleName_New = substr(MiddleName_2, 1, 1) ) ) ) ; quit; proc sql; CREATE TABLE __SameMiddleName_Jan1st AS SELECT DISTINCT '1a - Jan 1st DOB' AS Class, * FROM &DsnIn. NameChanges WHERE (day(BirthDate_1) = 1 and month(BirthDate_1) = 1) AND ( MiddleName_1 = MiddleName_Old AND MiddleName_2 = MiddleName_New ) ; CREATE TABLE __SameMiddleInitial_Jan1st AS SELECT DISTINCT '2a - Jan 1st DOB' AS Class, NameChanges.* FROM &DsnIn. NameChanges LEFT JOIN __SameMiddleName AlreadySpecified ON ( NameChanges.P20ID_1 = AlreadySpecified.P20ID_1 AND NameChanges.P20ID_2 = AlreadySpecified.P20ID_2 ) OR ( NameChanges.P20ID_1 = AlreadySpecified.P20ID_2 AND NameChanges.P20ID_2 = AlreadySpecified.P20ID_1 ) WHERE AlreadySpecified.P20ID_1 IS NULL AND (day(NameChanges.BirthDate_1) = 1 and month(NameChanges.BirthDate_1) = 1) AND ( substr(NameChanges.MiddleName_1, 1, 1) = substr(NameChanges.MiddleName_Old, 1, 1) AND substr(NameChanges.MiddleName_2, 1, 1) = substr(NameChanges.MiddleName_New, 1, 1) ) ; CREATE TABLE __NullMiddleName_Jan1st AS SELECT DISTINCT '3a - Jan 1st DOB' as Class, NameChanges.* FROM &DsnIn. NameChanges LEFT JOIN ( SELECT P20ID_1, P20ID_2 FROM __SameMiddleName UNION ALL SELECT P20ID_1, P20ID_2 FROM __SameMiddleInitial ) AlreadySpecified ON ( NameChanges.P20ID_1 = AlreadySpecified.P20ID_1 AND NameChanges.P20ID_2 = AlreadySpecified.P20ID_2 ) OR ( NameChanges.P20ID_1 = AlreadySpecified.P20ID_2 AND NameChanges.P20ID_2 = AlreadySpecified.P20ID_1 ) WHERE AlreadySpecified.P20ID_1 IS NULL AND (day(NameChanges.BirthDate_1) = 1 and month(NameChanges.BirthDate_1) = 1) AND Name_1 <> Name_2 AND ( ( NameChanges.MiddleName_1 IS NULL AND NameChanges.MiddleName_2 IS NULL ) OR ( NameChanges.MiddleName_2 IS NULL AND ( MiddleName_1 = MiddleName_Old OR substr(MiddleName_1, 1, 1) = MiddleName_Old OR MiddleName_1 = substr(MiddleName_Old, 1, 1) ) ) OR ( NameChanges.MiddleName_1 IS NULL AND ( MiddleName_New = MiddleName_2 OR substr(MiddleName_New, 1, 1) = MiddleName_2 OR MiddleName_New = substr(MiddleName_2, 1, 1) ) ) ) ; quit; proc sql; CREATE TABLE &DsnOut. AS SELECT * FROM __SameMiddleName UNION SELECT * FROM __SameMiddleInitial UNION SELECT * FROM __NullMiddleName UNION SELECT * FROM __SameMiddleName_Jan1st UNION SELECT * FROM __SameMiddleInitial_Jan1st UNION SELECT * FROM __NullMiddleName_Jan1st ORDER BY Class, ValidationSource, P20ID_1, P20ID_2 ; quit; %if &Debug. = N %then %do; proc sql; DROP TABLE __SameMiddleName, __SameMiddleInitial, __NullMiddleName, __SameMiddleName_Jan1st, __SameMiddleInitial_Jan1st, __NullMiddleName_Jan1st; quit; %end; %put --- End of %upcase(&sysmacroname) macro; %mend _ClassifyNameChanges; %macro GetNameChanges(DsnIn=, DsnOut=, AocDsnIn=, DolDsnIn=, MarrDivDsnIn=, HasSourceCol=N); %put --- Start of %upcase(&sysmacroname) macro; %_PossibleNameChanges(DsnIn_1=&DsnIn., DsnIn_2=&DsnIn., DsnOut=__PossibleNameChanges, AocDsnIn=&AocDsnIn., DolDsnIn=&DolDsnIn., MarrDivDsnIn=&MarrDivDsnIn., HasSourceCol=&HasSourceCol.) %_ClassifyNameChanges(DsnIn=__PossibleNameChanges, DsnOut=&DsnOut.); %put --- End of %upcase(&sysmacroname) macro; %mend GetNameChanges; %macro GetNameChangesBetweenDataSets(AocDsnIn=, DolDsnIn=, MarrDivDsnIn=, DsnIn_1=, DsnIn_2=, DsnOut_1_2=, DsnOut_2_1=); %put --- Start of %upcase(&sysmacroname) macro; %_PossibleNameChanges(DsnIn_1=&DsnIn_1., DsnIn_2=&DsnIn_2., DsnOut=__PossibleNameChanges_1_2, AocDsnIn=&AocDsnIn., DolDsnIn=&DolDsnIn., MarrDivDsnIn=&MarrDivDsnIn., HasSourceCol=N) %_ClassifyNameChanges(DsnIn=__PossibleNameChanges_1_2, DsnOut=&DsnOut_1_2.); %_PossibleNameChanges(DsnIn_1=&DsnIn_2., DsnIn_2=&DsnIn_1., DsnOut=__PossibleNameChanges_2_1_01, AocDsnIn=&AocDsnIn., DolDsnIn=&DolDsnIn., MarrDivDsnIn=&MarrDivDsnIn., HasSourceCol=N) /* The same person might have current and former names in BOTH data sets. This will remove these second of twin occurrances from __PossibleNameChanges_2_1_01. */ /* proc sql;*/ /* CREATE TABLE __PossibleNameChanges_2_1_02 AS*/ /* SELECT TwoToOne.**/ /* FROM __PossibleNameChanges_2_1_01 TwoToOne*/ /* LEFT JOIN __PossibleNameChanges_1_2 OneToTwo*/ /* ON TwoToOne.P20ID_1 = OneToTwo.P20ID_1 AND*/ /* TwoToOne.P20ID_2 = OneToTwo.P20ID_2 AND*/ /* TwoToOne.Name_1 = OneToTwo.Name_1 AND*/ /* TwoToOne.Name_2 = OneToTwo.Name_2*/ /* WHERE OneToTwo.P20ID_1 IS NULL*/ /* ;*/ /* quit;*/ %_ClassifyNameChanges(DsnIn=__PossibleNameChanges_2_1_01, DsnOut=__ClassifiedNameChanges_2_1_01 (rename=(P20ID_1=P20ID_2 P20ID_2=P20ID_1 Name_1=Name_2 Name_2=Name_1 MiddleName_1=MiddleName_2 MiddleName_2=MiddleName_1 BirthDate_1=BirtDate_2 BirthDate_2=BirthDate_1))) data &DsnOut_2_1.; set __ClassifiedNameChanges_2_1_01; run; %put --- End of %upcase(&sysmacroname) macro; %mend GetNameChangesBetweenDataSets; %macro OrganizeNameChangeClasses(DsnIn=, DsnOut=); /* If you are porting these macros to a non-SAS implementation, see description under soruce code file header for %OrganizeNameChangeClasses(DsnIn=, DsnOut=). */ %put --- Start of %upcase(&sysmacroname) macro; proc sql noprint; SELECT DISTINCT Class INTO :_Class1 - :_Class99 FROM &DsnIn. ORDER BY Class ; %let NumClasses = &sqlobs.; quit; %do i = 1 %to &NumClasses.; proc sql; CREATE TABLE __Root AS SELECT * FROM &DsnIn. WHERE Class = "&&_Class&I" ; quit; proc sql; CREATE TABLE __SingleClass&i. AS SELECT * FROM __Root %do j = %eval(&i.+1) %to &NumClasses.; UNION ALL SELECT * FROM &DsnIn. WHERE Class = "&&_Class&J" AND P20ID_P20ID IN ( SELECT P20ID_P20ID FROM __Root ) %end; ORDER BY P20ID_P20ID, Class ; quit; %end; proc sql; CREATE TABLE __AlmostOut LIKE __Root; CREATE TABLE Moo AS SELECT * FROM __AlmostOut %do i = 1 %to &NumClasses.; UNION ALL SELECT * FROM __SingleClass&i. %end; quit; data &DsnOut.; length RecordOrder 8; set Moo; RecordOrder = _n_; run; proc sql; DROP TABLE __Root, __AlmostOut; DROP TABLE __SingleClass1 %if &NumClasses. > 1 %then %do; %do i = 2 %to &NumClasses.; ,__SingleClass&i. %end; %end; ; quit; %put --- End of %upcase(&sysmacroname) macro; %mend OrganizeNameChangeClasses; /* * Test data and sample macro calls. ; data TestData_1; input @1 P20ID $ @10 LastName $ @20 FirstName $ @30 MiddleName $ @40 BirthDate mmddyy8.; datalines; A001 Sabel John Robert 07301965 A002 Vanlet Paul Herbert 02231929 A003 NewLast NewFirst MI 10102010 A004 Smith John B 07072007 A005 Musho Jane Ellen 05051980 A006 Moody Madeye Xerxes 06061966 A007 Stanley Madeye X 06061966 ; run; data TestData_2; input @1 P20ID $ @10 LastName $ @20 FirstName $ @30 MiddleName $ @40 BirthDate mmddyy8.; datalines; B001 John Sabel Robert 07301965 B002 Smith Robert Molt 02231929 B003 OldLast OldFirst MI 10102010 B004 Brown Susan Delores 07072007 B005 Riley Jane Ellen 05051980 ; run; data TestData_AOC; input @1 ValidationSource $ @10 CaseFileDate mmddyy8. @20 LastName_Old $ @30 FirstName_Old $ @40 MiddleName_Old $ @50 LastName_New $ @60 FirstName_New $ @70 MiddleName_New $ @80 CountyCode $; datalines; AOC_Test 0102000 Sabel John Robert John Sabel Robert 01 AOC_Test 0102010 Vanlet Paul Herbert Smith Robert Molt 17 AOC_Test 0102010 OldLast OldFirst MI NewLast NewFirst MI 27 AOC_Test 0102010 Brown Susan D Smith John Brad 21 ; run; data TestData_DOL; input @1 ValidationSource $ @10 TranDate_New mmddyy8. @20 LastName_Old $ @30 FirstName_Old $ @40 MiddleName_Old $ @50 BirthDate_Old mmddyy8. @60 LastName_New $ @70 FirstName_New $ @80 MiddleName_New $ @90 BirthDate_New mmddyy8. @100 County_Old $ @110 County_New $; datalines; DOL_Test 0102000 Sabel John Robert 07301965 John Sabel Robert 07301965 AA BB ; run; data TestData_MarrDiv; input @1 ValidationSource $ @10 NameChangeDate mmddyy8. @20 LastName_Old $ @30 LastName_New $ @40 FirstName $ @50 MiddleName $ @60 BirthDate mmddyy8. @70 CountyCode $; datalines; MarrTest 0102000 Musho Riley Jane E 05051980 GG MarrTest 0102000 Moody Stanley Madeye X 06061966 LL ; run; options mprint symbolgen; %GetNameChangesBetweenDataSets(DsnIn_1=TestData_1, DsnIn_2=TestData_2, DsnOut_1_2=_Test_1to2, DsnOut_2_1=_Test_2to1, AocDsnIn=TestData_AOC, DolDsnIn=TestData_DOL, MarrDivDsnIn=TestData_MarrDiv) %OrganizeNameChangeClasses(DsnIn=_Test_1to2, DsnOut=_Test_1to2_Grouped_Classes) */ *;