/*----------------------------------------------------------------------------- File: LinkPlusToolkit.sas Author: John Sabel john.sabel@ofm.wa.gov Washington State Education Research & Data Center Creation date: 2/27/12 Version: 0.92 Purpose This file combines all the ERDC Link Plus related SAS functions and macros into one file. Therefore, running this file once in SAS will cause all the macros to become compiled, and ready for use. This file contains the following programs: 1) CreateLinkPlusFunctions.sas - Uses PROC FCMP to create Link Plus related functions. 2) ReadInLinkPlusReport.sas - Creates a macro that can read into SAS a Link Plus report. 3) ModifyLinkPlusReport.sas - Creates a macro that will modify the Link Plus report brought into sas by %ReadInLinkPlusReport. 4) SeparateByCondition.sas - After determining accept, reject and manual review cutoff values, this macro can segment up the Link Plus report data sets in SAS using these cutoff values. 4) CreateLinkPlusExportFile.sas - Creates a Link Plus export file from a SAS data set that contains Link Plus report data. 5) AssembleLinkPlusReport.sas - Creates a macro to assemble a Link Plus report from a Link Plus report brought into SAS using %ReadInLinkPlusReport and (optionall) modified by %ModifyLinkPlusReport. 6) MergeLinkPlusLinkageFiles.sas - Creates a macro that can concatenate all the Link Plus data file to data file linkage export files that exist in a single directory. Revisions Date Version Author Comments 2012-07-08 0.91 JRS Added CreateLinkPlusExportFile.sas. 2012-10-08 0.92 JRS Added helper macro %FetchLinkPlusLinkageExportFile. This macro is necessary for MergeLinkPlusLinkageFiles to run. ------------------------------------------------------------------------------- Copyright © 2012 Washington State Office of Financial Management This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. -----------------------------------------------------------------------------*/ /*----------------------------------------------------------------------------- File: CreateLinkPlusFunctions.sas Author: John Sabel john.sabel@ofm.wa.gov Washington State Education Research & Data Center Creation date: 1/31/12 Version: 0.9 Purpose This program creates one Link Plus specific function. When you run this program, the function that is created is stored in SASUser.FCMP.LinkPlus. There is a dependency between this location and the program "ReadInLinkPlus.sas." The program "ReadInLinkPlus.sas" is coded to look for the function FixIdVariables() in SASUser.FCMP.LinkPlus. The programs, and the resulting dependency are written to faciliate ease of first time use. Link Plus specific functions: 1) FixIdVariables(MetadataString $) $ Used in "ReadInLinkPlus.sas." ------------------------------------------------------------------------------- Copyright © 2012 Washington State Office of Financial Management This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. -----------------------------------------------------------------------------*/ proc fcmp outlib = SASUser.FCMP.LinkPlus; function FixIdVariables(MetadataString $) $; length ReturnString $512 MetaString $512; /* Need to create copy of MetadataString to avoid a WA RNING being thrown. */ MetaString = strip(MetadataString); if indexc(MetaString, '09'x) then delimiter = '09'x; else delimiter = ' '; NumWords = countc(MetaString, '', 'st') + 1; if index(MetaString, '_1') then do; do i = 1 to NumWords - 1; SingleWord = scanq(MetaString, i, delimiter); ReturnString = catx(delimiter, ReturnString, SingleWord); if prxmatch('/_1/', SingleWord) then do; SingleWord2 = prxchange('s/(\w+)_1/$1_2/', - 1, SingleWord) ; MetaString = tranwrd(MetaString, delimiter || trim(SingleWord), delimiter || trim(SingleWord2)); *MetaString = tranwrd(MetaString, '09'x || SingleWord, '09'x || SingleWord2); end; end; ReturnString = catx(delimiter, ReturnString, scanq(MetaString, NumWords, delimiter)); end; else do; do i = NumWords to 2 by -1; SingleWord = scanq(MetaString, i, delimiter); ReturnString = catx(delimiter, SingleWord, ReturnString); if prxmatch('/_2/', SingleWord) then do; SingleWord2 = prxchange('s/(\w+)_2/$1_1/', - 1, SingleWord) ; MetaString = tranwrd(MetaString, trim(SingleWord) || delimiter, trim(SingleWord2) || delimiter); *MetaString = tranwrd(MetaString, '09'x || SingleWord, '09'x || SingleWord2); end; end; ReturnString = catx(delimiter, scanq(MetaString, 1, delimiter), ReturnString); end; Put ReturnString=; return(ReturnString); endsub; run; /*----------------------------------------------------------------------------- File: ReadInLinkPlusReport.sas Author: John Sabel john.sabel@ofm.wa.gov Washington State Education Research & Data Center Creation date: 1/31/12 Version: 0.91 Purpose This program reads in the Link Plus report file. The program splits the report file into two data sets, 1) the header portion (metadata) of the report file, and 2) the data itself. Note By default, SAS variables are eight characters long. If your dataset has variables longer than eight characters, and these variables will be used in "ModifyLinkPlusReport.sas", then you will need to modify this program to to read in the full length of these characters. Examples of how to modify this program are shown below. For example, this program modifies different permutations of variables for first, middle and last name so as to allow for 20 character field lengths. Also, this program uses the function FixIdVariables(). The definition of FixIdVariables is in a separate program, "CreateLinkPlusFunctions.sas", available on the ERDC website. So you will have to download and run "CreateLinkPlusFunctions.sas" before you can run this program. You only need to run "CreateLinkPlusFunctions.sas" once, and then the FixIdVariables() will be stored permanently in SASUSER.FCMP.LINKPLUS. Everytime this program is run, it checks to see if SAS has a referenece to the package SASUSER.FCMP. If it doeesn't, it adds a reference. The program, and the resulting dependency to FixIdVariables() was written to faciliate ease of first time use. In more formal code, you could remove the check, and in in an autoexec file, add the reference to the package using an options statement (i.e. "options cmplib=SASUSER.FCMP"). Input Parameters: LinkRptPathIn: Path of a unmodified Link Plus report. Output Parameters: MetadataDsnOut: The metadata header of a Link Plus report as a SAS dataset. ReportDsnOut: The data section of a Link Plus report as a SAS dataset. Debug Parameter: Y = Don't delete intermediate dataset "__MetaDataHeader." N = Delete "__MetaDataHeader." Default is N. Macro Variables Outputted put_1, put_2: A tab delimited list of Link Plus report variables. Used in the creation of a modifed Link Plus report. The clerical review function of Link Plus doesn't actually use any of these variables directly from the report. Instead, the clerical review function pulls in these variables from the original ACSCII your files used to as input to Link Plus. If you want to keep report size down, and you don't care to view it yourself, then you could use the macro "justtabs" in "AssembleLinkPlusReport.sas." justtabs A tab delimited list of nothing. This can be used in lieu of put_1 and put_2 above when you don't care to put in the unneeded variables when you create a modified Link Plus report in "AssembleLinkPlusReport.sas." Revisions Date Version Author Comments 2012-02-27 0.91 JRS Fixed bug where program would only run in debug mode. Modified comments. ------------------------------------------------------------------------------- Copyright © 2012 Washington State Office of Financial Management This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. -----------------------------------------------------------------------------*/ %macro ReadInLinkPlusReport(LinkRptPathIn=, MetadataDsnOut=, ReportDsnOut=, Debug=N); %put --- Start of %upcase(&sysmacroname) macro; /* These variables are used in AssembleLinkPlusReport.sas */ %global put_1 put_2; /* Checks to see if reference already exists to the location of FixIdVariables(). If the reference doesn't exist,then is adds it. */ %if %index(%sysfunc(getoption(cmplib)), SASUSER.FCMP) = 0 %then %do; options append=(CMPLIB=SASUSER.FCMP); %end; /* This data step captures metadata into a set of macro variables from the header of Link Plus' report file */ data __MetaDataHeader; infile "&LinkRptPathIn." length=len obs=50 dsd; length put_1 $300 put_2 $300; retain HeaderLineNum 0; input dataline : $varying300. len @; HeaderLineNum = HeaderLineNum + 1; if substr(dataline,1,10)='Linkage ID' then do; /* Remove blanks and "-" characters, so that "Linkage ID" becomes "LinkageID", etc. */ dataline2 = strip(compress(compress(dataline, "-"))); /* For deduplication reports: strip out LinkageID, SetID, Record#*/ dataline3=prxchange('s/LinkageID\tSetID\tRecord#\t//',-1, dataline2); /* For linking reports: strip out LinkageID, File, Record# */ dataline4=prxchange('s/LinkageID\tFile\tRecord#\t//',-1, dataline3); /* Strip out Class, Score */ dataline5=prxchange('s/\tClass\tScore//',-1,dataline4); /* Create new variable names by appending "_1" to remaining variable names */ length_1 = prxchange('s/(\w+)/$1_1/', -1, dataline5); length_1 = FixIdVariables(length_1); /* Create new variable names by appending "_2" to remaining variable names */ length_2 = prxchange('s/(\w+)/$1_2/', -1, dataline5); length_2 = FixIdVariables(length_2); %if &Debug. = Y %then %do; put length_1 =; put length_2 =; put; %end; /* Add "$" in between every variable name. Used in the "input" statement below. */ input_1 = prxchange('s/1\t/1 \$ /', -1, length_1); input_1 = prxchange('s/2\t/2 \$ /', -1, input_1); /* Add "$" after last variable name */ input_1 = compbl(input_1 || ' $'); input_2=compbl(prxchange('s/1\t/1 \$ /', -1, length_2)); input_2=compbl(prxchange('s/2\t/2 \$ /', -1, input_2)); /* Add "$" after last variable name */ input_2 = compbl(input_2 || ' $'); /* Create put statement fragments from raw input statement fragments. Changes all "$" that are for an input statement to tabs for use in the put statements used to output modified report. However, these variable names and data are NOT used by Link Plus when it generates Clerical Review views. It obtains the data for the views by pulling the it from the source data file. So if you don't care about having these variablesin your modified report, use the "justtabs" variable in the final put statements. */ put_1 = prxchange("s/ \$ / '09'x /", -1, input_1); put_2 = prxchange("s/ \$ / '09'x /", -1, input_2); justtabs = prxchange('s/(\w+)//', -1, length_1); justtabs = prxchange("s/\t/ '09'x /", -1, justtabs); %if &Debug. = Y %then %do; put put_1 =; put put_2 =; put; put justtabs =; /* Only produces (NumberColumns -1) tabs */ put; %end; /* A kludge. */ input_1 = tranwrd(input_1, '_2', '_kludge'); input_2 = tranwrd(input_2, '_1', '_kludge'); /****************************************************************** * * * Examples of how to modify input statements so that more than * * eight characters will be read into SAS report. * * * ******************************************************************/ /* Modify the input statements to take in the full 9 character SSN. */ /* Parameters to tranwrd() are case sensitive. */ input_1 = tranwrd(input_1, 'ssn_1 $', 'ssn_1 : $9.'); input_2 = tranwrd(input_2, 'ssn_2 $', 'ssn_2 : $9.'); /* Modify input statements to take in full 20 character of first and last names that are in ERDC data. */ input_1 = tranwrd(input_1, 'LASTNAME_1 $', 'LASTNAME_1 : $20.'); input_2 = tranwrd(input_2, 'LASTNAME_2 $', 'LASTNAME_2 : $20.'); input_1 = tranwrd(input_1, 'FIRSTNAME_1 $', 'FIRSTNAME_1 : $20.'); input_2 = tranwrd(input_2, 'FIRSTNAME_2 $', 'FIRSTNAME_2 : $20.'); input_1 = tranwrd(input_1, 'lname_1 $', 'lname_1 : $20.'); input_2 = tranwrd(input_2, 'lname_2 $', 'lname_2 : $20.'); input_1 = tranwrd(input_1, 'fname_1 $', 'fname_1 : $20.'); input_2 = tranwrd(input_2, 'fname_2 $', 'fname_2 : $20.'); input_1 = tranwrd(input_1, 'LastName_1 $', 'LastName_1 : $20.'); input_2 = tranwrd(input_2, 'LastName_2 $', 'LastName_2 : $20.'); input_1 = tranwrd(input_1, 'FirstName_1 $', 'FirstName_1 : $20.'); input_2 = tranwrd(input_2, 'FirstName_2 $', 'FirstName_2 : $20.'); input_1 = tranwrd(input_1, 'last_name_1 $', 'last_name_1 : $20.'); input_2 = tranwrd(input_2, 'last_name_2 $', 'last_name_2 : $20.'); input_1 = tranwrd(input_1, 'first_name_1 $', 'first_name_1 : $20.'); input_2 = tranwrd(input_2, 'first_name_2 $', 'first_name_2 : $20.'); input_1 = tranwrd(input_1, 'middle_name_1 $', 'middle_name_1 : $20.'); input_2 = tranwrd(input_2, 'middle_name_2 $', 'middle_name_2 : $20.'); /* Modify input statements to take in the full 16 characters of the DistrictStudentCode from the K-12 data.. */ input_1 = tranwrd(input_1, 'DistrictStudentCode_1 $', 'DistrictStudentCode_1 : $16.'); input_2 = tranwrd(input_2, 'DistrictStudentCode_2 $', 'DistrictStudentCode_2 : $16.'); /* Modify input statements to take in full 11 digit rec_key used in hospitalization data */ input_1 = tranwrd(input_1, 'REC_KEY_1 $', 'REC_KEY_1 : $11.'); input_2 = tranwrd(input_2, 'REC_KEY_2 $', 'REC_KEY_2 : $11.'); /******************************************************************/ /* End of examples of modifying input statements. */ /******************************************************************/ %if &Debug. = Y %then %do; put input_1 =; put input_2 =; put; put HeaderLineNum =; %end; call symputx('length_1',length_1); call symputx('length_2',length_2); call symputx('input_1',input_1); call symputx('input_2',input_2); call symputx('put_1',put_1); call symputx('put_2',put_2); call symputx('justtabs',justtabs); call symputx('HeaderLineNum',HeaderLineNum); end; run; /* Create a dataset containing just the metadata header portion of a Link Plus report. */ data &MetadataDsnOut.; infile "&LinkRptPathIn." length=len dsd obs=&HeaderLineNum.; input dataline : $varying200. len @; *putlog dataline _error_ _n_; run; /* Create a dataset containing the report data itself from a Link Plus report. */ data &ReportDsnOut.; %let StartingObs=%eval(&HeaderLineNum.+1); infile "&LinkRptPathIn." dlm='09'x dsd firstobs=&StartingObs.; input #2 LinkageID1 SetOrLinkID1 RecordNum1 &input_1. #3 LinkageID2 SetOrLinkID2 RecordNum2 &input_2. class score; output; run; %if &Debug. = N %then %do; proc datasets lib=work nolist; delete __MetaDataHeader; run; quit; %end; %put --- End of %upcase(&sysmacroname) macro; %mend ReadInLinkPlusReport; /*----------------------------------------------------------------------------- File: ModifyLinkPlusReport.sas Author: John Sabel john.sabel@ofm.wa.gov Washington State Education Research & Data Center Creation date: 1/31/12 Version: 0.9 Purpose This program modifies the SAS version of a Link Plus report that is generated by "ReadInLinkPlusReport.sas." It modifies Link Plus's "score" variable so that each Link Plus "class" value has a non-overlapping range of scores. In other words: 1) Class 1: Range of modified scores, 0 to 99.9 2) Class 2: Range of modified scores, 100 to 199.9 3) Class 3: Range of modified scores, 200 to 299.9 4) etc. What reassigning the score variable allow you to do is to perform clerical review in Link Plus by class. For example, the definition of Link Plus's class 1 is where matches have the same SSN, birth date, first name, and last name. In this case, barring unusual cases such as single letter first or last names, you would want to accept the entire class, regardless of score. After using this macro, followed by "AssembleLinkPlusReport.sas", in Link Plus's clerical review function, you could accept all scores that range from 0 to 99.9. In a similar fashion, you could come up with accept/reject scores for each class. This is a barebones implementation. You could also modify class definitions in the code, and come up with additional classes. I have used up to 99 classes before. Input Parameters ReportDsnIn: Data from a Link Plus report as a SAS dataset. Debug: Not used. Output Parameters ReportDsnOut: Modified data from a Link Plus report as a SAS dataset. -----------------------------------------------------------------------------*/ %macro ModifyLinkPlusReport(ReportDsnIn=, ReportDsnOut=, Debug=N); %put --- Start of %upcase(&sysmacroname) macro; /* Mofifies Link Plus report data that was loaded into a SAS */ data &ReportDsnOut.; set &ReportDsnIn.; /* In the program "ReadInLinkPlusReport.sas", the original variable names are stored in the macro variables "put_1" and "put_2" (or in "justtabs"). These macro variables are then used to define the output for the modfied Link Plus report generated in "AssembleLinkPlusReport.sas." As a result, the output dataset defined by "ReportDataset" here can have as many additional intermediate variables as desired. */ orig_score = score; orig_class = class; score = (class * 100) + score; run; proc sort data=&ReportDsnOut.; by class descending score ; run; %if &Debug.=Y %then %do; %put NOTE: The Debug parameter was set to Y, but this macro does not have a debug mode.; %end; %put --- End of %upcase(&sysmacroname) macro; %mend ModifyLinkPlusReport; /*----------------------------------------------------------------------------- File: SeparateByCondition.sas Author: John Sabel john.sabel@ofm.wa.gov Washington State Education Research & Data Center Creation date: 9/15/10 Version: 0.9 Purpose This program separates a dataset into two datasets based on a condition set. A condition set can contain one or more conditions. -----------------------------------------------------------------------------*/ %macro SeparateByCondition(DsnIn=, MeetsCondition_DsnOut=, DoesNotMeetCondition_DsnOut=, ConditionSet=, Debug=N); %put --- Start of %upcase(&sysmacroname.) macro; data &MeetsCondition_DsnOut &DoesNotMeetCondition_DsnOut.; set &DsnIn.; if (&ConditionSet.) then output &MeetsCondition_DsnOut.; else output &DoesNotMeetCondition_DsnOut.; run; %if &Debug.=Y %then %do; %put NOTE: The Debug parameter was set to Y, but this macro does not have a debug mode.; %end; %put --- End of %upcase(&sysmacroname.) macro; %mend SeparateByCondition; /*----------------------------------------------------------------------------- File: CreateLinkPlusExportFile.sas Author: John Sabel john.sabel@ofm.wa.gov Washington State Education Research & Data Center Creation date: 7/12/11 Version: 0.9 Purpose This program creates a Link Plus linkage export file from a dataset that was originally imported from a Link Plus report using ReadInLinkPlusReport.sas. -----------------------------------------------------------------------------*/ %macro CreateLinkPlusExportFile(ReportDsnIn=, FilePathOut=, Match_Status=, Key_1=, Key_2=, Debug=N); %put --- Start of %upcase(&sysmacroname) macro; %if &Debug.=Y %then %do; %put NOTE: The Debug parameter was set to Y, but this macro does not have a debug mode.; %end; data _null_; set &ReportDsnIn.; file "&FilePathOut." dlm='09'x; if _n_ = 1 then do; put 'Match_Status' '09'x 'Score' '09'x 'Class' '09'x 'Linkage_ID' '09'x "&Key_1._File1" '09'x "&Key_2._File2"; end; put "&Match_Status." '09'x score class LinkageID1 &Key_1._1 &Key_2._2; run; %put --- End of %upcase(&sysmacroname) macro; %mend CreateLinkPlusExportFile; /*----------------------------------------------------------------------------- File: AssembleLinkPlusReport.sas Author: John Sabel john.sabel@ofm.wa.gov Washington State Education Research & Data Center Creation date: 1/31/12 Version: 0.9 Purpose This program "assembles" a Link Plus report from its constituent parts that were created in from a raw Link Plus report by running "ReadInLinkPlusReport.sas": 1) The Link Plus report header which contains column headings and other information (the parameter and dataset "MetadataDsnIn"), 2) The data itself (the parameter and dataset "ReportDsnIn"). Input Parameters: MetadataDsnIn: The two-level dataset name of the report header of a Link Plus report. ReportDsnIn: The two-level dataset name of the data of a Link Plus report that is the form as a SAS dataset. Debug: Not used. Hidden Input Parameters: Macro variables "put_1" and "put_2": Running the macro "ReadInLinkPlusReport.sas" creates these. Alternatively, you could use the single macro variable "justtabs" in lieu of "put_1" and "put_2", though you will need uncomment and comment out the appropriate lines. See the comments in the source code of "ReadInLinkPlusReport.sas" for more details. Output Parameters: LinkRptPathOut: The fully qualified file name of the Link Plus report to be create. ------------------------------------------------------------------------------- Copyright © 2012 Washington State Office of Financial Management This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. -----------------------------------------------------------------------------*/ %macro AssembleLinkPlusReport(MetadataDsnIn=, ReportDsnIn=, LinkRptPathOut=, Debug=N); %put --- Start of %upcase(&sysmacroname) macro; /* Output metadata */ data _null_; file "&LinkRptPathOut."; set &MetadataDsnIn. end=eof; varlen = length(dataline); put dataline $varying200. varlen; run; /* Output data */ /* As far as Link Plus is concerned, the data step doesn't need the formats (i.e. 10., 6.0, 6.1), but they do make the modified report look a little more like the original report */ data _null_; file "&LinkRptPathOut." mod; set &ReportDsnIn. end=eof; put #2 LinkageID1 10. '09'x SetOrLinkID1 '09'x RecordNum1 '09'x &put_1. /* &justtabs. '09'x */ #3 LinkageID2 10. '09'x SetOrLinkID2 '09'x RecordNum2 '09'x &put_2. /* &justtabs. '09'x */ class 6.0 '09'x score 6.1; run; /* Link Plus wants last line of report file to contain just a CR/LF. That is what this does. */ data _null_; file "&LinkRptPathOut." mod; put; run; %if &Debug.=Y %then %do; %put NOTE: The Debug parameter was set to Y, but this macro does not have a debug mode.; %end; %put --- End of %upcase(&sysmacroname) macro; %mend AssembleLinkPlusReport; /*----------------------------------------------------------------------------- File: MergeLinkPlusLinkageFiles.sas Author: John Sabel john.sabel@ofm.wa.gov Washington State Education Research & Data Center Creation date: 2/27/12 Version: 0.9 Purpose For a given export directory, this program concatenates all the Link Plus exported "View" files for a data file to data file linkage project. It will not combine export files from a deduplication project. It filters the files in the export directory based on a ".txt" file extension. Therefore, this program will throw an error if it encounters in the export directory a ".txt" file that is NOT a export file. Parameters DirIn: The directory path to the exported Link Plus views. This directory should only contain the export files that you want to link together. DsnOut: What you want to call the final two-level dataset name. OnlyGoodLinks: (Optional) Set to Y means only links with that have a Match_Status = 1 will be kept, otherwise, all linkages, including those with a Match_Status of 2 and 3 will be kept. Default is Y. Debug: (Optional) Y = Don't delete intermidiate datasets. N = Delete intermidiate datasets. Default is N Note Running this macro with the "OnlyGoodLinks" parameter set to N results in a SAS data set that contains every linkage record, whether they are flagged "accept", "uncertain" or "reject." Running the following SQL statement will result in a nice report that can give you information that will help you to set cutoff values for future linking runs using similar data: proc sql; title 'FirstBatch Statistics'; SELECT class, count(*) AS TotalMatches, sum(case when Match_Status = 1 then 1 else 0 end) as AcceptedMatches, sum(case when Match_Status = 2 then 1 else 0 end) as RejectedMatches, sum(case when Match_Status = 3 then 1 else 0 end) as UncertainMatches, min(case when Match_Status = 1 then (score - (class * 100)) end) AS MinScoreAccepted, max(case when Match_Status <> 1 then (score - (class * 100)) end) AS MaxScoreNotAccepted FROM <> GROUP BY class ORDER BY class ; quit; This SQL statement assumes that the Link Plus report data had been first modified by the program ModifyLinkPlusReport.sas (The default implentation of ModifyLinkPlusReport.sas modified the score variable). ------------------------------------------------------------------------------- Copyright © 2012 Washington State Office of Financial Management This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. -----------------------------------------------------------------------------*/ %macro MergeLinkPlusLinkageFiles(DirIn=, DsnOut=, OnlyGoodLinks=Y, Debug=N); %put --- Start of %upcase(&sysmacroname) macro; %local err or; %let err=ERR; %let or=OR; %if "&debug" ^= "Y" and "&debug" ^= "N" %then %do; %put &err&or: (&sysmacroname.) Invalid value for the macro parameter DEBUG.; %put &err&or: (&sysmacroname) Value should be either N or Y.; %goto exit; %end; %local i; filename dircmd pipe "dir /b &DirIn."; data __Dirfiles; length outfile $200; Keep outfile; infile dircmd missover length=length; input @; input bigline $varying200. length; outfile = strip(bigline); if substr(strip(reverse(outfile)), 1, 3) = 'txt'; /* We only want .txt files */ run; /* Create a list of export files, and a list of what will be the SAS dataset names for these files. Because export files names could begin with a number, the SAS dataset names will all be prepended by the characters "__". */ proc sql noprint; SELECT outfile, "__" || substr(outfile,1,length(outfile)-4) INTO :RawDataNames separated by ' ', :SasDataNames separated by ' ' FROM __Dirfiles ; %let numfiles = &sqlobs.; quit; %if &numfiles.= 0 %then %do; %put &err&or: (&sysmacroname.) The directory "&DirIn." does not contain .txt files.; %goto exit; %end; /* Create datasets from all the Link Plus Export files */ %FetchLinkPlusLinkageExportFile(FileIn=&DirIn.\%scan(&RawDataNames.,1,%str( )),DsnOut=%scan(&SasDataNames.,1,%str( )),OnlyGoodLinks=&OnlyGoodLinks.) %if %sysfunc(exist(%scan(&SasDataNames.,1,%str( ))))=0 %then %do; %put &err&or: (&sysmacroname.) The dataset "%scan(&SasDataNames.,&i.,%str( ))" does not exist.; %put &err&or: (&sysmacroname) This macro will now terminate.; %goto exit; %end; data &DsnOut.; set %scan(&SasDataNames.,1,%str( )); run; %do i = 2 %to &numfiles.; %FetchLinkPlusLinkageExportFile(FileIn=&DirIn.\%scan(&RawDataNames.,&i.,%str( )),DsnOut=%scan(&SasDataNames.,&i.,%str( )),OnlyGoodLinks=&OnlyGoodLinks.) /* Triggered if macro FetchLinkPlusLinkageExportFile() does not return a SAS data set. */ %if %sysfunc(exist(%scan(&SasDataNames.,&i.,%str( ))))=0 %then %do; %put &err&or: (&sysmacroname.) The dataset "%scan(&SasDataNames.,&i.,%str( ))" does not exist.; %put &err&or: (&sysmacroname) This macro will now terminate.; %goto exit; %end; proc append base=&DsnOut. data=%scan(&SasDataNames.,&i.,%str( )); run; %end; %exit: %if &debug. = N %then %do; proc datasets lib=work nolist; delete __Dirfiles &SasDataNames.; run; quit; %end; %else %do; %put _user_; %end; %put --- End of %upcase(&sysmacroname) macro; %mend MergeLinkPlusLinkageFiles; /*----------------------------------------------------------------------------- File: FetchLinkPlusLinkageExportFile.sas Author: John Sabel Creation date: 3/8/11 Purpose Creates a SAS dataset from a Link Plus tab delimited link export file. Parameters FileIn: The full file path to the Link Plus export file. DsnOut: The two-level dataset name that is created. OnlyGoodLinks: (Optional) Set to Y means only links with that have a Match_Status = 1 will be kept, otherwise, all linkages, including those with a Match_Status of 2 and 3 will be kept. Default is Y. tested this program with REC_KEY. The key must be numeric. Debug: (Optional) Y = Don't delete intermidiate datasets. N = Delete intermidiate datasets. Default is 'N' Modification Log: Person Date Description -----------------------------------------------------------------------------*/ /* Might want to add parameter "NumericKeys" to handle non-numeric keys?. */ %macro FetchLinkPlusLinkageExportFile(FileIn=,DsnOut=,OnlyGoodLinks=Y,Debug=N); %put --- Start of %upcase(&sysmacroname) macro; %local err or; %let err=ERR; %let or=OR; %local vars; %if &Debug.=Y %then %do; %put NOTE: The Debug parameter was set to Y, but this macro does not have a debug mode.; %end; filename in "&FileIn."; * Path to export file; data _null_; infile in dsd missover obs=1 length=len; input thehead :$varying200. len; call symput('vars',thehead); run; %if %index(%bquote(&vars.),Match_Status) ^= 1 %then %do; %put &err&or: (&sysmacroname.) The file "&FileIn." is not a Link Plus export file.; %goto exit; %end; data &DsnOut.; /*length &length.;*/ INFILE in delimiter='09'x end=eof print firstobs=2 missover; * if empty values at end of line then use missover; INPUT &vars.; %if &OnlyGoodLinks.=Y %then %do; if Match_Status = 1; %end; run; %exit: %put --- End of %upcase(&sysmacroname) macro; %mend FetchLinkPlusLinkageExportFile;