my brain hurts!
Jun. 9th, 2004 03:33 pmjust got through a gnarly data-selection and sorting problem... I had two data sets from the same people. Data set 1 had one observation per person, while dataset 2 had anywhere from one to three observations. I was trying to retain the correct observation (the one with all the data) in a new dataset and get rid of duplicates... but SAS is dumb (or I am dumb, or both) and it kept getting screwed up.
Finally I realized that I needed to deal with the people who had duplicates separately. So here are the steps:
Finally I realized that I needed to deal with the people who had duplicates separately. So here are the steps:
- Merge the two original datasets, but the merged data go to one temporary dataset if there is a single observation for an ID, but another temporary dataset if there are >1 observations per ID. (Also get rid of any observations for people who are only in the second original data set.)
- In the dataset with >1 observation, retain those where some index variable matches for both original datasets. (in this case, "visit=dietvisit")
- Append the new new dataset with the single-observation temporary dataset.
Code:/* now add dietary iron values where they are available */ data new dups; merge fefin (in=g) diet; by alive_id; if not g then delete; /* delete people who only have diet data */ if first.alive_id=last.alive_id then output new; else output dups; /* this is where people w/ >1 observation go */ run; /* if >1 visit then we want the one where dietary iron was collected */ data nodups; set dups; if (visit ^= '' & dietvisit ^='' & visit=dietvisit); run; /* append the correct duplicate observations to the new data set */ proc append base=new data=nodups; run;