quietann: (Default)
[personal profile] quietann
just got through a gnarly data-selection and sorting problem... I had two data sets from the same people. Data set 1 had one observation per person, while dataset 2 had anywhere from one to three observations. I was trying to retain the correct observation (the one with all the data) in a new dataset and get rid of duplicates... but SAS is dumb (or I am dumb, or both) and it kept getting screwed up.

Finally I realized that I needed to deal with the people who had duplicates separately. So here are the steps:



  1. Merge the two original datasets, but the merged data go to one temporary dataset if there is a single observation for an ID, but another temporary dataset if there are >1 observations per ID. (Also get rid of any observations for people who are only in the second original data set.)

  2. In the dataset with >1 observation, retain those where some index variable matches for both original datasets. (in this case, "visit=dietvisit")

  3. Append the new new dataset with the single-observation temporary dataset.

    Code:

    /* now add dietary iron values where they are available */
    data new dups;
    	merge fefin (in=g) diet;
    	by alive_id;
    	if not g then delete;		/* delete people who 
                                            only have diet data */
    	if first.alive_id=last.alive_id then output new;	
    
    	else output dups;		/* this is where people 
                                            w/ >1 observation go */
    run;
    
    /* if >1 visit then we want the one where dietary iron was collected */
    
    data nodups;
    	set dups;
    	if (visit ^= '' & dietvisit ^='' & visit=dietvisit);		
    run;
    
    /* append the correct duplicate observations to the new data set */
    
    proc append base=new data=nodups;
    run;
    
You may post here only if quietann has given you access; posting by non-Access List accounts has been disabled.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

Profile

quietann: (Default)
quietann

November 2011

S M T W T F S
  12345
6789101112
131415 16171819
20212223242526
27282930   

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jan. 28th, 2026 09:01 pm
Powered by Dreamwidth Studios