Transfer of Rfree flags to a new data set

There are many cases in which it is appropriate to replicate the set of reserved reflections used to track Rfree so that the same set is used for multiple data sets. Some examples of this are:
  1. You start refinement of your structure, but become worried that there was an error in the data processing. You re-run denzo/scalepack/scala/truncate or whatever to produce a new reflection file. The first thing you might want to test is whether your re-processed data is in fact better. If it is, then you probably want to continue your refinement from where you left off but using the new data file. In either case you should keep the same set of reserved reflections to track Rfree.

  2. You start refinement of your structure against one data set, and then while your refinement is in progress you get a better data set (higher resolution, more complete, whatever). It would not be correct to select a new set of reserved reflections from your new data set randomly. They would no longer be "free". They would, in fact, be replicate measurements of reflections that you already used in refinement of your first data set. Instead you should reserve the same set of reflections in your new data set as in your old one. If your new data set goes to higher resolution, then you augment the original set by adding reserved reflections from your new high-resolution shell of data.

  3. You are refining highly isomorphous structures, for instance multiple parallel soaks of different small molecules into the same bunch of crystals. You will presumably begin each structure refinement/analysis from the same starting model. Neither the models nor the data are independent, so you should not allow reflections used in the refinement of the starting model to be treated as "free" in the later refinements. Instead you should use the same original set of reflections for Rfree in each case.

Imperfectly isomorphous structures present more of a borderline case than the examples above. Just how isomorphous do two structures have to be before the data sets are not independent? It's something of a judgement call. But on the other hand, it doesn't hurt to keep the same set of reserved reflections so you might as well do it anyway.

I now give a pair of fragments from X-PLOR scripts that may be used to extract the Rfree flags (TEST) from an existing data file and merge them back into another data file.

!
! Read in an existing data file,
! write out only the Rfree flags
!
@@crystal_setup.inp
    nreflections = 50000
    reflection @olddata.xpl end

    write reflection
	test
	output= oldtest.xpl 
    end
stop
!
! Merge Fo, Sigma from new data set with 
!       Rfree flags (TEST) from previous data set
! Write out merged data set to newdata.xpl
!
@@crystal_setup.inp
    nreflection = 100000
    reflection  @newdata.fob end
    reflection  @oldtest.xpl end
    resolution 100. 2.0
    reduce

	write reflection 
	    fobs sigma test
	    output=newdata.xpl
	end  

    mbins = 20
    print completeness
    ?
stop

E A Merritt 1999