Instructions for running mapfromfileS.cpp: This is a most rudimentary installation, consisting of the source C++ file and sample configuration and input files. Use the c++ command to compile and link it, for example: c++ -o mapfromfileS.exe mapfromfileS.cpp The resulting executable requires five command-line arguments in invariant order. The first argument (argv[1]) is the name of the configuration file, where additional variables are given values. The second argument is the name of the data file, which contains a record for each plant in the mapping population. The third argument is the name of the pre-dosage file, which states the pre- deletion dosage for each marker, by marker. The fourth argument is the name of the output file, which will contain the initially obtained map, the finally obtained map, echoed configuration settings, and additional information of help for debugging. The fifth argument is the name of an intermediary file that contains the total deletion frequencies of each marker and each combination of markers. The program creates this file, whose name therefore should be novel if it is to be re-used. The sixth argument is the last; it is a numerical flag that tells mapfromfileS.exe whether to use the data file (second argument) or a stored file of processed deletion frequencies (fifth argument) as input. Here is an example that was used to map the W7984 x Opata 85 population in the GrainGenes database: mapfromfileS.exe wheatmaprun0323c.cfg wheatdata0312.txt wheatprd.txt wheatrunout0323c.txt wheatposrec.txt 1 Copies of the three input files above have been included in this archive. The configuration file provides values for the following variables in an invariant order. ploidy: The effective ploidy of the organism. Many situations are treated as deletion from a monoploid genotype, in which case ploidy is 1. baseNumber: The number of linkage groups to be obtained. This typically is the somatic chromosome number for inbred and deletion populations, and the gametic chromosome number for backcross populations. nLoci: The number of markers, which is the number of loci for backcrosses, twice the number of loci for inbreds, and any multiple of the number of loci for deletion populations. plantsTotal: The number of individuals in the mapping population. deletionmapping: A flag that tells mapfromfileS whether to treat the data as a true deletion population (if 1) or not (0). simpMultFlag: A flag that tells mapfromfileS whether to treat markers as simplex or potentially multiplex in polyploids. maxlgl: The maximum allowed number of markers in one linkage group. cutoff: The maximum allowed value of recombination frequency between adjacent markers within a linkage group. pctile: The percentile cutoff allowed for gaps between chromosome ends that are recombining at random. rseed: A long integer, between 1 and 2^31-1 inclusive, for seeding a random-number generator. temperature: The initial temperature for simulated annealing during the correction phase. tempiters: The number of iterations that begin a new temperature during cooling in the simulated annealing phase. annealiters: The number of iterations at each temperature during simulated annealing. step: A multiplier (slightly less than 1) for the temperature at the beginning of each cycle of iterations. The data file looks something like a Fasta sequence file. Header lines begin with a '>' and contain a line of annotation for the individual plant. Subsequent lines list dosage for each marker in that plant, as space-delimited integers in fixed order by marker number. There is no particular length limit to each following line; the data are read in order with fscanf. The user must keep track of which marker identities go with which marker number. For inbreds, each parent is considered to have its own set of markers, so that marker number modulo nLoci is associated with one and only one locus, and each locus n is associated with markers n and n + nLoci. The predosage file is a series of space-delimited single digits ranging from 0 up to the predosage value for the same marker. In most cases, this is 0 or 1, but it can be greater if dosage of an individual marker can be measured directly, as from a microarray or from quantitative PCR. Again, there is no particular line length requirement. For further explanation, please ask me at ccrane@purdue.edu. .