Note: Distances scaled ...

The nbpMatching package's distancematrix() function formats a distance matrix for the nonbimatch() function. Nonbimatch() only accepts integer values, so as.integer is called on distances. This will truncate values, such that 999.99 would become 999.

Additionally, nonbimatch() will display a warning message letting you know the program is rescaling your distances to work with the matching algorithm. Its default setting limits the maximum length of distances to 6 digits. This can be tweaked with the precision option in nonbimatch(). Examples and R code follow.

Suppose you had distances that looked like:
1.2345667
56.235543
2345.7987

First distancematrix will convert values to integers:
1
56
2345

The algorithm notes the largest distance is in the thousands (4 digits). Thus it will multiply all the distances by 100 and then truncate.
123
5623
234579

Conversely, if the distances are too long, it will divide by the appropriate amount. Consider this second example.
1.2456671
4.5112114
562.35543
2345798.7

Here the maximum distance is 7 digits long, so the algorithm will truncate and divide by 10.
0
0
56
234579

What to do if you have some really big distances in your matrix? It's possible you have some really big distances in your matrix, e.g. distances between people that would never match to each other, and these are making all your other distances look like 0. This could happen if you had two distinct clusters of subjects who should be matching within their clusters. The trick here would be to truncate your maximum distance at some value like 999999. Going back to the last example, you'd transform your distances
1.2456671
4.5112114
562.35543
2345798.7

to
1.2456671
4.5112114
562.35543
999999

The algorithm would then simply truncate them.
1
4
562
999999

Note that you could optionally round() distances before calling distancematrix() if you don't want values truncated.

# Here is an illustration of truncating your distances. d0 = read.csv("distances.csv", header=F) max(d0) # suppose the max > 999999

# Truncate the distances at 999,999 d1[d1>999999] = 999999

# Then perform the matching as usual. library(nbpMatching) d2 = distancematrix(d1) d3 = nonbimatch(d2) d3$halves
Topic revision: r4 - 31 Mar 2015, RobertGreevy
This site is powered by FoswikiCopyright &© 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback