XNR-NXNR

xnr, nxnr

XNR and NXNR Searches

Results from an XNR search are a subset of AND and NR results.

{AND} ⊇ {NR} ⊇ {XNR} (1)

A subtractive complement, relative to AND, is also defined.

{NXNR} = {AND} - {XNR} (2)

This is depicted in the next figure. Some authors call an exclusive near search a PROXIMITY search (1).

XNR searches

Figure 1. XNR results as a subset of AND results.

The blue region represents the number of XNR matches and the pink one the number of XNR nonmatches or NXNR. The sum of both regions represents the total number of AND results. So NXNR is the subtractive complement of XNR, relative to the AND set.

In other words, the blue region corresponds to the number of AND results that match an XNR search whereas the pink region corresponds to the number of AND results that do not match said search. We could have computed this complement relative to NR, but we elected not to do that.

Applications to IR

In our implementation, XNR matches the first two search terms, in any order, and no more than XNR number of terms from one another, but not themselves.

When declared in a query, the 'XNR' abbreviation is also interpreted as a place holder for a value and as a distance operator. Thus, the query

~XNR:w1 w2 w3 ...(3)

with XNR = 10

~10:w1 w2 w3...(4)

instructs the platform to find documents where

  • w1 is separated by no more than 10 words from w2
  • w1 is separated by no more than 10 words from w1

The tilde at the beginning is required so when the query is parsed it will be recognized as an XNR search and not as a NR one.

As a result, an XNR search can find documents with passages (portions of texts) starting with w1 and ending with w2 or vice versa.

Unlike window passages defined in the IR literature (2 - 5) which are defined by fragmenting documents, these are defined by the query and can be overlapping or nonoverlapping in nature. We define an overlapping passage as one that starts or ends within other passages of similar word lengths. This is illustrated in Figure 2.

XNR searches

Figure 2. Representation of overlapping and nonoverlapping exclusive passages starting with w1 and ending with w2 or vice versa.

Final Remarks

Minerazzi supports the XNR search mode and its subtractive complement, relative to the AND mode. This mode helps users discriminate between AND results based on a word distance criterion that is defined at query time.

References

  1. Kostofff, R. N., RIgsby, J. T., and Barth, R. B. (2006). Adjacency and Proximity Searching in the Science Citation Index and Google.
  2. Kaszkiel, M. and Zobel, J. (2001). Effective ranking with arbitrary passages. Journal of the American Society For Information Science and Technology, 52(4):344-364.
  3. Callan, J.P. (1994). Passage-level evidence in document retrieval. In B.W. Croft & C.J. van Rijsbergen (Eds.), Proceedings of the 17th annual international ACM-SIGIR conference on research and developments in information retrieval, Dublin, Ireland, July (pp. 302-310), New York: ACM.
  4. Kaszkiel, M. and Zobel, J. (1997). Passage retrieval revisited. In N. J. Belkin, D. Narasimhalu, & P. Willett (Eds.), Proceedings of the 20th annual international ACM-SIGIR conference on research and development in information retrieval, Philadelphia, PA (pp. 178-185).
  5. Liu, X. and Croft, B. Language models for information retrieval: Passage retrieval based on language models. Proceedings of the eleventh international conference on Information and knowledge management, pp.375-382, November 2002.