Match

this page describe about match command

Combination

search for the most significant term in a category.

GET news_headlines/_search
{
  "query": {
    "match": {
      "category": "ENTERTAINMENT"
    }
  },
  "aggregations": {
    "popular_in_entertainment": {
      "significant_text": {
        "field": "headline"
      }
    }
  }
}

result is:

"hits": { ... },
"aggregations": {
    "my_sample": {
        "doc_count": 16058,
        "bg_count": 200853,
        "buckets": [
        {
            "key": "trailers",
            "doc_count": 387,
            "score": 0.21944632913239076,
            "bg_count": 479
        },
        {            
            "key": "movie",
            "doc_count": 419,
            "score": 0.16123418320234606,
            "bg_count": 730
        },
...

Precision and Recall

Simple match

match query use "OR" logic as default. it means:

# below pseudo-code not describe about string OR operation.
query = "Kloe" | "Kardashian" | "Kendall" | "Jenner"
GET news_headlines/_search
{
  "query": {
    "match": {
      "headline": {
        "query": "Khloe Kardashian Kendall Jenner"
      }
    }
  }
}

found too many documents. need to increase precision.

Increase precision

update query using AND operator

GET news_headlines/_search
{
  "query": {
    "match": {
      "headline": {
        "query": "Khloe Kardashian Kendall Jenner",
        "operator": "and"
      }
    }
  }
}
...
"hits": {
    "total": {
        "value": 1,
        "relation": "eq"
    },
...

as you see, we got only 1 document. precision increased but not what expected.

minimum_should_match

GET news_headlines/_search
{
  "query": {
    "match": {
      "headline": {
        "query": "Khloe Kardashian Kendall Jenner",
        "minimum_should_match": 3
      }
    }
  }
}
...
"hits": {
    "total": {
        "value": 6,
        "relation": "eq"
    },
...

we got 5 more documents.

Last updated