Basics

elastic search์—์„œ ๋ฌด์—‡์„ ๊ฒ€์ƒ‰ํ•œ ๊ฒฝ์šฐ, ์œ ์‚ฌํ•œ ๊ฒƒ๋“ค์— ๋Œ€ํ•ด ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

Precision

image from https://youtu.be/CCTgroOcyfM?t=494

What portion of the retrieved data is actually relevant to the search query?

Recall

image from https://youtu.be/CCTgroOcyfM?t=518

What portion of relevant data is being returned as search results?

Precision and Recall are inversely related. like:

  • Precision

    • query์™€ ์™„๋ฒฝํžˆ ๋งค์น˜๋˜๋Š” ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜.

    • ๋ฐ˜ํ™˜๋˜๋Š” document๊ฐ€ ์ ์„ ์ˆ˜ ์žˆ์Œ.

  • Recall

    • retrieve lot of result

    • ๋งŽ์€ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜.

    • query์— ๋Œ€ํ•ด ์™„๋ฒฝํžˆ ๋งค์น˜๋˜๋Š” document๊ฐ€ ์•„๋‹ ์ˆ˜ ์žˆ์Œ.

Score

  • value that represents how relevant a document is to that specific query

  • score is computed for each document that is a hit.

  • score use 2 type of data.

    • term frequency

    • inverse document frequency

TF (Term Frequency)

for example, let's say search term is "How to form good habits"

https://youtu.be/CCTgroOcyfM?t=830

IDF (Inverse Document Frequency)

IDF decrease weight which occur very frequently. our case, habit occurred very frequently.

https://youtu.be/CCTgroOcyfM?t=830

References

Last updated