202602141413
Status: #reference
Tags: Information Retrieval, Relevance Feedback
State: #nascient
Pseudo-Relevance Feedback
Sometimes the performance of an Information Retrieval system will be poor, not due to the system itself, but due to the query it was fed. Web browsers like Google and DuckDuckGo give you dorking to enhance the specificity of your queries, but this requires you directly fine-tuning the query to retrieve the needed information.
Pseudo-relevance feedback, on the end, is a state of the art query expansion method which attempts to compensate for the potential weakness of a user's query through logical assumptions.
Pseudo-Relevance Feedback is a type of blind/local feedback (which means it adjusts the result in a way unknown to the user.) It's built on the assumption that when you give a query, if that query is even somewhat well specified, the top documents retrieved by the system will be relevant to the documents you are searching.
So it will take your query, but instead of trying to find synonyms on the query (however one may do this), it will instead use the top-k documents based on this first query.
The system then analyses and finds relevant terms to augment your query from these documents. The hope is that, the now augmented query will be able to find relevant documents in a more precise manner.
The advantages of this approach are that it's automatic and allows to get context that would be hard to encode with pure synonym search, the disadvantages are that if the assumption is incorrect and the retrieved documents are not relevant we will only find further irrelevant documents (a concept called topic drift), even if the assumption holds it is quite possible to add noise by choosing low-signal terms.
Relevant Links
| File | Folder | Last Modified |
|---|