📑 Topics
Named Entity Recognition
Named Entity Recognition is currently one of the strongest task areas on the site. The main question is no longer whether NER can work for archive indexing, but which model family is the most defensible default and where LLMs still help.
Current View
The topic summary below is meant for fast comprehension before reading the underlying benchmark posts.
Default recommendation
Use dedicated transformer NER models as the primary indexing path, selecting the strongest model by language and collection type.
Where LLMs fit
Use LLM extraction as a slower fallback or targeted enrichment path when dedicated models are weak or unavailable.
Main caveat
Label inventories vary heavily across datasets, so mapped evaluation is necessary and direct score comparisons need care.
Current signal
The strongest current evidence still favors dedicated NER over prompt-based LLM extraction for routine multilingual archive indexing.
Evidence Notes
These posts contain the detailed benchmark tables, setup notes, and interpretation behind the current recommendation.
Large Language Models for Entities?
LLM-based NER is useful as a fallback and experimentation path, but current results make it slower and usually less accurate than dedicated NER models.
Named Entity Recognition
Dedicated NER models remain the strongest default for high-throughput archive indexing, especially when they are routed by language and text domain.