📑 Topics

Named Entity Recognition

Named Entity Recognition is currently one of the strongest task areas on the site. The main question is no longer whether NER can work for archive indexing, but which model family is the most defensible default and where LLMs still help.

Current View

The topic summary below is meant for fast comprehension before reading the underlying benchmark posts.

Default recommendation

Use dedicated transformer NER models as the primary indexing path, selecting the strongest model by language and collection type.

Where LLMs fit

Use LLM extraction as a slower fallback or targeted enrichment path when dedicated models are weak or unavailable.

Main caveat

Label inventories vary heavily across datasets, so mapped evaluation is necessary and direct score comparisons need care.

Current signal

The strongest current evidence still favors dedicated NER over prompt-based LLM extraction for routine multilingual archive indexing.

Evidence Notes

These posts contain the detailed benchmark tables, setup notes, and interpretation behind the current recommendation.

📇 NER 🤖 LLM 2️⃣ Secondary

Large Language Models for Entities?

LLM-based NER is useful as a fallback and experimentation path, but current results make it slower and usually less accurate than dedicated NER models.

📇 NER 1️⃣ Preliminary

Named Entity Recognition

Dedicated NER models remain the strongest default for high-throughput archive indexing, especially when they are routed by language and text domain.