Resource-efficient LLM system for local-service search
A production-oriented LLM/SLM system for local-service search: structured extraction, POI grounding, LLM-as-a-judge verification, agent planning, RL-style post-training, and nearline cache-based deployment. It improved multilingual query understanding, increased cache reuse, and reduced online serving pressure under real search-engine constraints.
