Abstract: Recently, researchers in the field of math word problem (MWP) solving have reported performance metrics for various large language models (LLMs) on benchmark datasets, with some models ...
GSM8K-V is a purely visual multi-image mathematical reasoning benchmark that systematically maps each GSM8K math word problem into its visual counterpart to enable a clean, within-item comparison ...
Seeking a review of the Supreme Court’s August 12 order, which paused coercive action against end-of-life vehicles (ELVs) to curb air pollution in Delhi-NCR, the Commission for Air Quality Monitoring ...
Reflecting on the past five years of employment trends across advertising, it’s clear marketers and employers have been on a fully disorienting turbulent roller coaster ride. For a brief time, the ...
Abstract: The accessibility and quality of education in Sri Lanka face significant disparities, particularly between rural and urban areas. This research developed a personalized Intelligent Tutoring ...
Chinese AI startup DeepSeek has released two powerful new AI models that the company claims match or exceed the capabilities of OpenAI's GPT-5 and Google's Gemini-3.0-Pro — a development that could ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results