Specifically, we show how language models, including transformer models that feature prominently in large language models such as BERT and GPT, can handle numerical information, and in particular ...