Text rendering is actually quite complicated and a core game toolkit feature. If your game toolkit can't do bitmap fonts, I would suggest looking at a different toolkit. I understand many game toolkits aren't good at text rendering, which IMO says quite a lot about many game toolkits. 😉
I'm not convinced that Spine is the right place to implement text rendering. It's only related to animation in that something is rendered to the screen. Likely apps will want to do more text rendering than just that in their skeletons, at which point the Spine Runtimes are providing general game toolkit features. Maintaining the Spine Runtimes is already quite a task! For text rendering, at a minimum, you need to load the font data, layout the glyphs, and render them efficiently. Doing all that would make the Spine Runtimes more of a game toolkit than most game toolkits! Most game toolkits don't have a proper way to render efficiently, instead naively forcing all rendering through a scene graph. We'd not only need to implement and port the core text rendering routines, but also slog through how to render it efficiently using the various terrible game toolkit APIs.
Also, supporting generation of a bitmap font from a TTF is very complex (this uses FreeType), but is necessary to support rendering arbitrary text (eg allowing the user to type) in CJK and other languages which have many glyphs. Properly supporting right-to-left languages and combining marks requires something like harfbuzz and is even more complicated than anything mentioned so far.
BTW, I wrote Hiero long ago, as well as the "other software" wiki page. 🙂
A workaround can be to animate bones in Spine, then render text using your game toolkit, but transform the rendering by the bone matrices. A placeholder image could be used in Spine so you can see the text, then at runtime the text would be rendered dynamically. This could be made a reusable component and is something that could be included in some Spine Runtimes. Not all, because many don't allow enough control over rendering to make it feasible.
A similar argument could be made for UI. It would be great for Spine to support animating UI components, but it's not reasonable to implement a complete UI solution in the Spine Runtimes. Instead, animate a placeholder in Spine, then transform the actual UI rendering at runtime.
We (badlogic and I) already implemented all of the above in libgdx, which we maintain and make available 100% for free. libgdx was a massive undertaking with over 1M lines of code and hundreds of contributors
even relatively small portions of it don't really make sense to include in an animation API.