You certainly need to follow what your target users need, so long as you can do it ethically and sustainably.
The question is really the rate of change, though, not the change itself. If the change occurs slowly enough, odds are that more users will be okay with it. There’s also the question of measuring the users’ experience (which really needs to be done strictly passively without getting in the users’ way) to know how to “fade” out the cues.
I find it useful to define an ideal, as a target for discussion, and then see how close to it one can get feasibly. If we consider the Medium floaty thingy, the beginner would probably benefit from seeing it all the time. The system would ideally track every keystroke and mouse movement, and detect when the inputs suggest that the user knows how to use it (e.g., the mouse moves directly to the right location to find the doohickey, without wandering to several different locations), then starts fading it over the subsequent few hours of use. Different users get different experiences depending on their level of engagement.
How you’d actually implement this depends on all kinds of factors that will vary between teams, organizations, apps, etc. But if you don’t aim for a high bar, you’ll never know how high you can actually go.
Yes, it requires more work & maintenance. But if it helps the users, then it’s worth it. And if users are more comfortable, they’re more likely to tell their friends to use your system.
Do you know those tiny, smooth “transition” animations that are now ubiquitous in interfaces? I remember when they were seen as burdensome and expensive to develop and maintain too.