The first thing you need is a macro, which I call SpeakA, which contains this action:

That's the routine that speaks text asynchronously (hence the A). There's nothing in this macro that makes it run asynchronously, that's flagged in the other macro, which is as follows. The explanation follows the code.

Firstly, I'm using the application Pages, so I activate it. If you are using something else, feel free. But you may have to change some of the timing.
Secondly, I assume the user has already selected the text and triggered this macro with a hotkey.
Thirdly, I initiate the asynchronous reading of the text. Note the flag indicating asynchronousness.
Fourthly, I use a shell command to calculate the size of the spoken text file. it assumes the default voice. If you don't prefer to use the default voice, feel free to change it. But that might change the timing.
Then I paused for three seconds. That's a totally arbitrary value. My program doesn't know how far down the selected text begins. If it starts at the top of the screen, you may want to change this to 30 seconds to get the speech synthesizer theorizer a chance to reach the middle of the screen before the rest of the code kicks in.
Lastly, there's a loop that scrolls for the exact duration of the size of the speech file. There are two very important numbers in this code that I told you in a previous post you would have to tinker with. One is 0.07 which is how long each tick of the scroll will pause for. This is basically the speed of the scrolling. It is very dependent on your fonts and screen size (even your video driver). Since I have no way of knowing those values for your screen, you will have to experiment with that. Second is the value 8. This value divides the size of the speech file (which represents the duration) by a constant that tells KM how long to keep scrolling for. This value may also need to be tinkered with at your end, depending on things like the voice you choose and the speed of your CPU. Of course I can't predict those things.
This code works perfectly for me. It does exactly what I think you described. It keeps the part of the selected text that's being read right in the middle of the screen until the end of the speech. It's perfect.
Try my code and tinker with the two values, 8 and 0.07. Maybe those values will work for you. But on my screen my fonts are very large, and yours are probably smaller. So I'm guessing you will have to change those values.