<< Previous | TOC | Next >>
From Simon’s post, "Roger Taylor and John Strong gave me both some pointers yesterday regarding the transparent sprite copy routine…..
I proceeded to correct those errors today and came up with a v2 [edit: version 5 actually]…. then my good friend Pere Serrat from the dragon scene decided he needed a bit of a break from his current project and decided to provide some input
Pere Serrat, Hugo Dufort and i have had a collab before working on my whacky PAL artifact stuff…
@pere and i have bounced this transparent sprite thing back and forth for the last 2 hours or so, providing input to each other…
so, this is what we came up with…
and it’s ALOT different than v1 or v2"
Version 3 on left, 5 on right.
Things get interesting here, as the changes are a little more subtile. Put on your mad cap! You might be surprised with the result in this version.
Lines 7–8 Simon just reversed the load order. No, he’s not crazy. Okay yes he is, but that’s not why he did it.
You will also notice he tossed out clr ,u
from version 3, as it is no longer useful. It actually wasn’t useful in version 3 either. The CLR
was there, I assume, to handle OR’ing the individual nibbles into the destination.
Version 3
11 [4+0] sta ,u ; update dest buffer with sprite
Version 5
14 [4+0] stb ,u ; update dest buffer
In both cases, effectively negates the usefulness of CLR as the sta/stb overwrites the entire byte of the destination.
Lines 10–14 we deal with the left nibble. In the quest to drop instructions, he managed to drop 1. By loading B with the sprite byte (line 12) after the bita
test, he is able to reuse the instructions at 13 & 14.
10 [2] seeN1 bita #$f0 ; test left nibble
11 [5] beq useB1 ; if zero use background
12 [4+1] ldb -1,x ; else get sprite byte
13 [2] useB1 andb #$f0 ; use only left nibble and clear right one
14 [4+0] stb ,u ; update dest buffer
But is it faster? Let’s take a look at the number of instructions that have to be run when the left nibble of A is zero, and when it is not.
Number of instructions
Version | Not 0 | Equals 0 | Total |
---|---|---|---|
5 | 5 | 4 | 9 |
3 | 4 | 4 | 8 |
Hmm. Simon is actually running more instructions in the new version. Bad Simon, bad!
Since instruction count isn’t the real deciding factor, let’s take a closer look at this and include cycle counts. Let’s not include the CLR instruction from version 3 to be a little fairer.
Cycle counts
Version | Not 0 | Equals 0 | Total |
---|---|---|---|
5 | 30 | 25 | 55 |
3 | 24 | 21 | 45 |
Version 3
06 [4+0] loop1 lda ,x ; get sprite byte
07 [4+0] ldb ,y ; get backgroud byte
;08 [6+0] clr ,u ; clean dest buffer
09 [2] seeN1 anda #$f0 ; use left nibble
10 [5] beq useB1 ; if zero use background
11 [4+0] sta ,u ; update dest buffer with sprite
12 [5] bra seeN2 ; test right nibble
13
14 [2] useB1 andb #$f0 ; use background left nibble
15 [4+0] stb ,u ; update dest buffer
Version 5
07 [4+2] loop1 ldb ,y+ ; get backgroud byte
08 [4+2] lda ,x+ ; get sprite byte
09
10 [2] seeN1 bita #$f0 ; test left nibble
11 [5] beq useB1 ; if zero use background
12 [4+1] ldb -1,x ; else get sprite byte
13 [2] useB1 andb #$f0 ; use only left nibble and clear right one
14 [4+0] stb ,u ; update dest buffer
So that is 45 cycles in version 3 with the extra instruction vs 55 in version 5. Oops.
<< Previous | TOC | Next >>
Copyright © 2025, Lee Patterson