196 lines
8.9 KiB
Markdown
196 lines
8.9 KiB
Markdown
# Raspberry Casket
|
|
A fast and small open source Pretracker replayer
|
|
|
|
## Raspberry Casket Player V2.x (10-Aug-2023)
|
|
|
|
Provided by Chris 'platon42' Hodges <chrisly@platon42.de>
|
|
|
|
Rewritten by *platon42/Desire* based on a resourced, binary identical
|
|
version of the original Pretracker V1.0 replayer binary provided
|
|
by *hitchhikr* (thanks!), originally written in C by *Pink/Abyss*.
|
|
|
|
This version is the hard work of reverse engineering all the
|
|
offsets, removing all the C compiler crud, removing dead and
|
|
surplus code (maybe artefacts from earlier ideas that did nothing),
|
|
optimizing the code where possible. This resulted in both reduced
|
|
size of the replayer, faster sample calculation and speeding the
|
|
tick routine up significantly.
|
|
Bugs from the original replayer were fixed.
|
|
|
|
I also added a few optional features that come in handy, such as
|
|
song-end detection and precalc progress support.
|
|
|
|
Note that this player can be also used for the playback of
|
|
Pretracker 1.5 tunes, given you don't use sfx or sub-songs.
|
|
|
|
Also: Open source. It's 2023, keeping the code closed is just not
|
|
part of the demoscene spirit (anymore?), at least for a replayer.
|
|
|
|
This player is still being optimized and worked on since its
|
|
first release in late 2022.
|
|
|
|
Productions that I know have been using Raspberry Casket so far:
|
|
|
|
- [Smooth Flowing/Dekadence](https://www.pouet.net/prod.php?which=94347)
|
|
- [Cracking Posadas/Software Failure](https://www.pouet.net/prod.php?which=94570)
|
|
|
|
### Verification
|
|
|
|
The first versions of the replayer had been verified against about
|
|
60 Pretracker tunes to create an identical internal state for each
|
|
tick and identical samples (if certain optimizations switches are disabled).
|
|
|
|
During the process this identical state and identical samples promise
|
|
had to be dropped due to bugs in the original player and optimizations.
|
|
This is especially the case for the track delay feature of Pretracker
|
|
that could in some cases cause odd behaviour and unwanted muting that
|
|
has been fixed in Raspberry Casket. So the verification is now heavily
|
|
reduced to about 20 songs that still are identical.
|
|
|
|
I do, however, now also have an emulated Paula output verification that
|
|
compares the generated sound between the original code and Raspberry Casket.
|
|
Divergences are manually checked from time to time.
|
|
|
|
If you find some problems,
|
|
please let me know under chrisly@platon42.de. Thank you.
|
|
|
|
### Usage
|
|
|
|
The new replayer comes as a drop-in binary replacement if you wish.
|
|
In this case you will get faster sample generation (about 12%
|
|
faster on 68000) and about 45% less CPU time spent during playback. However, you
|
|
won't get stuff as song-end detection and precalc progress this way.
|
|
This mode uses the old CPU DMA wait that takes away 8 raster lines.
|
|
|
|
If you want to get rid of the unnecessary waiting, you can switch
|
|
to a copper driven audio control. If you want to use the top portion
|
|
of the copperlist for this, you probably need to double buffer it.
|
|
Otherwise, you could also position the copperlist at the end of
|
|
the display and use single buffering if you call the tick routine
|
|
during the vertical blank.
|
|
|
|
Please use the documented sizes for the `MySong` and `MyPlayer` data
|
|
structures, which are the symbols `sv_SIZEOF` and `pv_SIZEOF`
|
|
respectively (about 2KB and 12KB with volume table).
|
|
|
|
The source needs two common include files to compile (`custom.i` and
|
|
`dmabits.i`). You should leave assembler optimizations enabled.
|
|
|
|
1. (If you're using copper list mode, call `pre_PrepareCopperlist`.)
|
|
2. Call `pre_SongInit` with
|
|
- a pointer to `MySong` (`mv_SIZEOF`) in `a1` and
|
|
- the music data in `a2`.
|
|
It will return the amount of sample memory needed in `d0`.
|
|
3. Then call `pre_PlayerInit` with
|
|
- a pointer to `MyPlayer` (`pv_SIZEOF`) in `a0`
|
|
- a pointer to chip memory sample buffer in `a1`
|
|
- the pointer to `MySong` in `a2`
|
|
- a pointer to a longword for progress information or null in `a3`
|
|
|
|
This will create the samples, too.
|
|
4. After that, regularly call `pre_PlayerTick` with `MyPlayer` in `a0`
|
|
and optionally the copperlist in `a1` if you're using that mode).
|
|
|
|
### Size
|
|
|
|
The original C compiled code was... just bad. The new binary is
|
|
less than 1/3 of the original one.
|
|
|
|
The code has been also optimized in a way that it compresses better.
|
|
The original code compressed with *Blueberry's* Shrinkler goes from
|
|
18052 bytes down to 9023 bytes.
|
|
|
|
Raspberry Casket, depending on the features compiled in, is about
|
|
5802 bytes and shrinkles down to ~4125 bytes (in isolation).
|
|
|
|
So this means that the optimization is not just "on the outside".
|
|
|
|
About 2.4 KB of the code (and data) are spent for the sample generation,
|
|
the remaining code for playback.
|
|
|
|
### Timing
|
|
|
|
#### Sample precalculation
|
|
|
|
Sample generation is faster than the original 1.0 player and also
|
|
faster than the 1.5 player, which got a slightly better performance
|
|
than the 1.0 one (compiler change?).
|
|
|
|
According to my measurements on my set of Pretracker tunes,
|
|
Raspberry Casket needs between 10% to 20% less instructions.
|
|
Of these instructions, about 5% are `muls` operations and the new
|
|
player is only able to shave off between 3% and 8% percent of those,
|
|
so this is probably the limiting factor.
|
|
|
|
#### Playback
|
|
|
|
Raspberry Casket is about twice as fast as the old replayer for playback.
|
|
|
|
Unfortunately, the replayer is still pretty slow and has high
|
|
jitter compared to other standard music replayers.
|
|
|
|
This means it may take up to 32 raster lines (13-18 on average)
|
|
which is significant more than a standard Protracker replayer
|
|
(the original one could take about 60 raster lines worst case and
|
|
about 34 on average!).
|
|
|
|
Watch out for *Presto*, the [LightSpeedPlayer](https://github.com/arnaud-carre/LSPlayer) variant that should
|
|
solve this problem.
|
|
|
|
### Known issues
|
|
|
|
- Behaviour for undefined volume slides with both up- and down nibble specified is different (e.g. A9A, hi Rapture!). Don't do that.
|
|
- Don't use loops with odd lengths and offsets (even if Pretracker allows this when dragging the loop points).
|
|
- Don't stop the music with F00 and use a note delay (EDx) in the same line.
|
|
- Don't try to play music with no waves, instruments or patterns.
|
|
- Pattern breaks with target row >= $7f will be ignored.
|
|
- Shinobi seemed to have used an early beta version of Pretracker where it was possible to specify a Subloop Wait of 0. That's illegal and unsupported.
|
|
- Pattern break (Dxx) + Song pos (Bxx) on the same line does not work in original Pretracker & Player: New Dxx position is ignored.
|
|
There is code to enable it in the player, so you could in theory make backwards running tracks like in Protracker.
|
|
But this doesn't make sense as long as the tracker itself does not support it.
|
|
- Setting the same track delay multiple times will no longer mute the delayed channel.
|
|
- Clearing the track delay (multiple times) will no longer mute the delayed channel nor cause a delay of one tick to the note played in the no-longer delayed channel.
|
|
|
|
## Changelog
|
|
|
|
### V2.x (unreleased)
|
|
- Split wave generation out of main file, reorganised content into header files.
|
|
- Optimized some more code paths for Raspberry Casket replayer.
|
|
- In the wave generator optimized away a table (32 words), replacement code is even smaller!
|
|
- Replaced the period table by byte-deltas, saved 36 bytes and compression is even better!
|
|
- Optimized some code paths for octave selection.
|
|
- Removed two 25 bytes tables each, saving another 42 bytes.
|
|
- Completely reworked track delay handling, fixed oddities and improved output quality.
|
|
- This removes a big source of cpu jitter when track delay is enabled (no longer clearing the track delay buffer).
|
|
- This also fixes usages of illegal period 0 in the lead-in that could cause the replay to miss the first trigger.
|
|
- Moved pattern table init from PlayerInit to SongInit, optimized SongInit a bit.
|
|
- Wave order table filling moved and optimized in SongInit.
|
|
- Added Presto player draft.
|
|
- Bugfix: Songend detection for back-jumps was broken since at least V1.1.
|
|
- Optimized some more wave selection code.
|
|
- Drop-in replacement code size: 5802 bytes.
|
|
|
|
### V1.x (unreleased)
|
|
- Fixed a bug regarding the copper output mode with looping waves having a loop-offset.
|
|
- Fixed wrong register use on triggering waves regarding the loop offset.
|
|
- Minor code size optimizations.
|
|
|
|
### V1.1 (28-Dec-22)
|
|
- Optimized base displacement by reordering variables.
|
|
- Further optimized ADSR code.
|
|
- Optimized wave loop code.
|
|
- Baked in this strange vibrato speed multiplication to precalculated vibrato value (where possible).
|
|
- Various small optimizations.
|
|
- Store instrument number * 4 on loading to avoid using two adds every frame.
|
|
- Optimized speed/shuffle code. Idea of using xor turned out to make things too complicated for pattern breaks/jumps.
|
|
- Rearranged code for more short branches.
|
|
- Optimized track delay code further.
|
|
- Optimized pattern / song advance code.
|
|
- Maximum jitter now about one rasterline less, average about 0.5 rasterlines less (measurements, your mileage may vary).
|
|
- Drop-in replacement code size: 6228 bytes.
|
|
|
|
### V1.0 (26-Dec-22)
|
|
|
|
- Initial release.
|
|
- Drop-in replacement code size: 6446 bytes.
|