@Moggy
Circuit is very simple: there is an octal flip flop (74377) that drive the resistor DAC network. Two of them can output stereo sounds. The diodes are to enable the write on the flip flops on proper address, e.g. $BF (0b11011111) for the A5 address line (A0 and A1 are reserved for NMI and keyboard and tape, A2 for printer, but the others seems free to be used, or at least in the base ZX81 ROM I didn't find anything else used).
Cost of the two (for stereo) ICs together might be one euro; resistors and diodes might cost a bit more but still little. And important for me, this circuit would be in line with the time period and Clive orientation on "minimization"
Regarding software, everything will be streamed by the ZX81 itself with its own clock. About memory size, for the instruments I think that a 256 bytes waveform with ADSR might already give good results. For the music, trackers have already proven good 4 tracks music, and those are very compact also.
@MAK
Seems to me that we can save the 74138 using one entire address line (e.g. A5 and A6 if stereo) dedicated to the purpose and enable the write with that (as depicted in the hand written schema above). To save the not gate more than an address line, since data is ready after the addresses set, seems WR is a better solution for CLK. Even maybe directly without not gate if IORQ would be slowed down a bit by the (diode diode) OR gate (or with a tiny capacitor). The sequence should be:
1. A0-A7 are set to the correct output port
2. D0-D7 are set to the correct data to be written
3. IORQ is set to 0
4. WR is set to 0
5. WR is set to 1
Ad is depicted in the schema: