I would replace C3 with 0.0047-0.01μF and C9 with 1μF to start.
Now that I think about it, you can get pretty clever with the drive channel. If you spin around real fast and squint at the schematic while covering one eye it looks a little like a Trainwreck Express. More on that in a bit.
While I agree ac427v that cutting lows early in an OD circuit is a good strategy, it's worth noting that C2 and C3 are in series and R7 and R8 are in parallel. This puts the input cutoff frequency at about 250Hz. Much of the body is gone when high passing that far up. I still say to tweak C3, but maybe don't go higher than 0.0047μF to keep it from getting muddy.
Back to being clever and the Trainwreck Express similarities: there are 3 stages and a fender style tonestack in both the drive section of this amp and the Trainwreck. Granted the tonestack is in a different spot, so just copying the Trainwreck values blindly isn't likely to be fruitful. But we can take inspiration.
I would first depopulate C13 and replace R24 with a 10k. This will give you the cold clipping/asymmetry you're seeking BUT it will drastically cut the output of the stage. To make up for that I suggest paralleling a 100k across R25, or even outright jumping across it. This has the benefit of also relaxing the low cut at the end of the drive stage.
The other thing you need to make the most of the cold clipper is to hit it pretty hard. There's a fixed voltage divider formed by R116 and R55 that cuts the signal in half. I'd parallel a 220k across R116 to start (experiment with values from 100k to 470k). This will give you more signal into the tonestack, and therefore into the cold clipper.
Also I would put a 1μF cap in parallel with R17 (which I would replace with a 1k) and/or replace R18 with a 10k trimpot. The first change hard wires a high shelf boost at about 200hz and biases the stage slightly hotter. The second allows the large cathode cap to be somewhat active even when "switched out", and being able to get the shades in between no cap and full bypass. Find the one you like and that becomes the new baseline.
Or you can be even more clever and put the cold clipper here (10k in place of R17) and put a 1.5k resistor in parallel with C9. This will give you the option of cold clipper or hot fully bypassed stage, each with their own dedicated gain knob (assuming I'm understanding the gain switching schematic correctly). You'll still want to tweak the voltage divider, but you'll probably find you keep one knob pretty low and the other fairly high.
These tweaks can likely introduce a good bit of fizz into the signal, so don't be shy about using 220-820pF caps to bypass the plates in the drive stage. I'd probably go fairly aggressive with 680-820pF on only the last stage to start out. This gives the upper harmonics a chance to intermingle before they are cut.
Hope this gives you a few ideas. I don't know how this would all sound, but it sounds like anything is an improvement from what you're experiencing on the stock design. Worth experimenting.
Update: added a marked up schematic of most of these ideas plus a few I didn't discuss.