Half a million random songs sounds like quite enough data for good results.
And I do agree with the results - I've talked about it for years with my friends. Sometimes it seems to me over half the hit songs on the radio are built upon the same four chords and the rest of them merely change a few odd notes, and ever since the radios started compressing their music on the air so much there's no volume difference between a quietly picked intro and a smashing chorus, the producers have used more and more compression on records until modern (popular) music largely has almost zero dynamics left to it.
I'd also extend it to song structures, there aren't many song intros in popular music any more and the songs fast forward to the chorus quicker and quicker every year it seems, until much of it sounds just like prolonged choruses with some interesting sounds in between.
That's basically the reason I do not listen to commercial pop-music playing radio stations any more. At least indie and metal / rock stations still play music that has some variance to it...big record company produced popular music for the masses has indeed become just a continuum of changes over the same exact formula and base, all ground and pushed out as a leveled-out, ready-chewed paste from the tube.
Dee
"When life's a biatch, be a horny dog"
Amps: Marshall JVM 410H w/ Plexi Cap mod, Choke Mod & Negative Feedback Removal mod, 4x12", Behringer GMX110, Amplitube 3/StealthPedal
Half a dozen custom built/bastardized guitars all with EMG's, mostly 85's, Ibanez Artwood acoustic & Yamaha SGR bass, Epiphone Prophecy SG, Vox Wah, Pitchblack tuner plus assorted pedals, rack gear etc. for home studio use.