If I see one trend emerge in HOW the non-ANSI IS work, is that most of them seem to under represent flooders and over represent throwers....and as many of the baseline models ARE tighter beamed, the calibration tends to be closer to the thrower calibration and further from the flooders...and, sure, the spectrum response is going to be all over the place.
Most of the weakness of a DIYS IS is that the light MUST be evenly distributed to have a single point measurement be representative.
The simplest version of that is the ceiling bounce with the meter next to the light on the table, etc. MOST light from a tight hot spot will come straight down from the reflections off the ceiling over the table. If the (floody) light puts say, 60% of its spot OUTSIDE the table's diameter, most of the output is not measured with the same weighting given to the light coming down on the meter, on the table, etc.
So, given the high-tech coatings and engineering that goes into a real IS, a DIYS IS is going to be off, and, it will tend to be off the same way across the board, for example being "calibrated" more for throwers than for flooders, or to the emissions from one LED at a particular amperage/drive level vs another, etc.
It can get complicated, as variables include corona and spill proportions to hot spot proportions, and a lot of other variables that impact the difficulty of getting an even and representative distribution to fall on your lux meter's sensor.
The other issue you can't control is the consistency from light to light in actual output. A batch to batch variance in reflectors, LED, etc, can mean that if you measure 3 lights, you could have them 15% apart in output, and, the factory output spec, based upon a 3 light AVERAGE, means that YOUR light might be within 15% of that, high or low.
Add that variable set to your DIYS IS's built in variable sets, your lux meter's built in variables, the IS variables, etc, and, when you add it all up, you COULD see dramatically different differences in perceived measurements, which, after crunching the stats, could be all the same.
Example: If the dogma is that SureFire is accurately measuring the lumens of their lights (Which can be correct, given gov contracts, etc...) - and, assuming that SF lumens are good for calibration...whatever the beam characteristics of the SF you use to calibrate are, become the reference you use as a baseline to measure the SF.
So SF lumens will, in that scenario, tend to be "right", because they were used to calibrate themselves....and lights that use different LEDs, beam angles, and so forth, will be "off", as they were not used to baseline themselves, but compared to another light's baseline.
If, for example, Zebralights's were used to establish the "baseline", then only ZL's would be found to be accurate, and all the throwy/different beam distribution and spectrum lights would be "off".
So, a real IS is the correct answer, and DIYS versions can ball park things, but, you would really need to do a lot of work to account for all the variables, and not just be satisfied to say, well, "its right for the known lights".
If your DIYS IS is actually as accurate as a real IS, and you could go to court and use it to prove that X light is over stating or understating its lumens, then, you could make a lot of them, and sell them to all the suckers paying many thousands of dollars for a real one, instead of buying, or just making your DIYS version. (You is the "royal you", not a poster or anyone in particular...)
That said, we ALL use our lux meters to bench line with...as we don't have an IS, and probably never will. Hence, we do take our DIYS measurements with the proverbial salt.