All color is best-effort
Thanks to my sponsors: John Horowitz, Xavier Groleau, Romain Kelifa, Matt Heise, Lyssieth, Dirkjan Ochtman, David White, Niels Abildgaard, Philipp Hatt, traxys, Zac Harrold, Marie Janssen, Dimitri Merejkowsky, Eugene Bulkin, Kyle Lacy, Morgan Rosenkranz, Kevin Anderson, Tyler Bloom, Xirvik Servers, Alex Krantz and 277 more
This is a dual feature! It's available as a video too. Watch on YouTube
I do not come to you with answers today, but rather some observations and a lot of questions.
The weird glitch
Recently I was editing some video and I noticed this:
Not what the finger is pointing at — the dots.
Here are the separate layers this image is made up of: the background is a stock image I’ve licensed from Envato Elements:
Because I use it as a background image, I’ve cranked down the exposition in the Color tab:
And for a little added style and some additional readability for subtitles, I’ve added a tilt-shift blur:
On top of that, we have some text captured from Safari as a transparent image by right-clicking on an element in the Inspector and picking “Capture Screenshot”, one of my favorite tricks as of late:
However, the transparency is not complete. GitHub’s CSS has tables with an opaque background. So I added an additional 3D keyer to remove the background:
Those two layers composited already show some strangeness:
Uhh I can hardly see anything.
Very well — enhance!
It’s still kind of subtle, but you can see the dots.
Are those dots present without the 3D Keyer?
No, they’re not:
They are also not here on the Fusion tab at all:
The dots are bad when playback is using full resolution:
But things get a lot weirder once you scale down. This is half resolution:
And this is quarter resolution:
I have investigated this for half a day and I have good news and bad news.
The good news is: unpausing makes all the artifacts disappear!
Ah, great! So surely that means it’s not present in the export, yes?
Well bear, what do you think the bad news are?
Ah f-
Even if I considered those artifacts acceptable, the Tilt-Shift Blur effect makes them impossible to ignore: it blows up every single point into those large colored circles. Even with a strength of zero, it turns those dots into streaks that are reminiscent of memory corruption:
Disappointingly, I don’t know what is causing this particular problem.
There’s a whole bunch of things that don’t solve it: the 3D Keyer can work in different color spaces:
…and has a bunch of knobs you can turn:
But none of them truly fix it. The other keyers exhibit the same issue, here’s the Ultra Keyer instead:
Same with the Luma Keyer, the Chroma Keyer, etc. No amount of checking “Pre-divide / post-multiply” makes a difference. Manually adding an “Alpha multiply” node does jack shit. Same with “Gamut limiter” — there goes my theory that it’s just outputting colors outside the gamut but that… because it might be using f32 values internally, uhhh something happened?
But nope.
Playing with color spaces
What about the source PNG file, is it in some weird colorspace?
Bear, it’s in the most vanilla colorspace you can imagine: sRGB.
What’s with the numbers?
Oh that’s just the IEC standard for it:
Among other things, the standard (published in 1999, amended in 2003) specifies the proper transfer function to use, which is almost like gamma = 2.2, but not quite:
The two solid curves are:
- gamma 2.2 (top)
- gamma 2.4 (bottom)
The dotted curve is sRGB IEC61966-2.1
Close enough though: even if some part of the pipeline used “simplified sRGB” by doing gamma 2.2 instead of using the IEC curve, we wouldn’t see that.
What would we see?
In that case, probably nothing - but when we use the wrong color profile, things look, well, wrong. They can look washed out, they can look too saturated, too dark, too bright, any of these.
Believe it or not, HDR is the main reason I’m using iPhones all around now. By default, my iPhone 14 with the Camera app will shoot HDR video.
You can disable HDR in the settings:
…but personally, I prefer shooting with the Blackmagic Camera app for iPhone, which calls color spaces by their actual names:
I took 4 short videos in quick succession and dragged all those in DaVinci Resolve just to find out what would happen:
They look different.
Indeed they do! First off, there’s most likely an exposure / white balance / etc. mismatch between the built-in Camera app (which had everything on full auto) and the Blackmagic Camera app, but even just looking at the three pieces of Blackmagic footage, things look different, even on this sRGB screenshot, shown on my DCI-P3 display.
CIE Chromaticity diagram
If we switch to Resolve’s Color tab, we can see which colors are actually used in each image, via CIE Chromaticity diagrams.
Rec.709:
P3-D65:
Rec.2020-HLG:
iPhone Camera app:
The “horseshoe” shape in white represents the visible spectrum, and the triangle labelled “Rec.709” represents our color space: which colors of the visible spectrum we’re able to “represent”, or “encode”.
The circle is the “white point” — Rec.709, P3-D65 and Rec.2020-HLG are all using the same white point, D65, which…
…is intended to represent average daylight and has a correlated colour temperature of approximately 6500 K. CIE standard illuminant D65 should be used in all colorimetric calculations requiring representative daylight, unless there are specific reasons for using a different illuminant. Variations in the relative spectral power distribution of daylight are known to occur, particularly in the ultraviolet spectral region, as a function of season, time of day, and geographic location.
— ISO 10526:1999/CIE S005/E-1998, CIE Standard Illuminants for Colorimetry
Without agonizing too much about the science of it, we can see a couple interesting things: the Rec.709 feels very “noisy” for some reason:
I have no idea if it’s normal or not, but we don’t have the same thing in the P3-D65 footage:
So… maybe it’s an encoding artifact? Not sure.
But the thing that’s hard to ignore is on the Rec.2020 chromaticity diagram. Much like, if you crank the gain on an audio track, you can see it clip: you can see the peaks being flattened to the minimum and maximum representable value, as if “clipped” with a scissor:
Similarly, we can feel that our Rec.2020 footage feels a little cramped in our Rec.709 color space: it wants to get out!
But that’s not even what feels wrong with this comparison… the Rec.709 footage looks “less wrong” to me — the Rec.2020 one has some colors that look… off. The floor is too bright. The cat’s fur becomes too bright too quick.
Hey, that’s the gamma curve!
Right!!! The whole reason we have a “gamma curve” in the first place is to spend our bits wisely.
If all you had was 16 shades of grey (sorry E.L James), which would you choose?
Linear, or sRGB?
Oooh, I’ll take the sRGB ones any day.
Exactly — with a “linear” progression (in terms of light emission), the shades become “too bright” much too quickly.
You can refer to What every coder should know about gamma or the Krita docs about Gamma and Linear if you want to dive more in-depth.
And that’s what the curve I showed earlier is all about:
The sRGB IEC61966-2.1 reverse OETF
Reading this graph, we can see that an encoded value of 0.1 is barely equivalent to 0.01 lightness. To obtain 0.1 lightness, we have to go all the way to 0.35!
If we weren’t using “gamma curves” like these, colors would look a lot worse, especially given how long we’ve been using 8-bit color.
Our first transfer function
Before things get a lot more confusing, let’s ask ourselves: what is the exact name of the transfer function we just plotted?
As far as I can tell, it’s a “reverse OETF”.
An OETF (opto-electronic transfer function) is used when capturing/encoding: The camera sensor detects some amount of light and we need to figure out which integer value to encode it as in the video signal.
Think sRGB, Rec.709, Rec.2020-HLG, Rec.2020-PQ, etc.
An EOTF (electro-optical transfer function) is used by monitors/displays, taking a signal and deciding how much light to emit for each integer value of the signal (more or less).
Think Display P3, sRGB, Adobe RGB, etc. — monitor color profiles:
Finally, an OOTF is the composition of the OETF and the EOTF and we do not need to worry about it at all — check the Wikipedia page on transfer functions in imaging if you really must.
Really, the only one we have to care about is the “OETF”, which has been done in our camera and that we need to reverse in order to know how things are.
For sRGB, the OETF is defined as:
where , , , and
is the lightness value (how much light hit the camera sensor), and is the encoded signal (the RGB value we put in the video file).
We called it , with for “encoding”, and also for consistency,
The “reverse OETF” is then:
where , and , , and are the same as for .
Called for “decoding”, you guessed it.
We could’ve called it , but eh.
How do we know we got the inverse right?
Although not very rigorous, plotting them allows us to see a symmetry across the axis.
It looks a little like an almond!
The sRGB OETF (top), a diagonal (straight), and the reverse OETF (👉👈)
Another way is to do and simplify to see if we fall back to .
First the linear part:
Then the exponential part:
And finally, the breakpoint for was at , which gives us:
For the linear segment:
For the exponential segment:
Yup, looks like we got it right — that explains where the breakpoint for , comes from.
Parade scope
Back to our footage, there’s another fun color visualization we can look at: the “Parade”. This displays red, green, and blue values (here on a scale of 0 to 1024, to mimic 10-bit encoding, although that’s configurable) from left to right.
Our Rec.709 footage, which looks “good”, has a good spread:
Our Rec.2020 footage however, seems squished at the top of the scale:
Luckily, there’s a slider for that!
But here’s the thing: the footage looks fine when reviewing it on my iPhone:
It’s not “blown out”, or anything.
Well, since we’re getting technical… what color space is that screenshot in?
Good question! I Airplay’d the screenshot to myself and got an sRGB JPEG file. It definitely looks more “vivid” on my iPhone screen than on the webpage. I’m not actually sure iPhones can take HDR screenshots?
But the point is: the iPhone did something to that Rec.2020 footage. It’s showing more colors to me on the iPhone screen, but it’s able to save an sRGB version as a screenshot that doesn’t look as wrong as what I had when I dragged the video file on my DaVinci Resolve timeline.
In fact, my mac can do “that” too: opening the footage in QuickTime shows “correct” colors as well:
And what “that” is, is tone mapping.
More transfer functions
How do we know for sure how color is encoded into our H.265 files?
The macOS finder gave us three numbers for each of these files:
1-1-1
for Blackmagic Cam in “Rec.709” mode12-1-6
for Blackmagic Cam in “P3 D65” mode9-18-9
for Blackmagic Cam in “Rec.2020 - HDR” mode
Those numbers tell us all we need to know, and are standardized in the ITU-T H.265 Recommendation, which is a free download, and I have gone through its 728 pages to fish out the relevant information.
The first number is for color primaries. Notable numbers include:
- 1 for Rec. ITU-R BT.709-6 and IEC 61966-2-1 sRGB or sYCC (HD but SDR video content)
- 4 for Rec. ITU-R BT.601-7 625 (PAL)
- 5 for Rec. ITU-R BT.601-7 525 (NTSC)
- 9 for Rec. ITU-R BT.2020-2 and Rec. ITU-R BT.2100-2 (HDR)
- 12 for SMPTE ST 2113 “P3D65” (hey that sounds familiar)
These are the xy coordinates of what the color space considers red, green, and blue, in the CIE 1931 color space.
Again, the horseshoe is the entire visible spectrum and the triangles are “color gamuts” that represent which colors we can actually encode.
The Rec.2020 gamut (thick circles) is larger, it covers more of the visible spectrum than Rec.709 (thin circles) — that’s why the cloud of points in this chromaticity graph feels cramped: it’s actually Rec.2020 footage that we’ve clamped to Rec.709: we have to throw away some of the colors that were encoded into the original footage.
Wait a minute, I have several questions.
Go right ahead.
First: why didn’t we just pick a super large gamut that covers the entire visible spectrum? Why is Rec. 709 so small to begin with?
The short answer is 8-bit color: we talked about spending our bits wisely and it’s time to visualize it:
Without gamma curves, all of our possible encoded colors are concentrated near the illuminant, and there’s a lot of gaps near pure red, green, and blue colors.
But if you check “Apply TRC”, you can see all the discrete colors spread out away from the illuminant, and closer to the color primaries. This is valid for Rec.709 and Rec.2020 just as well.
And that’s what the second number is for! TRC for Transfer characteristics, which really is “transfer functions”, or “tone response curve” (also TRC), or “gamma curve”, yada yada.
Well. It’s not that simple. Note 5 says, paraphrasing:
some values of
transfer_characteristics
are defined in terms of a reference OETF, and others are defined in terms of a reference EOTF, according to the convention that has been applied in other Specifications.In the cases of Rec. ITU-R BT.709-6 and Rec. ITU-R BT.2020-2 (which can be indicated by transfer_characteristics equal to 1, 6, 14, or 15), although the value is defined in terms of a reference OETF, a suggested corresponding reference EOTF characteristic function for flat panel displays used in HDTV studio production has been specified in Rec. ITU-R BT.1886-0.
Here’s the ITU-R BT.1886
EOTF in question:
(Source: ITU-R BT.1886 on Wikipedia)
is the “input video signal level”, in the range , so far so good — we can divide our 8-bit values by 256, or our 10-bit values by 1024, no worries there.
is the screen luminance in (candelas per square meter, also called nits), which, uhhh, where did my beautiful normalized light-linear value go? This is much too real-world for me. is 2.4, which is unsurprising for Rec.709 / SDR (see below).
Also, is just lower-case , the greek letter “gamma”
is user gain, previously known as “contrast”, because, that’s right, EOTFs are for displays.
Mnemonics:
- EOTF = Electro-optical TF = (electric ➡️ optic)
- OETF = Opto-electrical TF = (optic ➡️ electric)
Displays are optical, video files are electrical.
is user black level lift, aka “brightness”.
Finally, and are screen luminance for white and black, respectively, also in .
So yeah. What we plotted earlier isn’t the EOTF, it was a reverse OETF.
I hope someone other than you had that question.
Notable transfer_characteristics
values for H.265 include 1, for Rec. ITU-R BT.709-6.
They give this OETF:
How do we know it’s an OETF? Because its input is L_c
, a “linear optical intensity with a nominal
real-valued range of 0 to 1”.
The value and are constants defined for the the curve segments to meet at the breakpoint.
For TRC 1, 6, 11, 14 and 15 we have:
Can we plot some of these? I’m getting a little lost in the theory.
Sure! Let me just rewrite the OETF in a way that’s a little more consistent with our sRGB work from earlier:
where:
Whoa hey that’s exactly the same formula as sRGB!
Yup, only with a different slope for the linear bit, a different breakpoint , a different constant , and rather than .
Here’s a plot of the sRGB gamma curve, zoomed into
The sRGB OETF's linear segment (solid) and the start of its exponential segment (dotted)
And here’s the Rec. ITU-R BT.709-6 curve, which we’ll hereafter lovingly refer to as just BT.709 (BT for “broadcast television”):
The BT.709 OETF's linear segment (solid) and the start of its exponential segment (dotted)
Whoaaaa. Same shape, different constants.
Moving on to HDR, value 18 refers to ARIB STD-B67, also Rec. ITU-R BT.2100-2 HLG, which we’ll simply refer to as “HLG” moving forward, for Hybrid log-gamma.
The definition given in the H.265 spec V10 is:
Where , , .
Can uhhh.. can we make that more readable?
With pleasure:
Where , , , and the same , , and values already mentioned.
Hey! There’s no linear segment! And what’s with the twelves?
Well! Because in spirit, lightness levels are no longer , they’re .
That’s right, HDR is not just 10-bit (or 12-bit), it’s also brighter colors.
How white is your white?
There’s a cool demo called Wanna see a whiter
white that embeds an HDR video to
show you that, when you say #ffffff
on a webpage (or rgb(255, 255, 255)
), you’re
really referring to “the brighest SDR white”, which really depends on your display.
Me, I have a pair of LG 27UP85NP-W computer displays at home, which sport the “VESA DisplayHDR(TM) 400” badge, meaning they should be able to reach all the way to 400 nits (remember, nits are just , candelas per square meter), and they’ve been tested to reach 413, so that’s cool.
But that’s the absolute highest they’ll go.
Right now I have them using the DCI-P3 color profile:
And I have the “High Dynamic Range” option left unchecked in macOS’s Display settings:
So… how white is my white? I don’t actually know! Because I don’t have a luminance meter, like the Konica Minolta LS-150:
Foot-lamberts???? Good gravy.
So I can’t measure how much light my screen puts out: I can only compare various shades with my eyes, which adapt quickly to various lighting conditions!
In fact, that’s precisely why sRGB is traditionally displayed with a gamma of around 2.2, suitable for a typical office setting, Rec.709 is displayed with gamma 2.4, for darker living rooms in the evening, and DCI-P3 is displayed with gamma 2.6, for near-dark movie theaters! (DCI for “Digital Cinema Initiative”):
From top to bottom: a gamma 2.2 curve, a gamma 2.4 curve, and a gamma 2.6 curve.
The higher the display gamma, the deeper the blacks — in other terms, it “improves contrast”.
But, and that’s where things get confusing, this display gamma is not the same as the encoding gamma that’s used in the OETFs we’ve seen.
The sRGB OETF is pretty close to a Gamma of 2.2:
sRGB OETF (solid) versus gamma 2.2 curve (dotted)
The Rec.709 OETF, even though it uses for the exponential part, is overall better approximated by Gamma 1.96:
Rec.709 OETF (solid) versus gamma 1.96 curve (dotted)
The gamma used in OETF, again is really just here to make sure that we make the most out of our bits, whereas the display gamma is usually adjustable on computer monitors and TVs, to suit the overall lighting environment: it’s all about contrast.
Gamma is such a fascinating operation for me, because it always maps back to , as opposed to something like lift:
Or gain:
Conclusion
I left this article in that state for months — working on other projects in the meantime.
At some point I thought I discovered that if you tell DaVinci Resolve to use an HDR color space while editing, then the stars were gone. But I never could really put my finger on what was going wrong.
And then it happened again:
Like before, it’s barely noticeable in Full resolution:
And impossible to ignore in Half resolution:
I discovered that in the Color tab, increasing the gamma slider makes the problem lesser (although it also changes the colors), and decreasing the gamma slider makes the problem much, much more visible:
I even found a way to isolate just the artifacts, so you can see a night full of stars:
My conclusion was that if you use a gamut mapping node, then it works around the problem, since that’s how I worked around the problem for my unsynn video:
But you know what? It doesn’t work anymore. My workaround workaroundn’t.
And so, the mystery remains complete. It’s probably a floating point thing. But which one? Who knows. Not me! Not me.
This is a dual feature! It's available as a video too. Watch on YouTube
Here's another article just for you:
Frustrated? It's not you, it's Rust
Learning Rust is… an experience. An emotional journey. I’ve rarely been more frustrated than in my first few months of trying to learn Rust.
What makes it worse is that it doesn’t matter how much prior experience you have, in Java, C#, C or C++ or otherwise - it’ll still be unnerving.
In fact, more experience probably makes it worse! The habits have settled in deeper, and there’s a certain expectation that, by now, you should be able to get that done in a shorter amount of time.