Archive for the ‘graphics’ Category

Stupid image resizing

Wednesday, October 1st, 2008

It greatly annoys me when web pages don’t resize images properly. In particular, one thing that there seems to be an epidemic of at the moment is web pages that embed images at one resolution and then use styles like “max-width: 100%” in a column narrower than the image. This makes the images look horrible because most browsers (at least IE7 and Firefox 2) resize the images by just dropping columns, causing diagonal lines and curves to be all wavy. At least Firefox 3 gets this right and resamples the image but it’s still a waste of bandwidth.

Along similar lines, here is an interesting article about the interactions between gamma correction and image scaling. I hadn’t thought about that before (my own image resizing routines always just assumed a linear brightness scale) but this has definitely opened my eyes.

Ray tracing in GR

Saturday, September 27th, 2008

Following on from this post, a natural generalization is that to non-Euclidean spaces. This is important for simulating gravity, for example rendering a scientifically accurate trip through a wormhole (something I have long wanted to do but never got to work). The main difference is that ones rays are curved in general, which makes the equations much more difficult (really they need to be numerically integrated, making it orders of magnitude slower than normal ray-tracing). One complication of this is that generally the rays will also curve between the eye point and the screen. But the rays between your screen and your eye in real life do not curve, so it would look wrong!

I think the way out of this is to make the virtual screen very small and close to the eye. This doesn’t affect the rendering in flat space (since only the directions of the rays matter) and effectively eliminates the need to take into account curvature between the screen and the eye (essentially it makes the observer into a locally Euclidean reference frame).

Another complications of simulated relativity is the inability to simulate time dilation. Well, you can simulate it perfectly well if you’re the only observer in the simulated universe but this would be a big problem for anyone who wanted to make a relativistically-accurate multiplayer game - as soon as the players are moving fast enough with respect to each other to have different reference frames, they will disagree about their expected relative time dilations.

Linux cleartype subtly broken

Wednesday, September 17th, 2008

Most modern graphical desktops have an option to render fonts taking into account the positions of the sub-pixels on liquid crystal displays. Since these are always in the same positions relative to the pixels, one can subtly alter the horizontal position of something by changing its hue. This effectively triples the horizontal resolution, which can make for a great improvement in the readability of text.

Unfortunately, my Ubuntu desktop doesn’t do this correctly - some bits of text have a yellowish fringe and some bits of text have a bluish fringe, both of which are quite distracting. The problem is that while you can alter the horizontal position of something at sub-pixel intervals, you can only alter its width by whole pixels (otherwise the hue changes don’t cancel out and integrating over a region of the screen gives a yellow or cyan colour.

I’ve therefore had to switch my desktop to use grayscale anti-aliasing, which is a bit more blurry. Fortunately the pixels on my monitor are small enough that this doesn’t bother me very much. I do prefer the font rendering that Windows does, though. While FreeType does apparently include code to support Microsoft’s patented programmed hinting I can’t seem to get the font rendering on Linux to look as good as it does on Windows.

Equations for 3D graphics

Tuesday, September 16th, 2008

When I first learnt how to do 3D graphics it was as a recipe. You take your 3D world coordinates x, y and z and rotate them (rotation matrices around the coordinate axes are particularly easy to write down). If you want perspective, you then divide x and y by z (for isometric views just ignore z). Next you scale the result by the number of pixels per world-coordinate unit at z=1 and translate so that x=y=0 is in the center of the screen.

This worked great, but didn’t tell me why this was the right thing to do.

Later, reading about ray tracing I realized what was really going on. Your model is in 3D space but also in 3D space are some additional points - the entrance pupil of your eye and the pixels that make up your screen. If you imagine a straight line from your eye passing through a particular point of your model and going on to infinity, that line may also pass through the screen. If it does, the point on the screen that it passes through corresponds to the pixel on the physical screen at which that point should be drawn.

Since computer screens are generally rectangular, if the positions of the corners of the screen are a, b, c and d (a and d being diagonally opposite to each other) the position of the fourth corner can be determined from the first three by using the equation a+b=c+d. So to fix the position, orientation and size of the screen we only need to consider 3 corners (a, b and c). We also need to consider the position of the eye, e, which is independent of a, b and c. We thus have 12 degrees of freedom (3 coordinates each for a, b, c and e). Three of these degrees of freedom correspond to translations of the whole screen/eye system in 3D space. Two of them correspond to orientation (looking up/down and left/right). Two correspond to the horizontal and vertical size of the screen. Three more give the position of the eye relative to the screen. One more gives the “twist” of the screen (rotation along the axis between the eye and the closest point on the screen plane). That’s eleven degrees of freedom - what’s the other one? It took me a while to find it but eventually I realized that I was under-contraining a, b and c - the remaining degree of freedom is the angle between the top and left edges of the screen (which for every monitor I’ve ever seen will be 90 degrees - nice to know that this system can support different values for this though).

If you solve the equations, this system turns out to be exactly the same transformations as the original recipe, only a little more general and somewhat better justified. It also unifies the perspective and isometric views - isometric is what you get if the distance between the eye and the screen is infinity. Obviously if you were really infinitely far away from your computer screen you wouldn’t be able to see anything on it, which is why the isometric view doesn’t look as realistic as the perspective view.

Many 3D graphics engines allow you to set a parameter called “perspective” or “field of view” which effectively increases or decreases how “distorted” the perspective looks and how much peripheral vision you have. This is essentially the same as the eye-screen distance in my model. To get the most realistic image the FoV should be set according to the distance between your eyes and your screen.

Generations of screen sizes

Sunday, July 6th, 2008

If you make a list of common screen resolutions and plot them on a log/log scale, you will notice that a regular pattern emerges - given an x by y resolution, x*2 by y*2 also tends to be common. However the sizes are not evenly distributed throughout logarithmic spaces - there are gaps which suggest different “generations” of monitor sizes, each monitor having 4 times the number of pixels as the corresponding monitor in the previous generation. I suspect this is because of the influence of TV - the resolutions are clustered around powers of 2 of TV resolutions.

Generation 1 - 256-480 pixels wide, 180-300 pixels tall (non-interlaced TV, handhelds)
Generation 2 - 512-960 pixels wide, 360-600 pixels tall (interlaced TV, older computer monitors)
Generation 3 - 1024-1920 pixels wide, 720-1200 pixels tall (most current computer monitors)
Generation 4 - 2048-3840 pixels wide, 1440-2400 pixels tall (”Q” standards - very high end monitors)
Generation 5 - 4096-7680 pixels wide, 2880-4800 pixels tall (”H” standards - no monitors exist yet)

Rendering rings of teleportation

Wednesday, June 4th, 2008

Rings of teleportation are very handy things to have around. The surface bounded by one ring is equated with the surface bounded by the other, so if you put something through one ring it will come out through the other. (Like the portals in “Portal”, but more portable). They don’t exist, of course, but this technicality doesn’t prevent us from drawing pictures of them.

Writing code to render these things is an interesting exercise. It’s easy to do with a ray tracer - if a ray intersects the disc inside one ring, just continue it to the equivalent point on the other ring.

Once that’s working, you can put the rings side-by-side so that light goes around in circles - if you put your eye point in the middle you can see an infinite tunnel.

A trick you can play is to reverse the orientation of one of the rings so that you look through one ring, out of the other to an object, the object will appear to you to be inverted, as in a mirror image.

Another trick is to make the rings different sizes, or shapes. As long as there is a 1:1 function equating points on one surface with points on the other, it works fine.

However, having rings of different sizes or non-circular shapes opens the possibility of putting one ring through the other. What happens then? It seems like the “infinite tunnel” then becomes a real thing rather than just an optical effect, but where does the second ring exist in real space? It seems that the only place it can appear is through the other side of the first ring, but that would mean that every point in space appears in an infinite number of places - this seems like it would have rather drastic consequences.

So it seems more likely that the second ring would be prevented from going through the first somehow (perhaps a ring edge would get in the way).

What I want from an HDR workflow

Tuesday, June 3rd, 2008

Once I’ve got my HDR camera and my HDR monitor, I’ll need new photographic workflow applications to get the images looking the way I want them. I expect that there will be a few parameters that I’ll almost always want to tweak, much as I almost always re-crop my photos at the moment. These parameters are likely to be:

  • Colour balance (2D slider)
  • Exposure (slider)
  • Dynamic range compression (slider)
  • Tone mapping radius (slider)

The last two of these reproduce the functionality of current HDR applications, allowing creation of tone-mapped images for non-HDR output (like printing, or legacy monitors).

The high dynamic range revolution

Monday, June 2nd, 2008

Currently some people are making beautiful HDR images like these. This takes an input image with a high dynamic range (often composed of multiple exposures with different exposure times to get good colour resolution over a wide range of brightnesses) and “compresses” the range down to monitor or printout ranges. This can give an effect similar to an oil painting (painters use similar techniques).

But such techniques will soon become unnecessary as the dynamic range that monitors can display increases. As I’ve mentioned before I’ve seen this technology in action and it’s seriously impressive - the pictures are incredibly realistic, like looking out of a window. As these monitors drop in price they will become ubiquitous and then we will want to take pictures that take full advantage of them.

Shooting RAW with a good digital SLR goes some way towards this, but I think that with the new generation of monitors will come a new generation of cameras optimized for taking HDR images. This might be as simple as reading the sensor several times over the course of the exposure, or it might be a completely new sensor design.

With new monitors and new cameras, the entire graphics pipeline will be re-engineered for HDR.

Photographic workflow

Sunday, June 1st, 2008

The workflow that I use for the photographs on my website has remained pretty much unchanged for many years.

  1. Copy the photos from the card to the computer and then delete them from the card.
  2. Open the folder of photos in ACDSee and delete any obvious duds.
  3. Open all the photos in Paint Shop Pro 4 (yes, I know it’s ancient but it works well, I know my way around all the tools and it’s fast).
  4. I look for similar photos and close the ones that are redundant or unattractive, eventually whittling it down to the set of photos that will form a nice album page.
  5. I rotate (sometimes by arbitrary angles) and crop. Sometimes I’ll adjust brightness and/or contrast to save a poor photo if there’s something in particular that I want to have a picture of. Sometimes I’ll use a more sophistical program like Photoshop to remove redeye or do other colour manipulations.
  6. Very occasionally I will use the clone tool to erase something that I don’t want in the photo.
  7. I’ll resample the photos to the appropriate size and save them as jpgs.
  8. Finally I’ll manipulate the directory listing in a text editor to create the html file, add captions and upload the lot.

Someday I’ll trade in my trusty Olympus C3000Z and get a nice digital SLR. But I might wait a few years because the high dynamic range revolution is coming. More about that tomorrow.

High density mouse pointer

Wednesday, May 9th, 2007

Many techniques have been invented for making the mouse pointer more visible. One is a little animation of circles radiating out from it when you press a “locate” button, another is just making it bigger. Yet another is adding “mouse trails” showing the position the pointer was at over the last few frames (though this has the disconcerting effect of making your pointer appear to be a snake). One which I think has been inadequately explored is making it “high density”. Normally when you move the mouse the operating system erases the pointer from the old location and plots it at the new location (doing nothing in between) if you move the pointer around quickly in a circle it the pointer seems to appear at discrete locations.

I think it would be better if, in any given frame, the operating system plotted the pointer at the position it was at in the last frame, the position it is at in the current frame and everywhere in between. This would give the impression of a “streak” as you moved the pointer, or a solid circle if you moved it in a rapid circle, as if the pointer is plotted much more often than once per frame - more “densely” in other words. It would be kind of like mouse trails done properly.