# User:Versatranitsonlywaytofly

## For don't seeing objects on screen edges like projection onto paper instead like in real world for small eye, which see objects on edges small even if those are on same line parallel to line going through viewer ears

Here I begining to change coordinates of pixels to be displayed on the screen so get proper angle of seeing things like in real world. I found very good example which is possible to edit and which is very good for my needs to create or at least to try create things like in real world. You just need to install Directx SDK. This tutorial can be found in "(SDK root)\Samples\C++\Direct3D10\Tutorials\Tutorial11". Need to install Microsoft Visual Studio 2010 (not express (because Visual Studio Express is for C#, most likely), but you may try install both). Need to download DirectX 2010 SDK here http://www.microsoft.com/download/en/details.aspx?id=8109 . About this tutorial can be found in help file of directx 2010 SDK. So exact directory is "C:\Program Files\Microsoft DirectX SDK (June 2010)\Samples\C++\Direct3D10\Tutorials\Tutorial11". Need to open Tutorial11_2010.vcxproj file with Visual Studio. It for me don't want to open instantly so need to choose "restart this application under different credentials" and then "yes".

So to change all this waves in tutorial need just copy file "Tutorial11.fx" from directory "C:\Program Files\Microsoft DirectX SDK (June 2010)\Samples\C++\Direct3D10\Tutorials\Tutorial11" on desktop. Then open this file with notepad or wordpad which are in "C:\Windows" directory. Before opening you can rename file to ".txt" - text file and then edit and then safe and after saving rename again to ".fx" file and overwrite original. You can edit and experiment what changes do. To run after opening "Tutorial11_2010.vcxproj" need to press green traingle (Run or debug it means and everything will work, example will start in new window) and then "yes".

THIS IS ORIGINAL "Tutorial11.fx" CODE:
"
//--------------------------------------------------------------------------------------

//-------------------------------------------------------------------------------------- // Constant Buffer Variables //--------------------------------------------------------------------------------------

   Texture2D g_txDiffuse;
SamplerState samLinear
{
Filter = MIN_MAG_MIP_LINEAR;
};

   cbuffer cbConstant
{
float3 vLightDir = float3(-0.577,0.577,-0.577);
};

   cbuffer cbChangesEveryFrame
{
matrix World;
matrix View;
matrix Projection;
float Time;
};

   cbuffer cbUserChanges
{
float Waviness;
};

   struct VS_INPUT
{
float3 Pos          : POSITION;
float3 Norm         : NORMAL;
float2 Tex          : TEXCOORD0;
};

   struct PS_INPUT
{
float4 Pos : SV_POSITION;
float3 Norm : TEXCOORD0;
float2 Tex : TEXCOORD1;
};


//-------------------------------------------------------------------------------------- // DepthStates //--------------------------------------------------------------------------------------

   DepthStencilState EnableDepth
{
DepthEnable = TRUE;
DepthFunc = LESS_EQUAL;
};

   BlendState NoBlending
{
AlphaToCoverageEnable = FALSE;
BlendEnable[0] = FALSE;
};


   PS_INPUT VS( VS_INPUT input )
{
PS_INPUT output = (PS_INPUT)0;

output.Pos = mul( float4(input.Pos,1), World );

output.Pos.x += sin( output.Pos.y*0.1f + Time )*Waviness;

output.Pos = mul( output.Pos, View );
output.Pos = mul( output.Pos, Projection );
output.Norm = mul( input.Norm, World );
output.Tex = input.Tex;

return output;
}


   float4 PS( PS_INPUT input) : SV_Target
{
// Calculate lighting assuming light color is <1,1,1,1>
float fLighting = saturate( dot( input.Norm, vLightDir ) );
float4 outputColor = g_txDiffuse.Sample( samLinear, input.Tex ) * fLighting;
outputColor.a = 1;
return outputColor;
}


//-------------------------------------------------------------------------------------- // Technique //--------------------------------------------------------------------------------------

       technique10 Render
{
pass P0
{

       SetDepthStencilState( EnableDepth, 0 );
SetBlendState( NoBlending, float4( 0.0f, 0.0f, 0.0f, 0.0f ), 0xFFFFFFFF );
}
}

".
THIS IS CODE WHICH I EDIT FOR NEXT EDITING TO MAKE FINAL GOAL OF OBJECTS LOOKS LIKE IN REAL WORLD:
"
//--------------------------------------------------------------------------------------

//-------------------------------------------------------------------------------------- // Constant Buffer Variables //--------------------------------------------------------------------------------------

   Texture2D g_txDiffuse;
SamplerState samLinear
{
Filter = MIN_MAG_MIP_LINEAR;
};

   cbuffer cbConstant
{
float3 vLightDir = float3(-0.577,0.577,-0.577);
};

   cbuffer cbChangesEveryFrame
{
matrix World;
matrix View;
matrix Projection;
float Time;
};

   cbuffer cbUserChanges
{
float Waviness;
};

   struct VS_INPUT
{
float3 Pos          : POSITION;
float3 Norm         : NORMAL;
float2 Tex          : TEXCOORD0;
};

   struct PS_INPUT
{
float4 Pos : SV_POSITION;
float3 Norm : TEXCOORD0;
float2 Tex : TEXCOORD1;
};


//-------------------------------------------------------------------------------------- // DepthStates //--------------------------------------------------------------------------------------

   DepthStencilState EnableDepth
{
DepthEnable = TRUE;
DepthFunc = LESS_EQUAL;
};

   BlendState NoBlending
{
AlphaToCoverageEnable = FALSE;
BlendEnable[0] = FALSE;
};


   PS_INPUT VS( VS_INPUT input )
{
PS_INPUT output = (PS_INPUT)0;

output.Pos = mul( float4(input.Pos,1), World );

output.Pos = mul( output.Pos, View );
output.Pos = mul( output.Pos, Projection );

   output.Pos.x += sin( output.Pos.y * 0.1f)*20;

   output.Norm = mul( input.Norm, World );
output.Tex = input.Tex;

return output;
}


   float4 PS( PS_INPUT input) : SV_Target
{
// Calculate lighting assuming light color is <1,1,1,1>
float fLighting = saturate( dot( input.Norm, vLightDir ) );
float4 outputColor = g_txDiffuse.Sample( samLinear, input.Tex ) * fLighting;
outputColor.a = 1;
return outputColor;
}


//-------------------------------------------------------------------------------------- // Technique //--------------------------------------------------------------------------------------

       technique10 Render
{
pass P0
{

       SetDepthStencilState( EnableDepth, 0 );
SetBlendState( NoBlending, float4( 0.0f, 0.0f, 0.0f, 0.0f ), 0xFFFFFFFF );
}
}

".

COMPARISION. Diference is only in vertex shader code:
Original:
//--------------------------------------------------------------------------------------

PS_INPUT VS( VS_INPUT input ) {

   PS_INPUT output = (PS_INPUT)0;

output.Pos = mul( float4(input.Pos,1), World );

output.Pos.x += sin( output.Pos.y*0.1f + Time )*Waviness;

output.Pos = mul( output.Pos, View );
output.Pos = mul( output.Pos, Projection );
output.Norm = mul( input.Norm, World );
output.Tex = input.Tex;

return output;


}

My after editing:
//--------------------------------------------------------------------------------------

PS_INPUT VS( VS_INPUT input ) {

   PS_INPUT output = (PS_INPUT)0;

output.Pos = mul( float4(input.Pos,1), World );

output.Pos = mul( output.Pos, View );
output.Pos = mul( output.Pos, Projection );

   output.Pos.x += sin( output.Pos.y * 0.1f)*20;

   output.Norm = mul( input.Norm, World );
output.Tex = input.Tex;

return output;


}

So waviness means, that you can control it value in demo from "0 to 20". I just remove word "waviness" and write "20". Also I remove word "time", because when object do not rotating about y axis (y axis is z axis in computer graphics) through object going wave which changes in time and I do not wish this effect for proper displaying objects to don't see them bigger on corners than in center. Also I though, that perhaps i should put this "output.Pos.x += sin( output.Pos.y * 0.1f)*20;" after over "world", "vief", "Projection" matrices, which rotates moves projects objects on screen and I made it from first guess. Now object waving as projection not dependently if you rotate it or not. So from here I think I can make or you can make code for proper object displaying about what I talked in Discusion Raytracing (graphics) on wikipedia http://en.wikipedia.org/wiki/Talk:Ray_tracing_(graphics) . In section "Ray tracing will never be correct on computer until size of virtual camera (or eye) matrix will not be 1000 times smaller".
Here screenshoots to proof that everything is in wave only as projection on the screen http://imageshack.us/g/263/tutorial11remaked.jpg/ .
In line "output.Pos.x += sin( output.Pos.y * 0.1f)*20;" the number "0.1f" means distance between waves. If remove "*0.1f" from that line then distance between pixels or between vertexes will be distance between waves (wave lenght).
But after my trials I understood that in such way or manipulation with code impossible to get by me desired effect of proper objects like in real world. Because all matrices only rotates all vertexes and everything is somehow normalized (everywhere vectors instead some certain coordinates). I think such waves there possible to make on quake or quake 2 engine. I think in 3dmark2000 demo where some sphere waving was used this kinda manipulation with vertexes and there don't need any smart graphics shaders, because matrices World, Fiew, Projection are since first directx I think and since 3d graphics or even since "Wolfenshtein 3D" game.
Here some screenshots http://imageshack.us/g/851/tutorial11remakedjustfo.jpg/ based on fallowing code or similar:
"output.Pos.x =output.Pos.x +(output.Pos.y/48)*Waviness;"
As you see I don't ask to move bottom of object to left, but only to right, but it still move, because vectors are somehow reflected. I mean, "output.Pos.x" is not real sequence of position x on screen, because when going to full screen at bigger resolution everything don't changes and bottom (legs) is more to left when set "Waviness" at 20 than at 0 and object head is vice-versa.
Here code when bottom (legs) do not moving and only moving head to right:

PS_INPUT VS( VS_INPUT input ) {

   PS_INPUT output = (PS_INPUT)0;
output.Pos = mul( float4(input.Pos,1), World );
output.Pos = mul( output.Pos, View );
output.Pos.x =output.Pos.x +(output.Pos.y/480)*Waviness;
output.Pos.x +=Waviness/2;
output.Pos = mul( output.Pos, Projection );
output.Norm = mul( input.Norm, World );
output.Tex = input.Tex;

   return output;


}

But seems legs still little bit moving to left, so "output.Pos.y/500" may be correct or maybe even "output.Pos.y/515".
The main problem is that her face should become fatter when moving her head to right and legs remaining in same position. So face must be fatter than legs and this not happening and without it I can't farther chase goal to look objects like in real world.
So seems there need to change alsmost all algorithm of drawing 3D in all computer graphics. Because now it just moves all vertexes of object on x axis. I need to move vertices proportionaly to distance to the center (the same on x and y axis, like somthing ${\displaystyle x_{n}=x+{\frac {c}{n(x^{2}+y^{2})}}}$, ${\displaystyle y_{n}=y+{\frac {c}{n(x^{2}+y^{2})}}}$). If vertex (or pixels) is farther from the center then he should be moved more. To make it without changing code need know basic code. Then for example possible to draw plane on the zOy plane (because in computer graphics x and y is like on 2d graphics and z is perspective) like this x=az+by+c. So then need to measure distance from each vertex to the plane x=az+by+c parallel to plane zOy. And then depending on this distance move each vertex by corresponding value closer to plane.
Here goes math formula how I planing to do to show objects like in real world. First of all we will calculate proper image only in one quarter of screen. This quarter is in right top corrner. We will be using resolution screen 640*480. So center coordinates are O(320; 240). Now I supose for this quarter field of view is 45 degrees (or maybe 22.5). If field of view is 45 degrees, then let ${\displaystyle x_{t}=240}$ pixels will be one side of triangle, ${\displaystyle z_{t}=240}$ will another side of the triangle (z is going to us) and hypotenuse of triangle between ${\displaystyle x_{t}}$ end and ${\displaystyle z_{t}}$ end (which is your eye) is ${\displaystyle h_{xz}=240^{2}+240^{2}={\sqrt {115200}}=339.411255}$. So correct image will be only in circle quarter with radius ${\displaystyle R=240}$. So ${\displaystyle \cos \alpha ={\frac {x_{t}}{h_{xz}}}={\frac {240}{339.411255}}={\frac {1}{\sqrt {2}}}}$, ${\displaystyle \alpha =\arccos {\frac {1}{\sqrt {2}}}={\frac {\pi }{4}}.}$ So ${\displaystyle {\sqrt {x_{t}^{2}+y_{t}^{2}}}\leq 240}$ must be. So here how we will calculate proper x and y coordinates:
${\displaystyle x_{n}=x-{\frac {x_{t}}{\sqrt {240^{2}+({\sqrt {x_{t}^{2}+y_{t}^{2}}})^{2}}}}-320=x-{\frac {x_{t}}{\sqrt {240^{2}+x_{t}^{2}+y_{t}^{2}}}}-320;}$
Here ${\displaystyle 0, because ${\displaystyle {\sqrt {x_{t}^{2}+y_{t}^{2}}}<240.}$ And ${\displaystyle 320, because ${\displaystyle {\sqrt {x_{t}^{2}+y_{t}^{2}}}<240.}$ And ${\displaystyle x_{t}=x-320.}$
${\displaystyle y_{n}=y-{\frac {y_{t}}{\sqrt {240^{2}+({\sqrt {x_{t}^{2}+y_{t}^{2}}})^{2}}}}-240=y-{\frac {y_{t}}{\sqrt {240^{2}+x_{t}^{2}+y_{t}^{2}}}}-240;}$
Here ${\displaystyle 0, because ${\displaystyle {\sqrt {x_{t}^{2}+y_{t}^{2}}}<240.}$ And ${\displaystyle 240, because ${\displaystyle {\sqrt {x_{t}^{2}+y_{t}^{2}}}<240.}$ And ${\displaystyle y_{t}=y-240.}$
For example if screen resolution is 640*480 pixels and center coordinates are (320; 240). Then if ${\displaystyle x=480}$ and ${\displaystyle y=360}$, we can calculate pixel coordinates after fixings:
${\displaystyle x_{480}=480-{\frac {480-320}{\sqrt {240^{2}+(480-320)^{2}+(360-240)^{2}}}}-320=160-{\frac {160}{\sqrt {240^{2}+160^{2}+120^{2}}}}=160-{\frac {160}{312.41}}=}$
${\displaystyle =160-0.512=159.489;}$
${\displaystyle x_{360}=360-{\frac {360-240}{\sqrt {240^{2}+(480-320)^{2}+(360-240)^{2}}}}-240=160-{\frac {120}{\sqrt {240^{2}+160^{2}+120^{2}}}}=120-{\frac {120}{312.41}}=}$
${\displaystyle =120-0.384=119.616.}$
By next formulas, you will see that ${\displaystyle \Delta x=0.512}$ and ${\displaystyle \Delta y=0.384}$ are correct:
${\displaystyle {\sqrt {x_{480}^{2}+y_{360}^{2}}}={\sqrt {159.489^{2}+119.616^{2}}}={\sqrt {39744.72858}}=199.3608.}$
${\displaystyle {\sqrt {x^{2}+y^{2}}}={\sqrt {160^{2}+120^{2}}}={\sqrt {40000}}=200.}$
Now let's choose ${\displaystyle x_{t}=200}$, ${\displaystyle y_{t}=0}$ and then x=200+320=520 (and it don't violates rules 520<320+240=560), y=240 and so let's calculate:
${\displaystyle x_{520}=480-{\frac {520-320}{\sqrt {240^{2}+(520-320)^{2}+(240-240)^{2}}}}-320=200-{\frac {200}{\sqrt {240^{2}+200^{2}+0^{2}}}}=200-{\frac {200}{312.41}}=}$
${\displaystyle =200-0.6402=199.3598;}$
So here I guess just too small precision of calculator that 199.3608 is not equal 199.3598. So as you see, in general, radius if is the same length in many differents coordinates on screen [from point (320; 240)], then becomes after fixing coordinates the same amount shorter than was do not matter what point on screen connects with screens center (320; 240).
In case somthing would be wrong, you can choose next formulas:
${\displaystyle x_{n}=x-{\frac {c\cdot x_{t}}{\sqrt {240^{2}+({\sqrt {x_{t}^{2}+y_{t}^{2}}})^{2}}}}-320=x-{\frac {c\cdot x_{t}}{\sqrt {240^{2}+x_{t}^{2}+y_{t}^{2}}}}-320;}$
${\displaystyle y_{n}=y-{\frac {c\cdot y_{t}}{\sqrt {240^{2}+({\sqrt {x_{t}^{2}+y_{t}^{2}}})^{2}}}}-240=y-{\frac {c\cdot y_{t}}{\sqrt {240^{2}+x_{t}^{2}+y_{t}^{2}}}}-240;}$
where c is some constant to try and choose the best solution, which looks most realistic. For such resolution c should be about 10-50.
There possible to write shorter formulas (but perhaps less correct):
${\displaystyle x_{n}=x-c\cdot x_{t}{\sqrt {x^{2}+y^{2}}}-320;}$
${\displaystyle y_{n}=y-c\cdot y_{t}{\sqrt {x^{2}+y^{2}}}-320.}$
In this case constant c shoult be about 1/1000.
Actualy there need somthing more like this formulas:
${\displaystyle x_{n}=x-(k\cdot x_{t}+c){\sqrt {x^{2}+y^{2}}}-320;}$
${\displaystyle y_{n}=y-(k\cdot y_{t}+c){\sqrt {x^{2}+y^{2}}}-240.}$
Because y coordinates are still bigger, when x very big and y small. But object on center right edge is still more wider than higher (say, if natural object size in center is ${\displaystyle x_{s}=1}$ and ${\displaystyle y_{s}=1}$, then on right central corner object size is somthing like ${\displaystyle x_{s}=2}$ and ${\displaystyle y_{s}=1.5}$) if you will rotate camera standing in same point.

## Partial solution

Here code, which may fix exactly or not exactly (i don't figure it out yet) top right quarter of the screen:
//--------------------------------------------------------------------------------------
PS_INPUT VS( VS_INPUT input )

{

    PS_INPUT output = (PS_INPUT)0;
output.Pos = mul( float4(input.Pos,1), World );
output.Pos = mul( output.Pos, View );
output.Pos.z =output.Pos.z +(output.Pos.y/500)*Waviness*10;
output.Pos.z +=10*Waviness/2;
output.Pos.z =output.Pos.z +(output.Pos.x/500)*Waviness*10;
output.Pos = mul( output.Pos, Projection );
output.Norm = mul( input.Norm, World );
output.Tex = input.Tex;
return output;


}

".
Just need it replace in "tutorial11.fx" file. Then object size becoming smaller on right up corrner, like I want. In this demo if z axis going to us then it is negative and if going from us then it's positive (openGL and XNA somthing using z axis direction reflected).
Here some images: http://imageshack.us/g/69/tutorial11atlastquarter.jpg/ .

## YES, I made it, for all quarters

Here possibly exact solution for fixing perspective that they would look like in real photograph or movie and just like in real live.
So here is code for all quarters for tutorial11, which is in directory: "C:\Program Files\Microsoft DirectX SDK (June 2010)\Samples\C++\Direct3D10\Tutorials\Tutorial11". So just need replace code in file "tutorial11.fx" (which you can open with notepad renaming it to "tutorial11.txt" ant after editing renaming back to "tutorial11.fx"). So this code you need replace (it works for all quarters, you can change waviness to see how farther from center object size shrinking):
//--------------------------------------------------------------------------------------
   PS_INPUT VS( VS_INPUT input )
{
PS_INPUT output = (PS_INPUT)0;
output.Pos = mul( float4(input.Pos,1), World );
output.Pos = mul( output.Pos, View );
if(output.Pos.y < 0)
{
output.Pos.z =output.Pos.z -(output.Pos.y/500)*Waviness*10;
output.Pos.z +=10*Waviness/2;
}
else
{
output.Pos.z =output.Pos.z +(output.Pos.y/500)*Waviness*10;
output.Pos.z +=10*Waviness/2;
}
if(output.Pos.x < 0)
{
output.Pos.z =output.Pos.z -(output.Pos.x/500)*Waviness*10;
}
else
{
output.Pos.z =output.Pos.z +(output.Pos.x/500)*Waviness*10;
}
output.Pos = mul( output.Pos, Projection );
output.Norm = mul( input.Norm, World );
output.Tex = input.Tex;
return output;
}

Here upgraded code, in which removed lines "output.Pos.z +=10*Waviness/2;", because it is no more actual, when each quarter plus and minus coordinates are treated separately (and deletion of this lines very of course improves precision and makes code clear, without any unwanted unprecise displacements):

   PS_INPUT VS( VS_INPUT input )
{
PS_INPUT output = (PS_INPUT)0;
output.Pos = mul( float4(input.Pos,1), World );
output.Pos = mul( output.Pos, View );
if(output.Pos.y < 0)
{
output.Pos.z =output.Pos.z -(output.Pos.y/500)*Waviness*10;

}
else
{
output.Pos.z =output.Pos.z +(output.Pos.y/500)*Waviness*10;

}
if(output.Pos.x < 0)
{
output.Pos.z =output.Pos.z -(output.Pos.x/500)*Waviness*10;
}
else
{
output.Pos.z =output.Pos.z +(output.Pos.x/500)*Waviness*10;
}
output.Pos = mul( output.Pos, Projection );
output.Norm = mul( input.Norm, World );
output.Tex = input.Tex;
return output;
}

Here some images of how looks perspective correction: http://imageshack.us/g/220/tutorial11withperspecti.jpg/ .
And I don't see reason why it could be wrong, so need to play some game with such correction to feel if there is exact solution or need some little bit different method. Anyway, it still fixing much already. And now perspective in games can look closer to reality or the same like in reality (with my code).

### Tutorial11.fx update

Need this code, if you don't want that object would look like through crystal with 8 sides, which are 8 triangles (and this crystal through symmetry axis divided into two parts and like you looking through one of this part, which looks like rhombus - crystal), change to this code:

   PS_INPUT VS( VS_INPUT input )
{
PS_INPUT output = (PS_INPUT)0;
output.Pos = mul( float4(input.Pos,1), World );
output.Pos = mul( output.Pos, View );
output.Pos.z =output.Pos.z + sqrt(output.Pos.y * output.Pos.y + output.Pos.x * output.Pos.x)*Waviness/20;
output.Pos = mul( output.Pos, Projection );
output.Norm = mul( input.Norm, World );
output.Tex = input.Tex;
return output;
}

Now depending on "Waviness" object will look like through wide field of view camera or like through door peephole or fisheye lens. In previous code if pixel have coordinates A(100; 100), then pixel was the same pushed on z axis, like pixel with coordinates B(200; 0). But if you measure distance from point O(0; 0) to point A(100; 100) and to point B(200; 0), then you will figure it out, that distance to point B(200; 0) from center is bigger than to point A(100; 100), but must be the same, like shown in this update. So ${\displaystyle {\sqrt {200^{2}+0^{2}}}=200}$ and ${\displaystyle {\sqrt {100^{2}+100^{2}}}={\sqrt {20000}}=141.14213562}$. So you see 141 much less than 200, but in old algorithm without squares used pixel in z direction pushed away the same amount for point A(100; 100) and point B(200; 0) and this shouldn't be. In this updated algorithm pixel A(100; 100) is pushed by 141 unit in z direction (away from camera) and pixel (or more precisely vertex) B(200; 0) pushed away from camera in z direction by 200 units.

## Wall in game looks parallel to line going through ears holes and parallel to line which going from one leg to another leg

http://imageshack.us/g/225/japanparaleltoplain.jpg/

Here is screenshots from Tomb Raider Legend, which shows how all wall is projection to virtualcampera-photomatrix. In real world more distant object looking smaller (and this more distance object is on the same plane which is parallel to virtual camera photomatrix, but this distance object [which is seen on the edge of view] in real live is visible as smaller and so in movie or photography).

## Projector may show movie like 3D graphics

So if image photographed through lens into photoaparat photomatrix have on edges smaller objects, which are on the same line parallel to line going though ears, then image projector should do vice versa and turn real live image (recorded video or photographed picture) into image who looking more like rendered image from 3D graphics. If this not happening, then simply it can be because simply field of view for videocamera or photocamera or for eye is too small compare with field of view in game. Say if in game field of view is 45 degrees then for videocamera or for webcam field of view is 30 or even 20 degrees. And thats why projector do not completely turning photography or recorded video into such disbalanced image like in 3D graphics games. I know only one thing, that if you change field of view in game (in game almost impossible, but in 3Dmax very easy) or 3D demo or 3D tutorial then it will not be visible such much bigger objects on edges when rotating camera standing in same point. So I am even not sure if there can be, that because of too small field of view in camcorders and photoaparats there don't visible such disbalance like in 3D graphics. But one thing is for sure, that projector should do effect opposite to effect which creates photocamera. So if in reality everything looks like in 3D game, but just too small field of view, then do not need my correction for 3D games and would be just enough to change field of view to smaller number.

If you think about laser pointer, then at 90 degrees (perpendicular light to surface) red laser pointer is smallest and at 150 degrees becoming ellipse and becoming bigger by area (although less intensive light). So projector light on edges of projected image can be as light at bigger angle and thus wider objects will look on edges and if it is not 3D graphics, but photography then photography become like 3D graphics; and if it is 3D graphics then it will be even worse than 3D graphics, because even objects which are on line parallel to line passing virtual viewer ears on edges will look bigger. But as I say projector distortion effect can be almost invisible, because big chance that smaller field of view is used in movies than in 3D graphics games.
Also there is possiblity (it's too hard to understand how lenses works in this problem case), that for projector and photocamera lenses are made in such way, that picture would look most realistic (regarding size of objects in corners, when distance increasing from image center); but I don't know if this possible, because there can be just one type of lenses (one light passing lenses law) and you can't change lenses shape whatever you want because then you may do not get image at all and just combination of spots of colors instead.

## Field of view in games

In game "Call of Duty: Moder Warfare" multiplayer is possible to change field of view from 65 degrees to 80 degrees by taping in console "~" this "/cg_fov 75" then field of view will be 75 degrees (type﻿ in /cg_fovscale 1.2 or 3, 4, 5, 6, 7, 8, 9 or 2 (just make sure that you are on a devmap server); press "left shift" + "~" and you will see it says cheat protection if you try more than 80 or less than 65). Default field of view is 65 degrees which is pretty smart choise and game looking more like in movie. Human do not seeing like in wide screen so left and right sides of display with 1920*1080 resolution (16:9) is wasted. Human eye field of view I think is about 60 degrees. Somewhere between 40 degrees and 120 degrees. It's hard to say, because blured view not in center. Webcamera field of view is smaller than eye field of view (but again it's hard to compare, because eye image is blured on edges) and so webcam field of view must be about 45 degrees, maybe between 20 and 70 degrees. So you see what wrong is field of view in games (in most 90 degrees like in "Far Cry 2"). I find some tutorial where field of view 45 degrees.

### Unreal development kit

You can download unreal development kit and there is game to lunch with one level like demo version of "Unreal Tournament 3" with only one map. So in this demo you can change field of view from 40 to 120 degrees if you choose less or more it still be in this interval. To change field of view in this demo need just write in console "~" "fov 60" and field of view will be 60 degrees. Only guns field of fiew remaining the same and i guess it is 90 degrees FOV for all guns or somthing between 60 and 90 degrees, so it's of course makes looking game more unrealistic, but at least you can see gun, but I doubt webcam would record the gun if it would be at height of eyes and gun would be at height of belly.

Screenshots with FOV 40, 60, 80 and 120 degrees:

### Bulletstorm

To change field of view in Bulletstorm you just need to rename file "DefaultCamera.ini" to "DefaultCamera.txt" and open with notepad on desktop and then editing, safing, renaming back to ".ini" and moving back to "C:\Program Files\EA\Bulletstorm\StormGame\Config" folder. It works for game singleplayer and perhaps for multiplayer too. Weapons field of view changing [the same] with all over objects FOV like in most multiplayer games.
When walking default field of view is 85 degrees and when running default FOV is 95 degrees. Targeting default fov is 45 degrees and aiming default fov is 40 degrees. Magnification (like looking through sniper gun) of objects in center is the same as smaller fov. Minimization of scene (or objects) with big fov is not the same as increasing fov.

## How I would describe my effect

My effect, which making objects smaller on edges is the same effect like moving eyes. So if you moving eyes, then it is exactly the same like you moving eyes in real world. So in game big field of view (which teoreticly can be 0<FOV<180) isn't looks like normal, because human do not seeing in such big field of view like 120 degrees (human seeing about 60-90 or maybe less also is theory of fast scanning eye, but in the end it's still blured when not in center objects; so if you fast moving eye and saying "my field of view" is 120 degrees and more then it is exactly my effect). So I then would could set field of fiew to say 120 or 140 degrees and then apply my edges "faring" effect. So then if you rotate camera, each object would be the same size. Then it would be like moving eyes on monitor like you moving eyes in real world and mouse would be used much less for rotating camera around. Thats sounds really fantastic and grate like going to reality in display and forgeting about rotating mouse to see what is behind you or on your left (much less mouse motions).

One more important thing. If you set up field of view in game to about 50 degrees then you will see game as it is movie or photograph (I no more have doubt about it). But eye can concentrate and clearly see only one point, so eye field of view is even smaller, because everything else what is not in eye center is less or more blured (farther from center more blured). So eye real not blured field of view can be about 5 degrees. If eye only see clearly things, which are only in center, then it is the same as very small field of view. And if you need to see clearly all objects details by moving eye, then to mimic reality there camera recorded video do not describe correctly reality, because it's still have little bit bigger objects on corners than in center (if objects are on straight line parallel to line going through ears). And if you moving eye to see objects on corners then the will be smaller than in center (if objects are on straight line parallel to line going through ears). So, my [invited] algorithm from "tutorial11" do not exactly making like camera (but because webcamera have small field of view it's not big difference). My algorithm from "tutorial11" (read earlier posts) making exactly like eye seeing everything if you will move eye on monitor and will move eye in reality. For example if I would make 3D room (with computer graphics) in which you living and you would sit in same chair in 3D room and in real room and virtual camera height would be the same as your eye height, then you would see by moving only eye everything exacly the same! To make it even more realistic glasses with small displays on each eye would make it better, but it only transition effect, and it's not so important, because human can understand, that it is monitor and can imagine where is his eye (it sounds little bit strange, but it most realistic 3D graphics, much more realistic would look than what you know now). So mouse motions would be like turning head and eyes motions is eyes motions and this is all what is all about. For you it will be enough just move eyes instead mouse, to know more about what is on your left or right and you will not see such unrealistic picture like increasing field of view to 120 or more degrees (and even with 90 degrees FOV it's not very realistic, like I said).

### Projector VS door peephole

My effect is similar to door peephole glass lens, but only [effect] must be smaller. So, I finally figured it out, it can be two types of lenses which can make objects on edges make smaller and lenses do not making objects smaller, when they are on parallel [flat] plain to photomatrix [of photoaparat or webcam]. Lenses which making image on photomatrix of all objects on parallel plain to photomatrix in equal size [all objects] are used in processors manufacturing (with processor chip reticle (mask)). Video projectors too using such type lenses like in chip manufacturing. So another type of lenses like door peephole, which is made to see who is behind the door and which is made from glass, making objects on edges visible as smaller. Human eye lens can be one of those two types lenses or for all peoples differently (but it's not very likely). But like I said, since human only can focus on objects in center there almost impossible to say, which type lens is human eye lens. In videogame is the same virtual lens like in videoprojectors and like lenses used for chips manufacturing. Webcam and photoparats and all video recording devices can even not notice, that they using not the same type lenses, but one of them using more similar type like in videoprojectors and overs like door peephole. But since in photoparats and over videorecorders are many lenses, then I tend to think, that bigger chance, that to make them all work together, that those all lenses for captuing video must be type like in projectors and like chip manufacturing lenses (so everything like in game). But since all videorecorders and photoaprats using small field of view mostly in default regime, then it's still hard to see diference (at big field of view it must be clearly be visible what type lenses are used). So my algorithm is to simulate human eye, which can focus only on one small spot in center and do not seeing clearly on edges. And because of this in videogame I want to make all spots at equal distance from player, that this spots would be visible of equal size. Then I can also make bigger field of view and each spot (or object) still will be of equal size if it is at the same distance as any over spot on the screen. By moving eye (or eyes) and tracking any object on screen it will be feeling like looking around in real world (because human can concentrate look only in center of eye, so real field of view to see clearly objects is very small about 5 degrees).

### Projector VS door peephole (maybe projector have small field of view or it do showing desktop incorrectly)

I have changed my mind after searching videos with cameras set to big field of view and they all showing images like through door peephole. And sow photographs, which less or more making image look like through door peephole if field of view is more than 30-60 degrees. For CPU reticles making there either small field of view chosen somehow through big distance or something or there is on chip edges bigger transistors density (and this is not fatal for chip working). So video projector either can show images from big distance (have small field of view) or it from close distance showing incorrectly and for example projector computer desktop all open windows folders showing incorrectly with bigger objects size farther from center. I do not want talk too much about what projector in this case doing for recorded video from real world, but it doing opposite effect and real world recorded video through projector can be transformed to looking like in virtual computer 3D world (if field of view of recorded video and projector set the same, say 90 degrees). But I do not guaranty, maybe there still is tricks (with optic lenses) to do projector show correctly computer desktop and folders even from close distance (like something magnification and so on).
If it so, that at big field of view there all cameras making visible to see that on edges (in proportion to distance from center) objects looks smaller if they are on plane parallel to camera photomatrix [plane], then my correction algorithm truly will make 3D computer graphics looking more alive and realistic, even this [my correction algorithm] should give more realism at 90 degrees field of view with felling, that something really more realistic.

## Tutorial13 with tutorial11

To apply my invented effect for tutorial13 like I changed it in tutorial11, you can go to "C:\Program Files\Microsoft DirectX SDK (June 2010)\Samples\C++\Direct3D10\Tutorials\Tutorial13" and edit file "tutorial13.fx" by changing this:

   GSPS_INPUT VS( VS_INPUT input )
{
GSPS_INPUT output = (GSPS_INPUT)0;

output.Pos = mul( float4(input.Pos,1), World );
output.Norm = mul( input.Norm, (float3x3)World );
output.Tex = input.Tex;

return output;
}

to this:

   GSPS_INPUT VS( VS_INPUT input )
{
GSPS_INPUT output = (GSPS_INPUT)0;
output.Pos = mul( float4(input.Pos,1), World );
output.Pos = mul( output.Pos, View );
if(output.Pos.y < 0)
{
output.Pos.z =output.Pos.z -output.Pos.y;
}
else
{
output.Pos.z =output.Pos.z +output.Pos.y;
}
if(output.Pos.x < 0)
{
output.Pos.z =output.Pos.z -output.Pos.x;
}
else
{
output.Pos.z =output.Pos.z +output.Pos.x;
}
output.Norm = mul( input.Norm, (float3x3)World );
output.Tex = input.Tex;
return output;
}

Then vertex farther from center will be moved from camera and this is that I want, and tutorial13 explosing effect quality do not changing at all.

This is screenshots from tutorial13 with my effect (tutorial 13 effect is explosion and my effect is to set equal distance from virtual player eye to any object on parallel plain to virtual camera photomatrix; just do not chosen correct number and on edges objects becoming too small, just wanted to show that it works): http://imageshack.us/g/189/tutorial13with11tut.jpg/ .

### Tutorial13 update

Actually previous code was working only in center without some rotations. So need different code. So complete and without some rotation errors code is this:

   [maxvertexcount(12)]
void GS( triangle GSPS_INPUT input[3], inout TriangleStream<GSPS_INPUT> TriStream )
{
GSPS_INPUT output;
//
// Calculate the face normal
//
float3 faceEdgeA = input[1].Pos - input[0].Pos;
float3 faceEdgeB = input[2].Pos - input[0].Pos;
float3 faceNormal = normalize( cross(faceEdgeA, faceEdgeB) );
float3 ExplodeAmt = faceNormal*Explode;
//
// Calculate the face center
//
float3 centerPos = (input[0].Pos.xyz + input[1].Pos.xyz + input[2].Pos.xyz)/3.0;
float2 centerTex = (input[0].Tex + input[1].Tex + input[2].Tex)/3.0;
centerPos += faceNormal*Explode;
//
// Output the pyramid
//
for( int i=0; i<3; i++ )
{
output.Pos = input[i].Pos + float4(ExplodeAmt,0);
output.Pos = mul( output.Pos, View );
if(output.Pos.y < 0)
{
output.Pos.z =output.Pos.z -output.Pos.y*0.7071;
}
else
{
output.Pos.z =output.Pos.z +output.Pos.y*0.7071;
}
if(output.Pos.x < 0)
{
output.Pos.z =output.Pos.z -output.Pos.x*0.7071;
}
else
{
output.Pos.z =output.Pos.z +output.Pos.x*0.7071;
}
output.Pos = mul( output.Pos, Projection );
output.Norm = input[i].Norm;
output.Tex = input[i].Tex;
TriStream.Append( output );

int iNext = (i+1)%3;
output.Pos = input[iNext].Pos + float4(ExplodeAmt,0);
output.Pos = mul( output.Pos, View );
if(output.Pos.y < 0)
{
output.Pos.z =output.Pos.z -output.Pos.y*0.7071;
}
else
{
output.Pos.z =output.Pos.z +output.Pos.y*0.7071;
}
if(output.Pos.x < 0)
{
output.Pos.z =output.Pos.z -output.Pos.x*0.7071;
}
else
{
output.Pos.z =output.Pos.z +output.Pos.x*0.7071;
}
output.Pos = mul( output.Pos, Projection );
output.Norm = input[iNext].Norm;
output.Tex = input[iNext].Tex;
TriStream.Append( output );

output.Pos = float4(centerPos,1) + float4(ExplodeAmt,0);
output.Pos = mul( output.Pos, View );
if(output.Pos.y < 0)
{
output.Pos.z =output.Pos.z -output.Pos.y*0.7071;
}
else
{
output.Pos.z =output.Pos.z +output.Pos.y*0.7071;
}
if(output.Pos.x < 0)
{
output.Pos.z =output.Pos.z -output.Pos.x*0.7071;
}
else
{
output.Pos.z =output.Pos.z +output.Pos.x*0.7071;
}
output.Pos = mul( output.Pos, Projection );
output.Norm = faceNormal;
output.Tex = centerTex;
TriStream.Append( output );

TriStream.RestartStrip();
}
for( int i=2; i>=0; i-- )
{
output.Pos = input[i].Pos + float4(ExplodeAmt,0);
output.Pos = mul( output.Pos, View );
if(output.Pos.y < 0)
{
output.Pos.z =output.Pos.z -output.Pos.y*0.7071;
}
else
{
output.Pos.z =output.Pos.z +output.Pos.y*0.7071;
}
if(output.Pos.x < 0)
{
output.Pos.z =output.Pos.z -output.Pos.x*0.7071;
}
else
{
output.Pos.z =output.Pos.z +output.Pos.x*0.7071;
}
output.Pos = mul( output.Pos, Projection );
output.Norm = -input[i].Norm;
output.Tex = input[i].Tex;
TriStream.Append( output );
}
TriStream.RestartStrip();
}

which replacing this code:

   [maxvertexcount(12)]
void GS( triangle GSPS_INPUT input[3], inout TriangleStream<GSPS_INPUT> TriStream )
{
GSPS_INPUT output;

//
// Calculate the face normal
//
float3 faceEdgeA = input[1].Pos - input[0].Pos;
float3 faceEdgeB = input[2].Pos - input[0].Pos;
float3 faceNormal = normalize( cross(faceEdgeA, faceEdgeB) );
float3 ExplodeAmt = faceNormal*Explode;

//
// Calculate the face center
//
float3 centerPos = (input[0].Pos.xyz + input[1].Pos.xyz + input[2].Pos.xyz)/3.0;
float2 centerTex = (input[0].Tex + input[1].Tex + input[2].Tex)/3.0;
centerPos += faceNormal*Explode;

//
// Output the pyramid
//
for( int i=0; i<3; i++ )
{
output.Pos = input[i].Pos + float4(ExplodeAmt,0);
output.Pos = mul( output.Pos, View );
output.Pos = mul( output.Pos, Projection );
output.Norm = input[i].Norm;
output.Tex = input[i].Tex;
TriStream.Append( output );

int iNext = (i+1)%3;
output.Pos = input[iNext].Pos + float4(ExplodeAmt,0);
output.Pos = mul( output.Pos, View );
output.Pos = mul( output.Pos, Projection );
output.Norm = input[iNext].Norm;
output.Tex = input[iNext].Tex;
TriStream.Append( output );

output.Pos = float4(centerPos,1) + float4(ExplodeAmt,0);
output.Pos = mul( output.Pos, View );
output.Pos = mul( output.Pos, Projection );
output.Norm = faceNormal;
output.Tex = centerTex;
TriStream.Append( output );

TriStream.RestartStrip();
}

for( int i=2; i>=0; i-- )
{
output.Pos = input[i].Pos + float4(ExplodeAmt,0);
output.Pos = mul( output.Pos, View );
output.Pos = mul( output.Pos, Projection );
output.Norm = -input[i].Norm;
output.Tex = input[i].Tex;
TriStream.Append( output );
}
TriStream.RestartStrip();
}

The rest of "tutorial13.fx" code do not need to change.

## Explanation why object on the edge of screen becoming not only bigger, but also wider

This is explanation why objects on right central side or on left central side becoming wider more than higher.

It's very simple, if you think. This is related to perspective. I guess I must explain how rendering all algorithm in 3D graphics working. First there placed all objects in 3 dimensional coordinate system. Each object consist of vertexes. Vertexes connects lines on object. On this lines puted textures. So each object is rotated at some amount depending on distance to camera. In 3D graphics camera is simple flat plain and all objects is projection of vertexes or/and lines on the plain. So objects, whose vertexes are farther from plane are minimized. So I said each object consist of vertexes. And so each vertex have coordinates (x; y; z). If object vertex distance from plane is say 10, then object vertex coordinates are divided by 10 and so objects vertex coordinates becoming (x/10; y/10; z/10). So farther objects [vertexes] just shrinking closer to center. But it is not only process. All objects rotates depending on distance from virtual camera (plain) and this is happening, because farther vertexes from virtual camera plane are closer to screen center. If distance is bigger then object rotation is smaller. And this is related to field of view. This is natural process in drawing and very related to perspective. Perspective means, that if you are close, you see near objects rotated to you at bigger angle and thus they have bigger ${\displaystyle {\frac {wide}{height}}}$ ratio (if object is on left or right central side), than the same object which are far from you. So it is like this: if near object projection onto plain have dimensions 20 height and 50 wide, then say farther object from plane [puted near at the same wall] have projection onto plane dimensions 10 height and 20 wide. Wide decreased more than height, when distance increased. So it explains why, when rotating camera in 3D videogame standing in same point, objects on edges becoming not only bigger but also looks wider (like it changed proportions of height and wide). So if I naturally move vertexes back depending on distance from plane [onto which objects are projected] center, then object proportions will naturally become normalized, because object size will be the same in all places like in real world, so it must be EXACT algorithm of making it feel like in real world and this algorithm is mine, which I show in tutorial11. And there no need anything else to this algorithm, because it is exact like shown, without any squares and roots (because they are not needed in this case working with vertexes). Also plane function can be described as ${\displaystyle z=x+y}$ and so z coordinates getting farther from virtual camera plane, when x and y coordinates are on edges (just remind you, that z coordinate is in 3D graphics axis from or to virtual camera and not up; up is y) either big or negatively big. This is also exactly with plane projection and so on such things (learn math about plane).

When plane (virtual camera) is rotated and if object was before rotation in center of screen and after rotation objects is on the right top corner of the screen then no doubt object size increased. This means, that distance from object to the plane (camera) increased very much, because plane right "wing" come much closer to object and now (after rotation) distance is calculated between plane edge and object instead plane center and object (and when you looking from same point plane rotates only around plane center). And this plane can be described as ${\displaystyle z=kx+ky.}$ Here k is rotation coefficient or normalization coefficient, that all objects would look the same size when rotating camera standing in same point. So either x or y coordinates increasing, the z (of the plane) and the plane corner getting closer to object at same amount. If both x and y increasing, then by low of plane formula z (3D graphics, which is from us) coordinate increasing as sum of x and y multiplied by k. Now we only must found, what exact k must be. But first, to remove confusion, I just say, that this effect of objects becoming wider when they closer as effect of perspective or field of view is natural, because if object is far, then no big deal if his most far vertex is at 20 units distance and objects nearest vertex is at 15 units distance from camera plane. But if object is close, then nearest vertex is twice closer than most far object vertex. So for close object, his most far vertex move closer to center at amount twice bigger than closest vertex. And for object which is far from camera plane, most far vertex move 20/15=1.33 times more than closest vertex of object which is far. This is explain why near objects are naturally looks wider [than far same objects] if they lie near long wall (near objects to camera plane have bigger projection wide:height ratio to compare with far objects).
Actually there shrinking all plane with vertexes parallel to camera plane depending on distance. And each plane with vertexes, which is parallel to camera plane shrinking to center of monitor screen. So there is like vertex moving not only by his own coordinates, but how empty coordinates is before from over moved vertexes. Or in over words, algorithm works, that the farther vertex is from screen center, the more it moves to center and also more moves to center if it is farther from camera plane. So vertex coordinates after 3D [not mine] algorithm, becoming ${\displaystyle x_{n}=x-ax-bz}$ and ${\displaystyle y_{n}=y-ay-bz}$ (for right top quarter of the screen) and then fallowing projection on camera plane and so on or something like that. Here a depends on monitor resolution and so on.
So what exact c must be here:
   if(output.Pos.y < 0)
{
output.Pos.z =output.Pos.z -output.Pos.y*c;
}
else
{
output.Pos.z =output.Pos.z +output.Pos.y*c;
}
if(output.Pos.x < 0)
{
output.Pos.z =output.Pos.z -output.Pos.x*c;
}
else
{
output.Pos.z =output.Pos.z +output.Pos.x*c;
}

?
I only roughly from experience, can tell you, that it must be about c=1/5=0.2.
Let's calculate c for top right quarter of screen. Assume display resolution is 640*480. Then top right quarter resolution is 320*240. Let's say in this formulas:
${\displaystyle x_{n}=x-ax-bz,}$
${\displaystyle y_{n}=y-ay-bz,}$
coefficient a is equal b, a=b (or another solution can be a=2*b). Here z is distance to camera plane to object vertex; x and y are screen coordinates (in top right quarter). Also I have feeling, that b must be somthing like b=d-z, where d=>1000.
Coefficient c can be easily calibrated when big field of view set in game. Then just standing in one point and looking around by comparing how object looking in corner and in center (and it should be the same size). So this calibration at big field of view should be good then for all FOVs and fit exactly.

Here some screenshots to show how depending on distance of object from camera changing object wide:height ratio. The bigger distance from camera to object, the bigger object wide:height ratio (of the same object at bigger distance from camera)

http://imageshack.us/g/683/fov120thatswhy.jpg/ .

### How game not modified 3D graphics algorithm works

Distance from any vertex of object to camera plane is nearest distance from vertex to [camera] plane. With math formulas there no problem to calculate distance from vertex to any plane and no problem to find vertex projection coordinates on plane. Vertex projection to plane is nearest distance from vertex to plane. We have monitor with resolution 640*480 pixels. Let's define ${\displaystyle x_{p}}$ and ${\displaystyle y_{p}}$ is pixels coordinates on the screen and ${\displaystyle x_{v}}$, ${\displaystyle y_{v}}$ and ${\displaystyle z_{v}}$ is vertex coordinates. In 3D graphics Ox axis is going to right; Oy axis is going up and Oz axis is going from us (from camera). All 3D objects have places in 3 dimensional coordinate system. All vertexes positions are known. There is how to make more distant objects look smaller (at least in right top quarter):

${\displaystyle x_{p}={\frac {x_{v}}{z_{v}}};}$
${\displaystyle y_{p}={\frac {y_{v}}{z_{v}}}.}$
It best to imagine, if think, that camera (plane) is yOx plane and ${\displaystyle z_{v}}$ begins counting from 0 to say 640. All distant vertexes of objects which are not projected, after this calculations becoming visible and going to screen center and are now projected onto plain (if z-buffer allows). For example there is 3 vertexes of object and each vertex have coordinates A(-400; -400; 100), B(-500; -500; 100), C(-600; -600; 100), then after algorithm will take effect coordinates will be A(-4; -4; 100), B(-5; -5; 100), C(-6; -6; 100). But ${\displaystyle z_{v}}$ coordinate doesn't mean anything for camera, so only will be projected x and y coordinates of vertexes A, B and C.
Another example, if there is vertexes D(-400; -400; 10), E(-500; -500; 10), F(-600; -600; 10), then after algorithm taking effect on vertexes there are new coordinates D(-40; -40; 10), E(-50; -50; 10), F(-60; -60; 10) by using this formulas:
${\displaystyle x_{p}={\frac {x_{v}}{z_{v}}};}$
${\displaystyle y_{p}={\frac {y_{v}}{z_{v}}}.}$
So this new vertexes are projected onto plane (camera) and line [which connecting vertexes] points are projected or texture, which connecting lines of object. Onto camera plane can be projected only those vertexes, whose line, which is shortest distance to plane from vertex, is perpendicular to camera plane. So vertexes with coordinates like (-100000; -100000; 10) can not be projected onto plane, but can be projected for example such vertex (-100000; -100000; 500) and this vertex will be on screen in the bottom left corner.

If camera (plane) size is 640*480 vertex units, then rotating camera plane around Oy axis and cammera center coordinates is (0; 0; 0), then object, which was projected before rotation into camera center, than after rotation this vertex will be projected moving through shortest distance to camera plane and so this vertex will be projected onto camera central right (or left) corner. And of course then ${\displaystyle z_{v}}$ value will significantly change from say 1000 to 700 if rotation about Oy axis will be about 45 degrees. So this is how object becoming bigger if rotating camera and standing in same point (and thus object becoming bigger on corner than in center). Only I do not explain how changing ${\displaystyle {\frac {wide}{height}}}$ ratio if object becoming bigger (you can think of it as average wide and average height). If object is closer to camera (plane), then his ${\displaystyle {\frac {wide}{height}}}$ ratio is bigger and if object is farther from camera (all objects vertexes are farther from plane, which is camera) then object ${\displaystyle {\frac {wide}{height}}}$ ratio is smaller (in this case we assuming object is in central left or right corner, but if object is in central top or bottom position then object ${\displaystyle {\frac {wide}{height}}}$ ratio is bigger for more distant objects).

## Mystery solved - object do not becoming wider, but additional sides of object are seen

Yes, I was searching answer not in this direction. For example if object is parallel to camera plane in center and if object is on left central corner of screen and also parallel to camera plane, then simply more sides of object is seen and thats why he looking wider. Because in center is visible only one front side and on left central corner visible front object side and little bit object right side (think of object as cube, which frontal side is parallel to camera plane). When cube frontal side is not parallel to camera plane then appears misunderstanding thinking, that something wrong with object wide (by comparing object in center and in central left side of screen).
So need just algorithm, which would made objects smaller on screen corners (to get scene looking like in real world) and do not need bother about object wide:height ratio. So everything then OK and my algorithm, which pushing farther from camera (plane) vertexes, whose farther from screen center seems exact.

### update

To understand it better imagine that you putting cube in center with cube front side parallel to camera photomatrix. Then you moving cube to left but cube front side still parallel to camera photomatrix. Then because of perspective additionally right side of the cube becoming be visible. This is happening with sphere also (sphere on corner looks wider [like ellipse] and in center like circle), but with sphere is happening exactly the same like with cube, but simply it's hard to understand it, because sphere is round and hard to distinguish front side of sphere and right side, which becoming visible on left corner because of perspective. If we would paint front sphere side and all over "sides" into total 6 [equal] sectors like have cube, then put sphere in center and in left corner painted front sphere side would be the same size, but in another color painter right sphere size in left would be visible and in center would be invisible, because in center is behind front sphere side. To make it even more clear, imagine, that front sphere 1/6 part is painted in red and right side in green and over 4 sides in some over colors. So if sphere is in center, then only red color is visible and if sphere is in central left corner then visible red and green color [of frontal and right side of sphere corespondingly]. Front sphere side painted in red should be considered as side, which is visible all red, when sphere is in center of screen. And if we move little bit sphere to any over side (but do not rotate sphere), then we will see same sphere area painted in red and additional over color (or red and two over colors if sphere say moved through line trajectory, which is parallel to camera photomatrix plane, to left up edge).
This effect also happening with "Panasonic Lumix DMC-GH1" camera, which can film like 3D graphics. In reality [with "Panasonic Lumix DMC-GH1" camera] it happening because lenses are so polished, that it would be like that. It don't depending what field of view you will use with this lense(s) it will always objects on plane, parallel to camera photomatrix, will show of equal [front side] size do not matter how far on edge object is (in left or in right side or up or down, only matters [shortest] distance between photomatrix and plane [parallel to photomatrix], through which object traveling). Object in center can show only front side and object in corner showing front side of equal size like in center and additionally over side, which is visible because of perspective (because of this object looks wider [in central left or right corner]; imagine cube which front side parallel to camera photomatrix).
Here some cube images from 3Dmax: http://imageshack.us/g/641/cubealmostincenter.jpg/ .

## Field of view in reality

I measured field of view of web cam and it is about 20-40 degrees. Distance from webcam to rule is 30 cm and visible only 20 cm and rule is parallel to webcam photomatrix. So if radius r=30 and we know that ${\displaystyle c=2\pi r}$. So ${\displaystyle {\frac {c}{2}}=\pi r=3.14\cdot 30=94.2}$ (cm). So 94.2/20=4.71. So now just need 180/4.71=38.2 degrees. This is horizontal field of view of webcam (it is about 38 degrees).

Here is video of camera with 117 degrees field of view http://www.youtube.com/watch?v=0TBnXKZ-bhc .
Here video with different FOV's http://www.youtube.com/watch?v=p0eUejPEaI8 .
Here video with 84 degrees field of view and this is how would look game with my algorithm http://www.youtube.com/watch?v=v4o9LcO9a3U&NR=1 .
Here videos with big FOV's:
Here video which confusing me, because FOV effect is like in 3D videogames (maybe video made with multiple cameras)

http://www.youtube.com/watch?v=qXVTEmBLakY&feature=related . See how it strange increasing object size in place 1:50, but maybe it's not 3D video games fov, but something I misunderstanding and this is just ordinary small field of view (about 30-40 degrees). Here this Panasonic Lumix DMC-GH1 camera review for about 1000 dolars http://gizmodo.com/5304887/panasonic-lumix-dmc+gh1-review-a-1500-misfit . Here Panasonic Lumix DMC-GH1 filiming at big fov and everything looks like in 3D videogame http://www.youtube.com/watch?v=xF6M6BUgwUg . After this video I do not see reason for farther developing on this theme. It just can be lens whatever you want. And eye lens can be one of those types. But since eye can not clearly see anything that is not in center of eye, then impossible to say, what kind of lens eye using. Also field of view of eye possibly is 50-60 degrees and at such angle not much possible to say about what lens type it using.

## What field of view effect in reality more natural, like in 3D games or like fisheye?

If there is cube, which frontal side is parallel to virtual camera photomatrix, then by decreasing cube size with my algorithm (if cube is on edge of screen), cube frontal side will no longer be square and thus then need average cube size so it will look from each [camera rotation] point the same. So this is harder to get, than just all cubes frontal sides parallel to virtual camera photomatrix are the same size.

And by the way, forget about wide field of view with my algorithm as the same as moving only eye muscles, because it's not the same, obviously on say right top corner object have different shape than in center.

So how games looks now, they are more natural or at least better calibrated for they own type. The same for real cameras - cameras with parallel recording, rather than like through door peephole type cameras, have calibration technique with no doubt about it (all parallel objects [cubes] to camera photomatrix must be equal size with they parallel frontal sides). And for seeking effect that if standing in one point and rotating camera all objects is the same, perhaps best it would be spheres as reference measure for equal size. But it even can be wrong with spheres size, because spheres most likely also change they shape and then only average shape size can be taken as reference measure for equal size of all objects if standing in one point and just rotating camera. So for this type of cameras much harder to set correct method of how make lenses, that objects size represent this type viewing as most correct (there can be always small error). But I think the smaller object on edge the comparing with same small object in center standing in same point, but just rotating camera, the more precise value can be calibrated for rest of the objects. So I think calibration can be almost without errors. So still, what [lenses type] effect is better I leave it to decide to you.

## Human seeing at least 160 degrees

After my research with my eye I have to admit, that human field of view can be almost 180 degrees. You can close one your eye, then look straight and don't move eye and put finger near ear at different distance, but still on plane parallel to eye retina (move your hand with finger closer or father from your head; if you looking with right eye then with you right hand finger move to right and try different distance from eye, but don't move back or in front). In this case your finger will be blurly visible and you will still see your finger, when you move finger muscles (so in this case you holding finger at field of view limit 170-180 degrees). And finger is smaller than close, but it is on the same parallel plane as if it is close, which means, that human seeing like in my algorithm (more like through door peephole), rather than like in 3D games. So my algorithm, which describing object size by object distance from camera center is like human eye simulation (like human see). Computer 3D graphics algorithm describing object size by distance from camera as big plane to object by measuring nearest distance between plane [parallel to camera photomatrix] and object (computer 3D graphics currently everywhere used algorithm is like can film "Panasonic Lumix DMC-GH1" camera, but not like most cameras, I guess, but still it not like human see). Also I should mention that at field of view 40 degrees or smaller my algorithm and default [3D games] algorithm and all cameras showing images, about which impossible to say to what category algorithm or camera belongs.

### update

But because of blur on edges of eye vision, there still is doubt about what category ("parallel to plane" or "fisheye"-"door peephole") eye using. So if because of blur around center impossible to say much about eye lens type, then most logical assumption is, that human seeing roughly like webcam (30-40 degrees fov; at fov 30-40 degrees very hard to say about lenses type or algorithm type) and rest of image around center is somthing like almost not objects, but like dream. Also it can be, that human eye is between those two types (in this case just need calibrate to middle using my algorithm, between default 3D games algorithm and between my algorithm, which measuring distance from camera center to any object).

## Applying edges minimization effect for sphere with texture in DX10

So you need have DirectX 10 SDK, then there is examples and this time I apply effect for minimizing size on screen corners for sphere with texture. Go into "C:\Program Files\Microsoft DirectX SDK (June 2010)\Samples\C++\Direct3D10\DDSWithoutD3DX". This time need to made modification for two files: "DDSWithoutD3DX.fx" and "DDSWithoutD3DX10.cpp" (I am not sure if it will work after modifications with directx9 hardware).
For file "DDSWithoutD3DX.fx" need to change this code:

//-------------------------------------------------------------------------------------- // This shader computes standard transform and lighting //--------------------------------------------------------------------------------------

   VS_OUTPUT RenderSceneVS( VS_INPUT input )
{
VS_OUTPUT Output;
float3 vNormalWorldSpace;

// Transform the position from object space to homogeneous projection space
Output.Position = mul( input.Position, g_mWorldViewProjection );

// Transform the normal from object space to world space
vNormalWorldSpace = normalize(mul(input.Normal, (float3x3)g_mWorld)); // normal (world space)

   // Calc diffuse color
Output.Diffuse.rgb = max(0.3,dot(vNormalWorldSpace, g_vLightDir)).rrr;
Output.Diffuse.a = 1.0f;

// Just copy the texture coordinate through
Output.TextureUV = input.TextureUV;

return Output;
}

to this code:

//-------------------------------------------------------------------------------------- // This shader computes standard transform and lighting //--------------------------------------------------------------------------------------

   VS_OUTPUT RenderSceneVS( VS_INPUT input )
{
VS_OUTPUT Output;
float3 vNormalWorldSpace;
// Transform the position from object space to homogeneous projection space
Output.Position = mul( input.Position, g_mWorld );
Output.Position = mul( Output.Position, g_mView );
if(Output.Position.y < 0)
{
Output.Position.z =Output.Position.z -Output.Position.y;
}
else
{
Output.Position.z =Output.Position.z +Output.Position.y;
}
if(Output.Position.x < 0)
{
Output.Position.z =Output.Position.z -Output.Position.x;
}
else
{
Output.Position.z =Output.Position.z +Output.Position.x;
}
Output.Position = mul( Output.Position, g_mProj );
// Transform the normal from object space to world space
vNormalWorldSpace = normalize(mul(input.Normal, (float3x3)g_mWorld)); // normal (world space)
// Calc diffuse color
Output.Diffuse.rgb = max(0.3,dot(vNormalWorldSpace, g_vLightDir)).rrr;
Output.Diffuse.a = 1.0f;
// Just copy the texture coordinate through
Output.TextureUV = input.TextureUV;
return Output;
}

Also for file "DDSWithoutD3DX.fx" need add two last lines ("float4x4 g_mView;" and "float4x4 g_mProj;") in "Global vairbales":

//-------------------------------------------------------------------------------------- // Global variables //--------------------------------------------------------------------------------------

   float3   g_vLightDir = float3(0,0.707,-0.707);  // Light's direction in world space
float4x4 g_mWorld;                  // World matrix for object
float4x4 g_mWorldViewProjection;    // World * View * Projection matrix
float4x4 g_mView;
float4x4 g_mProj;

And to "DDSWithoutD3DX10.cpp" need add this rows (anywhere, but perhaps better near similar information rows; this file is easily found in left table and after adding rows need to save "DDSWithoutD3DX10.cpp", trying to do it through notepad is fatal for ".cpp" files):
   ID3D10EffectMatrixVariable*         g_pmView = NULL;
ID3D10EffectMatrixVariable*         g_pmProj = NULL;
g_pmView = g_pEffect10->GetVariableByName( "g_mView" )->AsMatrix();
g_pmProj = g_pEffect10->GetVariableByName( "g_mProj" )->AsMatrix();
g_pmView->SetMatrix( ( float* )&mView );
g_pmProj->SetMatrix( ( float* )&mProj );

To be sure, that effect working take replace ball mesh with car mesh. Ball mesh is in directory "C:\Program Files\Microsoft DirectX SDK (June 2010)\Samples\Media\misc\ball.sdkmesh". So you need just from directory "C:\Program Files\Microsoft DirectX SDK (June 2010)\Samples\Media\ExoticCar" copy mesh "carinnards.sdkmesh" to ball mesh directory "C:\Program Files\Microsoft DirectX SDK (June 2010)\Samples\Media\misc" with renamed name "ball.sdkmesh". And you will see how greatly everything working for car (car looking like through fisheye glass).
Here some images of car through fisheye glass: http://imageshack.us/g/710/carinsteadball.jpg/ .

### update

Now car looking like not through fisheye lens, but like through rhombus kinda lens. So need squares and square root like for Pythagorean theorem or sine and cosine kinda like it would be in circle. Square root is not necessary, because if numbers are smaller than 1 it is not needed. Square root after rising square is just for length (of vector or radius) calculation, but constant can fix it for fisheye lens effect and some cycles can be saved.
So this code:
   if(Output.Position.y < 0)
{
Output.Position.z =Output.Position.z -Output.Position.y;
}
else
{
Output.Position.z =Output.Position.z +Output.Position.y;
}
if(Output.Position.x < 0)
{
Output.Position.z =Output.Position.z -Output.Position.x;
}
else
{
Output.Position.z =Output.Position.z +Output.Position.x;
}

need replace with this code:
   if(Output.Position.y < 0)
{
Output.Position.z =Output.Position.z +0.3* pow(Output.Position.y, 2);
}
else
{
Output.Position.z =Output.Position.z + 0.3* pow(Output.Position.y, 2);
}
if(Output.Position.x < 0)
{
Output.Position.z =Output.Position.z +0.3 * pow(Output.Position.x, 2);
}
else
{
Output.Position.z =Output.Position.z +0.3 * pow(Output.Position.x, 2);
}

It can be explained like "Output.Position.z =Output.Position.z +0.3 * Output.Position.x * Output.Position.x;" etc.
Here faster way (rising by square makes minus sign disappear, so don't need "if" and "else" statements):
   Output.Position.z =Output.Position.z +0.3 * Output.Position.y * Output.Position.y;
Output.Position.z =Output.Position.z +0.3 * Output.Position.x * Output.Position.x;

And here even more fast way:
   Output.Position.z =Output.Position.z +0.3 * Output.Position.y * Output.Position.y +0.3 * Output.Position.x * Output.Position.x;

Which is the same as this:
   Output.Position.z =Output.Position.z +pow(Output.Position.y * Output.Position.y + Output.Position.x * Output.Position.x , 0.5);

And the same to this:
   Output.Position.z =Output.Position.z +sqrt(Output.Position.y * Output.Position.y + Output.Position.x * Output.Position.x);

Here images of car using this formulas and now car do not have rhombus edge shapes and is oval kinda like truly through fisheye (door peephole) lens: http://imageshack.us/g/412/notrhombuscar.jpg/ .
After all formulas, which don't using square root most likely are wrong (except if there is some automatic normalization of vector or something like that, but not likely), so only two last are correct.

## distance to edges calibration (in center must be about 1.7 times be object bigger than on right or left edge)

Horizontal default field of view is 90 degrees. So for left and for right half of screen there is 45 degrees. So if point in center have coordinate A(0; 0; 1) and in right central corner point have coordinate B(0; 1; 1) (coordinates in computer graphics are (x; y; z)=(horizontal; vertical; deepness)), then distance to point B(1; 0; 1) from center O(0; 0; 0) is about 1.4142 times bigger. Because ${\displaystyle OA={\sqrt {0+0+1^{2}}}=1}$ and ${\displaystyle OB={\sqrt {1^{2}+0^{2}+1^{2}}}={\sqrt {2}}=1.4142.}$ Also distance from point D(0.7071; 0.7071; 1) to point O is the same as from point B to point O, because ${\displaystyle OD={\sqrt {0.7071^{+}0.7071^{2}+1^{2}}}={\sqrt {0.5^{2}+0.5^{2}+1}}={\sqrt {2}}=1.4142.}$
So calibration must be performed looking how much times cube front side, which is parallel to camera, smaller on right central side, than in center. This front cube side must be exactly ${\displaystyle {\sqrt {2}}=1.4142}$ times smaller in central right (or left) edge than in center (additionally it better that half of cube front side would be invisible and cube frontal side wide would be no more than about 100 pixels).
Since horizontal field of view is 90 degrees, then you theoretically can know how much times object in left top corner must be smaller than in center. But there monitor aspect ratio is not 1:1, but 4:3 or 9:16, so you can only image how more far going vector (straight line) from center to point C with coordinates C(1; 1; 1). So vertical field of view is smaller than horizontal and for 4:3 aspect ratio is 3*90/4=67.5 degree. So you can imagine how much farther point C(1; 1; 1) is and cube frontal side in this point C must be 1.73205 times smaller than in point A(0; 0; 1), because distance to point C is ${\displaystyle OC={\sqrt {1^{2}+1^{2}+1^{2}}}={\sqrt {3}}=1.73205.}$ And vector OC{1; 1; 1} with all 3 axis [x, y, z] making 45 degrees angle. If you want to know, how much time cube frontal side parallel to camera must be smaller in left top corner of screen with aspect ratio 4:3, than in point A(0; 0; 1), then you need to know what coordinates have point in left top corner of screen. We know that x and z coordinates for FOV 90 degrees are equal 1. Now we can calculate y coordinate 3*1/4=0.75, so point E in top left monitor corner have coordinates E(1; 0.75; 1) and distance from center O(0; 0; 0) to point E is ${\displaystyle OE={\sqrt {1^{2}+0.75^{2}+1^{2}}}={\sqrt {2.5625}}=1.600781059.}$ So small cube frontal side parallel to virtual camera photomatrix must be in left top corner 1.600781 times smaller than in center of the screen if cube to top left screen corner is moved only in x and y direction and do not changing z coordinate.
But in many games and especially in 3D rendering programs like 3DstudioMax horizontal fov is 90 degrees for whatever monitor aspect ratio (just changing vertical aspect ratio and vertical aspect ratio is smaller for wide screen monitors). So you can calibrate just comparing cube frontal side, which is parallel to camera, in center of screen and in left or right central edge. And in central left edge cube frontal side must be ${\displaystyle {\sqrt {2}}=1.4142}$ times smaller than in center (cube z coordinate during cube replacement from center to left central edge must do not change).
I think there don't need any constants and only using this code:
   output.Pos.z =output.Pos.z + sqrt(output.Pos.x * output.Pos.x + output.Pos.y * output.Pos.y);

it makes any object (cube front side parallel to camera), at same z distance in center and in left (or right) central edge of screen, look 1.4142 times smaller than in center of screen (if cube frontal side was 50 pixels wide and 50 pixels height, then in left central edge cube frontal side wide will be 50/1.4142=35 pixels and height will be 50/1.4142=35 pixels). Half cube frontal side must be invisible (on left central edge of screen) for most proper calibration and it is best to choose, that cube frontal side would be as small as possible or moved far in z direction from camera, that it would consist of very small amount of pixels, but you got idea, 50 pixels cube frontal side wide is just fine in center (and then it must become 35 pixels cube frontal side wide or 50/1.6=31 pixel wide and 31 pixel height in left top corner of screen with monitor aspect ratio 4:3).

## update

If in center cube with front side parallel to camera have front side square 50*50 pixels, then after applying formula "output.Pos.z =output.Pos.z + sqrt(output.Pos.x * output.Pos.x + output.Pos.y * output.Pos.y);", cube on central left edge will be moved closer to center (because after this formula field of view little bit expanding), so need first to see, where cube in central left edge after formula is and only then compare cube front side of 50 pixels with cube in moved place after applying formula from central left edge. But there is big possibility, that this formula already have everything, that need and don't need constants (to multiply value under square root) and it calibrated already.

So cube front side on central screen edge must be ${\displaystyle {\sqrt {2}}=1.4142}$ times farther than cube front side in center of screen. Then means (from perspective creating formula y/z and x/z), that cube front side position after applying formula "output.Pos.z =output.Pos.z + sqrt(output.Pos.x * output.Pos.x + output.Pos.y * output.Pos.y);" must be p=(screenwide/2)/1.4142=(1280/2)/1.4142=640/1.4142=452.54834 or about 453 pixels from center and not 640 pixels from center. So need to compare front cube side of 50 pixels wide with cube front side in central left (not on the end of edge) edge with distance 453 pixels from center to this small cube front side center; and this small cube wide must be 50/1.4142=35 pixels. Smaller cube must be on central horizontal line. 640+453=1093 pixels from left monitor side to that smaller cube frontal side center, if monitor screen resolution is 1280*720 (16:9) or 1280*960 (4:3). By calibrating it only on horizontal line it will be calibrated properly for all scene, because horizontal field of view (fov) for all aspect ratios (of monitors) is the same (90 degrees in most cases). This calibration method is good only for games or tutorials with [horizontal] fov=90 degrees. Another example if screen wide is 1600 pixels, then cube which before applying lens effect was on most distant right edge, then after applying lens effect cube will be ${\displaystyle p={\frac {\frac {1600}{2}}{\sqrt {2}}}={\frac {800}{\sqrt {2}}}={\frac {800}{1.4142}}=565.685425=566}$ pixels from screen center (or 800+566=1366 pixels from left monitor side) to right. And cube [front side parallel to camera] size will be also ${\displaystyle {\sqrt {2}}}$ times smaller (35 pixels in right central side (566; 0) instead 50 pixels in screen center (0; 0)). So it don't matter (almost in all cases) if screen resolution is 1600*1200 or 1600*900, because horizontal fov is the same for any aspect ratio.

## "SoftParticles_2010" first fisheye lens effect for all scene

To launch from directory "C:\Program Files\Microsoft DirectX SDK (June 2010)\Samples\C++\Direct3D10\SoftParticles" project "SoftParticles_2010.vcxproj" and after that push green buton like in all over demos, need to have instaled "DirectX SDK" and any of "visual studio C++" version 10 (2010 or greater) and it can be express, ultima, professional, what matters it must be "VS C++".

Here images how it looks with code "output.Pos.z =output.Pos.z + sqrt(output.Pos.x * output.Pos.x + output.Pos.y * output.Pos.y);" :
Here screenshots of how it looks with code "output.Pos.z =output.Pos.z + sqrt(output.Pos.x * output.Pos.x + output.Pos.y * output.Pos.y) *2;" :

In "SoftParticles.fx" file need original this code:
   cbuffer cbPerObject
{
matrix g_mWorldViewProj;
matrix g_mWorldView;
matrix g_mWorld;
matrix g_mInvView;
matrix g_mInvProj;
float3 g_vViewDir;
};

   //
// Vertex shader for drawing the scene
//
PSSceneIn VSScenemain(VSSceneIn input)
{
PSSceneIn output;
output.vPos = mul( float4(input.Pos,1), g_mWorld );
output.Pos = mul( float4(input.Pos,1), g_mWorldViewProj );
output.Norm = normalize( mul( input.Norm, (float3x3)g_mWorld ) );
output.Tan = normalize( mul( input.Tan, (float3x3)g_mWorld ) );
output.Tex = input.Tex;
return output;
}

change to this code:
   cbuffer cbPerObject
{
matrix g_mWorldViewProj;
matrix g_mWorldView;
matrix g_mWorld;
matrix g_mView;
matrix g_mProj;
matrix g_mInvView;
matrix g_mInvProj;
float3 g_vViewDir;
};

   //
// Vertex shader for drawing the scene
//
PSSceneIn VSScenemain(VSSceneIn input)
{
PSSceneIn output;
output.vPos = mul( float4(input.Pos,1), g_mWorld );
output.Pos = mul(mul( float4(input.Pos,1), g_mWorld ), g_mView);
output.Pos.z =output.Pos.z + sqrt(output.Pos.x * output.Pos.x + output.Pos.y * output.Pos.y);
output.Pos = mul(output.Pos, g_mProj);
output.Norm = normalize( mul( input.Norm, (float3x3)g_mWorld ) );
output.Tan = normalize( mul( input.Tan, (float3x3)g_mWorld ) );
output.Tex = input.Tex;
return output;
}


And in "SoftParticles.cpp" file need to change those codes:
   ID3D10EffectMatrixVariable* g_pmWorldViewProj = NULL;
ID3D10EffectMatrixVariable* g_pmWorldView = NULL;
ID3D10EffectMatrixVariable* g_pmWorld = NULL;
ID3D10EffectMatrixVariable* g_pmInvView = NULL;
ID3D10EffectMatrixVariable* g_pmInvProj = NULL;

   // Obtain the parameter handles
g_pmWorldViewProj = g_pEffect10->GetVariableByName( "g_mWorldViewProj" )->AsMatrix();
g_pmWorldView = g_pEffect10->GetVariableByName( "g_mWorldView" )->AsMatrix();
g_pmWorld = g_pEffect10->GetVariableByName( "g_mWorld" )->AsMatrix();
g_pmInvView = g_pEffect10->GetVariableByName( "g_mInvView" )->AsMatrix();
g_pmInvProj = g_pEffect10->GetVariableByName( "g_mInvProj" )->AsMatrix();

   // Get the projection & view matrix from the camera class
D3DXMATRIX mWorld;
D3DXMATRIX mView;
D3DXMATRIX mProj;
D3DXMATRIX mInvView;
D3DXMATRIX mInvProj;
D3DXMatrixIdentity( &mWorld );
mProj = *g_Camera.GetProjMatrix();
mView = *g_Camera.GetViewMatrix();
D3DXMATRIX mWorldViewProj = mWorld*mView*mProj;
D3DXMATRIX mWorldView = mWorld*mView;
D3DXMatrixInverse( &mInvView, NULL, &mView );
D3DXMatrixInverse( &mInvProj, NULL, &mProj);

   g_pmWorldViewProj->SetMatrix( (float*)&mWorldViewProj );
g_pmWorldView->SetMatrix( (float*)&mWorldView );
g_pmWorld->SetMatrix( (float*)&mWorld );
g_pmInvView->SetMatrix( (float*)&mInvView );
g_pmInvProj->SetMatrix( (float*)&mInvProj );

to those codes:
   ID3D10EffectMatrixVariable* g_pmWorldViewProj = NULL;
ID3D10EffectMatrixVariable* g_pmWorldView = NULL;
ID3D10EffectMatrixVariable* g_pmWorld = NULL;
ID3D10EffectMatrixVariable* g_pmView = NULL;
ID3D10EffectMatrixVariable* g_pmProj = NULL;
ID3D10EffectMatrixVariable* g_pmInvView = NULL;
ID3D10EffectMatrixVariable* g_pmInvProj = NULL;

   // Obtain the parameter handles
g_pmWorldViewProj = g_pEffect10->GetVariableByName( "g_mWorldViewProj" )->AsMatrix();
g_pmWorldView = g_pEffect10->GetVariableByName( "g_mWorldView" )->AsMatrix();
g_pmWorld = g_pEffect10->GetVariableByName( "g_mWorld" )->AsMatrix();
g_pmView = g_pEffect10->GetVariableByName( "g_mView" )->AsMatrix();
g_pmProj = g_pEffect10->GetVariableByName( "g_mProj" )->AsMatrix();
g_pmInvView = g_pEffect10->GetVariableByName( "g_mInvView" )->AsMatrix();
g_pmInvProj = g_pEffect10->GetVariableByName( "g_mInvProj" )->AsMatrix();

   // Get the projection & view matrix from the camera class
D3DXMATRIX mWorld;
D3DXMATRIX mView;
D3DXMATRIX mProj;
D3DXMATRIX mInvView;
D3DXMATRIX mInvProj;
D3DXMatrixIdentity( &mWorld );
mProj = *g_Camera.GetProjMatrix();
mView = *g_Camera.GetViewMatrix();
D3DXMATRIX mWorldViewProj = mWorld*mView*mProj;
D3DXMATRIX mWorldView = mWorld*mView;
D3DXMatrixInverse( &mInvView, NULL, &mView );
D3DXMatrixInverse( &mInvProj, NULL, &mProj);

   g_pmWorldViewProj->SetMatrix( (float*)&mWorldViewProj );
g_pmWorldView->SetMatrix( (float*)&mWorldView );
g_pmWorld->SetMatrix( (float*)&mWorld );
g_pmView->SetMatrix( (float*)&mView );
g_pmProj->SetMatrix( (float*)&mProj );
g_pmInvView->SetMatrix( (float*)&mInvView );
g_pmInvProj->SetMatrix( (float*)&mInvProj );


This is from DirectX 2010 SDK (June). This tutorial is in "C:\Program Files\Microsoft DirectX SDK (June 2010)\Samples\C++\Direct3D\ShadowMap". You need to have "Visual Studio C++" also. Professional or any over, but it must be Visual Studio 2010 (maybe VC 2008 can be, but I am not sure). To open you choose file "ShadowMap_2010.vcxproj" and then agree ok something like this few times and then push green triangle and should run. Then you can edit "ShadowMap.fx" file with notepad (on desktop) or if you understand programing quite well in corner in Visual Studio folder icon (in Solution Explorer window) Solution'ShadowMap_2010'(1 project)>>ShadowMap>>Shaders>>ShadowMap.fx. After edit fx file, need safe (file>>safe ShadowMap.fx) and then run.
This time there don't need anything change, but just add two lines in "ShadowMap.fx" file.
    //-----------------------------------------------------------------------------
// Desc: Process vertex for scene
//-----------------------------------------------------------------------------
void VertScene( float4 iPos : POSITION,
float3 iNormal : NORMAL,
float2 iTex : TEXCOORD0,
out float4 oPos : POSITION,
out float2 Tex : TEXCOORD0,
out float4 vPos : TEXCOORD1,
out float3 vNormal : TEXCOORD2,
out float4 vPosLight : TEXCOORD3 )
{
//
// Transform position to view space
//
vPos = mul( iPos, g_mWorldView );
vPos.z =vPos.z+ sqrt(vPos.x * vPos.x + vPos.y * vPos.y); //my line
//
// Transform to screen coord
//
oPos = mul( vPos, g_mProj );
//
// Compute view space normal
//
vNormal = mul( iNormal, (float3x3)g_mWorldView );
//
// Propagate texture coord
//
Tex = iTex;
vPos = mul( iPos, g_mWorldView ); //my line
//
// Transform the position to light projection space, or the
// projection space as if the camera is looking out from
// the spotlight.
//
vPosLight = mul( vPos, g_mViewToLightProj );
}

Two lines "vPos.z =vPos.z+ sqrt(vPos.x * vPos.x + vPos.y * vPos.y);" and "vPos = mul( iPos, g_mWorldView );" inserted into code by me. Second line ("vPos = mul( iPos, g_mWorldView );") needed for proper light map. First line is for geometry. But there is problem, that all objects in scene, especially flat 2D planes have very small number of vertices (triangles). For nowadays games need that all objects don't matter what simple or not they are, must have at least 10-100 vertices, like planes, because overwise you will not get effects like looking through glass or water drops or through hot air. So in this case there is small chance, that second line ("vPos = mul( iPos, g_mWorldView );") is not required (I mean, maybe with objects, whose have more triangles everything will look correct).

Here is third line (in "ShadowMap.fx" file) in this place:
  //-----------------------------------------------------------------------------
// Desc: Process vertex for the shadow map
//-----------------------------------------------------------------------------
void VertShadow( float4 Pos : POSITION,
float3 Normal : NORMAL,
out float4 oPos : POSITION,
out float2 Depth : TEXCOORD0 )
{
//
// Compute the projected coordinates
//
oPos = mul( Pos, g_mWorldView );
//    oPos.z =oPos.z+ sqrt(oPos.x * oPos.x + oPos.y * oPos.y); //my line (shadows with fisheye not in same place as in normal case (this line wrong))
oPos = mul( oPos, g_mProj );
//
// Store z and w in our spare texcoord
//
Depth.xy = oPos.zw;
}

This third line "oPos.z =oPos.z+ sqrt(oPos.x * oPos.x + oPos.y * oPos.y);" must not be inserted, because shadows don't much exactly they coordinates from object on object (shadows must be in same place as if you look everything without fisheye lens effect (shadows must be in same places like in unedited/original version). But there still is chance, that collums of arrows have only many vertical lines and small number of horizontal lines and this makes it effect, but I still think due to shadow minimization on walls, that this third line must be not inserted. IF THERE WOULD BE OBJECTS WITH MORE POLIGONS THEN I COULD TELL FOR SURE IF THIS (THIRD) LINE MUST BE OR NOT MUST BE INSERTED.
Judging from that closest arrows cones are with fish eye lens and in original (not edited) case shined with light in exactly same place and blue rings also shined (lighted) in both cases the same, I COME TO CONCLUSION, THAT THERE MUST BE SECOND LINE 90% CHANCE THAT REQUIRED. AND THIRD LINE IS ABOUT 5% CHANCE THAT REQUIRED (it also gives proper shadow place, but wrong from arrow on arrow in one place and like I said it can be due too poor geometry of 3D arrows). Third line ("oPos.z =oPos.z+ sqrt(oPos.x * oPos.x + oPos.y * oPos.y);") shrinks shadows to center.

You can add 4th line, which making wooden lamp geometry to be like rest of scene: through fish eye lens (yes, for some reason lamp geometry is separated from all over room and objects geometry vertices). This 4th line for only geometry (of only lamp itself) reason, thus, is unimportant. So here it is:
   //-----------------------------------------------------------------------------
// Desc: Process vertex for the light object
//-----------------------------------------------------------------------------
void VertLight( float4 iPos : POSITION,
float3 iNormal : NORMAL,
float2 iTex : TEXCOORD0,
out float4 oPos : POSITION,
out float2 Tex : TEXCOORD0 )
{
//
// Transform position to view space
//
oPos = mul( iPos, g_mWorldView );
oPos.z =oPos.z+ sqrt(oPos.x * oPos.x + oPos.y * oPos.y); //my line (do not making effect for the rest of scene, nor for shadows)
//
// Transform to screen coord
//
oPos = mul( oPos, g_mProj );
//
// Propagate texture coord
//
Tex = iTex;
}


## Fisheye effect per pixel for HDRpipline (2010)

Steps needed to apply fisheye effect for some (HDRpipline) DirectX 10 demo:

1) Install Directx 10 SDK and Visual Studio C++ 2010 (if Microsoft will not remove this HDR Pipline demo from latter versions of Directx SDK, then with later corresponding version of Directx and Visual Studio should be possible replicate it all);
2) To run this (HDRpipline) demo need two times click with mouse right button on HDRPipeline_2010.vcxproj (or if you don't see extension vcxproj, then can open project probably from Visual Studio C++ menu), then click "Restart this application under different credentials" (maybe you can click "Ignore" - I didn't try it yet), then Yes (to give permission);
3) To run this demo click on green triangle button or press Ctrl+F5. You should see window of HDR scene;
4) Go to "C:\Program Files\Microsoft DirectX SDK (June 2010)\Samples\C++\Direct3D\HDRPipeline\Shader Code" and copy file "PostProcessing.psh" to desktop (and make sure it not marked as read only - I don't remember if it is necessary to edit and save with notepad);
5) Rename "PostProcessing.psh" to "PostProcessing.txt" before editing and saving [and then back to "PostProcessing.psh" after edit and save];
6) This initial original code (in file "PostProcessing.psh"):
 //------------------------------------------------------------------
//
// Samples the input texture 16x according to the provided offsets
// and then writes the average to the output texture
//------------------------------------------------------------------
float4 DownSample( in float2 t : TEXCOORD0 ) : COLOR
{
float4 average = { 0.0f, 0.0f, 0.0f, 0.0f };
for( int i = 0; i < 16; i++ )
{
average += tex2D( tex0, t + float2(tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y) );
}
average *= ( 1.0f / 16.0f );
return average;
}

you must replace with this code (it is for monitors with aspect ratio 4:3):
 //------------------------------------------------------------------
//
// Samples the input texture 16x according to the provided offsets
// and then writes the average to the output texture
//------------------------------------------------------------------
float4 DownSample( in float2 t : TEXCOORD0 ) : COLOR
{
// float2 f=((t*2-1)*(sqrt((t.x*2-1)*(t.x*2-1)+(t.y*2-1)*(t.y*2-1))/3.4142+1))*0.5+0.5; //my
float2 g;
// g=((t*2-1)*(sqrt((t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1))/5+1))*0.5+0.5; //my
// g.x=((t.x*2-1)*(sqrt((t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1))/5+1))*0.5+0.5; //my
// g.y=((0.75*(t.y*2-1)*(sqrt((t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1))/5+1))*0.5+0.375)*4/3; //my
g.x=((t.x*2-1)*sqrt(1+(t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1)))*0.5+0.5; //my
//idea is l=sqrt((t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1)) as one line lenght and another line lengt is 1 (this all is for 90 degrees horizontal fov) and result is LL=sqrt(l*l+1), where LL is maximum lengt in corner and lx=sqrt(1+(t.x*2-1)*(t.x*2-1)) would be only lengt on x axis //my
g.y=((0.75*(t.y*2-1)*sqrt(1+(t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1)))*0.5+0.375)*4/3; //my
float4 average = { 0.0f, 0.0f, 0.0f, 0.0f };
for( int i = 0; i < 16; i++ )
{
//      average += tex2D( tex0, t + float2(tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y) ); //original
//      average += tex2D( tex0, max(abs(t.x), abs(t.y))*t/sqrt((t.x*2-1)*(t.x*2-1)+(t.y*2-1)*(t.y*2-1)) + float2(tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y) );  //my
//   average += tex2D( tex0, ((t*2-1)*(sqrt((t.x*2-1)*(t.x*2-1)+(t.y*2-1)*(t.y*2-1))/5.828+1))*0.5+0.5 + float2(tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y) ); //my
//  average += tex2D( tex0, t*(sqrt((t.x*2-1)*(t.x*2-1)+(t.y*2-1)*(t.y*2-1))/5.828+1) + float2(tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y) ); //my
average += tex2D( tex0, ((g*2-1)/1.41421356)*0.5+0.5 + float2(tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y) ); //my
}
average *= ( 1.0f / 16.0f );
return average;
}

in code there is explanations and trys (so maybe you better understand that not everything is correct, what looks correct from first look) so here only really correct code repeated again:
 float4 DownSample( in float2 t : TEXCOORD0 ) : COLOR
{
float2 g;
g.x=((t.x*2-1)*sqrt(1+(t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1)))*0.5+0.5; //my
//idea is l=sqrt((t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1)) as one line lenght and another line lengt is 1 (this all is for 90 degrees horizontal fov) and result is LL=sqrt(l*l+1), where LL is maximum lengt in corner and lx=sqrt(1+(t.x*2-1)*(t.x*2-1)) would be only lengt on x axis //my
g.y=((0.75*(t.y*2-1)*sqrt(1+(t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1)))*0.5+0.375)*4/3; //my
float4 average = { 0.0f, 0.0f, 0.0f, 0.0f };
for( int i = 0; i < 16; i++ )
{
//      average += tex2D( tex0, t + float2(tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y) ); //original
average += tex2D( tex0, ((g*2-1)/1.41421356)*0.5+0.5 + float2(tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y) ); //my
}
average *= ( 1.0f / 16.0f );
return average;
}

7) Notice, that by making this Blur effect to achieve HDR Microsoft using smaller resolution blured textures and maximum number of pixels added is for some reason only 16 (strange coincidence with 8 X87 FPU registers + 8 General Purpose registers or 16 SSE registers);
8) So here original Microsoft piece of code from file "PostProcess.cpp" (this file directory is "C:\Program Files\Microsoft DirectX SDK (June 2010)\Samples\C++\Direct3D\HDRPipeline"):
 //--------------------------------------------------------------------------------------
//  CreateResources( )
//
//      DESC:
//          This function creates all the necessary resources to produce the post-
//          processing effect for the HDR input. When this function completes successfully
//          rendering can commence. A call to 'DestroyResources()' should be made when
//          the application closes.
//
//      PARAMS:
//          pDevice      : The current device that resources should be created with/from
//          pDisplayDesc : Describes the back-buffer currently in use, can be useful when
//                         creating GUI based resources.
//
//      NOTES:
//          n/a
//--------------------------------------------------------------------------------------
HRESULT CreateResources( IDirect3DDevice9* pDevice, const D3DSURFACE_DESC* pDisplayDesc )
{
// [ 0 ] GATHER NECESSARY INFORMATION
//-----------------------------------
HRESULT hr = S_OK;
LPD3DXBUFFER pCode = NULL;
V( HDREnumeration::FindBestHDRFormat( &PostProcess::g_fmtHDR ) );
if( FAILED( hr ) )
{
// High Dynamic Range Rendering is not supported on this device!
OutputDebugString( L"PostProcess::CreateResources() - Current hardware does not allow HDR rendering!\n" );
return hr;
}
// [ 1 ] CREATE BRIGHT PASS TEXTURE
//---------------------------------
// Bright pass texture is 1/2 the size of the original HDR render target.
// Part of the pixel shader performs a 2x2 downsampling. The downsampling
// is intended to reduce the number of pixels that need to be worked on -
// in general, HDR pipelines tend to be fill-rate heavy.
V( pDevice->CreateTexture(
//     pDisplayDesc->Width/2, pDisplayDesc->Height/2, //original
pDisplayDesc->Width , pDisplayDesc->Height , //my
1, D3DUSAGE_RENDERTARGET, PostProcess::g_fmtHDR,
D3DPOOL_DEFAULT, &PostProcess::g_pBrightPassTex, NULL
) );
if( FAILED( hr ) )
{
// We couldn't create the texture - lots of possible reasons for this...
OutputDebugString(
L"PostProcess::CreateResources() - Could not create bright-pass render target. Examine D3D Debug Output for details.\n" );
return hr;
}
// [ 2 ] CREATE BRIGHT PASS PIXEL SHADER
//--------------------------------------
WCHAR str[MAX_PATH];
V_RETURN( DXUTFindDXSDKMediaFileCch( str, MAX_PATH, L"Shader Code\\PostProcessing.psh" ) );
str,
NULL, NULL,
"BrightPass",
"ps_2_0",
0,
&pCode,
NULL,
&PostProcess::g_pBrightPassConstants
) );
if( FAILED( hr ) )
{
// in the 'Shader Code' folder to get a proper compile breakdown.
OutputDebugString(
L"PostProcess::CreateResources() - Compiling of 'BrightPass' from 'PostProcessing.psh' failed!\n" );
return hr;
}
V( pDevice->CreatePixelShader( reinterpret_cast< DWORD* >( pCode->GetBufferPointer() ),
&PostProcess::g_pBrightPassPS ) );
if( FAILED( hr ) )
{
// Couldn't turn the compiled shader into an actual, usable, pixel shader!
OutputDebugString(
L"PostProcess::CreateResources() - Could not create a pixel shader object for 'BrightPass'.\n" );
pCode->Release();
return hr;
}
pCode->Release();
// [ 3 ] CREATE DOWNSAMPLED TEXTURE
//---------------------------------
// This render target is 1/8th the size of the original HDR image (or, more
// importantly, 1/4 the size of the bright-pass). The downsampling pixel
// shader performs a 4x4 downsample in order to reduce the number of pixels
// that are sent to the horizontal/vertical blurring stages.
V( pDevice->CreateTexture(
//    pDisplayDesc->Width/8, pDisplayDesc->Height/8,   // original
pDisplayDesc->Width, pDisplayDesc->Height,   //my
1, D3DUSAGE_RENDERTARGET, PostProcess::g_fmtHDR,
D3DPOOL_DEFAULT, &PostProcess::g_pDownSampledTex, NULL
) );
if( FAILED( hr ) )
{
// We couldn't create the texture - lots of possible reasons for this...
OutputDebugString(
L"PostProcess::CreateResources() - Could not create downsampling render target. Examine D3D Debug Output for details.\n" );
return hr;
}
// [ 3 ] CREATE DOWNSAMPLING PIXEL SHADER
//---------------------------------------
V_RETURN( DXUTFindDXSDKMediaFileCch( str, MAX_PATH, L"Shader Code\\PostProcessing.psh" ) );
str,
NULL, NULL,
"DownSample",
"ps_2_0",
0,
&pCode,
NULL,
&PostProcess::g_pDownSampleConstants
) );
if( FAILED( hr ) )
{
// in the 'Shader Code' folder to get a proper compile breakdown.
OutputDebugString(
L"PostProcess::CreateResources() - Compiling of 'DownSample' from 'PostProcessing.psh' failed!\n" );
return hr;
}
V( pDevice->CreatePixelShader( reinterpret_cast< DWORD* >( pCode->GetBufferPointer() ),
&PostProcess::g_pDownSamplePS ) );
if( FAILED( hr ) )
{
// Couldn't turn the compiled shader into an actual, usable, pixel shader!
OutputDebugString(
L"PostProcess::CreateResources() - Could not create a pixel shader object for 'DownSample'.\n" );
pCode->Release();
return hr;
}
pCode->Release();
// [ 4 ] CREATE HORIZONTAL BLOOM TEXTURE
//--------------------------------------
// The horizontal bloom texture is the same dimension as the down sample
// render target. Combining a 4x4 downsample operation as well as a
// horizontal blur leads to a prohibitively high number of texture reads.
V( pDevice->CreateTexture(
//   pDisplayDesc->Width/8, pDisplayDesc->Height/8,  //original
pDisplayDesc->Width, pDisplayDesc->Height,  //my
1, D3DUSAGE_RENDERTARGET, PostProcess::g_fmtHDR,
D3DPOOL_DEFAULT, &PostProcess::g_pBloomHorizontal, NULL
) );
if( FAILED( hr ) )
{
// We couldn't create the texture - lots of possible reasons for this...
OutputDebugString(
L"PostProcess::CreateResources() - Could not create horizontal bloom render target. Examine D3D Debug Output for details.\n" );
return hr;
}
// [ 5 ] CREATE HORIZONTAL BLOOM PIXEL SHADER
//-------------------------------------------
V_RETURN( DXUTFindDXSDKMediaFileCch( str, MAX_PATH, L"Shader Code\\PostProcessing.psh" ) );
str,
NULL, NULL,
"HorizontalBlur",
"ps_2_0",
0,
&pCode,
NULL,
&PostProcess::g_pHBloomConstants
) );
if( FAILED( hr ) )
{
// in the 'Shader Code' folder to get a proper compile breakdown.
OutputDebugString(
L"PostProcess::CreateResources() - Compiling of 'HorizontalBlur' from 'PostProcessing.psh' failed!\n" );
return hr;
}
V( pDevice->CreatePixelShader( reinterpret_cast< DWORD* >( pCode->GetBufferPointer() ),
&PostProcess::g_pHBloomPS ) );
if( FAILED( hr ) )
{
// Couldn't turn the compiled shader into an actual, usable, pixel shader!
OutputDebugString(
L"PostProcess::CreateResources() - Could not create a pixel shader object for 'HorizontalBlur'.\n" );
pCode->Release();
return hr;
}
pCode->Release();
// [ 6 ] CREATE VERTICAL BLOOM TEXTURE
//------------------------------------
// The vertical blur texture must be the same size as the horizontal blur texture
// so as to get a correct 2D distribution pattern.
V( pDevice->CreateTexture(
//    pDisplayDesc->Width/8, pDisplayDesc->Height/8,  //original
pDisplayDesc->Width, pDisplayDesc->Height,  //my
1, D3DUSAGE_RENDERTARGET, PostProcess::g_fmtHDR,
D3DPOOL_DEFAULT, &PostProcess::g_pBloomVertical, NULL
) );
if( FAILED( hr ) )
{
// We couldn't create the texture - lots of possible reasons for this...
OutputDebugString(
L"PostProcess::CreateResources() - Could not create vertical bloom render target. Examine D3D Debug Output for details.\n" );
return hr;
}
// [ 7 ] CREATE VERTICAL BLOOM PIXEL SHADER
//-----------------------------------------
V_RETURN( DXUTFindDXSDKMediaFileCch( str, MAX_PATH, L"Shader Code\\PostProcessing.psh" ) );
str,
NULL, NULL,
"VerticalBlur",
"ps_2_0",
0,
&pCode,
NULL,
&PostProcess::g_pVBloomConstants
) );
if( FAILED( hr ) )
{
// in the 'Shader Code' folder to get a proper compile breakdown.
OutputDebugString(
L"PostProcess::CreateResources() - Compiling of 'VerticalBlur' from 'PostProcessing.psh' failed!\n" );
return hr;
}
V( pDevice->CreatePixelShader( reinterpret_cast< DWORD* >( pCode->GetBufferPointer() ),
&PostProcess::g_pVBloomPS ) );
if( FAILED( hr ) )
{
// Couldn't turn the compiled shader into an actual, usable, pixel shader!
OutputDebugString(
L"PostProcess::CreateResources() - Could not create a pixel shader object for 'VerticalBlur'.\n" );
pCode->Release();
return hr;
}
pCode->Release();
return hr;
}

but to get it really per pixel 'render to target texture', the wide and height must be not divided by 2 for BRIGHT PASS TEXTURE, nor divided by 8 for DOWNSAMPLED TEXTURE, nor divided by 8 for HORIZONTAL BLOOM TEXTURE, nor divided by 8 for HORIZONTAL BLOOM TEXTURE.

### How many cores hiding GPU? About 8

Here all information what is in "PostProcessing.psh" file:
//======================================================================
//
//      HIGH DYNAMIC RANGE RENDERING DEMONSTRATION
//      Written by Jack Hoxley, November 2005
//
//======================================================================
//------------------------------------------------------------------
//  GLOBAL VARIABLES
//------------------------------------------------------------------
// float4 tcDownSampleOffsets[16];         // The sampling offsets used by 'DownSample' and 'BrightPass' //original
// float HBloomWeights[9];                 // Description of the sampling distribution used by  //original
// float HBloomOffsets[9];                 // the HorizontalBlur() function  //original
// float VBloomWeights[9];                 // Description of the sampling distribution used by //original
// float VBloomOffsets[9];                 // the VerticalBlur() function //original
float fBrightPassThreshold;             // Values greater than this are accepted for the bright pass
float4 tcDownSampleOffsets[16];         // The sampling offsets used by 'DownSample' and 'BrightPass'  //original 16
float HBloomWeights[15];                 // Description of the sampling distribution used by  //my
float HBloomOffsets[15];                 // the HorizontalBlur() function  //my
float VBloomWeights[15];                 // Description of the sampling distribution used by //my
float VBloomOffsets[15];                 // the VerticalBlur() function  //my
sampler tex0 : register( s0 );          // Whatever texture is set using IDirect3DDevice9::SetTexture( 0, ... )
//------------------------------------------------------------------
// BRIGHT PASS AND 2x2 DOWN-SAMPLING PIXEL SHADER
//
// performs the 2x2 down sample, and then accepts any pixels
// that are greater or equal to the configured threshold
//------------------------------------------------------------------
float4 BrightPass( in float2 t : TEXCOORD0 ) : COLOR
{
float4 average = { 0.0f, 0.0f, 0.0f, 0.0f };
// load in and combine the 4 samples from the source HDR texture
//   for( int i = 0; i < 4; i++ )  //original
for( int i = 0; i < 16; i++ )
{
average += tex2D( tex0, t + float2( tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y ) );
}
average *= 0.0625f;  //original 0.25f
// Determine the brightness of this particular pixel. As with the luminance calculations
// there are 4 possible variations on this calculation:
// 1. Do a very simple mathematical average:
//float luminance = dot( average.rgb, float3( 0.33f, 0.33f, 0.33f ) );
// 2. Perform a more accurately weighted average:
//float luminance = dot( average.rgb, float3( 0.299f, 0.587f, 0.114f ) );
// 3. Take the maximum value of the incoming, same as computing the
//    brightness/value for an HSV/HSB conversion:
float luminance = max( average.r, max( average.g, average.b ) );
// 4. Compute the luminance component as per the HSL colour space:
//float luminance = 0.5f * ( max( average.r, max( average.g, average.b ) ) + min( average.r, min( average.g, average.b ) ) );
// 5. Use the magnitude of the colour
//float luminance = length( average.rgb );
// Determine whether this pixel passes the test...
//    if( luminance < fBrightPassThreshold )   //original
//        average = float4( 0.0f, 0.0f, 0.0f, 1.0f );  //original
average.rgb =pow(average.rgb, fBrightPassThreshold*4);  //my line
// Write the colour to the bright-pass render target
return average;
}
//------------------------------------------------------------------
//
// Samples the input texture 16x according to the provided offsets
// and then writes the average to the output texture
//------------------------------------------------------------------
float4 DownSample( in float2 t : TEXCOORD0 ) : COLOR
{
// float2 f=((t*2-1)*(sqrt((t.x*2-1)*(t.x*2-1)+(t.y*2-1)*(t.y*2-1))/3.4142+1))*0.5+0.5; //my
float2 g;  //my
// g=((t*2-1)*(sqrt((t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1))/5+1))*0.5+0.5; //my
// g.x=((t.x*2-1)*(sqrt((t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1))/5+1))*0.5+0.5; //my
// g.y=((0.75*(t.y*2-1)*(sqrt((t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1))/5+1))*0.5+0.375)*4/3; //my
g.x=((t.x*2-1)*sqrt(1+(t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1)))*0.5+0.5; //my
//idea is l=sqrt((t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1)) as one line lenght and another line lengt is 1 (this all is for 90 degrees horizontal fov) and result is LL=sqrt(l*l+1), where LL is maximum lengt in corner and lx=sqrt(1+(t.x*2-1)*(t.x*2-1)) would be only lengt on x axis //my
g.y=((0.75*(t.y*2-1)*sqrt(1+(t.x*2-1)*(t.x*2-1)+0.75*0.75*(t.y*2-1)*(t.y*2-1)))*0.5+0.375)*4/3; //my
float4 average = { 0.0f, 0.0f, 0.0f, 0.0f };
for( int i = 0; i < 16; i++ )
{
//      average += tex2D( tex0, t + float2(tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y) ); //original
//      average += tex2D( tex0, max(abs(t.x), abs(t.y))*t/sqrt((t.x*2-1)*(t.x*2-1)+(t.y*2-1)*(t.y*2-1)) + float2(tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y) );  //my
//   average += tex2D( tex0, ((t*2-1)*(sqrt((t.x*2-1)*(t.x*2-1)+(t.y*2-1)*(t.y*2-1))/5.828+1))*0.5+0.5 + float2(tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y) ); //my
//  average += tex2D( tex0, t*(sqrt((t.x*2-1)*(t.x*2-1)+(t.y*2-1)*(t.y*2-1))/5.828+1) + float2(tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y) ); //my
average += tex2D( tex0, ((g*2-1)/1.41421356)*0.5+0.5 + float2(tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y) ); //my
}
average *= ( 1.0f / 16.0f );
return average;
}
//------------------------------------------------------------------
// HORIZONTAL BLUR
//
// Takes 9 samples from the down-sampled texture (4 either side and
// one central) biased by the provided weights. Different weight
// distributions will give more subtle/pronounced blurring.
//------------------------------------------------------------------
float4 HorizontalBlur( in float2 t : TEXCOORD0 ) : COLOR
{
float4 color = { 0.0f, 0.0f, 0.0f, 0.0f };
// for( int i = 0; i < 9; i++ ) //original
for( int i = 0; i < 15; i++ )
{
color += (tex2D( tex0, t + float2( HBloomOffsets[i], 0.0f ) ) * HBloomWeights[i] );
}
return float4( color.rgb, 1.0f );
}
//------------------------------------------------------------------
// VERTICAL BLUR
//
// Takes 9 samples from the down-sampled texture (4 above/below and
// one central) biased by the provided weights. Different weight
// distributions will give more subtle/pronounced blurring.
//------------------------------------------------------------------
float4 VerticalBlur( in float2 t : TEXCOORD0 ) : COLOR
{
float4 color = { 0.0f, 0.0f, 0.0f, 0.0f };
//  for( int i = 0; i < 9; i++ ) //original
for( int i = 0; i < 15; i++ )
{
color += (tex2D( tex0, t + float2( 0.0f, VBloomOffsets[i] ) ) * VBloomWeights[i] );
}
return float4( color.rgb, 1.0f );
}


Here original things in file "PostProcess.cpp":
//--------------------------------------------------------------------------------------
//  PerformPostProcessing( )
//
//      DESC:
//          This is the core function for this module - it takes the raw HDR image
//          generated by the 'HDRScene' component and puts it through 4 post
//          processing stages - to end up with a bloom effect on the over-exposed
//          (HDR) parts of the image.
//
//      PARAMS:
//          pDevice : The device that will be rendered to
//
//      NOTES:
//          n/a
//
//--------------------------------------------------------------------------------------
HRESULT PerformPostProcessing( IDirect3DDevice9* pDevice )
{
// [ 0 ] BRIGHT PASS
//------------------
LPDIRECT3DTEXTURE9 pHDRSource = NULL;
if( FAILED( HDRScene::GetOutputTexture( &pHDRSource ) ) )
{
// Couldn't get the input - means that none of the subsequent
// work is worth attempting!
OutputDebugString( L"PostProcess::PerformPostProcessing() - Unable to retrieve source HDR information!\n" );
return E_FAIL;
}
LPDIRECT3DSURFACE9 pBrightPassSurf = NULL;
if( FAILED( PostProcess::g_pBrightPassTex->GetSurfaceLevel( 0, &pBrightPassSurf ) ) )
{
// Can't get the render target. Not good news!
OutputDebugString(
L"PostProcess::PerformPostProcessing() - Couldn't retrieve top level surface for bright pass render target.\n" );
return E_FAIL;
}
pDevice->SetRenderTarget( 0, pBrightPassSurf );         // Configure the output of this stage
pDevice->SetTexture( 0, pHDRSource );                   // Configure the input..
PostProcess::g_pBrightPassConstants->SetFloat( pDevice, "fBrightPassThreshold", PostProcess::g_BrightThreshold );
// We need to compute the sampling offsets used for this pass.
// A 2x2 sampling pattern is used, so we need to generate 4 offsets
D3DXVECTOR4 offsets[4];
// Find the dimensions for the source data
D3DSURFACE_DESC srcDesc;
pHDRSource->GetLevelDesc( 0, &srcDesc );
// Because the source and destination are NOT the same sizes, we
// need to provide offsets to correctly map between them.
float sU = ( 1.0f / static_cast< float >( srcDesc.Width ) );
float sV = ( 1.0f / static_cast< float >( srcDesc.Height ) );
// The last two components (z,w) are unused. This makes for simpler code, but if
// constant-storage is limited then it is possible to pack 4 offsets into 2 float4's
offsets[0] = D3DXVECTOR4( -0.5f * sU, 0.5f * sV, 0.0f, 0.0f );
offsets[1] = D3DXVECTOR4( 0.5f * sU, 0.5f * sV, 0.0f, 0.0f );
offsets[2] = D3DXVECTOR4( -0.5f * sU, -0.5f * sV, 0.0f, 0.0f );
offsets[3] = D3DXVECTOR4( 0.5f * sU, -0.5f * sV, 0.0f, 0.0f );
PostProcess::g_pBrightPassConstants->SetVectorArray( pDevice, "tcDownSampleOffsets", offsets, 4 );
RenderToTexture( pDevice );
// [ 1 ] DOWN SAMPLE
//------------------
LPDIRECT3DSURFACE9 pDownSampleSurf = NULL;
if( FAILED( PostProcess::g_pDownSampledTex->GetSurfaceLevel( 0, &pDownSampleSurf ) ) )
{
// Can't get the render target. Not good news!
OutputDebugString(
L"PostProcess::PerformPostProcessing() - Couldn't retrieve top level surface for down sample render target.\n" );
return E_FAIL;
}
pDevice->SetRenderTarget( 0, pDownSampleSurf );
pDevice->SetTexture( 0, PostProcess::g_pBrightPassTex );
// We need to compute the sampling offsets used for this pass.
// A 4x4 sampling pattern is used, so we need to generate 16 offsets
// Find the dimensions for the source data
PostProcess::g_pBrightPassTex->GetLevelDesc( 0, &srcDesc );
// Find the dimensions for the destination data
D3DSURFACE_DESC destDesc;
pDownSampleSurf->GetDesc( &destDesc );
// Compute the offsets required for down-sampling. If constant-storage space
// is important then this code could be packed into 8xFloat4's. The code here
// is intentionally less efficient to aid readability...
D3DXVECTOR4 dsOffsets[16];
int idx = 0;
for( int i = -2; i < 2; i++ )
{
for( int j = -2; j < 2; j++ )
{
dsOffsets[idx++] = D3DXVECTOR4(
( static_cast< float >( i ) + 0.5f ) * ( 1.0f / static_cast< float >( destDesc.Width ) ),
( static_cast< float >( j ) + 0.5f ) * ( 1.0f / static_cast< float >( destDesc.Height ) ),
0.0f, // unused
0.0f  // unused
);
}
}
PostProcess::g_pDownSampleConstants->SetVectorArray( pDevice, "tcDownSampleOffsets", dsOffsets, 16 );
RenderToTexture( pDevice );
// [ 2 ] BLUR HORIZONTALLY
//------------------------
LPDIRECT3DSURFACE9 pHBloomSurf = NULL;
if( FAILED( PostProcess::g_pBloomHorizontal->GetSurfaceLevel( 0, &pHBloomSurf ) ) )
{
// Can't get the render target. Not good news!
OutputDebugString(
L"PostProcess::PerformPostProcessing() - Couldn't retrieve top level surface for horizontal bloom render target.\n" );
return E_FAIL;
}
pDevice->SetRenderTarget( 0, pHBloomSurf );
pDevice->SetTexture( 0, PostProcess::g_pDownSampledTex );
// Configure the sampling offsets and their weights
float HBloomWeights[9];
float HBloomOffsets[9];
for( int i = 0; i < 9; i++ )
{
// Compute the offsets. We take 9 samples - 4 either side and one in the middle:
//     i =  0,  1,  2,  3, 4,  5,  6,  7,  8
//Offset = -4, -3, -2, -1, 0, +1, +2, +3, +4
HBloomOffsets[i] = ( static_cast< float >( i ) - 4.0f ) * ( 1.0f / static_cast< float >( destDesc.Width ) );
// 'x' is just a simple alias to map the [0,8] range down to a [-1,+1]
float x = ( static_cast< float >( i ) - 4.0f ) / 4.0f;
// Use a gaussian distribution. Changing the standard-deviation
// (second parameter) as well as the amplitude (multiplier) gives
// distinctly different results.
HBloomWeights[i] = g_GaussMultiplier * ComputeGaussianValue( x, g_GaussMean, g_GaussStdDev );
}
// Commit both arrays to the device:
PostProcess::g_pHBloomConstants->SetFloatArray( pDevice, "HBloomWeights", HBloomWeights, 9 );
PostProcess::g_pHBloomConstants->SetFloatArray( pDevice, "HBloomOffsets", HBloomOffsets, 9 );
RenderToTexture( pDevice );
// [ 3 ] BLUR VERTICALLY
//----------------------
LPDIRECT3DSURFACE9 pVBloomSurf = NULL;
if( FAILED( PostProcess::g_pBloomVertical->GetSurfaceLevel( 0, &pVBloomSurf ) ) )
{
// Can't get the render target. Not good news!
OutputDebugString(
L"PostProcess::PerformPostProcessing() - Couldn't retrieve top level surface for vertical bloom render target.\n" );
return E_FAIL;
}
pDevice->SetRenderTarget( 0, pVBloomSurf );
pDevice->SetTexture( 0, PostProcess::g_pBloomHorizontal );
// Configure the sampling offsets and their weights
// It is worth noting that although this code is almost identical to the
// previous section ('H' weights, above) there is an important difference: destDesc.Height.
// The bloom render targets are *not* square, such that you can't re-use the same offsets in
// both directions.
float VBloomWeights[9];
float VBloomOffsets[9];
for( int i = 0; i < 9; i++ )
{
// Compute the offsets. We take 9 samples - 4 either side and one in the middle:
//     i =  0,  1,  2,  3, 4,  5,  6,  7,  8
//Offset = -4, -3, -2, -1, 0, +1, +2, +3, +4
VBloomOffsets[i] = ( static_cast< float >( i ) - 4.0f ) * ( 1.0f / static_cast< float >( destDesc.Height ) );
// 'x' is just a simple alias to map the [0,8] range down to a [-1,+1]
float x = ( static_cast< float >( i ) - 4.0f ) / 4.0f;
// Use a gaussian distribution. Changing the standard-deviation
// (second parameter) as well as the amplitude (multiplier) gives
// distinctly different results.
VBloomWeights[i] = g_GaussMultiplier * ComputeGaussianValue( x, g_GaussMean, g_GaussStdDev );
}
// Commit both arrays to the device:
PostProcess::g_pVBloomConstants->SetFloatArray( pDevice, "VBloomWeights", VBloomWeights, 9 );
PostProcess::g_pVBloomConstants->SetFloatArray( pDevice, "VBloomOffsets", VBloomOffsets, 9 );
RenderToTexture( pDevice );
// [ 4 ] CLEAN UP
//---------------
SAFE_RELEASE( pHDRSource );
SAFE_RELEASE( pBrightPassSurf );
SAFE_RELEASE( pDownSampleSurf );
SAFE_RELEASE( pHBloomSurf );
SAFE_RELEASE( pVBloomSurf );
return S_OK;
}

And here how the same thing looks from file "PostProcess.cpp" after the number of offsets changed (increased) by me:
//--------------------------------------------------------------------------------------
//  PerformPostProcessing( )
//
//      DESC:
//          This is the core function for this module - it takes the raw HDR image
//          generated by the 'HDRScene' component and puts it through 4 post
//          processing stages - to end up with a bloom effect on the over-exposed
//          (HDR) parts of the image.
//
//      PARAMS:
//          pDevice : The device that will be rendered to
//
//      NOTES:
//          n/a
//
//--------------------------------------------------------------------------------------
HRESULT PerformPostProcessing( IDirect3DDevice9* pDevice )
{
// [ 0 ] BRIGHT PASS
//------------------
LPDIRECT3DTEXTURE9 pHDRSource = NULL;
if( FAILED( HDRScene::GetOutputTexture( &pHDRSource ) ) )
{
// Couldn't get the input - means that none of the subsequent
// work is worth attempting!
OutputDebugString( L"PostProcess::PerformPostProcessing() - Unable to retrieve source HDR information!\n" );
return E_FAIL;
}
LPDIRECT3DSURFACE9 pBrightPassSurf = NULL;
if( FAILED( PostProcess::g_pBrightPassTex->GetSurfaceLevel( 0, &pBrightPassSurf ) ) )
{
// Can't get the render target. Not good news!
OutputDebugString(
L"PostProcess::PerformPostProcessing() - Couldn't retrieve top level surface for bright pass render target.\n" );
return E_FAIL;
}
pDevice->SetRenderTarget( 0, pBrightPassSurf );         // Configure the output of this stage
pDevice->SetTexture( 0, pHDRSource );                   // Configure the input..
PostProcess::g_pBrightPassConstants->SetFloat( pDevice, "fBrightPassThreshold", PostProcess::g_BrightThreshold );
// We need to compute the sampling offsets used for this pass.
// A 2x2 sampling pattern is used, so we need to generate 4 offsets
D3DXVECTOR4 offsets[16];   //original 4
// Find the dimensions for the source data
D3DSURFACE_DESC srcDesc;
pHDRSource->GetLevelDesc( 0, &srcDesc );
// Because the source and destination are NOT the same sizes, we
// need to provide offsets to correctly map between them.
float sU = ( 1.0f / static_cast< float >( srcDesc.Width ) );
float sV = ( 1.0f / static_cast< float >( srcDesc.Height ) );
// The last two components (z,w) are unused. This makes for simpler code, but if
// constant-storage is limited then it is possible to pack 4 offsets into 2 float4's
offsets[0] = D3DXVECTOR4( -0.5f * sU, 0.5f * sV, 0.0f, 0.0f );
offsets[1] = D3DXVECTOR4( 0.5f * sU, 0.5f * sV, 0.0f, 0.0f );
offsets[2] = D3DXVECTOR4( -0.5f * sU, -0.5f * sV, 0.0f, 0.0f );
offsets[3] = D3DXVECTOR4( 0.5f * sU, -0.5f * sV, 0.0f, 0.0f );
offsets[4] = D3DXVECTOR4( -1.5f * sU, 1.5f * sV, 0.0f, 0.0f );  //myoffsets[4-15]
offsets[5] = D3DXVECTOR4( 1.5f * sU, 1.5f * sV, 0.0f, 0.0f );
offsets[6] = D3DXVECTOR4( -1.5f * sU, -1.5f * sV, 0.0f, 0.0f );
offsets[7] = D3DXVECTOR4( 1.5f * sU, -1.5f * sV, 0.0f, 0.0f );
offsets[8] = D3DXVECTOR4( -2.5f * sU, 2.5f * sV, 0.0f, 0.0f );
offsets[9] = D3DXVECTOR4( 2.5f * sU, 2.5f * sV, 0.0f, 0.0f );
offsets[10] = D3DXVECTOR4( -2.5f * sU, -2.5f * sV, 0.0f, 0.0f );
offsets[11] = D3DXVECTOR4( 2.5f * sU, -2.5f * sV, 0.0f, 0.0f );
offsets[12] = D3DXVECTOR4( -3.5f * sU, 3.5f * sV, 0.0f, 0.0f );
offsets[13] = D3DXVECTOR4( 3.5f * sU, 3.5f * sV, 0.0f, 0.0f );
offsets[14] = D3DXVECTOR4( -3.5f * sU, -3.5f * sV, 0.0f, 0.0f );
offsets[15] = D3DXVECTOR4( 3.5f * sU, -3.5f * sV, 0.0f, 0.0f );
PostProcess::g_pBrightPassConstants->SetVectorArray( pDevice, "tcDownSampleOffsets", offsets, 16 ); //4 original
RenderToTexture( pDevice );
// [ 1 ] DOWN SAMPLE
//------------------
LPDIRECT3DSURFACE9 pDownSampleSurf = NULL;
if( FAILED( PostProcess::g_pDownSampledTex->GetSurfaceLevel( 0, &pDownSampleSurf ) ) )
{
// Can't get the render target. Not good news!
OutputDebugString(
L"PostProcess::PerformPostProcessing() - Couldn't retrieve top level surface for down sample render target.\n" );
return E_FAIL;
}
pDevice->SetRenderTarget( 0, pDownSampleSurf );
pDevice->SetTexture( 0, PostProcess::g_pBrightPassTex );
// We need to compute the sampling offsets used for this pass.
// A 4x4 sampling pattern is used, so we need to generate 16 offsets
// Find the dimensions for the source data
PostProcess::g_pBrightPassTex->GetLevelDesc( 0, &srcDesc );
// Find the dimensions for the destination data
D3DSURFACE_DESC destDesc;
pDownSampleSurf->GetDesc( &destDesc );
// Compute the offsets required for down-sampling. If constant-storage space
// is important then this code could be packed into 8xFloat4's. The code here
// is intentionally less efficient to aid readability...
D3DXVECTOR4 dsOffsets[16]; //original 16
int idx = 0;
for( int i = -2; i < 2; i++ )
{
for( int j = -2; j < 2; j++ )
{
dsOffsets[idx++] = D3DXVECTOR4(
( static_cast< float >( i ) + 0.5f ) * ( 1.0f / static_cast< float >( destDesc.Width ) ),
( static_cast< float >( j ) + 0.5f ) * ( 1.0f / static_cast< float >( destDesc.Height ) ),
0.0f, // unused
0.0f  // unused
);
}
}
PostProcess::g_pDownSampleConstants->SetVectorArray( pDevice, "tcDownSampleOffsets", dsOffsets, 16 ); //original 16
RenderToTexture( pDevice );
// [ 2 ] BLUR HORIZONTALLY
//------------------------
LPDIRECT3DSURFACE9 pHBloomSurf = NULL;
if( FAILED( PostProcess::g_pBloomHorizontal->GetSurfaceLevel( 0, &pHBloomSurf ) ) )
{
// Can't get the render target. Not good news!
OutputDebugString(
L"PostProcess::PerformPostProcessing() - Couldn't retrieve top level surface for horizontal bloom render target.\n" );
return E_FAIL;
}
pDevice->SetRenderTarget( 0, pHBloomSurf );
pDevice->SetTexture( 0, PostProcess::g_pDownSampledTex );
// Configure the sampling offsets and their weights
float HBloomWeights[15]; //original 9
float HBloomOffsets[15]; //original 9
for( int i = 0; i < 15; i++ ) //original 9
{
// Compute the offsets. We take 9 samples - 4 either side and one in the middle:
//     i =  0,  1,  2,  3, 4,  5,  6,  7,  8
//Offset = -4, -3, -2, -1, 0, +1, +2, +3, +4
HBloomOffsets[i] = ( static_cast< float >( i ) - 7.0f ) * ( 1.0f / static_cast< float >( destDesc.Width ) ); //-4 original
// 'x' is just a simple alias to map the [0,8] range down to a [-1,+1]
float x = ( static_cast< float >( i ) - 7.0f ) / 7.0f; //4 original
// Use a gaussian distribution. Changing the standard-deviation
// (second parameter) as well as the amplitude (multiplier) gives
// distinctly different results.
HBloomWeights[i] = g_GaussMultiplier * ComputeGaussianValue( x, g_GaussMean, g_GaussStdDev );
}
// Commit both arrays to the device:
PostProcess::g_pHBloomConstants->SetFloatArray( pDevice, "HBloomWeights", HBloomWeights, 15 );
PostProcess::g_pHBloomConstants->SetFloatArray( pDevice, "HBloomOffsets", HBloomOffsets, 15 );
RenderToTexture( pDevice );
// [ 3 ] BLUR VERTICALLY
//----------------------
LPDIRECT3DSURFACE9 pVBloomSurf = NULL;
if( FAILED( PostProcess::g_pBloomVertical->GetSurfaceLevel( 0, &pVBloomSurf ) ) )
{
// Can't get the render target. Not good news!
OutputDebugString(
L"PostProcess::PerformPostProcessing() - Couldn't retrieve top level surface for vertical bloom render target.\n" );
return E_FAIL;
}
pDevice->SetRenderTarget( 0, pVBloomSurf );
pDevice->SetTexture( 0, PostProcess::g_pBloomHorizontal );
// Configure the sampling offsets and their weights
// It is worth noting that although this code is almost identical to the
// previous section ('H' weights, above) there is an important difference: destDesc.Height.
// The bloom render targets are *not* square, such that you can't re-use the same offsets in
// both directions.
float VBloomWeights[15];
float VBloomOffsets[15];
for( int i = 0; i < 15; i++ )
{
// Compute the offsets. We take 9 samples - 4 either side and one in the middle:
//     i =  0,  1,  2,  3, 4,  5,  6,  7,  8
//Offset = -4, -3, -2, -1, 0, +1, +2, +3, +4
VBloomOffsets[i] = ( static_cast< float >( i ) - 7.0f ) * ( 1.0f / static_cast< float >( destDesc.Height ) );
// 'x' is just a simple alias to map the [0,8] range down to a [-1,+1]
float x = ( static_cast< float >( i ) - 7.0f ) / 7.0f;
// Use a gaussian distribution. Changing the standard-deviation
// (second parameter) as well as the amplitude (multiplier) gives
// distinctly different results.
VBloomWeights[i] = g_GaussMultiplier * ComputeGaussianValue( x, g_GaussMean, g_GaussStdDev );
}
// Commit both arrays to the device:
PostProcess::g_pVBloomConstants->SetFloatArray( pDevice, "VBloomWeights", VBloomWeights, 15 );
PostProcess::g_pVBloomConstants->SetFloatArray( pDevice, "VBloomOffsets", VBloomOffsets, 15 );
RenderToTexture( pDevice );
// [ 4 ] CLEAN UP
//---------------
SAFE_RELEASE( pHDRSource );
SAFE_RELEASE( pBrightPassSurf );
SAFE_RELEASE( pDownSampleSurf );
SAFE_RELEASE( pHBloomSurf );
SAFE_RELEASE( pVBloomSurf );
return S_OK;
}

Notice, that maximum number of offsets for some reason is 16 or 15 (15 for vertical or horizontal blur). If you will try to make more offsets you will get error. More offsets probably working correctly only for vertical and horizontal blur, because doing more offsets is purpose to be sure, that GPU is real and to find out how many cores GPU have. If don't to be silly, GPU don't have 640-2048 shader cores, but 8 or 32 cores (I even doubt about number 32). So my GPU as it is claimed from some deeper observations have 8 cores and Radeon 7970 have 32 compute units (2048 Stream Processors). Here program which tell how many cores your GPU have (so as claims site author Radeon 4870 or 4850 have 10 cores). So this GPU cores working not on about 700 MHz frequency, but like normal cores on 3 GHz frequency. So my resolution is 1600*1200 with about 45 fps and I try to decrease CPU frequency by half (through bios) to be sure, that CPU not interfering (and result is, that number of fps don't changing at all with CPU frequency 2x smaller). So if GPU doing multiplication and addition in 4 cycles, then lets see how many operations need. 1600*1200=1920000, this is about 2 milion pixels. Each pixel have 3 colors (RGB). So here is 15 horizontal blur additions, 15 vertical blur additions and 16+16 another blur additions, so total 30+32=62 additions. Number of colors is 3, so 62*3=186 addition operations. And this all runs at about 45 fps with 2 core CPU and full screen resolution 1600*1200. So each GPU core working at about 3 GHz, so this gives ${\displaystyle {\frac {3\cdot 10^{9}}{1920000\cdot 62\cdot [4(cycles)]}}={\frac {3\cdot 10^{9}}{119040000\cdot [4(cycles)]}}={\frac {25.2016129}{4}}=6.3\;(fps).}$
So 6.3 (fps) * 8 (cores) = 50.4 fps. Theory is correct about 8 GPU cores. Alternative seeing can be, that not only for addition need 4 cycles, but for multiplication too need 4 cycles, but then also can be, that on vectors 3 colors (RGB) are added ant multiplied with SIMD operation 3 times faster (to pack 3 bytes need shift them [left or right] and then add and you have 64 bits with many space between bytes). So then ${\displaystyle {\frac {25.2016129}{4+4}}\cdot 3=9.45\;(fps).}$ Then 9.45 (fps) * 8 (cores) = 75.6. Also something suspicious is about fps multiplication by 4 in code of file "HDRPipeline.cpp" this: "g_dwFrameRate = g_dwFrameCount * 4;". But I don't see anything lagging and program FRAPS showing the same frame rate ~45 fps. This can be because need find out frequency of GPU and it is not 4 GHz, but ~3GHz so maybe not big deal. So there is still chance, that my GPU have not 8, but 4 cores or 2 cores or maybe even 1 core. There is big chance, that each blur pass (4 passes total: Downsample, another Downsample, Vertical blur and Horizontal blur) is separate frame and maybe thats why need multiply fps by 4. In this case almost no chances, that there more than two cores or then it's almost set, that there is just one core, because colors can be shifted and added to one 64 bits file and on this file can be done multiplications and additions, because much room between three 8 bits in 64 bits (or even three 16 bits can be packed into 64 bits and possibly thats why to much offsets (more than 16) can make three 16 bits RGB colors to interfere with each over in 64 bits integer). Since shift bits left is multiplication by and shift bits right is division by two, then RED color of render target texture need to shift 32 or 48 bits, so need multiply (when red was converted to 64 bits integer [by multiplying RED channel by 64 bits integer]) Red by 2^32=4294967296 integer or 2^48=281474976710656 integer. Green channel bits, after conversion from 8 bits or 16 bits integer to 64 bits integer, need to shift 16 or 32 bits to left, so GREEN multiply by 2^16=65536 or 2^32=4294967296 integer. Blue channel bits are in place already, so blue channel need just multiply by 64 bits integer, so that it would become 64 bits integer. So when all channels are 64 bits integers and shifted to left accordingly, then we can just add those three channels (R[64]+G[64]+B[64]) and then farther all operations of bluring will be done on this 64 bits integer of packed RGB data. And this will be 3 times faster.

### Small HDR improvement

HDR is made using render to texture technique. First All scene is rendered to texture of size 243 wide and 243 height (wide:height aspect ratio is 1:1, so image looks shrinked horizontally). Then 3*3 pixels are added together and after this process we have texture 243/3=81 pixel of wide and 81 pixel of height. Then again 3*3 pixels added together and we get texture 27 pixels of wide and 27 pixels of height. After this again 9 pixels added to 1 and texture of size 9*9 pixels is build. Next we have texture 3 pixels wide and 3 pixels height. And finally we get texture of just 1 pixel, which is average luminance.

Here everything what is in "luminance.psh" file (which is in directory "C:\Program Files\Microsoft DirectX SDK (June 2010)\Samples\C++\Direct3D\HDRPipeline\Shader Code"):
//======================================================================
//
//      HIGH DYNAMIC RANGE RENDERING DEMONSTRATION
//      Written by Jack Hoxley, November 2005
//
//======================================================================
//------------------------------------------------------------------
//  GLOBAL VARIABLES
//------------------------------------------------------------------
float4      tcLumOffsets[4];                // The offsets used by GreyScaleDownSample()
float4      tcDSOffsets[9];                 // The offsets used by DownSample()
sampler     s0  :   register( s0 );         // The first texture
//------------------------------------------------------------------
//  DEBUG LUMINANCE DISPLAY
//------------------------------------------------------------------
float4 LuminanceDisplay( in float2 t : TEXCOORD0 ) : COLOR0
{
// Acquire the luminance from the texture
float4 l = tex2D( s0, t );
// Compute a simple scalar, due to the values being > 1.0f
// the output is often all white, so just to make it a little
// more informative we can apply a scalar to bring it closer
// to the 0..1 range
float scalar = 1.0f;
// Only the RED and GREEN channels have anything stored in them, but
// we're not interested in the maximum value, so we just use the red
// channel:
return float4( l.r * scalar, l.r * scalar, l.r * scalar, 1.0f );
}
//------------------------------------------------------------------
//  This entry point performs the basic first-pass when measuring
//  luminance of the HDR render target. It samples the HDR target
//  multiple times so as to compensate for the down-sampling and
//  subsequent loss of data.
//------------------------------------------------------------------
float4 GreyScaleDownSample( in float2 t : TEXCOORD0 ) : COLOR0
{
// Compute the average of the 4 necessary samples
float average = 0.0f;
float maximum = -1e20;
float4 color = 0.0f;
float3 WEIGHT = float3( 0.299f, 0.587f, 0.114f );
for( int i = 0; i < 4; i++ )
{
color = tex2D( s0, t + float2( tcLumOffsets[i].x, tcLumOffsets[i].y ) );
// There are a number of ways we can try and convert the RGB value into
// a single luminance value:
// 1. Do a very simple mathematical average:
//float GreyValue = dot( color.rgb, float3( 0.33f, 0.33f, 0.33f ) );
// 2. Perform a more accurately weighted average:
//float GreyValue = dot( color.rgb, WEIGHT );
// 3. Take the maximum value of the incoming, same as computing the
//    brightness/value for an HSV/HSB conversion:
float GreyValue = max( color.r, max( color.g, color.b ) );
// 4. Compute the luminance component as per the HSL colour space:
//float GreyValue = 0.5f * ( max( color.r, max( color.g, color.b ) ) + min( color.r, min( color.g, color.b ) ) );
// 5. Use the magnitude of the colour
//float GreyValue = length( color.rgb );
maximum = max( maximum, GreyValue );
//         average += (0.25f * log( 1e-5 + GreyValue )); //1e-5 necessary to stop the  singularity at GreyValue=0  //original
average +=0.25* exp(0.25f * log( 1e-5 + GreyValue ));  //myline
}
//   average = exp( average );  //original
// Output the luminance to the render target
return float4( average, maximum, 0.0f, 1.0f );
}
//------------------------------------------------------------------
//  This entry point will, using a 3x3 set of reads will downsample
//  from one luminance map to another.
//------------------------------------------------------------------
float4 DownSample( in float2 t : TEXCOORD0 ) : COLOR0
{
// Compute the average of the 10 necessary samples
float4 color = 0.0f;
float maximum = -1e20;
float average = 0.0f;
for( int i = 0; i < 9; i++ )
{
color = tex2D( s0, t + float2( tcDSOffsets[i].x, tcDSOffsets[i].y ) );
average += color.r;
maximum = max( maximum, color.g );
}
// We've just taken 9 samples from the
// high resolution texture, so compute the
// actual average that is written to the
// lower resolution texture (render target).
average /= 9.0f;
// Return the new average luminance
return float4( average, maximum, 0.0f, 1.0f );
}

So you see, that I replace this "average += (0.25f * log( 1e-5 + GreyValue ));" with this "average +=0.25* exp(0.25f * log( 1e-5 + GreyValue ));" line and turned off this "average = exp( average );" line. What I changed is only divided everything by 4 (and then everything is treated as bigger numbers, for example, if GreyValue=0.75 then exp(0.25*ln(0.75))=0.93 or exp(0.25*ln(0.1))=0.56). But I think this "GreyScaleDownSample" bluring part is not needed at all. There should be enough only this code:
float4 GreyScaleDownSample( in float2 t : TEXCOORD0 ) : COLOR0
{
// Compute the average of the 4 necessary samples
float average = 0.0f;
float maximum = -1e20;
float4 color = 0.0f;
float3 WEIGHT = float3( 0.299f, 0.587f, 0.114f );
color = tex2D( s0, t );
float GreyValue = max( color.r, max( color.g, color.b ) );
maximum = max( maximum, GreyValue );
//         average += (0.25f * log( 1e-5 + GreyValue )); //1e-5 necessary to stop the  singularity at GreyValue=0  //original
average += exp(0.25f * log( 1e-5 + GreyValue ));  //myline
//   average = exp( average );  //original
// Output the luminance to the render target
return float4( average, maximum, 0.0f, 1.0f );
}

So either you adding 4 pixels and then dividing by 4 or don't doing this part, everything works almost the same (and I visually don't see difference).
Next thing to change is in "HDRSource.psh" file (which is in directory "C:\Program Files\Microsoft DirectX SDK (June 2010)\Samples\C++\Direct3D\HDRPipeline\Shader Code"). This file ("HDRSource.psh") is small and here is all code in this file:
//======================================================================
//
//      HIGH DYNAMIC RANGE RENDERING DEMONSTRATION
//      Written by Jack Hoxley, October 2005
//
//======================================================================
//------------------------------------------------------------------
//  GLOBAL VARIABLES
//------------------------------------------------------------------
// float HDRScalar = 3.0f;         // Constant exposed to the application.  //original
float HDRScalar = 1.0f;  //my
// Larger values will generate a brighter colour.
//------------------------------------------------------------------
//------------------------------------------------------------------
float4 main( in float4 c : COLOR ) : COLOR
{
return float4(
c.r * HDRScalar,
c.g * HDRScalar,
c.b * HDRScalar,
1.0f
);
}

so I just changed "HDRScalar" from 3 to 1. Since in this HDRpipline project no Sun textures and no refractions or reflections [of bright light or bright textures like Sun], multiplying by 3 is not necessary and redundant (and distracting from focus on what is important). All colours now are in 0 to 1 range (but it probably last step this multiplication by 3, so in 0 to 1 range they was before it always). You see, Sun through dark objects even refracted is very bright. Also Sun reflected on dark objects looks bright. So Sun light is weak, but when looking into sun or Sun reflection on mirror is very very bright. So if Sun texture is refracted or reflected on mirror surface, then no matter how dark surface is - Sun disk will be bright. So for example horizon textures with Sun (shiny day) you multiply by 3 or even by 10. And horizon textures of rainy day or cloudy day you multiply by 1. Only notice, that this shiny day textures you must made with Sun brightness 1 and blue sky brightness 0.3, and clouds brightness about 0.2-0.5 in most bright sky or clouds parts. Also you can have lights, which illuminates white or almost white surface to values from 0 to 3 and when it comes to ouput, you clamping final scene to values from 0 to 1 (you simply cuting everything above 1). So first you need strong light multiply by 3. It almost all objects (with brightness above 0.3) will shine white (or maximum bright if it say 1 colour object), but if this scene will be refracted through dark glass or reflected on dark surface, then it will not be with brightness from 0 to 0.25, but with brightness from 0 to 75. After lighted objects from such strong light (which was multiplied by 3) was reflected or refracted and everything rendered, you can clam values to range 0-1, like this clam(sample.rgb, 0, 1). Function "clamp(sample, min, max)" everything above max=1 sets to 1. Here sample is final texture, which have colours in floating point format (those RGB colours are from 0 to 3 and function clamp will make them in range from 0 to 1 and video adapter will convert those RGB colours in range from 0 to 1 into integers of range from 0 to 255). But if to be realistic in many cases you will not want such strong light source, maybe only if it would happen naturally like puting normal light very close to object and then this scene (of light very close to object) is refracted or reflected. So in this case you have normal light and it automaticly exceeds 0 to 1 range of illumination of objects. So in this case you still must clamp (after everything rendered) to avoid errors, or maybe even it will clamp automaticly to 0-1 range (so you see you have HDR even if you don't trying to do it, but in this case you just must don't use clamp function for any lighted objects or to use it very carefully).
So here how to made HDR with simple math logic (divide by average) and not under some bulshit-experiment formulas. Fallowing code is from file "FinalPass.psh":
//======================================================================
//
//      HIGH DYNAMIC RANGE RENDERING DEMONSTRATION
//      Written by Jack Hoxley, November 2005
//
//======================================================================
//------------------------------------------------------------------
//  GLOBAL VARIABLES
//------------------------------------------------------------------
sampler     original_scene  : register( s0 );   // The HDR data
sampler     luminance       : register( s1 );   // The 1x1 luminance map
sampler     bloom           : register( s2 );   // The post processing results
float       fExposure;                          // A user configurable bias to under/over expose the image
float       fGaussianScalar;                    // Used in the post-processing, but also useful here
float       g_rcp_bloom_tex_w;                  // The reciprocal WIDTH of the texture in 'bloom'
float       g_rcp_bloom_tex_h;                  // The reciprocal HEIGHT of the texture in 'bloom'
//------------------------------------------------------------------
//------------------------------------------------------------------
float4 main( in float2 t : TEXCOORD0 ) : COLOR0
{
// Read the HDR value that was computed as part of the original scene
float4 c = tex2D( original_scene, t );
// Read the luminance value, target the centre of the texture
// which will map to the only pixel in it!
float4 l = tex2D( luminance, float2( 0.5f, 0.5f ) );
// Compute the blur value using a bilinear filter
// It is worth noting that if the hardware supports linear filtering of a
// floating point render target that this step can probably be skipped.
float xWeight = frac( t.x / g_rcp_bloom_tex_w ) - 0.5;
float xDir = xWeight;
xWeight = abs( xWeight );
xDir /= xWeight;
xDir *= g_rcp_bloom_tex_w;
float yWeight = frac( t.y / g_rcp_bloom_tex_h ) - 0.5;
float yDir = yWeight;
yWeight = abs( yWeight );
yDir /= yWeight;
yDir *= g_rcp_bloom_tex_h;
// sample the blur texture for the 4 relevant pixels, weighted accordingly
float4 b = ((1.0f - xWeight) * (1.0f - yWeight))    * tex2D( bloom, t );
b +=       (xWeight * (1.0f - yWeight))             * tex2D( bloom, t + float2( xDir, 0.0f ) );
b +=       (yWeight * (1.0f - xWeight))             * tex2D( bloom, t + float2( 0.0f, yDir ) );
b +=       (xWeight * yWeight)                      * tex2D( bloom, t + float2( xDir, yDir ) );
// Compute the actual colour:
float4 final = c + 0.25f * b;
//   float4 final = 0.1*c + 0.25f * b*2; //my for fisheyelens
// Reinhard's tone mapping equation (See Eqn#3 from
// "Photographic Tone Reproduction for Digital Images" for more details) is:
//
//      (      (   Lp    ))
// Lp * (1.0f +(---------))
//      (      ((Lm * Lm)))
// -------------------------
//         1.0f + Lp
//
// Lp is the luminance at the given point, this is computed using Eqn#2 from the above paper:
//
//        exposure
//   Lp = -------- * HDRPixelIntensity
//          l.r
//
// The exposure ("key" in the above paper) can be used to adjust the overall "balance" of
// the image. "l.r" is the average luminance across the scene, computed via the luminance
// downsampling process. 'HDRPixelIntensity' is the measured brightness of the current pixel
// being processed.
//    float Lp = (fExposure / l.r) * max( final.r, max( final.g, final.b ) ); //original
// A slight difference is that we have a bloom component in the final image - this is *added* to the
// final result, therefore potentially increasing the maximum luminance across the whole image.
// For a bright area of the display, this factor should be the integral of the bloom distribution
// multipled by the maximum value. The integral of the gaussian distribution between [-1,+1] should
// be AT MOST 1.0; but the sample code adds a scalar to the front of this, making it a good enough
// approximation to the *real* integral.
//    float LmSqr = (l.g + fGaussianScalar * l.g) * (l.g + fGaussianScalar * l.g); //original
// Compute Eqn#3:
//    float toneScalar = ( Lp * ( 1.0f + ( Lp / ( LmSqr ) ) ) ) / ( 1.0f + Lp );  //original
float toneScalar =fExposure / l.r;  //my
// Tonemap the final outputted pixel:
c = final * toneScalar;
// Return the fully composed colour
c.a = 1.0f;
return c;
}

So what I did, is that, I turned off this "float Lp = (fExposure / l.r) * max( final.r, max( final.g, final.b ) );" line and this "float LmSqr = (l.g + fGaussianScalar * l.g) * (l.g + fGaussianScalar * l.g);" line and replaced this "float toneScalar = ( Lp * ( 1.0f + ( Lp / ( LmSqr ) ) ) ) / ( 1.0f + Lp );" line with this "float toneScalar =fExposure / l.r;" line. Now everything is under crystal logic.
This code from file "PostProcessing.psh" replacing Bright pass threshold with power function:
//------------------------------------------------------------------
// BRIGHT PASS AND 2x2 DOWN-SAMPLING PIXEL SHADER
//
// performs the 2x2 down sample, and then accepts any pixels
// that are greater or equal to the configured threshold
//------------------------------------------------------------------
float4 BrightPass( in float2 t : TEXCOORD0 ) : COLOR
{
float4 average = { 0.0f, 0.0f, 0.0f, 0.0f };
// load in and combine the 4 samples from the source HDR texture
//   for( int i = 0; i < 4; i++ )  //original
for( int i = 0; i < 16; i++ )
{
average += tex2D( tex0, t + float2( tcDownSampleOffsets[i].x, tcDownSampleOffsets[i].y ) );
}
average *= 0.0625f;  //original 0.25f
// Determine the brightness of this particular pixel. As with the luminance calculations
// there are 4 possible variations on this calculation:
// 1. Do a very simple mathematical average:
//float luminance = dot( average.rgb, float3( 0.33f, 0.33f, 0.33f ) );
// 2. Perform a more accurately weighted average:
//float luminance = dot( average.rgb, float3( 0.299f, 0.587f, 0.114f ) );
// 3. Take the maximum value of the incoming, same as computing the
//    brightness/value for an HSV/HSB conversion:
float luminance = max( average.r, max( average.g, average.b ) );
// 4. Compute the luminance component as per the HSL colour space:
//float luminance = 0.5f * ( max( average.r, max( average.g, average.b ) ) + min( average.r, min( average.g, average.b ) ) );
// 5. Use the magnitude of the colour
//float luminance = length( average.rgb );
// Determine whether this pixel passes the test...
//    if( luminance < fBrightPassThreshold )   //original
//        average = float4( 0.0f, 0.0f, 0.0f, 1.0f );  //original
average.rgb =pow(average.rgb, fBrightPassThreshold*4);  //my line
// Write the colour to the bright-pass render target
return average;
}

So instead of making bloom just on some bright areas over bright pass threshold, I choose made it with power function (since everything is in range from 0 to 1), so that bloom would be just little bit for not so bright areas. So I just replaced this code "if( luminance < fBrightPassThreshold ) average = float4( 0.0f, 0.0f, 0.0f, 1.0f );" with this code "average.rgb =pow(average.rgb, fBrightPassThreshold*4);". By default fBrightPassThreshold=0.8, so for faster rendering (but I still think it may use created look up table of say 256 values or 65536 values) you can rise two time square instead of power of 4 (power function is quite slow and square is just multiplication, which have the same as addition speed). But don't worry, power function is slow only from theoretical point of few, but maybe it simplified for GPU or there is just not so much pixels to worry about speed of this. Say for power function need 140 cycles. GPU clock is 1000 MHz. There is 3*1600*1200=5760000 pixels. So 10^9 /(5760000*140)=10^9 /806400000=1.24 fps. There even 8 cores not enough. Somehow power function is simplified or look up table is created before real time rendering.

#### Faster HDR

${\displaystyle (e^{0.25\cdot \ln(0.1)}+e^{0.25\cdot \ln(0.9)})/2=(0.56234+0.974)/2=0.7681725358.}$
${\displaystyle e^{0.25\cdot \ln((0.1+0.9)/2)}=e^{0.25\cdot \ln(0.5)}=0.8408964.}$
So you see, if half of pixels is about 0.1 of brightness and half of brightness 0.9, then you see, that there no almost any difference or you calculate natural logarithm (ln()) and e for each pixel (of 243*243 resolution), or you simply first calculate average of all pixels to one pixel and then perform ln and exponent combination. By doing in second way you saving computing power. Actually GPU are too fast, so you don't saving anything, but in case of crysis situations you need to know how to made HDR if you don't have access to pixels, but only to average of one pixel.
Average must be get bigger, than it is because we dividing by average and you don't want that during normal scene to see brighter objects as white. So with this trick adaptation is only when in scene dominating dark pixels (almost no bright objects; and if there is at least few bright objects then all scene is dark and bright objects looks normal and not overbrighted - this is how it is in real life).
But ln() (in High level shader language, I mean, in DirectX log() function is natural logarithm function) is more for unbalanced scene, where no everything in 0 to 1 range. If everything is 0 to 1 range. I mean if average is in 0 to 1 range, then better use two square roots - it's much faster (but who cares about speed for one pixel?) and simpler.
${\displaystyle (0.1^{0.25}+0.9^{0.25})/2=(0.56234+0.974)/2=0.7681725358.}$
${\displaystyle ((0.1+0.9)/2)^{0.25}=0.5^{0.25}=0.8408964.}$