Now Playing Tracks

April 12th, 2013 - Editor In Full Swing


Exciting times for me.  I’ve separated all the parts of my project into about 12-13 different libraries and am in the process of going crazy with perfecting the encapsulation so that there’s no bizarre entanglements going on anymore.

Because of all the work I’ve done on this, it now means I can work much faster on the editor since it integrates all the world rendering from the client now.  I’m already moving stuff around, but soon I’ll work on painting alphas and extruding terrain.

Preparations for Portability/Release(?) - April 10th, 2013

At some point during the past month, I tried to bring my complete project to work to see how it would fare in a different environment.  I had been very sloppy with my encapsulation and dependencies and hard-coding a lot of things.  It was a nightmare and took me about 2 hours to do enough local hacks to get even one area (the viewer app) working.

This was a wake-up call:  If I was ever going to release the project, I would need to get it into a releasable state.  So the past few days I spent a ton of time removing the hard-coded things and making the project build cleanly.  My current choice is to put all dependencies in the source tree and have people build them from the solution rather than requiring anything to be installed by a user/designer beforehand.  The only requirement at the moment for your environment is to have an environment variable defined for the root of the tree - this is used by the apps themselves and by some helpers.

I am trying to determine how I will end up releasing my ongoing work here. One possibility is to just release ALL source for everything.  This is the most flexible for people to do what they want to do, however I have nightmares of random people making forks and starting their ‘own’ version. Another is to release the libraries as libraries/headers, and then the source for the server and client only.  This is relatively little code and most would be custom per project anyway. 

Whichever way I go, I will want some interested parties to help beta (more like alpha) test the progress.  I will not be wanting other contributors to the code-base, but will welcome ideas/suggestions from the community.

If you’re interested in being a sort of alpha-tester, please let me know via http://modcraft.superparanoid.de/ - send me a private message (I’m relaxok on there).

Note - I am likely to rename the project soon.  I’d like to give it a snappy name, but may settle for something completely generic, like MMOSDK.  We shall see.

In other news - I have done a lot more culling portal-wise and am now 90% done with portal support. More soon!

February 24, 2013 - Portal Rendering (I’m Back!)

Yes, I have returned. Real life intervened for quite awhile, but I’m working on the game engine once again.

I’ve accomplished two major things the past few days:

1) I completed the terrain rendering optimization mentioned in the last post.  I now use a simple box filter to create my own mipmaps so that I can manually create the data required for a Texture2DArray resource.  This let me completely get rid of all the ugly branching my pixel shader was doing in order to address different textures.  Speed is very fast now.  The next optimization is to have lower rez skin versions for further-away tiles.

2) Portals!  Yes I have finally started to implement them.  You can see in the above video the portal rectangles in white, and in wireframe mode you can see how internal objects disappear when the angle of view does not allow the camera eye ray to intersect the portal from the proper direction (using dot product of the difference between the eye origin and centerpoint of the portal face, and the portal’s normal.)  This vastly improves rendering speeds anywhere there are buildings and such.  Towns and cities are much speedier now.  The next step is fully implementing the portal chain, which makes dungeons quite speedy as well. I’m only using portal 0 for now since in most buildings that’s the front door.

In some other areas:

- I’ve laid the groundwork for the world builder tool by implementing picking of all world objects (things that will be placed/rotated/etc in an editor).  Picking is actually trickier than I anticipated once you want to do something beyond bounding boxes.  Ideally I will have to get bsp leaf ray intersection done, otherwise picking objects that are close together gets quite difficult.  

- I started supporting String CRCs so that costly things like string comparisons of full resource path names can be very fast.  This is in progress.

Major Optimization Fun

October 2nd, 2012 - A few weeks ago I became quite disappointed when comparing my rendering speed to that of the real WoW client which is one of the most high-performing game clients in recent history.

I’ve made a ton of changes, using new profiling tools like Intel’s VTune, and the newest NSight for Visual Studio.

Some of the optimizations in recent weeks are below.  A few were silly and obvious, but some were only found when really drilling down.

— Completely re-tool cbuffer usage into frame specific, object specific, and mesh specific structures.  This especially helps reduce lighting related Maps/Unmaps into the shader.  I reduced Map/Unmap copies by about 75%. 

— Stop using string (shader name)->pointer maps for shader-specific d3d11 buffers and such.  I converted my whole ShaderSet system to be enum based arrays.

— Don’t use sqrt in pythagorean distance calculations unless you actually need to present the actual value.  Comparisions don’t need them.  This brought the render lists object sorting (see below) from about 5% of the rendering thread’s CPU time to 0.5%.  Surprise kids: sqrt isn’t fast.

— Toy with shader compile options for speed.

— Locally cache a bunch of variables that were accessed by accessor a lot and not optimized out in compilation.

— Separate opaque and transparent render lists, resulting in far fewer blend state changes (also in preparation for instancing of opaques)

— Sort opaque list front to back and transparent list back to front.  The former takes advantage of early-z culling in the pipeline which is always recommended and I never did it because of worries about my transparent objects.

— Don’t bother clearing stencil buffer - not using it.

— Calculate world-light lighting values once every second or so instead of every frame.  That was definitely overkill. 

— Turn some copy-constructor-calling assigns into references to avoid lots of copying (specifically transparency timelines and the hefty bone timelines)

— Get rid of my ‘extra stuff’ Map/Unmap buffer and move the stuff into one of the other buffers.  Still not sure why I didn’t do this before.

— Only calculate bounding box corners once for passive (stationary) objects.  Massive speedup.  Bounding box calculation was 9% of the rendering thread’s CPU time before, now negligible.  Just a terrible oversight on my part before.

— Draw terrain early in the frame so it can be worked on while doing all the object-specific culling and such on the CPU side.  This is a technique mentioned in several of the more hardcore optimization sites (like Matt Fisher who wrote GPUView while interning(!) at Microsoft).

— Start reducing branching in my terrain’s pixel shader.  This will be one of the biggest speedups of all (going from about 60 fps in an average scene to 100+) if my tests so far prove correct.   I have quite a ways to go though.  Unfortunately, accessing texture arrays by index in pixel shaders can’t be done with variables, only integrals.  In order to get by this and do true texture array accessing (sampling via x,y,z rather than just trying to access textures[n]) I will no longer be able to use D3DX11 SRV loading from graphic files and will have to roll my own texture creation and mipmapping.  To get started with this I’m beginning to use stb_image, if you’ve heard of that.  Nice for lightweight no frills loading of png files (for example) into RGBA raw memory which is nice pointing D3D to.

What’s the result of all this?  Well, I don’t have a good test framework and scene set up for exact testing numbers of such things.  But all told, I think frame rates are up anywhere from 50-80% compared to before the optimizations.

More updates soon.

September 5th, 2012 - BSP Tree

Well, I finally bit the bullet and incorporated the BSP Tree data from WMO models.  So my format now has support for a bsp tree.  This is pretty much needed for collision detection that has decent performance.  I’m not totally clear on how it works mathematically yet, but above you can see me drawing theramore isle, drawing collision vertices only.  Each BSP Leaf mesh is a random color so you can see how granular the bsp structure is.  It goes way way beyond the meshes the models normally use.

The next step is to raycast with the planes and leaf polys, from a point on the character bounding boxes to test collision.  I have not done this before, but I think it’s going to be quite a math lesson.

September 3rd, 2012 - Baked Shadows v2.0

If you’ll recall months ago I had baked shadows in terrain working, however it was quite ugly - blocky and such.  There is actually more data there than first thought.  The iffy documentation on the ADT format makes it seem like there’s only ‘on and off’ in the shadow data, so I assumed better looking shadows were created by multisampling textures.   However, there is more data than that.  Here I am basically shading the data range from 0.8 to 1.0 (as a multiplier).  There is some hint that a different ‘shadow color’ from a light database may be used but I think it looks fine as is.  Note the smoothness compared to the old method.

Someday, I will incorporate real shadows for everything but I’d prefer not to slow rendering down for that right now (the same reason Blizzard didn’t add it to the client way back when)

August 27th, 2012 - Fades

Obviously, you can’t draw every object in every map tile, so typically you implement a draw distance for objects in addition to a scene draw distance.  One thing that has been really bothering me with the draw distance is the instant on/off clipping of objects.   So today, I also implemented fade-ins and fade-outs for all objects going in and out of view.  In the above video, I’m demonstrating the fades, and messing with the draw distance to make it a bit more obvious.  This gives you a much smoother experience in the client, similar to the WoW client.  

August 27th, 2012 - Spawning and Leak-Hunting

Once again, I have time to work on this.  Expect lots of updates now.  

It’s useful to be able to spawn creatures, for all sorts of reasons - so I finally implemented it.  Again, note that this is not a local client hack - it actually spawns the mob into the world on the server.

In other news, I’ve found many of the COM memory leaks.  Though I am still leaking memory, it’s no longer COM objects and the leaks are quite reduced.  This is exciting news.

Rendering Wall

July 6th, 2012 - The last 2 months or so, I have barely gotten anything done on the project.  There are a few issues I’ve run into.  One is that since I am now rendering model-sets down to the last bottle on a table - I’m drawing far more objects.  This requires way better culling than I am presently doing.  A city scene like Stormwind for example has an incredible amount of models in it without proper reduction.   I am beginning to implement a portal system, though I am really not too sure how it will work yet.  

Essentially, portal rendering techniques which most people think of as being good for closed dungeon environments, let you not draw parts of a world or model that are not visible because you are not viewing them through the correct ‘portal’ - some geometry that separates rooms or areas.  Beyond dungeon rendering though, it’s quite good for city scenes, so that you are only drawing the facade of a building and not anything inside of it, unless you can see into the front door for example - or not seeing upstairs when you are not within the structure.   These improvements will vastly improve the rendering of scenes with many buildings.  Until I complete that, things are pretty slow.

Additionally, I am dealing with your usual memory leaks and also certain hard to trace crashes.

If I don’t make enough progress soon with the portal rendering, I may get rid of that low level of model sets, and concentrate on some more system-oriented things like tackling combat, auras, and spells in a complete way.

For now, I have incorporated portal information into the model format and am converting WMO portals, but I am not using that information yet in the rendering itself.   That is the next step.

To Tumblr, Love Pixel Union