2D Gaussian Splatting: Geometrically Accurate Radiance Field Reconstruction

Discover how 2D Gaussian Splatting transforms neural rendering by replacing volumetric 3D Gaussians with surface-aligned 2D disks.

2D Gaussian Splatting, being a breakthrough in neural rendering, brings a new level of geometric precision to radiance field reconstruction. While previous methods like Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) have achieved photorealistic image synthesis, they often fall short when it comes to accurate surface geometry – resulting in soft edges, depth inconsistencies, and unstable surface normals.

Unlike 3D blobs scattered through space in 3D Gaussian Splatting, 2DGS models the world using flat, oriented 2D disks – surface-aligned primitives that mirror the actual geometry of objects. Each Gaussian becomes a small, oriented disk lying directly on the surface of an object, capturing not only its colour but also its shape, depth, and normal direction with remarkable consistency.

The Result?
This simple yet powerful shift – from 3D volume to 2D surface representation – changes everything. The method achieves real-time rendering, clean and triangulable geometry, and multi-view consistency that even high-end neural fields struggle to match.

  1. Why 2D Gaussian Splatting?
    1. From NeRF to 3DGS – The Evolution of Neural Rendering
    2. 3D Gaussian Splatting – From Neural Fields to Explicit Geometry
    3. The 2DGS Revolution – Flattening the Volume into a Surface
  2. Key Advantages of 2D Gaussian Splatting (2DGS)
    1. View Consistency – The Key Advantage
    2. Perspective-Correct Ray-Splat Intersection
    3. Regularization for Surface Fidelity
    4. Real-Time Rendering with Geometric Precision
  3. How 2D Gaussian Splatting Works? – A Step-by-Step Pipeline
    1. 2DGS Pipeline Summary – Input to Output
  4. 2DGS Pipeline Execution
  5. 2DGS Rasterized Output on MipNeRF360 Flowers Dataset
  6. Conclusion
  7. References

1. Why 2D Gaussian Splatting?

The motivation behind 2D Gaussian Splatting (2DGS) arises from a fundamental shortcoming in today’s neural rendering systems: a persistent trade-off between photorealism and geometric accuracy.
To understand why this method is a breakthrough, let’s first see what went wrong with earlier approaches and how 2DGS overcomes those limitations.

1.1 From NeRF to 3DGS – The Evolution of Neural Rendering

Introduced in 2020, NeRF revolutionized novel view synthesis by introducing the concept of a radiance field – a function that learns how light interacts with every point in space.

Instead of explicitly storing geometry, NeRF models the entire 3D scene as a continuous neural function Fθ​(x,d) that maps:

  • a 3D position x=(x,y,z), and
  • a view direction d
    to
  • color c and density σ.

This function is typically implemented using a multi-layer perceptron (MLP) – a small neural network trained on multi-view images.

Rendering is performed using volumetric ray marching:
For each pixel, NeRF traces a ray through the 3D space, samples hundreds of points, queries the MLP for colour and density at each sample, and then integrates them using the volume rendering equation.

StrengthWeakness
Produces photorealistic images with accurate lighting and reflections.Extremely slow – requires hundreds of neural network evaluations per pixel.
Learns view-dependent effects like specular highlights.Geometry is implicit – hard to extract or edit.
Works well for static, bounded scenes.Training can take hours or days, and rendering is far from real time.

NeRF demonstrated that a neural network can learn the appearance and structure of a 3D scene directly from images – but the method’s computational cost made it impractical for large-scale or interactive applications.

1.2 3D Gaussian Splatting – From Neural Fields to Explicit Geometry

To overcome NeRF’s inefficiency, researchers shifted from implicit neural networks to explicit representations. Introduced in 2023, 3D Gaussian Splatting (3DGS) – a simple yet revolutionary idea: represent the scene not as a neural function, but as a set of millions of 3D Gaussian primitives.

Intuition – Imagine having a 3D volume filled with millions of tiny 3D Gaussian blobs(point clouds) floating in space. When viewed through a camera, all 3D Gaussians are projected to 2D and blended based on their depth and transparency – forming the rendered image.

A visual diagram comparing a sparse point cloud on the left with a colored teapot on the right representing 3D Gaussian splats. The image illustrates how a set of discrete 3D points, arranged in a scattered grid, can be converted into a continuous, smooth surface using Gaussian splatting. The point cloud appears as dots distributed in 3D space, while the teapot shows a dense, blended surface with color gradients, symbolizing the volumetric rendering achieved after the conversion.
Fig 2. Conversion of a sparse point cloud into smooth 3D Gaussian splats [Source]

Each Gaussian is like a translucent ellipsoid (point cloud) floating in space, described by:

  • Center position: (x,y,z)
  • Covariance matrix (Σ): defines its 3D shape, size, and orientation
  • Color (RGB) and opacity (α)
  • View-dependent color function, modelled by spherical harmonics

Rendering a scene then becomes a 2D rasterization problem instead of neural integration:

  • Each Gaussian is projected onto the image plane as an ellipse.
  • The colors and transparencies of overlapping Gaussians are blended using alpha compositing – a weighted accumulation process that mimics light absorption through translucent layers.

The Geometry Problem in 3DGS

3DGS treats each Gaussian as a volumetric blob rather than as a point on a surface. This leads to:

  • Blurry or “thick” surfaces: because each Gaussian occupies a small 3D volume, delicate structures (like table edges or thin leaves) are smeared across space.
  • Multi-view Inconsistency: each viewpoint sees a slightly different cross-section of the blob, leading to flickering normals and depth artifacts.
  • Difficult mesh extraction: the volumetric nature makes it hard to reconstruct accurate triangle meshes – most surfaces appear fuzzy or inflated.

1.3 The 2DGS Revolution – Flattening the Volume into a Surface

2D Gaussian Splatting (2DGS), proposed in SIGGRAPH 2024, solves these problems by rethinking the representation itself.

Intuition – Creating a 2D image by “spraying” small, colored, fuzzy circles (2D Gaussians) onto a canvas. All these Gaussians overlap and blend to reconstruct a detailed image.

Instead of 3D ellipsoids, 2DGS models the world as a collection of flat, elliptical disks – each one embedded directly on the surface of the object. These disks are called 2D-oriented Gaussian primitives and have well-defined tangent planes, orientations, and scales.

An illustration showing how a 2D Gaussian splat is represented in both the object’s tangent frame and the image frame. On the left, a curved surface with overlapping colored disks represents tangent-plane Gaussians aligned to local geometry in object space. Each disk lies on a small patch defined by its center and tangent directions. On the right, the same Gaussian splat appears projected onto a regular image grid, maintaining its elliptical shape and orientation. The diagram explains how 2D Gaussian splats transition from surface-aligned geometry to pixel-aligned image space during rendering.
Fig 3. Transformation of a 2D Gaussian splat from tangent-plane object space to image space

Mathematically, for a disk centred at p_k​:

P(u,v) = p_k + {s_u}{t_u}{u} + {s_v}{t_v}{v}

Here:

  • p_k​ is the disk centre (3D position),
  • t_u and t_v​ are orthogonal tangent vectors,
  • s_u​ and s_v​ are scaling factors controlling the Gaussian spread,
  • (u,v) are the local 2D coordinates within the disk.

Each 2D Gaussian lies exactly on a tangent plane with a normal vector n = t_u × t_v.

A standard 2D Gaussian function modulates its color and opacity:

This design ensures that every primitive represents a real surface patch rather than a floating 3D blob. We will learn more about the entire pipeline later in this blog post.

2. Key Advantages of 2D Gaussian Splatting (2DGS)

2D Gaussian Splatting (2DGS) combines the best qualities of NeRF’s photorealism and 3D Gaussian Splatting’s real-time rendering, while eliminating their biggest weaknesses – fuzzy geometry, inconsistent normals, and heavy computation.

Let’s explore how 2DGS achieves this remarkable balance through its key innovations.

2.1 View Consistency – The Key Advantage

One of the most critical properties of 2DGS is multi-view consistency – the ability to represent surfaces identically from every camera viewpoint.

The Problem in 3DGS and NeRF

  • In 3DGS, each Gaussian is a volumetric ellipsoid, and different camera rays intersect different parts of that ellipsoid.
    → This leads to slight variations in color, depth, and normal estimation between views.
  • In NeRF, view consistency is implicitly learned by the neural network, but it is not guaranteed, since each pixel requires volumetric integration over hundreds of samples along a ray.
A diagram comparing how 3D and 2D Gaussian representations interact with camera rays. On the left, a blue ellipsoidal shape labeled as a 3D Gaussian is intersected by multiple camera rays, each cutting through different cross-sections of the volume. This demonstrates the inconsistency caused by volumetric Gaussians, where each viewpoint samples a slightly different part of the object. On the right, a flat blue disk labeled as a 2D Gaussian is intersected by the same set of rays, but all meet at the same surface point, illustrating stable geometry and consistent shading across viewpoints. The image visually supports the explanation of multi-view consistency achieved in 2D Gaussian Splatting.
Fig 4. Comparison of 3D and 2D Gaussians showing consistent surface intersections across different camera viewpoints.

How 2DGS Fixes It

  • In 2DGS, each primitive is a surface-aligned 2D disk.
    A surface point’s appearance and position don’t change with viewpoint – every camera ray samples the same planar Gaussian.
  • This eliminates parallax inconsistencies, stabilizes shading, and ensures that depth and normal maps remain coherent across all views.

2.2 Perspective-Correct Ray-Splat Intersection

When a 2D Gaussian is viewed under perspective projection, its appearance changes – a circular disk may appear elliptical depending on the viewpoint. To handle this, 2DGS introduces perspective-correct splatting using explicit ray-plane intersection.

Here’s the process:

  • Each pixel casts a camera ray into the scene.
  • The ray intersects the Gaussian disk’s plane at a single point.
  • The intersection coordinates (u,v) are computed analytically using the camera and disk transformations.
  • The pixel color is computed by evaluating the Gaussian value G(u,v) and blending it with other splats along the ray.

This process is both mathematically exact and differentiable, enabling accurate gradient-based optimization during training.

2.3 Regularization for Surface Fidelity

2DGS not only renders beautifully but also accurately understands geometry.

The Problem in Prior Methods

  • 3DGS: Surface normals are emergent from overlapping volumetric Gaussians, often noisy or unstable.
  • NeRF: Normals are implicit derivatives of a learned density field, which tend to fluctuate or smooth out fine details.

The 2DGS Advantage

Since each 2D Gaussian is defined directly on a surface plane with an explicit normal vector, geometry becomes first-class, not a byproduct.

Additionally, two geometric regularizers reinforce this structure:

RegularizerPurposeEffect
Depth Distortion Loss ((L_d))Forces splats along a ray to converge to a single surface layerRemoves “floating” artifacts and ensures thin surfaces
Normal Consistency Loss ((L_n))Encourages neighbouring disks to share smooth, consistent orientationsProduces clean, continuous surface normals

Together, these losses make 2DGS both geometrically accurate and visually stable, with clear depth boundaries and physically correct normals.

2.4 Real-Time Rendering with Geometric Precision

A significant strength of 2DGS is that it delivers NeRF-level visual quality and SDF-level geometry at 3DGS-level real-time speeds. SDF, short for Signed Distance Function, is a mathematical representation of a 3D shape or surface. It tells us, for any point in 3D space, how far that point is from the surface, and whether it is inside or outside the object.

  • Unlike NeRF, which uses MLPs to evaluate radiance hundreds of times per pixel, 2DGS directly manipulates explicit Gaussian parameters.
  • Rendering uses standard rasterization and alpha blending – operations natively optimized on modern GPUs.
  • Every stage (intersection, blending, filtering) is differentiable, allowing end-to-end optimization without sacrificing runtime speed.

3. How 2D Gaussian Splatting Works? – A Step-by-Step Pipeline

At its heart, 2D Gaussian Splatting (2DGS) converts ordinary multi-view images into a precise, differentiable surface-based radiance field. The pipeline explained below traces this journey from input images to the final rendered output, detailing every transformation along the way.

Stage 1: Input and Initialization

Input:

  • A set of calibrated multi-view images (each with known camera intrinsics and extrinsics).
  • Optional sparse point cloud initialization from Structure-from-Motion (SfM).

Goal:

Provide an initial guess of the scene’s geometry – rough point positions and normals.

Process:

Each SfM point becomes a seed Gaussian primitive. Unlike 3DGS (which would initialize ellipsoids in 3D space), 2DGS initializes 2D disks, each tangent to the estimated surface. Each disk starts with:

ParameterMeaning
(p_k)3D center position of the disk
(t_u, t_v)Two tangent vectors defining its local 2D plane
(n_k = t_u \times t_v)Surface normal
(s_u, s_v)Scaling factors controlling disk size along tangent axes
(c_k, \alpha_k)Initial color and opacity

This setup encodes the local geometry and appearance at each surface patch before optimization begins.

Stage 2: Tangent-Plane Parameterization

Each disk defines a local 2D coordinate system – the tangent plane – where the Gaussian’s spread is computed.

P(u,v) = p_k + {s_u}{t_u}{u} + {s_v}{t_v}{v}

where p_k being the disk center in 3D world space, (t_u, t_v)​ are the orthogonal tangent vectors, (u,v)​ being the local coordinates within the disk and (s_u​, s_v)​ being the scaling factors determing disk radius.

On this plane, the Gaussian falloff is isotropic:

This function defines how opacity and color intensity decrease smoothly away from the disk center.
Every disk thus becomes a tiny, smooth “surface brushstroke” contributing to the scene’s radiance.

Stage 3: Transformation to Camera Space

To render from a specific viewpoint, every disk must be expressed relative to the camera.

Transformation Matrix:

  • R = [t_u​, t_v​, n_k​]: rotation matrix from local → world coordinates.
  • S = diag(s_u​, s_v​,0): scaling in local axes.
  • p_k​: translation (disk center).

Next, this is multiplied by the world-to-camera transform W to express the disk in the camera’s coordinate frame:

x_{cam} = WH{(u, v, 1,1)}^T

This step positions every disk exactly as the camera “sees” it, preparing for ray-based evaluation.

Stage 4: Perspective-Correct Ray-Splat Intersection

This is the mathematical heart of 2DGS – how a pixel ray interacts with a disk. Older surface splatting methods (and even 3DGS) relied on affine projections that approximate how a disk looks under perspective. 2DGS replaces this with an exact ray-plane intersection computation.

Process:

  • Each image pixel corresponds to a camera ray defined by its position and direction.
  • For each Gaussian disk, we compute where that ray intersects the disk’s plane.
  • The intersection point’s coordinates (u,v) on the tangent plane are obtained by solving:

(derived from homogeneous plane equations).

  • Using these (u,v) coordinates, we evaluate the Gaussian function G(u,v).
    The result indicates how strongly this disk contributes to the pixel’s color.

Benefit: This exact ray–plane intersection ensures perspective-correct rendering – the shape, size, and contribution of every disk remain geometrically accurate, even under extreme viewing angles.

Stage 5: Filtering and Stability

When a disk is viewed edge-on, its projected area may shrink dramatically, leading to aliasing or numerical instability.

To prevent this, 2DGS applies a low-pass filter (from Bosch et al., 2005):

  • G(u(x)): standard Gaussian response.
  • G((x−c)/σ): blurred Gaussian for stability when projected size is tiny.
  • σ = \sqrt{2}/{2} in practice.

This adaptive smoothing maintains temporal stability and ensures that disks remain visible even when seen at sharp grazing angles.

Stage 6: Alpha Compositing (Color Accumulation)

Once all disks are projected and evaluated, the renderer composites them using a front-to-back blending rule – similar to 3DGS but over surface-aligned splats.

Here:

  • c_i​: disk color (possibly view-dependent).
  • \alpha_i​: disk opacity.
  • \hat{g}_i(u(x)): filtered Gaussian value.
  • The product term represents accumulated transparency.

Disks are sorted by depth (nearest first), and blending continues until total opacity ≈ 1, meaning the pixel is fully covered. This yields the final rendered color c(x) for each pixel.

Because all operations are differentiable, gradients can flow back to disk parameters during training.

Stage 7: Differentiable Optimization in 2D Gaussian Splatting

During training, 2DGS updates each disk’s parameters (p_k, t_u, t_v, s_u, s_v, c_k, \alpha_{k}​) using gradient descent.

The loss function combines:

ComponentPurpose
(L_c)RGB reconstruction (color fidelity)
(L_d)Depth distortion – keeps splats on a thin surface
(L_n)Normal consistency – smooths orientations

L = L_c​ + {\alpha_d}{​L_d} ​+ β{L_n}

This ensures that disks:

  • stay aligned with real-world surfaces,
  • maintain smooth normals, and
  • reproduce image colors accurately.

Stage 8: Output -> Real-Time Rendering and Geometry Extraction

Outputs:

  • Rendered Images:
    • High-quality, geometrically accurate novel views produced in real time.
  • Depth Maps:
    • Consistent, noise-free depth estimation per pixel.
  • Triangle Mesh:
    • Extracted from depth maps via TSDF fusion + Marching Cubes, yielding watertight, artifact-free 3D geometry.

3.1 2DGS Pipeline Summary – Input to Output

StageDescriptionOutput
1Multi-view images → SfM point initializationRough surface seeds
2Tangent-plane parameterization2D Gaussian disks on surfaces
3Transform to camera spaceCamera-aligned splats
4Ray–splat intersectionPerspective-correct (u,v) hits
5FilteringStable Gaussian responses
6Alpha compositingPer-pixel blended colors
7Differentiable optimizationLearned geometry + color
8Rasterization / mesh extractionFinal rendered views + 3D model
Download Code To easily follow along this tutorial, please download code by clicking on the button below. It's FREE!

4. 2DGS Pipeline Execution

This section walks through the complete process of executing the 2D Gaussian Splatting (2DGS) pipeline on the MipNeRF360 “Flowers” dataset – including environment setup, dependency conflicts, CUDA version mismatches, and runtime fixes. The complete step-by-step procedure is as follows –

  • Start by cloning the 2DGS implementation (surfels-based)
git clone https://github.com/hbb1/2d-gaussian-splatting.git --recursive
  • Environment Setup

The repo provides an environment.yml file which Installs PyTorch + CUDA 11.8, includes Open3D and Trimesh for mesh extraction and visualization and builds CUDA extensions (diff-surfel-rasterization, simple-knn).

conda env create --file environment.yml
conda activate surfel_splatting

But during the creation of the environment, we might get:

RuntimeError: The detected CUDA version (12.X) mismatches the version that was used to compile PyTorch (11.8)

The error occurs because there can be a mismatch between our system’s CUDA toolkit and the repo environment built with CUDA 11.8. To fix this, we can create a clean environment using PyTorch built for CUDA 12.1, which is compatible with your 12.2 runtime. Steps (terminal commands) to create a clean environment using PyTorch for CUDA 12.1 are given below –

conda create -n surfel_splatting python=3.10 -y
conda activate surfel_splatting
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121

Verify CUDA version:

python -c "import torch; print(torch.__version__, torch.version.cuda)"

Output should be:

2.1.2+cu121  12.1
  • Install Remaining Dependencies
conda install -c conda-forge ffmpeg=4.2.2 pillow=10.2.0 typing_extensions=4.9.0 pip=23.3.1
pip install open3d==0.18.0 mediapy==1.1.2 lpips==0.1.4 scikit-image==0.21.0 tqdm==4.66.2 trimesh==4.3.2 plyfile opencv-python
  • Build CUDA Submodules
cd submodules/diff-surfel-rasterization
python setup.py install

cd ../simple-knn
python setup.py install

After successfully running the above terminal commands, our conda environment with all dependencies and built sub-modules is available. Now, we can proceed with the training and rastering part of the pipeline.

  • Training Command
python train.py -s /home/opencvuniv/Work/Shubham/2d-gaussian-splatting/360_extra_scenes/flowers -m output/m360/flowers

This will load the images & camera poses, optimize surfel-based Gaussians and save the checkpoints to the path – output/m360/flowers/. Be careful about the path to the dataset to be given as an argument in the command.

You’ll see progress logs like:

Training ends at the specified iteration count (default: 30,000).

  • Rendering Command
python render.py -s /home/opencvuniv/Work/Shubham/2d-gaussian-splatting/360_extra_scenes/flowers -m output/m360/flowers --skip_train --skip_test --mesh_res 1024

This loads your trained checkpoint, extracts the reconstructed base, and renders it with the extracted bounded mesh if we want to focus on the foreground.

But if we want to render with unbounded mesh extraction, use the below terminal command –

python render.py -s /home/opencvuniv/Work/Shubham/2d-gaussian-splatting/360_extra_scenes/flowers -m output/m360/flowers --unbounded --skip_train --skip_test --mesh_res 1024

5. 2DGS Rasterized Output

Limitations & Future Work in 2DGS

Even though 2DGS sets a new benchmark, the authors note three limitations:

  • Semi-Transparent Objects – Assumes opaque surfaces; struggles with glass or water.
  • Densification Bias – Spawns more splats in texture-rich areas than flat ones; can miss tiny structures.
  • Over-Smoothing Trade-Off – Strong regularization may slightly reduce micro-detail.

6. Conclusion

2D Gaussian Splatting fundamentally rethinks radiance-field representation.
By replacing volumetric 3D Gaussians with surface-aligned 2D disks and coupling them with geometric regularization, it achieves:

  • Physically accurate geometry
  • Real-time rendering performance
  • Compact storage and clean meshes

It elegantly unifies the visual fidelity of NeRF with the geometric rigor of SDFs – marking a new era in geometry-aware neural rendering.

7. References



Read Next

VideoRAG: Redefining Long-Context Video Comprehension

VideoRAG: Redefining Long-Context Video Comprehension

Discover VideoRAG, a framework that fuses graph-based reasoning and multi-modal retrieval to enhance LLMs' ability to understand multi-hour videos efficiently.

AI Agent in Action: Automating Desktop Tasks with VLMs

AI Agent in Action: Automating Desktop Tasks with VLMs

Learn how to build AI agent from scratch using Moondream3 and Gemini. It is a generic task based agent free from…

The Ultimate Guide To VLM Evaluation Metrics, Datasets, And Benchmarks

The Ultimate Guide To VLM Evaluation Metrics, Datasets, And Benchmarks

Get a comprehensive overview of VLM Evaluation Metrics, Benchmarks and various datasets for tasks like VQA, OCR and Image Captioning.

Subscribe to our Newsletter

Subscribe to our email newsletter to get the latest posts delivered right to your email.

Subscribe to receive the download link, receive updates, and be notified of bug fixes

Which email should I send you the download link?

 

Get Started with OpenCV

Subscribe To Receive

We hate SPAM and promise to keep your email address safe.​