MASt3R-SLAM: Real-Time Dense SLAM with 3D Priors

MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors

Ever tried to build a dense 3D map from a handheld camera and found the results noisy, incomplete, or too slow to be useful? Dense SLAM pipelines often struggle to combine real-time tracking with high-quality reconstruction priors. MASt3R-SLAM aims to bridge that gap: a practical, research-grade system that brings 3D reconstruction priors into a real-time dense SLAM pipeline.

This article walks through what MASt3R-SLAM does, who it’s for, how it works, and how you can get started using the repository and the provided checkpoints. All information below is drawn exclusively from the repository text provided for this project.

What It Does

At its core, MASt3R-SLAM is a real-time dense SLAM system that leverages learned 3D reconstruction priors to improve dense mapping and tracking quality. The project integrates prior reconstruction knowledge with SLAM to produce denser, higher-fidelity reconstructions while maintaining real-time performance.

Problems it addresses:

  • Making dense 3D reconstruction faster and more reliable in real-time settings.
  • Combining learned priors from state-of-the-art reconstruction models with SLAM tracking and mapping.
  • Providing a practical codebase and checkpoints so researchers can reproduce results and test on standard datasets.

Note: A full paper, project page and demo videos are linked in the repository and accompany the code for deeper technical detail and evaluation results.

Who It’s For

MASt3R-SLAM is aimed primarily at researchers and advanced practitioners working in robotics, computer vision, and augmented reality who need a real-time dense SLAM system that benefits from learned reconstruction priors.

Typical users and use cases:

  • Academic researchers comparing SLAM and reconstruction approaches or building on learned priors.
  • Robotics engineers wanting denser maps for navigation, manipulation, or scene understanding.
  • AR/VR developers interested in live dense mapping for immersive experiences.
  • Anyone who wants a reproducible codebase for experiments on datasets such as TUM-RGBD, 7-Scenes, EuRoC, and ETH3D SLAM.

Skill level: Intermediate to advanced. The repository expects familiarity with Python, Conda environments, CUDA-enabled PyTorch, and basic SLAM/dataset handling. If you are comfortable with Git, Conda and GPU-enabled deep learning stacks, you will be able to run the provided demos and evaluations.

How It Works

The repository combines several components to achieve real-time dense SLAM with reconstruction priors. While full implementation details are described in the accompanying paper, the codebase reveals an integration of learned reconstruction models and a SLAM backend that runs in real-time.

Technical overview:

  • MASt3R-style reconstruction priors are integrated into the SLAM processing pipeline via pretrained checkpoints and retrieval codebooks.
  • The implementation is written for PyTorch and requires a CUDA-enabled GPU. Different CUDA versions are supported via specific PyTorch installation commands.
  • It leverages external open-source projects as building blocks: MASt3R, MASt3R-SfM, DROID-SLAM, and ModernGL (all acknowledged in the repository).
  • The system supports multiple data sources: live Realsense camera feeds, MP4 videos, and folders of images. It also supplies scripts to download common SLAM datasets for benchmarking.

Key technologies and dependencies

  • Python 3.11 (Conda environment)
  • PyTorch (specific versions matched to system CUDA)
  • NVidia CUDA toolkit (nvcc used to check version)
  • Third-party modules included as submodules and editable installs (e.g., thirdparty/mast3r, thirdparty/in3d)
  • Optional: torchcodec for faster MP4 loading

Getting Started

Follow the repository’s provided setup steps. If any required detail is missing from the repository text, this section will point that out and recommend where the repository itself links to additional resources (paper, project page, issues).

<!-- Create and activate the Conda environment -->
conda create -n mast3r-slam python=3.11
conda activate mast3r-slam
<!-- Check CUDA version -->
nvcc --version

The repository provides specific PyTorch installation commands depending on your CUDA version. Choose the matching block for your system:

<!-- CUDA 11.8 -->
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1  pytorch-cuda=11.8 -c pytorch -c nvidia

<!-- CUDA 12.1 -->
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.1 -c pytorch -c nvidia

<!-- CUDA 12.4 -->
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.4 -c pytorch -c nvidia

Clone the repository and install dependencies (the project uses submodules and editable installs for third-party components):

git clone https://github.com/rmurai0610/MASt3R-SLAM.git --recursive
cd MASt3R-SLAM/

# if you've clone the repo without --recursive run
# git submodule update --init --recursive

pip install -e thirdparty/mast3r
pip install -e thirdparty/in3d
pip install --no-build-isolation -e .
 

# Optionally install torchcodec for faster mp4 loading
pip install torchcodec==0.1

The repository requires model checkpoints for MASt3R and retrieval. The license for these checkpoints and dataset information is referenced in the MASt3R project’s CHECKPOINTS_NOTICE. The repository provides commands to download the required checkpoints into a local checkpoints/ folder:

mkdir -p checkpoints/
wget https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth -P checkpoints/
wget https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric_retrieval_trainingfree.pth -P checkpoints/
wget https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric_retrieval_codebook.pkl -P checkpoints/

WSL users: The repository notes that Ubuntu has been primarily tested. If you run under WSL, switch to the windows branch to disable multiprocessing (which can cause shared-memory issues) with:

git checkout windows

Running examples and demos

Example dataset download and run commands from the repository:

bash ./scripts/download_tum.sh
python main.py --dataset datasets/tum/rgbd_dataset_freiburg1_room/ --config config/calib.yaml

Live demo with a Realsense camera:

python main.py --dataset realsense --config config/base.yaml

Process a video or a folder of images:

python main.py --dataset <path/to/video>.mp4 --config config/base.yaml
python main.py --dataset <path/to/folder> --config config/base.yaml

# With known calibration parameters (intrinsics.yaml)
python main.py --dataset <path/to/video>.mp4 --config config/base.yaml --calib config/intrinsics.yaml
python main.py --dataset <path/to/folder> --config config/base.yaml --calib config/intrinsics.yaml

Key Features

  • Real-time dense SLAM: Designed to run live or on recorded data while producing dense reconstructions.
  • 3D reconstruction priors: Integrates pretrained MASt3R priors and retrieval codebooks to improve reconstruction quality.
  • Multiple input sources: Live Realsense, MP4 video, or image folders.
  • Dataset support and evaluation scripts: Scripts to download and evaluate on TUM-RGBD, 7-Scenes, EuRoC, and ETH3D SLAM.
  • Editable third-party modules: Uses submodules and editable pip installs for rapid iteration (thirdparty/mast3r, thirdparty/in3d).

Comparison and unique selling points

While the repository does not include a direct head-to-head metric comparison in the README text, the combination of real-time SLAM with learned reconstruction priors and the supplied MASt3R checkpoints is the project’s distinguishing factor. The system is positioned as a practical implementation that builds on and integrates notable projects such as MASt3R and DROID-SLAM to achieve dense, real-time reconstruction.

Why It’s Worth Trying

If your work relies on dense mapping at interactive rates or you want to explore the use of learned 3D priors inside a SLAM pipeline, MASt3R-SLAM offers a readily usable codebase with downloadable checkpoints and dataset scripts. The repository includes links to a paper, a demo video, and a project page for deeper technical background and results.

Hardware note: The authors state experiments were run on an RTX 4090 and that results may differ on other GPUs. This is important when reproducing paper results.

Community, support and reproducibility: The repo lists acknowledgements to several open-source projects it builds on, and includes scripts for evaluating on multiple datasets. The README notes small reproducibility differences may exist between the released code and the paper due to the multi-processing version used in this implementation.

Project Links and Repository

The official repository is available at: https://github.com/rmurai0610/MASt3R-SLAM.

Additional resources linked in the repository header include:

The repo also acknowledges and depends on the following projects (see the repository for details and license notes):

Final Thoughts

MASt3R-SLAM is a focused, research-oriented codebase that brings learned 3D reconstruction priors into a real-time dense SLAM system. The repository is practical—providing environment setup instructions, downloadable checkpoints, dataset scripts, and evaluations—making it a solid starting point for anyone exploring dense mapping with learned priors.

If you intend to reproduce the paper’s results, pay attention to the CUDA/PyTorch versions, the checkpoint licenses (linked in the repository), and the authors’ note about hardware (RTX 4090) and minor reproducibility differences. For more detailed theory and quantitative results, consult the linked paper and project page.

Citation:

@article{murai2024_mast3rslam,
    title={{MASt3R-SLAM}: Real-Time Dense {SLAM} with {3D} Reconstruction Priors},
    author={Murai, Riku and Dexheimer, Eric and Davison, Andrew J.},
    journal={arXiv preprint},
    year={2024},
}