Zhe Jun Tang, Ph.D
Machine Learning Researcher specializing in probabilistic modeling, deep learning, and large-scale data analysis. Experienced in designing algorithms for complex, high-dimensional problems across computer vision, NLP, and time-series forecasting, bridging theory and practical impact.
Education
University of Oxford
Oxford, U.K.Masters in Eng. Science | Non-Graduating Study Abroad Year; Thesis Graded First
2018Courses: Statistical Learning, Information Engineering, Control Systems, Machine Vision & Robotics
National University of Singapore
SingaporeBachelors in Electrical Engineering | First Class Honors
2019Texas Instruments Book Prize Award | Top student in Digital Signal Processing and System
United Engineers Book Prize Award | Top student in Industrial Control System
First Author Publication List
3iGS - Factorised Tensorial Illumination for 3D Gaussian Splatting
Abstract. The use of 3D Gaussians as representation of radiance fields has enabled high quality novel view synthesis at real-time rendering speed. However, the choice of optimising the outgoing radiance of each Gaussian independently as spherical harmonics results in unsatisfactory view dependent effects. In response to these limitations, our work, Factorised Tensorial Illumination for 3D Gaussian Splatting, or 3iGS, improves upon 3D Gaussian Splatting (3DGS) rendering quality. Instead of optimising a single outgoing radiance parameter, 3iGS enhances 3DGS view-dependent effects by expressing the outgoing radiance as a function of a local illumination field and Bidirectional Reflectance Distribution Function (BRDF) features. We optimise a continuous incident illumination field through a Tensorial Factorisation representation, while separately fine-tuning the BRDF features of each 3D Gaussian relative to this illumination field. Our methodology significantly enhances the rendering quality of specular view-dependent effects of 3DGS, while maintaining rapid training and rendering speeds.
ABLE-NeRF - Attention Based Rendering with Learnable Embeddings for Neural Radiance Fields
Abstract. Neural Radiance Field (NeRF) is a popular method in representing 3D scenes by optimising a continuous volumetric scene function. Its large success which lies in applying volumetric rendering (VR) is also its Achilles' heel in producing view-dependent effects. As a consequence, glossy and transparent surfaces often appear murky. A remedy to reduce these artefacts is to constrain this VR equation by excluding volumes with back-facing normal. While this approach has some success in rendering glossy surfaces, translucent objects are still poorly represented. In this paper, we present an alternative to the physics-based VR approach by introducing a self-attention-based framework on volumes along a ray. In addition, inspired by modern game engines which utilise Light Probes to store local lighting passing through the scene, we incorporate Learnable Embeddings to capture view dependent effects within the scene. Our method, which we call ABLE-NeRF, significantly reduces ‘blurry’ glossy surfaces in rendering and produces realistic translucent surfaces which lack in prior art. In the Blender dataset, ABLE-NeRF achieves SOTA results and surpasses Ref-NeRF in all 3 image quality metrics PSNR, SSIM, LPIPS.
MPT-Net - Mask Point Transformer Network for Large Scale Point Cloud Semantic Segmentation
Abstract. Point cloud semantic segmentation is important for road scene perception, a task for driverless vehicles to achieve full fledged autonomy. In this work, we introduce Mask Point Transformer Network (MPT-Net), a novel architecture for point cloud segmentation that is simple to implement. MPT-Net consists of a local and global feature encoder and a transformer based decoder; a 3D Point-Voxel Convolution encoder backbone with voxel self attention to encode features and a Mask Point Transformer module to decode point features and segment the point cloud. Firstly, we introduce the novel MPT designed to specifically handle point cloud segmentation. MPT offers two benefits. It attends to every point in the point cloud using mask tokens to extract class specific features globally with cross attention, and provide inter-class feature information exchange using self attention on the learned mask tokens. Secondly, we design a backbone to use sparse point voxel convolutional blocks and a self attention block using transformers to learn local and global contextual features. We evaluate MPT-Net on large scale outdoor driving scene point cloud datasets, SemanticKITTI and nuScenes. Our experiments show that by replacing the standard segmentation head with MPT, MPT-Net achieves a state-of-the-art performance over our baseline approach by 3.8% in SemanticKITTI and is highly effective in detecting 'stuffs' in point cloud.
Experience
NTU S-Lab for Advanced Intelligence
SingaporeMachine Learning Researcher | Neural Rendering, Real-Time 3D Graphics
2022 – 2024- Developed learned reflectance parameterisations to model non-linear radiance behavior in 3D Gaussian splatting.
- Optimised structured tensor decompositions for efficient real-time illumination modeling.
- Proposed transformer-based inverse rendering models embedding physics priors into attention.
- Experimented with LLMs/VLMs for language-aligned scene editing objectives.
SenseTime Research
SingaporeAlgorithm Researcher | Point Cloud Segmentation & 3D Scene Understanding
2019 – 2022- Built compute-efficient cross-attention transformer for large-scale 3D point segmentation.
- Designed masked token attention reducing class-level decoding complexity.
- Parallelized training across >100 GPUs cutting training time by 40%.
ST Electronics
SingaporeAI Research Intern | Signal Processing & Drone Detection via Deep Learning
2017 – 2018- Developed patented deep learning method for spectrum-based drone detection.
- Deployed CUDA-based FFT algorithms to enhance air defense system performance.
University of Oxford, Oxford Photonics Group
Oxford, U.K.Student Researcher | Optical Wireless Communication & Tracking
2017 – 2018- Built high-speed optical communication links for VR applications.
- Designed low-cost optical tracking systems reducing costs by 90%.
Technical Proficiencies
Languages
Python (proficient); C, CUDA, Verilog, Assembly (basic)
Platforms
NVIDIA GPU clusters (HPC), Linux, Xilinx FPGA, HFSS, Arduino
ML Frameworks
PyTorch, TensorFlow, NumPy, pandas, SciPy, scikit-learn, Matplotlib, OpenCV
Honors, Awards, and Competition Ranking
NTU - SenseTime Talent Programme: Full scholarship for Ph.D Candidature and Monthly Stipend of ~US$4,000
Texas Instruments Book Prize: Top Student in NUS ECE for Digital Signal Processing and System
United Engineers - Faculty of Engineering Annual Book Prize: Top Student in NUS ECE for Industrial Control System
LBC Family Engineering Scholarship: Awarded to top 0.33% Engineering Students for dual matriculation in Oxford University
WorldQuant Brain / Quantitative Researcher: GOLD Certificate
SemanticKITTI LiDAR Segmentation Competition 2021: Worldwide 11th (Solo)