Deep & Machine Learning
In this project, we explore and experiment with existing NeRF related neural network, we are interested in using NeRF to generate free view sythesized authentic photos, and especially for these large objects that are to approach clean version of views, such as large architecture with occlusions and views that are too high to take a picture unless using drone.
The project explores the gradient-domain processing in the practice of image blending, tone mapping and non-photorealistic rendering. The method mainly focuses on the Poisson Blending algorithm. The tasks include primary gradient minimization, 4 neighbors based Poisson blending, mixed gradient Poisson blending and gray-scale intensity preserved color2gray method. The whole project is implemented in Python. The detail implementation can be viewed here https://github.com/CaoYuchen/16726/tree/main/a2, and the related website.
This project implements two famous GAN architecture: DC-GAN and CycleGAN. It is programmed in Pytorch, the major code includes the build-up of discriminator and generator neural network, loss function, forward and backward propagations. It also explores different methods that help GAN generate better results, such as Data Augmentation, Differentiable Augmentation, variance of different lose functions, variance of different discriminators, and implemented in different dataset to check the robustness fo the network. The detail implementation can be viewed here https://github.com/CaoYuchen/16726/tree/main/a3, and the related website.
Neural Style Transfer
Neural Style Transfer is a vgg-19 based neural network, utilizes regression method MSE for loss function, and the LBFGS for input image(noise) optimization. It only uses the feature extraction part of vgg-19, and only for evaluation purpose(no gradient optimization for these layers), instead, the optimization happens in the loss function and input(two ends). And the loss function consists of two parts, content loss and style loss, we’ll implement them separately first, and then combine them together with assigned weights. The detail implementation can be viewed here https://github.com/CaoYuchen/16726/tree/main/a4, and the related website.
Dense Object Reconstruction from RGBD Images with Embedded Deep Shape Representations
Most problems involving simultaneous localization and mapping can nowadays be solved using one of two fundamentally different approaches. The traditional approach is given by a least-squares objective, which minimizes many local photometric or geometric residuals over explicitly parametrized structure and camera parameters. Unmodeled effects violating the Lambertian surface assumption or geometric invariance of individual residuals are encountered through statistical averaging or the addition of robust kernels and smoothness terms. Aiming at more accurate measurement models and the inclusion of higher-order shape priors, the community more recently shifted its attention to deep end-to-end models for solving geometric localization and mapping problems. However, at test-time, these feed-forward models ignore the more traditional geometric or photometric consistency terms, thus leading to a low ability to recover fine details and potentially complete failure in corner case scenarios. With an application to dense object modeling from RGB-D images, our work aims at taking the best of both worlds by embedding modern higher-order object shape priors into classical iterative residual minimization objectives. We demonstrate a general ability to improve mapping accuracy with respect to each modality alone, and present a successful application to real data.
This is a paper published in the ACCV workshop, it can be found here https://arxiv.org/abs/1810.04891
As name implies, this project’s target is to draw svg from 3D shape, the main tasks are to draw line, triangle shapes, manage super-sampling, transforms, trilinear filter, alpha composition, scaling, etc. you can find original link here, and this is my implementation https://github.com/CaoYuchen/CMU15662/tree/main/DrawSVG.
This assignment covers basic half-edge-based 3D mesh and modeling algorithm, BVH, ray tracing rendering, rigging, simple animation and particle effect.
Computer Vision & SLAM
Representations and Benchmarking of Modern Visual SLAM Systems
Simultaneous Localisation And Mapping (SLAM) has long been recognized as a core problem to be solved within countless emerging mobile applications that require intelligent interaction or navigation in an environment. Classical solutions to the problem primarily aim at localisation and reconstruction of a geometric 3D model of the scene. More recently, the community increasingly investigates the development of Spatial Artificial Intelligence (Spatial AI), an evolutionary paradigm pursuing a simultaneous recovery of object-level composition and semantic annotations of the recovered 3D model. Several interesting approaches have already been presented, producing object-level maps with both geometric and semantic properties rather than just accurate and robust localisation performance. As such, they require much broader ground truth information for validation purposes. We discuss the structure of the representations and optimisation problems involved in Spatial AI, and propose new synthetic datasets that, for the first time, include accurate ground truth information about the scene composition as well as individual object shapes and poses. We furthermore propose evaluation metrics for all aspects of such joint geometric-semantic representations and apply them to a new semantic SLAM framework. It is our hope that the introduction of these datasets and proper evaluation metrics will be instrumental in the evaluation of current and future Spatial AI systems and as such contribute substantially to the overall research progress on this important topic.
This is a paper published in Sensors Journal, it can be found here https://www.mdpi.com/1424-8220/20/9/2572
A basic framework of SLAM
A fundamental SLAM system with tracking, mapping and pose optimization in Matlab. It includes SIFT & Harris feature extraction, 7/8 points, homograph method, and LevenBerg-Marquardt average error for pose optimization. The detail implementation can be viewed here https://github.com/CaoYuchen/SLAM-basicframe
This course mainly includes basic computer system knowledge from compiler to linker, includes stack, heap, cache, simple network implementation, debug and disassemble usages. The assignment is based on C. The implementation can be found here https://github.com/CaoYuchen/CMU15513
This is a dynamic self-balanced system based on vector field, the points start from random location and influenced by field force in each pixel, these points will gradually form into a dynamic stable status, as an representation of cybernetic system. It is implemented in Processing, the code can be found here https://github.com/CaoYuchen/Cybernetic
Digital Media Programming
Interactive White Board
This is a set of English teaching tools, Interactive White Board(IWB), that I made for EF Education First, the main purpose of it is to assist children-oriented teaching in an entertaining and interactive way. My main contributions include:
- Design styles for children-oriented teaching
- The entire programming of front-end
- Data transfer part of Back-end
I also made another questionaire website Goal Map for EF Education to help collect information quickly from parents to help their children locate best learning curve and courses to take.
There are also some other websites that I made, you can check them here:
Dr.Dox Quest For Time
GGJ 2022 game project: https://github.com/gcwhitfield/DrDoxQuestForTime. It’s a puzzle game where Dr.Dox is a time traveler seeking for treasures but he has to be careful to not bump into himself in the past to avoid time paradox. My major role in this project:
- Game design
CMU Advanced Game Studio project: Penumbra. This projects is a split-screen two players game, players need to cooperate together and communicate with each other offline while playing the daughter and father to solve puzzles, go through forest and hide from monsters, to finally let them meet in the game. My major role in this project:
- Game Programmer
- Sound Artist
- Game Designer
- 3D Artist