/Codes are a puzzle. A game, just like any other game. - Alan Turing/
I'm an AI enthusiasts who enjoys participating in data science competitions and using machine learning techniques to solve interesting problems. I'm also a developer who enjoys building applications to help my community. I'm currently an undergraduate at the National University of Singapore, pursuing a Double Degree in Mathematics and Computer Science.
I completed an internship at OptioAI, during which I worked on creating AI solutions powered by locally hosted large language models, alongside smaller neural networks and classical models.
-
[(ECCV 2024 Conference Paper) Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models] Recent advancements in diffusion models have notably improved the perceptual quality of generated images in text-to-image synthesis tasks. However, diffusion models often struggle to produce images that accurately reflect the intended semantics of the associated text prompts. We examine cross-attention layers in diffusion models and observe a propensity for these layers to disproportionately focus on certain tokens during the generation process, thereby undermining semantic fidelity. To address the issue of dominant attention, we introduce attention regulation, a computation-efficient on-the-fly optimization approach at inference time to align attention maps with the input text prompt. Notably, our method requires no additional training or fine-tuning and serves as a plug-in module on a model. Hence, the generation capacity of the original model is fully preserved. We compare our approach with alternative approaches across various datasets, evaluation metrics, and diffusion models. Experiment results show that our method consistently outperforms other baselines, yielding images that more faithfully reflect the desired concepts with reduced computation overhead. Code is available here.
-
[(ECCV 2024 Workshop Paper) Investigating Copyright Issues of Diffusion Models under Practical Scenarios] Diffusion models excel in many generative modeling tasks, notably in creating images from text prompts, a task referred to as text-to-image (T2I) generation. Despite the ability to generate high-quality images, these models often replicate elements from their training data, leading to increasing copyright concerns in real applications in recent years. In response to this raising concern about copyright infringement, recent studies have studied the copyright behavior of diffusion models when using direct, copyrighted prompts. Our research extends this by examining subtler forms of infringement, where even indirect prompts can trigger copyright issues. Specifically, we introduce a data generation pipeline to systematically produce data for studying copyright in diffusion models. Our pipeline enables us to investigate copyright infringement in a more practical setting, involving replicating visual features rather than entire works using seemingly irrelevant prompts for T2I generation. We generate data using our proposed pipeline to test various diffusion models, including the latest Stable Diffusion XL. Our findings reveal a widespread tendency that these models tend to produce copyright-infringing content, highlighting a significant challenge in this field.
- [Simple Maths Expression Solver] A web application that takes in a photo of a simple handwritten mathematical expression and evaluates the numeric answer. The neural network uses CNN layers and has an architecture that is inspired from the VGG-16 architecture. This was created during the 2021 SCSE Computing Challenge 2021, in which my group was among the top 5 and won the Outstanding Creative Team.
- [Address Extraction] A neural network is built using the BERT model to extract street names and point of interests from a dataset containing messy Indonesian addresses.
- [Registration Platform] A web application that is built during the Covid-19 pandemic to help a church to keep track of the number of people who will be attending the physical Sunday worship service.
- [Centralised Database] A web application that is built to aid a company to share training resources with the rest of their employees. The web application automatically synchronises the resources uploaded to Google Drive by the admin.
- [Bus Search] A web application that displays the cheapest and shortest route between two bus stops in Singapore.
- [Productivity Bot] A discord bot that keeps track of the number of hours put into working/studying through a clocking in and out system. A leaderboard ranked by hours of work done can be generated by the admin through the bot.
- Email: [email protected]
- LinkedIn: Tze Tzun Teoh