
better data, a Singapore-based startup that uses programmable synthetic data to keep real data safe, announced today that it has raised $1.55 million. The reportedly oversubscribed seed round was led by Investible, with participation from Franklin Templeton, Xcel Next, Singapore University of Technology and Design, Bon Auxilium, Tenity, Plug and Play and Entrepreneur First.
Founded in 2021 by its CEO and chief technologist, Dr. Kevin Yee Uzair Javaid, the startup aims to make data sharing faster and more secure as global data protection regulations increase. The company is currently in the process of developing research and development partnerships with two major universities in Singapore and the United States (it cannot publicly disclose who they are), and its clients include the Shanghai Pudong Development Bank.
Betterdata says it uses data anonymization to destroy data unlike traditional data sharing because it leverages generative artificial intelligence and privacy engineering.
Yee explained to TechCrunch that programmatic synthetic data uses generative models to create and augment new datasets, such as deep learning models, including generative adversarial models used in deepfakes, transformers used in ChatGPT, and diffusion models used in stable diffusion .
These synthetic datasets have similar characteristics and structure to real-world data, but do not reveal sensitive or private information about individuals.
“The idea is to create a fictional version of a real dataset that can be safely used for a variety of purposes, including protecting confidential data, reducing bias, and improving machine learning models,” he said.
Programmatically synthetic data helps developers in several ways. Some examples include helping them protect sensitive data, comply with data protection regulations like GDPR and HIPAA, improve data availability between teams, create more data to train, test and validate machine learning models, and create more data for underrepresented groups Multiple records to resolve data imbalance issues or classes.
Funding from Betterdata will be used for its product launches and enhancements to its programmable synthetic data technology stack, including support for single-table, multi-table and time-series datasets. These are different variants of tabular datasets, Yee explained, with the main difference being their structure and the problems they were created to solve.
For example, single-table datasets focus on independent tables, while multi-table datasets are designed to consider relationships between multiple tables, and time-series datasets deal with data collected over time.
Betterdata also plans to hire more people, including sales and marketing staff, and expand its operations to more Asia-Pacific regions outside of Singapore in the next one to two years.
In a statement on Investible’s investment, Principal Khairu Rejal said: “Betterdata solves one of the biggest problems facing the AI industry today: the lack of high-quality data that also meets privacy requirements. Generating synthetic data that mimics real-world data without compromising quality and privacy helps organizations meet global compliance and privacy laws at scale.”