SyntheX AI: Singapore Startup Revolutionizing AI Training with Synthetic Data

Date: March 18, 2026

Artificial intelligence models are only as good as the data they're trained on. But what happens when the data is scarce, sensitive, or biased? This is the challenge that SyntheX AI, a Singapore-based startup, is tackling head-on—and they're doing it with something called synthetic data.

The Data Dilemma

Every AI model worth its salt requires massive amounts of training data. But for many enterprises—especially those in healthcare, finance, and government—real data is hard to come by. Patient records contain sensitive personal information. Financial transactions are tightly regulated. Government datasets often involve national security concerns.

Beyond scarcity, there's the bias problem. AI systems trained on incomplete or skewed datasets can perpetuate and amplify existing inequalities. A healthcare AI trained primarily on data from one demographic might perform poorly for patients from different backgrounds. A loan approval system trained on historical data might unfairly disadvantage certain groups.

"We saw enterprises struggling with a fundamental problem," explains Dr. Wei Lin, SyntheX AI's co-founder and CEO. "They have the AI expertise, they have the computing resources, but they don't have the data they need—or the data they have is problematic. Synthetic data gives them a way to build robust, fair AI systems without these constraints."

How Synthetic Data Works

Synthetic data is artificially generated data that mimics the statistical properties of real-world data—without containing any actual personal or sensitive information. Think of it as a digital twin of a dataset: it looks, feels, and behaves like real data, but no actual individuals are represented.

SyntheX AI's platform uses advanced generative AI models to create synthetic datasets tailored to each client's specific needs. Users define the characteristics they need—the distribution of ages, the range of income levels, the prevalence of certain medical conditions—and the system generates data that matches these parameters while ensuring full privacy compliance.

The key differentiator is SyntheX AI's proprietary "differential privacy" technology, which mathematically guarantees that even if someone tries to reverse-engineer the synthetic data, they cannot recover any original records. This is particularly crucial for industries with strict regulatory requirements like healthcare (HIPAA) and financial services (MAS guidelines).

Real-World Impact

SyntheX AI's technology is already making waves across multiple sectors. In healthcare, the company has partnered with several Singapore hospitals to generate synthetic patient data for training diagnostic AI systems. These synthetic datasets allow researchers to build and test AI models without compromising patient privacy—a game-changer for medical AI development.

One notable collaboration is with the National University Health System (NUHS), where SyntheX AI helped generate synthetic medical imaging data for training AI systems to detect early-stage cancers. By augmenting limited real datasets with carefully generated synthetic examples, the AI's detection accuracy improved by 34%.

"The beauty of synthetic data is that we can create edge cases that are rare in real life but critical for AI performance," notes Dr. Sarah Chen, Head of AI Research at NUHS. "We can generate thousands of examples of rare tumor types that our doctors rarely see in practice. Our AI learns to recognize these patterns without putting any patients at risk."

In finance, SyntheX AI has worked with Singapore's DBS Bank to generate synthetic transaction data for building fraud detection systems. By training on synthetic data that includes realistic but completely fabricated fraud scenarios, DBS's AI has become significantly better at identifying novel fraud techniques before they cause real damage.

Addressing AI Bias

Beyond privacy and scarcity, synthetic data offers a powerful solution to AI bias. When real-world data reflects historical inequities, AI trained on that data will perpetuate those inequities. But synthetic data can be deliberately designed to be fair and representative.

SyntheX AI has developed a "fairness-first" generation framework that allows clients to explicitly specify diversity requirements. Want to ensure your hiring AI doesn't discriminate based on age, gender, or ethnicity? SyntheX can generate training data that deliberately includes balanced representation across all these dimensions.

"This isn't just about making AI 'feel' fair," explains Dr. Lin. "We can mathematically prove that the synthetic data meets specified fairness criteria. Our clients can then train AI systems with the confidence that they're building on a truly equitable foundation."

Looking Ahead

SyntheX AI recently secured S$18 million in Series B funding, led by Lightspeed Venture Partners with participation from existing investors including Sequoia Capital Southeast Asia. The funding will be used to expand the company's research team, develop industry-specific synthetic data solutions, and expand into markets across Southeast Asia and Northeast Asia.

The company is also investing heavily in what it calls "multimodal synthetic data"—the ability to generate synthetic data across different formats simultaneously. This includes combining synthetic text, images, and tabular data to train more sophisticated AI systems that can understand and process multiple types of information.

As AI continues to advance, the demand for high-quality training data will only grow. With increasing regulatory focus on AI governance—including Singapore's own Model AI Governance Framework—companies need solutions that balance innovation with responsibility. SyntheX AI believes synthetic data is the key to unlocking both.

"We're not trying to replace real data," says Dr. Lin. "We're providing a complement that enables enterprises to do more—of better quality AI—while staying fully compliant and ethical. The future of AI training is hybrid: real data for validation, synthetic data for coverage."

As Singapore positions itself as a global AI hub, startups like SyntheX AI demonstrate that the city-state is thinking beyond just building AI models—it's solving the fundamental infrastructure challenges that will determine how fast and how fairly AI can advance.


This article is part of our ongoing coverage of Singapore's AI startup ecosystem. For more AI news from the Lion City, visit AI Dominance SG.

Related Links: