A new, powerful Citizen Portal experience is ready. Switch now

USBE outlines synthetic‑data pilot to let researchers test code without exposing student records

April 27, 2026 | Utah State Board of Education, Utah Government Divisions, Utah Legislative Branch, Utah


This article was created by AI summarizing key points discussed. AI makes mistakes, so for full details and context, please refer to the video of the full meeting. Please report any errors so we can fix them. Report an error »

USBE outlines synthetic‑data pilot to let researchers test code without exposing student records
A USBE presenter described an internal synthetic data project that produces artificial datasets resembling real LEA submissions so researchers and vendors can develop and test code without access to identifiable student records.

The presenter said the workflow trains machine‑learning models on real data to learn structure and then generates synthetic datasets that mimic distributions without containing real student entries. “No real student data goes into the system,” the presenter said, explaining the process includes model selection and checks for disclosure risk. The presentation cited differential‑privacy techniques (described in the meeting as a mathematical approach to privacy) and noted USBE is working with the University of Utah to review code.

Why USBE is pursuing synthetic data: presenters said it enables faster turnaround on research and developer testing (for example, testing an SIS or a machine‑learning model) without repeated board approvals or data‑sharing agreements. The synthetic datasets would allow external parties to run code and return only outputs or code rather than raw data.

Limits and caution: presenters described the effort as still early‑stage. They said the synthetic data are intended to be useful for development and algorithm testing, not for producing authoritative analytic estimates about the population; presenters emphasized documentation and validation steps will accompany any release.

Next steps: USBE staff said the synthetic dataset will be reviewed by partner universities, and availability will be determined as policies and validation progress. Attendees asked whether the product would be used to train third‑party large models; presenters said it could be but that future usage and any commercial arrangements are still being evaluated.

View the Full Meeting & All Its Details

This article offers just a summary. Unlock complete video, transcripts, and insights as a Founder Member.

Watch full, unedited meeting videos
Search every word spoken in unlimited transcripts
AI summaries & real-time alerts (all government levels)
Permanent access to expanding government content
Access Full Meeting

30-day money-back guarantee