wordmark that reads grants in action

Jingchen (Monika) Hu receives NSF-National Center for Science and Engineering Statistics (NCSES) consortium funding

Headshot of Jingchen (Monika) Hu

Jingchen (Monika) Hu, Associate Professor of Mathematics and Statistics, received funding through the NSF’s National Center for Science and Engineering Statistics (NCSES) to work on the creation of synthetic data for the Survey of Earned Doctorates and Development and on the use of verification metrics. Together with colleagues and collaborators from the Urban Institute, RTI International, and the RAND Corporation, the team will produce a synthetic data file for public use that supports a tiered access model, explore the use of synthetic data for evidence-building, and test the use of verification metrics in validating estimates produced from synthetic data. A synthetic dataset does not contain the exact records of the original dataset, but instead retains the statistical properties of the original dataset, preserving information useful to researchers and their queries.

Monika will be providing methodological advice and quality review on the development of a synthetic version of the Survey of Earned Doctorates, using open-source code, such as R packages, to produce a public-use microdata file. She will also be helping develop accompanying educational and communication materials based on stakeholder needs and expectations from extensive user engagement.

Monika will lead the identification and development of disclosure risk metrics that measure the appropriate risk from releasing the synthetic data, and support the Urban Institute and other collaborators in producing the final report to NSF, as well as support creating a plan for dissemination of the synthetic data and the verification metrics, and also the public messaging about this new public-use data product.

This project is part of a “coordinated research and development program that will further NCSES’s role as a central Federal clearinghouse for the collection, interpretation, analysis, and dissemination of objective data on science, engineering, technology, and research and development.” It will also contribute to NCSES efforts to “develop new ways of acquiring, cleaning, and standardizing data; combining multiple data sets; and linking data from various government and private sources to yield valuable insights into critical issues” as well as “documenting and providing lessons learned for similar projects across the government.”

Posted
October 16, 2023
Grants in Action - Celebrating
Sciences