A new, powerful Citizen Portal experience is ready. Switch now

How to extract dual‑vintage PUMA data from the 2022 ACS 5‑year PUMS using MDAT

March 30, 2026 | U.S. Census Bureau, Department of Commerce (DOC), Executive, Federal


This article was created by AI summarizing key points discussed. AI makes mistakes, so for full details and context, please refer to the video of the full meeting. Please report any errors so we can fix them. Report an error »

How to extract dual‑vintage PUMA data from the 2022 ACS 5‑year PUMS using MDAT
Tyson, presenter, demonstrated how to use the microdata access tool (MDAT) to get detailed PUMA‑level estimates from the 2022 American Community Survey (ACS) 5‑year Public Use Microdata Sample (PUMS) by creating two separate tables—one using 2010 PUMA boundaries and one using 2020 PUMA boundaries—and then combining their totals.

The tutorial is aimed at users who need more detailed categories (for example, the language spoken at home) than the prebuilt tables on data.census.gov provide. Tyson said, “Because the 2022 ACS 5‑year PUMS has dual vintages, we’ll actually have to create 2 separate tables to account for this,” and then walked through the exact MDAT steps.

He began by opening MDAT at data.census.gov/app/mdat, selecting the ACS 5‑year PUMS dataset and switching the vintage to 2022. Then Tyson added the detailed household language variable (HHLANP) and the PUMA variable as working variables. For the first table he selected the PUMA10 (2010) variable, warned that MDAT limits the number of PUMAs per table, and filtered the geography by state (New York and Pennsylvania) to disambiguate repeated PUMA codes across states.

Using the cart, he created a custom recode for the PUMA10 variable, searched for and selected code 01600 (Butler County, Pennsylvania) and 04110 (New York City Queens Community District 5 under 2010 boundaries), and gave each recode a readable label (for example, “Puma 01600 Butler County, PA”). He then built the table layout by moving geographies to columns and placing the recoded PUMA variable ahead of the language variable in the rows to show the two 2010 PUMAs with detailed language categories.

Tyson showed how to hide the total selected geographies column (so the table displays only the individual PUMAs) and noted that some columns will show zeros when a PUMA code exists in a state the user did not select. He pointed out that the first table covers only part of the five‑year period (2018–2021).

To capture 2022 data within the same ACS 5‑year PUMS, Tyson instructed viewers to repeat the process using the PUMA20 (2020) variable: return to the MDAT landing page, reselect the ACS 5‑year PUMS with vintage 2022, add the same HHLANP variable, choose PUMA20, and then recode the PUMA20 values (selecting 01600 for Butler County and 04405 for the New York City Queens CD5 2020 GEOID) and rename groups accordingly.

After building the second table with the 2020 boundaries and arranging the variables the same way, Tyson advised combining the two tables by adding the totals for each detailed language category by PUMA to obtain the full 2022 ACS 5‑year PUMS estimate. He closed by directing viewers to the resources link for additional guidance and mapping tools (data.census.gov mapping and TIGERweb) to confirm GEOID or boundary changes.

Practical takeaways: create one MDAT table for the 2010 PUMA boundaries and one for the 2020 boundaries, limit PUMAs per table as required, filter by state to avoid cross‑state code collisions, give recoded groups clear labels, hide the overall totals column for clarity, and sum the matching cells across the two tables to generate complete 2022 estimates.

View the Full Meeting & All Its Details

This article offers just a summary. Unlock complete video, transcripts, and insights as a Founder Member.

Watch full, unedited meeting videos
Search every word spoken in unlimited transcripts
AI summaries & real-time alerts (all government levels)
Permanent access to expanding government content
Access Full Meeting

30-day money-back guarantee