As part of what I've been discussing recently in the community, I attempted to build a simple Python application that queries every entry in the USASpending API containing the keyword "Cuba." I'm particularly interested in both grants and contracts funded by the Department of State and the now-defunct USAID. In the first case, I still need to figure out how to filter out those instances tied to services requested for the United States Embassy in Havana, which fall outside my object of study.
Working with the USASpending API brings up certain problems common to programmers. For instance, the award_type_codes filter doesn't allow collusion between categories. If I query contracts —closely tied to services contracted from vendors— together with grants, the API returns a 500 error. The documentation isn't very explicit on this point, but with Claude's critical assistance, I arrived at a version that fires 5 parallel requests —one per award type:
const GROUPS = {
contracts: ["A","B","C","D"],
idvs: ["IDV_A","IDV_B","IDV_B_A","IDV_B_B","IDV_B_C","IDV_C","IDV_D","IDV_E"],
grants: ["02","03","04","05"],
direct_payments: ["06","10"],
other: ["09","11"]
};
The results are then merged on the client side and re-sorted. There's also the pagination problem. Since there's a 100-result limit per page, a sequential fetch is needed until all results matching the query have been retrieved:
const pagesNeeded = Math.min(Math.ceil(total / 100), MAXPAGES);
if (pagesNeeded > 1) {
const more = await Promise.all(
Array.from({ length: pagesNeeded - 1 }, (_, k) => fetchPage(k + 2))
);
}
Another "contentious" point was the modification history —that is, capturing every movement in the Treasury Department's accounting system that increases or decreases the obligations tied to each award. In the end, it turned out there's an endpoint under the site's "award detail" section that makes this query remarkably efficient:
const body = { award_id: awardInternalId, page: 1, limit: 100, sort: "action_date", order: "asc" };
const r = await fetch('/api/v2/transactions/', {
method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(body)
});
Since I run the tool from my own machine, the idea was to build a Python HTTP host that serves the HTML and doubles as a proxy, without the contingencies I experienced when I first tried to build a plain HTML page. On the technical side, one of the clearest confirmations is that data being public through APIs doesn't mean it's easy to query them. This prototype, once properly adjusted and validated, will become the data source for this tool also in development. I'll keep you updated.