Getting started¶
This page walks through the conventions shared by every function in
tesouropy: the bilingual interface, the polars output, caching, retries,
logging, and the fault-tolerance contract.
Install and import¶
Bilingual interface (PT/EN)¶
Almost every function has two names: a Portuguese name with Portuguese parameters, and an English alias mapping English parameter names onto the Portuguese ones. They return identical data — pick whichever you prefer.
# Portuguese
rreo = tn.get_rreo(
an_exercicio=2022, nr_periodo=6,
co_tipo_demonstrativo="RREO", no_anexo="RREO-Anexo 01",
co_esfera="E", id_ente=17,
)
# English (same call, same result)
rreo = tn.get_budget_report(
fiscal_year=2022, period=6,
report_type="RREO", appendix="RREO-Anexo 01",
sphere="E", entity_id=17,
)
Always call with keyword arguments
Function signatures put required parameters first (Python forbids a required parameter after one with a default), so the positional order may differ slightly from the R package. Calling with keywords keeps your code unambiguous and stable.
polars output¶
Every fetch returns a polars DataFrame with tidy,
snake_case, accent-folded column names. Use the full polars API to wrangle it:
entes = tn.get_entes()
# Municipalities of Pernambuco, biggest first
(
entes
.filter((pl.col("uf") == "PE") & (pl.col("esfera") == "M"))
.sort("populacao", descending=True)
.select("cod_ibge", "ente", "populacao")
.head()
)
Caching¶
Every HTTP request is cached in memory by default (use_cache=True), keyed
by URL + query parameters. Repeated calls in the same session are instant. Clear
the cache with:
Pass use_cache=False to force a fresh request.
Retries¶
Transient failures are retried automatically: 5 attempts with progressive backoff (3 → 6 → 9 → 12 s) on HTTP 429/500/502/503/504 and connection errors. Non-retryable errors (400, 404, …) fail fast with an actionable message.
Logging and verbosity¶
The package logs progress through the "tesouropy" logger, attached to a
NullHandler, so importing it is silent by default. Opt in to see progress:
To log the full request URL of each call (handy for debugging or pasting into a
browser), pass verbose=True per call, or set it globally:
Limiting rows while exploring¶
Large datasets (CUSTOS, broad SICONFI queries) can be capped with max_rows
and tuned with page_size:
Fault tolerance¶
Two families of functions never throw away data they already fetched.
Partial pagination¶
If a page after the first fails, you get the rows fetched so far, flagged:
df = tn.get_custos_pessoal_ativo(ano=2023)
if getattr(df, "partial", False):
print("incomplete:", df.last_page_error)
State-wide loops¶
The *_for_state helpers fetch data for every municipality of a state. If one
municipality fails after all retries, the failure is recorded and the loop
continues:
rreo_es = tn.get_rreo_for_state(
state_uf="ES", an_exercicio=2021, nr_periodo=6,
co_tipo_demonstrativo="RREO", no_anexo="RREO-Anexo 01",
)
getattr(rreo_es, "failed", None) # DataFrame of failed municipalities, if any
getattr(rreo_es, "no_data", None) # municipalities that returned 0 rows
Metadata attributes are transient
partial, failed, no_data and last_page_error are attached to the
returned DataFrame instance. Most polars operations return a new
DataFrame that will not carry them, so read them right after the call.
Where to next¶
- SICONFI — fiscal reports and entities
- CUSTOS — federal cost data
- SADIPEM — public debt
- SIORG — organizational structure
- Transferências — constitutional transfers
- SIOPE — education spending
- RREO longitudinal — coherent series across years