Secure and Intuitive Access to BigDataPE API Datasets • BigDataPE

CRAN_Status_Badge CRAN Downloads

BigDataPE is an R package that provides a secure and intuitive way to access datasets from the BigDataPE platform. The package allows users to fetch data from the API using token-based authentication, manage multiple tokens for different datasets, and retrieve data efficiently using chunking.

❕️ Disclaimer
This package acts as a wrapper for Brazilian public APIs provided by the Big Data PE platform, maintained by the Government of the State of Pernambuco, which is the institution responsible for the data. To maintain consistency with R package development standards, all wrapper functions use English names and English parameter names (e.g., bdpe_store_token(), bdpe_fetch_data(), bdpe_fetch_chunks()). However, because the source API is natively in Portuguese, response column names are returned in Portuguese (e.g., nu_notificacao, dt_notificacao, co_municipio_residencia, tp_sexo, no_bairro_residencia), and some data values are also in Portuguese (e.g., neighbourhood names like "BOA VIAGEM", "SANTO AMARO"). Additionally, access to the Big Data PE API requires an approved access request on the platform and a connection to the PE Conectado network or a VPN — external connections will time out.

Installation

You can install the BigDataPE package directly from GitHub:

# Install the devtools package if you haven't already
install.packages("devtools")

# Install BigDataPE from GitHub
devtools::install_github("StrategicProjects/bigdatape")

After installation, load the package:

library(BigDataPE)

Features

Securely store and manage API tokens with the environment variables.
Fetch data from the BigDataPE API using a simple interface.
Retrieve large datasets iteratively using chunking.
Easily manage multiple datasets and their associated tokens.

Functions Overview

1. Store Token: `bdpe_store_token`

This function securely stores an authentication token for a specific dataset.

bdpe_store_token(base_name, token)

Parameters:

base_name: The name of the dataset.
token: The authentication token for the dataset.

Example:

bdpe_store_token("education_dataset", "your-token-here")

2. Retrieve Token: `bdpe_get_token`

This function retrieves the securely stored token for a specific dataset.

bdpe_get_token(base_name)

Parameters:

base_name: The name of the dataset.

Example:

token <- bdpe_get_token("education_dataset")

3. Remove Token: `bdpe_remove_token`

This function removes the token associated with a specific dataset.

bdpe_remove_token(base_name)

Parameters:

base_name: The name of the dataset.

Example:

bdpe_remove_token("education_dataset")

4. List Tokens: `bdpe_list_tokens`

This function lists all datasets with stored tokens.

bdpe_list_tokens()

Example:

datasets <- bdpe_list_tokens()
print(datasets)

5. Fetch Data: `bdpe_fetch_data`

This function retrieves data from the BigDataPE API using securely stored tokens.

bdpe_fetch_data(
  base_name, 
  limit = 100, 
  offset = 0, 
  query = list(), 
  endpoint = "https://www.bigdata.pe.gov.br/api/buscar")

Parameters:

base_name: The name of the dataset.
limit: Number of records per page. Default is Inf
offset: Starting record for the query. Default is 0.
query: Additional query parameters.
endpoint: The API endpoint URL.

Example:

data <- bdpe_fetch_data("education_dataset", limit = 50)

6. Fetch Data in Chunks: `bdpe_fetch_chunks`

This function retrieves data from the API iteratively in chunks.

bdpe_fetch_chunks(
  base_name, 
  total_limit = Inf, 
  chunk_size = 100, 
  query = list(), 
  endpoint = "https://www.bigdata.pe.gov.br/api/buscar")

Parameters:

base_name: The name of the dataset.
total_limit: Maximum number of records to fetch. Default is Inf (fetch all available data).
chunk_size: Number of records per chunk. Default is 50.000
query: Additional query parameters.
endpoint: The API endpoint URL.

Example:

# Fetch up to 500 records in chunks of 100
data <- bdpe_fetch_chunks(
          "education_dataset", 
          total_limit = 500, 
          chunk_size = 100)

# Fetch all available data in chunks of 200
all_data <- bdpe_fetch_chunks(
              "education_dataset", 
              chunk_size = 200)

7. Construct URL with Query Parameters: `parse_queries`

This internal function constructs a URL with query parameters.

parse_queries(url, query_list)

Parameters:

url: The base URL.
query_list: A list of query parameters.

Example:

url <- parse_queries(
            "https://www.example.com", 
            list(param1 = "value1", param2 = "value2")
            )
print(url)

Example Workflow

Here’s a complete example workflow:

# Store a token for a dataset
bdpe_store_token("education_dataset", "your-token-here")

# Fetch 100 records starting from the first record
data <- bdpe_fetch_data("education_dataset", limit = 100, offset = 0)

# Fetch data in chunks
all_data <- bdpe_fetch_chunks(
  "education_dataset", 
  total_limit = 500, 
  chunk_size = 100)

# List all datasets with stored tokens
datasets <- bdpe_list_tokens()

# Remove a token
bdpe_remove_token("education_dataset")

Contributing

If you find any issues or have feature requests, feel free to create an issue or a pull request on GitHub.

License

This package is licensed under the MIT License. See the LICENSE file for more details.

BigDataPE

Installation

Features

Functions Overview

1. Store Token: bdpe_store_token

2. Retrieve Token: bdpe_get_token

3. Remove Token: bdpe_remove_token

4. List Tokens: bdpe_list_tokens

5. Fetch Data: bdpe_fetch_data

6. Fetch Data in Chunks: bdpe_fetch_chunks

7. Construct URL with Query Parameters: parse_queries

Example Workflow

Contributing

License

1. Store Token: `bdpe_store_token`

2. Retrieve Token: `bdpe_get_token`

3. Remove Token: `bdpe_remove_token`

4. List Tokens: `bdpe_list_tokens`

5. Fetch Data: `bdpe_fetch_data`

6. Fetch Data in Chunks: `bdpe_fetch_chunks`

7. Construct URL with Query Parameters: `parse_queries`