Skip to contents

Fetches a feed and enqueues each item's link. The item title and date are attached to the request's user_data (available to handlers as ctx$request$user_data), so feed metadata can be carried into the dataset.

Usage

cr_from_rss(
  crawler,
  url,
  label = NULL,
  include = NULL,
  exclude = NULL,
  max = Inf
)

Arguments

crawler

A Crawler.

url

URL of an RSS or Atom feed.

label

Optional handler label routing the enqueued URLs.

include, exclude

Optional glob patterns (see cr_on_html()).

max

Maximum number of items to enqueue.

Value

The crawler, invisibly.

Examples

if (FALSE) { # \dontrun{
crawler() |>
  cr_on_html(\(ctx) ctx$push_data(list(
    url = ctx$request$url, title = ctx$request$user_data$title
  ))) |>
  cr_from_rss("https://example.com/feed.xml")
} # }